[2024-06-12 12:58:55,378][62882] Saving configuration to /workspace/metta/train_dir/p2.death/config.json... [2024-06-12 12:58:55,411][62882] Rollout worker 0 uses device cpu [2024-06-12 12:58:55,412][62882] Rollout worker 1 uses device cpu [2024-06-12 12:58:55,412][62882] Rollout worker 2 uses device cpu [2024-06-12 12:58:55,412][62882] Rollout worker 3 uses device cpu [2024-06-12 12:58:55,412][62882] Rollout worker 4 uses device cpu [2024-06-12 12:58:55,413][62882] Rollout worker 5 uses device cpu [2024-06-12 12:58:55,413][62882] Rollout worker 6 uses device cpu [2024-06-12 12:58:55,413][62882] Rollout worker 7 uses device cpu [2024-06-12 12:58:55,413][62882] Rollout worker 8 uses device cpu [2024-06-12 12:58:55,413][62882] Rollout worker 9 uses device cpu [2024-06-12 12:58:55,414][62882] Rollout worker 10 uses device cpu [2024-06-12 12:58:55,414][62882] Rollout worker 11 uses device cpu [2024-06-12 12:58:55,414][62882] Rollout worker 12 uses device cpu [2024-06-12 12:58:55,414][62882] Rollout worker 13 uses device cpu [2024-06-12 12:58:55,414][62882] Rollout worker 14 uses device cpu [2024-06-12 12:58:55,415][62882] Rollout worker 15 uses device cpu [2024-06-12 12:58:55,415][62882] Rollout worker 16 uses device cpu [2024-06-12 12:58:55,415][62882] Rollout worker 17 uses device cpu [2024-06-12 12:58:55,415][62882] Rollout worker 18 uses device cpu [2024-06-12 12:58:55,415][62882] Rollout worker 19 uses device cpu [2024-06-12 12:58:55,415][62882] Rollout worker 20 uses device cpu [2024-06-12 12:58:55,416][62882] Rollout worker 21 uses device cpu [2024-06-12 12:58:55,416][62882] Rollout worker 22 uses device cpu [2024-06-12 12:58:55,416][62882] Rollout worker 23 uses device cpu [2024-06-12 12:58:55,416][62882] Rollout worker 24 uses device cpu [2024-06-12 12:58:55,416][62882] Rollout worker 25 uses device cpu [2024-06-12 12:58:55,417][62882] Rollout worker 26 uses device cpu [2024-06-12 12:58:55,417][62882] Rollout worker 27 uses device cpu [2024-06-12 12:58:55,417][62882] Rollout worker 28 uses device cpu [2024-06-12 12:58:55,417][62882] Rollout worker 29 uses device cpu [2024-06-12 12:58:55,417][62882] Rollout worker 30 uses device cpu [2024-06-12 12:58:55,418][62882] Rollout worker 31 uses device cpu [2024-06-12 12:58:55,984][62882] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-12 12:58:55,984][62882] InferenceWorker_p0-w0: min num requests: 10 [2024-06-12 12:58:56,027][62882] Starting all processes... [2024-06-12 12:58:56,027][62882] Starting process learner_proc0 [2024-06-12 12:58:56,290][62882] Starting all processes... [2024-06-12 12:58:56,292][62882] Starting process inference_proc0-0 [2024-06-12 12:58:56,292][62882] Starting process rollout_proc0 [2024-06-12 12:58:56,292][62882] Starting process rollout_proc1 [2024-06-12 12:58:56,292][62882] Starting process rollout_proc2 [2024-06-12 12:58:56,292][62882] Starting process rollout_proc3 [2024-06-12 12:58:56,292][62882] Starting process rollout_proc4 [2024-06-12 12:58:56,292][62882] Starting process rollout_proc5 [2024-06-12 12:58:56,292][62882] Starting process rollout_proc6 [2024-06-12 12:58:56,293][62882] Starting process rollout_proc7 [2024-06-12 12:58:56,293][62882] Starting process rollout_proc8 [2024-06-12 12:58:56,294][62882] Starting process rollout_proc9 [2024-06-12 12:58:56,295][62882] Starting process rollout_proc10 [2024-06-12 12:58:56,295][62882] Starting process rollout_proc11 [2024-06-12 12:58:56,296][62882] Starting process rollout_proc12 [2024-06-12 12:58:56,297][62882] Starting process rollout_proc13 [2024-06-12 12:58:56,297][62882] Starting process rollout_proc14 [2024-06-12 12:58:56,297][62882] Starting process rollout_proc15 [2024-06-12 12:58:56,298][62882] Starting process rollout_proc16 [2024-06-12 12:58:56,301][62882] Starting process rollout_proc17 [2024-06-12 12:58:56,301][62882] Starting process rollout_proc18 [2024-06-12 12:58:56,301][62882] Starting process rollout_proc19 [2024-06-12 12:58:56,303][62882] Starting process rollout_proc20 [2024-06-12 12:58:56,304][62882] Starting process rollout_proc21 [2024-06-12 12:58:56,307][62882] Starting process rollout_proc22 [2024-06-12 12:58:56,308][62882] Starting process rollout_proc23 [2024-06-12 12:58:56,308][62882] Starting process rollout_proc24 [2024-06-12 12:58:56,311][62882] Starting process rollout_proc25 [2024-06-12 12:58:56,311][62882] Starting process rollout_proc26 [2024-06-12 12:58:56,314][62882] Starting process rollout_proc27 [2024-06-12 12:58:56,316][62882] Starting process rollout_proc28 [2024-06-12 12:58:56,316][62882] Starting process rollout_proc29 [2024-06-12 12:58:56,317][62882] Starting process rollout_proc30 [2024-06-12 12:58:56,318][62882] Starting process rollout_proc31 [2024-06-12 12:58:58,360][63127] Worker 7 uses CPU cores [7] [2024-06-12 12:58:58,387][63099] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-12 12:58:58,387][63099] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-06-12 12:58:58,396][63099] Num visible devices: 1 [2024-06-12 12:58:58,408][63125] Worker 5 uses CPU cores [5] [2024-06-12 12:58:58,420][63099] Setting fixed seed 0 [2024-06-12 12:58:58,421][63099] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-12 12:58:58,421][63099] Initializing actor-critic model on device cuda:0 [2024-06-12 12:58:58,479][63131] Worker 13 uses CPU cores [13] [2024-06-12 12:58:58,480][63140] Worker 19 uses CPU cores [19] [2024-06-12 12:58:58,487][63145] Worker 26 uses CPU cores [26] [2024-06-12 12:58:58,504][63146] Worker 25 uses CPU cores [25] [2024-06-12 12:58:58,534][63132] Worker 12 uses CPU cores [12] [2024-06-12 12:58:58,535][63135] Worker 15 uses CPU cores [15] [2024-06-12 12:58:58,544][63138] Worker 18 uses CPU cores [18] [2024-06-12 12:58:58,564][63120] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-12 12:58:58,564][63120] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-06-12 12:58:58,578][63137] Worker 17 uses CPU cores [17] [2024-06-12 12:58:58,580][63121] Worker 2 uses CPU cores [2] [2024-06-12 12:58:58,580][63147] Worker 27 uses CPU cores [27] [2024-06-12 12:58:58,580][63120] Num visible devices: 1 [2024-06-12 12:58:58,592][63143] Worker 23 uses CPU cores [23] [2024-06-12 12:58:58,592][63144] Worker 24 uses CPU cores [24] [2024-06-12 12:58:58,596][63141] Worker 21 uses CPU cores [21] [2024-06-12 12:58:58,603][63136] Worker 16 uses CPU cores [16] [2024-06-12 12:58:58,604][63126] Worker 6 uses CPU cores [6] [2024-06-12 12:58:58,611][63139] Worker 20 uses CPU cores [20] [2024-06-12 12:58:58,640][63123] Worker 1 uses CPU cores [1] [2024-06-12 12:58:58,644][63130] Worker 9 uses CPU cores [9] [2024-06-12 12:58:58,658][63133] Worker 11 uses CPU cores [11] [2024-06-12 12:58:58,687][63119] Worker 0 uses CPU cores [0] [2024-06-12 12:58:58,692][63124] Worker 4 uses CPU cores [4] [2024-06-12 12:58:58,726][63134] Worker 14 uses CPU cores [14] [2024-06-12 12:58:58,728][63128] Worker 8 uses CPU cores [8] [2024-06-12 12:58:58,738][63142] Worker 22 uses CPU cores [22] [2024-06-12 12:58:58,763][63149] Worker 31 uses CPU cores [31] [2024-06-12 12:58:58,771][63122] Worker 3 uses CPU cores [3] [2024-06-12 12:58:58,827][63151] Worker 29 uses CPU cores [29] [2024-06-12 12:58:58,864][63150] Worker 28 uses CPU cores [28] [2024-06-12 12:58:58,888][63129] Worker 10 uses CPU cores [10] [2024-06-12 12:58:58,889][63148] Worker 30 uses CPU cores [30] [2024-06-12 12:58:59,247][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,248][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,248][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,248][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,248][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,248][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,248][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,248][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,248][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,248][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,248][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,248][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,248][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,248][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,248][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,248][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,248][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,248][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,248][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,248][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,248][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,248][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,252][63099] RunningMeanStd input shape: (1,) [2024-06-12 12:58:59,252][63099] RunningMeanStd input shape: (1,) [2024-06-12 12:58:59,252][63099] RunningMeanStd input shape: (1,) [2024-06-12 12:58:59,252][63099] RunningMeanStd input shape: (1,) [2024-06-12 12:58:59,252][63099] RunningMeanStd input shape: (11, 11) [2024-06-12 12:58:59,292][63099] RunningMeanStd input shape: (1,) [2024-06-12 12:58:59,297][63099] Created Actor Critic model with architecture: [2024-06-12 12:58:59,297][63099] SampleFactoryAgentWrapper( (obs_normalizer): ObservationNormalizer() (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (agent): MettaAgent( (_encoder): MultiFeatureSetEncoder( (feature_set_encoders): ModuleDict( (grid_obs): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (agent): RunningMeanStdInPlace() (altar): RunningMeanStdInPlace() (converter): RunningMeanStdInPlace() (generator): RunningMeanStdInPlace() (wall): RunningMeanStdInPlace() (agent:dir): RunningMeanStdInPlace() (agent:energy): RunningMeanStdInPlace() (agent:frozen): RunningMeanStdInPlace() (agent:hp): RunningMeanStdInPlace() (agent:id): RunningMeanStdInPlace() (agent:inv_r1): RunningMeanStdInPlace() (agent:inv_r2): RunningMeanStdInPlace() (agent:inv_r3): RunningMeanStdInPlace() (agent:shield): RunningMeanStdInPlace() (altar:hp): RunningMeanStdInPlace() (altar:state): RunningMeanStdInPlace() (converter:hp): RunningMeanStdInPlace() (converter:state): RunningMeanStdInPlace() (generator:amount): RunningMeanStdInPlace() (generator:hp): RunningMeanStdInPlace() (generator:state): RunningMeanStdInPlace() (wall:hp): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) (6): Linear(in_features=512, out_features=512, bias=True) (7): ELU(alpha=1.0) ) ) (global_vars): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (_steps): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_action): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_action_id): RunningMeanStdInPlace() (last_action_val): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_reward): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_reward): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (kinship): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (kinship): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) ) (merged_encoder): Sequential( (0): Linear(in_features=544, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) ) ) (_core): ModelCoreRNN( (core): GRU(512, 512) ) (_decoder): Decoder( (mlp): Identity() ) (_critic_linear): Linear(in_features=512, out_features=1, bias=True) (_action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=16, bias=True) ) ) ) [2024-06-12 12:58:59,355][63099] Using optimizer [2024-06-12 12:58:59,535][63099] No checkpoints found [2024-06-12 12:58:59,535][63099] Did not load from checkpoint, starting from scratch! [2024-06-12 12:58:59,535][63099] Initialized policy 0 weights for model version 0 [2024-06-12 12:58:59,536][63099] LearnerWorker_p0 finished initialization! [2024-06-12 12:58:59,536][63099] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-12 12:59:00,234][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,235][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,235][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,235][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,235][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,235][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,235][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,235][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,235][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,235][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,235][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,235][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,235][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,235][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,235][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,235][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,235][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,236][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,236][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,236][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,236][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,236][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,239][63120] RunningMeanStd input shape: (1,) [2024-06-12 12:59:00,239][63120] RunningMeanStd input shape: (1,) [2024-06-12 12:59:00,239][63120] RunningMeanStd input shape: (1,) [2024-06-12 12:59:00,239][63120] RunningMeanStd input shape: (1,) [2024-06-12 12:59:00,240][63120] RunningMeanStd input shape: (11, 11) [2024-06-12 12:59:00,278][63120] RunningMeanStd input shape: (1,) [2024-06-12 12:59:00,299][62882] Inference worker 0-0 is ready! [2024-06-12 12:59:00,300][62882] All inference workers are ready! Signal rollout workers to start! [2024-06-12 12:59:02,606][63136] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,630][63139] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,631][63140] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,632][63141] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,639][63146] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,643][63143] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,644][63137] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,645][63151] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,649][63145] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,649][63144] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,650][63142] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,652][63138] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,655][63149] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,657][63148] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,685][63150] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,698][63127] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,706][63130] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,709][63123] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,712][63133] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,717][63122] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,720][63124] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,721][63135] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,731][63129] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,733][63131] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,734][63121] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,735][63125] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,742][63119] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,742][63134] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,743][63132] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,744][63128] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,748][63126] Decorrelating experience for 0 frames... [2024-06-12 12:59:02,758][63147] Decorrelating experience for 0 frames... [2024-06-12 12:59:03,134][62882] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-12 12:59:03,905][63136] Decorrelating experience for 256 frames... [2024-06-12 12:59:03,937][63139] Decorrelating experience for 256 frames... [2024-06-12 12:59:03,941][63141] Decorrelating experience for 256 frames... [2024-06-12 12:59:03,959][63146] Decorrelating experience for 256 frames... [2024-06-12 12:59:03,960][63143] Decorrelating experience for 256 frames... [2024-06-12 12:59:03,963][63151] Decorrelating experience for 256 frames... [2024-06-12 12:59:03,964][63137] Decorrelating experience for 256 frames... [2024-06-12 12:59:03,965][63140] Decorrelating experience for 256 frames... [2024-06-12 12:59:03,980][63142] Decorrelating experience for 256 frames... [2024-06-12 12:59:03,982][63149] Decorrelating experience for 256 frames... [2024-06-12 12:59:03,983][63145] Decorrelating experience for 256 frames... [2024-06-12 12:59:03,984][63144] Decorrelating experience for 256 frames... [2024-06-12 12:59:03,985][63138] Decorrelating experience for 256 frames... [2024-06-12 12:59:03,995][63148] Decorrelating experience for 256 frames... [2024-06-12 12:59:04,001][63127] Decorrelating experience for 256 frames... [2024-06-12 12:59:04,019][63130] Decorrelating experience for 256 frames... [2024-06-12 12:59:04,020][63123] Decorrelating experience for 256 frames... [2024-06-12 12:59:04,038][63133] Decorrelating experience for 256 frames... [2024-06-12 12:59:04,039][63122] Decorrelating experience for 256 frames... [2024-06-12 12:59:04,044][63135] Decorrelating experience for 256 frames... [2024-06-12 12:59:04,052][63126] Decorrelating experience for 256 frames... [2024-06-12 12:59:04,053][63124] Decorrelating experience for 256 frames... [2024-06-12 12:59:04,060][63129] Decorrelating experience for 256 frames... [2024-06-12 12:59:04,062][63131] Decorrelating experience for 256 frames... [2024-06-12 12:59:04,065][63128] Decorrelating experience for 256 frames... [2024-06-12 12:59:04,067][63121] Decorrelating experience for 256 frames... [2024-06-12 12:59:04,068][63125] Decorrelating experience for 256 frames... [2024-06-12 12:59:04,068][63119] Decorrelating experience for 256 frames... [2024-06-12 12:59:04,074][63134] Decorrelating experience for 256 frames... [2024-06-12 12:59:04,077][63132] Decorrelating experience for 256 frames... [2024-06-12 12:59:04,077][63150] Decorrelating experience for 256 frames... [2024-06-12 12:59:04,096][63147] Decorrelating experience for 256 frames... [2024-06-12 12:59:08,134][62882] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 12820.0. Samples: 64100. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-12 12:59:08,135][62882] Avg episode reward: [(0, '0.000')] [2024-06-12 12:59:10,716][63123] Worker 1, sleep for 4.688 sec to decorrelate experience collection [2024-06-12 12:59:10,756][63122] Worker 3, sleep for 14.062 sec to decorrelate experience collection [2024-06-12 12:59:10,795][63121] Worker 2, sleep for 9.375 sec to decorrelate experience collection [2024-06-12 12:59:10,829][63127] Worker 7, sleep for 32.812 sec to decorrelate experience collection [2024-06-12 12:59:10,834][63124] Worker 4, sleep for 18.750 sec to decorrelate experience collection [2024-06-12 12:59:10,840][63125] Worker 5, sleep for 23.438 sec to decorrelate experience collection [2024-06-12 12:59:10,843][63130] Worker 9, sleep for 42.188 sec to decorrelate experience collection [2024-06-12 12:59:10,857][63129] Worker 10, sleep for 46.875 sec to decorrelate experience collection [2024-06-12 12:59:10,869][63126] Worker 6, sleep for 28.125 sec to decorrelate experience collection [2024-06-12 12:59:10,877][63135] Worker 15, sleep for 70.312 sec to decorrelate experience collection [2024-06-12 12:59:10,883][63131] Worker 13, sleep for 60.938 sec to decorrelate experience collection [2024-06-12 12:59:10,887][63134] Worker 14, sleep for 65.625 sec to decorrelate experience collection [2024-06-12 12:59:10,898][63128] Worker 8, sleep for 37.500 sec to decorrelate experience collection [2024-06-12 12:59:10,907][63099] Signal inference workers to stop experience collection... [2024-06-12 12:59:10,910][63132] Worker 12, sleep for 56.250 sec to decorrelate experience collection [2024-06-12 12:59:10,912][63144] Worker 24, sleep for 112.500 sec to decorrelate experience collection [2024-06-12 12:59:10,916][63120] InferenceWorker_p0-w0: stopping experience collection [2024-06-12 12:59:10,917][63138] Worker 18, sleep for 84.375 sec to decorrelate experience collection [2024-06-12 12:59:11,455][63099] Signal inference workers to resume experience collection... [2024-06-12 12:59:11,456][63120] InferenceWorker_p0-w0: resuming experience collection [2024-06-12 12:59:11,475][63133] Worker 11, sleep for 51.562 sec to decorrelate experience collection [2024-06-12 12:59:11,482][63143] Worker 23, sleep for 107.812 sec to decorrelate experience collection [2024-06-12 12:59:11,486][63141] Worker 21, sleep for 98.438 sec to decorrelate experience collection [2024-06-12 12:59:11,491][63145] Worker 26, sleep for 121.875 sec to decorrelate experience collection [2024-06-12 12:59:11,691][63151] Worker 29, sleep for 135.938 sec to decorrelate experience collection [2024-06-12 12:59:11,741][63142] Worker 22, sleep for 103.125 sec to decorrelate experience collection [2024-06-12 12:59:11,806][63136] Worker 16, sleep for 75.000 sec to decorrelate experience collection [2024-06-12 12:59:11,811][63139] Worker 20, sleep for 93.750 sec to decorrelate experience collection [2024-06-12 12:59:11,811][63140] Worker 19, sleep for 89.062 sec to decorrelate experience collection [2024-06-12 12:59:11,828][63137] Worker 17, sleep for 79.688 sec to decorrelate experience collection [2024-06-12 12:59:11,854][63148] Worker 30, sleep for 140.625 sec to decorrelate experience collection [2024-06-12 12:59:11,854][63146] Worker 25, sleep for 117.188 sec to decorrelate experience collection [2024-06-12 12:59:11,912][63149] Worker 31, sleep for 145.312 sec to decorrelate experience collection [2024-06-12 12:59:11,965][63147] Worker 27, sleep for 126.562 sec to decorrelate experience collection [2024-06-12 12:59:12,242][63150] Worker 28, sleep for 131.250 sec to decorrelate experience collection [2024-06-12 12:59:12,585][63120] Updated weights for policy 0, policy_version 10 (0.0013) [2024-06-12 12:59:13,135][62882] Fps is (10 sec: 16383.5, 60 sec: 16383.5, 300 sec: 16383.5). Total num frames: 163840. Throughput: 0: 32879.1. Samples: 328800. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-12 12:59:13,135][62882] Avg episode reward: [(0, '0.000')] [2024-06-12 12:59:13,147][63099] Saving new best policy, reward=0.000! [2024-06-12 12:59:15,427][63123] Worker 1 awakens! [2024-06-12 12:59:15,981][62882] Heartbeat connected on Batcher_0 [2024-06-12 12:59:15,983][62882] Heartbeat connected on LearnerWorker_p0 [2024-06-12 12:59:15,999][62882] Heartbeat connected on RolloutWorker_w0 [2024-06-12 12:59:16,001][62882] Heartbeat connected on RolloutWorker_w1 [2024-06-12 12:59:16,019][62882] Heartbeat connected on InferenceWorker_p0-w0 [2024-06-12 12:59:18,134][62882] Fps is (10 sec: 16383.6, 60 sec: 10922.5, 300 sec: 10922.5). Total num frames: 163840. Throughput: 0: 22138.3. Samples: 332080. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-12 12:59:18,135][62882] Avg episode reward: [(0, '0.000')] [2024-06-12 12:59:20,218][63121] Worker 2 awakens! [2024-06-12 12:59:20,226][62882] Heartbeat connected on RolloutWorker_w2 [2024-06-12 12:59:23,134][62882] Fps is (10 sec: 1638.4, 60 sec: 9011.1, 300 sec: 9011.1). Total num frames: 180224. Throughput: 0: 17305.9. Samples: 346120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 10.0) [2024-06-12 12:59:23,135][62882] Avg episode reward: [(0, '0.000')] [2024-06-12 12:59:24,890][63122] Worker 3 awakens! [2024-06-12 12:59:24,894][62882] Heartbeat connected on RolloutWorker_w3 [2024-06-12 12:59:28,134][62882] Fps is (10 sec: 3276.9, 60 sec: 7864.3, 300 sec: 7864.3). Total num frames: 196608. Throughput: 0: 14762.4. Samples: 369060. Policy #0 lag: (min: 0.0, avg: 1.4, max: 11.0) [2024-06-12 12:59:28,134][62882] Avg episode reward: [(0, '0.000')] [2024-06-12 12:59:29,678][63124] Worker 4 awakens! [2024-06-12 12:59:29,687][62882] Heartbeat connected on RolloutWorker_w4 [2024-06-12 12:59:33,134][62882] Fps is (10 sec: 6553.7, 60 sec: 8192.0, 300 sec: 8192.0). Total num frames: 245760. Throughput: 0: 12822.0. Samples: 384660. Policy #0 lag: (min: 0.0, avg: 4.4, max: 12.0) [2024-06-12 12:59:33,134][62882] Avg episode reward: [(0, '0.000')] [2024-06-12 12:59:34,316][63125] Worker 5 awakens! [2024-06-12 12:59:34,322][62882] Heartbeat connected on RolloutWorker_w5 [2024-06-12 12:59:37,846][63120] Updated weights for policy 0, policy_version 20 (0.0015) [2024-06-12 12:59:38,134][62882] Fps is (10 sec: 13107.2, 60 sec: 9362.3, 300 sec: 9362.3). Total num frames: 327680. Throughput: 0: 13304.6. Samples: 465660. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) [2024-06-12 12:59:38,134][62882] Avg episode reward: [(0, '0.000')] [2024-06-12 12:59:39,094][63126] Worker 6 awakens! [2024-06-12 12:59:39,099][62882] Heartbeat connected on RolloutWorker_w6 [2024-06-12 12:59:43,134][62882] Fps is (10 sec: 16383.7, 60 sec: 10240.0, 300 sec: 10240.0). Total num frames: 409600. Throughput: 0: 14285.9. Samples: 571440. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2024-06-12 12:59:43,135][62882] Avg episode reward: [(0, '0.000')] [2024-06-12 12:59:43,670][63127] Worker 7 awakens! [2024-06-12 12:59:43,676][62882] Heartbeat connected on RolloutWorker_w7 [2024-06-12 12:59:56,583][65383] Saving configuration to /workspace/metta/train_dir/p2.death/config.json... [2024-06-12 12:59:56,600][65383] Rollout worker 0 uses device cpu [2024-06-12 12:59:56,600][65383] Rollout worker 1 uses device cpu [2024-06-12 12:59:56,600][65383] Rollout worker 2 uses device cpu [2024-06-12 12:59:56,601][65383] Rollout worker 3 uses device cpu [2024-06-12 12:59:56,601][65383] Rollout worker 4 uses device cpu [2024-06-12 12:59:56,601][65383] Rollout worker 5 uses device cpu [2024-06-12 12:59:56,601][65383] Rollout worker 6 uses device cpu [2024-06-12 12:59:56,602][65383] Rollout worker 7 uses device cpu [2024-06-12 12:59:56,602][65383] Rollout worker 8 uses device cpu [2024-06-12 12:59:56,602][65383] Rollout worker 9 uses device cpu [2024-06-12 12:59:56,602][65383] Rollout worker 10 uses device cpu [2024-06-12 12:59:56,603][65383] Rollout worker 11 uses device cpu [2024-06-12 12:59:56,603][65383] Rollout worker 12 uses device cpu [2024-06-12 12:59:56,603][65383] Rollout worker 13 uses device cpu [2024-06-12 12:59:56,603][65383] Rollout worker 14 uses device cpu [2024-06-12 12:59:56,604][65383] Rollout worker 15 uses device cpu [2024-06-12 12:59:56,604][65383] Rollout worker 16 uses device cpu [2024-06-12 12:59:56,604][65383] Rollout worker 17 uses device cpu [2024-06-12 12:59:56,605][65383] Rollout worker 18 uses device cpu [2024-06-12 12:59:56,605][65383] Rollout worker 19 uses device cpu [2024-06-12 12:59:56,605][65383] Rollout worker 20 uses device cpu [2024-06-12 12:59:56,605][65383] Rollout worker 21 uses device cpu [2024-06-12 12:59:56,605][65383] Rollout worker 22 uses device cpu [2024-06-12 12:59:56,605][65383] Rollout worker 23 uses device cpu [2024-06-12 12:59:56,605][65383] Rollout worker 24 uses device cpu [2024-06-12 12:59:56,605][65383] Rollout worker 25 uses device cpu [2024-06-12 12:59:56,605][65383] Rollout worker 26 uses device cpu [2024-06-12 12:59:56,606][65383] Rollout worker 27 uses device cpu [2024-06-12 12:59:56,606][65383] Rollout worker 28 uses device cpu [2024-06-12 12:59:56,606][65383] Rollout worker 29 uses device cpu [2024-06-12 12:59:56,606][65383] Rollout worker 30 uses device cpu [2024-06-12 12:59:56,606][65383] Rollout worker 31 uses device cpu [2024-06-12 12:59:57,172][65383] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-12 12:59:57,172][65383] InferenceWorker_p0-w0: min num requests: 10 [2024-06-12 12:59:57,216][65383] Starting all processes... [2024-06-12 12:59:57,216][65383] Starting process learner_proc0 [2024-06-12 12:59:57,479][65383] Starting all processes... [2024-06-12 12:59:57,482][65383] Starting process inference_proc0-0 [2024-06-12 12:59:57,482][65383] Starting process rollout_proc0 [2024-06-12 12:59:57,482][65383] Starting process rollout_proc1 [2024-06-12 12:59:57,482][65383] Starting process rollout_proc2 [2024-06-12 12:59:57,482][65383] Starting process rollout_proc3 [2024-06-12 12:59:57,483][65383] Starting process rollout_proc4 [2024-06-12 12:59:57,485][65383] Starting process rollout_proc5 [2024-06-12 12:59:57,485][65383] Starting process rollout_proc6 [2024-06-12 12:59:57,485][65383] Starting process rollout_proc7 [2024-06-12 12:59:57,485][65383] Starting process rollout_proc8 [2024-06-12 12:59:57,485][65383] Starting process rollout_proc9 [2024-06-12 12:59:57,486][65383] Starting process rollout_proc10 [2024-06-12 12:59:57,486][65383] Starting process rollout_proc11 [2024-06-12 12:59:57,486][65383] Starting process rollout_proc12 [2024-06-12 12:59:57,487][65383] Starting process rollout_proc13 [2024-06-12 12:59:57,487][65383] Starting process rollout_proc14 [2024-06-12 12:59:57,487][65383] Starting process rollout_proc15 [2024-06-12 12:59:57,488][65383] Starting process rollout_proc16 [2024-06-12 12:59:57,488][65383] Starting process rollout_proc17 [2024-06-12 12:59:57,490][65383] Starting process rollout_proc18 [2024-06-12 12:59:57,491][65383] Starting process rollout_proc19 [2024-06-12 12:59:57,494][65383] Starting process rollout_proc20 [2024-06-12 12:59:57,495][65383] Starting process rollout_proc21 [2024-06-12 12:59:57,495][65383] Starting process rollout_proc22 [2024-06-12 12:59:57,498][65383] Starting process rollout_proc23 [2024-06-12 12:59:57,499][65383] Starting process rollout_proc24 [2024-06-12 12:59:57,499][65383] Starting process rollout_proc25 [2024-06-12 12:59:57,501][65383] Starting process rollout_proc26 [2024-06-12 12:59:57,502][65383] Starting process rollout_proc27 [2024-06-12 12:59:57,503][65383] Starting process rollout_proc28 [2024-06-12 12:59:57,505][65383] Starting process rollout_proc29 [2024-06-12 12:59:57,507][65383] Starting process rollout_proc30 [2024-06-12 12:59:57,509][65383] Starting process rollout_proc31 [2024-06-12 12:59:59,262][65624] Worker 6 uses CPU cores [6] [2024-06-12 12:59:59,300][65623] Worker 7 uses CPU cores [7] [2024-06-12 12:59:59,503][65619] Worker 3 uses CPU cores [3] [2024-06-12 12:59:59,532][65628] Worker 12 uses CPU cores [12] [2024-06-12 12:59:59,576][65626] Worker 10 uses CPU cores [10] [2024-06-12 12:59:59,588][65618] Worker 2 uses CPU cores [2] [2024-06-12 12:59:59,599][65631] Worker 15 uses CPU cores [15] [2024-06-12 12:59:59,616][65639] Worker 26 uses CPU cores [26] [2024-06-12 12:59:59,664][65641] Worker 23 uses CPU cores [23] [2024-06-12 12:59:59,676][65629] Worker 13 uses CPU cores [13] [2024-06-12 12:59:59,715][65642] Worker 28 uses CPU cores [28] [2024-06-12 12:59:59,722][65644] Worker 27 uses CPU cores [27] [2024-06-12 12:59:59,746][65630] Worker 14 uses CPU cores [14] [2024-06-12 12:59:59,756][65643] Worker 25 uses CPU cores [25] [2024-06-12 12:59:59,773][65595] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-12 12:59:59,773][65595] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-06-12 12:59:59,781][65595] Num visible devices: 1 [2024-06-12 12:59:59,784][65622] Worker 5 uses CPU cores [5] [2024-06-12 12:59:59,800][65595] Setting fixed seed 0 [2024-06-12 12:59:59,801][65595] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-12 12:59:59,801][65595] Initializing actor-critic model on device cuda:0 [2024-06-12 12:59:59,806][65621] Worker 8 uses CPU cores [8] [2024-06-12 12:59:59,816][65633] Worker 17 uses CPU cores [17] [2024-06-12 12:59:59,864][65646] Worker 30 uses CPU cores [30] [2024-06-12 12:59:59,880][65640] Worker 24 uses CPU cores [24] [2024-06-12 12:59:59,884][65615] Worker 0 uses CPU cores [0] [2024-06-12 12:59:59,895][65627] Worker 11 uses CPU cores [11] [2024-06-12 12:59:59,900][65625] Worker 9 uses CPU cores [9] [2024-06-12 12:59:59,908][65632] Worker 16 uses CPU cores [16] [2024-06-12 12:59:59,918][65636] Worker 20 uses CPU cores [20] [2024-06-12 12:59:59,919][65635] Worker 19 uses CPU cores [19] [2024-06-12 12:59:59,936][65638] Worker 22 uses CPU cores [22] [2024-06-12 12:59:59,936][65647] Worker 31 uses CPU cores [31] [2024-06-12 12:59:59,948][65617] Worker 1 uses CPU cores [1] [2024-06-12 12:59:59,970][65620] Worker 4 uses CPU cores [4] [2024-06-12 12:59:59,978][65637] Worker 21 uses CPU cores [21] [2024-06-12 12:59:59,985][65634] Worker 18 uses CPU cores [18] [2024-06-12 13:00:00,065][65616] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-12 13:00:00,065][65616] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-06-12 13:00:00,072][65616] Num visible devices: 1 [2024-06-12 13:00:00,080][65645] Worker 29 uses CPU cores [29] [2024-06-12 13:00:00,522][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,522][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,522][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,523][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,523][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,523][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,523][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,523][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,523][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,523][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,523][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,523][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,523][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,523][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,523][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,523][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,523][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,523][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,523][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,523][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,523][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,523][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,526][65595] RunningMeanStd input shape: (1,) [2024-06-12 13:00:00,527][65595] RunningMeanStd input shape: (1,) [2024-06-12 13:00:00,527][65595] RunningMeanStd input shape: (1,) [2024-06-12 13:00:00,527][65595] RunningMeanStd input shape: (1,) [2024-06-12 13:00:00,527][65595] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:00,567][65595] RunningMeanStd input shape: (1,) [2024-06-12 13:00:00,571][65595] Created Actor Critic model with architecture: [2024-06-12 13:00:00,571][65595] SampleFactoryAgentWrapper( (obs_normalizer): ObservationNormalizer() (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (agent): MettaAgent( (_encoder): MultiFeatureSetEncoder( (feature_set_encoders): ModuleDict( (grid_obs): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (agent): RunningMeanStdInPlace() (altar): RunningMeanStdInPlace() (converter): RunningMeanStdInPlace() (generator): RunningMeanStdInPlace() (wall): RunningMeanStdInPlace() (agent:dir): RunningMeanStdInPlace() (agent:energy): RunningMeanStdInPlace() (agent:frozen): RunningMeanStdInPlace() (agent:hp): RunningMeanStdInPlace() (agent:id): RunningMeanStdInPlace() (agent:inv_r1): RunningMeanStdInPlace() (agent:inv_r2): RunningMeanStdInPlace() (agent:inv_r3): RunningMeanStdInPlace() (agent:shield): RunningMeanStdInPlace() (altar:hp): RunningMeanStdInPlace() (altar:state): RunningMeanStdInPlace() (converter:hp): RunningMeanStdInPlace() (converter:state): RunningMeanStdInPlace() (generator:amount): RunningMeanStdInPlace() (generator:hp): RunningMeanStdInPlace() (generator:state): RunningMeanStdInPlace() (wall:hp): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) (6): Linear(in_features=512, out_features=512, bias=True) (7): ELU(alpha=1.0) ) ) (global_vars): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (_steps): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_action): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_action_id): RunningMeanStdInPlace() (last_action_val): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_reward): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_reward): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (kinship): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (kinship): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) ) (merged_encoder): Sequential( (0): Linear(in_features=544, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) ) ) (_core): ModelCoreRNN( (core): GRU(512, 512) ) (_decoder): Decoder( (mlp): Identity() ) (_critic_linear): Linear(in_features=512, out_features=1, bias=True) (_action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=16, bias=True) ) ) ) [2024-06-12 13:00:00,637][65595] Using optimizer [2024-06-12 13:00:00,822][65595] No checkpoints found [2024-06-12 13:00:00,822][65595] Did not load from checkpoint, starting from scratch! [2024-06-12 13:00:00,822][65595] Initialized policy 0 weights for model version 0 [2024-06-12 13:00:00,824][65595] LearnerWorker_p0 finished initialization! [2024-06-12 13:00:00,824][65595] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-12 13:00:01,523][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,523][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,523][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,523][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,523][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,524][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,524][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,524][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,524][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,524][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,524][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,524][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,524][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,524][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,524][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,524][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,524][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,524][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,524][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,524][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,524][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,524][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,527][65616] RunningMeanStd input shape: (1,) [2024-06-12 13:00:01,528][65616] RunningMeanStd input shape: (1,) [2024-06-12 13:00:01,528][65616] RunningMeanStd input shape: (1,) [2024-06-12 13:00:01,528][65616] RunningMeanStd input shape: (1,) [2024-06-12 13:00:01,528][65616] RunningMeanStd input shape: (11, 11) [2024-06-12 13:00:01,568][65616] RunningMeanStd input shape: (1,) [2024-06-12 13:00:01,589][65383] Inference worker 0-0 is ready! [2024-06-12 13:00:01,590][65383] All inference workers are ready! Signal rollout workers to start! [2024-06-12 13:00:03,918][65632] Decorrelating experience for 0 frames... [2024-06-12 13:00:03,924][65638] Decorrelating experience for 0 frames... [2024-06-12 13:00:03,924][65641] Decorrelating experience for 0 frames... [2024-06-12 13:00:03,931][65637] Decorrelating experience for 0 frames... [2024-06-12 13:00:03,933][65639] Decorrelating experience for 0 frames... [2024-06-12 13:00:03,933][65636] Decorrelating experience for 0 frames... [2024-06-12 13:00:03,934][65643] Decorrelating experience for 0 frames... [2024-06-12 13:00:03,934][65633] Decorrelating experience for 0 frames... [2024-06-12 13:00:03,934][65635] Decorrelating experience for 0 frames... [2024-06-12 13:00:03,941][65640] Decorrelating experience for 0 frames... [2024-06-12 13:00:03,943][65647] Decorrelating experience for 0 frames... [2024-06-12 13:00:03,947][65646] Decorrelating experience for 0 frames... [2024-06-12 13:00:03,953][65645] Decorrelating experience for 0 frames... [2024-06-12 13:00:03,957][65642] Decorrelating experience for 0 frames... [2024-06-12 13:00:03,975][65634] Decorrelating experience for 0 frames... [2024-06-12 13:00:03,976][65644] Decorrelating experience for 0 frames... [2024-06-12 13:00:03,989][65625] Decorrelating experience for 0 frames... [2024-06-12 13:00:03,989][65623] Decorrelating experience for 0 frames... [2024-06-12 13:00:03,994][65622] Decorrelating experience for 0 frames... [2024-06-12 13:00:04,009][65627] Decorrelating experience for 0 frames... [2024-06-12 13:00:04,011][65617] Decorrelating experience for 0 frames... [2024-06-12 13:00:04,013][65615] Decorrelating experience for 0 frames... [2024-06-12 13:00:04,013][65619] Decorrelating experience for 0 frames... [2024-06-12 13:00:04,016][65631] Decorrelating experience for 0 frames... [2024-06-12 13:00:04,018][65629] Decorrelating experience for 0 frames... [2024-06-12 13:00:04,019][65626] Decorrelating experience for 0 frames... [2024-06-12 13:00:04,025][65618] Decorrelating experience for 0 frames... [2024-06-12 13:00:04,027][65621] Decorrelating experience for 0 frames... [2024-06-12 13:00:04,028][65628] Decorrelating experience for 0 frames... [2024-06-12 13:00:04,031][65620] Decorrelating experience for 0 frames... [2024-06-12 13:00:04,036][65630] Decorrelating experience for 0 frames... [2024-06-12 13:00:04,036][65624] Decorrelating experience for 0 frames... [2024-06-12 13:00:04,332][65383] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-12 13:00:05,204][65632] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,242][65638] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,248][65633] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,248][65641] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,253][65635] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,254][65639] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,259][65637] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,260][65643] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,261][65636] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,273][65640] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,279][65647] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,283][65646] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,294][65645] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,304][65642] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,305][65625] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,305][65623] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,306][65622] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,342][65631] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,343][65627] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,347][65621] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,351][65619] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,351][65626] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,355][65620] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,356][65644] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,359][65624] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,359][65617] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,360][65629] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,361][65615] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,363][65618] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,368][65628] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,372][65630] Decorrelating experience for 256 frames... [2024-06-12 13:00:05,383][65634] Decorrelating experience for 256 frames... [2024-06-12 13:00:09,332][65383] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 13220.1. Samples: 66100. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-12 13:00:09,333][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:00:12,072][65631] Worker 15, sleep for 70.312 sec to decorrelate experience collection [2024-06-12 13:00:12,094][65623] Worker 7, sleep for 32.812 sec to decorrelate experience collection [2024-06-12 13:00:12,095][65620] Worker 4, sleep for 18.750 sec to decorrelate experience collection [2024-06-12 13:00:12,095][65618] Worker 2, sleep for 9.375 sec to decorrelate experience collection [2024-06-12 13:00:12,104][65626] Worker 10, sleep for 46.875 sec to decorrelate experience collection [2024-06-12 13:00:12,106][65630] Worker 14, sleep for 65.625 sec to decorrelate experience collection [2024-06-12 13:00:12,138][65619] Worker 3, sleep for 14.062 sec to decorrelate experience collection [2024-06-12 13:00:12,138][65627] Worker 11, sleep for 51.562 sec to decorrelate experience collection [2024-06-12 13:00:12,141][65624] Worker 6, sleep for 28.125 sec to decorrelate experience collection [2024-06-12 13:00:12,152][65622] Worker 5, sleep for 23.438 sec to decorrelate experience collection [2024-06-12 13:00:12,166][65639] Worker 26, sleep for 121.875 sec to decorrelate experience collection [2024-06-12 13:00:12,181][65638] Worker 22, sleep for 103.125 sec to decorrelate experience collection [2024-06-12 13:00:12,184][65621] Worker 8, sleep for 37.500 sec to decorrelate experience collection [2024-06-12 13:00:12,185][65632] Worker 16, sleep for 75.000 sec to decorrelate experience collection [2024-06-12 13:00:12,189][65625] Worker 9, sleep for 42.188 sec to decorrelate experience collection [2024-06-12 13:00:12,214][65617] Worker 1, sleep for 4.688 sec to decorrelate experience collection [2024-06-12 13:00:12,222][65595] Signal inference workers to stop experience collection... [2024-06-12 13:00:12,224][65636] Worker 20, sleep for 93.750 sec to decorrelate experience collection [2024-06-12 13:00:12,224][65642] Worker 28, sleep for 131.250 sec to decorrelate experience collection [2024-06-12 13:00:12,227][65616] InferenceWorker_p0-w0: stopping experience collection [2024-06-12 13:00:12,233][65647] Worker 31, sleep for 145.312 sec to decorrelate experience collection [2024-06-12 13:00:12,234][65628] Worker 12, sleep for 56.250 sec to decorrelate experience collection [2024-06-12 13:00:12,737][65595] Signal inference workers to resume experience collection... [2024-06-12 13:00:12,737][65616] InferenceWorker_p0-w0: resuming experience collection [2024-06-12 13:00:12,751][65629] Worker 13, sleep for 60.938 sec to decorrelate experience collection [2024-06-12 13:00:12,752][65637] Worker 21, sleep for 98.438 sec to decorrelate experience collection [2024-06-12 13:00:12,762][65646] Worker 30, sleep for 140.625 sec to decorrelate experience collection [2024-06-12 13:00:12,763][65640] Worker 24, sleep for 112.500 sec to decorrelate experience collection [2024-06-12 13:00:12,763][65635] Worker 19, sleep for 89.062 sec to decorrelate experience collection [2024-06-12 13:00:12,767][65643] Worker 25, sleep for 117.188 sec to decorrelate experience collection [2024-06-12 13:00:12,775][65645] Worker 29, sleep for 135.938 sec to decorrelate experience collection [2024-06-12 13:00:12,775][65644] Worker 27, sleep for 126.562 sec to decorrelate experience collection [2024-06-12 13:00:13,006][65641] Worker 23, sleep for 107.812 sec to decorrelate experience collection [2024-06-12 13:00:13,006][65633] Worker 17, sleep for 79.688 sec to decorrelate experience collection [2024-06-12 13:00:13,063][65634] Worker 18, sleep for 84.375 sec to decorrelate experience collection [2024-06-12 13:00:13,837][65616] Updated weights for policy 0, policy_version 10 (0.0011) [2024-06-12 13:00:14,332][65383] Fps is (10 sec: 16383.8, 60 sec: 16383.8, 300 sec: 16383.8). Total num frames: 163840. Throughput: 0: 32823.6. Samples: 328240. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-12 13:00:14,333][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:00:14,358][65595] Saving new best policy, reward=0.000! [2024-06-12 13:00:16,925][65617] Worker 1 awakens! [2024-06-12 13:00:17,168][65383] Heartbeat connected on Batcher_0 [2024-06-12 13:00:17,170][65383] Heartbeat connected on LearnerWorker_p0 [2024-06-12 13:00:17,175][65383] Heartbeat connected on RolloutWorker_w0 [2024-06-12 13:00:17,176][65383] Heartbeat connected on RolloutWorker_w1 [2024-06-12 13:00:17,222][65383] Heartbeat connected on InferenceWorker_p0-w0 [2024-06-12 13:00:19,332][65383] Fps is (10 sec: 16384.0, 60 sec: 10922.7, 300 sec: 10922.7). Total num frames: 163840. Throughput: 0: 22104.1. Samples: 331560. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-12 13:00:19,333][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:00:21,479][65618] Worker 2 awakens! [2024-06-12 13:00:21,488][65383] Heartbeat connected on RolloutWorker_w2 [2024-06-12 13:00:24,333][65383] Fps is (10 sec: 1638.4, 60 sec: 9011.1, 300 sec: 9011.1). Total num frames: 180224. Throughput: 0: 17265.8. Samples: 345320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 10.0) [2024-06-12 13:00:24,340][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:00:26,271][65619] Worker 3 awakens! [2024-06-12 13:00:26,284][65383] Heartbeat connected on RolloutWorker_w3 [2024-06-12 13:00:29,332][65383] Fps is (10 sec: 3276.8, 60 sec: 7864.3, 300 sec: 7864.3). Total num frames: 196608. Throughput: 0: 14699.2. Samples: 367480. Policy #0 lag: (min: 0.0, avg: 4.3, max: 11.0) [2024-06-12 13:00:29,333][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:00:30,938][65620] Worker 4 awakens! [2024-06-12 13:00:30,943][65383] Heartbeat connected on RolloutWorker_w4 [2024-06-12 13:00:34,332][65383] Fps is (10 sec: 6553.8, 60 sec: 8192.0, 300 sec: 8192.0). Total num frames: 245760. Throughput: 0: 12915.3. Samples: 387460. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2024-06-12 13:00:34,333][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:00:35,688][65622] Worker 5 awakens! [2024-06-12 13:00:35,693][65383] Heartbeat connected on RolloutWorker_w5 [2024-06-12 13:00:39,332][65383] Fps is (10 sec: 9830.5, 60 sec: 8426.1, 300 sec: 8426.1). Total num frames: 294912. Throughput: 0: 13123.4. Samples: 459320. Policy #0 lag: (min: 0.0, avg: 2.1, max: 15.0) [2024-06-12 13:00:39,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:00:39,587][65616] Updated weights for policy 0, policy_version 20 (0.0011) [2024-06-12 13:00:40,364][65624] Worker 6 awakens! [2024-06-12 13:00:40,368][65383] Heartbeat connected on RolloutWorker_w6 [2024-06-12 13:00:44,332][65383] Fps is (10 sec: 16384.0, 60 sec: 10240.0, 300 sec: 10240.0). Total num frames: 409600. Throughput: 0: 14183.5. Samples: 567340. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2024-06-12 13:00:44,333][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:00:44,916][65623] Worker 7 awakens! [2024-06-12 13:00:44,920][65383] Heartbeat connected on RolloutWorker_w7 [2024-06-12 13:00:47,453][65616] Updated weights for policy 0, policy_version 30 (0.0012) [2024-06-12 13:00:49,332][65383] Fps is (10 sec: 21299.3, 60 sec: 11286.8, 300 sec: 11286.8). Total num frames: 507904. Throughput: 0: 14092.9. Samples: 634180. Policy #0 lag: (min: 0.0, avg: 2.9, max: 5.0) [2024-06-12 13:00:49,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:00:49,716][65621] Worker 8 awakens! [2024-06-12 13:00:49,720][65383] Heartbeat connected on RolloutWorker_w8 [2024-06-12 13:00:54,332][65383] Fps is (10 sec: 22937.6, 60 sec: 12779.5, 300 sec: 12779.5). Total num frames: 638976. Throughput: 0: 15685.3. Samples: 771940. Policy #0 lag: (min: 0.0, avg: 3.2, max: 6.0) [2024-06-12 13:00:54,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:00:54,476][65625] Worker 9 awakens! [2024-06-12 13:00:54,483][65383] Heartbeat connected on RolloutWorker_w9 [2024-06-12 13:00:54,793][65616] Updated weights for policy 0, policy_version 40 (0.0012) [2024-06-12 13:00:59,078][65626] Worker 10 awakens! [2024-06-12 13:00:59,083][65383] Heartbeat connected on RolloutWorker_w10 [2024-06-12 13:00:59,332][65383] Fps is (10 sec: 26214.2, 60 sec: 14000.9, 300 sec: 14000.9). Total num frames: 770048. Throughput: 0: 13234.7. Samples: 923800. Policy #0 lag: (min: 0.0, avg: 2.5, max: 7.0) [2024-06-12 13:00:59,333][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:01:00,332][65616] Updated weights for policy 0, policy_version 50 (0.0013) [2024-06-12 13:01:03,800][65627] Worker 11 awakens! [2024-06-12 13:01:03,805][65383] Heartbeat connected on RolloutWorker_w11 [2024-06-12 13:01:04,332][65383] Fps is (10 sec: 29490.8, 60 sec: 15564.8, 300 sec: 15564.8). Total num frames: 933888. Throughput: 0: 15163.5. Samples: 1013920. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-06-12 13:01:04,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:01:06,096][65616] Updated weights for policy 0, policy_version 60 (0.0017) [2024-06-12 13:01:08,584][65628] Worker 12 awakens! [2024-06-12 13:01:08,590][65383] Heartbeat connected on RolloutWorker_w12 [2024-06-12 13:01:09,332][65383] Fps is (10 sec: 29491.3, 60 sec: 17749.3, 300 sec: 16384.0). Total num frames: 1064960. Throughput: 0: 18936.1. Samples: 1197440. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-06-12 13:01:09,333][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:01:10,749][65616] Updated weights for policy 0, policy_version 70 (0.0013) [2024-06-12 13:01:13,788][65629] Worker 13 awakens! [2024-06-12 13:01:13,796][65383] Heartbeat connected on RolloutWorker_w13 [2024-06-12 13:01:14,332][65383] Fps is (10 sec: 31129.8, 60 sec: 18022.4, 300 sec: 17788.3). Total num frames: 1245184. Throughput: 0: 23232.4. Samples: 1412940. Policy #0 lag: (min: 0.0, avg: 4.8, max: 10.0) [2024-06-12 13:01:14,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:01:14,467][65595] Saving new best policy, reward=0.001! [2024-06-12 13:01:15,188][65616] Updated weights for policy 0, policy_version 80 (0.0017) [2024-06-12 13:01:17,800][65630] Worker 14 awakens! [2024-06-12 13:01:17,807][65383] Heartbeat connected on RolloutWorker_w14 [2024-06-12 13:01:19,332][65383] Fps is (10 sec: 39321.0, 60 sec: 21572.2, 300 sec: 19442.3). Total num frames: 1458176. Throughput: 0: 25142.6. Samples: 1518880. Policy #0 lag: (min: 0.0, avg: 3.5, max: 9.0) [2024-06-12 13:01:19,333][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:01:19,716][65616] Updated weights for policy 0, policy_version 90 (0.0021) [2024-06-12 13:01:22,485][65631] Worker 15 awakens! [2024-06-12 13:01:22,494][65383] Heartbeat connected on RolloutWorker_w15 [2024-06-12 13:01:24,131][65616] Updated weights for policy 0, policy_version 100 (0.0021) [2024-06-12 13:01:24,332][65383] Fps is (10 sec: 39321.7, 60 sec: 24303.0, 300 sec: 20480.0). Total num frames: 1638400. Throughput: 0: 28172.4. Samples: 1727080. Policy #0 lag: (min: 0.0, avg: 3.2, max: 11.0) [2024-06-12 13:01:24,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:01:27,286][65632] Worker 16 awakens! [2024-06-12 13:01:27,295][65383] Heartbeat connected on RolloutWorker_w16 [2024-06-12 13:01:29,004][65616] Updated weights for policy 0, policy_version 110 (0.0027) [2024-06-12 13:01:29,332][65383] Fps is (10 sec: 34406.5, 60 sec: 26760.5, 300 sec: 21202.8). Total num frames: 1802240. Throughput: 0: 30280.8. Samples: 1929980. Policy #0 lag: (min: 0.0, avg: 6.4, max: 12.0) [2024-06-12 13:01:29,333][65383] Avg episode reward: [(0, '0.002')] [2024-06-12 13:01:32,792][65633] Worker 17 awakens! [2024-06-12 13:01:32,801][65383] Heartbeat connected on RolloutWorker_w17 [2024-06-12 13:01:33,691][65616] Updated weights for policy 0, policy_version 120 (0.0023) [2024-06-12 13:01:34,332][65383] Fps is (10 sec: 34406.3, 60 sec: 28945.0, 300 sec: 22027.4). Total num frames: 1982464. Throughput: 0: 31387.5. Samples: 2046620. Policy #0 lag: (min: 1.0, avg: 5.8, max: 11.0) [2024-06-12 13:01:34,333][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:01:37,539][65634] Worker 18 awakens! [2024-06-12 13:01:37,549][65383] Heartbeat connected on RolloutWorker_w18 [2024-06-12 13:01:38,391][65616] Updated weights for policy 0, policy_version 130 (0.0024) [2024-06-12 13:01:39,332][65383] Fps is (10 sec: 36045.0, 60 sec: 31129.6, 300 sec: 22765.1). Total num frames: 2162688. Throughput: 0: 32867.1. Samples: 2250960. Policy #0 lag: (min: 0.0, avg: 43.6, max: 127.0) [2024-06-12 13:01:39,333][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:01:41,924][65635] Worker 19 awakens! [2024-06-12 13:01:41,934][65383] Heartbeat connected on RolloutWorker_w19 [2024-06-12 13:01:43,185][65616] Updated weights for policy 0, policy_version 140 (0.0023) [2024-06-12 13:01:44,332][65383] Fps is (10 sec: 39321.7, 60 sec: 32768.0, 300 sec: 23756.8). Total num frames: 2375680. Throughput: 0: 34459.1. Samples: 2474460. Policy #0 lag: (min: 0.0, avg: 6.3, max: 13.0) [2024-06-12 13:01:44,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:01:46,072][65636] Worker 20 awakens! [2024-06-12 13:01:46,083][65383] Heartbeat connected on RolloutWorker_w20 [2024-06-12 13:01:47,595][65616] Updated weights for policy 0, policy_version 150 (0.0033) [2024-06-12 13:01:49,332][65383] Fps is (10 sec: 37682.9, 60 sec: 33860.2, 300 sec: 24185.9). Total num frames: 2539520. Throughput: 0: 35130.7. Samples: 2594800. Policy #0 lag: (min: 0.0, avg: 6.6, max: 14.0) [2024-06-12 13:01:49,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:01:51,288][65637] Worker 21 awakens! [2024-06-12 13:01:51,299][65383] Heartbeat connected on RolloutWorker_w21 [2024-06-12 13:01:51,819][65616] Updated weights for policy 0, policy_version 160 (0.0030) [2024-06-12 13:01:54,332][65383] Fps is (10 sec: 34406.1, 60 sec: 34679.4, 300 sec: 24724.9). Total num frames: 2719744. Throughput: 0: 36308.8. Samples: 2831340. Policy #0 lag: (min: 0.0, avg: 7.3, max: 15.0) [2024-06-12 13:01:54,333][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:01:54,362][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000000167_2736128.pth... [2024-06-12 13:01:55,215][65616] Updated weights for policy 0, policy_version 170 (0.0024) [2024-06-12 13:01:55,350][65638] Worker 22 awakens! [2024-06-12 13:01:55,361][65383] Heartbeat connected on RolloutWorker_w22 [2024-06-12 13:01:59,332][65383] Fps is (10 sec: 39321.8, 60 sec: 36044.8, 300 sec: 25502.0). Total num frames: 2932736. Throughput: 0: 36907.1. Samples: 3073760. Policy #0 lag: (min: 0.0, avg: 6.7, max: 18.0) [2024-06-12 13:01:59,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:01:59,638][65616] Updated weights for policy 0, policy_version 180 (0.0025) [2024-06-12 13:02:00,916][65641] Worker 23 awakens! [2024-06-12 13:02:00,927][65383] Heartbeat connected on RolloutWorker_w23 [2024-06-12 13:02:04,194][65616] Updated weights for policy 0, policy_version 190 (0.0026) [2024-06-12 13:02:04,332][65383] Fps is (10 sec: 39321.8, 60 sec: 36317.9, 300 sec: 25941.3). Total num frames: 3112960. Throughput: 0: 37032.5. Samples: 3185340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 16.0) [2024-06-12 13:02:04,333][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:02:05,363][65640] Worker 24 awakens! [2024-06-12 13:02:05,376][65383] Heartbeat connected on RolloutWorker_w24 [2024-06-12 13:02:07,377][65616] Updated weights for policy 0, policy_version 200 (0.0020) [2024-06-12 13:02:09,332][65383] Fps is (10 sec: 39321.2, 60 sec: 37683.1, 300 sec: 26607.6). Total num frames: 3325952. Throughput: 0: 38116.4. Samples: 3442320. Policy #0 lag: (min: 0.0, avg: 8.0, max: 17.0) [2024-06-12 13:02:09,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:02:10,016][65643] Worker 25 awakens! [2024-06-12 13:02:10,028][65383] Heartbeat connected on RolloutWorker_w25 [2024-06-12 13:02:11,267][65616] Updated weights for policy 0, policy_version 210 (0.0028) [2024-06-12 13:02:14,140][65639] Worker 26 awakens! [2024-06-12 13:02:14,153][65383] Heartbeat connected on RolloutWorker_w26 [2024-06-12 13:02:14,332][65383] Fps is (10 sec: 45874.9, 60 sec: 38775.4, 300 sec: 27474.7). Total num frames: 3571712. Throughput: 0: 39300.4. Samples: 3698500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 17.0) [2024-06-12 13:02:14,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:02:15,527][65616] Updated weights for policy 0, policy_version 220 (0.0024) [2024-06-12 13:02:19,204][65616] Updated weights for policy 0, policy_version 230 (0.0026) [2024-06-12 13:02:19,332][65383] Fps is (10 sec: 44236.6, 60 sec: 38502.4, 300 sec: 27913.5). Total num frames: 3768320. Throughput: 0: 39583.0. Samples: 3827860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 16.0) [2024-06-12 13:02:19,333][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:02:19,350][65644] Worker 27 awakens! [2024-06-12 13:02:19,362][65383] Heartbeat connected on RolloutWorker_w27 [2024-06-12 13:02:22,931][65616] Updated weights for policy 0, policy_version 240 (0.0030) [2024-06-12 13:02:23,486][65642] Worker 28 awakens! [2024-06-12 13:02:23,499][65383] Heartbeat connected on RolloutWorker_w28 [2024-06-12 13:02:24,332][65383] Fps is (10 sec: 44236.8, 60 sec: 39594.6, 300 sec: 28672.0). Total num frames: 4014080. Throughput: 0: 40985.7. Samples: 4095320. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-12 13:02:24,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:02:26,854][65616] Updated weights for policy 0, policy_version 250 (0.0026) [2024-06-12 13:02:28,819][65645] Worker 29 awakens! [2024-06-12 13:02:28,832][65383] Heartbeat connected on RolloutWorker_w29 [2024-06-12 13:02:29,333][65383] Fps is (10 sec: 42598.4, 60 sec: 39867.7, 300 sec: 28926.2). Total num frames: 4194304. Throughput: 0: 41883.4. Samples: 4359220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 13:02:29,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:02:29,892][65616] Updated weights for policy 0, policy_version 260 (0.0034) [2024-06-12 13:02:33,486][65646] Worker 30 awakens! [2024-06-12 13:02:33,501][65383] Heartbeat connected on RolloutWorker_w30 [2024-06-12 13:02:34,080][65616] Updated weights for policy 0, policy_version 270 (0.0036) [2024-06-12 13:02:34,332][65383] Fps is (10 sec: 40960.0, 60 sec: 40686.9, 300 sec: 29491.2). Total num frames: 4423680. Throughput: 0: 42076.0. Samples: 4488220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 13:02:34,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:02:37,406][65616] Updated weights for policy 0, policy_version 280 (0.0032) [2024-06-12 13:02:37,644][65647] Worker 31 awakens! [2024-06-12 13:02:37,659][65383] Heartbeat connected on RolloutWorker_w31 [2024-06-12 13:02:39,332][65383] Fps is (10 sec: 47514.5, 60 sec: 41779.2, 300 sec: 30125.4). Total num frames: 4669440. Throughput: 0: 42739.7. Samples: 4754620. Policy #0 lag: (min: 1.0, avg: 96.3, max: 281.0) [2024-06-12 13:02:39,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:02:41,330][65616] Updated weights for policy 0, policy_version 290 (0.0037) [2024-06-12 13:02:44,332][65383] Fps is (10 sec: 47513.7, 60 sec: 42052.2, 300 sec: 30617.6). Total num frames: 4898816. Throughput: 0: 43595.1. Samples: 5035540. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-12 13:02:44,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:02:44,491][65616] Updated weights for policy 0, policy_version 300 (0.0028) [2024-06-12 13:02:48,635][65616] Updated weights for policy 0, policy_version 310 (0.0034) [2024-06-12 13:02:49,332][65383] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 31079.9). Total num frames: 5128192. Throughput: 0: 44121.7. Samples: 5170820. Policy #0 lag: (min: 0.0, avg: 7.9, max: 18.0) [2024-06-12 13:02:49,333][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:02:51,358][65595] Signal inference workers to stop experience collection... (50 times) [2024-06-12 13:02:51,389][65616] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-06-12 13:02:51,412][65595] Signal inference workers to resume experience collection... (50 times) [2024-06-12 13:02:51,413][65616] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-06-12 13:02:51,546][65616] Updated weights for policy 0, policy_version 320 (0.0037) [2024-06-12 13:02:54,332][65383] Fps is (10 sec: 44236.6, 60 sec: 43690.6, 300 sec: 31418.7). Total num frames: 5341184. Throughput: 0: 44415.5. Samples: 5441020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-12 13:02:54,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:02:55,863][65616] Updated weights for policy 0, policy_version 330 (0.0029) [2024-06-12 13:02:58,896][65616] Updated weights for policy 0, policy_version 340 (0.0037) [2024-06-12 13:02:59,332][65383] Fps is (10 sec: 45875.3, 60 sec: 44236.7, 300 sec: 31925.4). Total num frames: 5586944. Throughput: 0: 44715.1. Samples: 5710680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 13:02:59,333][65383] Avg episode reward: [(0, '0.002')] [2024-06-12 13:02:59,333][65595] Saving new best policy, reward=0.002! [2024-06-12 13:03:03,353][65616] Updated weights for policy 0, policy_version 350 (0.0039) [2024-06-12 13:03:04,333][65383] Fps is (10 sec: 45874.9, 60 sec: 44782.8, 300 sec: 32221.8). Total num frames: 5799936. Throughput: 0: 44851.1. Samples: 5846160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 13:03:04,333][65383] Avg episode reward: [(0, '0.002')] [2024-06-12 13:03:05,898][65616] Updated weights for policy 0, policy_version 360 (0.0030) [2024-06-12 13:03:09,332][65383] Fps is (10 sec: 42599.0, 60 sec: 44783.0, 300 sec: 32502.3). Total num frames: 6012928. Throughput: 0: 45036.6. Samples: 6121960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-12 13:03:09,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:03:10,607][65616] Updated weights for policy 0, policy_version 370 (0.0026) [2024-06-12 13:03:13,393][65616] Updated weights for policy 0, policy_version 380 (0.0030) [2024-06-12 13:03:14,332][65383] Fps is (10 sec: 45875.6, 60 sec: 44782.9, 300 sec: 32940.4). Total num frames: 6258688. Throughput: 0: 45118.7. Samples: 6389560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-12 13:03:14,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:03:17,872][65616] Updated weights for policy 0, policy_version 390 (0.0025) [2024-06-12 13:03:19,332][65383] Fps is (10 sec: 47513.4, 60 sec: 45329.2, 300 sec: 33272.1). Total num frames: 6488064. Throughput: 0: 45474.8. Samples: 6534580. Policy #0 lag: (min: 2.0, avg: 9.8, max: 20.0) [2024-06-12 13:03:19,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:03:20,554][65616] Updated weights for policy 0, policy_version 400 (0.0026) [2024-06-12 13:03:24,332][65383] Fps is (10 sec: 42598.8, 60 sec: 44509.9, 300 sec: 33423.4). Total num frames: 6684672. Throughput: 0: 45440.4. Samples: 6799440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-12 13:03:24,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:03:24,874][65616] Updated weights for policy 0, policy_version 410 (0.0033) [2024-06-12 13:03:27,833][65616] Updated weights for policy 0, policy_version 420 (0.0031) [2024-06-12 13:03:29,332][65383] Fps is (10 sec: 45874.6, 60 sec: 45875.2, 300 sec: 33886.9). Total num frames: 6946816. Throughput: 0: 45438.6. Samples: 7080280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-12 13:03:29,333][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:03:31,845][65616] Updated weights for policy 0, policy_version 430 (0.0039) [2024-06-12 13:03:34,332][65383] Fps is (10 sec: 49152.0, 60 sec: 45875.3, 300 sec: 34172.3). Total num frames: 7176192. Throughput: 0: 45523.3. Samples: 7219360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 13:03:34,333][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:03:34,875][65616] Updated weights for policy 0, policy_version 440 (0.0025) [2024-06-12 13:03:39,262][65616] Updated weights for policy 0, policy_version 450 (0.0035) [2024-06-12 13:03:39,332][65383] Fps is (10 sec: 42598.8, 60 sec: 45055.9, 300 sec: 34292.1). Total num frames: 7372800. Throughput: 0: 45422.8. Samples: 7485040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 13:03:39,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:03:41,945][65616] Updated weights for policy 0, policy_version 460 (0.0024) [2024-06-12 13:03:44,336][65383] Fps is (10 sec: 44220.9, 60 sec: 45326.4, 300 sec: 34629.3). Total num frames: 7618560. Throughput: 0: 45545.4. Samples: 7760380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 13:03:44,337][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:03:46,418][65616] Updated weights for policy 0, policy_version 470 (0.0037) [2024-06-12 13:03:49,238][65616] Updated weights for policy 0, policy_version 480 (0.0030) [2024-06-12 13:03:49,333][65383] Fps is (10 sec: 49151.5, 60 sec: 45602.1, 300 sec: 34952.5). Total num frames: 7864320. Throughput: 0: 45752.5. Samples: 7905020. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-12 13:03:49,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:03:53,450][65616] Updated weights for policy 0, policy_version 490 (0.0030) [2024-06-12 13:03:54,333][65383] Fps is (10 sec: 45890.0, 60 sec: 45601.9, 300 sec: 35118.7). Total num frames: 8077312. Throughput: 0: 45645.4. Samples: 8176020. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-12 13:03:54,333][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:03:54,340][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000000493_8077312.pth... [2024-06-12 13:03:56,363][65616] Updated weights for policy 0, policy_version 500 (0.0032) [2024-06-12 13:03:59,332][65383] Fps is (10 sec: 42598.5, 60 sec: 45056.0, 300 sec: 35277.9). Total num frames: 8290304. Throughput: 0: 45728.4. Samples: 8447340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-12 13:03:59,333][65383] Avg episode reward: [(0, '0.003')] [2024-06-12 13:04:00,659][65616] Updated weights for policy 0, policy_version 510 (0.0029) [2024-06-12 13:04:04,024][65616] Updated weights for policy 0, policy_version 520 (0.0034) [2024-06-12 13:04:04,332][65383] Fps is (10 sec: 45876.3, 60 sec: 45602.2, 300 sec: 35566.9). Total num frames: 8536064. Throughput: 0: 45540.8. Samples: 8583920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-12 13:04:04,333][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:04:06,872][65595] Signal inference workers to stop experience collection... (100 times) [2024-06-12 13:04:06,907][65616] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-06-12 13:04:06,929][65595] Signal inference workers to resume experience collection... (100 times) [2024-06-12 13:04:06,930][65616] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-06-12 13:04:07,984][65616] Updated weights for policy 0, policy_version 530 (0.0025) [2024-06-12 13:04:09,332][65383] Fps is (10 sec: 45876.0, 60 sec: 45602.1, 300 sec: 35710.4). Total num frames: 8749056. Throughput: 0: 45767.1. Samples: 8858960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 13:04:09,341][65383] Avg episode reward: [(0, '0.003')] [2024-06-12 13:04:10,761][65616] Updated weights for policy 0, policy_version 540 (0.0036) [2024-06-12 13:04:14,337][65383] Fps is (10 sec: 44217.0, 60 sec: 45325.6, 300 sec: 35913.1). Total num frames: 8978432. Throughput: 0: 45571.5. Samples: 9131200. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-12 13:04:14,338][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:04:14,768][65616] Updated weights for policy 0, policy_version 550 (0.0038) [2024-06-12 13:04:18,338][65616] Updated weights for policy 0, policy_version 560 (0.0039) [2024-06-12 13:04:19,333][65383] Fps is (10 sec: 47511.9, 60 sec: 45601.9, 300 sec: 36173.3). Total num frames: 9224192. Throughput: 0: 45528.1. Samples: 9268140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 13:04:19,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:04:22,019][65616] Updated weights for policy 0, policy_version 570 (0.0028) [2024-06-12 13:04:24,332][65383] Fps is (10 sec: 45896.1, 60 sec: 45875.1, 300 sec: 36296.9). Total num frames: 9437184. Throughput: 0: 45718.2. Samples: 9542360. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-12 13:04:24,333][65383] Avg episode reward: [(0, '0.002')] [2024-06-12 13:04:25,233][65616] Updated weights for policy 0, policy_version 580 (0.0038) [2024-06-12 13:04:29,094][65616] Updated weights for policy 0, policy_version 590 (0.0038) [2024-06-12 13:04:29,332][65383] Fps is (10 sec: 44238.1, 60 sec: 45329.1, 300 sec: 36477.6). Total num frames: 9666560. Throughput: 0: 45737.4. Samples: 9818400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-12 13:04:29,333][65383] Avg episode reward: [(0, '0.002')] [2024-06-12 13:04:32,580][65616] Updated weights for policy 0, policy_version 600 (0.0039) [2024-06-12 13:04:34,332][65383] Fps is (10 sec: 47513.9, 60 sec: 45602.1, 300 sec: 36712.3). Total num frames: 9912320. Throughput: 0: 45465.0. Samples: 9950940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 13:04:34,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:04:36,397][65616] Updated weights for policy 0, policy_version 610 (0.0042) [2024-06-12 13:04:39,333][65383] Fps is (10 sec: 45874.1, 60 sec: 45875.0, 300 sec: 36819.3). Total num frames: 10125312. Throughput: 0: 45583.2. Samples: 10227260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 13:04:39,333][65383] Avg episode reward: [(0, '0.002')] [2024-06-12 13:04:40,027][65616] Updated weights for policy 0, policy_version 620 (0.0031) [2024-06-12 13:04:43,434][65616] Updated weights for policy 0, policy_version 630 (0.0033) [2024-06-12 13:04:44,333][65383] Fps is (10 sec: 44236.0, 60 sec: 45604.7, 300 sec: 36981.0). Total num frames: 10354688. Throughput: 0: 45599.0. Samples: 10499300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 13:04:44,333][65383] Avg episode reward: [(0, '0.002')] [2024-06-12 13:04:47,143][65616] Updated weights for policy 0, policy_version 640 (0.0029) [2024-06-12 13:04:49,332][65383] Fps is (10 sec: 45876.2, 60 sec: 45329.1, 300 sec: 37137.1). Total num frames: 10584064. Throughput: 0: 45645.4. Samples: 10637960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 13:04:49,333][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:04:50,552][65616] Updated weights for policy 0, policy_version 650 (0.0031) [2024-06-12 13:04:54,332][65383] Fps is (10 sec: 44237.3, 60 sec: 45329.3, 300 sec: 37231.2). Total num frames: 10797056. Throughput: 0: 45588.8. Samples: 10910460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 13:04:54,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:04:54,422][65616] Updated weights for policy 0, policy_version 660 (0.0035) [2024-06-12 13:04:57,875][65616] Updated weights for policy 0, policy_version 670 (0.0033) [2024-06-12 13:04:59,332][65383] Fps is (10 sec: 45875.7, 60 sec: 45875.3, 300 sec: 37433.3). Total num frames: 11042816. Throughput: 0: 45544.7. Samples: 11180500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 13:04:59,340][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:05:01,313][65616] Updated weights for policy 0, policy_version 680 (0.0037) [2024-06-12 13:05:04,332][65383] Fps is (10 sec: 47513.3, 60 sec: 45602.1, 300 sec: 38210.8). Total num frames: 11272192. Throughput: 0: 45575.7. Samples: 11319040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 13:05:04,333][65383] Avg episode reward: [(0, '0.002')] [2024-06-12 13:05:04,947][65616] Updated weights for policy 0, policy_version 690 (0.0032) [2024-06-12 13:05:08,610][65616] Updated weights for policy 0, policy_version 700 (0.0031) [2024-06-12 13:05:09,332][65383] Fps is (10 sec: 44236.1, 60 sec: 45602.0, 300 sec: 38377.4). Total num frames: 11485184. Throughput: 0: 45528.0. Samples: 11591120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 13:05:09,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:05:12,118][65616] Updated weights for policy 0, policy_version 710 (0.0027) [2024-06-12 13:05:14,332][65383] Fps is (10 sec: 44236.9, 60 sec: 45605.6, 300 sec: 39155.0). Total num frames: 11714560. Throughput: 0: 45452.8. Samples: 11863780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 13:05:14,333][65383] Avg episode reward: [(0, '0.002')] [2024-06-12 13:05:15,948][65616] Updated weights for policy 0, policy_version 720 (0.0038) [2024-06-12 13:05:19,311][65616] Updated weights for policy 0, policy_version 730 (0.0031) [2024-06-12 13:05:19,332][65383] Fps is (10 sec: 47513.5, 60 sec: 45602.3, 300 sec: 39932.6). Total num frames: 11960320. Throughput: 0: 45669.7. Samples: 12006080. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-12 13:05:19,333][65383] Avg episode reward: [(0, '0.004')] [2024-06-12 13:05:19,333][65595] Saving new best policy, reward=0.004! [2024-06-12 13:05:23,171][65616] Updated weights for policy 0, policy_version 740 (0.0037) [2024-06-12 13:05:24,332][65383] Fps is (10 sec: 45875.6, 60 sec: 45602.2, 300 sec: 40599.0). Total num frames: 12173312. Throughput: 0: 45414.5. Samples: 12270900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 13:05:24,333][65383] Avg episode reward: [(0, '0.003')] [2024-06-12 13:05:26,348][65616] Updated weights for policy 0, policy_version 750 (0.0033) [2024-06-12 13:05:29,332][65383] Fps is (10 sec: 44237.5, 60 sec: 45602.2, 300 sec: 41209.9). Total num frames: 12402688. Throughput: 0: 45515.3. Samples: 12547480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 13:05:29,333][65383] Avg episode reward: [(0, '0.002')] [2024-06-12 13:05:30,053][65616] Updated weights for policy 0, policy_version 760 (0.0033) [2024-06-12 13:05:33,303][65616] Updated weights for policy 0, policy_version 770 (0.0043) [2024-06-12 13:05:34,332][65383] Fps is (10 sec: 45875.3, 60 sec: 45329.1, 300 sec: 41820.9). Total num frames: 12632064. Throughput: 0: 45524.9. Samples: 12686580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 13:05:34,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:05:37,491][65616] Updated weights for policy 0, policy_version 780 (0.0031) [2024-06-12 13:05:39,332][65383] Fps is (10 sec: 47513.5, 60 sec: 45875.4, 300 sec: 42265.2). Total num frames: 12877824. Throughput: 0: 45589.4. Samples: 12961980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 13:05:39,333][65383] Avg episode reward: [(0, '0.002')] [2024-06-12 13:05:40,691][65616] Updated weights for policy 0, policy_version 790 (0.0027) [2024-06-12 13:05:43,559][65595] Signal inference workers to stop experience collection... (150 times) [2024-06-12 13:05:43,559][65595] Signal inference workers to resume experience collection... (150 times) [2024-06-12 13:05:43,601][65616] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-06-12 13:05:43,601][65616] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-06-12 13:05:44,332][65383] Fps is (10 sec: 45874.7, 60 sec: 45602.2, 300 sec: 42653.9). Total num frames: 13090816. Throughput: 0: 45644.3. Samples: 13234500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 13:05:44,333][65383] Avg episode reward: [(0, '0.003')] [2024-06-12 13:05:44,806][65616] Updated weights for policy 0, policy_version 800 (0.0046) [2024-06-12 13:05:48,206][65616] Updated weights for policy 0, policy_version 810 (0.0035) [2024-06-12 13:05:49,332][65383] Fps is (10 sec: 42598.1, 60 sec: 45329.1, 300 sec: 42931.6). Total num frames: 13303808. Throughput: 0: 45448.5. Samples: 13364220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-12 13:05:49,333][65383] Avg episode reward: [(0, '0.002')] [2024-06-12 13:05:51,749][65616] Updated weights for policy 0, policy_version 820 (0.0034) [2024-06-12 13:05:54,332][65383] Fps is (10 sec: 45875.4, 60 sec: 45875.2, 300 sec: 43320.4). Total num frames: 13549568. Throughput: 0: 45510.3. Samples: 13639080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 13:05:54,333][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:05:54,358][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000000828_13565952.pth... [2024-06-12 13:05:54,418][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000000167_2736128.pth [2024-06-12 13:05:55,398][65616] Updated weights for policy 0, policy_version 830 (0.0022) [2024-06-12 13:05:59,028][65616] Updated weights for policy 0, policy_version 840 (0.0031) [2024-06-12 13:05:59,332][65383] Fps is (10 sec: 45874.9, 60 sec: 45328.9, 300 sec: 43487.0). Total num frames: 13762560. Throughput: 0: 45492.9. Samples: 13910960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-12 13:05:59,333][65383] Avg episode reward: [(0, '0.003')] [2024-06-12 13:06:02,243][65616] Updated weights for policy 0, policy_version 850 (0.0031) [2024-06-12 13:06:04,332][65383] Fps is (10 sec: 44236.9, 60 sec: 45329.1, 300 sec: 43820.2). Total num frames: 13991936. Throughput: 0: 45346.3. Samples: 14046660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 13:06:04,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:06:06,420][65616] Updated weights for policy 0, policy_version 860 (0.0037) [2024-06-12 13:06:09,332][65383] Fps is (10 sec: 47514.3, 60 sec: 45875.3, 300 sec: 44042.4). Total num frames: 14237696. Throughput: 0: 45567.6. Samples: 14321440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 13:06:09,333][65383] Avg episode reward: [(0, '0.002')] [2024-06-12 13:06:09,633][65616] Updated weights for policy 0, policy_version 870 (0.0034) [2024-06-12 13:06:13,247][65616] Updated weights for policy 0, policy_version 880 (0.0039) [2024-06-12 13:06:14,332][65383] Fps is (10 sec: 47513.6, 60 sec: 45875.3, 300 sec: 44098.0). Total num frames: 14467072. Throughput: 0: 45631.0. Samples: 14600880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 13:06:14,333][65383] Avg episode reward: [(0, '0.003')] [2024-06-12 13:06:16,661][65616] Updated weights for policy 0, policy_version 890 (0.0033) [2024-06-12 13:06:19,332][65383] Fps is (10 sec: 42598.0, 60 sec: 45056.0, 300 sec: 44153.5). Total num frames: 14663680. Throughput: 0: 45363.0. Samples: 14727920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 13:06:19,333][65383] Avg episode reward: [(0, '0.003')] [2024-06-12 13:06:20,808][65616] Updated weights for policy 0, policy_version 900 (0.0037) [2024-06-12 13:06:23,885][65616] Updated weights for policy 0, policy_version 910 (0.0033) [2024-06-12 13:06:24,332][65383] Fps is (10 sec: 44237.0, 60 sec: 45602.2, 300 sec: 44431.2). Total num frames: 14909440. Throughput: 0: 45449.3. Samples: 15007200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 13:06:24,333][65383] Avg episode reward: [(0, '0.002')] [2024-06-12 13:06:27,799][65616] Updated weights for policy 0, policy_version 920 (0.0032) [2024-06-12 13:06:29,333][65383] Fps is (10 sec: 49151.2, 60 sec: 45875.0, 300 sec: 44653.3). Total num frames: 15155200. Throughput: 0: 45389.6. Samples: 15277040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 13:06:29,333][65383] Avg episode reward: [(0, '0.000')] [2024-06-12 13:06:30,870][65616] Updated weights for policy 0, policy_version 930 (0.0040) [2024-06-12 13:06:34,332][65383] Fps is (10 sec: 44237.2, 60 sec: 45329.1, 300 sec: 44708.9). Total num frames: 15351808. Throughput: 0: 45575.3. Samples: 15415100. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-12 13:06:34,332][65383] Avg episode reward: [(0, '0.006')] [2024-06-12 13:06:34,345][65595] Saving new best policy, reward=0.006! [2024-06-12 13:06:34,936][65616] Updated weights for policy 0, policy_version 940 (0.0033) [2024-06-12 13:06:38,654][65616] Updated weights for policy 0, policy_version 950 (0.0035) [2024-06-12 13:06:39,332][65383] Fps is (10 sec: 44237.7, 60 sec: 45329.0, 300 sec: 44820.0). Total num frames: 15597568. Throughput: 0: 45715.1. Samples: 15696260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-12 13:06:39,333][65383] Avg episode reward: [(0, '0.002')] [2024-06-12 13:06:42,083][65616] Updated weights for policy 0, policy_version 960 (0.0028) [2024-06-12 13:06:44,332][65383] Fps is (10 sec: 45875.1, 60 sec: 45329.2, 300 sec: 44986.6). Total num frames: 15810560. Throughput: 0: 45574.9. Samples: 15961820. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-12 13:06:44,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:06:45,621][65616] Updated weights for policy 0, policy_version 970 (0.0035) [2024-06-12 13:06:49,332][65383] Fps is (10 sec: 44237.0, 60 sec: 45602.2, 300 sec: 45153.2). Total num frames: 16039936. Throughput: 0: 45850.7. Samples: 16109940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 13:06:49,333][65383] Avg episode reward: [(0, '0.003')] [2024-06-12 13:06:49,380][65616] Updated weights for policy 0, policy_version 980 (0.0031) [2024-06-12 13:06:52,711][65616] Updated weights for policy 0, policy_version 990 (0.0026) [2024-06-12 13:06:54,332][65383] Fps is (10 sec: 47513.5, 60 sec: 45602.2, 300 sec: 45264.3). Total num frames: 16285696. Throughput: 0: 45697.8. Samples: 16377840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 13:06:54,333][65383] Avg episode reward: [(0, '0.004')] [2024-06-12 13:06:56,459][65595] Signal inference workers to stop experience collection... (200 times) [2024-06-12 13:06:56,460][65595] Signal inference workers to resume experience collection... (200 times) [2024-06-12 13:06:56,498][65616] InferenceWorker_p0-w0: stopping experience collection (200 times) [2024-06-12 13:06:56,499][65616] InferenceWorker_p0-w0: resuming experience collection (200 times) [2024-06-12 13:06:56,590][65616] Updated weights for policy 0, policy_version 1000 (0.0035) [2024-06-12 13:06:59,332][65383] Fps is (10 sec: 47513.3, 60 sec: 45875.3, 300 sec: 45430.9). Total num frames: 16515072. Throughput: 0: 45592.0. Samples: 16652520. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-12 13:06:59,341][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:07:00,000][65616] Updated weights for policy 0, policy_version 1010 (0.0033) [2024-06-12 13:07:03,937][65616] Updated weights for policy 0, policy_version 1020 (0.0028) [2024-06-12 13:07:04,332][65383] Fps is (10 sec: 44236.5, 60 sec: 45602.1, 300 sec: 45430.9). Total num frames: 16728064. Throughput: 0: 45721.8. Samples: 16785400. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-12 13:07:04,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:07:07,398][65616] Updated weights for policy 0, policy_version 1030 (0.0029) [2024-06-12 13:07:09,332][65383] Fps is (10 sec: 44236.9, 60 sec: 45329.0, 300 sec: 45375.4). Total num frames: 16957440. Throughput: 0: 45435.5. Samples: 17051800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 13:07:09,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:07:11,040][65616] Updated weights for policy 0, policy_version 1040 (0.0026) [2024-06-12 13:07:14,325][65616] Updated weights for policy 0, policy_version 1050 (0.0032) [2024-06-12 13:07:14,332][65383] Fps is (10 sec: 47513.8, 60 sec: 45602.2, 300 sec: 45542.0). Total num frames: 17203200. Throughput: 0: 45627.4. Samples: 17330260. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-12 13:07:14,333][65383] Avg episode reward: [(0, '0.003')] [2024-06-12 13:07:18,162][65616] Updated weights for policy 0, policy_version 1060 (0.0025) [2024-06-12 13:07:19,332][65383] Fps is (10 sec: 45875.3, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 17416192. Throughput: 0: 45615.0. Samples: 17467780. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-12 13:07:19,333][65383] Avg episode reward: [(0, '0.002')] [2024-06-12 13:07:21,663][65616] Updated weights for policy 0, policy_version 1070 (0.0026) [2024-06-12 13:07:24,332][65383] Fps is (10 sec: 42597.9, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 17629184. Throughput: 0: 45393.7. Samples: 17738980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 13:07:24,333][65383] Avg episode reward: [(0, '0.004')] [2024-06-12 13:07:25,482][65616] Updated weights for policy 0, policy_version 1080 (0.0030) [2024-06-12 13:07:28,631][65616] Updated weights for policy 0, policy_version 1090 (0.0032) [2024-06-12 13:07:29,332][65383] Fps is (10 sec: 45875.1, 60 sec: 45329.2, 300 sec: 45597.5). Total num frames: 17874944. Throughput: 0: 45597.7. Samples: 18013720. Policy #0 lag: (min: 0.0, avg: 13.2, max: 26.0) [2024-06-12 13:07:29,333][65383] Avg episode reward: [(0, '0.003')] [2024-06-12 13:07:32,471][65616] Updated weights for policy 0, policy_version 1100 (0.0042) [2024-06-12 13:07:34,332][65383] Fps is (10 sec: 49152.5, 60 sec: 46148.2, 300 sec: 45597.5). Total num frames: 18120704. Throughput: 0: 45561.3. Samples: 18160200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 13:07:34,333][65383] Avg episode reward: [(0, '0.004')] [2024-06-12 13:07:35,826][65616] Updated weights for policy 0, policy_version 1110 (0.0032) [2024-06-12 13:07:39,306][65616] Updated weights for policy 0, policy_version 1120 (0.0038) [2024-06-12 13:07:39,333][65383] Fps is (10 sec: 47513.0, 60 sec: 45875.1, 300 sec: 45597.5). Total num frames: 18350080. Throughput: 0: 45671.8. Samples: 18433080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 13:07:39,333][65383] Avg episode reward: [(0, '0.002')] [2024-06-12 13:07:42,896][65616] Updated weights for policy 0, policy_version 1130 (0.0029) [2024-06-12 13:07:44,332][65383] Fps is (10 sec: 45875.0, 60 sec: 46148.2, 300 sec: 45597.5). Total num frames: 18579456. Throughput: 0: 45790.7. Samples: 18713100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-12 13:07:44,336][65383] Avg episode reward: [(0, '0.007')] [2024-06-12 13:07:46,478][65616] Updated weights for policy 0, policy_version 1140 (0.0026) [2024-06-12 13:07:49,332][65383] Fps is (10 sec: 44237.4, 60 sec: 45875.2, 300 sec: 45597.5). Total num frames: 18792448. Throughput: 0: 45717.8. Samples: 18842700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 13:07:49,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:07:50,243][65616] Updated weights for policy 0, policy_version 1150 (0.0028) [2024-06-12 13:07:53,441][65616] Updated weights for policy 0, policy_version 1160 (0.0026) [2024-06-12 13:07:54,332][65383] Fps is (10 sec: 42598.3, 60 sec: 45329.0, 300 sec: 45486.4). Total num frames: 19005440. Throughput: 0: 45897.3. Samples: 19117180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 13:07:54,333][65383] Avg episode reward: [(0, '0.006')] [2024-06-12 13:07:54,357][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000001161_19021824.pth... [2024-06-12 13:07:54,406][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000000493_8077312.pth [2024-06-12 13:07:57,457][65616] Updated weights for policy 0, policy_version 1170 (0.0026) [2024-06-12 13:07:59,332][65383] Fps is (10 sec: 45875.0, 60 sec: 45602.1, 300 sec: 45597.5). Total num frames: 19251200. Throughput: 0: 45909.7. Samples: 19396200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 13:07:59,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:08:01,006][65616] Updated weights for policy 0, policy_version 1180 (0.0032) [2024-06-12 13:08:04,332][65383] Fps is (10 sec: 45875.3, 60 sec: 45602.1, 300 sec: 45597.5). Total num frames: 19464192. Throughput: 0: 45790.6. Samples: 19528360. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-12 13:08:04,333][65383] Avg episode reward: [(0, '0.002')] [2024-06-12 13:08:04,940][65616] Updated weights for policy 0, policy_version 1190 (0.0041) [2024-06-12 13:08:07,974][65616] Updated weights for policy 0, policy_version 1200 (0.0031) [2024-06-12 13:08:09,332][65383] Fps is (10 sec: 45875.4, 60 sec: 45875.2, 300 sec: 45597.5). Total num frames: 19709952. Throughput: 0: 45839.6. Samples: 19801760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 13:08:09,333][65383] Avg episode reward: [(0, '0.004')] [2024-06-12 13:08:11,766][65616] Updated weights for policy 0, policy_version 1210 (0.0031) [2024-06-12 13:08:14,332][65383] Fps is (10 sec: 47513.5, 60 sec: 45602.1, 300 sec: 45597.5). Total num frames: 19939328. Throughput: 0: 45855.1. Samples: 20077200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-12 13:08:14,333][65383] Avg episode reward: [(0, '0.003')] [2024-06-12 13:08:15,248][65616] Updated weights for policy 0, policy_version 1220 (0.0031) [2024-06-12 13:08:19,296][65616] Updated weights for policy 0, policy_version 1230 (0.0038) [2024-06-12 13:08:19,333][65383] Fps is (10 sec: 44236.2, 60 sec: 45602.0, 300 sec: 45653.0). Total num frames: 20152320. Throughput: 0: 45423.0. Samples: 20204240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 13:08:19,333][65383] Avg episode reward: [(0, '0.003')] [2024-06-12 13:08:22,563][65616] Updated weights for policy 0, policy_version 1240 (0.0034) [2024-06-12 13:08:24,332][65383] Fps is (10 sec: 42598.6, 60 sec: 45602.2, 300 sec: 45486.4). Total num frames: 20365312. Throughput: 0: 45453.0. Samples: 20478460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 13:08:24,333][65383] Avg episode reward: [(0, '0.005')] [2024-06-12 13:08:26,593][65616] Updated weights for policy 0, policy_version 1250 (0.0028) [2024-06-12 13:08:27,592][65595] Signal inference workers to stop experience collection... (250 times) [2024-06-12 13:08:27,615][65616] InferenceWorker_p0-w0: stopping experience collection (250 times) [2024-06-12 13:08:27,648][65595] Signal inference workers to resume experience collection... (250 times) [2024-06-12 13:08:27,648][65616] InferenceWorker_p0-w0: resuming experience collection (250 times) [2024-06-12 13:08:29,332][65383] Fps is (10 sec: 47514.1, 60 sec: 45875.2, 300 sec: 45597.5). Total num frames: 20627456. Throughput: 0: 45265.3. Samples: 20750040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 13:08:29,333][65383] Avg episode reward: [(0, '0.005')] [2024-06-12 13:08:29,959][65616] Updated weights for policy 0, policy_version 1260 (0.0034) [2024-06-12 13:08:33,234][65616] Updated weights for policy 0, policy_version 1270 (0.0032) [2024-06-12 13:08:34,332][65383] Fps is (10 sec: 45875.5, 60 sec: 45056.0, 300 sec: 45597.5). Total num frames: 20824064. Throughput: 0: 45581.4. Samples: 20893860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 13:08:34,333][65383] Avg episode reward: [(0, '0.001')] [2024-06-12 13:08:36,604][65616] Updated weights for policy 0, policy_version 1280 (0.0026) [2024-06-12 13:08:39,332][65383] Fps is (10 sec: 42598.6, 60 sec: 45056.1, 300 sec: 45542.5). Total num frames: 21053440. Throughput: 0: 45577.8. Samples: 21168180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 13:08:39,333][65383] Avg episode reward: [(0, '0.004')] [2024-06-12 13:08:40,841][65616] Updated weights for policy 0, policy_version 1290 (0.0035) [2024-06-12 13:08:43,962][65616] Updated weights for policy 0, policy_version 1300 (0.0030) [2024-06-12 13:08:44,332][65383] Fps is (10 sec: 47513.5, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 21299200. Throughput: 0: 45449.0. Samples: 21441400. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-12 13:08:44,333][65383] Avg episode reward: [(0, '0.005')] [2024-06-12 13:08:47,850][65616] Updated weights for policy 0, policy_version 1310 (0.0033) [2024-06-12 13:08:49,332][65383] Fps is (10 sec: 47513.6, 60 sec: 45602.2, 300 sec: 45597.6). Total num frames: 21528576. Throughput: 0: 45698.7. Samples: 21584800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 13:08:49,333][65383] Avg episode reward: [(0, '0.002')] [2024-06-12 13:08:50,834][65616] Updated weights for policy 0, policy_version 1320 (0.0027) [2024-06-12 13:08:54,332][65383] Fps is (10 sec: 45874.8, 60 sec: 45875.2, 300 sec: 45653.1). Total num frames: 21757952. Throughput: 0: 45800.0. Samples: 21862760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 13:08:54,333][65383] Avg episode reward: [(0, '0.004')] [2024-06-12 13:08:54,956][65616] Updated weights for policy 0, policy_version 1330 (0.0028) [2024-06-12 13:08:57,803][65616] Updated weights for policy 0, policy_version 1340 (0.0032) [2024-06-12 13:08:59,332][65383] Fps is (10 sec: 44236.7, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 21970944. Throughput: 0: 45728.9. Samples: 22135000. Policy #0 lag: (min: 2.0, avg: 11.5, max: 22.0) [2024-06-12 13:08:59,333][65383] Avg episode reward: [(0, '0.003')] [2024-06-12 13:09:01,769][65616] Updated weights for policy 0, policy_version 1350 (0.0029) [2024-06-12 13:09:04,332][65383] Fps is (10 sec: 45875.8, 60 sec: 45875.3, 300 sec: 45653.0). Total num frames: 22216704. Throughput: 0: 45915.3. Samples: 22270420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-12 13:09:04,333][65383] Avg episode reward: [(0, '0.002')] [2024-06-12 13:09:05,298][65616] Updated weights for policy 0, policy_version 1360 (0.0033) [2024-06-12 13:09:09,192][65616] Updated weights for policy 0, policy_version 1370 (0.0032) [2024-06-12 13:09:09,333][65383] Fps is (10 sec: 47513.1, 60 sec: 45602.0, 300 sec: 45653.7). Total num frames: 22446080. Throughput: 0: 45987.9. Samples: 22547920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 13:09:09,333][65383] Avg episode reward: [(0, '0.004')] [2024-06-12 13:09:12,506][65616] Updated weights for policy 0, policy_version 1380 (0.0033) [2024-06-12 13:09:14,332][65383] Fps is (10 sec: 44236.7, 60 sec: 45329.2, 300 sec: 45542.0). Total num frames: 22659072. Throughput: 0: 46087.6. Samples: 22823980. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-12 13:09:14,333][65383] Avg episode reward: [(0, '0.004')] [2024-06-12 13:09:16,475][65616] Updated weights for policy 0, policy_version 1390 (0.0031) [2024-06-12 13:09:19,332][65383] Fps is (10 sec: 45875.7, 60 sec: 45875.3, 300 sec: 45653.0). Total num frames: 22904832. Throughput: 0: 45857.7. Samples: 22957460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-12 13:09:19,333][65383] Avg episode reward: [(0, '0.003')] [2024-06-12 13:09:19,645][65616] Updated weights for policy 0, policy_version 1400 (0.0029) [2024-06-12 13:09:23,517][65616] Updated weights for policy 0, policy_version 1410 (0.0025) [2024-06-12 13:09:24,332][65383] Fps is (10 sec: 49152.0, 60 sec: 46421.4, 300 sec: 45708.6). Total num frames: 23150592. Throughput: 0: 45940.0. Samples: 23235480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 13:09:24,333][65383] Avg episode reward: [(0, '0.003')] [2024-06-12 13:09:27,041][65616] Updated weights for policy 0, policy_version 1420 (0.0030) [2024-06-12 13:09:29,332][65383] Fps is (10 sec: 45875.2, 60 sec: 45602.2, 300 sec: 45597.5). Total num frames: 23363584. Throughput: 0: 45926.2. Samples: 23508080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 13:09:29,333][65383] Avg episode reward: [(0, '0.003')] [2024-06-12 13:09:30,564][65616] Updated weights for policy 0, policy_version 1430 (0.0027) [2024-06-12 13:09:33,937][65616] Updated weights for policy 0, policy_version 1440 (0.0027) [2024-06-12 13:09:34,333][65383] Fps is (10 sec: 44236.1, 60 sec: 46148.1, 300 sec: 45653.1). Total num frames: 23592960. Throughput: 0: 45699.9. Samples: 23641300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 13:09:34,333][65383] Avg episode reward: [(0, '0.005')] [2024-06-12 13:09:37,716][65616] Updated weights for policy 0, policy_version 1450 (0.0024) [2024-06-12 13:09:39,332][65383] Fps is (10 sec: 47513.5, 60 sec: 46421.3, 300 sec: 45708.6). Total num frames: 23838720. Throughput: 0: 45630.7. Samples: 23916140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 13:09:39,333][65383] Avg episode reward: [(0, '0.005')] [2024-06-12 13:09:41,461][65616] Updated weights for policy 0, policy_version 1460 (0.0030) [2024-06-12 13:09:44,332][65383] Fps is (10 sec: 44237.3, 60 sec: 45602.1, 300 sec: 45597.5). Total num frames: 24035328. Throughput: 0: 45717.3. Samples: 24192280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 13:09:44,333][65383] Avg episode reward: [(0, '0.005')] [2024-06-12 13:09:44,994][65616] Updated weights for policy 0, policy_version 1470 (0.0031) [2024-06-12 13:09:48,727][65616] Updated weights for policy 0, policy_version 1480 (0.0029) [2024-06-12 13:09:49,335][65383] Fps is (10 sec: 44224.6, 60 sec: 45873.1, 300 sec: 45708.2). Total num frames: 24281088. Throughput: 0: 45543.3. Samples: 24320000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 13:09:49,336][65383] Avg episode reward: [(0, '0.002')] [2024-06-12 13:09:51,767][65595] Signal inference workers to stop experience collection... (300 times) [2024-06-12 13:09:51,768][65595] Signal inference workers to resume experience collection... (300 times) [2024-06-12 13:09:51,810][65616] InferenceWorker_p0-w0: stopping experience collection (300 times) [2024-06-12 13:09:51,810][65616] InferenceWorker_p0-w0: resuming experience collection (300 times) [2024-06-12 13:09:51,899][65616] Updated weights for policy 0, policy_version 1490 (0.0029) [2024-06-12 13:09:54,332][65383] Fps is (10 sec: 47513.4, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 24510464. Throughput: 0: 45634.3. Samples: 24601460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 13:09:54,333][65383] Avg episode reward: [(0, '0.007')] [2024-06-12 13:09:54,345][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000001496_24510464.pth... [2024-06-12 13:09:54,387][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000000828_13565952.pth [2024-06-12 13:09:56,055][65616] Updated weights for policy 0, policy_version 1500 (0.0032) [2024-06-12 13:09:59,332][65383] Fps is (10 sec: 44249.0, 60 sec: 45875.2, 300 sec: 45597.5). Total num frames: 24723456. Throughput: 0: 45619.5. Samples: 24876860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 13:09:59,333][65383] Avg episode reward: [(0, '0.004')] [2024-06-12 13:09:59,432][65616] Updated weights for policy 0, policy_version 1510 (0.0035) [2024-06-12 13:10:03,443][65616] Updated weights for policy 0, policy_version 1520 (0.0030) [2024-06-12 13:10:04,333][65383] Fps is (10 sec: 44236.4, 60 sec: 45602.0, 300 sec: 45653.0). Total num frames: 24952832. Throughput: 0: 45475.0. Samples: 25003840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-06-12 13:10:04,333][65383] Avg episode reward: [(0, '0.004')] [2024-06-12 13:10:06,573][65616] Updated weights for policy 0, policy_version 1530 (0.0035) [2024-06-12 13:10:09,332][65383] Fps is (10 sec: 47513.6, 60 sec: 45875.3, 300 sec: 45708.6). Total num frames: 25198592. Throughput: 0: 45397.3. Samples: 25278360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 13:10:09,333][65383] Avg episode reward: [(0, '0.005')] [2024-06-12 13:10:10,263][65616] Updated weights for policy 0, policy_version 1540 (0.0034) [2024-06-12 13:10:13,431][65616] Updated weights for policy 0, policy_version 1550 (0.0028) [2024-06-12 13:10:14,332][65383] Fps is (10 sec: 45875.2, 60 sec: 45875.1, 300 sec: 45597.5). Total num frames: 25411584. Throughput: 0: 45628.3. Samples: 25561360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 13:10:14,333][65383] Avg episode reward: [(0, '0.005')] [2024-06-12 13:10:17,499][65616] Updated weights for policy 0, policy_version 1560 (0.0026) [2024-06-12 13:10:19,332][65383] Fps is (10 sec: 44237.0, 60 sec: 45602.2, 300 sec: 45653.0). Total num frames: 25640960. Throughput: 0: 45772.6. Samples: 25701060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 13:10:19,333][65383] Avg episode reward: [(0, '0.005')] [2024-06-12 13:10:20,909][65616] Updated weights for policy 0, policy_version 1570 (0.0034) [2024-06-12 13:10:24,073][65616] Updated weights for policy 0, policy_version 1580 (0.0028) [2024-06-12 13:10:24,332][65383] Fps is (10 sec: 47513.8, 60 sec: 45602.1, 300 sec: 45708.6). Total num frames: 25886720. Throughput: 0: 45788.8. Samples: 25976640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 13:10:24,333][65383] Avg episode reward: [(0, '0.006')] [2024-06-12 13:10:27,625][65616] Updated weights for policy 0, policy_version 1590 (0.0035) [2024-06-12 13:10:29,332][65383] Fps is (10 sec: 47513.3, 60 sec: 45875.2, 300 sec: 45708.6). Total num frames: 26116096. Throughput: 0: 45780.0. Samples: 26252380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 13:10:29,333][65383] Avg episode reward: [(0, '0.005')] [2024-06-12 13:10:31,367][65616] Updated weights for policy 0, policy_version 1600 (0.0034) [2024-06-12 13:10:34,332][65383] Fps is (10 sec: 44237.4, 60 sec: 45602.3, 300 sec: 45597.5). Total num frames: 26329088. Throughput: 0: 46035.4. Samples: 26391460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 13:10:34,333][65383] Avg episode reward: [(0, '0.003')] [2024-06-12 13:10:35,081][65616] Updated weights for policy 0, policy_version 1610 (0.0027) [2024-06-12 13:10:38,459][65616] Updated weights for policy 0, policy_version 1620 (0.0026) [2024-06-12 13:10:39,332][65383] Fps is (10 sec: 44236.5, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 26558464. Throughput: 0: 46003.5. Samples: 26671620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-12 13:10:39,333][65383] Avg episode reward: [(0, '0.006')] [2024-06-12 13:10:42,345][65616] Updated weights for policy 0, policy_version 1630 (0.0041) [2024-06-12 13:10:44,332][65383] Fps is (10 sec: 45875.0, 60 sec: 45875.2, 300 sec: 45708.6). Total num frames: 26787840. Throughput: 0: 45820.9. Samples: 26938800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 13:10:44,333][65383] Avg episode reward: [(0, '0.004')] [2024-06-12 13:10:45,723][65616] Updated weights for policy 0, policy_version 1640 (0.0034) [2024-06-12 13:10:49,332][65383] Fps is (10 sec: 45875.5, 60 sec: 45604.2, 300 sec: 45653.0). Total num frames: 27017216. Throughput: 0: 46112.1. Samples: 27078880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 13:10:49,333][65383] Avg episode reward: [(0, '0.003')] [2024-06-12 13:10:49,346][65616] Updated weights for policy 0, policy_version 1650 (0.0033) [2024-06-12 13:10:52,863][65616] Updated weights for policy 0, policy_version 1660 (0.0032) [2024-06-12 13:10:54,333][65383] Fps is (10 sec: 49151.1, 60 sec: 46148.2, 300 sec: 45819.7). Total num frames: 27279360. Throughput: 0: 46281.6. Samples: 27361040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-12 13:10:54,333][65383] Avg episode reward: [(0, '0.004')] [2024-06-12 13:10:56,595][65616] Updated weights for policy 0, policy_version 1670 (0.0032) [2024-06-12 13:10:59,332][65383] Fps is (10 sec: 47513.7, 60 sec: 46148.3, 300 sec: 45764.1). Total num frames: 27492352. Throughput: 0: 46204.6. Samples: 27640560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 13:10:59,333][65383] Avg episode reward: [(0, '0.006')] [2024-06-12 13:10:59,706][65616] Updated weights for policy 0, policy_version 1680 (0.0024) [2024-06-12 13:11:03,374][65616] Updated weights for policy 0, policy_version 1690 (0.0026) [2024-06-12 13:11:04,332][65383] Fps is (10 sec: 45875.6, 60 sec: 46421.4, 300 sec: 45764.1). Total num frames: 27738112. Throughput: 0: 46202.1. Samples: 27780160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 13:11:04,333][65383] Avg episode reward: [(0, '0.004')] [2024-06-12 13:11:06,870][65616] Updated weights for policy 0, policy_version 1700 (0.0027) [2024-06-12 13:11:09,332][65383] Fps is (10 sec: 47513.7, 60 sec: 46148.3, 300 sec: 45764.1). Total num frames: 27967488. Throughput: 0: 46144.5. Samples: 28053140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 13:11:09,333][65383] Avg episode reward: [(0, '0.005')] [2024-06-12 13:11:10,193][65616] Updated weights for policy 0, policy_version 1710 (0.0039) [2024-06-12 13:11:13,678][65616] Updated weights for policy 0, policy_version 1720 (0.0036) [2024-06-12 13:11:14,332][65383] Fps is (10 sec: 45875.6, 60 sec: 46421.4, 300 sec: 45875.2). Total num frames: 28196864. Throughput: 0: 46428.0. Samples: 28341640. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-12 13:11:14,333][65383] Avg episode reward: [(0, '0.004')] [2024-06-12 13:11:17,212][65616] Updated weights for policy 0, policy_version 1730 (0.0044) [2024-06-12 13:11:19,332][65383] Fps is (10 sec: 45875.0, 60 sec: 46421.3, 300 sec: 45819.6). Total num frames: 28426240. Throughput: 0: 46413.7. Samples: 28480080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 13:11:19,333][65383] Avg episode reward: [(0, '0.003')] [2024-06-12 13:11:20,940][65616] Updated weights for policy 0, policy_version 1740 (0.0031) [2024-06-12 13:11:20,943][65595] Signal inference workers to stop experience collection... (350 times) [2024-06-12 13:11:20,943][65595] Signal inference workers to resume experience collection... (350 times) [2024-06-12 13:11:20,962][65616] InferenceWorker_p0-w0: stopping experience collection (350 times) [2024-06-12 13:11:20,962][65616] InferenceWorker_p0-w0: resuming experience collection (350 times) [2024-06-12 13:11:24,168][65616] Updated weights for policy 0, policy_version 1750 (0.0026) [2024-06-12 13:11:24,333][65383] Fps is (10 sec: 47512.9, 60 sec: 46421.3, 300 sec: 45819.7). Total num frames: 28672000. Throughput: 0: 46392.4. Samples: 28759280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 13:11:24,333][65383] Avg episode reward: [(0, '0.007')] [2024-06-12 13:11:27,898][65616] Updated weights for policy 0, policy_version 1760 (0.0030) [2024-06-12 13:11:29,332][65383] Fps is (10 sec: 45875.5, 60 sec: 46148.3, 300 sec: 45875.2). Total num frames: 28884992. Throughput: 0: 46476.4. Samples: 29030240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-12 13:11:29,333][65383] Avg episode reward: [(0, '0.004')] [2024-06-12 13:11:31,171][65616] Updated weights for policy 0, policy_version 1770 (0.0039) [2024-06-12 13:11:34,332][65383] Fps is (10 sec: 42599.1, 60 sec: 46148.2, 300 sec: 45764.1). Total num frames: 29097984. Throughput: 0: 46633.8. Samples: 29177400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 13:11:34,333][65383] Avg episode reward: [(0, '0.008')] [2024-06-12 13:11:34,420][65595] Saving new best policy, reward=0.008! [2024-06-12 13:11:34,891][65616] Updated weights for policy 0, policy_version 1780 (0.0026) [2024-06-12 13:11:38,315][65616] Updated weights for policy 0, policy_version 1790 (0.0033) [2024-06-12 13:11:39,332][65383] Fps is (10 sec: 45875.2, 60 sec: 46421.4, 300 sec: 45875.2). Total num frames: 29343744. Throughput: 0: 46560.6. Samples: 29456260. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-12 13:11:39,333][65383] Avg episode reward: [(0, '0.003')] [2024-06-12 13:11:42,061][65616] Updated weights for policy 0, policy_version 1800 (0.0030) [2024-06-12 13:11:44,332][65383] Fps is (10 sec: 50789.9, 60 sec: 46967.4, 300 sec: 45986.3). Total num frames: 29605888. Throughput: 0: 46463.5. Samples: 29731420. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-12 13:11:44,333][65383] Avg episode reward: [(0, '0.008')] [2024-06-12 13:11:45,664][65616] Updated weights for policy 0, policy_version 1810 (0.0034) [2024-06-12 13:11:48,662][65616] Updated weights for policy 0, policy_version 1820 (0.0039) [2024-06-12 13:11:49,332][65383] Fps is (10 sec: 49151.4, 60 sec: 46967.4, 300 sec: 45930.7). Total num frames: 29835264. Throughput: 0: 46827.1. Samples: 29887380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 13:11:49,339][65383] Avg episode reward: [(0, '0.007')] [2024-06-12 13:11:52,194][65616] Updated weights for policy 0, policy_version 1830 (0.0029) [2024-06-12 13:11:54,332][65383] Fps is (10 sec: 44236.9, 60 sec: 46148.4, 300 sec: 45875.2). Total num frames: 30048256. Throughput: 0: 46986.2. Samples: 30167520. Policy #0 lag: (min: 1.0, avg: 8.2, max: 19.0) [2024-06-12 13:11:54,333][65383] Avg episode reward: [(0, '0.006')] [2024-06-12 13:11:54,369][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000001835_30064640.pth... [2024-06-12 13:11:54,416][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000001161_19021824.pth [2024-06-12 13:11:55,833][65616] Updated weights for policy 0, policy_version 1840 (0.0032) [2024-06-12 13:11:59,215][65616] Updated weights for policy 0, policy_version 1850 (0.0029) [2024-06-12 13:11:59,332][65383] Fps is (10 sec: 47514.1, 60 sec: 46967.5, 300 sec: 46041.8). Total num frames: 30310400. Throughput: 0: 46792.4. Samples: 30447300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 13:11:59,333][65383] Avg episode reward: [(0, '0.006')] [2024-06-12 13:12:02,742][65616] Updated weights for policy 0, policy_version 1860 (0.0025) [2024-06-12 13:12:04,332][65383] Fps is (10 sec: 52428.6, 60 sec: 47240.5, 300 sec: 46152.9). Total num frames: 30572544. Throughput: 0: 47070.6. Samples: 30598260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 13:12:04,333][65383] Avg episode reward: [(0, '0.005')] [2024-06-12 13:12:05,795][65616] Updated weights for policy 0, policy_version 1870 (0.0028) [2024-06-12 13:12:09,313][65616] Updated weights for policy 0, policy_version 1880 (0.0024) [2024-06-12 13:12:09,332][65383] Fps is (10 sec: 49152.2, 60 sec: 47240.6, 300 sec: 46097.4). Total num frames: 30801920. Throughput: 0: 47210.4. Samples: 30883740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 13:12:09,333][65383] Avg episode reward: [(0, '0.005')] [2024-06-12 13:12:12,779][65616] Updated weights for policy 0, policy_version 1890 (0.0032) [2024-06-12 13:12:14,332][65383] Fps is (10 sec: 42599.0, 60 sec: 46694.4, 300 sec: 46041.8). Total num frames: 30998528. Throughput: 0: 47589.4. Samples: 31171760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 13:12:14,333][65383] Avg episode reward: [(0, '0.007')] [2024-06-12 13:12:16,405][65616] Updated weights for policy 0, policy_version 1900 (0.0029) [2024-06-12 13:12:19,332][65383] Fps is (10 sec: 47513.5, 60 sec: 47513.6, 300 sec: 46264.0). Total num frames: 31277056. Throughput: 0: 47482.2. Samples: 31314100. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-12 13:12:19,333][65383] Avg episode reward: [(0, '0.003')] [2024-06-12 13:12:19,731][65616] Updated weights for policy 0, policy_version 1910 (0.0032) [2024-06-12 13:12:22,379][65595] Signal inference workers to stop experience collection... (400 times) [2024-06-12 13:12:22,379][65595] Signal inference workers to resume experience collection... (400 times) [2024-06-12 13:12:22,391][65616] InferenceWorker_p0-w0: stopping experience collection (400 times) [2024-06-12 13:12:22,391][65616] InferenceWorker_p0-w0: resuming experience collection (400 times) [2024-06-12 13:12:23,250][65616] Updated weights for policy 0, policy_version 1920 (0.0030) [2024-06-12 13:12:24,332][65383] Fps is (10 sec: 52428.8, 60 sec: 47513.8, 300 sec: 46264.0). Total num frames: 31522816. Throughput: 0: 47771.6. Samples: 31605980. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-12 13:12:24,333][65383] Avg episode reward: [(0, '0.006')] [2024-06-12 13:12:26,330][65616] Updated weights for policy 0, policy_version 1930 (0.0026) [2024-06-12 13:12:29,332][65383] Fps is (10 sec: 44236.5, 60 sec: 47240.5, 300 sec: 46097.3). Total num frames: 31719424. Throughput: 0: 47940.5. Samples: 31888740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 13:12:29,333][65383] Avg episode reward: [(0, '0.005')] [2024-06-12 13:12:30,082][65616] Updated weights for policy 0, policy_version 1940 (0.0034) [2024-06-12 13:12:33,320][65616] Updated weights for policy 0, policy_version 1950 (0.0038) [2024-06-12 13:12:34,333][65383] Fps is (10 sec: 42593.6, 60 sec: 47512.7, 300 sec: 46097.2). Total num frames: 31948800. Throughput: 0: 47280.8. Samples: 32015060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-12 13:12:34,334][65383] Avg episode reward: [(0, '0.004')] [2024-06-12 13:12:37,226][65616] Updated weights for policy 0, policy_version 1960 (0.0033) [2024-06-12 13:12:39,332][65383] Fps is (10 sec: 50790.2, 60 sec: 48059.7, 300 sec: 46264.0). Total num frames: 32227328. Throughput: 0: 47288.0. Samples: 32295480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-12 13:12:39,333][65383] Avg episode reward: [(0, '0.005')] [2024-06-12 13:12:40,310][65616] Updated weights for policy 0, policy_version 1970 (0.0034) [2024-06-12 13:12:44,029][65616] Updated weights for policy 0, policy_version 1980 (0.0035) [2024-06-12 13:12:44,332][65383] Fps is (10 sec: 49157.4, 60 sec: 47240.6, 300 sec: 46264.0). Total num frames: 32440320. Throughput: 0: 47536.0. Samples: 32586420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 13:12:44,333][65383] Avg episode reward: [(0, '0.004')] [2024-06-12 13:12:47,118][65616] Updated weights for policy 0, policy_version 1990 (0.0046) [2024-06-12 13:12:49,332][65383] Fps is (10 sec: 44237.4, 60 sec: 47240.7, 300 sec: 46319.5). Total num frames: 32669696. Throughput: 0: 47162.4. Samples: 32720560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 13:12:49,333][65383] Avg episode reward: [(0, '0.004')] [2024-06-12 13:12:51,153][65616] Updated weights for policy 0, policy_version 2000 (0.0036) [2024-06-12 13:12:54,217][65616] Updated weights for policy 0, policy_version 2010 (0.0028) [2024-06-12 13:12:54,332][65383] Fps is (10 sec: 49151.6, 60 sec: 48059.7, 300 sec: 46375.1). Total num frames: 32931840. Throughput: 0: 47175.9. Samples: 33006660. Policy #0 lag: (min: 1.0, avg: 13.0, max: 27.0) [2024-06-12 13:12:54,333][65383] Avg episode reward: [(0, '0.006')] [2024-06-12 13:12:58,218][65616] Updated weights for policy 0, policy_version 2020 (0.0033) [2024-06-12 13:12:59,332][65383] Fps is (10 sec: 49151.3, 60 sec: 47513.5, 300 sec: 46430.6). Total num frames: 33161216. Throughput: 0: 46911.9. Samples: 33282800. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-12 13:12:59,333][65383] Avg episode reward: [(0, '0.009')] [2024-06-12 13:13:01,323][65616] Updated weights for policy 0, policy_version 2030 (0.0026) [2024-06-12 13:13:04,332][65383] Fps is (10 sec: 45875.3, 60 sec: 46967.5, 300 sec: 46375.0). Total num frames: 33390592. Throughput: 0: 47133.3. Samples: 33435100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 13:13:04,333][65383] Avg episode reward: [(0, '0.005')] [2024-06-12 13:13:04,998][65616] Updated weights for policy 0, policy_version 2040 (0.0029) [2024-06-12 13:13:08,518][65616] Updated weights for policy 0, policy_version 2050 (0.0024) [2024-06-12 13:13:09,332][65383] Fps is (10 sec: 42598.8, 60 sec: 46421.3, 300 sec: 46264.0). Total num frames: 33587200. Throughput: 0: 46698.2. Samples: 33707400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 13:13:09,333][65383] Avg episode reward: [(0, '0.005')] [2024-06-12 13:13:12,136][65616] Updated weights for policy 0, policy_version 2060 (0.0030) [2024-06-12 13:13:14,332][65383] Fps is (10 sec: 44237.0, 60 sec: 47240.5, 300 sec: 46375.1). Total num frames: 33832960. Throughput: 0: 46569.0. Samples: 33984340. Policy #0 lag: (min: 1.0, avg: 12.7, max: 23.0) [2024-06-12 13:13:14,333][65383] Avg episode reward: [(0, '0.005')] [2024-06-12 13:13:15,698][65616] Updated weights for policy 0, policy_version 2070 (0.0034) [2024-06-12 13:13:19,129][65616] Updated weights for policy 0, policy_version 2080 (0.0035) [2024-06-12 13:13:19,332][65383] Fps is (10 sec: 49151.6, 60 sec: 46694.3, 300 sec: 46486.1). Total num frames: 34078720. Throughput: 0: 47071.7. Samples: 34133240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 13:13:19,333][65383] Avg episode reward: [(0, '0.007')] [2024-06-12 13:13:22,695][65616] Updated weights for policy 0, policy_version 2090 (0.0029) [2024-06-12 13:13:24,332][65383] Fps is (10 sec: 45875.5, 60 sec: 46148.3, 300 sec: 46319.5). Total num frames: 34291712. Throughput: 0: 46962.8. Samples: 34408800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 13:13:24,333][65383] Avg episode reward: [(0, '0.002')] [2024-06-12 13:13:25,867][65616] Updated weights for policy 0, policy_version 2100 (0.0025) [2024-06-12 13:13:29,303][65616] Updated weights for policy 0, policy_version 2110 (0.0037) [2024-06-12 13:13:29,332][65383] Fps is (10 sec: 49152.7, 60 sec: 47513.7, 300 sec: 46597.2). Total num frames: 34570240. Throughput: 0: 46944.9. Samples: 34698940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-12 13:13:29,333][65383] Avg episode reward: [(0, '0.008')] [2024-06-12 13:13:32,828][65616] Updated weights for policy 0, policy_version 2120 (0.0033) [2024-06-12 13:13:33,566][65595] Signal inference workers to stop experience collection... (450 times) [2024-06-12 13:13:33,568][65595] Signal inference workers to resume experience collection... (450 times) [2024-06-12 13:13:33,598][65616] InferenceWorker_p0-w0: stopping experience collection (450 times) [2024-06-12 13:13:33,598][65616] InferenceWorker_p0-w0: resuming experience collection (450 times) [2024-06-12 13:13:34,332][65383] Fps is (10 sec: 52428.0, 60 sec: 47787.4, 300 sec: 46652.7). Total num frames: 34816000. Throughput: 0: 47279.9. Samples: 34848160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-12 13:13:34,333][65383] Avg episode reward: [(0, '0.009')] [2024-06-12 13:13:36,179][65616] Updated weights for policy 0, policy_version 2130 (0.0032) [2024-06-12 13:13:39,332][65383] Fps is (10 sec: 45875.0, 60 sec: 46694.5, 300 sec: 46541.7). Total num frames: 35028992. Throughput: 0: 47287.2. Samples: 35134580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 13:13:39,333][65383] Avg episode reward: [(0, '0.009')] [2024-06-12 13:13:39,647][65616] Updated weights for policy 0, policy_version 2140 (0.0026) [2024-06-12 13:13:43,126][65616] Updated weights for policy 0, policy_version 2150 (0.0038) [2024-06-12 13:13:44,332][65383] Fps is (10 sec: 42598.6, 60 sec: 46694.4, 300 sec: 46486.1). Total num frames: 35241984. Throughput: 0: 47225.4. Samples: 35407940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-12 13:13:44,333][65383] Avg episode reward: [(0, '0.004')] [2024-06-12 13:13:46,498][65616] Updated weights for policy 0, policy_version 2160 (0.0029) [2024-06-12 13:13:49,332][65383] Fps is (10 sec: 47512.8, 60 sec: 47240.4, 300 sec: 46597.2). Total num frames: 35504128. Throughput: 0: 47055.0. Samples: 35552580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 13:13:49,333][65383] Avg episode reward: [(0, '0.006')] [2024-06-12 13:13:50,199][65616] Updated weights for policy 0, policy_version 2170 (0.0027) [2024-06-12 13:13:53,737][65616] Updated weights for policy 0, policy_version 2180 (0.0022) [2024-06-12 13:13:54,332][65383] Fps is (10 sec: 50790.6, 60 sec: 46967.5, 300 sec: 46708.3). Total num frames: 35749888. Throughput: 0: 47369.3. Samples: 35839020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 13:13:54,333][65383] Avg episode reward: [(0, '0.006')] [2024-06-12 13:13:54,411][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000002183_35766272.pth... [2024-06-12 13:13:54,456][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000001496_24510464.pth [2024-06-12 13:13:57,552][65616] Updated weights for policy 0, policy_version 2190 (0.0026) [2024-06-12 13:13:59,332][65383] Fps is (10 sec: 47514.3, 60 sec: 46967.6, 300 sec: 46652.7). Total num frames: 35979264. Throughput: 0: 47494.2. Samples: 36121580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 13:13:59,333][65383] Avg episode reward: [(0, '0.005')] [2024-06-12 13:14:00,339][65616] Updated weights for policy 0, policy_version 2200 (0.0036) [2024-06-12 13:14:04,332][65383] Fps is (10 sec: 44236.6, 60 sec: 46694.4, 300 sec: 46597.2). Total num frames: 36192256. Throughput: 0: 47374.3. Samples: 36265080. Policy #0 lag: (min: 1.0, avg: 12.6, max: 23.0) [2024-06-12 13:14:04,333][65383] Avg episode reward: [(0, '0.004')] [2024-06-12 13:14:04,485][65616] Updated weights for policy 0, policy_version 2210 (0.0024) [2024-06-12 13:14:07,505][65616] Updated weights for policy 0, policy_version 2220 (0.0034) [2024-06-12 13:14:09,333][65383] Fps is (10 sec: 47510.2, 60 sec: 47786.1, 300 sec: 46763.7). Total num frames: 36454400. Throughput: 0: 47467.2. Samples: 36544860. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-12 13:14:09,333][65383] Avg episode reward: [(0, '0.008')] [2024-06-12 13:14:11,045][65616] Updated weights for policy 0, policy_version 2230 (0.0026) [2024-06-12 13:14:14,230][65616] Updated weights for policy 0, policy_version 2240 (0.0032) [2024-06-12 13:14:14,332][65383] Fps is (10 sec: 50790.0, 60 sec: 47786.6, 300 sec: 46763.8). Total num frames: 36700160. Throughput: 0: 47550.9. Samples: 36838740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 13:14:14,333][65383] Avg episode reward: [(0, '0.008')] [2024-06-12 13:14:17,895][65616] Updated weights for policy 0, policy_version 2250 (0.0031) [2024-06-12 13:14:19,332][65383] Fps is (10 sec: 49155.3, 60 sec: 47786.7, 300 sec: 46763.8). Total num frames: 36945920. Throughput: 0: 47450.7. Samples: 36983440. Policy #0 lag: (min: 2.0, avg: 11.7, max: 24.0) [2024-06-12 13:14:19,333][65383] Avg episode reward: [(0, '0.008')] [2024-06-12 13:14:21,187][65616] Updated weights for policy 0, policy_version 2260 (0.0029) [2024-06-12 13:14:24,332][65383] Fps is (10 sec: 45875.4, 60 sec: 47786.6, 300 sec: 46763.8). Total num frames: 37158912. Throughput: 0: 47328.8. Samples: 37264380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 13:14:24,333][65383] Avg episode reward: [(0, '0.007')] [2024-06-12 13:14:24,865][65616] Updated weights for policy 0, policy_version 2270 (0.0031) [2024-06-12 13:14:27,789][65616] Updated weights for policy 0, policy_version 2280 (0.0030) [2024-06-12 13:14:29,332][65383] Fps is (10 sec: 47513.8, 60 sec: 47513.5, 300 sec: 46874.9). Total num frames: 37421056. Throughput: 0: 47618.7. Samples: 37550780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-12 13:14:29,333][65383] Avg episode reward: [(0, '0.009')] [2024-06-12 13:14:31,922][65616] Updated weights for policy 0, policy_version 2290 (0.0036) [2024-06-12 13:14:34,333][65383] Fps is (10 sec: 49151.6, 60 sec: 47240.5, 300 sec: 46819.3). Total num frames: 37650432. Throughput: 0: 47664.4. Samples: 37697480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 13:14:34,333][65383] Avg episode reward: [(0, '0.004')] [2024-06-12 13:14:34,694][65616] Updated weights for policy 0, policy_version 2300 (0.0032) [2024-06-12 13:14:38,626][65616] Updated weights for policy 0, policy_version 2310 (0.0033) [2024-06-12 13:14:39,332][65383] Fps is (10 sec: 45874.7, 60 sec: 47513.5, 300 sec: 46930.4). Total num frames: 37879808. Throughput: 0: 47714.1. Samples: 37986160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 13:14:39,333][65383] Avg episode reward: [(0, '0.011')] [2024-06-12 13:14:39,393][65595] Saving new best policy, reward=0.011! [2024-06-12 13:14:41,371][65616] Updated weights for policy 0, policy_version 2320 (0.0037) [2024-06-12 13:14:44,332][65383] Fps is (10 sec: 45875.9, 60 sec: 47786.7, 300 sec: 46875.3). Total num frames: 38109184. Throughput: 0: 47758.7. Samples: 38270720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 13:14:44,333][65383] Avg episode reward: [(0, '0.006')] [2024-06-12 13:14:45,406][65616] Updated weights for policy 0, policy_version 2330 (0.0025) [2024-06-12 13:14:48,189][65616] Updated weights for policy 0, policy_version 2340 (0.0021) [2024-06-12 13:14:49,332][65383] Fps is (10 sec: 50790.8, 60 sec: 48059.8, 300 sec: 47041.5). Total num frames: 38387712. Throughput: 0: 47656.9. Samples: 38409640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-12 13:14:49,333][65383] Avg episode reward: [(0, '0.011')] [2024-06-12 13:14:50,063][65595] Signal inference workers to stop experience collection... (500 times) [2024-06-12 13:14:50,111][65595] Signal inference workers to resume experience collection... (500 times) [2024-06-12 13:14:50,112][65616] InferenceWorker_p0-w0: stopping experience collection (500 times) [2024-06-12 13:14:50,138][65616] InferenceWorker_p0-w0: resuming experience collection (500 times) [2024-06-12 13:14:52,502][65616] Updated weights for policy 0, policy_version 2350 (0.0036) [2024-06-12 13:14:54,332][65383] Fps is (10 sec: 50790.6, 60 sec: 47786.7, 300 sec: 47097.1). Total num frames: 38617088. Throughput: 0: 47941.2. Samples: 38702180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 13:14:54,333][65383] Avg episode reward: [(0, '0.010')] [2024-06-12 13:14:54,986][65616] Updated weights for policy 0, policy_version 2360 (0.0028) [2024-06-12 13:14:59,122][65616] Updated weights for policy 0, policy_version 2370 (0.0034) [2024-06-12 13:14:59,332][65383] Fps is (10 sec: 44237.3, 60 sec: 47513.6, 300 sec: 47041.5). Total num frames: 38830080. Throughput: 0: 47774.0. Samples: 38988560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 13:14:59,333][65383] Avg episode reward: [(0, '0.007')] [2024-06-12 13:15:01,998][65616] Updated weights for policy 0, policy_version 2380 (0.0027) [2024-06-12 13:15:04,332][65383] Fps is (10 sec: 44236.4, 60 sec: 47786.6, 300 sec: 46986.0). Total num frames: 39059456. Throughput: 0: 47401.8. Samples: 39116520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 13:15:04,333][65383] Avg episode reward: [(0, '0.005')] [2024-06-12 13:15:06,268][65616] Updated weights for policy 0, policy_version 2390 (0.0033) [2024-06-12 13:15:08,683][65616] Updated weights for policy 0, policy_version 2400 (0.0035) [2024-06-12 13:15:09,332][65383] Fps is (10 sec: 49151.4, 60 sec: 47787.2, 300 sec: 47152.6). Total num frames: 39321600. Throughput: 0: 47631.6. Samples: 39407800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 13:15:09,333][65383] Avg episode reward: [(0, '0.008')] [2024-06-12 13:15:12,762][65616] Updated weights for policy 0, policy_version 2410 (0.0029) [2024-06-12 13:15:14,333][65383] Fps is (10 sec: 50789.9, 60 sec: 47786.6, 300 sec: 47208.1). Total num frames: 39567360. Throughput: 0: 47743.4. Samples: 39699240. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-12 13:15:14,333][65383] Avg episode reward: [(0, '0.008')] [2024-06-12 13:15:15,520][65616] Updated weights for policy 0, policy_version 2420 (0.0024) [2024-06-12 13:15:19,332][65383] Fps is (10 sec: 45875.1, 60 sec: 47240.5, 300 sec: 47097.1). Total num frames: 39780352. Throughput: 0: 47733.4. Samples: 39845480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-12 13:15:19,333][65383] Avg episode reward: [(0, '0.011')] [2024-06-12 13:15:19,646][65616] Updated weights for policy 0, policy_version 2430 (0.0023) [2024-06-12 13:15:22,426][65616] Updated weights for policy 0, policy_version 2440 (0.0036) [2024-06-12 13:15:24,332][65383] Fps is (10 sec: 45876.0, 60 sec: 47786.7, 300 sec: 47152.6). Total num frames: 40026112. Throughput: 0: 47628.6. Samples: 40129440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 13:15:24,333][65383] Avg episode reward: [(0, '0.011')] [2024-06-12 13:15:26,589][65616] Updated weights for policy 0, policy_version 2450 (0.0032) [2024-06-12 13:15:29,332][65383] Fps is (10 sec: 50790.8, 60 sec: 47786.7, 300 sec: 47319.2). Total num frames: 40288256. Throughput: 0: 47565.3. Samples: 40411160. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-12 13:15:29,333][65383] Avg episode reward: [(0, '0.008')] [2024-06-12 13:15:29,366][65616] Updated weights for policy 0, policy_version 2460 (0.0037) [2024-06-12 13:15:33,422][65616] Updated weights for policy 0, policy_version 2470 (0.0031) [2024-06-12 13:15:34,332][65383] Fps is (10 sec: 49151.6, 60 sec: 47786.8, 300 sec: 47319.2). Total num frames: 40517632. Throughput: 0: 47571.1. Samples: 40550340. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-12 13:15:34,333][65383] Avg episode reward: [(0, '0.009')] [2024-06-12 13:15:36,503][65616] Updated weights for policy 0, policy_version 2480 (0.0031) [2024-06-12 13:15:39,332][65383] Fps is (10 sec: 42598.6, 60 sec: 47240.6, 300 sec: 47208.1). Total num frames: 40714240. Throughput: 0: 47689.8. Samples: 40848220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 13:15:39,333][65383] Avg episode reward: [(0, '0.010')] [2024-06-12 13:15:40,255][65616] Updated weights for policy 0, policy_version 2490 (0.0026) [2024-06-12 13:15:43,469][65616] Updated weights for policy 0, policy_version 2500 (0.0028) [2024-06-12 13:15:44,332][65383] Fps is (10 sec: 47513.7, 60 sec: 48059.7, 300 sec: 47374.8). Total num frames: 40992768. Throughput: 0: 47677.2. Samples: 41134040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 13:15:44,333][65383] Avg episode reward: [(0, '0.009')] [2024-06-12 13:15:46,873][65616] Updated weights for policy 0, policy_version 2510 (0.0023) [2024-06-12 13:15:49,332][65383] Fps is (10 sec: 54066.4, 60 sec: 47786.6, 300 sec: 47374.8). Total num frames: 41254912. Throughput: 0: 48191.5. Samples: 41285140. Policy #0 lag: (min: 1.0, avg: 12.0, max: 26.0) [2024-06-12 13:15:49,333][65383] Avg episode reward: [(0, '0.008')] [2024-06-12 13:15:50,196][65616] Updated weights for policy 0, policy_version 2520 (0.0029) [2024-06-12 13:15:53,395][65595] Signal inference workers to stop experience collection... (550 times) [2024-06-12 13:15:53,404][65616] InferenceWorker_p0-w0: stopping experience collection (550 times) [2024-06-12 13:15:53,503][65595] Signal inference workers to resume experience collection... (550 times) [2024-06-12 13:15:53,504][65616] InferenceWorker_p0-w0: resuming experience collection (550 times) [2024-06-12 13:15:53,652][65616] Updated weights for policy 0, policy_version 2530 (0.0034) [2024-06-12 13:15:54,332][65383] Fps is (10 sec: 49151.9, 60 sec: 47786.6, 300 sec: 47430.3). Total num frames: 41484288. Throughput: 0: 48335.1. Samples: 41582880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 13:15:54,333][65383] Avg episode reward: [(0, '0.009')] [2024-06-12 13:15:54,461][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000002533_41500672.pth... [2024-06-12 13:15:54,504][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000001835_30064640.pth [2024-06-12 13:15:57,156][65616] Updated weights for policy 0, policy_version 2540 (0.0030) [2024-06-12 13:15:59,332][65383] Fps is (10 sec: 42598.7, 60 sec: 47513.5, 300 sec: 47263.7). Total num frames: 41680896. Throughput: 0: 47921.9. Samples: 41855720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 13:15:59,333][65383] Avg episode reward: [(0, '0.012')] [2024-06-12 13:15:59,333][65595] Saving new best policy, reward=0.012! [2024-06-12 13:16:00,603][65616] Updated weights for policy 0, policy_version 2550 (0.0028) [2024-06-12 13:16:03,838][65616] Updated weights for policy 0, policy_version 2560 (0.0026) [2024-06-12 13:16:04,333][65383] Fps is (10 sec: 47513.2, 60 sec: 48332.7, 300 sec: 47430.3). Total num frames: 41959424. Throughput: 0: 47787.0. Samples: 41995900. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-12 13:16:04,333][65383] Avg episode reward: [(0, '0.009')] [2024-06-12 13:16:07,211][65616] Updated weights for policy 0, policy_version 2570 (0.0027) [2024-06-12 13:16:09,332][65383] Fps is (10 sec: 52428.9, 60 sec: 48059.8, 300 sec: 47485.8). Total num frames: 42205184. Throughput: 0: 47980.4. Samples: 42288560. Policy #0 lag: (min: 1.0, avg: 8.5, max: 22.0) [2024-06-12 13:16:09,333][65383] Avg episode reward: [(0, '0.015')] [2024-06-12 13:16:09,418][65595] Saving new best policy, reward=0.015! [2024-06-12 13:16:10,726][65616] Updated weights for policy 0, policy_version 2580 (0.0030) [2024-06-12 13:16:13,967][65616] Updated weights for policy 0, policy_version 2590 (0.0027) [2024-06-12 13:16:14,333][65383] Fps is (10 sec: 49152.0, 60 sec: 48059.7, 300 sec: 47541.4). Total num frames: 42450944. Throughput: 0: 48231.0. Samples: 42581560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-12 13:16:14,333][65383] Avg episode reward: [(0, '0.008')] [2024-06-12 13:16:17,428][65616] Updated weights for policy 0, policy_version 2600 (0.0026) [2024-06-12 13:16:19,332][65383] Fps is (10 sec: 45875.2, 60 sec: 48059.8, 300 sec: 47430.3). Total num frames: 42663936. Throughput: 0: 48333.8. Samples: 42725360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 13:16:19,333][65383] Avg episode reward: [(0, '0.010')] [2024-06-12 13:16:20,661][65616] Updated weights for policy 0, policy_version 2610 (0.0026) [2024-06-12 13:16:23,963][65616] Updated weights for policy 0, policy_version 2620 (0.0028) [2024-06-12 13:16:24,332][65383] Fps is (10 sec: 47514.0, 60 sec: 48332.7, 300 sec: 47596.9). Total num frames: 42926080. Throughput: 0: 48231.4. Samples: 43018640. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-12 13:16:24,333][65383] Avg episode reward: [(0, '0.012')] [2024-06-12 13:16:27,496][65616] Updated weights for policy 0, policy_version 2630 (0.0035) [2024-06-12 13:16:29,332][65383] Fps is (10 sec: 52428.4, 60 sec: 48332.7, 300 sec: 47763.5). Total num frames: 43188224. Throughput: 0: 48306.6. Samples: 43307840. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-12 13:16:29,333][65383] Avg episode reward: [(0, '0.014')] [2024-06-12 13:16:30,980][65616] Updated weights for policy 0, policy_version 2640 (0.0034) [2024-06-12 13:16:34,226][65616] Updated weights for policy 0, policy_version 2650 (0.0036) [2024-06-12 13:16:34,332][65383] Fps is (10 sec: 49152.2, 60 sec: 48332.8, 300 sec: 47708.0). Total num frames: 43417600. Throughput: 0: 48272.1. Samples: 43457380. Policy #0 lag: (min: 1.0, avg: 11.0, max: 25.0) [2024-06-12 13:16:34,333][65383] Avg episode reward: [(0, '0.016')] [2024-06-12 13:16:37,761][65616] Updated weights for policy 0, policy_version 2660 (0.0024) [2024-06-12 13:16:39,332][65383] Fps is (10 sec: 44237.3, 60 sec: 48605.9, 300 sec: 47541.4). Total num frames: 43630592. Throughput: 0: 48228.1. Samples: 43753140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 13:16:39,333][65383] Avg episode reward: [(0, '0.009')] [2024-06-12 13:16:40,998][65616] Updated weights for policy 0, policy_version 2670 (0.0024) [2024-06-12 13:16:44,332][65383] Fps is (10 sec: 45874.9, 60 sec: 48059.7, 300 sec: 47596.9). Total num frames: 43876352. Throughput: 0: 48462.2. Samples: 44036520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 13:16:44,333][65383] Avg episode reward: [(0, '0.009')] [2024-06-12 13:16:44,613][65616] Updated weights for policy 0, policy_version 2680 (0.0036) [2024-06-12 13:16:47,663][65616] Updated weights for policy 0, policy_version 2690 (0.0025) [2024-06-12 13:16:49,332][65383] Fps is (10 sec: 54067.3, 60 sec: 48606.0, 300 sec: 47874.6). Total num frames: 44171264. Throughput: 0: 48687.3. Samples: 44186820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 13:16:49,333][65383] Avg episode reward: [(0, '0.006')] [2024-06-12 13:16:51,402][65616] Updated weights for policy 0, policy_version 2700 (0.0025) [2024-06-12 13:16:53,757][65595] Signal inference workers to stop experience collection... (600 times) [2024-06-12 13:16:53,792][65616] InferenceWorker_p0-w0: stopping experience collection (600 times) [2024-06-12 13:16:53,812][65595] Signal inference workers to resume experience collection... (600 times) [2024-06-12 13:16:53,813][65616] InferenceWorker_p0-w0: resuming experience collection (600 times) [2024-06-12 13:16:54,332][65383] Fps is (10 sec: 50790.9, 60 sec: 48332.8, 300 sec: 47708.0). Total num frames: 44384256. Throughput: 0: 48800.0. Samples: 44484560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 13:16:54,333][65383] Avg episode reward: [(0, '0.008')] [2024-06-12 13:16:54,357][65616] Updated weights for policy 0, policy_version 2710 (0.0027) [2024-06-12 13:16:57,827][65616] Updated weights for policy 0, policy_version 2720 (0.0030) [2024-06-12 13:16:59,332][65383] Fps is (10 sec: 44236.6, 60 sec: 48879.0, 300 sec: 47596.9). Total num frames: 44613632. Throughput: 0: 48840.1. Samples: 44779360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-12 13:16:59,333][65383] Avg episode reward: [(0, '0.014')] [2024-06-12 13:17:01,178][65616] Updated weights for policy 0, policy_version 2730 (0.0027) [2024-06-12 13:17:04,332][65383] Fps is (10 sec: 47513.0, 60 sec: 48332.8, 300 sec: 47652.4). Total num frames: 44859392. Throughput: 0: 48743.5. Samples: 44918820. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-12 13:17:04,333][65383] Avg episode reward: [(0, '0.009')] [2024-06-12 13:17:04,792][65616] Updated weights for policy 0, policy_version 2740 (0.0037) [2024-06-12 13:17:07,712][65616] Updated weights for policy 0, policy_version 2750 (0.0027) [2024-06-12 13:17:09,332][65383] Fps is (10 sec: 50790.5, 60 sec: 48605.9, 300 sec: 47874.6). Total num frames: 45121536. Throughput: 0: 48596.1. Samples: 45205460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-12 13:17:09,333][65383] Avg episode reward: [(0, '0.012')] [2024-06-12 13:17:11,677][65616] Updated weights for policy 0, policy_version 2760 (0.0039) [2024-06-12 13:17:14,332][65383] Fps is (10 sec: 49152.3, 60 sec: 48332.9, 300 sec: 47708.0). Total num frames: 45350912. Throughput: 0: 48431.2. Samples: 45487240. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-12 13:17:14,333][65383] Avg episode reward: [(0, '0.009')] [2024-06-12 13:17:14,819][65616] Updated weights for policy 0, policy_version 2770 (0.0029) [2024-06-12 13:17:18,748][65616] Updated weights for policy 0, policy_version 2780 (0.0043) [2024-06-12 13:17:19,332][65383] Fps is (10 sec: 44236.4, 60 sec: 48332.7, 300 sec: 47596.9). Total num frames: 45563904. Throughput: 0: 48187.5. Samples: 45625820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-12 13:17:19,333][65383] Avg episode reward: [(0, '0.009')] [2024-06-12 13:17:21,485][65616] Updated weights for policy 0, policy_version 2790 (0.0026) [2024-06-12 13:17:24,332][65383] Fps is (10 sec: 47513.7, 60 sec: 48332.8, 300 sec: 47819.1). Total num frames: 45826048. Throughput: 0: 48121.3. Samples: 45918600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 13:17:24,333][65383] Avg episode reward: [(0, '0.013')] [2024-06-12 13:17:25,643][65616] Updated weights for policy 0, policy_version 2800 (0.0036) [2024-06-12 13:17:28,497][65616] Updated weights for policy 0, policy_version 2810 (0.0038) [2024-06-12 13:17:29,332][65383] Fps is (10 sec: 50790.4, 60 sec: 48059.7, 300 sec: 47874.8). Total num frames: 46071808. Throughput: 0: 48083.6. Samples: 46200280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-06-12 13:17:29,333][65383] Avg episode reward: [(0, '0.007')] [2024-06-12 13:17:32,702][65616] Updated weights for policy 0, policy_version 2820 (0.0036) [2024-06-12 13:17:34,333][65383] Fps is (10 sec: 47510.6, 60 sec: 48059.2, 300 sec: 47707.9). Total num frames: 46301184. Throughput: 0: 48079.3. Samples: 46350420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-12 13:17:34,334][65383] Avg episode reward: [(0, '0.009')] [2024-06-12 13:17:35,554][65616] Updated weights for policy 0, policy_version 2830 (0.0041) [2024-06-12 13:17:39,023][65616] Updated weights for policy 0, policy_version 2840 (0.0025) [2024-06-12 13:17:39,332][65383] Fps is (10 sec: 45875.4, 60 sec: 48332.8, 300 sec: 47763.5). Total num frames: 46530560. Throughput: 0: 47882.2. Samples: 46639260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 13:17:39,333][65383] Avg episode reward: [(0, '0.012')] [2024-06-12 13:17:42,357][65616] Updated weights for policy 0, policy_version 2850 (0.0030) [2024-06-12 13:17:44,333][65383] Fps is (10 sec: 49150.7, 60 sec: 48605.2, 300 sec: 47874.4). Total num frames: 46792704. Throughput: 0: 47741.7. Samples: 46927780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 13:17:44,334][65383] Avg episode reward: [(0, '0.011')] [2024-06-12 13:17:46,112][65616] Updated weights for policy 0, policy_version 2860 (0.0028) [2024-06-12 13:17:49,191][65616] Updated weights for policy 0, policy_version 2870 (0.0026) [2024-06-12 13:17:49,332][65383] Fps is (10 sec: 49152.1, 60 sec: 47513.5, 300 sec: 47763.5). Total num frames: 47022080. Throughput: 0: 47944.1. Samples: 47076300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-12 13:17:49,333][65383] Avg episode reward: [(0, '0.010')] [2024-06-12 13:17:52,791][65616] Updated weights for policy 0, policy_version 2880 (0.0023) [2024-06-12 13:17:54,332][65383] Fps is (10 sec: 47517.8, 60 sec: 48059.7, 300 sec: 47819.1). Total num frames: 47267840. Throughput: 0: 48025.7. Samples: 47366620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-12 13:17:54,336][65383] Avg episode reward: [(0, '0.011')] [2024-06-12 13:17:54,416][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000002886_47284224.pth... [2024-06-12 13:17:54,469][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000002183_35766272.pth [2024-06-12 13:17:55,665][65616] Updated weights for policy 0, policy_version 2890 (0.0031) [2024-06-12 13:17:59,214][65616] Updated weights for policy 0, policy_version 2900 (0.0029) [2024-06-12 13:17:59,332][65383] Fps is (10 sec: 49151.6, 60 sec: 48332.7, 300 sec: 47874.6). Total num frames: 47513600. Throughput: 0: 48212.4. Samples: 47656800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 13:17:59,333][65383] Avg episode reward: [(0, '0.005')] [2024-06-12 13:18:02,673][65616] Updated weights for policy 0, policy_version 2910 (0.0028) [2024-06-12 13:18:04,332][65383] Fps is (10 sec: 47513.8, 60 sec: 48059.8, 300 sec: 47985.7). Total num frames: 47742976. Throughput: 0: 48324.1. Samples: 47800400. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-12 13:18:04,333][65383] Avg episode reward: [(0, '0.008')] [2024-06-12 13:18:06,278][65616] Updated weights for policy 0, policy_version 2920 (0.0035) [2024-06-12 13:18:09,332][65383] Fps is (10 sec: 49152.3, 60 sec: 48059.7, 300 sec: 48041.2). Total num frames: 48005120. Throughput: 0: 48073.3. Samples: 48081900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 13:18:09,333][65383] Avg episode reward: [(0, '0.016')] [2024-06-12 13:18:09,338][65616] Updated weights for policy 0, policy_version 2930 (0.0039) [2024-06-12 13:18:12,934][65616] Updated weights for policy 0, policy_version 2940 (0.0028) [2024-06-12 13:18:14,332][65383] Fps is (10 sec: 50790.4, 60 sec: 48332.8, 300 sec: 48041.2). Total num frames: 48250880. Throughput: 0: 48479.6. Samples: 48381860. Policy #0 lag: (min: 1.0, avg: 9.3, max: 23.0) [2024-06-12 13:18:14,333][65383] Avg episode reward: [(0, '0.014')] [2024-06-12 13:18:16,064][65616] Updated weights for policy 0, policy_version 2950 (0.0031) [2024-06-12 13:18:19,332][65383] Fps is (10 sec: 47513.8, 60 sec: 48605.9, 300 sec: 48096.7). Total num frames: 48480256. Throughput: 0: 48418.5. Samples: 48529220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 13:18:19,333][65383] Avg episode reward: [(0, '0.016')] [2024-06-12 13:18:19,523][65616] Updated weights for policy 0, policy_version 2960 (0.0035) [2024-06-12 13:18:19,990][65595] Signal inference workers to stop experience collection... (650 times) [2024-06-12 13:18:20,043][65616] InferenceWorker_p0-w0: stopping experience collection (650 times) [2024-06-12 13:18:20,101][65595] Signal inference workers to resume experience collection... (650 times) [2024-06-12 13:18:20,101][65616] InferenceWorker_p0-w0: resuming experience collection (650 times) [2024-06-12 13:18:22,758][65616] Updated weights for policy 0, policy_version 2970 (0.0022) [2024-06-12 13:18:24,334][65383] Fps is (10 sec: 49143.1, 60 sec: 48604.4, 300 sec: 48040.9). Total num frames: 48742400. Throughput: 0: 48590.5. Samples: 48825920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 13:18:24,335][65383] Avg episode reward: [(0, '0.013')] [2024-06-12 13:18:25,966][65616] Updated weights for policy 0, policy_version 2980 (0.0034) [2024-06-12 13:18:29,332][65383] Fps is (10 sec: 49152.5, 60 sec: 48333.0, 300 sec: 47985.7). Total num frames: 48971776. Throughput: 0: 48805.1. Samples: 49123960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-12 13:18:29,333][65383] Avg episode reward: [(0, '0.014')] [2024-06-12 13:18:29,541][65616] Updated weights for policy 0, policy_version 2990 (0.0032) [2024-06-12 13:18:32,600][65616] Updated weights for policy 0, policy_version 3000 (0.0024) [2024-06-12 13:18:34,332][65383] Fps is (10 sec: 49161.0, 60 sec: 48879.5, 300 sec: 48152.3). Total num frames: 49233920. Throughput: 0: 48759.6. Samples: 49270480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 13:18:34,333][65383] Avg episode reward: [(0, '0.008')] [2024-06-12 13:18:36,416][65616] Updated weights for policy 0, policy_version 3010 (0.0029) [2024-06-12 13:18:39,267][65616] Updated weights for policy 0, policy_version 3020 (0.0026) [2024-06-12 13:18:39,332][65383] Fps is (10 sec: 50789.5, 60 sec: 49152.0, 300 sec: 48263.4). Total num frames: 49479680. Throughput: 0: 48768.0. Samples: 49561180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 13:18:39,333][65383] Avg episode reward: [(0, '0.012')] [2024-06-12 13:18:42,823][65616] Updated weights for policy 0, policy_version 3030 (0.0034) [2024-06-12 13:18:44,332][65383] Fps is (10 sec: 47513.6, 60 sec: 48606.6, 300 sec: 48152.3). Total num frames: 49709056. Throughput: 0: 48993.9. Samples: 49861520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-06-12 13:18:44,333][65383] Avg episode reward: [(0, '0.012')] [2024-06-12 13:18:45,822][65616] Updated weights for policy 0, policy_version 3040 (0.0024) [2024-06-12 13:18:49,333][65383] Fps is (10 sec: 47511.1, 60 sec: 48878.5, 300 sec: 48152.2). Total num frames: 49954816. Throughput: 0: 48780.2. Samples: 49995540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-12 13:18:49,334][65383] Avg episode reward: [(0, '0.017')] [2024-06-12 13:18:49,334][65595] Saving new best policy, reward=0.017! [2024-06-12 13:18:49,891][65616] Updated weights for policy 0, policy_version 3050 (0.0029) [2024-06-12 13:18:52,482][65616] Updated weights for policy 0, policy_version 3060 (0.0026) [2024-06-12 13:18:54,332][65383] Fps is (10 sec: 50790.0, 60 sec: 49152.0, 300 sec: 48263.4). Total num frames: 50216960. Throughput: 0: 49260.4. Samples: 50298620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 13:18:54,333][65383] Avg episode reward: [(0, '0.011')] [2024-06-12 13:18:56,291][65616] Updated weights for policy 0, policy_version 3070 (0.0030) [2024-06-12 13:18:59,332][65383] Fps is (10 sec: 49154.8, 60 sec: 48879.0, 300 sec: 48318.9). Total num frames: 50446336. Throughput: 0: 49037.3. Samples: 50588540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 13:18:59,333][65383] Avg episode reward: [(0, '0.018')] [2024-06-12 13:18:59,492][65616] Updated weights for policy 0, policy_version 3080 (0.0025) [2024-06-12 13:19:03,190][65616] Updated weights for policy 0, policy_version 3090 (0.0028) [2024-06-12 13:19:04,332][65383] Fps is (10 sec: 47514.2, 60 sec: 49152.0, 300 sec: 48263.5). Total num frames: 50692096. Throughput: 0: 49161.8. Samples: 50741500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-12 13:19:04,333][65383] Avg episode reward: [(0, '0.020')] [2024-06-12 13:19:04,350][65595] Saving new best policy, reward=0.020! [2024-06-12 13:19:05,916][65616] Updated weights for policy 0, policy_version 3100 (0.0025) [2024-06-12 13:19:09,332][65383] Fps is (10 sec: 49152.0, 60 sec: 48878.9, 300 sec: 48263.4). Total num frames: 50937856. Throughput: 0: 49068.2. Samples: 51033900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 13:19:09,333][65383] Avg episode reward: [(0, '0.011')] [2024-06-12 13:19:09,752][65616] Updated weights for policy 0, policy_version 3110 (0.0033) [2024-06-12 13:19:12,847][65616] Updated weights for policy 0, policy_version 3120 (0.0026) [2024-06-12 13:19:14,332][65383] Fps is (10 sec: 49151.3, 60 sec: 48878.9, 300 sec: 48263.4). Total num frames: 51183616. Throughput: 0: 48821.1. Samples: 51320920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 13:19:14,333][65383] Avg episode reward: [(0, '0.010')] [2024-06-12 13:19:16,712][65616] Updated weights for policy 0, policy_version 3130 (0.0026) [2024-06-12 13:19:19,232][65616] Updated weights for policy 0, policy_version 3140 (0.0027) [2024-06-12 13:19:19,332][65383] Fps is (10 sec: 50790.1, 60 sec: 49425.0, 300 sec: 48430.0). Total num frames: 51445760. Throughput: 0: 49010.1. Samples: 51475940. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-12 13:19:19,333][65383] Avg episode reward: [(0, '0.011')] [2024-06-12 13:19:22,946][65616] Updated weights for policy 0, policy_version 3150 (0.0030) [2024-06-12 13:19:24,332][65383] Fps is (10 sec: 47514.1, 60 sec: 48607.4, 300 sec: 48263.4). Total num frames: 51658752. Throughput: 0: 48966.3. Samples: 51764660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 13:19:24,333][65383] Avg episode reward: [(0, '0.014')] [2024-06-12 13:19:26,012][65616] Updated weights for policy 0, policy_version 3160 (0.0030) [2024-06-12 13:19:29,332][65383] Fps is (10 sec: 45875.5, 60 sec: 48878.8, 300 sec: 48318.9). Total num frames: 51904512. Throughput: 0: 48816.4. Samples: 52058260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 13:19:29,333][65383] Avg episode reward: [(0, '0.014')] [2024-06-12 13:19:29,610][65616] Updated weights for policy 0, policy_version 3170 (0.0022) [2024-06-12 13:19:33,142][65616] Updated weights for policy 0, policy_version 3180 (0.0028) [2024-06-12 13:19:34,332][65383] Fps is (10 sec: 52428.8, 60 sec: 49152.0, 300 sec: 48485.6). Total num frames: 52183040. Throughput: 0: 49149.6. Samples: 52207240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-12 13:19:34,333][65383] Avg episode reward: [(0, '0.016')] [2024-06-12 13:19:36,361][65616] Updated weights for policy 0, policy_version 3190 (0.0034) [2024-06-12 13:19:39,332][65383] Fps is (10 sec: 49151.9, 60 sec: 48605.9, 300 sec: 48430.0). Total num frames: 52396032. Throughput: 0: 48931.1. Samples: 52500520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-12 13:19:39,333][65383] Avg episode reward: [(0, '0.020')] [2024-06-12 13:19:39,638][65595] Signal inference workers to stop experience collection... (700 times) [2024-06-12 13:19:39,639][65595] Signal inference workers to resume experience collection... (700 times) [2024-06-12 13:19:39,656][65616] InferenceWorker_p0-w0: stopping experience collection (700 times) [2024-06-12 13:19:39,656][65616] InferenceWorker_p0-w0: resuming experience collection (700 times) [2024-06-12 13:19:39,775][65616] Updated weights for policy 0, policy_version 3200 (0.0031) [2024-06-12 13:19:42,993][65616] Updated weights for policy 0, policy_version 3210 (0.0024) [2024-06-12 13:19:44,333][65383] Fps is (10 sec: 45874.2, 60 sec: 48878.8, 300 sec: 48318.9). Total num frames: 52641792. Throughput: 0: 49045.2. Samples: 52795580. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-12 13:19:44,333][65383] Avg episode reward: [(0, '0.015')] [2024-06-12 13:19:46,510][65616] Updated weights for policy 0, policy_version 3220 (0.0028) [2024-06-12 13:19:49,332][65383] Fps is (10 sec: 49152.0, 60 sec: 48879.4, 300 sec: 48374.4). Total num frames: 52887552. Throughput: 0: 48736.3. Samples: 52934640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 13:19:49,333][65383] Avg episode reward: [(0, '0.012')] [2024-06-12 13:19:50,000][65616] Updated weights for policy 0, policy_version 3230 (0.0032) [2024-06-12 13:19:53,323][65616] Updated weights for policy 0, policy_version 3240 (0.0032) [2024-06-12 13:19:54,332][65383] Fps is (10 sec: 52430.0, 60 sec: 49152.1, 300 sec: 48596.6). Total num frames: 53166080. Throughput: 0: 48957.4. Samples: 53236980. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-12 13:19:54,333][65383] Avg episode reward: [(0, '0.015')] [2024-06-12 13:19:54,341][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000003245_53166080.pth... [2024-06-12 13:19:54,395][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000002533_41500672.pth [2024-06-12 13:19:56,159][65616] Updated weights for policy 0, policy_version 3250 (0.0029) [2024-06-12 13:19:59,332][65383] Fps is (10 sec: 49152.3, 60 sec: 48879.0, 300 sec: 48541.1). Total num frames: 53379072. Throughput: 0: 49029.9. Samples: 53527260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 13:19:59,333][65383] Avg episode reward: [(0, '0.014')] [2024-06-12 13:19:59,824][65616] Updated weights for policy 0, policy_version 3260 (0.0027) [2024-06-12 13:20:03,023][65616] Updated weights for policy 0, policy_version 3270 (0.0034) [2024-06-12 13:20:04,332][65383] Fps is (10 sec: 45875.1, 60 sec: 48878.9, 300 sec: 48485.5). Total num frames: 53624832. Throughput: 0: 48810.8. Samples: 53672420. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-12 13:20:04,333][65383] Avg episode reward: [(0, '0.016')] [2024-06-12 13:20:06,589][65616] Updated weights for policy 0, policy_version 3280 (0.0029) [2024-06-12 13:20:09,332][65383] Fps is (10 sec: 49152.0, 60 sec: 48879.0, 300 sec: 48485.6). Total num frames: 53870592. Throughput: 0: 48927.1. Samples: 53966380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 13:20:09,333][65383] Avg episode reward: [(0, '0.012')] [2024-06-12 13:20:09,505][65616] Updated weights for policy 0, policy_version 3290 (0.0025) [2024-06-12 13:20:13,187][65616] Updated weights for policy 0, policy_version 3300 (0.0026) [2024-06-12 13:20:14,332][65383] Fps is (10 sec: 50790.6, 60 sec: 49152.1, 300 sec: 48652.2). Total num frames: 54132736. Throughput: 0: 48956.1. Samples: 54261280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 13:20:14,333][65383] Avg episode reward: [(0, '0.028')] [2024-06-12 13:20:14,357][65595] Saving new best policy, reward=0.028! [2024-06-12 13:20:16,293][65616] Updated weights for policy 0, policy_version 3310 (0.0035) [2024-06-12 13:20:19,332][65383] Fps is (10 sec: 49152.0, 60 sec: 48605.9, 300 sec: 48596.6). Total num frames: 54362112. Throughput: 0: 48948.4. Samples: 54409920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 13:20:19,333][65383] Avg episode reward: [(0, '0.018')] [2024-06-12 13:20:19,817][65616] Updated weights for policy 0, policy_version 3320 (0.0032) [2024-06-12 13:20:22,963][65616] Updated weights for policy 0, policy_version 3330 (0.0034) [2024-06-12 13:20:24,332][65383] Fps is (10 sec: 45875.3, 60 sec: 48879.0, 300 sec: 48485.5). Total num frames: 54591488. Throughput: 0: 48936.6. Samples: 54702660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 13:20:24,333][65383] Avg episode reward: [(0, '0.019')] [2024-06-12 13:20:26,606][65616] Updated weights for policy 0, policy_version 3340 (0.0026) [2024-06-12 13:20:29,332][65383] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 48596.6). Total num frames: 54853632. Throughput: 0: 48901.5. Samples: 54996140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 13:20:29,333][65383] Avg episode reward: [(0, '0.013')] [2024-06-12 13:20:29,483][65616] Updated weights for policy 0, policy_version 3350 (0.0030) [2024-06-12 13:20:33,043][65616] Updated weights for policy 0, policy_version 3360 (0.0028) [2024-06-12 13:20:34,332][65383] Fps is (10 sec: 50789.6, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 55099392. Throughput: 0: 49203.1. Samples: 55148780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 13:20:34,333][65383] Avg episode reward: [(0, '0.013')] [2024-06-12 13:20:36,065][65616] Updated weights for policy 0, policy_version 3370 (0.0027) [2024-06-12 13:20:39,332][65383] Fps is (10 sec: 49151.5, 60 sec: 49152.0, 300 sec: 48652.1). Total num frames: 55345152. Throughput: 0: 49024.7. Samples: 55443100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 13:20:39,333][65383] Avg episode reward: [(0, '0.020')] [2024-06-12 13:20:39,614][65616] Updated weights for policy 0, policy_version 3380 (0.0035) [2024-06-12 13:20:42,890][65616] Updated weights for policy 0, policy_version 3390 (0.0034) [2024-06-12 13:20:44,332][65383] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 48596.6). Total num frames: 55590912. Throughput: 0: 48964.8. Samples: 55730680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 13:20:44,333][65383] Avg episode reward: [(0, '0.014')] [2024-06-12 13:20:46,541][65616] Updated weights for policy 0, policy_version 3400 (0.0037) [2024-06-12 13:20:49,330][65616] Updated weights for policy 0, policy_version 3410 (0.0023) [2024-06-12 13:20:49,332][65383] Fps is (10 sec: 52428.9, 60 sec: 49698.1, 300 sec: 48763.2). Total num frames: 55869440. Throughput: 0: 49096.8. Samples: 55881780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 13:20:49,333][65383] Avg episode reward: [(0, '0.022')] [2024-06-12 13:20:53,226][65616] Updated weights for policy 0, policy_version 3420 (0.0024) [2024-06-12 13:20:54,332][65383] Fps is (10 sec: 49151.9, 60 sec: 48605.8, 300 sec: 48818.8). Total num frames: 56082432. Throughput: 0: 49309.7. Samples: 56185320. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-12 13:20:54,333][65383] Avg episode reward: [(0, '0.018')] [2024-06-12 13:20:55,769][65595] Signal inference workers to stop experience collection... (750 times) [2024-06-12 13:20:55,807][65616] InferenceWorker_p0-w0: stopping experience collection (750 times) [2024-06-12 13:20:55,878][65595] Signal inference workers to resume experience collection... (750 times) [2024-06-12 13:20:55,878][65616] InferenceWorker_p0-w0: resuming experience collection (750 times) [2024-06-12 13:20:56,009][65616] Updated weights for policy 0, policy_version 3430 (0.0027) [2024-06-12 13:20:59,332][65383] Fps is (10 sec: 45875.5, 60 sec: 49152.0, 300 sec: 48707.7). Total num frames: 56328192. Throughput: 0: 49228.4. Samples: 56476560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 13:20:59,333][65383] Avg episode reward: [(0, '0.016')] [2024-06-12 13:20:59,595][65616] Updated weights for policy 0, policy_version 3440 (0.0023) [2024-06-12 13:21:02,628][65616] Updated weights for policy 0, policy_version 3450 (0.0019) [2024-06-12 13:21:04,332][65383] Fps is (10 sec: 52429.0, 60 sec: 49698.1, 300 sec: 48818.8). Total num frames: 56606720. Throughput: 0: 49320.9. Samples: 56629360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 13:21:04,333][65383] Avg episode reward: [(0, '0.018')] [2024-06-12 13:21:06,145][65616] Updated weights for policy 0, policy_version 3460 (0.0029) [2024-06-12 13:21:08,923][65616] Updated weights for policy 0, policy_version 3470 (0.0026) [2024-06-12 13:21:09,332][65383] Fps is (10 sec: 54067.2, 60 sec: 49971.2, 300 sec: 48874.3). Total num frames: 56868864. Throughput: 0: 49752.3. Samples: 56941520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 13:21:09,333][65383] Avg episode reward: [(0, '0.015')] [2024-06-12 13:21:12,708][65616] Updated weights for policy 0, policy_version 3480 (0.0028) [2024-06-12 13:21:14,332][65383] Fps is (10 sec: 47513.5, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 57081856. Throughput: 0: 49922.2. Samples: 57242640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 13:21:14,333][65383] Avg episode reward: [(0, '0.012')] [2024-06-12 13:21:15,678][65616] Updated weights for policy 0, policy_version 3490 (0.0027) [2024-06-12 13:21:19,332][65383] Fps is (10 sec: 45875.4, 60 sec: 49425.1, 300 sec: 48818.8). Total num frames: 57327616. Throughput: 0: 49410.8. Samples: 57372260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-12 13:21:19,333][65383] Avg episode reward: [(0, '0.023')] [2024-06-12 13:21:19,418][65616] Updated weights for policy 0, policy_version 3500 (0.0034) [2024-06-12 13:21:22,266][65616] Updated weights for policy 0, policy_version 3510 (0.0030) [2024-06-12 13:21:24,332][65383] Fps is (10 sec: 50790.5, 60 sec: 49971.1, 300 sec: 48818.8). Total num frames: 57589760. Throughput: 0: 49527.2. Samples: 57671820. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-12 13:21:24,333][65383] Avg episode reward: [(0, '0.024')] [2024-06-12 13:21:26,130][65616] Updated weights for policy 0, policy_version 3520 (0.0027) [2024-06-12 13:21:28,950][65616] Updated weights for policy 0, policy_version 3530 (0.0028) [2024-06-12 13:21:29,332][65383] Fps is (10 sec: 52428.6, 60 sec: 49971.2, 300 sec: 48929.8). Total num frames: 57851904. Throughput: 0: 49821.3. Samples: 57972640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-12 13:21:29,333][65383] Avg episode reward: [(0, '0.017')] [2024-06-12 13:21:32,931][65616] Updated weights for policy 0, policy_version 3540 (0.0029) [2024-06-12 13:21:34,332][65383] Fps is (10 sec: 49152.0, 60 sec: 49698.2, 300 sec: 48985.4). Total num frames: 58081280. Throughput: 0: 49832.5. Samples: 58124240. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-12 13:21:34,333][65383] Avg episode reward: [(0, '0.025')] [2024-06-12 13:21:35,623][65616] Updated weights for policy 0, policy_version 3550 (0.0027) [2024-06-12 13:21:39,332][65383] Fps is (10 sec: 45875.0, 60 sec: 49425.1, 300 sec: 48929.8). Total num frames: 58310656. Throughput: 0: 49689.8. Samples: 58421360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-06-12 13:21:39,333][65383] Avg episode reward: [(0, '0.018')] [2024-06-12 13:21:39,613][65616] Updated weights for policy 0, policy_version 3560 (0.0030) [2024-06-12 13:21:42,163][65616] Updated weights for policy 0, policy_version 3570 (0.0021) [2024-06-12 13:21:44,333][65383] Fps is (10 sec: 49151.3, 60 sec: 49698.0, 300 sec: 48818.7). Total num frames: 58572800. Throughput: 0: 49833.2. Samples: 58719060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 13:21:44,333][65383] Avg episode reward: [(0, '0.019')] [2024-06-12 13:21:46,219][65616] Updated weights for policy 0, policy_version 3580 (0.0025) [2024-06-12 13:21:48,981][65616] Updated weights for policy 0, policy_version 3590 (0.0031) [2024-06-12 13:21:49,333][65383] Fps is (10 sec: 54066.9, 60 sec: 49698.1, 300 sec: 49040.9). Total num frames: 58851328. Throughput: 0: 49829.7. Samples: 58871700. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 13:21:49,333][65383] Avg episode reward: [(0, '0.023')] [2024-06-12 13:21:52,809][65616] Updated weights for policy 0, policy_version 3600 (0.0026) [2024-06-12 13:21:54,332][65383] Fps is (10 sec: 49152.7, 60 sec: 49698.2, 300 sec: 48985.4). Total num frames: 59064320. Throughput: 0: 49547.6. Samples: 59171160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 13:21:54,333][65383] Avg episode reward: [(0, '0.016')] [2024-06-12 13:21:54,349][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000003605_59064320.pth... [2024-06-12 13:21:54,401][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000002886_47284224.pth [2024-06-12 13:21:55,261][65616] Updated weights for policy 0, policy_version 3610 (0.0032) [2024-06-12 13:21:56,020][65595] Signal inference workers to stop experience collection... (800 times) [2024-06-12 13:21:56,020][65595] Signal inference workers to resume experience collection... (800 times) [2024-06-12 13:21:56,060][65616] InferenceWorker_p0-w0: stopping experience collection (800 times) [2024-06-12 13:21:56,061][65616] InferenceWorker_p0-w0: resuming experience collection (800 times) [2024-06-12 13:21:59,308][65616] Updated weights for policy 0, policy_version 3620 (0.0034) [2024-06-12 13:21:59,332][65383] Fps is (10 sec: 45875.8, 60 sec: 49698.1, 300 sec: 48985.4). Total num frames: 59310080. Throughput: 0: 49436.9. Samples: 59467300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 13:21:59,333][65383] Avg episode reward: [(0, '0.012')] [2024-06-12 13:22:02,139][65616] Updated weights for policy 0, policy_version 3630 (0.0025) [2024-06-12 13:22:04,332][65383] Fps is (10 sec: 50790.6, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 59572224. Throughput: 0: 49772.5. Samples: 59612020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 13:22:04,333][65383] Avg episode reward: [(0, '0.028')] [2024-06-12 13:22:05,996][65616] Updated weights for policy 0, policy_version 3640 (0.0030) [2024-06-12 13:22:08,857][65616] Updated weights for policy 0, policy_version 3650 (0.0041) [2024-06-12 13:22:09,332][65383] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 59817984. Throughput: 0: 49709.8. Samples: 59908760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 13:22:09,333][65383] Avg episode reward: [(0, '0.012')] [2024-06-12 13:22:12,655][65616] Updated weights for policy 0, policy_version 3660 (0.0023) [2024-06-12 13:22:14,332][65383] Fps is (10 sec: 47513.6, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 60047360. Throughput: 0: 49618.3. Samples: 60205460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 13:22:14,333][65383] Avg episode reward: [(0, '0.020')] [2024-06-12 13:22:15,279][65616] Updated weights for policy 0, policy_version 3670 (0.0023) [2024-06-12 13:22:18,705][65616] Updated weights for policy 0, policy_version 3680 (0.0027) [2024-06-12 13:22:19,332][65383] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 49096.5). Total num frames: 60309504. Throughput: 0: 49628.0. Samples: 60357500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 13:22:19,333][65383] Avg episode reward: [(0, '0.016')] [2024-06-12 13:22:21,784][65616] Updated weights for policy 0, policy_version 3690 (0.0030) [2024-06-12 13:22:24,332][65383] Fps is (10 sec: 52428.2, 60 sec: 49698.1, 300 sec: 49152.0). Total num frames: 60571648. Throughput: 0: 49746.6. Samples: 60659960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 13:22:24,333][65383] Avg episode reward: [(0, '0.017')] [2024-06-12 13:22:25,188][65616] Updated weights for policy 0, policy_version 3700 (0.0034) [2024-06-12 13:22:28,483][65616] Updated weights for policy 0, policy_version 3710 (0.0032) [2024-06-12 13:22:29,332][65383] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 49152.1). Total num frames: 60801024. Throughput: 0: 49709.9. Samples: 60956000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-06-12 13:22:29,333][65383] Avg episode reward: [(0, '0.020')] [2024-06-12 13:22:31,801][65616] Updated weights for policy 0, policy_version 3720 (0.0024) [2024-06-12 13:22:34,332][65383] Fps is (10 sec: 50790.7, 60 sec: 49971.2, 300 sec: 49318.6). Total num frames: 61079552. Throughput: 0: 49780.6. Samples: 61111820. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-06-12 13:22:34,333][65383] Avg episode reward: [(0, '0.024')] [2024-06-12 13:22:35,198][65616] Updated weights for policy 0, policy_version 3730 (0.0023) [2024-06-12 13:22:38,281][65616] Updated weights for policy 0, policy_version 3740 (0.0030) [2024-06-12 13:22:39,332][65383] Fps is (10 sec: 50790.5, 60 sec: 49971.3, 300 sec: 49207.7). Total num frames: 61308928. Throughput: 0: 49678.2. Samples: 61406680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-12 13:22:39,333][65383] Avg episode reward: [(0, '0.020')] [2024-06-12 13:22:42,022][65616] Updated weights for policy 0, policy_version 3750 (0.0024) [2024-06-12 13:22:44,332][65383] Fps is (10 sec: 45875.1, 60 sec: 49425.2, 300 sec: 49207.5). Total num frames: 61538304. Throughput: 0: 49548.9. Samples: 61697000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 13:22:44,333][65383] Avg episode reward: [(0, '0.017')] [2024-06-12 13:22:45,071][65616] Updated weights for policy 0, policy_version 3760 (0.0027) [2024-06-12 13:22:48,369][65616] Updated weights for policy 0, policy_version 3770 (0.0026) [2024-06-12 13:22:49,332][65383] Fps is (10 sec: 49151.7, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 61800448. Throughput: 0: 49756.4. Samples: 61851060. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-12 13:22:49,333][65383] Avg episode reward: [(0, '0.014')] [2024-06-12 13:22:51,546][65616] Updated weights for policy 0, policy_version 3780 (0.0027) [2024-06-12 13:22:54,332][65383] Fps is (10 sec: 50790.4, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 62046208. Throughput: 0: 49922.7. Samples: 62155280. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-12 13:22:54,333][65383] Avg episode reward: [(0, '0.022')] [2024-06-12 13:22:55,638][65616] Updated weights for policy 0, policy_version 3790 (0.0041) [2024-06-12 13:22:57,983][65616] Updated weights for policy 0, policy_version 3800 (0.0027) [2024-06-12 13:22:59,332][65383] Fps is (10 sec: 50790.6, 60 sec: 49971.2, 300 sec: 49374.2). Total num frames: 62308352. Throughput: 0: 49753.7. Samples: 62444380. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-12 13:22:59,333][65383] Avg episode reward: [(0, '0.019')] [2024-06-12 13:23:01,998][65616] Updated weights for policy 0, policy_version 3810 (0.0036) [2024-06-12 13:23:04,332][65383] Fps is (10 sec: 52429.1, 60 sec: 49971.2, 300 sec: 49374.2). Total num frames: 62570496. Throughput: 0: 49798.8. Samples: 62598440. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-12 13:23:04,333][65383] Avg episode reward: [(0, '0.016')] [2024-06-12 13:23:04,378][65616] Updated weights for policy 0, policy_version 3820 (0.0028) [2024-06-12 13:23:08,553][65616] Updated weights for policy 0, policy_version 3830 (0.0023) [2024-06-12 13:23:09,332][65383] Fps is (10 sec: 49151.6, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 62799872. Throughput: 0: 49897.8. Samples: 62905360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 13:23:09,333][65383] Avg episode reward: [(0, '0.011')] [2024-06-12 13:23:10,687][65616] Updated weights for policy 0, policy_version 3840 (0.0023) [2024-06-12 13:23:12,736][65595] Signal inference workers to stop experience collection... (850 times) [2024-06-12 13:23:12,777][65616] InferenceWorker_p0-w0: stopping experience collection (850 times) [2024-06-12 13:23:12,783][65595] Signal inference workers to resume experience collection... (850 times) [2024-06-12 13:23:12,789][65616] InferenceWorker_p0-w0: resuming experience collection (850 times) [2024-06-12 13:23:14,332][65383] Fps is (10 sec: 45874.8, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 63029248. Throughput: 0: 49878.2. Samples: 63200520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 13:23:14,333][65383] Avg episode reward: [(0, '0.028')] [2024-06-12 13:23:14,925][65616] Updated weights for policy 0, policy_version 3850 (0.0030) [2024-06-12 13:23:17,376][65616] Updated weights for policy 0, policy_version 3860 (0.0026) [2024-06-12 13:23:19,332][65383] Fps is (10 sec: 50791.2, 60 sec: 49971.3, 300 sec: 49374.5). Total num frames: 63307776. Throughput: 0: 49552.6. Samples: 63341680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 13:23:19,333][65383] Avg episode reward: [(0, '0.019')] [2024-06-12 13:23:21,786][65616] Updated weights for policy 0, policy_version 3870 (0.0032) [2024-06-12 13:23:23,911][65616] Updated weights for policy 0, policy_version 3880 (0.0022) [2024-06-12 13:23:24,332][65383] Fps is (10 sec: 54067.5, 60 sec: 49971.3, 300 sec: 49485.2). Total num frames: 63569920. Throughput: 0: 49738.7. Samples: 63644920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-12 13:23:24,333][65383] Avg episode reward: [(0, '0.022')] [2024-06-12 13:23:28,244][65616] Updated weights for policy 0, policy_version 3890 (0.0024) [2024-06-12 13:23:29,332][65383] Fps is (10 sec: 47513.4, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 63782912. Throughput: 0: 50071.2. Samples: 63950200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 13:23:29,333][65383] Avg episode reward: [(0, '0.025')] [2024-06-12 13:23:30,539][65616] Updated weights for policy 0, policy_version 3900 (0.0023) [2024-06-12 13:23:34,332][65383] Fps is (10 sec: 45874.7, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 64028672. Throughput: 0: 49799.5. Samples: 64092040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-12 13:23:34,333][65383] Avg episode reward: [(0, '0.019')] [2024-06-12 13:23:34,767][65616] Updated weights for policy 0, policy_version 3910 (0.0034) [2024-06-12 13:23:37,170][65616] Updated weights for policy 0, policy_version 3920 (0.0035) [2024-06-12 13:23:39,332][65383] Fps is (10 sec: 54067.5, 60 sec: 50244.3, 300 sec: 49540.8). Total num frames: 64323584. Throughput: 0: 49747.7. Samples: 64393920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 18.0) [2024-06-12 13:23:39,333][65383] Avg episode reward: [(0, '0.021')] [2024-06-12 13:23:41,422][65616] Updated weights for policy 0, policy_version 3930 (0.0037) [2024-06-12 13:23:44,055][65616] Updated weights for policy 0, policy_version 3940 (0.0025) [2024-06-12 13:23:44,333][65383] Fps is (10 sec: 54067.0, 60 sec: 50517.2, 300 sec: 49540.8). Total num frames: 64569344. Throughput: 0: 49896.3. Samples: 64689720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 27.0) [2024-06-12 13:23:44,333][65383] Avg episode reward: [(0, '0.025')] [2024-06-12 13:23:47,943][65616] Updated weights for policy 0, policy_version 3950 (0.0033) [2024-06-12 13:23:49,333][65383] Fps is (10 sec: 47512.4, 60 sec: 49971.1, 300 sec: 49429.7). Total num frames: 64798720. Throughput: 0: 49838.9. Samples: 64841200. Policy #0 lag: (min: 1.0, avg: 7.7, max: 21.0) [2024-06-12 13:23:49,333][65383] Avg episode reward: [(0, '0.025')] [2024-06-12 13:23:50,695][65616] Updated weights for policy 0, policy_version 3960 (0.0027) [2024-06-12 13:23:54,332][65383] Fps is (10 sec: 45875.9, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 65028096. Throughput: 0: 49528.1. Samples: 65134120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 13:23:54,333][65383] Avg episode reward: [(0, '0.022')] [2024-06-12 13:23:54,453][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000003970_65044480.pth... [2024-06-12 13:23:54,458][65616] Updated weights for policy 0, policy_version 3970 (0.0035) [2024-06-12 13:23:54,509][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000003245_53166080.pth [2024-06-12 13:23:57,515][65616] Updated weights for policy 0, policy_version 3980 (0.0030) [2024-06-12 13:23:59,332][65383] Fps is (10 sec: 50791.5, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 65306624. Throughput: 0: 49481.0. Samples: 65427160. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-12 13:23:59,333][65383] Avg episode reward: [(0, '0.020')] [2024-06-12 13:24:00,990][65616] Updated weights for policy 0, policy_version 3990 (0.0028) [2024-06-12 13:24:04,005][65616] Updated weights for policy 0, policy_version 4000 (0.0024) [2024-06-12 13:24:04,332][65383] Fps is (10 sec: 52428.2, 60 sec: 49698.0, 300 sec: 49540.8). Total num frames: 65552384. Throughput: 0: 49892.2. Samples: 65586840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 13:24:04,341][65383] Avg episode reward: [(0, '0.024')] [2024-06-12 13:24:07,564][65616] Updated weights for policy 0, policy_version 4010 (0.0036) [2024-06-12 13:24:07,847][65595] Signal inference workers to stop experience collection... (900 times) [2024-06-12 13:24:07,848][65595] Signal inference workers to resume experience collection... (900 times) [2024-06-12 13:24:07,900][65616] InferenceWorker_p0-w0: stopping experience collection (900 times) [2024-06-12 13:24:07,900][65616] InferenceWorker_p0-w0: resuming experience collection (900 times) [2024-06-12 13:24:09,332][65383] Fps is (10 sec: 47513.4, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 65781760. Throughput: 0: 49908.4. Samples: 65890800. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) [2024-06-12 13:24:09,333][65383] Avg episode reward: [(0, '0.024')] [2024-06-12 13:24:10,547][65616] Updated weights for policy 0, policy_version 4020 (0.0041) [2024-06-12 13:24:14,179][65616] Updated weights for policy 0, policy_version 4030 (0.0031) [2024-06-12 13:24:14,332][65383] Fps is (10 sec: 47513.9, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 66027520. Throughput: 0: 49661.7. Samples: 66184980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 13:24:14,333][65383] Avg episode reward: [(0, '0.018')] [2024-06-12 13:24:17,207][65616] Updated weights for policy 0, policy_version 4040 (0.0031) [2024-06-12 13:24:19,332][65383] Fps is (10 sec: 50790.5, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 66289664. Throughput: 0: 49765.9. Samples: 66331500. Policy #0 lag: (min: 1.0, avg: 9.7, max: 23.0) [2024-06-12 13:24:19,333][65383] Avg episode reward: [(0, '0.020')] [2024-06-12 13:24:20,766][65616] Updated weights for policy 0, policy_version 4050 (0.0033) [2024-06-12 13:24:23,722][65616] Updated weights for policy 0, policy_version 4060 (0.0022) [2024-06-12 13:24:24,332][65383] Fps is (10 sec: 52429.3, 60 sec: 49698.2, 300 sec: 49651.9). Total num frames: 66551808. Throughput: 0: 49772.9. Samples: 66633700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-12 13:24:24,333][65383] Avg episode reward: [(0, '0.025')] [2024-06-12 13:24:27,545][65616] Updated weights for policy 0, policy_version 4070 (0.0028) [2024-06-12 13:24:29,332][65383] Fps is (10 sec: 47513.7, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 66764800. Throughput: 0: 49670.0. Samples: 66924860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-12 13:24:29,333][65383] Avg episode reward: [(0, '0.017')] [2024-06-12 13:24:30,385][65616] Updated weights for policy 0, policy_version 4080 (0.0028) [2024-06-12 13:24:33,717][65616] Updated weights for policy 0, policy_version 4090 (0.0032) [2024-06-12 13:24:34,332][65383] Fps is (10 sec: 49152.1, 60 sec: 50244.4, 300 sec: 49651.9). Total num frames: 67043328. Throughput: 0: 49573.6. Samples: 67072000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 13:24:34,333][65383] Avg episode reward: [(0, '0.026')] [2024-06-12 13:24:37,141][65616] Updated weights for policy 0, policy_version 4100 (0.0026) [2024-06-12 13:24:39,332][65383] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 49596.4). Total num frames: 67272704. Throughput: 0: 49584.9. Samples: 67365440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 13:24:39,333][65383] Avg episode reward: [(0, '0.028')] [2024-06-12 13:24:40,488][65616] Updated weights for policy 0, policy_version 4110 (0.0032) [2024-06-12 13:24:43,816][65616] Updated weights for policy 0, policy_version 4120 (0.0024) [2024-06-12 13:24:44,333][65383] Fps is (10 sec: 50785.4, 60 sec: 49697.5, 300 sec: 49707.2). Total num frames: 67551232. Throughput: 0: 49873.2. Samples: 67671500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 13:24:44,334][65383] Avg episode reward: [(0, '0.020')] [2024-06-12 13:24:46,944][65616] Updated weights for policy 0, policy_version 4130 (0.0024) [2024-06-12 13:24:49,332][65383] Fps is (10 sec: 49151.6, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 67764224. Throughput: 0: 49674.8. Samples: 67822200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-12 13:24:49,333][65383] Avg episode reward: [(0, '0.024')] [2024-06-12 13:24:50,106][65616] Updated weights for policy 0, policy_version 4140 (0.0030) [2024-06-12 13:24:53,514][65616] Updated weights for policy 0, policy_version 4150 (0.0028) [2024-06-12 13:24:54,332][65383] Fps is (10 sec: 49156.5, 60 sec: 50244.3, 300 sec: 49707.4). Total num frames: 68042752. Throughput: 0: 49657.8. Samples: 68125400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 13:24:54,333][65383] Avg episode reward: [(0, '0.031')] [2024-06-12 13:24:54,375][65595] Saving new best policy, reward=0.031! [2024-06-12 13:24:56,832][65616] Updated weights for policy 0, policy_version 4160 (0.0025) [2024-06-12 13:24:59,332][65383] Fps is (10 sec: 50790.1, 60 sec: 49425.0, 300 sec: 49651.8). Total num frames: 68272128. Throughput: 0: 49641.7. Samples: 68418860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 26.0) [2024-06-12 13:24:59,333][65383] Avg episode reward: [(0, '0.030')] [2024-06-12 13:25:00,000][65616] Updated weights for policy 0, policy_version 4170 (0.0031) [2024-06-12 13:25:03,340][65616] Updated weights for policy 0, policy_version 4180 (0.0022) [2024-06-12 13:25:04,332][65383] Fps is (10 sec: 47513.2, 60 sec: 49425.1, 300 sec: 49651.8). Total num frames: 68517888. Throughput: 0: 49699.5. Samples: 68567980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-12 13:25:04,342][65383] Avg episode reward: [(0, '0.031')] [2024-06-12 13:25:06,552][65616] Updated weights for policy 0, policy_version 4190 (0.0036) [2024-06-12 13:25:09,332][65383] Fps is (10 sec: 49152.9, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 68763648. Throughput: 0: 49600.9. Samples: 68865740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 13:25:09,333][65383] Avg episode reward: [(0, '0.022')] [2024-06-12 13:25:09,959][65616] Updated weights for policy 0, policy_version 4200 (0.0025) [2024-06-12 13:25:12,985][65616] Updated weights for policy 0, policy_version 4210 (0.0029) [2024-06-12 13:25:13,245][65595] Signal inference workers to stop experience collection... (950 times) [2024-06-12 13:25:13,245][65595] Signal inference workers to resume experience collection... (950 times) [2024-06-12 13:25:13,260][65616] InferenceWorker_p0-w0: stopping experience collection (950 times) [2024-06-12 13:25:13,260][65616] InferenceWorker_p0-w0: resuming experience collection (950 times) [2024-06-12 13:25:14,333][65383] Fps is (10 sec: 50790.1, 60 sec: 49971.1, 300 sec: 49707.4). Total num frames: 69025792. Throughput: 0: 49770.0. Samples: 69164520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 13:25:14,333][65383] Avg episode reward: [(0, '0.029')] [2024-06-12 13:25:16,378][65616] Updated weights for policy 0, policy_version 4220 (0.0022) [2024-06-12 13:25:19,332][65383] Fps is (10 sec: 52428.1, 60 sec: 49971.2, 300 sec: 49818.5). Total num frames: 69287936. Throughput: 0: 50006.1. Samples: 69322280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 13:25:19,333][65383] Avg episode reward: [(0, '0.025')] [2024-06-12 13:25:19,420][65616] Updated weights for policy 0, policy_version 4230 (0.0030) [2024-06-12 13:25:22,996][65616] Updated weights for policy 0, policy_version 4240 (0.0027) [2024-06-12 13:25:24,332][65383] Fps is (10 sec: 49152.7, 60 sec: 49425.0, 300 sec: 49707.4). Total num frames: 69517312. Throughput: 0: 50184.8. Samples: 69623760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-12 13:25:24,333][65383] Avg episode reward: [(0, '0.026')] [2024-06-12 13:25:26,092][65616] Updated weights for policy 0, policy_version 4250 (0.0031) [2024-06-12 13:25:29,332][65383] Fps is (10 sec: 47513.3, 60 sec: 49971.1, 300 sec: 49707.4). Total num frames: 69763072. Throughput: 0: 49818.2. Samples: 69913280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-12 13:25:29,333][65383] Avg episode reward: [(0, '0.027')] [2024-06-12 13:25:29,865][65616] Updated weights for policy 0, policy_version 4260 (0.0029) [2024-06-12 13:25:32,525][65616] Updated weights for policy 0, policy_version 4270 (0.0038) [2024-06-12 13:25:34,332][65383] Fps is (10 sec: 49152.3, 60 sec: 49425.1, 300 sec: 49707.4). Total num frames: 70008832. Throughput: 0: 49790.8. Samples: 70062780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 13:25:34,333][65383] Avg episode reward: [(0, '0.028')] [2024-06-12 13:25:36,158][65616] Updated weights for policy 0, policy_version 4280 (0.0028) [2024-06-12 13:25:39,076][65616] Updated weights for policy 0, policy_version 4290 (0.0031) [2024-06-12 13:25:39,332][65383] Fps is (10 sec: 54067.0, 60 sec: 50517.2, 300 sec: 49874.0). Total num frames: 70303744. Throughput: 0: 49955.8. Samples: 70373420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 13:25:39,333][65383] Avg episode reward: [(0, '0.032')] [2024-06-12 13:25:39,342][65595] Saving new best policy, reward=0.032! [2024-06-12 13:25:42,950][65616] Updated weights for policy 0, policy_version 4300 (0.0025) [2024-06-12 13:25:44,332][65383] Fps is (10 sec: 49151.6, 60 sec: 49152.7, 300 sec: 49596.3). Total num frames: 70500352. Throughput: 0: 49799.2. Samples: 70659820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 13:25:44,333][65383] Avg episode reward: [(0, '0.029')] [2024-06-12 13:25:45,642][65616] Updated weights for policy 0, policy_version 4310 (0.0033) [2024-06-12 13:25:49,332][65383] Fps is (10 sec: 45875.8, 60 sec: 49971.2, 300 sec: 49762.9). Total num frames: 70762496. Throughput: 0: 49577.8. Samples: 70798980. Policy #0 lag: (min: 1.0, avg: 12.7, max: 24.0) [2024-06-12 13:25:49,333][65383] Avg episode reward: [(0, '0.028')] [2024-06-12 13:25:49,693][65616] Updated weights for policy 0, policy_version 4320 (0.0030) [2024-06-12 13:25:52,349][65616] Updated weights for policy 0, policy_version 4330 (0.0030) [2024-06-12 13:25:54,332][65383] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 49707.4). Total num frames: 70991872. Throughput: 0: 49518.9. Samples: 71094100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 13:25:54,333][65383] Avg episode reward: [(0, '0.034')] [2024-06-12 13:25:54,393][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000004334_71008256.pth... [2024-06-12 13:25:54,440][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000003605_59064320.pth [2024-06-12 13:25:54,444][65595] Saving new best policy, reward=0.034! [2024-06-12 13:25:56,319][65616] Updated weights for policy 0, policy_version 4340 (0.0024) [2024-06-12 13:25:58,740][65616] Updated weights for policy 0, policy_version 4350 (0.0022) [2024-06-12 13:25:59,332][65383] Fps is (10 sec: 50790.4, 60 sec: 49971.3, 300 sec: 49707.4). Total num frames: 71270400. Throughput: 0: 49661.5. Samples: 71399280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 13:25:59,333][65383] Avg episode reward: [(0, '0.029')] [2024-06-12 13:26:02,958][65616] Updated weights for policy 0, policy_version 4360 (0.0029) [2024-06-12 13:26:04,332][65383] Fps is (10 sec: 50790.7, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 71499776. Throughput: 0: 49582.2. Samples: 71553480. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-06-12 13:26:04,333][65383] Avg episode reward: [(0, '0.027')] [2024-06-12 13:26:05,418][65616] Updated weights for policy 0, policy_version 4370 (0.0023) [2024-06-12 13:26:09,332][65383] Fps is (10 sec: 47513.4, 60 sec: 49698.0, 300 sec: 49707.4). Total num frames: 71745536. Throughput: 0: 49633.3. Samples: 71857260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 13:26:09,333][65383] Avg episode reward: [(0, '0.027')] [2024-06-12 13:26:09,644][65616] Updated weights for policy 0, policy_version 4380 (0.0024) [2024-06-12 13:26:10,793][65595] Signal inference workers to stop experience collection... (1000 times) [2024-06-12 13:26:10,793][65595] Signal inference workers to resume experience collection... (1000 times) [2024-06-12 13:26:10,809][65616] InferenceWorker_p0-w0: stopping experience collection (1000 times) [2024-06-12 13:26:10,810][65616] InferenceWorker_p0-w0: resuming experience collection (1000 times) [2024-06-12 13:26:12,026][65616] Updated weights for policy 0, policy_version 4390 (0.0027) [2024-06-12 13:26:14,332][65383] Fps is (10 sec: 49152.5, 60 sec: 49425.2, 300 sec: 49707.4). Total num frames: 71991296. Throughput: 0: 49756.6. Samples: 72152320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 13:26:14,333][65383] Avg episode reward: [(0, '0.033')] [2024-06-12 13:26:16,019][65616] Updated weights for policy 0, policy_version 4400 (0.0032) [2024-06-12 13:26:18,477][65616] Updated weights for policy 0, policy_version 4410 (0.0020) [2024-06-12 13:26:19,332][65383] Fps is (10 sec: 52429.1, 60 sec: 49698.2, 300 sec: 49762.9). Total num frames: 72269824. Throughput: 0: 49927.5. Samples: 72309520. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-12 13:26:19,333][65383] Avg episode reward: [(0, '0.024')] [2024-06-12 13:26:22,618][65616] Updated weights for policy 0, policy_version 4420 (0.0026) [2024-06-12 13:26:24,332][65383] Fps is (10 sec: 52428.5, 60 sec: 49971.2, 300 sec: 49707.4). Total num frames: 72515584. Throughput: 0: 49671.2. Samples: 72608620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 13:26:24,333][65383] Avg episode reward: [(0, '0.024')] [2024-06-12 13:26:25,139][65616] Updated weights for policy 0, policy_version 4430 (0.0026) [2024-06-12 13:26:29,326][65616] Updated weights for policy 0, policy_version 4440 (0.0035) [2024-06-12 13:26:29,332][65383] Fps is (10 sec: 47513.7, 60 sec: 49698.2, 300 sec: 49707.4). Total num frames: 72744960. Throughput: 0: 49897.8. Samples: 72905220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 13:26:29,333][65383] Avg episode reward: [(0, '0.035')] [2024-06-12 13:26:31,668][65616] Updated weights for policy 0, policy_version 4450 (0.0028) [2024-06-12 13:26:34,332][65383] Fps is (10 sec: 49152.3, 60 sec: 49971.2, 300 sec: 49818.5). Total num frames: 73007104. Throughput: 0: 49909.4. Samples: 73044900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 13:26:34,333][65383] Avg episode reward: [(0, '0.033')] [2024-06-12 13:26:35,558][65616] Updated weights for policy 0, policy_version 4460 (0.0025) [2024-06-12 13:26:38,256][65616] Updated weights for policy 0, policy_version 4470 (0.0028) [2024-06-12 13:26:39,333][65383] Fps is (10 sec: 52427.9, 60 sec: 49425.0, 300 sec: 49818.5). Total num frames: 73269248. Throughput: 0: 50138.6. Samples: 73350340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-12 13:26:39,333][65383] Avg episode reward: [(0, '0.029')] [2024-06-12 13:26:42,269][65616] Updated weights for policy 0, policy_version 4480 (0.0042) [2024-06-12 13:26:44,332][65383] Fps is (10 sec: 50790.3, 60 sec: 50244.3, 300 sec: 49707.4). Total num frames: 73515008. Throughput: 0: 50023.6. Samples: 73650340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 13:26:44,333][65383] Avg episode reward: [(0, '0.035')] [2024-06-12 13:26:44,825][65616] Updated weights for policy 0, policy_version 4490 (0.0023) [2024-06-12 13:26:48,947][65616] Updated weights for policy 0, policy_version 4500 (0.0026) [2024-06-12 13:26:49,332][65383] Fps is (10 sec: 47514.6, 60 sec: 49698.2, 300 sec: 49762.9). Total num frames: 73744384. Throughput: 0: 49884.1. Samples: 73798260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-12 13:26:49,333][65383] Avg episode reward: [(0, '0.024')] [2024-06-12 13:26:51,643][65616] Updated weights for policy 0, policy_version 4510 (0.0037) [2024-06-12 13:26:54,332][65383] Fps is (10 sec: 47513.3, 60 sec: 49971.2, 300 sec: 49762.9). Total num frames: 73990144. Throughput: 0: 49718.2. Samples: 74094580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 13:26:54,333][65383] Avg episode reward: [(0, '0.036')] [2024-06-12 13:26:54,342][65595] Saving new best policy, reward=0.036! [2024-06-12 13:26:55,231][65616] Updated weights for policy 0, policy_version 4520 (0.0033) [2024-06-12 13:26:57,890][65616] Updated weights for policy 0, policy_version 4530 (0.0027) [2024-06-12 13:26:59,332][65383] Fps is (10 sec: 54066.8, 60 sec: 50244.3, 300 sec: 49874.0). Total num frames: 74285056. Throughput: 0: 50083.1. Samples: 74406060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-12 13:26:59,333][65383] Avg episode reward: [(0, '0.029')] [2024-06-12 13:27:01,870][65616] Updated weights for policy 0, policy_version 4540 (0.0026) [2024-06-12 13:27:04,332][65383] Fps is (10 sec: 54067.3, 60 sec: 50517.4, 300 sec: 49874.0). Total num frames: 74530816. Throughput: 0: 50002.2. Samples: 74559620. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 13:27:04,333][65383] Avg episode reward: [(0, '0.031')] [2024-06-12 13:27:04,398][65616] Updated weights for policy 0, policy_version 4550 (0.0020) [2024-06-12 13:27:08,546][65616] Updated weights for policy 0, policy_version 4560 (0.0030) [2024-06-12 13:27:09,332][65383] Fps is (10 sec: 47513.6, 60 sec: 50244.3, 300 sec: 49874.0). Total num frames: 74760192. Throughput: 0: 50178.2. Samples: 74866640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 13:27:09,333][65383] Avg episode reward: [(0, '0.030')] [2024-06-12 13:27:10,682][65616] Updated weights for policy 0, policy_version 4570 (0.0028) [2024-06-12 13:27:14,332][65383] Fps is (10 sec: 45875.2, 60 sec: 49971.1, 300 sec: 49762.9). Total num frames: 74989568. Throughput: 0: 50077.3. Samples: 75158700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 13:27:14,333][65383] Avg episode reward: [(0, '0.036')] [2024-06-12 13:27:14,904][65616] Updated weights for policy 0, policy_version 4580 (0.0027) [2024-06-12 13:27:16,484][65595] Signal inference workers to stop experience collection... (1050 times) [2024-06-12 13:27:16,518][65616] InferenceWorker_p0-w0: stopping experience collection (1050 times) [2024-06-12 13:27:16,540][65595] Signal inference workers to resume experience collection... (1050 times) [2024-06-12 13:27:16,541][65616] InferenceWorker_p0-w0: resuming experience collection (1050 times) [2024-06-12 13:27:17,911][65616] Updated weights for policy 0, policy_version 4590 (0.0029) [2024-06-12 13:27:19,332][65383] Fps is (10 sec: 50790.4, 60 sec: 49971.2, 300 sec: 49818.5). Total num frames: 75268096. Throughput: 0: 50009.7. Samples: 75295340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 13:27:19,333][65383] Avg episode reward: [(0, '0.032')] [2024-06-12 13:27:21,343][65616] Updated weights for policy 0, policy_version 4600 (0.0029) [2024-06-12 13:27:24,181][65616] Updated weights for policy 0, policy_version 4610 (0.0032) [2024-06-12 13:27:24,332][65383] Fps is (10 sec: 54066.9, 60 sec: 50244.2, 300 sec: 49929.5). Total num frames: 75530240. Throughput: 0: 50135.2. Samples: 75606420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 13:27:24,333][65383] Avg episode reward: [(0, '0.039')] [2024-06-12 13:27:24,344][65595] Saving new best policy, reward=0.039! [2024-06-12 13:27:28,002][65616] Updated weights for policy 0, policy_version 4620 (0.0030) [2024-06-12 13:27:29,332][65383] Fps is (10 sec: 50790.7, 60 sec: 50517.4, 300 sec: 49818.5). Total num frames: 75776000. Throughput: 0: 50081.4. Samples: 75904000. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-12 13:27:29,333][65383] Avg episode reward: [(0, '0.037')] [2024-06-12 13:27:30,665][65616] Updated weights for policy 0, policy_version 4630 (0.0039) [2024-06-12 13:27:34,213][65616] Updated weights for policy 0, policy_version 4640 (0.0024) [2024-06-12 13:27:34,332][65383] Fps is (10 sec: 49152.7, 60 sec: 50244.3, 300 sec: 49874.0). Total num frames: 76021760. Throughput: 0: 50442.6. Samples: 76068180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 13:27:34,333][65383] Avg episode reward: [(0, '0.038')] [2024-06-12 13:27:37,068][65616] Updated weights for policy 0, policy_version 4650 (0.0025) [2024-06-12 13:27:39,333][65383] Fps is (10 sec: 49151.0, 60 sec: 49971.2, 300 sec: 49929.5). Total num frames: 76267520. Throughput: 0: 50733.2. Samples: 76377580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-12 13:27:39,333][65383] Avg episode reward: [(0, '0.032')] [2024-06-12 13:27:40,774][65616] Updated weights for policy 0, policy_version 4660 (0.0030) [2024-06-12 13:27:43,316][65616] Updated weights for policy 0, policy_version 4670 (0.0030) [2024-06-12 13:27:44,332][65383] Fps is (10 sec: 50790.2, 60 sec: 50244.3, 300 sec: 49929.6). Total num frames: 76529664. Throughput: 0: 50212.9. Samples: 76665640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-12 13:27:44,333][65383] Avg episode reward: [(0, '0.029')] [2024-06-12 13:27:47,202][65616] Updated weights for policy 0, policy_version 4680 (0.0030) [2024-06-12 13:27:49,332][65383] Fps is (10 sec: 52429.0, 60 sec: 50790.3, 300 sec: 49985.1). Total num frames: 76791808. Throughput: 0: 50519.5. Samples: 76833000. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-12 13:27:49,333][65383] Avg episode reward: [(0, '0.039')] [2024-06-12 13:27:50,165][65616] Updated weights for policy 0, policy_version 4690 (0.0028) [2024-06-12 13:27:53,739][65616] Updated weights for policy 0, policy_version 4700 (0.0035) [2024-06-12 13:27:54,332][65383] Fps is (10 sec: 52428.5, 60 sec: 51063.5, 300 sec: 49985.1). Total num frames: 77053952. Throughput: 0: 50219.5. Samples: 77126520. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) [2024-06-12 13:27:54,333][65383] Avg episode reward: [(0, '0.034')] [2024-06-12 13:27:54,343][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000004703_77053952.pth... [2024-06-12 13:27:54,384][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000003970_65044480.pth [2024-06-12 13:27:56,516][65616] Updated weights for policy 0, policy_version 4710 (0.0028) [2024-06-12 13:27:59,332][65383] Fps is (10 sec: 47513.9, 60 sec: 49698.1, 300 sec: 49818.5). Total num frames: 77266944. Throughput: 0: 50365.3. Samples: 77425140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 13:27:59,333][65383] Avg episode reward: [(0, '0.026')] [2024-06-12 13:28:00,429][65616] Updated weights for policy 0, policy_version 4720 (0.0029) [2024-06-12 13:28:03,168][65616] Updated weights for policy 0, policy_version 4730 (0.0024) [2024-06-12 13:28:04,332][65383] Fps is (10 sec: 45875.6, 60 sec: 49698.2, 300 sec: 49874.0). Total num frames: 77512704. Throughput: 0: 50525.4. Samples: 77568980. Policy #0 lag: (min: 1.0, avg: 12.5, max: 23.0) [2024-06-12 13:28:04,333][65383] Avg episode reward: [(0, '0.041')] [2024-06-12 13:28:04,448][65595] Saving new best policy, reward=0.041! [2024-06-12 13:28:06,212][65595] Signal inference workers to stop experience collection... (1100 times) [2024-06-12 13:28:06,264][65616] InferenceWorker_p0-w0: stopping experience collection (1100 times) [2024-06-12 13:28:06,321][65595] Signal inference workers to resume experience collection... (1100 times) [2024-06-12 13:28:06,321][65616] InferenceWorker_p0-w0: resuming experience collection (1100 times) [2024-06-12 13:28:06,849][65616] Updated weights for policy 0, policy_version 4740 (0.0021) [2024-06-12 13:28:09,333][65383] Fps is (10 sec: 49151.6, 60 sec: 49971.1, 300 sec: 49929.5). Total num frames: 77758464. Throughput: 0: 50188.0. Samples: 77864880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 13:28:09,333][65383] Avg episode reward: [(0, '0.023')] [2024-06-12 13:28:09,997][65616] Updated weights for policy 0, policy_version 4750 (0.0034) [2024-06-12 13:28:13,277][65616] Updated weights for policy 0, policy_version 4760 (0.0025) [2024-06-12 13:28:14,332][65383] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 49929.5). Total num frames: 78036992. Throughput: 0: 50129.7. Samples: 78159840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 13:28:14,333][65383] Avg episode reward: [(0, '0.040')] [2024-06-12 13:28:16,477][65616] Updated weights for policy 0, policy_version 4770 (0.0035) [2024-06-12 13:28:19,332][65383] Fps is (10 sec: 49153.0, 60 sec: 49698.2, 300 sec: 49762.9). Total num frames: 78249984. Throughput: 0: 49900.5. Samples: 78313700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 13:28:19,333][65383] Avg episode reward: [(0, '0.035')] [2024-06-12 13:28:19,795][65616] Updated weights for policy 0, policy_version 4780 (0.0030) [2024-06-12 13:28:23,434][65616] Updated weights for policy 0, policy_version 4790 (0.0025) [2024-06-12 13:28:24,332][65383] Fps is (10 sec: 49151.6, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 78528512. Throughput: 0: 50027.1. Samples: 78628800. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-12 13:28:24,333][65383] Avg episode reward: [(0, '0.024')] [2024-06-12 13:28:26,218][65616] Updated weights for policy 0, policy_version 4800 (0.0028) [2024-06-12 13:28:29,332][65383] Fps is (10 sec: 52428.1, 60 sec: 49971.1, 300 sec: 49985.1). Total num frames: 78774272. Throughput: 0: 50141.7. Samples: 78922020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 28.0) [2024-06-12 13:28:29,333][65383] Avg episode reward: [(0, '0.031')] [2024-06-12 13:28:30,018][65616] Updated weights for policy 0, policy_version 4810 (0.0025) [2024-06-12 13:28:32,756][65616] Updated weights for policy 0, policy_version 4820 (0.0021) [2024-06-12 13:28:34,332][65383] Fps is (10 sec: 52429.8, 60 sec: 50517.4, 300 sec: 49929.5). Total num frames: 79052800. Throughput: 0: 49884.2. Samples: 79077780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-12 13:28:34,333][65383] Avg episode reward: [(0, '0.025')] [2024-06-12 13:28:36,525][65616] Updated weights for policy 0, policy_version 4830 (0.0029) [2024-06-12 13:28:39,325][65616] Updated weights for policy 0, policy_version 4840 (0.0026) [2024-06-12 13:28:39,335][65383] Fps is (10 sec: 52417.5, 60 sec: 50515.6, 300 sec: 49929.2). Total num frames: 79298560. Throughput: 0: 49812.3. Samples: 79368180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-12 13:28:39,335][65383] Avg episode reward: [(0, '0.031')] [2024-06-12 13:28:43,119][65616] Updated weights for policy 0, policy_version 4850 (0.0026) [2024-06-12 13:28:44,332][65383] Fps is (10 sec: 45874.9, 60 sec: 49698.1, 300 sec: 49874.0). Total num frames: 79511552. Throughput: 0: 50020.9. Samples: 79676080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 13:28:44,333][65383] Avg episode reward: [(0, '0.026')] [2024-06-12 13:28:45,707][65616] Updated weights for policy 0, policy_version 4860 (0.0027) [2024-06-12 13:28:49,332][65383] Fps is (10 sec: 47524.4, 60 sec: 49698.3, 300 sec: 49985.1). Total num frames: 79773696. Throughput: 0: 49904.9. Samples: 79814700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 13:28:49,333][65383] Avg episode reward: [(0, '0.035')] [2024-06-12 13:28:49,341][65616] Updated weights for policy 0, policy_version 4870 (0.0028) [2024-06-12 13:28:52,446][65616] Updated weights for policy 0, policy_version 4880 (0.0019) [2024-06-12 13:28:54,332][65383] Fps is (10 sec: 52428.8, 60 sec: 49698.2, 300 sec: 49929.5). Total num frames: 80035840. Throughput: 0: 49996.6. Samples: 80114720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 13:28:54,333][65383] Avg episode reward: [(0, '0.032')] [2024-06-12 13:28:56,033][65616] Updated weights for policy 0, policy_version 4890 (0.0022) [2024-06-12 13:28:58,974][65616] Updated weights for policy 0, policy_version 4900 (0.0029) [2024-06-12 13:28:59,332][65383] Fps is (10 sec: 50789.7, 60 sec: 50244.2, 300 sec: 49929.5). Total num frames: 80281600. Throughput: 0: 50017.7. Samples: 80410640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 13:28:59,333][65383] Avg episode reward: [(0, '0.032')] [2024-06-12 13:29:02,621][65616] Updated weights for policy 0, policy_version 4910 (0.0026) [2024-06-12 13:29:04,332][65383] Fps is (10 sec: 49151.6, 60 sec: 50244.2, 300 sec: 49985.1). Total num frames: 80527360. Throughput: 0: 50075.0. Samples: 80567080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 13:29:04,341][65383] Avg episode reward: [(0, '0.041')] [2024-06-12 13:29:05,408][65616] Updated weights for policy 0, policy_version 4920 (0.0024) [2024-06-12 13:29:06,182][65595] Signal inference workers to stop experience collection... (1150 times) [2024-06-12 13:29:06,182][65595] Signal inference workers to resume experience collection... (1150 times) [2024-06-12 13:29:06,193][65616] InferenceWorker_p0-w0: stopping experience collection (1150 times) [2024-06-12 13:29:06,194][65616] InferenceWorker_p0-w0: resuming experience collection (1150 times) [2024-06-12 13:29:09,332][65383] Fps is (10 sec: 47514.2, 60 sec: 49971.3, 300 sec: 49929.6). Total num frames: 80756736. Throughput: 0: 49595.3. Samples: 80860580. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-12 13:29:09,333][65383] Avg episode reward: [(0, '0.039')] [2024-06-12 13:29:09,354][65616] Updated weights for policy 0, policy_version 4930 (0.0030) [2024-06-12 13:29:12,044][65616] Updated weights for policy 0, policy_version 4940 (0.0026) [2024-06-12 13:29:14,332][65383] Fps is (10 sec: 49152.2, 60 sec: 49698.1, 300 sec: 49929.5). Total num frames: 81018880. Throughput: 0: 49740.9. Samples: 81160360. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-12 13:29:14,333][65383] Avg episode reward: [(0, '0.026')] [2024-06-12 13:29:15,805][65616] Updated weights for policy 0, policy_version 4950 (0.0030) [2024-06-12 13:29:18,446][65616] Updated weights for policy 0, policy_version 4960 (0.0024) [2024-06-12 13:29:19,332][65383] Fps is (10 sec: 54066.9, 60 sec: 50790.3, 300 sec: 49985.1). Total num frames: 81297408. Throughput: 0: 49954.1. Samples: 81325720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 13:29:19,333][65383] Avg episode reward: [(0, '0.030')] [2024-06-12 13:29:22,295][65616] Updated weights for policy 0, policy_version 4970 (0.0029) [2024-06-12 13:29:24,332][65383] Fps is (10 sec: 52429.1, 60 sec: 50244.4, 300 sec: 50096.2). Total num frames: 81543168. Throughput: 0: 50303.8. Samples: 81631740. Policy #0 lag: (min: 2.0, avg: 11.0, max: 21.0) [2024-06-12 13:29:24,333][65383] Avg episode reward: [(0, '0.041')] [2024-06-12 13:29:25,212][65616] Updated weights for policy 0, policy_version 4980 (0.0031) [2024-06-12 13:29:28,961][65616] Updated weights for policy 0, policy_version 4990 (0.0032) [2024-06-12 13:29:29,332][65383] Fps is (10 sec: 45874.9, 60 sec: 49698.1, 300 sec: 49874.0). Total num frames: 81756160. Throughput: 0: 49989.2. Samples: 81925600. Policy #0 lag: (min: 2.0, avg: 11.0, max: 21.0) [2024-06-12 13:29:29,333][65383] Avg episode reward: [(0, '0.029')] [2024-06-12 13:29:31,839][65616] Updated weights for policy 0, policy_version 5000 (0.0027) [2024-06-12 13:29:34,333][65383] Fps is (10 sec: 47512.9, 60 sec: 49424.9, 300 sec: 49985.1). Total num frames: 82018304. Throughput: 0: 50066.9. Samples: 82067720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 13:29:34,333][65383] Avg episode reward: [(0, '0.035')] [2024-06-12 13:29:35,331][65616] Updated weights for policy 0, policy_version 5010 (0.0021) [2024-06-12 13:29:38,159][65616] Updated weights for policy 0, policy_version 5020 (0.0031) [2024-06-12 13:29:39,334][65383] Fps is (10 sec: 54057.7, 60 sec: 49971.5, 300 sec: 49984.9). Total num frames: 82296832. Throughput: 0: 50249.5. Samples: 82376040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 13:29:39,335][65383] Avg episode reward: [(0, '0.035')] [2024-06-12 13:29:41,742][65616] Updated weights for policy 0, policy_version 5030 (0.0023) [2024-06-12 13:29:44,332][65383] Fps is (10 sec: 49152.6, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 82509824. Throughput: 0: 50164.5. Samples: 82668040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 13:29:44,333][65383] Avg episode reward: [(0, '0.040')] [2024-06-12 13:29:44,907][65616] Updated weights for policy 0, policy_version 5040 (0.0029) [2024-06-12 13:29:48,765][65616] Updated weights for policy 0, policy_version 5050 (0.0028) [2024-06-12 13:29:49,332][65383] Fps is (10 sec: 47521.8, 60 sec: 49971.1, 300 sec: 49929.5). Total num frames: 82771968. Throughput: 0: 49975.1. Samples: 82815960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 13:29:49,333][65383] Avg episode reward: [(0, '0.044')] [2024-06-12 13:29:49,456][65595] Saving new best policy, reward=0.044! [2024-06-12 13:29:51,502][65616] Updated weights for policy 0, policy_version 5060 (0.0037) [2024-06-12 13:29:54,332][65383] Fps is (10 sec: 52428.6, 60 sec: 49971.2, 300 sec: 50040.6). Total num frames: 83034112. Throughput: 0: 50208.8. Samples: 83119980. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-12 13:29:54,333][65383] Avg episode reward: [(0, '0.040')] [2024-06-12 13:29:54,342][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000005068_83034112.pth... [2024-06-12 13:29:54,379][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000004334_71008256.pth [2024-06-12 13:29:55,166][65616] Updated weights for policy 0, policy_version 5070 (0.0029) [2024-06-12 13:29:58,030][65616] Updated weights for policy 0, policy_version 5080 (0.0027) [2024-06-12 13:29:59,332][65383] Fps is (10 sec: 50791.2, 60 sec: 49971.3, 300 sec: 50040.6). Total num frames: 83279872. Throughput: 0: 50307.6. Samples: 83424200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 13:29:59,333][65383] Avg episode reward: [(0, '0.042')] [2024-06-12 13:30:01,710][65616] Updated weights for policy 0, policy_version 5090 (0.0025) [2024-06-12 13:30:04,332][65383] Fps is (10 sec: 50790.3, 60 sec: 50244.3, 300 sec: 50096.1). Total num frames: 83542016. Throughput: 0: 50143.9. Samples: 83582200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 13:30:04,333][65383] Avg episode reward: [(0, '0.040')] [2024-06-12 13:30:04,460][65616] Updated weights for policy 0, policy_version 5100 (0.0027) [2024-06-12 13:30:08,140][65616] Updated weights for policy 0, policy_version 5110 (0.0025) [2024-06-12 13:30:09,332][65383] Fps is (10 sec: 52428.4, 60 sec: 50790.3, 300 sec: 50096.2). Total num frames: 83804160. Throughput: 0: 49931.5. Samples: 83878660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 13:30:09,333][65383] Avg episode reward: [(0, '0.032')] [2024-06-12 13:30:11,161][65616] Updated weights for policy 0, policy_version 5120 (0.0027) [2024-06-12 13:30:11,632][65595] Signal inference workers to stop experience collection... (1200 times) [2024-06-12 13:30:11,633][65595] Signal inference workers to resume experience collection... (1200 times) [2024-06-12 13:30:11,662][65616] InferenceWorker_p0-w0: stopping experience collection (1200 times) [2024-06-12 13:30:11,662][65616] InferenceWorker_p0-w0: resuming experience collection (1200 times) [2024-06-12 13:30:14,332][65383] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 49985.1). Total num frames: 84033536. Throughput: 0: 50072.5. Samples: 84178860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 13:30:14,333][65383] Avg episode reward: [(0, '0.035')] [2024-06-12 13:30:14,661][65616] Updated weights for policy 0, policy_version 5130 (0.0029) [2024-06-12 13:30:17,338][65616] Updated weights for policy 0, policy_version 5140 (0.0026) [2024-06-12 13:30:19,332][65383] Fps is (10 sec: 47513.6, 60 sec: 49698.1, 300 sec: 50040.6). Total num frames: 84279296. Throughput: 0: 50301.4. Samples: 84331280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-12 13:30:19,333][65383] Avg episode reward: [(0, '0.033')] [2024-06-12 13:30:20,941][65616] Updated weights for policy 0, policy_version 5150 (0.0031) [2024-06-12 13:30:24,175][65616] Updated weights for policy 0, policy_version 5160 (0.0028) [2024-06-12 13:30:24,332][65383] Fps is (10 sec: 50790.5, 60 sec: 49971.2, 300 sec: 50096.2). Total num frames: 84541440. Throughput: 0: 50142.5. Samples: 84632360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-12 13:30:24,333][65383] Avg episode reward: [(0, '0.035')] [2024-06-12 13:30:27,544][65616] Updated weights for policy 0, policy_version 5170 (0.0028) [2024-06-12 13:30:29,332][65383] Fps is (10 sec: 52429.5, 60 sec: 50790.5, 300 sec: 50151.7). Total num frames: 84803584. Throughput: 0: 50399.2. Samples: 84936000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 13:30:29,333][65383] Avg episode reward: [(0, '0.039')] [2024-06-12 13:30:30,562][65616] Updated weights for policy 0, policy_version 5180 (0.0028) [2024-06-12 13:30:34,003][65616] Updated weights for policy 0, policy_version 5190 (0.0030) [2024-06-12 13:30:34,333][65383] Fps is (10 sec: 50789.8, 60 sec: 50517.3, 300 sec: 49985.1). Total num frames: 85049344. Throughput: 0: 50582.2. Samples: 85092160. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-12 13:30:34,333][65383] Avg episode reward: [(0, '0.050')] [2024-06-12 13:30:34,391][65595] Saving new best policy, reward=0.050! [2024-06-12 13:30:37,205][65616] Updated weights for policy 0, policy_version 5200 (0.0034) [2024-06-12 13:30:39,332][65383] Fps is (10 sec: 49151.3, 60 sec: 49972.7, 300 sec: 50151.7). Total num frames: 85295104. Throughput: 0: 50402.7. Samples: 85388100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 13:30:39,333][65383] Avg episode reward: [(0, '0.044')] [2024-06-12 13:30:40,590][65616] Updated weights for policy 0, policy_version 5210 (0.0024) [2024-06-12 13:30:43,670][65616] Updated weights for policy 0, policy_version 5220 (0.0027) [2024-06-12 13:30:44,332][65383] Fps is (10 sec: 50790.9, 60 sec: 50790.4, 300 sec: 50151.7). Total num frames: 85557248. Throughput: 0: 50459.5. Samples: 85694880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 13:30:44,333][65383] Avg episode reward: [(0, '0.033')] [2024-06-12 13:30:46,909][65616] Updated weights for policy 0, policy_version 5230 (0.0028) [2024-06-12 13:30:49,332][65383] Fps is (10 sec: 52428.7, 60 sec: 50790.4, 300 sec: 50262.8). Total num frames: 85819392. Throughput: 0: 50370.2. Samples: 85848860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 13:30:49,333][65383] Avg episode reward: [(0, '0.044')] [2024-06-12 13:30:50,051][65616] Updated weights for policy 0, policy_version 5240 (0.0030) [2024-06-12 13:30:53,319][65616] Updated weights for policy 0, policy_version 5250 (0.0029) [2024-06-12 13:30:54,332][65383] Fps is (10 sec: 50790.4, 60 sec: 50517.3, 300 sec: 50151.7). Total num frames: 86065152. Throughput: 0: 50411.1. Samples: 86147160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 13:30:54,333][65383] Avg episode reward: [(0, '0.036')] [2024-06-12 13:30:56,431][65616] Updated weights for policy 0, policy_version 5260 (0.0032) [2024-06-12 13:30:59,332][65383] Fps is (10 sec: 50790.9, 60 sec: 50790.4, 300 sec: 50262.8). Total num frames: 86327296. Throughput: 0: 50784.1. Samples: 86464140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 13:30:59,333][65383] Avg episode reward: [(0, '0.036')] [2024-06-12 13:30:59,526][65616] Updated weights for policy 0, policy_version 5270 (0.0025) [2024-06-12 13:31:02,813][65616] Updated weights for policy 0, policy_version 5280 (0.0035) [2024-06-12 13:31:04,332][65383] Fps is (10 sec: 50790.7, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 86573056. Throughput: 0: 50617.0. Samples: 86609040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-12 13:31:04,333][65383] Avg episode reward: [(0, '0.036')] [2024-06-12 13:31:06,152][65616] Updated weights for policy 0, policy_version 5290 (0.0033) [2024-06-12 13:31:09,332][65383] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 86818816. Throughput: 0: 50444.0. Samples: 86902340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 13:31:09,333][65383] Avg episode reward: [(0, '0.036')] [2024-06-12 13:31:09,660][65616] Updated weights for policy 0, policy_version 5300 (0.0032) [2024-06-12 13:31:12,904][65616] Updated weights for policy 0, policy_version 5310 (0.0030) [2024-06-12 13:31:14,332][65383] Fps is (10 sec: 49151.9, 60 sec: 50517.4, 300 sec: 50151.7). Total num frames: 87064576. Throughput: 0: 50364.3. Samples: 87202400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 13:31:14,333][65383] Avg episode reward: [(0, '0.038')] [2024-06-12 13:31:15,893][65616] Updated weights for policy 0, policy_version 5320 (0.0021) [2024-06-12 13:31:19,090][65616] Updated weights for policy 0, policy_version 5330 (0.0028) [2024-06-12 13:31:19,332][65383] Fps is (10 sec: 50789.9, 60 sec: 50790.4, 300 sec: 50207.2). Total num frames: 87326720. Throughput: 0: 50384.5. Samples: 87359460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 13:31:19,333][65383] Avg episode reward: [(0, '0.033')] [2024-06-12 13:31:22,539][65616] Updated weights for policy 0, policy_version 5340 (0.0031) [2024-06-12 13:31:24,332][65383] Fps is (10 sec: 50790.4, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 87572480. Throughput: 0: 50616.9. Samples: 87665860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 13:31:24,333][65383] Avg episode reward: [(0, '0.043')] [2024-06-12 13:31:25,612][65616] Updated weights for policy 0, policy_version 5350 (0.0027) [2024-06-12 13:31:28,931][65616] Updated weights for policy 0, policy_version 5360 (0.0025) [2024-06-12 13:31:29,332][65383] Fps is (10 sec: 50790.5, 60 sec: 50517.2, 300 sec: 50262.8). Total num frames: 87834624. Throughput: 0: 50535.5. Samples: 87968980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 13:31:29,333][65383] Avg episode reward: [(0, '0.043')] [2024-06-12 13:31:32,240][65616] Updated weights for policy 0, policy_version 5370 (0.0032) [2024-06-12 13:31:34,333][65383] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50207.2). Total num frames: 88080384. Throughput: 0: 50550.2. Samples: 88123620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-12 13:31:34,333][65383] Avg episode reward: [(0, '0.045')] [2024-06-12 13:31:35,445][65616] Updated weights for policy 0, policy_version 5380 (0.0022) [2024-06-12 13:31:38,466][65616] Updated weights for policy 0, policy_version 5390 (0.0026) [2024-06-12 13:31:39,332][65383] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50318.3). Total num frames: 88358912. Throughput: 0: 50658.6. Samples: 88426800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 13:31:39,333][65383] Avg episode reward: [(0, '0.035')] [2024-06-12 13:31:42,132][65616] Updated weights for policy 0, policy_version 5400 (0.0023) [2024-06-12 13:31:44,332][65383] Fps is (10 sec: 49152.5, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 88571904. Throughput: 0: 50424.8. Samples: 88733260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 13:31:44,333][65383] Avg episode reward: [(0, '0.041')] [2024-06-12 13:31:44,389][65595] Signal inference workers to stop experience collection... (1250 times) [2024-06-12 13:31:44,390][65595] Signal inference workers to resume experience collection... (1250 times) [2024-06-12 13:31:44,413][65616] InferenceWorker_p0-w0: stopping experience collection (1250 times) [2024-06-12 13:31:44,413][65616] InferenceWorker_p0-w0: resuming experience collection (1250 times) [2024-06-12 13:31:44,837][65616] Updated weights for policy 0, policy_version 5410 (0.0038) [2024-06-12 13:31:48,207][65616] Updated weights for policy 0, policy_version 5420 (0.0033) [2024-06-12 13:31:49,332][65383] Fps is (10 sec: 47513.8, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 88834048. Throughput: 0: 50436.0. Samples: 88878660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 13:31:49,333][65383] Avg episode reward: [(0, '0.042')] [2024-06-12 13:31:51,524][65616] Updated weights for policy 0, policy_version 5430 (0.0024) [2024-06-12 13:31:54,332][65383] Fps is (10 sec: 52429.2, 60 sec: 50517.4, 300 sec: 50207.3). Total num frames: 89096192. Throughput: 0: 50655.2. Samples: 89181820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 13:31:54,333][65383] Avg episode reward: [(0, '0.035')] [2024-06-12 13:31:54,337][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000005438_89096192.pth... [2024-06-12 13:31:54,379][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000004703_77053952.pth [2024-06-12 13:31:55,084][65616] Updated weights for policy 0, policy_version 5440 (0.0034) [2024-06-12 13:31:57,873][65616] Updated weights for policy 0, policy_version 5450 (0.0021) [2024-06-12 13:31:59,332][65383] Fps is (10 sec: 50790.8, 60 sec: 50244.3, 300 sec: 50207.3). Total num frames: 89341952. Throughput: 0: 50771.2. Samples: 89487100. Policy #0 lag: (min: 3.0, avg: 12.4, max: 24.0) [2024-06-12 13:31:59,333][65383] Avg episode reward: [(0, '0.048')] [2024-06-12 13:32:01,567][65616] Updated weights for policy 0, policy_version 5460 (0.0027) [2024-06-12 13:32:04,224][65616] Updated weights for policy 0, policy_version 5470 (0.0025) [2024-06-12 13:32:04,332][65383] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50373.9). Total num frames: 89620480. Throughput: 0: 50733.4. Samples: 89642460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 13:32:04,333][65383] Avg episode reward: [(0, '0.042')] [2024-06-12 13:32:07,701][65616] Updated weights for policy 0, policy_version 5480 (0.0020) [2024-06-12 13:32:09,332][65383] Fps is (10 sec: 49151.5, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 89833472. Throughput: 0: 50615.1. Samples: 89943540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 13:32:09,333][65383] Avg episode reward: [(0, '0.045')] [2024-06-12 13:32:10,893][65616] Updated weights for policy 0, policy_version 5490 (0.0024) [2024-06-12 13:32:14,332][65383] Fps is (10 sec: 45875.0, 60 sec: 50244.2, 300 sec: 50207.2). Total num frames: 90079232. Throughput: 0: 50504.0. Samples: 90241660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 13:32:14,333][65383] Avg episode reward: [(0, '0.044')] [2024-06-12 13:32:14,502][65616] Updated weights for policy 0, policy_version 5500 (0.0027) [2024-06-12 13:32:17,093][65616] Updated weights for policy 0, policy_version 5510 (0.0027) [2024-06-12 13:32:19,332][65383] Fps is (10 sec: 54067.5, 60 sec: 50790.5, 300 sec: 50318.3). Total num frames: 90374144. Throughput: 0: 50648.6. Samples: 90402800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 13:32:19,333][65383] Avg episode reward: [(0, '0.049')] [2024-06-12 13:32:20,803][65616] Updated weights for policy 0, policy_version 5520 (0.0025) [2024-06-12 13:32:23,543][65616] Updated weights for policy 0, policy_version 5530 (0.0032) [2024-06-12 13:32:24,332][65383] Fps is (10 sec: 54067.3, 60 sec: 50790.4, 300 sec: 50318.3). Total num frames: 90619904. Throughput: 0: 50806.7. Samples: 90713100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-12 13:32:24,333][65383] Avg episode reward: [(0, '0.058')] [2024-06-12 13:32:24,342][65595] Saving new best policy, reward=0.058! [2024-06-12 13:32:27,191][65616] Updated weights for policy 0, policy_version 5540 (0.0029) [2024-06-12 13:32:29,332][65383] Fps is (10 sec: 47513.4, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 90849280. Throughput: 0: 50632.4. Samples: 91011720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 13:32:29,333][65383] Avg episode reward: [(0, '0.043')] [2024-06-12 13:32:30,046][65616] Updated weights for policy 0, policy_version 5550 (0.0028) [2024-06-12 13:32:33,641][65616] Updated weights for policy 0, policy_version 5560 (0.0033) [2024-06-12 13:32:34,332][65383] Fps is (10 sec: 49151.9, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 91111424. Throughput: 0: 50687.1. Samples: 91159580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-12 13:32:34,333][65383] Avg episode reward: [(0, '0.042')] [2024-06-12 13:32:36,549][65616] Updated weights for policy 0, policy_version 5570 (0.0023) [2024-06-12 13:32:37,892][65595] Signal inference workers to stop experience collection... (1300 times) [2024-06-12 13:32:37,939][65595] Signal inference workers to resume experience collection... (1300 times) [2024-06-12 13:32:37,942][65616] InferenceWorker_p0-w0: stopping experience collection (1300 times) [2024-06-12 13:32:37,971][65616] InferenceWorker_p0-w0: resuming experience collection (1300 times) [2024-06-12 13:32:39,332][65383] Fps is (10 sec: 52428.8, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 91373568. Throughput: 0: 50851.5. Samples: 91470140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 13:32:39,333][65383] Avg episode reward: [(0, '0.051')] [2024-06-12 13:32:39,726][65616] Updated weights for policy 0, policy_version 5580 (0.0024) [2024-06-12 13:32:42,965][65616] Updated weights for policy 0, policy_version 5590 (0.0028) [2024-06-12 13:32:44,332][65383] Fps is (10 sec: 54067.8, 60 sec: 51336.6, 300 sec: 50373.9). Total num frames: 91652096. Throughput: 0: 50990.6. Samples: 91781680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 25.0) [2024-06-12 13:32:44,333][65383] Avg episode reward: [(0, '0.049')] [2024-06-12 13:32:46,501][65616] Updated weights for policy 0, policy_version 5600 (0.0025) [2024-06-12 13:32:49,333][65383] Fps is (10 sec: 52428.1, 60 sec: 51063.3, 300 sec: 50318.3). Total num frames: 91897856. Throughput: 0: 50931.4. Samples: 91934380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-12 13:32:49,333][65383] Avg episode reward: [(0, '0.046')] [2024-06-12 13:32:49,711][65616] Updated weights for policy 0, policy_version 5610 (0.0024) [2024-06-12 13:32:52,811][65616] Updated weights for policy 0, policy_version 5620 (0.0031) [2024-06-12 13:32:54,332][65383] Fps is (10 sec: 49151.3, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 92143616. Throughput: 0: 51041.7. Samples: 92240420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 13:32:54,333][65383] Avg episode reward: [(0, '0.055')] [2024-06-12 13:32:56,271][65616] Updated weights for policy 0, policy_version 5630 (0.0022) [2024-06-12 13:32:59,332][65383] Fps is (10 sec: 47514.1, 60 sec: 50517.2, 300 sec: 50373.8). Total num frames: 92372992. Throughput: 0: 51006.7. Samples: 92536960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 13:32:59,333][65383] Avg episode reward: [(0, '0.053')] [2024-06-12 13:32:59,759][65616] Updated weights for policy 0, policy_version 5640 (0.0026) [2024-06-12 13:33:02,725][65616] Updated weights for policy 0, policy_version 5650 (0.0027) [2024-06-12 13:33:04,332][65383] Fps is (10 sec: 52429.5, 60 sec: 50790.4, 300 sec: 50540.5). Total num frames: 92667904. Throughput: 0: 50737.3. Samples: 92685980. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-06-12 13:33:04,333][65383] Avg episode reward: [(0, '0.044')] [2024-06-12 13:33:06,212][65616] Updated weights for policy 0, policy_version 5660 (0.0027) [2024-06-12 13:33:09,333][65383] Fps is (10 sec: 50789.8, 60 sec: 50790.3, 300 sec: 50318.3). Total num frames: 92880896. Throughput: 0: 50515.8. Samples: 92986320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 13:33:09,333][65383] Avg episode reward: [(0, '0.045')] [2024-06-12 13:33:09,432][65616] Updated weights for policy 0, policy_version 5670 (0.0026) [2024-06-12 13:33:12,927][65616] Updated weights for policy 0, policy_version 5680 (0.0027) [2024-06-12 13:33:14,332][65383] Fps is (10 sec: 49151.7, 60 sec: 51336.6, 300 sec: 50540.5). Total num frames: 93159424. Throughput: 0: 50450.2. Samples: 93281980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 13:33:14,333][65383] Avg episode reward: [(0, '0.047')] [2024-06-12 13:33:16,070][65616] Updated weights for policy 0, policy_version 5690 (0.0031) [2024-06-12 13:33:19,332][65383] Fps is (10 sec: 47514.7, 60 sec: 49698.2, 300 sec: 50262.8). Total num frames: 93356032. Throughput: 0: 50411.2. Samples: 93428080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 13:33:19,333][65383] Avg episode reward: [(0, '0.052')] [2024-06-12 13:33:19,526][65616] Updated weights for policy 0, policy_version 5700 (0.0035) [2024-06-12 13:33:22,359][65616] Updated weights for policy 0, policy_version 5710 (0.0019) [2024-06-12 13:33:24,332][65383] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 93650944. Throughput: 0: 50548.8. Samples: 93744840. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-12 13:33:24,333][65383] Avg episode reward: [(0, '0.041')] [2024-06-12 13:33:25,593][65616] Updated weights for policy 0, policy_version 5720 (0.0024) [2024-06-12 13:33:28,758][65616] Updated weights for policy 0, policy_version 5730 (0.0021) [2024-06-12 13:33:29,332][65383] Fps is (10 sec: 54067.4, 60 sec: 50790.5, 300 sec: 50318.3). Total num frames: 93896704. Throughput: 0: 50288.0. Samples: 94044640. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-12 13:33:29,333][65383] Avg episode reward: [(0, '0.053')] [2024-06-12 13:33:32,188][65616] Updated weights for policy 0, policy_version 5740 (0.0023) [2024-06-12 13:33:32,542][65595] Signal inference workers to stop experience collection... (1350 times) [2024-06-12 13:33:32,589][65616] InferenceWorker_p0-w0: stopping experience collection (1350 times) [2024-06-12 13:33:32,595][65595] Signal inference workers to resume experience collection... (1350 times) [2024-06-12 13:33:32,605][65616] InferenceWorker_p0-w0: resuming experience collection (1350 times) [2024-06-12 13:33:34,332][65383] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50374.2). Total num frames: 94158848. Throughput: 0: 50386.8. Samples: 94201780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 13:33:34,333][65383] Avg episode reward: [(0, '0.058')] [2024-06-12 13:33:35,126][65616] Updated weights for policy 0, policy_version 5750 (0.0028) [2024-06-12 13:33:38,658][65616] Updated weights for policy 0, policy_version 5760 (0.0031) [2024-06-12 13:33:39,332][65383] Fps is (10 sec: 50790.2, 60 sec: 50517.4, 300 sec: 50484.9). Total num frames: 94404608. Throughput: 0: 50211.7. Samples: 94499940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 13:33:39,333][65383] Avg episode reward: [(0, '0.043')] [2024-06-12 13:33:41,614][65616] Updated weights for policy 0, policy_version 5770 (0.0030) [2024-06-12 13:33:44,332][65383] Fps is (10 sec: 50790.1, 60 sec: 50244.2, 300 sec: 50484.9). Total num frames: 94666752. Throughput: 0: 50392.0. Samples: 94804600. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-12 13:33:44,333][65383] Avg episode reward: [(0, '0.044')] [2024-06-12 13:33:45,002][65616] Updated weights for policy 0, policy_version 5780 (0.0025) [2024-06-12 13:33:47,929][65616] Updated weights for policy 0, policy_version 5790 (0.0025) [2024-06-12 13:33:49,332][65383] Fps is (10 sec: 54067.4, 60 sec: 50790.6, 300 sec: 50540.5). Total num frames: 94945280. Throughput: 0: 50552.9. Samples: 94960860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-12 13:33:49,333][65383] Avg episode reward: [(0, '0.041')] [2024-06-12 13:33:51,123][65616] Updated weights for policy 0, policy_version 5800 (0.0021) [2024-06-12 13:33:54,332][65383] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50485.0). Total num frames: 95174656. Throughput: 0: 50790.9. Samples: 95271900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 13:33:54,333][65383] Avg episode reward: [(0, '0.044')] [2024-06-12 13:33:54,361][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000005810_95191040.pth... [2024-06-12 13:33:54,367][65616] Updated weights for policy 0, policy_version 5810 (0.0027) [2024-06-12 13:33:54,411][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000005068_83034112.pth [2024-06-12 13:33:57,650][65616] Updated weights for policy 0, policy_version 5820 (0.0025) [2024-06-12 13:33:59,332][65383] Fps is (10 sec: 49151.7, 60 sec: 51063.5, 300 sec: 50540.5). Total num frames: 95436800. Throughput: 0: 50879.1. Samples: 95571540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 13:33:59,333][65383] Avg episode reward: [(0, '0.044')] [2024-06-12 13:34:00,635][65616] Updated weights for policy 0, policy_version 5830 (0.0024) [2024-06-12 13:34:04,159][65616] Updated weights for policy 0, policy_version 5840 (0.0028) [2024-06-12 13:34:04,332][65383] Fps is (10 sec: 50789.9, 60 sec: 50244.2, 300 sec: 50596.0). Total num frames: 95682560. Throughput: 0: 51031.9. Samples: 95724520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 13:34:04,333][65383] Avg episode reward: [(0, '0.051')] [2024-06-12 13:34:07,137][65616] Updated weights for policy 0, policy_version 5850 (0.0023) [2024-06-12 13:34:09,332][65383] Fps is (10 sec: 50790.6, 60 sec: 51063.7, 300 sec: 50596.0). Total num frames: 95944704. Throughput: 0: 50760.1. Samples: 96029040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 13:34:09,333][65383] Avg episode reward: [(0, '0.044')] [2024-06-12 13:34:10,414][65616] Updated weights for policy 0, policy_version 5860 (0.0033) [2024-06-12 13:34:13,753][65616] Updated weights for policy 0, policy_version 5870 (0.0019) [2024-06-12 13:34:14,333][65383] Fps is (10 sec: 52428.3, 60 sec: 50790.3, 300 sec: 50540.4). Total num frames: 96206848. Throughput: 0: 50907.7. Samples: 96335500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 13:34:14,333][65383] Avg episode reward: [(0, '0.042')] [2024-06-12 13:34:17,024][65616] Updated weights for policy 0, policy_version 5880 (0.0026) [2024-06-12 13:34:19,332][65383] Fps is (10 sec: 49152.0, 60 sec: 51336.5, 300 sec: 50484.9). Total num frames: 96436224. Throughput: 0: 50784.1. Samples: 96487060. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) [2024-06-12 13:34:19,333][65383] Avg episode reward: [(0, '0.044')] [2024-06-12 13:34:20,221][65616] Updated weights for policy 0, policy_version 5890 (0.0032) [2024-06-12 13:34:23,237][65616] Updated weights for policy 0, policy_version 5900 (0.0021) [2024-06-12 13:34:24,332][65383] Fps is (10 sec: 50790.7, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 96714752. Throughput: 0: 50982.5. Samples: 96794160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 13:34:24,333][65383] Avg episode reward: [(0, '0.052')] [2024-06-12 13:34:26,478][65616] Updated weights for policy 0, policy_version 5910 (0.0027) [2024-06-12 13:34:29,332][65383] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50651.6). Total num frames: 96960512. Throughput: 0: 50849.8. Samples: 97092840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-12 13:34:29,333][65383] Avg episode reward: [(0, '0.056')] [2024-06-12 13:34:29,980][65616] Updated weights for policy 0, policy_version 5920 (0.0023) [2024-06-12 13:34:32,958][65616] Updated weights for policy 0, policy_version 5930 (0.0031) [2024-06-12 13:34:34,332][65383] Fps is (10 sec: 50791.3, 60 sec: 51063.5, 300 sec: 50596.3). Total num frames: 97222656. Throughput: 0: 50977.8. Samples: 97254860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-12 13:34:34,333][65383] Avg episode reward: [(0, '0.047')] [2024-06-12 13:34:36,189][65616] Updated weights for policy 0, policy_version 5940 (0.0028) [2024-06-12 13:34:39,288][65616] Updated weights for policy 0, policy_version 5950 (0.0027) [2024-06-12 13:34:39,332][65383] Fps is (10 sec: 52428.8, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 97484800. Throughput: 0: 50785.3. Samples: 97557240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 13:34:39,333][65383] Avg episode reward: [(0, '0.047')] [2024-06-12 13:34:42,788][65616] Updated weights for policy 0, policy_version 5960 (0.0022) [2024-06-12 13:34:44,333][65383] Fps is (10 sec: 50788.9, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 97730560. Throughput: 0: 50898.8. Samples: 97862000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 13:34:44,333][65383] Avg episode reward: [(0, '0.051')] [2024-06-12 13:34:45,783][65616] Updated weights for policy 0, policy_version 5970 (0.0024) [2024-06-12 13:34:48,748][65595] Signal inference workers to stop experience collection... (1400 times) [2024-06-12 13:34:48,800][65616] InferenceWorker_p0-w0: stopping experience collection (1400 times) [2024-06-12 13:34:48,801][65595] Signal inference workers to resume experience collection... (1400 times) [2024-06-12 13:34:48,813][65616] InferenceWorker_p0-w0: resuming experience collection (1400 times) [2024-06-12 13:34:48,936][65616] Updated weights for policy 0, policy_version 5980 (0.0030) [2024-06-12 13:34:49,332][65383] Fps is (10 sec: 50790.4, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 97992704. Throughput: 0: 51027.2. Samples: 98020740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 13:34:49,333][65383] Avg episode reward: [(0, '0.042')] [2024-06-12 13:34:51,891][65616] Updated weights for policy 0, policy_version 5990 (0.0022) [2024-06-12 13:34:54,332][65383] Fps is (10 sec: 50791.8, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 98238464. Throughput: 0: 51108.0. Samples: 98328900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 13:34:54,333][65383] Avg episode reward: [(0, '0.049')] [2024-06-12 13:34:55,605][65616] Updated weights for policy 0, policy_version 6000 (0.0030) [2024-06-12 13:34:58,535][65616] Updated weights for policy 0, policy_version 6010 (0.0027) [2024-06-12 13:34:59,332][65383] Fps is (10 sec: 49152.3, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 98484224. Throughput: 0: 50904.7. Samples: 98626200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 27.0) [2024-06-12 13:34:59,333][65383] Avg episode reward: [(0, '0.054')] [2024-06-12 13:35:02,004][65616] Updated weights for policy 0, policy_version 6020 (0.0023) [2024-06-12 13:35:04,332][65383] Fps is (10 sec: 50790.3, 60 sec: 51063.6, 300 sec: 50651.6). Total num frames: 98746368. Throughput: 0: 51036.9. Samples: 98783720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 13:35:04,333][65383] Avg episode reward: [(0, '0.043')] [2024-06-12 13:35:05,150][65616] Updated weights for policy 0, policy_version 6030 (0.0030) [2024-06-12 13:35:08,470][65616] Updated weights for policy 0, policy_version 6040 (0.0035) [2024-06-12 13:35:09,332][65383] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 98992128. Throughput: 0: 50985.5. Samples: 99088500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-12 13:35:09,333][65383] Avg episode reward: [(0, '0.052')] [2024-06-12 13:35:11,514][65616] Updated weights for policy 0, policy_version 6050 (0.0035) [2024-06-12 13:35:14,333][65383] Fps is (10 sec: 49151.0, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 99237888. Throughput: 0: 51046.5. Samples: 99389940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 13:35:14,333][65383] Avg episode reward: [(0, '0.053')] [2024-06-12 13:35:14,785][65616] Updated weights for policy 0, policy_version 6060 (0.0024) [2024-06-12 13:35:17,937][65616] Updated weights for policy 0, policy_version 6070 (0.0027) [2024-06-12 13:35:19,333][65383] Fps is (10 sec: 50789.5, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 99500032. Throughput: 0: 50870.9. Samples: 99544060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 13:35:19,333][65383] Avg episode reward: [(0, '0.056')] [2024-06-12 13:35:21,399][65616] Updated weights for policy 0, policy_version 6080 (0.0029) [2024-06-12 13:35:24,332][65383] Fps is (10 sec: 52429.6, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 99762176. Throughput: 0: 50872.0. Samples: 99846480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 13:35:24,333][65383] Avg episode reward: [(0, '0.051')] [2024-06-12 13:35:24,675][65616] Updated weights for policy 0, policy_version 6090 (0.0025) [2024-06-12 13:35:27,637][65616] Updated weights for policy 0, policy_version 6100 (0.0038) [2024-06-12 13:35:29,332][65383] Fps is (10 sec: 52429.4, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 100024320. Throughput: 0: 50850.0. Samples: 100150240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 13:35:29,333][65383] Avg episode reward: [(0, '0.052')] [2024-06-12 13:35:31,278][65616] Updated weights for policy 0, policy_version 6110 (0.0029) [2024-06-12 13:35:34,332][65383] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 100253696. Throughput: 0: 50814.3. Samples: 100307380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 13:35:34,333][65383] Avg episode reward: [(0, '0.054')] [2024-06-12 13:35:34,426][65616] Updated weights for policy 0, policy_version 6120 (0.0026) [2024-06-12 13:35:37,743][65616] Updated weights for policy 0, policy_version 6130 (0.0031) [2024-06-12 13:35:39,332][65383] Fps is (10 sec: 49152.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 100515840. Throughput: 0: 50744.9. Samples: 100612420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 13:35:39,333][65383] Avg episode reward: [(0, '0.058')] [2024-06-12 13:35:40,713][65616] Updated weights for policy 0, policy_version 6140 (0.0034) [2024-06-12 13:35:44,147][65616] Updated weights for policy 0, policy_version 6150 (0.0026) [2024-06-12 13:35:44,332][65383] Fps is (10 sec: 50789.7, 60 sec: 50517.5, 300 sec: 50651.5). Total num frames: 100761600. Throughput: 0: 50859.8. Samples: 100914900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-12 13:35:44,333][65383] Avg episode reward: [(0, '0.054')] [2024-06-12 13:35:47,027][65616] Updated weights for policy 0, policy_version 6160 (0.0030) [2024-06-12 13:35:49,332][65383] Fps is (10 sec: 54067.0, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 101056512. Throughput: 0: 50756.0. Samples: 101067740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-12 13:35:49,333][65383] Avg episode reward: [(0, '0.049')] [2024-06-12 13:35:50,177][65595] Signal inference workers to stop experience collection... (1450 times) [2024-06-12 13:35:50,178][65595] Signal inference workers to resume experience collection... (1450 times) [2024-06-12 13:35:50,189][65616] InferenceWorker_p0-w0: stopping experience collection (1450 times) [2024-06-12 13:35:50,189][65616] InferenceWorker_p0-w0: resuming experience collection (1450 times) [2024-06-12 13:35:50,316][65616] Updated weights for policy 0, policy_version 6170 (0.0026) [2024-06-12 13:35:53,442][65616] Updated weights for policy 0, policy_version 6180 (0.0031) [2024-06-12 13:35:54,332][65383] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 101269504. Throughput: 0: 50936.4. Samples: 101380640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 13:35:54,333][65383] Avg episode reward: [(0, '0.053')] [2024-06-12 13:35:54,404][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000006182_101285888.pth... [2024-06-12 13:35:54,463][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000005438_89096192.pth [2024-06-12 13:35:56,949][65616] Updated weights for policy 0, policy_version 6190 (0.0023) [2024-06-12 13:35:59,332][65383] Fps is (10 sec: 49152.2, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 101548032. Throughput: 0: 51046.5. Samples: 101687020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 13:35:59,333][65383] Avg episode reward: [(0, '0.052')] [2024-06-12 13:35:59,838][65616] Updated weights for policy 0, policy_version 6200 (0.0022) [2024-06-12 13:36:03,646][65616] Updated weights for policy 0, policy_version 6210 (0.0027) [2024-06-12 13:36:04,332][65383] Fps is (10 sec: 52428.9, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 101793792. Throughput: 0: 50873.5. Samples: 101833360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 27.0) [2024-06-12 13:36:04,333][65383] Avg episode reward: [(0, '0.056')] [2024-06-12 13:36:06,202][65616] Updated weights for policy 0, policy_version 6220 (0.0025) [2024-06-12 13:36:09,332][65383] Fps is (10 sec: 50789.9, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 102055936. Throughput: 0: 51198.6. Samples: 102150420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-12 13:36:09,333][65383] Avg episode reward: [(0, '0.053')] [2024-06-12 13:36:09,563][65616] Updated weights for policy 0, policy_version 6230 (0.0027) [2024-06-12 13:36:12,646][65616] Updated weights for policy 0, policy_version 6240 (0.0019) [2024-06-12 13:36:14,332][65383] Fps is (10 sec: 54067.4, 60 sec: 51609.8, 300 sec: 50873.7). Total num frames: 102334464. Throughput: 0: 51319.7. Samples: 102459620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-12 13:36:14,333][65383] Avg episode reward: [(0, '0.059')] [2024-06-12 13:36:14,347][65595] Saving new best policy, reward=0.059! [2024-06-12 13:36:15,755][65616] Updated weights for policy 0, policy_version 6250 (0.0033) [2024-06-12 13:36:18,736][65616] Updated weights for policy 0, policy_version 6260 (0.0023) [2024-06-12 13:36:19,332][65383] Fps is (10 sec: 52429.1, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 102580224. Throughput: 0: 51235.0. Samples: 102612960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 13:36:19,333][65383] Avg episode reward: [(0, '0.058')] [2024-06-12 13:36:22,770][65616] Updated weights for policy 0, policy_version 6270 (0.0030) [2024-06-12 13:36:24,332][65383] Fps is (10 sec: 49152.0, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 102825984. Throughput: 0: 51457.3. Samples: 102928000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 13:36:24,333][65383] Avg episode reward: [(0, '0.059')] [2024-06-12 13:36:25,265][65616] Updated weights for policy 0, policy_version 6280 (0.0026) [2024-06-12 13:36:28,990][65616] Updated weights for policy 0, policy_version 6290 (0.0040) [2024-06-12 13:36:29,332][65383] Fps is (10 sec: 47513.6, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 103055360. Throughput: 0: 51164.1. Samples: 103217280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 13:36:29,333][65383] Avg episode reward: [(0, '0.054')] [2024-06-12 13:36:32,050][65616] Updated weights for policy 0, policy_version 6300 (0.0027) [2024-06-12 13:36:34,332][65383] Fps is (10 sec: 52428.9, 60 sec: 51609.6, 300 sec: 50818.2). Total num frames: 103350272. Throughput: 0: 51084.5. Samples: 103366540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 13:36:34,333][65383] Avg episode reward: [(0, '0.059')] [2024-06-12 13:36:35,505][65616] Updated weights for policy 0, policy_version 6310 (0.0030) [2024-06-12 13:36:38,516][65616] Updated weights for policy 0, policy_version 6320 (0.0029) [2024-06-12 13:36:39,332][65383] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 103579648. Throughput: 0: 50916.5. Samples: 103671880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 13:36:39,333][65383] Avg episode reward: [(0, '0.058')] [2024-06-12 13:36:41,907][65616] Updated weights for policy 0, policy_version 6330 (0.0035) [2024-06-12 13:36:44,332][65383] Fps is (10 sec: 49151.5, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 103841792. Throughput: 0: 50861.7. Samples: 103975800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 13:36:44,333][65383] Avg episode reward: [(0, '0.053')] [2024-06-12 13:36:44,723][65616] Updated weights for policy 0, policy_version 6340 (0.0026) [2024-06-12 13:36:48,170][65616] Updated weights for policy 0, policy_version 6350 (0.0026) [2024-06-12 13:36:49,332][65383] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 104087552. Throughput: 0: 51100.4. Samples: 104132880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 13:36:49,333][65383] Avg episode reward: [(0, '0.064')] [2024-06-12 13:36:49,333][65595] Saving new best policy, reward=0.064! [2024-06-12 13:36:51,102][65616] Updated weights for policy 0, policy_version 6360 (0.0023) [2024-06-12 13:36:52,233][65595] Signal inference workers to stop experience collection... (1500 times) [2024-06-12 13:36:52,279][65616] InferenceWorker_p0-w0: stopping experience collection (1500 times) [2024-06-12 13:36:52,284][65595] Signal inference workers to resume experience collection... (1500 times) [2024-06-12 13:36:52,291][65616] InferenceWorker_p0-w0: resuming experience collection (1500 times) [2024-06-12 13:36:54,332][65383] Fps is (10 sec: 50790.6, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 104349696. Throughput: 0: 50826.7. Samples: 104437620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-12 13:36:54,333][65383] Avg episode reward: [(0, '0.057')] [2024-06-12 13:36:54,660][65616] Updated weights for policy 0, policy_version 6370 (0.0024) [2024-06-12 13:36:57,625][65616] Updated weights for policy 0, policy_version 6380 (0.0022) [2024-06-12 13:36:59,332][65383] Fps is (10 sec: 52429.3, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 104611840. Throughput: 0: 50840.0. Samples: 104747420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-06-12 13:36:59,333][65383] Avg episode reward: [(0, '0.060')] [2024-06-12 13:37:00,935][65616] Updated weights for policy 0, policy_version 6390 (0.0023) [2024-06-12 13:37:03,770][65616] Updated weights for policy 0, policy_version 6400 (0.0028) [2024-06-12 13:37:04,333][65383] Fps is (10 sec: 52428.1, 60 sec: 51336.4, 300 sec: 50984.8). Total num frames: 104873984. Throughput: 0: 51031.8. Samples: 104909400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-12 13:37:04,333][65383] Avg episode reward: [(0, '0.058')] [2024-06-12 13:37:06,923][65616] Updated weights for policy 0, policy_version 6410 (0.0028) [2024-06-12 13:37:09,332][65383] Fps is (10 sec: 50789.8, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 105119744. Throughput: 0: 50762.5. Samples: 105212320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 13:37:09,333][65383] Avg episode reward: [(0, '0.043')] [2024-06-12 13:37:10,123][65616] Updated weights for policy 0, policy_version 6420 (0.0030) [2024-06-12 13:37:13,650][65616] Updated weights for policy 0, policy_version 6430 (0.0026) [2024-06-12 13:37:14,332][65383] Fps is (10 sec: 49152.8, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 105365504. Throughput: 0: 51106.3. Samples: 105517060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 13:37:14,333][65383] Avg episode reward: [(0, '0.055')] [2024-06-12 13:37:16,474][65616] Updated weights for policy 0, policy_version 6440 (0.0024) [2024-06-12 13:37:19,332][65383] Fps is (10 sec: 50790.4, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 105627648. Throughput: 0: 51065.2. Samples: 105664480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 13:37:19,333][65383] Avg episode reward: [(0, '0.069')] [2024-06-12 13:37:19,367][65595] Saving new best policy, reward=0.069! [2024-06-12 13:37:19,940][65616] Updated weights for policy 0, policy_version 6450 (0.0031) [2024-06-12 13:37:22,968][65616] Updated weights for policy 0, policy_version 6460 (0.0025) [2024-06-12 13:37:24,332][65383] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 105889792. Throughput: 0: 51311.5. Samples: 105980900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 13:37:24,333][65383] Avg episode reward: [(0, '0.060')] [2024-06-12 13:37:26,487][65616] Updated weights for policy 0, policy_version 6470 (0.0032) [2024-06-12 13:37:29,332][65383] Fps is (10 sec: 52429.2, 60 sec: 51609.6, 300 sec: 50984.8). Total num frames: 106151936. Throughput: 0: 51283.6. Samples: 106283560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-12 13:37:29,333][65383] Avg episode reward: [(0, '0.067')] [2024-06-12 13:37:29,406][65616] Updated weights for policy 0, policy_version 6480 (0.0028) [2024-06-12 13:37:33,131][65616] Updated weights for policy 0, policy_version 6490 (0.0034) [2024-06-12 13:37:34,332][65383] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50929.3). Total num frames: 106397696. Throughput: 0: 51218.4. Samples: 106437700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 13:37:34,333][65383] Avg episode reward: [(0, '0.053')] [2024-06-12 13:37:35,931][65616] Updated weights for policy 0, policy_version 6500 (0.0030) [2024-06-12 13:37:39,333][65383] Fps is (10 sec: 47513.2, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 106627072. Throughput: 0: 51057.7. Samples: 106735220. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-12 13:37:39,333][65383] Avg episode reward: [(0, '0.062')] [2024-06-12 13:37:39,649][65616] Updated weights for policy 0, policy_version 6510 (0.0025) [2024-06-12 13:37:42,261][65616] Updated weights for policy 0, policy_version 6520 (0.0026) [2024-06-12 13:37:44,332][65383] Fps is (10 sec: 49152.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 106889216. Throughput: 0: 51031.6. Samples: 107043840. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-12 13:37:44,333][65383] Avg episode reward: [(0, '0.060')] [2024-06-12 13:37:45,714][65616] Updated weights for policy 0, policy_version 6530 (0.0024) [2024-06-12 13:37:48,773][65616] Updated weights for policy 0, policy_version 6540 (0.0022) [2024-06-12 13:37:49,332][65383] Fps is (10 sec: 52429.5, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 107151360. Throughput: 0: 50791.8. Samples: 107195020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-12 13:37:49,333][65383] Avg episode reward: [(0, '0.062')] [2024-06-12 13:37:52,039][65595] Signal inference workers to stop experience collection... (1550 times) [2024-06-12 13:37:52,051][65616] InferenceWorker_p0-w0: stopping experience collection (1550 times) [2024-06-12 13:37:52,097][65595] Signal inference workers to resume experience collection... (1550 times) [2024-06-12 13:37:52,097][65616] InferenceWorker_p0-w0: resuming experience collection (1550 times) [2024-06-12 13:37:52,252][65616] Updated weights for policy 0, policy_version 6550 (0.0023) [2024-06-12 13:37:54,332][65383] Fps is (10 sec: 54066.8, 60 sec: 51336.6, 300 sec: 51040.3). Total num frames: 107429888. Throughput: 0: 50853.0. Samples: 107500700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-12 13:37:54,333][65383] Avg episode reward: [(0, '0.054')] [2024-06-12 13:37:54,337][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000006557_107429888.pth... [2024-06-12 13:37:54,377][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000005810_95191040.pth [2024-06-12 13:37:55,495][65616] Updated weights for policy 0, policy_version 6560 (0.0035) [2024-06-12 13:37:58,732][65616] Updated weights for policy 0, policy_version 6570 (0.0027) [2024-06-12 13:37:59,333][65383] Fps is (10 sec: 52428.1, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 107675648. Throughput: 0: 50878.1. Samples: 107806580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-06-12 13:37:59,333][65383] Avg episode reward: [(0, '0.070')] [2024-06-12 13:37:59,334][65595] Saving new best policy, reward=0.070! [2024-06-12 13:38:01,628][65616] Updated weights for policy 0, policy_version 6580 (0.0028) [2024-06-12 13:38:04,332][65383] Fps is (10 sec: 45875.1, 60 sec: 50244.4, 300 sec: 50873.7). Total num frames: 107888640. Throughput: 0: 50873.0. Samples: 107953760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 13:38:04,333][65383] Avg episode reward: [(0, '0.053')] [2024-06-12 13:38:05,029][65616] Updated weights for policy 0, policy_version 6590 (0.0031) [2024-06-12 13:38:08,362][65616] Updated weights for policy 0, policy_version 6600 (0.0030) [2024-06-12 13:38:09,332][65383] Fps is (10 sec: 49152.6, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 108167168. Throughput: 0: 50767.6. Samples: 108265440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 13:38:09,333][65383] Avg episode reward: [(0, '0.057')] [2024-06-12 13:38:11,564][65616] Updated weights for policy 0, policy_version 6610 (0.0027) [2024-06-12 13:38:14,332][65383] Fps is (10 sec: 52428.5, 60 sec: 50790.3, 300 sec: 51040.3). Total num frames: 108412928. Throughput: 0: 50561.7. Samples: 108558840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 13:38:14,333][65383] Avg episode reward: [(0, '0.077')] [2024-06-12 13:38:14,341][65595] Saving new best policy, reward=0.077! [2024-06-12 13:38:14,696][65616] Updated weights for policy 0, policy_version 6620 (0.0027) [2024-06-12 13:38:18,091][65616] Updated weights for policy 0, policy_version 6630 (0.0031) [2024-06-12 13:38:19,332][65383] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 108691456. Throughput: 0: 50828.4. Samples: 108724980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 13:38:19,333][65383] Avg episode reward: [(0, '0.068')] [2024-06-12 13:38:21,353][65616] Updated weights for policy 0, policy_version 6640 (0.0025) [2024-06-12 13:38:24,332][65383] Fps is (10 sec: 52429.5, 60 sec: 50790.5, 300 sec: 50984.8). Total num frames: 108937216. Throughput: 0: 51078.0. Samples: 109033720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-12 13:38:24,333][65383] Avg episode reward: [(0, '0.075')] [2024-06-12 13:38:24,346][65616] Updated weights for policy 0, policy_version 6650 (0.0020) [2024-06-12 13:38:27,818][65616] Updated weights for policy 0, policy_version 6660 (0.0026) [2024-06-12 13:38:29,332][65383] Fps is (10 sec: 47513.6, 60 sec: 50244.3, 300 sec: 50873.7). Total num frames: 109166592. Throughput: 0: 50903.5. Samples: 109334500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 24.0) [2024-06-12 13:38:29,333][65383] Avg episode reward: [(0, '0.061')] [2024-06-12 13:38:30,835][65616] Updated weights for policy 0, policy_version 6670 (0.0028) [2024-06-12 13:38:33,943][65616] Updated weights for policy 0, policy_version 6680 (0.0028) [2024-06-12 13:38:34,332][65383] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 109445120. Throughput: 0: 50903.1. Samples: 109485660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 24.0) [2024-06-12 13:38:34,333][65383] Avg episode reward: [(0, '0.061')] [2024-06-12 13:38:37,526][65616] Updated weights for policy 0, policy_version 6690 (0.0025) [2024-06-12 13:38:39,332][65383] Fps is (10 sec: 54067.1, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 109707264. Throughput: 0: 50854.6. Samples: 109789160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 13:38:39,333][65383] Avg episode reward: [(0, '0.072')] [2024-06-12 13:38:40,805][65616] Updated weights for policy 0, policy_version 6700 (0.0028) [2024-06-12 13:38:43,939][65616] Updated weights for policy 0, policy_version 6710 (0.0024) [2024-06-12 13:38:44,332][65383] Fps is (10 sec: 50790.2, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 109953024. Throughput: 0: 50956.1. Samples: 110099600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 13:38:44,333][65383] Avg episode reward: [(0, '0.072')] [2024-06-12 13:38:47,047][65616] Updated weights for policy 0, policy_version 6720 (0.0024) [2024-06-12 13:38:49,332][65383] Fps is (10 sec: 49151.7, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 110198784. Throughput: 0: 50992.4. Samples: 110248420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 13:38:49,333][65383] Avg episode reward: [(0, '0.072')] [2024-06-12 13:38:50,554][65616] Updated weights for policy 0, policy_version 6730 (0.0021) [2024-06-12 13:38:53,434][65616] Updated weights for policy 0, policy_version 6740 (0.0030) [2024-06-12 13:38:54,332][65383] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50929.2). Total num frames: 110460928. Throughput: 0: 50786.1. Samples: 110550820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 13:38:54,333][65383] Avg episode reward: [(0, '0.066')] [2024-06-12 13:38:56,942][65616] Updated weights for policy 0, policy_version 6750 (0.0029) [2024-06-12 13:38:57,476][65595] Signal inference workers to stop experience collection... (1600 times) [2024-06-12 13:38:57,519][65616] InferenceWorker_p0-w0: stopping experience collection (1600 times) [2024-06-12 13:38:57,529][65595] Signal inference workers to resume experience collection... (1600 times) [2024-06-12 13:38:57,533][65616] InferenceWorker_p0-w0: resuming experience collection (1600 times) [2024-06-12 13:38:59,332][65383] Fps is (10 sec: 52429.6, 60 sec: 50790.6, 300 sec: 50984.8). Total num frames: 110723072. Throughput: 0: 51030.4. Samples: 110855200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 13:38:59,333][65383] Avg episode reward: [(0, '0.070')] [2024-06-12 13:38:59,631][65616] Updated weights for policy 0, policy_version 6760 (0.0019) [2024-06-12 13:39:03,366][65616] Updated weights for policy 0, policy_version 6770 (0.0027) [2024-06-12 13:39:04,332][65383] Fps is (10 sec: 52429.3, 60 sec: 51609.6, 300 sec: 50984.8). Total num frames: 110985216. Throughput: 0: 50793.8. Samples: 111010700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-12 13:39:04,333][65383] Avg episode reward: [(0, '0.066')] [2024-06-12 13:39:06,078][65616] Updated weights for policy 0, policy_version 6780 (0.0035) [2024-06-12 13:39:09,332][65383] Fps is (10 sec: 49151.9, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 111214592. Throughput: 0: 50693.3. Samples: 111314920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-12 13:39:09,333][65383] Avg episode reward: [(0, '0.065')] [2024-06-12 13:39:09,828][65616] Updated weights for policy 0, policy_version 6790 (0.0034) [2024-06-12 13:39:12,616][65616] Updated weights for policy 0, policy_version 6800 (0.0026) [2024-06-12 13:39:14,332][65383] Fps is (10 sec: 47513.2, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 111460352. Throughput: 0: 50735.5. Samples: 111617600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 13:39:14,333][65383] Avg episode reward: [(0, '0.065')] [2024-06-12 13:39:16,335][65616] Updated weights for policy 0, policy_version 6810 (0.0029) [2024-06-12 13:39:19,168][65616] Updated weights for policy 0, policy_version 6820 (0.0024) [2024-06-12 13:39:19,332][65383] Fps is (10 sec: 52428.2, 60 sec: 50790.4, 300 sec: 50929.3). Total num frames: 111738880. Throughput: 0: 50622.1. Samples: 111763660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 13:39:19,333][65383] Avg episode reward: [(0, '0.066')] [2024-06-12 13:39:22,453][65616] Updated weights for policy 0, policy_version 6830 (0.0031) [2024-06-12 13:39:24,332][65383] Fps is (10 sec: 54067.1, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 112001024. Throughput: 0: 50845.7. Samples: 112077220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 13:39:24,333][65383] Avg episode reward: [(0, '0.066')] [2024-06-12 13:39:25,571][65616] Updated weights for policy 0, policy_version 6840 (0.0024) [2024-06-12 13:39:29,047][65616] Updated weights for policy 0, policy_version 6850 (0.0022) [2024-06-12 13:39:29,336][65383] Fps is (10 sec: 49135.2, 60 sec: 51060.5, 300 sec: 50873.1). Total num frames: 112230400. Throughput: 0: 50873.9. Samples: 112389100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 13:39:29,336][65383] Avg episode reward: [(0, '0.071')] [2024-06-12 13:39:32,128][65616] Updated weights for policy 0, policy_version 6860 (0.0033) [2024-06-12 13:39:34,332][65383] Fps is (10 sec: 47513.7, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 112476160. Throughput: 0: 50797.4. Samples: 112534300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 13:39:34,333][65383] Avg episode reward: [(0, '0.068')] [2024-06-12 13:39:35,266][65616] Updated weights for policy 0, policy_version 6870 (0.0033) [2024-06-12 13:39:38,634][65616] Updated weights for policy 0, policy_version 6880 (0.0035) [2024-06-12 13:39:39,332][65383] Fps is (10 sec: 49169.1, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 112721920. Throughput: 0: 50716.1. Samples: 112833040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 13:39:39,333][65383] Avg episode reward: [(0, '0.079')] [2024-06-12 13:39:39,392][65595] Saving new best policy, reward=0.079! [2024-06-12 13:39:41,751][65616] Updated weights for policy 0, policy_version 6890 (0.0025) [2024-06-12 13:39:44,332][65383] Fps is (10 sec: 52428.8, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 113000448. Throughput: 0: 50739.0. Samples: 113138460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 13:39:44,333][65383] Avg episode reward: [(0, '0.067')] [2024-06-12 13:39:45,057][65616] Updated weights for policy 0, policy_version 6900 (0.0028) [2024-06-12 13:39:48,134][65616] Updated weights for policy 0, policy_version 6910 (0.0026) [2024-06-12 13:39:49,332][65383] Fps is (10 sec: 52429.1, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 113246208. Throughput: 0: 50880.5. Samples: 113300320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 13:39:49,333][65383] Avg episode reward: [(0, '0.067')] [2024-06-12 13:39:51,504][65616] Updated weights for policy 0, policy_version 6920 (0.0026) [2024-06-12 13:39:54,332][65383] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 113508352. Throughput: 0: 50861.6. Samples: 113603700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-12 13:39:54,333][65383] Avg episode reward: [(0, '0.073')] [2024-06-12 13:39:54,342][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000006928_113508352.pth... [2024-06-12 13:39:54,406][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000006182_101285888.pth [2024-06-12 13:39:54,723][65616] Updated weights for policy 0, policy_version 6930 (0.0023) [2024-06-12 13:39:58,068][65616] Updated weights for policy 0, policy_version 6940 (0.0028) [2024-06-12 13:39:59,332][65383] Fps is (10 sec: 50789.9, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 113754112. Throughput: 0: 50782.7. Samples: 113902820. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 13:39:59,333][65383] Avg episode reward: [(0, '0.065')] [2024-06-12 13:40:01,101][65616] Updated weights for policy 0, policy_version 6950 (0.0036) [2024-06-12 13:40:04,332][65383] Fps is (10 sec: 49152.3, 60 sec: 50244.2, 300 sec: 50873.7). Total num frames: 113999872. Throughput: 0: 50768.0. Samples: 114048220. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-12 13:40:04,333][65383] Avg episode reward: [(0, '0.074')] [2024-06-12 13:40:04,737][65616] Updated weights for policy 0, policy_version 6960 (0.0025) [2024-06-12 13:40:07,731][65616] Updated weights for policy 0, policy_version 6970 (0.0033) [2024-06-12 13:40:09,286][65595] Signal inference workers to stop experience collection... (1650 times) [2024-06-12 13:40:09,332][65383] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 114245632. Throughput: 0: 50385.0. Samples: 114344540. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-12 13:40:09,333][65383] Avg episode reward: [(0, '0.068')] [2024-06-12 13:40:09,335][65616] InferenceWorker_p0-w0: stopping experience collection (1650 times) [2024-06-12 13:40:09,337][65595] Signal inference workers to resume experience collection... (1650 times) [2024-06-12 13:40:09,344][65616] InferenceWorker_p0-w0: resuming experience collection (1650 times) [2024-06-12 13:40:10,803][65616] Updated weights for policy 0, policy_version 6980 (0.0022) [2024-06-12 13:40:13,979][65616] Updated weights for policy 0, policy_version 6990 (0.0022) [2024-06-12 13:40:14,332][65383] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 114524160. Throughput: 0: 50469.3. Samples: 114660040. Policy #0 lag: (min: 1.0, avg: 13.3, max: 25.0) [2024-06-12 13:40:14,333][65383] Avg episode reward: [(0, '0.072')] [2024-06-12 13:40:17,262][65616] Updated weights for policy 0, policy_version 7000 (0.0029) [2024-06-12 13:40:19,332][65383] Fps is (10 sec: 50790.7, 60 sec: 50244.4, 300 sec: 50818.2). Total num frames: 114753536. Throughput: 0: 50685.4. Samples: 114815140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 13:40:19,333][65383] Avg episode reward: [(0, '0.068')] [2024-06-12 13:40:20,398][65616] Updated weights for policy 0, policy_version 7010 (0.0029) [2024-06-12 13:40:23,706][65616] Updated weights for policy 0, policy_version 7020 (0.0028) [2024-06-12 13:40:24,332][65383] Fps is (10 sec: 50790.2, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 115032064. Throughput: 0: 50712.9. Samples: 115115120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 13:40:24,333][65383] Avg episode reward: [(0, '0.062')] [2024-06-12 13:40:27,104][65616] Updated weights for policy 0, policy_version 7030 (0.0022) [2024-06-12 13:40:29,332][65383] Fps is (10 sec: 52428.3, 60 sec: 50793.3, 300 sec: 50929.2). Total num frames: 115277824. Throughput: 0: 50569.3. Samples: 115414080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 13:40:29,333][65383] Avg episode reward: [(0, '0.068')] [2024-06-12 13:40:30,201][65616] Updated weights for policy 0, policy_version 7040 (0.0027) [2024-06-12 13:40:33,589][65616] Updated weights for policy 0, policy_version 7050 (0.0020) [2024-06-12 13:40:34,332][65383] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 115523584. Throughput: 0: 50430.6. Samples: 115569700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 13:40:34,333][65383] Avg episode reward: [(0, '0.070')] [2024-06-12 13:40:36,691][65616] Updated weights for policy 0, policy_version 7060 (0.0022) [2024-06-12 13:40:39,333][65383] Fps is (10 sec: 49151.6, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 115769344. Throughput: 0: 50505.7. Samples: 115876460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 13:40:39,333][65383] Avg episode reward: [(0, '0.078')] [2024-06-12 13:40:40,228][65616] Updated weights for policy 0, policy_version 7070 (0.0032) [2024-06-12 13:40:43,433][65616] Updated weights for policy 0, policy_version 7080 (0.0028) [2024-06-12 13:40:44,332][65383] Fps is (10 sec: 50790.4, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 116031488. Throughput: 0: 50544.9. Samples: 116177340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 13:40:44,333][65383] Avg episode reward: [(0, '0.062')] [2024-06-12 13:40:46,514][65616] Updated weights for policy 0, policy_version 7090 (0.0029) [2024-06-12 13:40:49,332][65383] Fps is (10 sec: 52429.9, 60 sec: 50790.4, 300 sec: 50929.3). Total num frames: 116293632. Throughput: 0: 50709.5. Samples: 116330140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 13:40:49,333][65383] Avg episode reward: [(0, '0.066')] [2024-06-12 13:40:49,950][65616] Updated weights for policy 0, policy_version 7100 (0.0025) [2024-06-12 13:40:53,131][65616] Updated weights for policy 0, policy_version 7110 (0.0020) [2024-06-12 13:40:54,332][65383] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 116555776. Throughput: 0: 50830.1. Samples: 116631900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-12 13:40:54,333][65383] Avg episode reward: [(0, '0.068')] [2024-06-12 13:40:56,601][65616] Updated weights for policy 0, policy_version 7120 (0.0024) [2024-06-12 13:40:59,321][65616] Updated weights for policy 0, policy_version 7130 (0.0023) [2024-06-12 13:40:59,332][65383] Fps is (10 sec: 52428.4, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 116817920. Throughput: 0: 50732.4. Samples: 116943000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-12 13:40:59,333][65383] Avg episode reward: [(0, '0.070')] [2024-06-12 13:41:02,790][65616] Updated weights for policy 0, policy_version 7140 (0.0026) [2024-06-12 13:41:04,332][65383] Fps is (10 sec: 49152.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 117047296. Throughput: 0: 50587.5. Samples: 117091580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 13:41:04,333][65383] Avg episode reward: [(0, '0.077')] [2024-06-12 13:41:06,166][65616] Updated weights for policy 0, policy_version 7150 (0.0026) [2024-06-12 13:41:09,332][65383] Fps is (10 sec: 47514.0, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 117293056. Throughput: 0: 50413.9. Samples: 117383740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-12 13:41:09,333][65383] Avg episode reward: [(0, '0.080')] [2024-06-12 13:41:09,378][65595] Saving new best policy, reward=0.080! [2024-06-12 13:41:09,384][65616] Updated weights for policy 0, policy_version 7160 (0.0022) [2024-06-12 13:41:12,530][65616] Updated weights for policy 0, policy_version 7170 (0.0027) [2024-06-12 13:41:14,332][65383] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 117555200. Throughput: 0: 50504.4. Samples: 117686780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 13:41:14,333][65383] Avg episode reward: [(0, '0.063')] [2024-06-12 13:41:16,108][65616] Updated weights for policy 0, policy_version 7180 (0.0022) [2024-06-12 13:41:18,854][65616] Updated weights for policy 0, policy_version 7190 (0.0034) [2024-06-12 13:41:19,332][65383] Fps is (10 sec: 52427.9, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 117817344. Throughput: 0: 50635.9. Samples: 117848320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-12 13:41:19,333][65383] Avg episode reward: [(0, '0.077')] [2024-06-12 13:41:21,566][65595] Signal inference workers to stop experience collection... (1700 times) [2024-06-12 13:41:21,566][65595] Signal inference workers to resume experience collection... (1700 times) [2024-06-12 13:41:21,575][65616] InferenceWorker_p0-w0: stopping experience collection (1700 times) [2024-06-12 13:41:21,604][65616] InferenceWorker_p0-w0: resuming experience collection (1700 times) [2024-06-12 13:41:22,643][65616] Updated weights for policy 0, policy_version 7200 (0.0024) [2024-06-12 13:41:24,332][65383] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 118063104. Throughput: 0: 50773.4. Samples: 118161260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-12 13:41:24,333][65383] Avg episode reward: [(0, '0.074')] [2024-06-12 13:41:25,066][65616] Updated weights for policy 0, policy_version 7210 (0.0025) [2024-06-12 13:41:28,849][65616] Updated weights for policy 0, policy_version 7220 (0.0025) [2024-06-12 13:41:29,332][65383] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 118308864. Throughput: 0: 50698.2. Samples: 118458760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-12 13:41:29,333][65383] Avg episode reward: [(0, '0.081')] [2024-06-12 13:41:31,684][65616] Updated weights for policy 0, policy_version 7230 (0.0032) [2024-06-12 13:41:34,332][65383] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 118571008. Throughput: 0: 50682.0. Samples: 118610840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 13:41:34,333][65383] Avg episode reward: [(0, '0.069')] [2024-06-12 13:41:35,295][65616] Updated weights for policy 0, policy_version 7240 (0.0024) [2024-06-12 13:41:37,980][65616] Updated weights for policy 0, policy_version 7250 (0.0030) [2024-06-12 13:41:39,333][65383] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 118833152. Throughput: 0: 50766.6. Samples: 118916400. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-12 13:41:39,333][65383] Avg episode reward: [(0, '0.076')] [2024-06-12 13:41:41,726][65616] Updated weights for policy 0, policy_version 7260 (0.0031) [2024-06-12 13:41:44,332][65383] Fps is (10 sec: 50790.3, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 119078912. Throughput: 0: 50397.7. Samples: 119210900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-12 13:41:44,333][65383] Avg episode reward: [(0, '0.077')] [2024-06-12 13:41:44,953][65616] Updated weights for policy 0, policy_version 7270 (0.0036) [2024-06-12 13:41:48,244][65616] Updated weights for policy 0, policy_version 7280 (0.0026) [2024-06-12 13:41:49,332][65383] Fps is (10 sec: 49152.4, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 119324672. Throughput: 0: 50468.8. Samples: 119362680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-12 13:41:49,333][65383] Avg episode reward: [(0, '0.070')] [2024-06-12 13:41:51,398][65616] Updated weights for policy 0, policy_version 7290 (0.0025) [2024-06-12 13:41:54,332][65383] Fps is (10 sec: 49152.4, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 119570432. Throughput: 0: 50740.4. Samples: 119667060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 13:41:54,333][65383] Avg episode reward: [(0, '0.076')] [2024-06-12 13:41:54,512][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000007300_119603200.pth... [2024-06-12 13:41:54,516][65616] Updated weights for policy 0, policy_version 7300 (0.0027) [2024-06-12 13:41:54,550][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000006557_107429888.pth [2024-06-12 13:41:57,764][65616] Updated weights for policy 0, policy_version 7310 (0.0034) [2024-06-12 13:41:59,332][65383] Fps is (10 sec: 50790.9, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 119832576. Throughput: 0: 50642.8. Samples: 119965700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 13:41:59,333][65383] Avg episode reward: [(0, '0.080')] [2024-06-12 13:42:01,288][65616] Updated weights for policy 0, policy_version 7320 (0.0025) [2024-06-12 13:42:04,330][65616] Updated weights for policy 0, policy_version 7330 (0.0034) [2024-06-12 13:42:04,332][65383] Fps is (10 sec: 52428.5, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 120094720. Throughput: 0: 50684.5. Samples: 120129120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 13:42:04,333][65383] Avg episode reward: [(0, '0.069')] [2024-06-12 13:42:07,950][65616] Updated weights for policy 0, policy_version 7340 (0.0024) [2024-06-12 13:42:09,332][65383] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 120340480. Throughput: 0: 50434.8. Samples: 120430820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 13:42:09,333][65383] Avg episode reward: [(0, '0.073')] [2024-06-12 13:42:10,561][65616] Updated weights for policy 0, policy_version 7350 (0.0025) [2024-06-12 13:42:14,332][65383] Fps is (10 sec: 47513.4, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 120569856. Throughput: 0: 50430.2. Samples: 120728120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 13:42:14,333][65383] Avg episode reward: [(0, '0.076')] [2024-06-12 13:42:14,528][65616] Updated weights for policy 0, policy_version 7360 (0.0029) [2024-06-12 13:42:17,241][65616] Updated weights for policy 0, policy_version 7370 (0.0030) [2024-06-12 13:42:19,332][65383] Fps is (10 sec: 49151.8, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 120832000. Throughput: 0: 50239.1. Samples: 120871600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 13:42:19,333][65383] Avg episode reward: [(0, '0.079')] [2024-06-12 13:42:20,754][65616] Updated weights for policy 0, policy_version 7380 (0.0020) [2024-06-12 13:42:23,555][65616] Updated weights for policy 0, policy_version 7390 (0.0025) [2024-06-12 13:42:24,332][65383] Fps is (10 sec: 50791.1, 60 sec: 50244.4, 300 sec: 50596.0). Total num frames: 121077760. Throughput: 0: 50437.6. Samples: 121186080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 13:42:24,333][65383] Avg episode reward: [(0, '0.076')] [2024-06-12 13:42:27,105][65616] Updated weights for policy 0, policy_version 7400 (0.0030) [2024-06-12 13:42:29,332][65383] Fps is (10 sec: 52429.0, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 121356288. Throughput: 0: 50601.4. Samples: 121487960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 13:42:29,333][65383] Avg episode reward: [(0, '0.072')] [2024-06-12 13:42:30,054][65616] Updated weights for policy 0, policy_version 7410 (0.0028) [2024-06-12 13:42:33,731][65616] Updated weights for policy 0, policy_version 7420 (0.0024) [2024-06-12 13:42:34,332][65383] Fps is (10 sec: 50789.9, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 121585664. Throughput: 0: 50628.9. Samples: 121640980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 13:42:34,333][65383] Avg episode reward: [(0, '0.077')] [2024-06-12 13:42:34,911][65595] Signal inference workers to stop experience collection... (1750 times) [2024-06-12 13:42:34,911][65595] Signal inference workers to resume experience collection... (1750 times) [2024-06-12 13:42:34,920][65616] InferenceWorker_p0-w0: stopping experience collection (1750 times) [2024-06-12 13:42:34,944][65616] InferenceWorker_p0-w0: resuming experience collection (1750 times) [2024-06-12 13:42:36,661][65616] Updated weights for policy 0, policy_version 7430 (0.0031) [2024-06-12 13:42:39,332][65383] Fps is (10 sec: 49152.0, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 121847808. Throughput: 0: 50633.8. Samples: 121945580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 13:42:39,333][65383] Avg episode reward: [(0, '0.077')] [2024-06-12 13:42:40,327][65616] Updated weights for policy 0, policy_version 7440 (0.0036) [2024-06-12 13:42:43,071][65616] Updated weights for policy 0, policy_version 7450 (0.0028) [2024-06-12 13:42:44,332][65383] Fps is (10 sec: 49152.2, 60 sec: 49971.3, 300 sec: 50596.0). Total num frames: 122077184. Throughput: 0: 50713.3. Samples: 122247800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 13:42:44,333][65383] Avg episode reward: [(0, '0.070')] [2024-06-12 13:42:46,597][65616] Updated weights for policy 0, policy_version 7460 (0.0024) [2024-06-12 13:42:49,333][65383] Fps is (10 sec: 50789.5, 60 sec: 50517.2, 300 sec: 50596.0). Total num frames: 122355712. Throughput: 0: 50285.2. Samples: 122391960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 13:42:49,333][65383] Avg episode reward: [(0, '0.073')] [2024-06-12 13:42:50,114][65616] Updated weights for policy 0, policy_version 7470 (0.0021) [2024-06-12 13:42:52,994][65616] Updated weights for policy 0, policy_version 7480 (0.0022) [2024-06-12 13:42:54,332][65383] Fps is (10 sec: 52428.7, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 122601472. Throughput: 0: 50261.3. Samples: 122692580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 13:42:54,333][65383] Avg episode reward: [(0, '0.077')] [2024-06-12 13:42:56,859][65616] Updated weights for policy 0, policy_version 7490 (0.0022) [2024-06-12 13:42:59,332][65383] Fps is (10 sec: 49152.8, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 122847232. Throughput: 0: 50417.4. Samples: 122996900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 13:42:59,333][65383] Avg episode reward: [(0, '0.072')] [2024-06-12 13:42:59,634][65616] Updated weights for policy 0, policy_version 7500 (0.0028) [2024-06-12 13:43:03,393][65616] Updated weights for policy 0, policy_version 7510 (0.0025) [2024-06-12 13:43:04,332][65383] Fps is (10 sec: 47514.0, 60 sec: 49698.2, 300 sec: 50540.5). Total num frames: 123076608. Throughput: 0: 50368.5. Samples: 123138180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 13:43:04,333][65383] Avg episode reward: [(0, '0.065')] [2024-06-12 13:43:06,111][65616] Updated weights for policy 0, policy_version 7520 (0.0034) [2024-06-12 13:43:09,332][65383] Fps is (10 sec: 50790.5, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 123355136. Throughput: 0: 49977.7. Samples: 123435080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 13:43:09,333][65383] Avg episode reward: [(0, '0.078')] [2024-06-12 13:43:10,010][65616] Updated weights for policy 0, policy_version 7530 (0.0022) [2024-06-12 13:43:12,838][65616] Updated weights for policy 0, policy_version 7540 (0.0031) [2024-06-12 13:43:14,333][65383] Fps is (10 sec: 54066.1, 60 sec: 50790.4, 300 sec: 50596.0). Total num frames: 123617280. Throughput: 0: 49943.8. Samples: 123735440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 13:43:14,333][65383] Avg episode reward: [(0, '0.067')] [2024-06-12 13:43:16,568][65616] Updated weights for policy 0, policy_version 7550 (0.0025) [2024-06-12 13:43:19,328][65616] Updated weights for policy 0, policy_version 7560 (0.0024) [2024-06-12 13:43:19,332][65383] Fps is (10 sec: 50790.5, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 123863040. Throughput: 0: 50114.3. Samples: 123896120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 13:43:19,333][65383] Avg episode reward: [(0, '0.078')] [2024-06-12 13:43:23,033][65616] Updated weights for policy 0, policy_version 7570 (0.0023) [2024-06-12 13:43:24,332][65383] Fps is (10 sec: 49152.9, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 124108800. Throughput: 0: 50056.9. Samples: 124198140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 13:43:24,333][65383] Avg episode reward: [(0, '0.072')] [2024-06-12 13:43:25,927][65616] Updated weights for policy 0, policy_version 7580 (0.0022) [2024-06-12 13:43:29,214][65616] Updated weights for policy 0, policy_version 7590 (0.0025) [2024-06-12 13:43:29,332][65383] Fps is (10 sec: 49152.0, 60 sec: 49971.2, 300 sec: 50540.5). Total num frames: 124354560. Throughput: 0: 49964.0. Samples: 124496180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 13:43:29,333][65383] Avg episode reward: [(0, '0.078')] [2024-06-12 13:43:32,327][65616] Updated weights for policy 0, policy_version 7600 (0.0028) [2024-06-12 13:43:34,332][65383] Fps is (10 sec: 50791.1, 60 sec: 50517.5, 300 sec: 50540.5). Total num frames: 124616704. Throughput: 0: 49927.9. Samples: 124638700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-12 13:43:34,333][65383] Avg episode reward: [(0, '0.077')] [2024-06-12 13:43:35,978][65616] Updated weights for policy 0, policy_version 7610 (0.0024) [2024-06-12 13:43:39,108][65616] Updated weights for policy 0, policy_version 7620 (0.0026) [2024-06-12 13:43:39,332][65383] Fps is (10 sec: 49152.2, 60 sec: 49971.2, 300 sec: 50484.9). Total num frames: 124846080. Throughput: 0: 50037.0. Samples: 124944240. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-12 13:43:39,333][65383] Avg episode reward: [(0, '0.077')] [2024-06-12 13:43:42,483][65616] Updated weights for policy 0, policy_version 7630 (0.0027) [2024-06-12 13:43:43,811][65595] Signal inference workers to stop experience collection... (1800 times) [2024-06-12 13:43:43,811][65595] Signal inference workers to resume experience collection... (1800 times) [2024-06-12 13:43:43,830][65616] InferenceWorker_p0-w0: stopping experience collection (1800 times) [2024-06-12 13:43:43,830][65616] InferenceWorker_p0-w0: resuming experience collection (1800 times) [2024-06-12 13:43:44,332][65383] Fps is (10 sec: 47513.1, 60 sec: 50244.3, 300 sec: 50485.0). Total num frames: 125091840. Throughput: 0: 50000.1. Samples: 125246900. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-12 13:43:44,332][65383] Avg episode reward: [(0, '0.080')] [2024-06-12 13:43:45,438][65616] Updated weights for policy 0, policy_version 7640 (0.0031) [2024-06-12 13:43:48,948][65616] Updated weights for policy 0, policy_version 7650 (0.0024) [2024-06-12 13:43:49,332][65383] Fps is (10 sec: 49151.7, 60 sec: 49698.3, 300 sec: 50429.4). Total num frames: 125337600. Throughput: 0: 50064.4. Samples: 125391080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 13:43:49,333][65383] Avg episode reward: [(0, '0.072')] [2024-06-12 13:43:52,067][65616] Updated weights for policy 0, policy_version 7660 (0.0023) [2024-06-12 13:43:54,332][65383] Fps is (10 sec: 52427.9, 60 sec: 50244.2, 300 sec: 50484.9). Total num frames: 125616128. Throughput: 0: 50119.4. Samples: 125690460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 13:43:54,333][65383] Avg episode reward: [(0, '0.072')] [2024-06-12 13:43:54,343][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000007667_125616128.pth... [2024-06-12 13:43:54,383][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000006928_113508352.pth [2024-06-12 13:43:55,678][65616] Updated weights for policy 0, policy_version 7670 (0.0023) [2024-06-12 13:43:58,730][65616] Updated weights for policy 0, policy_version 7680 (0.0027) [2024-06-12 13:43:59,332][65383] Fps is (10 sec: 50790.6, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 125845504. Throughput: 0: 50097.5. Samples: 125989820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 13:43:59,333][65383] Avg episode reward: [(0, '0.079')] [2024-06-12 13:44:02,150][65616] Updated weights for policy 0, policy_version 7690 (0.0034) [2024-06-12 13:44:04,332][65383] Fps is (10 sec: 45875.1, 60 sec: 49971.1, 300 sec: 50373.8). Total num frames: 126074880. Throughput: 0: 49787.0. Samples: 126136540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 13:44:04,333][65383] Avg episode reward: [(0, '0.084')] [2024-06-12 13:44:04,483][65595] Saving new best policy, reward=0.084! [2024-06-12 13:44:05,467][65616] Updated weights for policy 0, policy_version 7700 (0.0022) [2024-06-12 13:44:08,368][65616] Updated weights for policy 0, policy_version 7710 (0.0030) [2024-06-12 13:44:09,332][65383] Fps is (10 sec: 49151.8, 60 sec: 49698.1, 300 sec: 50429.4). Total num frames: 126337024. Throughput: 0: 49535.1. Samples: 126427220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 13:44:09,333][65383] Avg episode reward: [(0, '0.080')] [2024-06-12 13:44:12,058][65616] Updated weights for policy 0, policy_version 7720 (0.0029) [2024-06-12 13:44:14,332][65383] Fps is (10 sec: 52429.1, 60 sec: 49698.2, 300 sec: 50373.9). Total num frames: 126599168. Throughput: 0: 49605.3. Samples: 126728420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 13:44:14,333][65383] Avg episode reward: [(0, '0.074')] [2024-06-12 13:44:14,902][65616] Updated weights for policy 0, policy_version 7730 (0.0030) [2024-06-12 13:44:18,589][65616] Updated weights for policy 0, policy_version 7740 (0.0025) [2024-06-12 13:44:19,332][65383] Fps is (10 sec: 49151.4, 60 sec: 49424.9, 300 sec: 50262.8). Total num frames: 126828544. Throughput: 0: 49874.8. Samples: 126883080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 13:44:19,333][65383] Avg episode reward: [(0, '0.080')] [2024-06-12 13:44:21,292][65616] Updated weights for policy 0, policy_version 7750 (0.0024) [2024-06-12 13:44:24,332][65383] Fps is (10 sec: 49151.7, 60 sec: 49698.0, 300 sec: 50374.4). Total num frames: 127090688. Throughput: 0: 49743.4. Samples: 127182700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 13:44:24,333][65383] Avg episode reward: [(0, '0.080')] [2024-06-12 13:44:25,130][65616] Updated weights for policy 0, policy_version 7760 (0.0026) [2024-06-12 13:44:27,921][65616] Updated weights for policy 0, policy_version 7770 (0.0028) [2024-06-12 13:44:29,332][65383] Fps is (10 sec: 52429.3, 60 sec: 49971.2, 300 sec: 50429.4). Total num frames: 127352832. Throughput: 0: 49655.0. Samples: 127481380. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-12 13:44:29,333][65383] Avg episode reward: [(0, '0.077')] [2024-06-12 13:44:31,469][65616] Updated weights for policy 0, policy_version 7780 (0.0025) [2024-06-12 13:44:34,332][65383] Fps is (10 sec: 50790.6, 60 sec: 49697.9, 300 sec: 50429.4). Total num frames: 127598592. Throughput: 0: 49932.4. Samples: 127638040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 13:44:34,335][65383] Avg episode reward: [(0, '0.080')] [2024-06-12 13:44:34,649][65616] Updated weights for policy 0, policy_version 7790 (0.0025) [2024-06-12 13:44:38,385][65616] Updated weights for policy 0, policy_version 7800 (0.0025) [2024-06-12 13:44:39,332][65383] Fps is (10 sec: 47513.8, 60 sec: 49698.1, 300 sec: 50262.8). Total num frames: 127827968. Throughput: 0: 49985.9. Samples: 127939820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 13:44:39,333][65383] Avg episode reward: [(0, '0.079')] [2024-06-12 13:44:40,809][65595] Signal inference workers to stop experience collection... (1850 times) [2024-06-12 13:44:40,809][65595] Signal inference workers to resume experience collection... (1850 times) [2024-06-12 13:44:40,819][65616] InferenceWorker_p0-w0: stopping experience collection (1850 times) [2024-06-12 13:44:40,830][65616] InferenceWorker_p0-w0: resuming experience collection (1850 times) [2024-06-12 13:44:41,325][65616] Updated weights for policy 0, policy_version 7810 (0.0024) [2024-06-12 13:44:44,332][65383] Fps is (10 sec: 49151.9, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 128090112. Throughput: 0: 49833.2. Samples: 128232320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 13:44:44,333][65383] Avg episode reward: [(0, '0.080')] [2024-06-12 13:44:44,810][65616] Updated weights for policy 0, policy_version 7820 (0.0025) [2024-06-12 13:44:48,092][65616] Updated weights for policy 0, policy_version 7830 (0.0042) [2024-06-12 13:44:49,332][65383] Fps is (10 sec: 52428.6, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 128352256. Throughput: 0: 49908.1. Samples: 128382400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 13:44:49,333][65383] Avg episode reward: [(0, '0.080')] [2024-06-12 13:44:51,590][65616] Updated weights for policy 0, policy_version 7840 (0.0025) [2024-06-12 13:44:54,336][65383] Fps is (10 sec: 49135.1, 60 sec: 49422.2, 300 sec: 50262.2). Total num frames: 128581632. Throughput: 0: 50258.7. Samples: 128689040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 13:44:54,336][65383] Avg episode reward: [(0, '0.082')] [2024-06-12 13:44:54,694][65616] Updated weights for policy 0, policy_version 7850 (0.0032) [2024-06-12 13:44:58,536][65616] Updated weights for policy 0, policy_version 7860 (0.0025) [2024-06-12 13:44:59,332][65383] Fps is (10 sec: 45875.4, 60 sec: 49425.1, 300 sec: 50207.2). Total num frames: 128811008. Throughput: 0: 50085.8. Samples: 128982280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 13:44:59,333][65383] Avg episode reward: [(0, '0.081')] [2024-06-12 13:45:01,298][65616] Updated weights for policy 0, policy_version 7870 (0.0024) [2024-06-12 13:45:04,332][65383] Fps is (10 sec: 49169.2, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 129073152. Throughput: 0: 49933.4. Samples: 129130080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 13:45:04,333][65383] Avg episode reward: [(0, '0.081')] [2024-06-12 13:45:04,620][65616] Updated weights for policy 0, policy_version 7880 (0.0027) [2024-06-12 13:45:07,695][65616] Updated weights for policy 0, policy_version 7890 (0.0024) [2024-06-12 13:45:09,332][65383] Fps is (10 sec: 52428.9, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 129335296. Throughput: 0: 49569.9. Samples: 129413340. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-12 13:45:09,333][65383] Avg episode reward: [(0, '0.081')] [2024-06-12 13:45:11,418][65616] Updated weights for policy 0, policy_version 7900 (0.0029) [2024-06-12 13:45:14,243][65616] Updated weights for policy 0, policy_version 7910 (0.0022) [2024-06-12 13:45:14,332][65383] Fps is (10 sec: 52429.0, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 129597440. Throughput: 0: 49855.6. Samples: 129724880. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-12 13:45:14,333][65383] Avg episode reward: [(0, '0.082')] [2024-06-12 13:45:18,181][65616] Updated weights for policy 0, policy_version 7920 (0.0026) [2024-06-12 13:45:19,332][65383] Fps is (10 sec: 47513.8, 60 sec: 49698.3, 300 sec: 50096.2). Total num frames: 129810432. Throughput: 0: 49555.7. Samples: 129868040. Policy #0 lag: (min: 1.0, avg: 7.8, max: 19.0) [2024-06-12 13:45:19,333][65383] Avg episode reward: [(0, '0.081')] [2024-06-12 13:45:20,886][65616] Updated weights for policy 0, policy_version 7930 (0.0029) [2024-06-12 13:45:24,332][65383] Fps is (10 sec: 47513.8, 60 sec: 49698.3, 300 sec: 50151.7). Total num frames: 130072576. Throughput: 0: 49539.6. Samples: 130169100. Policy #0 lag: (min: 1.0, avg: 7.8, max: 19.0) [2024-06-12 13:45:24,333][65383] Avg episode reward: [(0, '0.080')] [2024-06-12 13:45:24,652][65616] Updated weights for policy 0, policy_version 7940 (0.0031) [2024-06-12 13:45:27,585][65616] Updated weights for policy 0, policy_version 7950 (0.0025) [2024-06-12 13:45:29,332][65383] Fps is (10 sec: 52428.3, 60 sec: 49698.1, 300 sec: 50207.2). Total num frames: 130334720. Throughput: 0: 49589.9. Samples: 130463860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-12 13:45:29,333][65383] Avg episode reward: [(0, '0.082')] [2024-06-12 13:45:31,408][65616] Updated weights for policy 0, policy_version 7960 (0.0033) [2024-06-12 13:45:34,032][65616] Updated weights for policy 0, policy_version 7970 (0.0026) [2024-06-12 13:45:34,332][65383] Fps is (10 sec: 50790.0, 60 sec: 49698.2, 300 sec: 50207.3). Total num frames: 130580480. Throughput: 0: 49618.6. Samples: 130615240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 13:45:34,333][65383] Avg episode reward: [(0, '0.076')] [2024-06-12 13:45:35,005][65595] Signal inference workers to stop experience collection... (1900 times) [2024-06-12 13:45:35,005][65595] Signal inference workers to resume experience collection... (1900 times) [2024-06-12 13:45:35,023][65616] InferenceWorker_p0-w0: stopping experience collection (1900 times) [2024-06-12 13:45:35,023][65616] InferenceWorker_p0-w0: resuming experience collection (1900 times) [2024-06-12 13:45:37,886][65616] Updated weights for policy 0, policy_version 7980 (0.0034) [2024-06-12 13:45:39,332][65383] Fps is (10 sec: 47513.5, 60 sec: 49698.1, 300 sec: 50096.2). Total num frames: 130809856. Throughput: 0: 49507.4. Samples: 130916700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 13:45:39,333][65383] Avg episode reward: [(0, '0.081')] [2024-06-12 13:45:40,982][65616] Updated weights for policy 0, policy_version 7990 (0.0029) [2024-06-12 13:45:44,304][65616] Updated weights for policy 0, policy_version 8000 (0.0029) [2024-06-12 13:45:44,332][65383] Fps is (10 sec: 49152.0, 60 sec: 49698.2, 300 sec: 50096.1). Total num frames: 131072000. Throughput: 0: 49479.1. Samples: 131208840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 13:45:44,333][65383] Avg episode reward: [(0, '0.077')] [2024-06-12 13:45:47,543][65616] Updated weights for policy 0, policy_version 8010 (0.0028) [2024-06-12 13:45:49,332][65383] Fps is (10 sec: 52428.3, 60 sec: 49698.1, 300 sec: 50096.2). Total num frames: 131334144. Throughput: 0: 49396.4. Samples: 131352920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 13:45:49,333][65383] Avg episode reward: [(0, '0.085')] [2024-06-12 13:45:51,272][65616] Updated weights for policy 0, policy_version 8020 (0.0024) [2024-06-12 13:45:54,138][65616] Updated weights for policy 0, policy_version 8030 (0.0030) [2024-06-12 13:45:54,332][65383] Fps is (10 sec: 49151.7, 60 sec: 49701.0, 300 sec: 49985.1). Total num frames: 131563520. Throughput: 0: 49900.7. Samples: 131658880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 13:45:54,333][65383] Avg episode reward: [(0, '0.081')] [2024-06-12 13:45:54,344][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000008030_131563520.pth... [2024-06-12 13:45:54,388][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000007300_119603200.pth [2024-06-12 13:45:57,552][65616] Updated weights for policy 0, policy_version 8040 (0.0023) [2024-06-12 13:45:59,332][65383] Fps is (10 sec: 44237.3, 60 sec: 49425.0, 300 sec: 49929.5). Total num frames: 131776512. Throughput: 0: 49450.7. Samples: 131950160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 13:45:59,333][65383] Avg episode reward: [(0, '0.080')] [2024-06-12 13:46:00,823][65616] Updated weights for policy 0, policy_version 8050 (0.0029) [2024-06-12 13:46:04,211][65616] Updated weights for policy 0, policy_version 8060 (0.0026) [2024-06-12 13:46:04,333][65383] Fps is (10 sec: 49151.6, 60 sec: 49698.0, 300 sec: 50040.6). Total num frames: 132055040. Throughput: 0: 49501.5. Samples: 132095620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 13:46:04,333][65383] Avg episode reward: [(0, '0.083')] [2024-06-12 13:46:07,321][65616] Updated weights for policy 0, policy_version 8070 (0.0022) [2024-06-12 13:46:09,332][65383] Fps is (10 sec: 54067.4, 60 sec: 49698.1, 300 sec: 50040.6). Total num frames: 132317184. Throughput: 0: 49629.3. Samples: 132402420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 13:46:09,333][65383] Avg episode reward: [(0, '0.083')] [2024-06-12 13:46:10,732][65616] Updated weights for policy 0, policy_version 8080 (0.0024) [2024-06-12 13:46:14,198][65616] Updated weights for policy 0, policy_version 8090 (0.0026) [2024-06-12 13:46:14,332][65383] Fps is (10 sec: 49152.5, 60 sec: 49151.9, 300 sec: 49929.6). Total num frames: 132546560. Throughput: 0: 49501.3. Samples: 132691420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 13:46:14,333][65383] Avg episode reward: [(0, '0.079')] [2024-06-12 13:46:16,912][65616] Updated weights for policy 0, policy_version 8100 (0.0035) [2024-06-12 13:46:19,332][65383] Fps is (10 sec: 45875.3, 60 sec: 49425.0, 300 sec: 49874.0). Total num frames: 132775936. Throughput: 0: 49289.4. Samples: 132833260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 13:46:19,333][65383] Avg episode reward: [(0, '0.082')] [2024-06-12 13:46:20,796][65616] Updated weights for policy 0, policy_version 8110 (0.0023) [2024-06-12 13:46:23,703][65616] Updated weights for policy 0, policy_version 8120 (0.0025) [2024-06-12 13:46:24,332][65383] Fps is (10 sec: 50790.8, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 133054464. Throughput: 0: 49196.0. Samples: 133130520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-12 13:46:24,333][65383] Avg episode reward: [(0, '0.082')] [2024-06-12 13:46:27,419][65616] Updated weights for policy 0, policy_version 8130 (0.0024) [2024-06-12 13:46:29,332][65383] Fps is (10 sec: 52428.8, 60 sec: 49425.1, 300 sec: 49929.6). Total num frames: 133300224. Throughput: 0: 49560.1. Samples: 133439040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 13:46:29,333][65383] Avg episode reward: [(0, '0.082')] [2024-06-12 13:46:30,316][65616] Updated weights for policy 0, policy_version 8140 (0.0025) [2024-06-12 13:46:34,017][65616] Updated weights for policy 0, policy_version 8150 (0.0029) [2024-06-12 13:46:34,332][65383] Fps is (10 sec: 49151.9, 60 sec: 49425.1, 300 sec: 49874.0). Total num frames: 133545984. Throughput: 0: 49688.6. Samples: 133588900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 13:46:34,333][65383] Avg episode reward: [(0, '0.085')] [2024-06-12 13:46:37,151][65616] Updated weights for policy 0, policy_version 8160 (0.0027) [2024-06-12 13:46:39,332][65383] Fps is (10 sec: 47512.8, 60 sec: 49425.0, 300 sec: 49818.5). Total num frames: 133775360. Throughput: 0: 49132.4. Samples: 133869840. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-12 13:46:39,333][65383] Avg episode reward: [(0, '0.085')] [2024-06-12 13:46:40,486][65616] Updated weights for policy 0, policy_version 8170 (0.0024) [2024-06-12 13:46:43,426][65595] Signal inference workers to stop experience collection... (1950 times) [2024-06-12 13:46:43,470][65616] InferenceWorker_p0-w0: stopping experience collection (1950 times) [2024-06-12 13:46:43,479][65595] Signal inference workers to resume experience collection... (1950 times) [2024-06-12 13:46:43,490][65616] InferenceWorker_p0-w0: resuming experience collection (1950 times) [2024-06-12 13:46:43,918][65616] Updated weights for policy 0, policy_version 8180 (0.0024) [2024-06-12 13:46:44,332][65383] Fps is (10 sec: 47513.3, 60 sec: 49152.0, 300 sec: 49818.5). Total num frames: 134021120. Throughput: 0: 49184.8. Samples: 134163480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 13:46:44,333][65383] Avg episode reward: [(0, '0.086')] [2024-06-12 13:46:44,339][65595] Saving new best policy, reward=0.086! [2024-06-12 13:46:47,301][65616] Updated weights for policy 0, policy_version 8190 (0.0030) [2024-06-12 13:46:49,332][65383] Fps is (10 sec: 47514.1, 60 sec: 48606.0, 300 sec: 49762.9). Total num frames: 134250496. Throughput: 0: 49185.1. Samples: 134308940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 13:46:49,333][65383] Avg episode reward: [(0, '0.086')] [2024-06-12 13:46:50,857][65616] Updated weights for policy 0, policy_version 8200 (0.0033) [2024-06-12 13:46:54,015][65616] Updated weights for policy 0, policy_version 8210 (0.0030) [2024-06-12 13:46:54,333][65383] Fps is (10 sec: 49151.6, 60 sec: 49151.9, 300 sec: 49762.9). Total num frames: 134512640. Throughput: 0: 48765.6. Samples: 134596880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 13:46:54,333][65383] Avg episode reward: [(0, '0.079')] [2024-06-12 13:46:57,865][65616] Updated weights for policy 0, policy_version 8220 (0.0031) [2024-06-12 13:46:59,332][65383] Fps is (10 sec: 50790.6, 60 sec: 49698.2, 300 sec: 49707.4). Total num frames: 134758400. Throughput: 0: 48906.3. Samples: 134892200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 13:46:59,333][65383] Avg episode reward: [(0, '0.081')] [2024-06-12 13:47:00,559][65616] Updated weights for policy 0, policy_version 8230 (0.0029) [2024-06-12 13:47:04,332][65383] Fps is (10 sec: 47514.4, 60 sec: 48879.1, 300 sec: 49651.9). Total num frames: 134987776. Throughput: 0: 49097.7. Samples: 135042660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 13:47:04,333][65383] Avg episode reward: [(0, '0.080')] [2024-06-12 13:47:04,382][65616] Updated weights for policy 0, policy_version 8240 (0.0027) [2024-06-12 13:47:07,483][65616] Updated weights for policy 0, policy_version 8250 (0.0027) [2024-06-12 13:47:09,332][65383] Fps is (10 sec: 47513.2, 60 sec: 48605.8, 300 sec: 49707.4). Total num frames: 135233536. Throughput: 0: 49051.5. Samples: 135337840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 13:47:09,333][65383] Avg episode reward: [(0, '0.080')] [2024-06-12 13:47:11,067][65616] Updated weights for policy 0, policy_version 8260 (0.0024) [2024-06-12 13:47:13,897][65616] Updated weights for policy 0, policy_version 8270 (0.0027) [2024-06-12 13:47:14,332][65383] Fps is (10 sec: 50790.2, 60 sec: 49152.0, 300 sec: 49707.4). Total num frames: 135495680. Throughput: 0: 48554.1. Samples: 135623980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-12 13:47:14,333][65383] Avg episode reward: [(0, '0.071')] [2024-06-12 13:47:17,747][65616] Updated weights for policy 0, policy_version 8280 (0.0024) [2024-06-12 13:47:19,333][65383] Fps is (10 sec: 50789.8, 60 sec: 49424.9, 300 sec: 49707.4). Total num frames: 135741440. Throughput: 0: 48826.5. Samples: 135786100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 13:47:19,333][65383] Avg episode reward: [(0, '0.087')] [2024-06-12 13:47:20,629][65616] Updated weights for policy 0, policy_version 8290 (0.0029) [2024-06-12 13:47:24,332][65383] Fps is (10 sec: 47513.2, 60 sec: 48605.8, 300 sec: 49540.8). Total num frames: 135970816. Throughput: 0: 49014.7. Samples: 136075500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 13:47:24,333][65383] Avg episode reward: [(0, '0.082')] [2024-06-12 13:47:24,471][65616] Updated weights for policy 0, policy_version 8300 (0.0026) [2024-06-12 13:47:27,708][65616] Updated weights for policy 0, policy_version 8310 (0.0029) [2024-06-12 13:47:29,332][65383] Fps is (10 sec: 44237.1, 60 sec: 48059.6, 300 sec: 49485.2). Total num frames: 136183808. Throughput: 0: 48899.5. Samples: 136363960. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-12 13:47:29,333][65383] Avg episode reward: [(0, '0.081')] [2024-06-12 13:47:31,142][65616] Updated weights for policy 0, policy_version 8320 (0.0026) [2024-06-12 13:47:34,333][65383] Fps is (10 sec: 49151.4, 60 sec: 48605.7, 300 sec: 49540.7). Total num frames: 136462336. Throughput: 0: 48634.0. Samples: 136497480. Policy #0 lag: (min: 2.0, avg: 11.6, max: 23.0) [2024-06-12 13:47:34,333][65383] Avg episode reward: [(0, '0.085')] [2024-06-12 13:47:34,741][65616] Updated weights for policy 0, policy_version 8330 (0.0033) [2024-06-12 13:47:38,415][65616] Updated weights for policy 0, policy_version 8340 (0.0023) [2024-06-12 13:47:39,332][65383] Fps is (10 sec: 52429.1, 60 sec: 48879.0, 300 sec: 49596.3). Total num frames: 136708096. Throughput: 0: 48735.7. Samples: 136789980. Policy #0 lag: (min: 2.0, avg: 11.6, max: 23.0) [2024-06-12 13:47:39,333][65383] Avg episode reward: [(0, '0.086')] [2024-06-12 13:47:41,656][65616] Updated weights for policy 0, policy_version 8350 (0.0031) [2024-06-12 13:47:44,332][65383] Fps is (10 sec: 44237.6, 60 sec: 48059.7, 300 sec: 49318.6). Total num frames: 136904704. Throughput: 0: 48475.5. Samples: 137073600. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-12 13:47:44,333][65383] Avg episode reward: [(0, '0.082')] [2024-06-12 13:47:45,435][65616] Updated weights for policy 0, policy_version 8360 (0.0025) [2024-06-12 13:47:47,982][65616] Updated weights for policy 0, policy_version 8370 (0.0028) [2024-06-12 13:47:49,332][65383] Fps is (10 sec: 44237.1, 60 sec: 48332.8, 300 sec: 49318.6). Total num frames: 137150464. Throughput: 0: 48072.0. Samples: 137205900. Policy #0 lag: (min: 0.0, avg: 13.2, max: 22.0) [2024-06-12 13:47:49,333][65383] Avg episode reward: [(0, '0.086')] [2024-06-12 13:47:52,286][65616] Updated weights for policy 0, policy_version 8380 (0.0028) [2024-06-12 13:47:54,332][65383] Fps is (10 sec: 54067.1, 60 sec: 48879.0, 300 sec: 49485.2). Total num frames: 137445376. Throughput: 0: 48038.6. Samples: 137499580. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-12 13:47:54,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:47:54,344][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000008389_137445376.pth... [2024-06-12 13:47:54,388][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000007667_125616128.pth [2024-06-12 13:47:54,391][65595] Saving new best policy, reward=0.088! [2024-06-12 13:47:54,600][65616] Updated weights for policy 0, policy_version 8390 (0.0022) [2024-06-12 13:47:59,103][65616] Updated weights for policy 0, policy_version 8400 (0.0023) [2024-06-12 13:47:59,332][65383] Fps is (10 sec: 47513.6, 60 sec: 47786.7, 300 sec: 49318.6). Total num frames: 137625600. Throughput: 0: 47808.9. Samples: 137775380. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-12 13:47:59,333][65383] Avg episode reward: [(0, '0.086')] [2024-06-12 13:48:01,355][65595] Signal inference workers to stop experience collection... (2000 times) [2024-06-12 13:48:01,356][65595] Signal inference workers to resume experience collection... (2000 times) [2024-06-12 13:48:01,375][65616] InferenceWorker_p0-w0: stopping experience collection (2000 times) [2024-06-12 13:48:01,376][65616] InferenceWorker_p0-w0: resuming experience collection (2000 times) [2024-06-12 13:48:02,232][65616] Updated weights for policy 0, policy_version 8410 (0.0040) [2024-06-12 13:48:04,332][65383] Fps is (10 sec: 42598.6, 60 sec: 48059.7, 300 sec: 49207.5). Total num frames: 137871360. Throughput: 0: 47112.6. Samples: 137906160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 13:48:04,333][65383] Avg episode reward: [(0, '0.084')] [2024-06-12 13:48:06,046][65616] Updated weights for policy 0, policy_version 8420 (0.0022) [2024-06-12 13:48:09,332][65383] Fps is (10 sec: 47513.1, 60 sec: 47786.6, 300 sec: 49096.5). Total num frames: 138100736. Throughput: 0: 47229.8. Samples: 138200840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 13:48:09,333][65383] Avg episode reward: [(0, '0.086')] [2024-06-12 13:48:09,388][65616] Updated weights for policy 0, policy_version 8430 (0.0029) [2024-06-12 13:48:12,658][65616] Updated weights for policy 0, policy_version 8440 (0.0029) [2024-06-12 13:48:14,332][65383] Fps is (10 sec: 47513.9, 60 sec: 47513.7, 300 sec: 49096.5). Total num frames: 138346496. Throughput: 0: 47485.9. Samples: 138500820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 13:48:14,333][65383] Avg episode reward: [(0, '0.082')] [2024-06-12 13:48:16,218][65616] Updated weights for policy 0, policy_version 8450 (0.0028) [2024-06-12 13:48:19,332][65383] Fps is (10 sec: 49152.7, 60 sec: 47513.8, 300 sec: 49096.5). Total num frames: 138592256. Throughput: 0: 47723.4. Samples: 138645020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 13:48:19,333][65383] Avg episode reward: [(0, '0.084')] [2024-06-12 13:48:19,347][65616] Updated weights for policy 0, policy_version 8460 (0.0030) [2024-06-12 13:48:22,643][65616] Updated weights for policy 0, policy_version 8470 (0.0025) [2024-06-12 13:48:24,332][65383] Fps is (10 sec: 49151.8, 60 sec: 47786.8, 300 sec: 49096.5). Total num frames: 138838016. Throughput: 0: 47609.4. Samples: 138932400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 13:48:24,333][65383] Avg episode reward: [(0, '0.085')] [2024-06-12 13:48:26,110][65616] Updated weights for policy 0, policy_version 8480 (0.0027) [2024-06-12 13:48:29,332][65383] Fps is (10 sec: 47513.4, 60 sec: 48059.8, 300 sec: 48985.4). Total num frames: 139067392. Throughput: 0: 47680.5. Samples: 139219220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 13:48:29,333][65383] Avg episode reward: [(0, '0.087')] [2024-06-12 13:48:29,722][65616] Updated weights for policy 0, policy_version 8490 (0.0028) [2024-06-12 13:48:32,657][65616] Updated weights for policy 0, policy_version 8500 (0.0025) [2024-06-12 13:48:34,332][65383] Fps is (10 sec: 44236.9, 60 sec: 46967.7, 300 sec: 48929.8). Total num frames: 139280384. Throughput: 0: 47711.1. Samples: 139352900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-12 13:48:34,333][65383] Avg episode reward: [(0, '0.081')] [2024-06-12 13:48:36,556][65616] Updated weights for policy 0, policy_version 8510 (0.0029) [2024-06-12 13:48:39,332][65383] Fps is (10 sec: 49152.1, 60 sec: 47513.7, 300 sec: 49040.9). Total num frames: 139558912. Throughput: 0: 47562.8. Samples: 139639900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-12 13:48:39,333][65383] Avg episode reward: [(0, '0.085')] [2024-06-12 13:48:39,577][65616] Updated weights for policy 0, policy_version 8520 (0.0027) [2024-06-12 13:48:43,489][65616] Updated weights for policy 0, policy_version 8530 (0.0029) [2024-06-12 13:48:44,332][65383] Fps is (10 sec: 52429.1, 60 sec: 48332.9, 300 sec: 49040.9). Total num frames: 139804672. Throughput: 0: 47749.4. Samples: 139924100. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-12 13:48:44,333][65383] Avg episode reward: [(0, '0.084')] [2024-06-12 13:48:46,471][65616] Updated weights for policy 0, policy_version 8540 (0.0032) [2024-06-12 13:48:49,333][65383] Fps is (10 sec: 47512.4, 60 sec: 48059.5, 300 sec: 48874.3). Total num frames: 140034048. Throughput: 0: 48178.5. Samples: 140074200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-12 13:48:49,333][65383] Avg episode reward: [(0, '0.080')] [2024-06-12 13:48:50,404][65616] Updated weights for policy 0, policy_version 8550 (0.0032) [2024-06-12 13:48:53,465][65616] Updated weights for policy 0, policy_version 8560 (0.0036) [2024-06-12 13:48:54,332][65383] Fps is (10 sec: 45874.7, 60 sec: 46967.5, 300 sec: 48874.3). Total num frames: 140263424. Throughput: 0: 48068.9. Samples: 140363940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 13:48:54,333][65383] Avg episode reward: [(0, '0.083')] [2024-06-12 13:48:56,982][65616] Updated weights for policy 0, policy_version 8570 (0.0028) [2024-06-12 13:48:59,332][65383] Fps is (10 sec: 47514.7, 60 sec: 48059.7, 300 sec: 48929.9). Total num frames: 140509184. Throughput: 0: 47767.1. Samples: 140650340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 13:48:59,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:49:00,547][65616] Updated weights for policy 0, policy_version 8580 (0.0030) [2024-06-12 13:49:02,044][65595] Signal inference workers to stop experience collection... (2050 times) [2024-06-12 13:49:02,081][65616] InferenceWorker_p0-w0: stopping experience collection (2050 times) [2024-06-12 13:49:02,091][65595] Signal inference workers to resume experience collection... (2050 times) [2024-06-12 13:49:02,099][65616] InferenceWorker_p0-w0: resuming experience collection (2050 times) [2024-06-12 13:49:04,128][65616] Updated weights for policy 0, policy_version 8590 (0.0032) [2024-06-12 13:49:04,332][65383] Fps is (10 sec: 47513.6, 60 sec: 47786.7, 300 sec: 48818.8). Total num frames: 140738560. Throughput: 0: 47957.7. Samples: 140803120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 13:49:04,333][65383] Avg episode reward: [(0, '0.086')] [2024-06-12 13:49:07,222][65616] Updated weights for policy 0, policy_version 8600 (0.0031) [2024-06-12 13:49:09,332][65383] Fps is (10 sec: 49151.8, 60 sec: 48332.8, 300 sec: 48818.8). Total num frames: 141000704. Throughput: 0: 47882.2. Samples: 141087100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 13:49:09,333][65383] Avg episode reward: [(0, '0.087')] [2024-06-12 13:49:10,981][65616] Updated weights for policy 0, policy_version 8610 (0.0027) [2024-06-12 13:49:13,750][65616] Updated weights for policy 0, policy_version 8620 (0.0026) [2024-06-12 13:49:14,332][65383] Fps is (10 sec: 49152.4, 60 sec: 48059.8, 300 sec: 48818.8). Total num frames: 141230080. Throughput: 0: 47631.2. Samples: 141362620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 13:49:14,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:49:17,847][65616] Updated weights for policy 0, policy_version 8630 (0.0027) [2024-06-12 13:49:19,332][65383] Fps is (10 sec: 47513.7, 60 sec: 48059.7, 300 sec: 48763.2). Total num frames: 141475840. Throughput: 0: 48084.4. Samples: 141516700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-12 13:49:19,333][65383] Avg episode reward: [(0, '0.084')] [2024-06-12 13:49:20,736][65616] Updated weights for policy 0, policy_version 8640 (0.0028) [2024-06-12 13:49:24,332][65383] Fps is (10 sec: 44236.4, 60 sec: 47240.5, 300 sec: 48541.1). Total num frames: 141672448. Throughput: 0: 48006.2. Samples: 141800180. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-12 13:49:24,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:49:24,766][65616] Updated weights for policy 0, policy_version 8650 (0.0028) [2024-06-12 13:49:27,646][65616] Updated weights for policy 0, policy_version 8660 (0.0023) [2024-06-12 13:49:29,332][65383] Fps is (10 sec: 45875.1, 60 sec: 47786.6, 300 sec: 48596.6). Total num frames: 141934592. Throughput: 0: 47837.2. Samples: 142076780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 13:49:29,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:49:29,346][65595] Saving new best policy, reward=0.090! [2024-06-12 13:49:31,883][65616] Updated weights for policy 0, policy_version 8670 (0.0024) [2024-06-12 13:49:34,332][65383] Fps is (10 sec: 52429.2, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 142196736. Throughput: 0: 47858.5. Samples: 142227820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 13:49:34,333][65383] Avg episode reward: [(0, '0.087')] [2024-06-12 13:49:34,566][65616] Updated weights for policy 0, policy_version 8680 (0.0030) [2024-06-12 13:49:38,468][65616] Updated weights for policy 0, policy_version 8690 (0.0028) [2024-06-12 13:49:39,332][65383] Fps is (10 sec: 45874.9, 60 sec: 47240.4, 300 sec: 48485.5). Total num frames: 142393344. Throughput: 0: 47608.4. Samples: 142506320. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-06-12 13:49:39,333][65383] Avg episode reward: [(0, '0.085')] [2024-06-12 13:49:41,778][65616] Updated weights for policy 0, policy_version 8700 (0.0023) [2024-06-12 13:49:44,332][65383] Fps is (10 sec: 44236.5, 60 sec: 47240.5, 300 sec: 48430.0). Total num frames: 142639104. Throughput: 0: 47528.0. Samples: 142789100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-12 13:49:44,333][65383] Avg episode reward: [(0, '0.085')] [2024-06-12 13:49:45,294][65616] Updated weights for policy 0, policy_version 8710 (0.0026) [2024-06-12 13:49:48,933][65616] Updated weights for policy 0, policy_version 8720 (0.0026) [2024-06-12 13:49:49,332][65383] Fps is (10 sec: 49152.8, 60 sec: 47513.8, 300 sec: 48486.1). Total num frames: 142884864. Throughput: 0: 47206.3. Samples: 142927400. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-12 13:49:49,333][65383] Avg episode reward: [(0, '0.085')] [2024-06-12 13:49:52,275][65616] Updated weights for policy 0, policy_version 8730 (0.0033) [2024-06-12 13:49:54,332][65383] Fps is (10 sec: 44237.2, 60 sec: 46967.5, 300 sec: 48374.5). Total num frames: 143081472. Throughput: 0: 47222.8. Samples: 143212120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 13:49:54,333][65383] Avg episode reward: [(0, '0.086')] [2024-06-12 13:49:54,519][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000008735_143114240.pth... [2024-06-12 13:49:54,570][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000008030_131563520.pth [2024-06-12 13:49:55,835][65616] Updated weights for policy 0, policy_version 8740 (0.0029) [2024-06-12 13:49:59,332][65383] Fps is (10 sec: 45874.8, 60 sec: 47240.5, 300 sec: 48374.5). Total num frames: 143343616. Throughput: 0: 47308.8. Samples: 143491520. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-12 13:49:59,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:49:59,350][65616] Updated weights for policy 0, policy_version 8750 (0.0025) [2024-06-12 13:50:02,550][65616] Updated weights for policy 0, policy_version 8760 (0.0031) [2024-06-12 13:50:04,332][65383] Fps is (10 sec: 52428.2, 60 sec: 47786.7, 300 sec: 48374.4). Total num frames: 143605760. Throughput: 0: 47459.5. Samples: 143652380. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-12 13:50:04,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 13:50:06,328][65616] Updated weights for policy 0, policy_version 8770 (0.0024) [2024-06-12 13:50:09,332][65383] Fps is (10 sec: 47513.3, 60 sec: 46967.4, 300 sec: 48207.8). Total num frames: 143818752. Throughput: 0: 47361.7. Samples: 143931460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-12 13:50:09,333][65383] Avg episode reward: [(0, '0.087')] [2024-06-12 13:50:09,908][65616] Updated weights for policy 0, policy_version 8780 (0.0024) [2024-06-12 13:50:10,521][65595] Signal inference workers to stop experience collection... (2100 times) [2024-06-12 13:50:10,552][65616] InferenceWorker_p0-w0: stopping experience collection (2100 times) [2024-06-12 13:50:10,569][65595] Signal inference workers to resume experience collection... (2100 times) [2024-06-12 13:50:10,574][65616] InferenceWorker_p0-w0: resuming experience collection (2100 times) [2024-06-12 13:50:13,530][65616] Updated weights for policy 0, policy_version 8790 (0.0031) [2024-06-12 13:50:14,332][65383] Fps is (10 sec: 42598.7, 60 sec: 46694.4, 300 sec: 48207.8). Total num frames: 144031744. Throughput: 0: 47500.9. Samples: 144214320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 13:50:14,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:50:16,474][65616] Updated weights for policy 0, policy_version 8800 (0.0021) [2024-06-12 13:50:19,332][65383] Fps is (10 sec: 49152.0, 60 sec: 47240.5, 300 sec: 48263.4). Total num frames: 144310272. Throughput: 0: 47197.2. Samples: 144351700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 13:50:19,333][65383] Avg episode reward: [(0, '0.087')] [2024-06-12 13:50:20,089][65616] Updated weights for policy 0, policy_version 8810 (0.0021) [2024-06-12 13:50:23,583][65616] Updated weights for policy 0, policy_version 8820 (0.0031) [2024-06-12 13:50:24,332][65383] Fps is (10 sec: 50790.1, 60 sec: 47786.7, 300 sec: 48152.3). Total num frames: 144539648. Throughput: 0: 47302.3. Samples: 144634920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-12 13:50:24,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:50:26,787][65616] Updated weights for policy 0, policy_version 8830 (0.0022) [2024-06-12 13:50:29,332][65383] Fps is (10 sec: 44237.4, 60 sec: 46967.5, 300 sec: 48041.2). Total num frames: 144752640. Throughput: 0: 47291.7. Samples: 144917220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 13:50:29,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:50:30,309][65616] Updated weights for policy 0, policy_version 8840 (0.0032) [2024-06-12 13:50:33,724][65616] Updated weights for policy 0, policy_version 8850 (0.0025) [2024-06-12 13:50:34,332][65383] Fps is (10 sec: 47513.7, 60 sec: 46967.4, 300 sec: 48152.3). Total num frames: 145014784. Throughput: 0: 47506.1. Samples: 145065180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 13:50:34,333][65383] Avg episode reward: [(0, '0.083')] [2024-06-12 13:50:37,054][65616] Updated weights for policy 0, policy_version 8860 (0.0027) [2024-06-12 13:50:39,332][65383] Fps is (10 sec: 50789.9, 60 sec: 47786.7, 300 sec: 48096.8). Total num frames: 145260544. Throughput: 0: 47418.6. Samples: 145345960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 13:50:39,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:50:40,765][65616] Updated weights for policy 0, policy_version 8870 (0.0031) [2024-06-12 13:50:44,313][65616] Updated weights for policy 0, policy_version 8880 (0.0026) [2024-06-12 13:50:44,332][65383] Fps is (10 sec: 47513.8, 60 sec: 47513.6, 300 sec: 47985.7). Total num frames: 145489920. Throughput: 0: 47505.4. Samples: 145629260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 13:50:44,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 13:50:47,833][65616] Updated weights for policy 0, policy_version 8890 (0.0029) [2024-06-12 13:50:49,332][65383] Fps is (10 sec: 44237.1, 60 sec: 46967.4, 300 sec: 47930.2). Total num frames: 145702912. Throughput: 0: 46961.0. Samples: 145765620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 13:50:49,333][65383] Avg episode reward: [(0, '0.087')] [2024-06-12 13:50:50,687][65616] Updated weights for policy 0, policy_version 8900 (0.0026) [2024-06-12 13:50:54,332][65383] Fps is (10 sec: 42598.2, 60 sec: 47240.5, 300 sec: 47930.1). Total num frames: 145915904. Throughput: 0: 46836.1. Samples: 146039080. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-12 13:50:54,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:50:54,977][65616] Updated weights for policy 0, policy_version 8910 (0.0030) [2024-06-12 13:50:58,046][65616] Updated weights for policy 0, policy_version 8920 (0.0026) [2024-06-12 13:50:59,332][65383] Fps is (10 sec: 47513.5, 60 sec: 47240.6, 300 sec: 47874.6). Total num frames: 146178048. Throughput: 0: 46718.2. Samples: 146316640. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-12 13:50:59,333][65383] Avg episode reward: [(0, '0.084')] [2024-06-12 13:51:01,929][65616] Updated weights for policy 0, policy_version 8930 (0.0025) [2024-06-12 13:51:04,332][65383] Fps is (10 sec: 47513.9, 60 sec: 46421.4, 300 sec: 47708.0). Total num frames: 146391040. Throughput: 0: 46813.9. Samples: 146458320. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-12 13:51:04,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:51:05,335][65616] Updated weights for policy 0, policy_version 8940 (0.0033) [2024-06-12 13:51:08,991][65616] Updated weights for policy 0, policy_version 8950 (0.0025) [2024-06-12 13:51:09,332][65383] Fps is (10 sec: 45875.4, 60 sec: 46967.6, 300 sec: 47763.5). Total num frames: 146636800. Throughput: 0: 46833.4. Samples: 146742420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 13:51:09,333][65383] Avg episode reward: [(0, '0.084')] [2024-06-12 13:51:12,453][65616] Updated weights for policy 0, policy_version 8960 (0.0028) [2024-06-12 13:51:14,332][65383] Fps is (10 sec: 47513.0, 60 sec: 47240.4, 300 sec: 47763.5). Total num frames: 146866176. Throughput: 0: 46606.9. Samples: 147014540. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-12 13:51:14,333][65383] Avg episode reward: [(0, '0.087')] [2024-06-12 13:51:16,013][65616] Updated weights for policy 0, policy_version 8970 (0.0028) [2024-06-12 13:51:19,291][65616] Updated weights for policy 0, policy_version 8980 (0.0028) [2024-06-12 13:51:19,332][65383] Fps is (10 sec: 49151.6, 60 sec: 46967.5, 300 sec: 47708.0). Total num frames: 147128320. Throughput: 0: 46543.1. Samples: 147159620. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-12 13:51:19,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:51:20,772][65595] Signal inference workers to stop experience collection... (2150 times) [2024-06-12 13:51:20,817][65616] InferenceWorker_p0-w0: stopping experience collection (2150 times) [2024-06-12 13:51:20,827][65595] Signal inference workers to resume experience collection... (2150 times) [2024-06-12 13:51:20,830][65616] InferenceWorker_p0-w0: resuming experience collection (2150 times) [2024-06-12 13:51:23,526][65616] Updated weights for policy 0, policy_version 8990 (0.0030) [2024-06-12 13:51:24,332][65383] Fps is (10 sec: 44236.8, 60 sec: 46148.2, 300 sec: 47485.8). Total num frames: 147308544. Throughput: 0: 46463.9. Samples: 147436840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 13:51:24,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:51:26,515][65616] Updated weights for policy 0, policy_version 9000 (0.0042) [2024-06-12 13:51:29,332][65383] Fps is (10 sec: 42598.6, 60 sec: 46694.4, 300 sec: 47485.8). Total num frames: 147554304. Throughput: 0: 46200.4. Samples: 147708280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 13:51:29,333][65383] Avg episode reward: [(0, '0.087')] [2024-06-12 13:51:30,331][65616] Updated weights for policy 0, policy_version 9010 (0.0032) [2024-06-12 13:51:33,403][65616] Updated weights for policy 0, policy_version 9020 (0.0025) [2024-06-12 13:51:34,332][65383] Fps is (10 sec: 49152.1, 60 sec: 46421.3, 300 sec: 47541.4). Total num frames: 147800064. Throughput: 0: 46179.9. Samples: 147843720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 13:51:34,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:51:37,449][65616] Updated weights for policy 0, policy_version 9030 (0.0037) [2024-06-12 13:51:39,333][65383] Fps is (10 sec: 47513.0, 60 sec: 46148.2, 300 sec: 47485.8). Total num frames: 148029440. Throughput: 0: 46371.0. Samples: 148125780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 13:51:39,333][65383] Avg episode reward: [(0, '0.087')] [2024-06-12 13:51:40,646][65616] Updated weights for policy 0, policy_version 9040 (0.0024) [2024-06-12 13:51:44,333][65383] Fps is (10 sec: 45875.0, 60 sec: 46148.1, 300 sec: 47485.8). Total num frames: 148258816. Throughput: 0: 46571.8. Samples: 148412380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 13:51:44,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:51:44,485][65616] Updated weights for policy 0, policy_version 9050 (0.0031) [2024-06-12 13:51:47,854][65616] Updated weights for policy 0, policy_version 9060 (0.0029) [2024-06-12 13:51:49,332][65383] Fps is (10 sec: 47514.6, 60 sec: 46694.4, 300 sec: 47430.3). Total num frames: 148504576. Throughput: 0: 46415.2. Samples: 148547000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 13:51:49,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:51:51,499][65616] Updated weights for policy 0, policy_version 9070 (0.0023) [2024-06-12 13:51:54,332][65383] Fps is (10 sec: 47513.7, 60 sec: 46967.4, 300 sec: 47374.7). Total num frames: 148733952. Throughput: 0: 46390.0. Samples: 148829980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 13:51:54,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:51:54,455][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000009079_148750336.pth... [2024-06-12 13:51:54,511][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000008389_137445376.pth [2024-06-12 13:51:54,647][65616] Updated weights for policy 0, policy_version 9080 (0.0032) [2024-06-12 13:51:58,372][65616] Updated weights for policy 0, policy_version 9090 (0.0030) [2024-06-12 13:51:59,332][65383] Fps is (10 sec: 45874.7, 60 sec: 46421.3, 300 sec: 47374.7). Total num frames: 148963328. Throughput: 0: 46553.4. Samples: 149109440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) [2024-06-12 13:51:59,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:52:01,674][65616] Updated weights for policy 0, policy_version 9100 (0.0028) [2024-06-12 13:52:04,332][65383] Fps is (10 sec: 44237.1, 60 sec: 46421.3, 300 sec: 47263.7). Total num frames: 149176320. Throughput: 0: 46508.0. Samples: 149252480. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) [2024-06-12 13:52:04,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:52:05,616][65616] Updated weights for policy 0, policy_version 9110 (0.0028) [2024-06-12 13:52:08,582][65616] Updated weights for policy 0, policy_version 9120 (0.0023) [2024-06-12 13:52:09,333][65383] Fps is (10 sec: 47513.0, 60 sec: 46694.2, 300 sec: 47263.7). Total num frames: 149438464. Throughput: 0: 46448.0. Samples: 149527000. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-12 13:52:09,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:52:12,410][65616] Updated weights for policy 0, policy_version 9130 (0.0033) [2024-06-12 13:52:14,333][65383] Fps is (10 sec: 49151.6, 60 sec: 46694.4, 300 sec: 47208.1). Total num frames: 149667840. Throughput: 0: 46577.2. Samples: 149804260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 13:52:14,333][65383] Avg episode reward: [(0, '0.086')] [2024-06-12 13:52:15,660][65616] Updated weights for policy 0, policy_version 9140 (0.0024) [2024-06-12 13:52:19,333][65383] Fps is (10 sec: 44236.7, 60 sec: 45875.1, 300 sec: 47152.6). Total num frames: 149880832. Throughput: 0: 46683.9. Samples: 149944500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 13:52:19,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 13:52:19,836][65616] Updated weights for policy 0, policy_version 9150 (0.0027) [2024-06-12 13:52:23,540][65616] Updated weights for policy 0, policy_version 9160 (0.0029) [2024-06-12 13:52:24,332][65383] Fps is (10 sec: 42598.8, 60 sec: 46421.4, 300 sec: 47152.6). Total num frames: 150093824. Throughput: 0: 46378.8. Samples: 150212820. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 13:52:24,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 13:52:26,752][65616] Updated weights for policy 0, policy_version 9170 (0.0027) [2024-06-12 13:52:29,332][65383] Fps is (10 sec: 49152.7, 60 sec: 46967.4, 300 sec: 47152.6). Total num frames: 150372352. Throughput: 0: 46117.0. Samples: 150487640. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-12 13:52:29,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:52:30,271][65616] Updated weights for policy 0, policy_version 9180 (0.0028) [2024-06-12 13:52:34,010][65616] Updated weights for policy 0, policy_version 9190 (0.0024) [2024-06-12 13:52:34,332][65383] Fps is (10 sec: 49151.9, 60 sec: 46421.4, 300 sec: 47041.5). Total num frames: 150585344. Throughput: 0: 46336.3. Samples: 150632140. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-12 13:52:34,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:52:37,445][65616] Updated weights for policy 0, policy_version 9200 (0.0036) [2024-06-12 13:52:39,332][65383] Fps is (10 sec: 44237.2, 60 sec: 46421.5, 300 sec: 47152.6). Total num frames: 150814720. Throughput: 0: 46247.8. Samples: 150911120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-06-12 13:52:39,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:52:40,860][65616] Updated weights for policy 0, policy_version 9210 (0.0027) [2024-06-12 13:52:41,442][65595] Signal inference workers to stop experience collection... (2200 times) [2024-06-12 13:52:41,442][65595] Signal inference workers to resume experience collection... (2200 times) [2024-06-12 13:52:41,478][65616] InferenceWorker_p0-w0: stopping experience collection (2200 times) [2024-06-12 13:52:41,478][65616] InferenceWorker_p0-w0: resuming experience collection (2200 times) [2024-06-12 13:52:44,333][65383] Fps is (10 sec: 44236.3, 60 sec: 46148.2, 300 sec: 47041.5). Total num frames: 151027712. Throughput: 0: 46297.6. Samples: 151192840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-06-12 13:52:44,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:52:44,530][65616] Updated weights for policy 0, policy_version 9220 (0.0031) [2024-06-12 13:52:47,959][65616] Updated weights for policy 0, policy_version 9230 (0.0030) [2024-06-12 13:52:49,332][65383] Fps is (10 sec: 49151.5, 60 sec: 46694.3, 300 sec: 46986.0). Total num frames: 151306240. Throughput: 0: 46464.9. Samples: 151343400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 13:52:49,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:52:51,239][65616] Updated weights for policy 0, policy_version 9240 (0.0028) [2024-06-12 13:52:54,332][65383] Fps is (10 sec: 44237.5, 60 sec: 45602.2, 300 sec: 46930.4). Total num frames: 151470080. Throughput: 0: 46364.1. Samples: 151613380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 13:52:54,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:52:54,969][65616] Updated weights for policy 0, policy_version 9250 (0.0028) [2024-06-12 13:52:58,284][65616] Updated weights for policy 0, policy_version 9260 (0.0032) [2024-06-12 13:52:59,332][65383] Fps is (10 sec: 42598.5, 60 sec: 46148.3, 300 sec: 46986.0). Total num frames: 151732224. Throughput: 0: 46146.8. Samples: 151880860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 13:52:59,333][65383] Avg episode reward: [(0, '0.087')] [2024-06-12 13:53:02,066][65616] Updated weights for policy 0, policy_version 9270 (0.0035) [2024-06-12 13:53:04,333][65383] Fps is (10 sec: 50789.7, 60 sec: 46694.3, 300 sec: 47041.5). Total num frames: 151977984. Throughput: 0: 46348.5. Samples: 152030180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-12 13:53:04,333][65383] Avg episode reward: [(0, '0.087')] [2024-06-12 13:53:05,002][65616] Updated weights for policy 0, policy_version 9280 (0.0028) [2024-06-12 13:53:09,332][65383] Fps is (10 sec: 44236.8, 60 sec: 45602.2, 300 sec: 46874.9). Total num frames: 152174592. Throughput: 0: 46749.8. Samples: 152316560. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-12 13:53:09,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:53:09,660][65616] Updated weights for policy 0, policy_version 9290 (0.0022) [2024-06-12 13:53:11,986][65616] Updated weights for policy 0, policy_version 9300 (0.0024) [2024-06-12 13:53:14,332][65383] Fps is (10 sec: 44237.2, 60 sec: 45875.3, 300 sec: 46874.9). Total num frames: 152420352. Throughput: 0: 46656.4. Samples: 152587180. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-12 13:53:14,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:53:16,758][65616] Updated weights for policy 0, policy_version 9310 (0.0033) [2024-06-12 13:53:18,881][65616] Updated weights for policy 0, policy_version 9320 (0.0028) [2024-06-12 13:53:19,332][65383] Fps is (10 sec: 54067.1, 60 sec: 47240.6, 300 sec: 47041.5). Total num frames: 152715264. Throughput: 0: 46639.1. Samples: 152730900. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-12 13:53:19,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 13:53:23,823][65616] Updated weights for policy 0, policy_version 9330 (0.0035) [2024-06-12 13:53:24,332][65383] Fps is (10 sec: 47513.6, 60 sec: 46694.4, 300 sec: 46874.9). Total num frames: 152895488. Throughput: 0: 46509.2. Samples: 153004040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 24.0) [2024-06-12 13:53:24,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:53:26,114][65616] Updated weights for policy 0, policy_version 9340 (0.0027) [2024-06-12 13:53:29,332][65383] Fps is (10 sec: 40959.6, 60 sec: 45875.1, 300 sec: 46930.4). Total num frames: 153124864. Throughput: 0: 46356.5. Samples: 153278880. Policy #0 lag: (min: 0.0, avg: 8.4, max: 24.0) [2024-06-12 13:53:29,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 13:53:30,867][65616] Updated weights for policy 0, policy_version 9350 (0.0026) [2024-06-12 13:53:33,533][65616] Updated weights for policy 0, policy_version 9360 (0.0023) [2024-06-12 13:53:34,335][65383] Fps is (10 sec: 49141.3, 60 sec: 46692.7, 300 sec: 46874.5). Total num frames: 153387008. Throughput: 0: 46004.0. Samples: 153413680. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-06-12 13:53:34,335][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:53:37,659][65616] Updated weights for policy 0, policy_version 9370 (0.0025) [2024-06-12 13:53:39,332][65383] Fps is (10 sec: 47514.1, 60 sec: 46421.3, 300 sec: 46763.8). Total num frames: 153600000. Throughput: 0: 46227.6. Samples: 153693620. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-12 13:53:39,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 13:53:39,333][65595] Saving new best policy, reward=0.092! [2024-06-12 13:53:40,478][65616] Updated weights for policy 0, policy_version 9380 (0.0023) [2024-06-12 13:53:44,332][65383] Fps is (10 sec: 42607.6, 60 sec: 46421.4, 300 sec: 46708.3). Total num frames: 153812992. Throughput: 0: 46634.6. Samples: 153979420. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-12 13:53:44,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 13:53:44,652][65616] Updated weights for policy 0, policy_version 9390 (0.0029) [2024-06-12 13:53:47,682][65616] Updated weights for policy 0, policy_version 9400 (0.0030) [2024-06-12 13:53:49,332][65383] Fps is (10 sec: 47513.2, 60 sec: 46148.2, 300 sec: 46819.4). Total num frames: 154075136. Throughput: 0: 46348.5. Samples: 154115860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-12 13:53:49,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:53:51,744][65616] Updated weights for policy 0, policy_version 9410 (0.0024) [2024-06-12 13:53:54,332][65383] Fps is (10 sec: 47513.9, 60 sec: 46967.5, 300 sec: 46708.3). Total num frames: 154288128. Throughput: 0: 46071.5. Samples: 154389780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 13:53:54,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 13:53:54,404][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000009418_154304512.pth... [2024-06-12 13:53:54,451][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000008735_143114240.pth [2024-06-12 13:53:54,839][65616] Updated weights for policy 0, policy_version 9420 (0.0024) [2024-06-12 13:53:58,303][65616] Updated weights for policy 0, policy_version 9430 (0.0028) [2024-06-12 13:53:59,332][65383] Fps is (10 sec: 45875.5, 60 sec: 46694.4, 300 sec: 46763.8). Total num frames: 154533888. Throughput: 0: 46197.8. Samples: 154666080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 13:53:59,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 13:54:01,593][65595] Signal inference workers to stop experience collection... (2250 times) [2024-06-12 13:54:01,594][65595] Signal inference workers to resume experience collection... (2250 times) [2024-06-12 13:54:01,603][65616] InferenceWorker_p0-w0: stopping experience collection (2250 times) [2024-06-12 13:54:01,604][65616] InferenceWorker_p0-w0: resuming experience collection (2250 times) [2024-06-12 13:54:02,108][65616] Updated weights for policy 0, policy_version 9440 (0.0028) [2024-06-12 13:54:04,332][65383] Fps is (10 sec: 47513.8, 60 sec: 46421.5, 300 sec: 46652.8). Total num frames: 154763264. Throughput: 0: 46314.3. Samples: 154815040. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-12 13:54:04,333][65383] Avg episode reward: [(0, '0.087')] [2024-06-12 13:54:05,671][65616] Updated weights for policy 0, policy_version 9450 (0.0032) [2024-06-12 13:54:09,089][65616] Updated weights for policy 0, policy_version 9460 (0.0030) [2024-06-12 13:54:09,332][65383] Fps is (10 sec: 45875.2, 60 sec: 46967.5, 300 sec: 46652.7). Total num frames: 154992640. Throughput: 0: 46177.8. Samples: 155082040. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-12 13:54:09,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:54:12,371][65616] Updated weights for policy 0, policy_version 9470 (0.0033) [2024-06-12 13:54:14,332][65383] Fps is (10 sec: 45874.5, 60 sec: 46694.4, 300 sec: 46597.2). Total num frames: 155222016. Throughput: 0: 46380.9. Samples: 155366020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 13:54:14,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:54:16,655][65616] Updated weights for policy 0, policy_version 9480 (0.0028) [2024-06-12 13:54:19,319][65616] Updated weights for policy 0, policy_version 9490 (0.0023) [2024-06-12 13:54:19,332][65383] Fps is (10 sec: 49152.3, 60 sec: 46148.3, 300 sec: 46819.4). Total num frames: 155484160. Throughput: 0: 46365.5. Samples: 155500020. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-12 13:54:19,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:54:23,780][65616] Updated weights for policy 0, policy_version 9500 (0.0035) [2024-06-12 13:54:24,332][65383] Fps is (10 sec: 45875.2, 60 sec: 46421.3, 300 sec: 46597.2). Total num frames: 155680768. Throughput: 0: 46407.0. Samples: 155781940. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-12 13:54:24,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:54:26,810][65616] Updated weights for policy 0, policy_version 9510 (0.0024) [2024-06-12 13:54:29,332][65383] Fps is (10 sec: 40959.8, 60 sec: 46148.3, 300 sec: 46430.6). Total num frames: 155893760. Throughput: 0: 46136.5. Samples: 156055560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 13:54:29,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 13:54:31,034][65616] Updated weights for policy 0, policy_version 9520 (0.0028) [2024-06-12 13:54:33,938][65616] Updated weights for policy 0, policy_version 9530 (0.0025) [2024-06-12 13:54:34,332][65383] Fps is (10 sec: 45875.6, 60 sec: 45876.9, 300 sec: 46597.2). Total num frames: 156139520. Throughput: 0: 45936.5. Samples: 156183000. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-06-12 13:54:34,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 13:54:38,335][65616] Updated weights for policy 0, policy_version 9540 (0.0027) [2024-06-12 13:54:39,332][65383] Fps is (10 sec: 45875.2, 60 sec: 45875.2, 300 sec: 46486.1). Total num frames: 156352512. Throughput: 0: 46166.7. Samples: 156467280. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-06-12 13:54:39,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:54:41,201][65616] Updated weights for policy 0, policy_version 9550 (0.0030) [2024-06-12 13:54:44,332][65383] Fps is (10 sec: 42598.2, 60 sec: 45875.2, 300 sec: 46375.0). Total num frames: 156565504. Throughput: 0: 46033.3. Samples: 156737580. Policy #0 lag: (min: 0.0, avg: 13.4, max: 30.0) [2024-06-12 13:54:44,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 13:54:45,623][65616] Updated weights for policy 0, policy_version 9560 (0.0034) [2024-06-12 13:54:48,083][65616] Updated weights for policy 0, policy_version 9570 (0.0031) [2024-06-12 13:54:49,332][65383] Fps is (10 sec: 49151.9, 60 sec: 46148.3, 300 sec: 46652.7). Total num frames: 156844032. Throughput: 0: 45572.4. Samples: 156865800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 13:54:49,333][65383] Avg episode reward: [(0, '0.087')] [2024-06-12 13:54:52,755][65616] Updated weights for policy 0, policy_version 9580 (0.0034) [2024-06-12 13:54:54,332][65383] Fps is (10 sec: 47513.7, 60 sec: 45875.2, 300 sec: 46430.6). Total num frames: 157040640. Throughput: 0: 45914.6. Samples: 157148200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 13:54:54,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:54:55,330][65616] Updated weights for policy 0, policy_version 9590 (0.0023) [2024-06-12 13:54:59,332][65383] Fps is (10 sec: 40960.0, 60 sec: 45329.1, 300 sec: 46264.0). Total num frames: 157253632. Throughput: 0: 45653.4. Samples: 157420420. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-12 13:54:59,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:54:59,947][65616] Updated weights for policy 0, policy_version 9600 (0.0028) [2024-06-12 13:55:02,275][65616] Updated weights for policy 0, policy_version 9610 (0.0025) [2024-06-12 13:55:04,333][65383] Fps is (10 sec: 45874.3, 60 sec: 45601.9, 300 sec: 46375.0). Total num frames: 157499392. Throughput: 0: 45673.0. Samples: 157555320. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-12 13:55:04,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:55:07,400][65616] Updated weights for policy 0, policy_version 9620 (0.0027) [2024-06-12 13:55:09,332][65383] Fps is (10 sec: 47513.7, 60 sec: 45602.1, 300 sec: 46430.6). Total num frames: 157728768. Throughput: 0: 45636.1. Samples: 157835560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-12 13:55:09,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 13:55:09,840][65616] Updated weights for policy 0, policy_version 9630 (0.0037) [2024-06-12 13:55:14,267][65616] Updated weights for policy 0, policy_version 9640 (0.0026) [2024-06-12 13:55:14,333][65383] Fps is (10 sec: 44233.1, 60 sec: 45328.3, 300 sec: 46208.3). Total num frames: 157941760. Throughput: 0: 45698.0. Samples: 158112020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-12 13:55:14,334][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:55:14,710][65595] Signal inference workers to stop experience collection... (2300 times) [2024-06-12 13:55:14,710][65595] Signal inference workers to resume experience collection... (2300 times) [2024-06-12 13:55:14,723][65616] InferenceWorker_p0-w0: stopping experience collection (2300 times) [2024-06-12 13:55:14,744][65616] InferenceWorker_p0-w0: resuming experience collection (2300 times) [2024-06-12 13:55:16,955][65616] Updated weights for policy 0, policy_version 9650 (0.0029) [2024-06-12 13:55:19,332][65383] Fps is (10 sec: 47513.2, 60 sec: 45329.0, 300 sec: 46319.5). Total num frames: 158203904. Throughput: 0: 45777.7. Samples: 158243000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-12 13:55:19,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:55:21,039][65616] Updated weights for policy 0, policy_version 9660 (0.0030) [2024-06-12 13:55:24,332][65383] Fps is (10 sec: 47519.1, 60 sec: 45602.3, 300 sec: 46319.5). Total num frames: 158416896. Throughput: 0: 45841.4. Samples: 158530140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 13:55:24,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 13:55:24,400][65616] Updated weights for policy 0, policy_version 9670 (0.0029) [2024-06-12 13:55:27,767][65616] Updated weights for policy 0, policy_version 9680 (0.0035) [2024-06-12 13:55:29,332][65383] Fps is (10 sec: 44236.7, 60 sec: 45875.1, 300 sec: 46208.4). Total num frames: 158646272. Throughput: 0: 45682.2. Samples: 158793280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 13:55:29,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 13:55:31,502][65616] Updated weights for policy 0, policy_version 9690 (0.0026) [2024-06-12 13:55:34,332][65383] Fps is (10 sec: 47512.9, 60 sec: 45875.2, 300 sec: 46208.4). Total num frames: 158892032. Throughput: 0: 45911.9. Samples: 158931840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 13:55:34,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 13:55:35,255][65616] Updated weights for policy 0, policy_version 9700 (0.0030) [2024-06-12 13:55:38,994][65616] Updated weights for policy 0, policy_version 9710 (0.0026) [2024-06-12 13:55:39,332][65383] Fps is (10 sec: 45875.4, 60 sec: 45875.1, 300 sec: 46152.9). Total num frames: 159105024. Throughput: 0: 45875.1. Samples: 159212580. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-12 13:55:39,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:55:42,130][65616] Updated weights for policy 0, policy_version 9720 (0.0033) [2024-06-12 13:55:44,332][65383] Fps is (10 sec: 40960.3, 60 sec: 45602.2, 300 sec: 46097.3). Total num frames: 159301632. Throughput: 0: 45739.6. Samples: 159478700. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-12 13:55:44,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 13:55:46,288][65616] Updated weights for policy 0, policy_version 9730 (0.0030) [2024-06-12 13:55:49,162][65616] Updated weights for policy 0, policy_version 9740 (0.0024) [2024-06-12 13:55:49,333][65383] Fps is (10 sec: 47513.2, 60 sec: 45602.0, 300 sec: 46319.5). Total num frames: 159580160. Throughput: 0: 45645.0. Samples: 159609340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 23.0) [2024-06-12 13:55:49,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:55:53,303][65616] Updated weights for policy 0, policy_version 9750 (0.0027) [2024-06-12 13:55:54,332][65383] Fps is (10 sec: 49152.1, 60 sec: 45875.3, 300 sec: 46152.9). Total num frames: 159793152. Throughput: 0: 45659.6. Samples: 159890240. Policy #0 lag: (min: 1.0, avg: 8.9, max: 22.0) [2024-06-12 13:55:54,333][65383] Avg episode reward: [(0, '0.087')] [2024-06-12 13:55:54,341][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000009753_159793152.pth... [2024-06-12 13:55:54,410][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000009079_148750336.pth [2024-06-12 13:55:56,330][65616] Updated weights for policy 0, policy_version 9760 (0.0032) [2024-06-12 13:55:59,332][65383] Fps is (10 sec: 42598.8, 60 sec: 45875.2, 300 sec: 46152.9). Total num frames: 160006144. Throughput: 0: 45785.0. Samples: 160172300. Policy #0 lag: (min: 1.0, avg: 8.9, max: 22.0) [2024-06-12 13:55:59,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 13:55:59,334][65595] Saving new best policy, reward=0.093! [2024-06-12 13:56:00,896][65616] Updated weights for policy 0, policy_version 9770 (0.0023) [2024-06-12 13:56:03,286][65616] Updated weights for policy 0, policy_version 9780 (0.0024) [2024-06-12 13:56:04,332][65383] Fps is (10 sec: 49151.7, 60 sec: 46421.5, 300 sec: 46264.0). Total num frames: 160284672. Throughput: 0: 45822.7. Samples: 160305020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-12 13:56:04,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:56:07,642][65616] Updated weights for policy 0, policy_version 9790 (0.0021) [2024-06-12 13:56:09,332][65383] Fps is (10 sec: 45875.9, 60 sec: 45602.2, 300 sec: 46097.4). Total num frames: 160464896. Throughput: 0: 45665.4. Samples: 160585080. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-12 13:56:09,333][65383] Avg episode reward: [(0, '0.087')] [2024-06-12 13:56:10,234][65616] Updated weights for policy 0, policy_version 9800 (0.0035) [2024-06-12 13:56:14,332][65383] Fps is (10 sec: 40960.5, 60 sec: 45876.1, 300 sec: 45986.3). Total num frames: 160694272. Throughput: 0: 45933.1. Samples: 160860260. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-12 13:56:14,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:56:15,210][65616] Updated weights for policy 0, policy_version 9810 (0.0027) [2024-06-12 13:56:17,199][65595] Signal inference workers to stop experience collection... (2350 times) [2024-06-12 13:56:17,246][65595] Signal inference workers to resume experience collection... (2350 times) [2024-06-12 13:56:17,248][65616] InferenceWorker_p0-w0: stopping experience collection (2350 times) [2024-06-12 13:56:17,251][65616] Updated weights for policy 0, policy_version 9820 (0.0022) [2024-06-12 13:56:17,264][65616] InferenceWorker_p0-w0: resuming experience collection (2350 times) [2024-06-12 13:56:19,332][65383] Fps is (10 sec: 47513.4, 60 sec: 45602.2, 300 sec: 46208.5). Total num frames: 160940032. Throughput: 0: 45759.7. Samples: 160991020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 13:56:19,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:56:22,227][65616] Updated weights for policy 0, policy_version 9830 (0.0028) [2024-06-12 13:56:24,332][65383] Fps is (10 sec: 49151.6, 60 sec: 46148.2, 300 sec: 46208.4). Total num frames: 161185792. Throughput: 0: 45849.9. Samples: 161275820. Policy #0 lag: (min: 0.0, avg: 12.5, max: 27.0) [2024-06-12 13:56:24,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:56:24,872][65616] Updated weights for policy 0, policy_version 9840 (0.0028) [2024-06-12 13:56:29,332][65383] Fps is (10 sec: 42598.6, 60 sec: 45329.2, 300 sec: 45986.3). Total num frames: 161366016. Throughput: 0: 45929.0. Samples: 161545500. Policy #0 lag: (min: 0.0, avg: 12.5, max: 27.0) [2024-06-12 13:56:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 13:56:29,435][65616] Updated weights for policy 0, policy_version 9850 (0.0022) [2024-06-12 13:56:31,924][65616] Updated weights for policy 0, policy_version 9860 (0.0031) [2024-06-12 13:56:34,332][65383] Fps is (10 sec: 44236.6, 60 sec: 45602.2, 300 sec: 46097.4). Total num frames: 161628160. Throughput: 0: 45771.7. Samples: 161669060. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-12 13:56:34,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:56:36,621][65616] Updated weights for policy 0, policy_version 9870 (0.0022) [2024-06-12 13:56:39,332][65383] Fps is (10 sec: 49151.7, 60 sec: 45875.3, 300 sec: 46097.4). Total num frames: 161857536. Throughput: 0: 45862.7. Samples: 161954060. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-12 13:56:39,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 13:56:39,346][65616] Updated weights for policy 0, policy_version 9880 (0.0025) [2024-06-12 13:56:43,552][65616] Updated weights for policy 0, policy_version 9890 (0.0023) [2024-06-12 13:56:44,332][65383] Fps is (10 sec: 45875.5, 60 sec: 46421.3, 300 sec: 46041.8). Total num frames: 162086912. Throughput: 0: 45913.4. Samples: 162238400. Policy #0 lag: (min: 0.0, avg: 13.2, max: 23.0) [2024-06-12 13:56:44,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:56:46,338][65616] Updated weights for policy 0, policy_version 9900 (0.0029) [2024-06-12 13:56:49,334][65383] Fps is (10 sec: 44231.0, 60 sec: 45328.2, 300 sec: 45986.1). Total num frames: 162299904. Throughput: 0: 46032.1. Samples: 162376520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 13:56:49,334][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:56:50,661][65616] Updated weights for policy 0, policy_version 9910 (0.0026) [2024-06-12 13:56:53,811][65616] Updated weights for policy 0, policy_version 9920 (0.0029) [2024-06-12 13:56:54,332][65383] Fps is (10 sec: 45875.3, 60 sec: 45875.2, 300 sec: 46041.8). Total num frames: 162545664. Throughput: 0: 45859.1. Samples: 162648740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 13:56:54,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:56:57,370][65616] Updated weights for policy 0, policy_version 9930 (0.0025) [2024-06-12 13:56:59,332][65383] Fps is (10 sec: 47519.6, 60 sec: 46148.3, 300 sec: 46097.4). Total num frames: 162775040. Throughput: 0: 45807.9. Samples: 162921620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 13:56:59,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:57:01,101][65616] Updated weights for policy 0, policy_version 9940 (0.0026) [2024-06-12 13:57:04,262][65616] Updated weights for policy 0, policy_version 9950 (0.0029) [2024-06-12 13:57:04,332][65383] Fps is (10 sec: 47513.8, 60 sec: 45602.2, 300 sec: 46041.8). Total num frames: 163020800. Throughput: 0: 46066.7. Samples: 163064020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 13:57:04,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 13:57:08,274][65616] Updated weights for policy 0, policy_version 9960 (0.0022) [2024-06-12 13:57:09,333][65383] Fps is (10 sec: 42597.8, 60 sec: 45601.9, 300 sec: 45875.2). Total num frames: 163201024. Throughput: 0: 45587.0. Samples: 163327240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 13:57:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 13:57:12,088][65616] Updated weights for policy 0, policy_version 9970 (0.0029) [2024-06-12 13:57:14,334][65383] Fps is (10 sec: 42592.0, 60 sec: 45874.0, 300 sec: 45986.1). Total num frames: 163446784. Throughput: 0: 45637.1. Samples: 163599240. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-12 13:57:14,334][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:57:15,965][65616] Updated weights for policy 0, policy_version 9980 (0.0027) [2024-06-12 13:57:19,021][65616] Updated weights for policy 0, policy_version 9990 (0.0026) [2024-06-12 13:57:19,332][65383] Fps is (10 sec: 49152.8, 60 sec: 45875.2, 300 sec: 46097.4). Total num frames: 163692544. Throughput: 0: 45837.8. Samples: 163731760. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-12 13:57:19,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:57:23,215][65616] Updated weights for policy 0, policy_version 10000 (0.0026) [2024-06-12 13:57:24,332][65383] Fps is (10 sec: 44243.0, 60 sec: 45056.0, 300 sec: 45819.7). Total num frames: 163889152. Throughput: 0: 45630.6. Samples: 164007440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 13:57:24,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:57:26,313][65616] Updated weights for policy 0, policy_version 10010 (0.0029) [2024-06-12 13:57:29,332][65383] Fps is (10 sec: 44236.7, 60 sec: 46148.2, 300 sec: 45930.7). Total num frames: 164134912. Throughput: 0: 45371.1. Samples: 164280100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 13:57:29,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:57:30,468][65616] Updated weights for policy 0, policy_version 10020 (0.0031) [2024-06-12 13:57:33,369][65616] Updated weights for policy 0, policy_version 10030 (0.0027) [2024-06-12 13:57:34,332][65383] Fps is (10 sec: 50790.7, 60 sec: 46148.3, 300 sec: 46041.8). Total num frames: 164397056. Throughput: 0: 45379.1. Samples: 164418520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 13:57:34,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:57:37,568][65616] Updated weights for policy 0, policy_version 10040 (0.0030) [2024-06-12 13:57:39,027][65595] Signal inference workers to stop experience collection... (2400 times) [2024-06-12 13:57:39,065][65616] InferenceWorker_p0-w0: stopping experience collection (2400 times) [2024-06-12 13:57:39,073][65595] Signal inference workers to resume experience collection... (2400 times) [2024-06-12 13:57:39,082][65616] InferenceWorker_p0-w0: resuming experience collection (2400 times) [2024-06-12 13:57:39,332][65383] Fps is (10 sec: 44237.3, 60 sec: 45329.1, 300 sec: 45930.8). Total num frames: 164577280. Throughput: 0: 45296.5. Samples: 164687080. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-12 13:57:39,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:57:40,242][65616] Updated weights for policy 0, policy_version 10050 (0.0026) [2024-06-12 13:57:44,332][65383] Fps is (10 sec: 39321.1, 60 sec: 45055.9, 300 sec: 45708.6). Total num frames: 164790272. Throughput: 0: 45317.7. Samples: 164960920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-12 13:57:44,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 13:57:44,580][65616] Updated weights for policy 0, policy_version 10060 (0.0028) [2024-06-12 13:57:47,521][65616] Updated weights for policy 0, policy_version 10070 (0.0027) [2024-06-12 13:57:49,332][65383] Fps is (10 sec: 47513.1, 60 sec: 45876.2, 300 sec: 46041.8). Total num frames: 165052416. Throughput: 0: 45246.1. Samples: 165100100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-12 13:57:49,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:57:52,131][65616] Updated weights for policy 0, policy_version 10080 (0.0028) [2024-06-12 13:57:54,332][65383] Fps is (10 sec: 49152.8, 60 sec: 45602.1, 300 sec: 45930.7). Total num frames: 165281792. Throughput: 0: 45460.7. Samples: 165372960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-12 13:57:54,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:57:54,417][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000010089_165298176.pth... [2024-06-12 13:57:54,460][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000009418_154304512.pth [2024-06-12 13:57:54,741][65616] Updated weights for policy 0, policy_version 10090 (0.0027) [2024-06-12 13:57:59,118][65616] Updated weights for policy 0, policy_version 10100 (0.0028) [2024-06-12 13:57:59,332][65383] Fps is (10 sec: 42598.5, 60 sec: 45056.0, 300 sec: 45764.1). Total num frames: 165478400. Throughput: 0: 45436.1. Samples: 165643800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 26.0) [2024-06-12 13:57:59,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:58:01,980][65616] Updated weights for policy 0, policy_version 10110 (0.0032) [2024-06-12 13:58:04,332][65383] Fps is (10 sec: 42598.5, 60 sec: 44782.9, 300 sec: 45875.2). Total num frames: 165707776. Throughput: 0: 45190.3. Samples: 165765320. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-12 13:58:04,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 13:58:06,691][65616] Updated weights for policy 0, policy_version 10120 (0.0031) [2024-06-12 13:58:09,260][65616] Updated weights for policy 0, policy_version 10130 (0.0023) [2024-06-12 13:58:09,333][65383] Fps is (10 sec: 49151.3, 60 sec: 46148.3, 300 sec: 45930.7). Total num frames: 165969920. Throughput: 0: 45205.2. Samples: 166041680. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-12 13:58:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 13:58:13,447][65616] Updated weights for policy 0, policy_version 10140 (0.0028) [2024-06-12 13:58:14,333][65383] Fps is (10 sec: 44235.8, 60 sec: 45057.0, 300 sec: 45541.9). Total num frames: 166150144. Throughput: 0: 45210.1. Samples: 166314560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 13:58:14,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 13:58:16,720][65616] Updated weights for policy 0, policy_version 10150 (0.0031) [2024-06-12 13:58:19,332][65383] Fps is (10 sec: 40960.7, 60 sec: 44782.9, 300 sec: 45708.6). Total num frames: 166379520. Throughput: 0: 44992.0. Samples: 166443160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 13:58:19,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:58:20,956][65616] Updated weights for policy 0, policy_version 10160 (0.0031) [2024-06-12 13:58:23,995][65616] Updated weights for policy 0, policy_version 10170 (0.0027) [2024-06-12 13:58:24,332][65383] Fps is (10 sec: 49152.6, 60 sec: 45875.2, 300 sec: 45819.7). Total num frames: 166641664. Throughput: 0: 45174.1. Samples: 166719920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 13:58:24,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 13:58:28,019][65616] Updated weights for policy 0, policy_version 10180 (0.0025) [2024-06-12 13:58:29,332][65383] Fps is (10 sec: 47513.0, 60 sec: 45329.0, 300 sec: 45653.4). Total num frames: 166854656. Throughput: 0: 45001.3. Samples: 166985980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 13:58:29,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:58:31,603][65616] Updated weights for policy 0, policy_version 10190 (0.0033) [2024-06-12 13:58:34,333][65383] Fps is (10 sec: 40959.5, 60 sec: 44236.7, 300 sec: 45597.5). Total num frames: 167051264. Throughput: 0: 45123.4. Samples: 167130660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 13:58:34,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:58:35,150][65616] Updated weights for policy 0, policy_version 10200 (0.0027) [2024-06-12 13:58:39,283][65616] Updated weights for policy 0, policy_version 10210 (0.0023) [2024-06-12 13:58:39,332][65383] Fps is (10 sec: 42599.2, 60 sec: 45056.0, 300 sec: 45653.1). Total num frames: 167280640. Throughput: 0: 44874.7. Samples: 167392320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-12 13:58:39,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 13:58:42,313][65616] Updated weights for policy 0, policy_version 10220 (0.0025) [2024-06-12 13:58:44,332][65383] Fps is (10 sec: 49152.5, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 167542784. Throughput: 0: 45012.4. Samples: 167669360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-12 13:58:44,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 13:58:46,418][65616] Updated weights for policy 0, policy_version 10230 (0.0027) [2024-06-12 13:58:49,333][65383] Fps is (10 sec: 47512.7, 60 sec: 45055.9, 300 sec: 45653.0). Total num frames: 167755776. Throughput: 0: 45474.5. Samples: 167811680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 13:58:49,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:58:49,424][65616] Updated weights for policy 0, policy_version 10240 (0.0030) [2024-06-12 13:58:50,534][65595] Signal inference workers to stop experience collection... (2450 times) [2024-06-12 13:58:50,534][65595] Signal inference workers to resume experience collection... (2450 times) [2024-06-12 13:58:50,564][65616] InferenceWorker_p0-w0: stopping experience collection (2450 times) [2024-06-12 13:58:50,564][65616] InferenceWorker_p0-w0: resuming experience collection (2450 times) [2024-06-12 13:58:53,901][65616] Updated weights for policy 0, policy_version 10250 (0.0027) [2024-06-12 13:58:54,332][65383] Fps is (10 sec: 40960.5, 60 sec: 44509.9, 300 sec: 45486.4). Total num frames: 167952384. Throughput: 0: 45414.0. Samples: 168085300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 13:58:54,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:58:56,634][65616] Updated weights for policy 0, policy_version 10260 (0.0033) [2024-06-12 13:58:59,333][65383] Fps is (10 sec: 45875.1, 60 sec: 45602.0, 300 sec: 45597.5). Total num frames: 168214528. Throughput: 0: 45168.0. Samples: 168347120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 13:58:59,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:59:01,234][65616] Updated weights for policy 0, policy_version 10270 (0.0031) [2024-06-12 13:59:04,082][65616] Updated weights for policy 0, policy_version 10280 (0.0031) [2024-06-12 13:59:04,333][65383] Fps is (10 sec: 47512.6, 60 sec: 45328.9, 300 sec: 45541.9). Total num frames: 168427520. Throughput: 0: 45453.6. Samples: 168488580. Policy #0 lag: (min: 0.0, avg: 6.2, max: 19.0) [2024-06-12 13:59:04,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:59:07,716][65616] Updated weights for policy 0, policy_version 10290 (0.0021) [2024-06-12 13:59:09,332][65383] Fps is (10 sec: 42599.0, 60 sec: 44510.0, 300 sec: 45486.4). Total num frames: 168640512. Throughput: 0: 45252.0. Samples: 168756260. Policy #0 lag: (min: 0.0, avg: 6.2, max: 19.0) [2024-06-12 13:59:09,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:59:11,258][65616] Updated weights for policy 0, policy_version 10300 (0.0023) [2024-06-12 13:59:14,332][65383] Fps is (10 sec: 45876.0, 60 sec: 45602.3, 300 sec: 45430.9). Total num frames: 168886272. Throughput: 0: 45254.4. Samples: 169022420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-12 13:59:14,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:59:14,992][65616] Updated weights for policy 0, policy_version 10310 (0.0023) [2024-06-12 13:59:18,594][65616] Updated weights for policy 0, policy_version 10320 (0.0024) [2024-06-12 13:59:19,332][65383] Fps is (10 sec: 49151.5, 60 sec: 45875.1, 300 sec: 45597.5). Total num frames: 169132032. Throughput: 0: 45156.0. Samples: 169162680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 13:59:19,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:59:22,295][65616] Updated weights for policy 0, policy_version 10330 (0.0026) [2024-06-12 13:59:24,332][65383] Fps is (10 sec: 42598.0, 60 sec: 44509.8, 300 sec: 45486.4). Total num frames: 169312256. Throughput: 0: 45294.5. Samples: 169430580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 13:59:24,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 13:59:25,927][65616] Updated weights for policy 0, policy_version 10340 (0.0026) [2024-06-12 13:59:29,010][65616] Updated weights for policy 0, policy_version 10350 (0.0026) [2024-06-12 13:59:29,332][65383] Fps is (10 sec: 44237.1, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 169574400. Throughput: 0: 45085.4. Samples: 169698200. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-12 13:59:29,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:59:32,784][65616] Updated weights for policy 0, policy_version 10360 (0.0038) [2024-06-12 13:59:34,332][65383] Fps is (10 sec: 47513.5, 60 sec: 45602.2, 300 sec: 45542.0). Total num frames: 169787392. Throughput: 0: 45240.9. Samples: 169847520. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-12 13:59:34,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:59:36,533][65616] Updated weights for policy 0, policy_version 10370 (0.0024) [2024-06-12 13:59:39,332][65383] Fps is (10 sec: 44237.0, 60 sec: 45602.1, 300 sec: 45597.5). Total num frames: 170016768. Throughput: 0: 45179.1. Samples: 170118360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 13:59:39,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 13:59:39,923][65616] Updated weights for policy 0, policy_version 10380 (0.0026) [2024-06-12 13:59:44,007][65616] Updated weights for policy 0, policy_version 10390 (0.0034) [2024-06-12 13:59:44,332][65383] Fps is (10 sec: 45875.6, 60 sec: 45056.0, 300 sec: 45430.9). Total num frames: 170246144. Throughput: 0: 45266.8. Samples: 170384120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 13:59:44,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 13:59:47,071][65616] Updated weights for policy 0, policy_version 10400 (0.0023) [2024-06-12 13:59:49,332][65383] Fps is (10 sec: 44236.8, 60 sec: 45056.1, 300 sec: 45486.4). Total num frames: 170459136. Throughput: 0: 45263.7. Samples: 170525440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 13:59:49,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 13:59:51,127][65616] Updated weights for policy 0, policy_version 10410 (0.0031) [2024-06-12 13:59:54,333][65383] Fps is (10 sec: 44236.1, 60 sec: 45602.0, 300 sec: 45541.9). Total num frames: 170688512. Throughput: 0: 45277.6. Samples: 170793760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 13:59:54,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 13:59:54,454][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000010419_170704896.pth... [2024-06-12 13:59:54,498][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000009753_159793152.pth [2024-06-12 13:59:54,966][65616] Updated weights for policy 0, policy_version 10420 (0.0019) [2024-06-12 13:59:58,374][65616] Updated weights for policy 0, policy_version 10430 (0.0029) [2024-06-12 13:59:59,333][65383] Fps is (10 sec: 47512.6, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 170934272. Throughput: 0: 45402.9. Samples: 171065560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 13:59:59,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:00:02,124][65616] Updated weights for policy 0, policy_version 10440 (0.0028) [2024-06-12 14:00:04,332][65383] Fps is (10 sec: 45875.7, 60 sec: 45329.1, 300 sec: 45486.4). Total num frames: 171147264. Throughput: 0: 45324.5. Samples: 171202280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 14:00:04,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:00:05,765][65616] Updated weights for policy 0, policy_version 10450 (0.0025) [2024-06-12 14:00:09,293][65616] Updated weights for policy 0, policy_version 10460 (0.0027) [2024-06-12 14:00:09,332][65383] Fps is (10 sec: 44237.7, 60 sec: 45602.2, 300 sec: 45542.1). Total num frames: 171376640. Throughput: 0: 45321.0. Samples: 171470020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 14:00:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:00:12,680][65616] Updated weights for policy 0, policy_version 10470 (0.0023) [2024-06-12 14:00:14,332][65383] Fps is (10 sec: 45875.0, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 171606016. Throughput: 0: 45657.7. Samples: 171752800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 14:00:14,335][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:00:16,452][65616] Updated weights for policy 0, policy_version 10480 (0.0026) [2024-06-12 14:00:17,194][65595] Signal inference workers to stop experience collection... (2500 times) [2024-06-12 14:00:17,194][65595] Signal inference workers to resume experience collection... (2500 times) [2024-06-12 14:00:17,214][65616] InferenceWorker_p0-w0: stopping experience collection (2500 times) [2024-06-12 14:00:17,214][65616] InferenceWorker_p0-w0: resuming experience collection (2500 times) [2024-06-12 14:00:19,332][65383] Fps is (10 sec: 42598.4, 60 sec: 44509.9, 300 sec: 45375.3). Total num frames: 171802624. Throughput: 0: 45368.5. Samples: 171889100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 14:00:19,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 14:00:19,828][65616] Updated weights for policy 0, policy_version 10490 (0.0023) [2024-06-12 14:00:23,542][65616] Updated weights for policy 0, policy_version 10500 (0.0031) [2024-06-12 14:00:24,332][65383] Fps is (10 sec: 44237.1, 60 sec: 45602.2, 300 sec: 45430.9). Total num frames: 172048384. Throughput: 0: 45324.9. Samples: 172157980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 14:00:24,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:00:27,102][65616] Updated weights for policy 0, policy_version 10510 (0.0025) [2024-06-12 14:00:29,332][65383] Fps is (10 sec: 49151.6, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 172294144. Throughput: 0: 45423.9. Samples: 172428200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 14:00:29,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:00:30,781][65616] Updated weights for policy 0, policy_version 10520 (0.0029) [2024-06-12 14:00:34,332][65383] Fps is (10 sec: 45875.6, 60 sec: 45329.2, 300 sec: 45430.9). Total num frames: 172507136. Throughput: 0: 45542.7. Samples: 172574860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 14:00:34,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 14:00:34,371][65616] Updated weights for policy 0, policy_version 10530 (0.0035) [2024-06-12 14:00:37,555][65616] Updated weights for policy 0, policy_version 10540 (0.0035) [2024-06-12 14:00:39,333][65383] Fps is (10 sec: 45871.7, 60 sec: 45601.5, 300 sec: 45597.4). Total num frames: 172752896. Throughput: 0: 45449.5. Samples: 172839020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 14:00:39,334][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:00:41,561][65616] Updated weights for policy 0, policy_version 10550 (0.0032) [2024-06-12 14:00:44,332][65383] Fps is (10 sec: 47513.0, 60 sec: 45602.1, 300 sec: 45430.9). Total num frames: 172982272. Throughput: 0: 45460.6. Samples: 173111280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 14:00:44,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:00:44,970][65616] Updated weights for policy 0, policy_version 10560 (0.0027) [2024-06-12 14:00:48,580][65616] Updated weights for policy 0, policy_version 10570 (0.0027) [2024-06-12 14:00:49,333][65383] Fps is (10 sec: 45878.4, 60 sec: 45875.1, 300 sec: 45486.4). Total num frames: 173211648. Throughput: 0: 45743.9. Samples: 173260760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 14:00:49,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 14:00:51,711][65616] Updated weights for policy 0, policy_version 10580 (0.0026) [2024-06-12 14:00:54,333][65383] Fps is (10 sec: 42597.9, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 173408256. Throughput: 0: 45617.1. Samples: 173522800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-12 14:00:54,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:00:55,584][65616] Updated weights for policy 0, policy_version 10590 (0.0029) [2024-06-12 14:00:59,332][65383] Fps is (10 sec: 44237.7, 60 sec: 45329.3, 300 sec: 45319.8). Total num frames: 173654016. Throughput: 0: 45549.0. Samples: 173802500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-12 14:00:59,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:00:59,427][65616] Updated weights for policy 0, policy_version 10600 (0.0023) [2024-06-12 14:01:02,774][65616] Updated weights for policy 0, policy_version 10610 (0.0026) [2024-06-12 14:01:04,332][65383] Fps is (10 sec: 47514.3, 60 sec: 45602.1, 300 sec: 45486.4). Total num frames: 173883392. Throughput: 0: 45506.6. Samples: 173936900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-12 14:01:04,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:01:06,870][65616] Updated weights for policy 0, policy_version 10620 (0.0032) [2024-06-12 14:01:09,332][65383] Fps is (10 sec: 45874.7, 60 sec: 45602.1, 300 sec: 45486.4). Total num frames: 174112768. Throughput: 0: 45554.2. Samples: 174207920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-12 14:01:09,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 14:01:10,438][65616] Updated weights for policy 0, policy_version 10630 (0.0031) [2024-06-12 14:01:14,332][65383] Fps is (10 sec: 42598.3, 60 sec: 45056.0, 300 sec: 45319.8). Total num frames: 174309376. Throughput: 0: 45467.6. Samples: 174474240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-12 14:01:14,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:01:14,393][65616] Updated weights for policy 0, policy_version 10640 (0.0035) [2024-06-12 14:01:17,734][65616] Updated weights for policy 0, policy_version 10650 (0.0027) [2024-06-12 14:01:19,332][65383] Fps is (10 sec: 42598.9, 60 sec: 45602.2, 300 sec: 45264.3). Total num frames: 174538752. Throughput: 0: 45185.3. Samples: 174608200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 14:01:19,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:01:21,410][65616] Updated weights for policy 0, policy_version 10660 (0.0025) [2024-06-12 14:01:24,333][65383] Fps is (10 sec: 47513.2, 60 sec: 45602.0, 300 sec: 45486.4). Total num frames: 174784512. Throughput: 0: 45354.9. Samples: 174879960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-12 14:01:24,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:01:24,754][65616] Updated weights for policy 0, policy_version 10670 (0.0035) [2024-06-12 14:01:28,770][65616] Updated weights for policy 0, policy_version 10680 (0.0027) [2024-06-12 14:01:29,332][65383] Fps is (10 sec: 47513.2, 60 sec: 45329.1, 300 sec: 45375.4). Total num frames: 175013888. Throughput: 0: 45507.1. Samples: 175159100. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-12 14:01:29,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:01:32,274][65616] Updated weights for policy 0, policy_version 10690 (0.0030) [2024-06-12 14:01:34,332][65383] Fps is (10 sec: 42598.6, 60 sec: 45055.9, 300 sec: 45264.3). Total num frames: 175210496. Throughput: 0: 45109.8. Samples: 175290700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 14:01:34,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:01:36,139][65616] Updated weights for policy 0, policy_version 10700 (0.0030) [2024-06-12 14:01:39,202][65616] Updated weights for policy 0, policy_version 10710 (0.0029) [2024-06-12 14:01:39,333][65383] Fps is (10 sec: 45874.1, 60 sec: 45329.5, 300 sec: 45375.3). Total num frames: 175472640. Throughput: 0: 44950.1. Samples: 175545560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 14:01:39,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:01:43,494][65616] Updated weights for policy 0, policy_version 10720 (0.0029) [2024-06-12 14:01:43,665][65595] Signal inference workers to stop experience collection... (2550 times) [2024-06-12 14:01:43,694][65616] InferenceWorker_p0-w0: stopping experience collection (2550 times) [2024-06-12 14:01:43,720][65595] Signal inference workers to resume experience collection... (2550 times) [2024-06-12 14:01:43,721][65616] InferenceWorker_p0-w0: resuming experience collection (2550 times) [2024-06-12 14:01:44,332][65383] Fps is (10 sec: 50790.7, 60 sec: 45602.2, 300 sec: 45486.6). Total num frames: 175718400. Throughput: 0: 44993.2. Samples: 175827200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 14:01:44,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:01:46,799][65616] Updated weights for policy 0, policy_version 10730 (0.0027) [2024-06-12 14:01:49,332][65383] Fps is (10 sec: 40961.2, 60 sec: 44510.0, 300 sec: 45208.7). Total num frames: 175882240. Throughput: 0: 44920.9. Samples: 175958340. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-12 14:01:49,333][65383] Avg episode reward: [(0, '0.086')] [2024-06-12 14:01:50,825][65616] Updated weights for policy 0, policy_version 10740 (0.0029) [2024-06-12 14:01:54,189][65616] Updated weights for policy 0, policy_version 10750 (0.0030) [2024-06-12 14:01:54,332][65383] Fps is (10 sec: 40960.4, 60 sec: 45329.3, 300 sec: 45264.3). Total num frames: 176128000. Throughput: 0: 44737.9. Samples: 176221120. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-12 14:01:54,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:01:54,451][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000010751_176144384.pth... [2024-06-12 14:01:54,505][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000010089_165298176.pth [2024-06-12 14:01:57,867][65616] Updated weights for policy 0, policy_version 10760 (0.0027) [2024-06-12 14:01:59,332][65383] Fps is (10 sec: 47513.6, 60 sec: 45056.0, 300 sec: 45208.7). Total num frames: 176357376. Throughput: 0: 44884.5. Samples: 176494040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 14:01:59,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 14:02:01,041][65616] Updated weights for policy 0, policy_version 10770 (0.0028) [2024-06-12 14:02:04,332][65383] Fps is (10 sec: 44236.4, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 176570368. Throughput: 0: 44830.1. Samples: 176625560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 14:02:04,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:02:05,033][65616] Updated weights for policy 0, policy_version 10780 (0.0027) [2024-06-12 14:02:08,152][65616] Updated weights for policy 0, policy_version 10790 (0.0029) [2024-06-12 14:02:09,336][65383] Fps is (10 sec: 44221.3, 60 sec: 44780.4, 300 sec: 45264.0). Total num frames: 176799744. Throughput: 0: 44901.1. Samples: 176900660. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-12 14:02:09,336][65383] Avg episode reward: [(0, '0.087')] [2024-06-12 14:02:11,863][65616] Updated weights for policy 0, policy_version 10800 (0.0032) [2024-06-12 14:02:14,332][65383] Fps is (10 sec: 47513.6, 60 sec: 45602.2, 300 sec: 45264.3). Total num frames: 177045504. Throughput: 0: 44884.5. Samples: 177178900. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-12 14:02:14,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:02:15,727][65616] Updated weights for policy 0, policy_version 10810 (0.0025) [2024-06-12 14:02:19,333][65383] Fps is (10 sec: 45890.6, 60 sec: 45328.9, 300 sec: 45319.8). Total num frames: 177258496. Throughput: 0: 44992.4. Samples: 177315360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-12 14:02:19,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:02:19,568][65616] Updated weights for policy 0, policy_version 10820 (0.0035) [2024-06-12 14:02:22,724][65616] Updated weights for policy 0, policy_version 10830 (0.0027) [2024-06-12 14:02:24,332][65383] Fps is (10 sec: 40960.1, 60 sec: 44510.0, 300 sec: 45153.2). Total num frames: 177455104. Throughput: 0: 45246.0. Samples: 177581620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 14:02:24,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:02:26,909][65616] Updated weights for policy 0, policy_version 10840 (0.0031) [2024-06-12 14:02:29,332][65383] Fps is (10 sec: 47514.3, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 177733632. Throughput: 0: 44863.6. Samples: 177846060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 14:02:29,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:02:30,919][65616] Updated weights for policy 0, policy_version 10850 (0.0034) [2024-06-12 14:02:34,062][65616] Updated weights for policy 0, policy_version 10860 (0.0027) [2024-06-12 14:02:34,333][65383] Fps is (10 sec: 47512.9, 60 sec: 45329.0, 300 sec: 45264.2). Total num frames: 177930240. Throughput: 0: 44992.3. Samples: 177983000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 14:02:34,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:02:37,468][65616] Updated weights for policy 0, policy_version 10870 (0.0030) [2024-06-12 14:02:39,332][65383] Fps is (10 sec: 45875.5, 60 sec: 45329.3, 300 sec: 45430.9). Total num frames: 178192384. Throughput: 0: 45349.8. Samples: 178261860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 14:02:39,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 14:02:41,166][65616] Updated weights for policy 0, policy_version 10880 (0.0032) [2024-06-12 14:02:44,332][65383] Fps is (10 sec: 44237.2, 60 sec: 44236.8, 300 sec: 45153.2). Total num frames: 178372608. Throughput: 0: 45309.3. Samples: 178532960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 14:02:44,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:02:44,973][65616] Updated weights for policy 0, policy_version 10890 (0.0023) [2024-06-12 14:02:48,484][65616] Updated weights for policy 0, policy_version 10900 (0.0020) [2024-06-12 14:02:49,333][65383] Fps is (10 sec: 42597.4, 60 sec: 45602.0, 300 sec: 45208.7). Total num frames: 178618368. Throughput: 0: 45328.3. Samples: 178665340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 14:02:49,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:02:52,154][65616] Updated weights for policy 0, policy_version 10910 (0.0039) [2024-06-12 14:02:54,332][65383] Fps is (10 sec: 49151.7, 60 sec: 45602.0, 300 sec: 45375.3). Total num frames: 178864128. Throughput: 0: 45473.2. Samples: 178946800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 14:02:54,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:02:55,575][65616] Updated weights for policy 0, policy_version 10920 (0.0028) [2024-06-12 14:02:59,118][65616] Updated weights for policy 0, policy_version 10930 (0.0024) [2024-06-12 14:02:59,332][65383] Fps is (10 sec: 45875.5, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 179077120. Throughput: 0: 45433.7. Samples: 179223420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-12 14:02:59,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:03:02,711][65616] Updated weights for policy 0, policy_version 10940 (0.0035) [2024-06-12 14:03:04,332][65383] Fps is (10 sec: 42599.0, 60 sec: 45329.1, 300 sec: 45153.2). Total num frames: 179290112. Throughput: 0: 45260.1. Samples: 179352060. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-12 14:03:04,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 14:03:06,567][65616] Updated weights for policy 0, policy_version 10950 (0.0035) [2024-06-12 14:03:09,332][65383] Fps is (10 sec: 47514.2, 60 sec: 45877.9, 300 sec: 45430.9). Total num frames: 179552256. Throughput: 0: 45499.6. Samples: 179629100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-12 14:03:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:03:09,431][65616] Updated weights for policy 0, policy_version 10960 (0.0023) [2024-06-12 14:03:12,020][65595] Signal inference workers to stop experience collection... (2600 times) [2024-06-12 14:03:12,066][65616] InferenceWorker_p0-w0: stopping experience collection (2600 times) [2024-06-12 14:03:12,122][65595] Signal inference workers to resume experience collection... (2600 times) [2024-06-12 14:03:12,122][65616] InferenceWorker_p0-w0: resuming experience collection (2600 times) [2024-06-12 14:03:13,571][65616] Updated weights for policy 0, policy_version 10970 (0.0030) [2024-06-12 14:03:14,332][65383] Fps is (10 sec: 45874.8, 60 sec: 45056.0, 300 sec: 45319.8). Total num frames: 179748864. Throughput: 0: 45626.1. Samples: 179899240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-12 14:03:14,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:03:16,889][65616] Updated weights for policy 0, policy_version 10980 (0.0034) [2024-06-12 14:03:19,332][65383] Fps is (10 sec: 40959.5, 60 sec: 45056.0, 300 sec: 45153.2). Total num frames: 179961856. Throughput: 0: 45397.0. Samples: 180025860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 14:03:19,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:03:20,882][65616] Updated weights for policy 0, policy_version 10990 (0.0032) [2024-06-12 14:03:23,919][65616] Updated weights for policy 0, policy_version 11000 (0.0027) [2024-06-12 14:03:24,332][65383] Fps is (10 sec: 47513.9, 60 sec: 46148.3, 300 sec: 45319.8). Total num frames: 180224000. Throughput: 0: 45564.8. Samples: 180312280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-12 14:03:24,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:03:28,334][65616] Updated weights for policy 0, policy_version 11010 (0.0028) [2024-06-12 14:03:29,332][65383] Fps is (10 sec: 49152.5, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 180453376. Throughput: 0: 45304.5. Samples: 180571660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-12 14:03:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:03:31,722][65616] Updated weights for policy 0, policy_version 11020 (0.0030) [2024-06-12 14:03:34,332][65383] Fps is (10 sec: 42598.1, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 180649984. Throughput: 0: 45359.2. Samples: 180706500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 14:03:34,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:03:35,582][65616] Updated weights for policy 0, policy_version 11030 (0.0028) [2024-06-12 14:03:39,301][65616] Updated weights for policy 0, policy_version 11040 (0.0021) [2024-06-12 14:03:39,332][65383] Fps is (10 sec: 42598.0, 60 sec: 44782.8, 300 sec: 45208.7). Total num frames: 180879360. Throughput: 0: 45269.8. Samples: 180983940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 14:03:39,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:03:42,511][65616] Updated weights for policy 0, policy_version 11050 (0.0026) [2024-06-12 14:03:44,332][65383] Fps is (10 sec: 45875.8, 60 sec: 45602.2, 300 sec: 45264.3). Total num frames: 181108736. Throughput: 0: 45156.1. Samples: 181255440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 14:03:44,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 14:03:46,790][65616] Updated weights for policy 0, policy_version 11060 (0.0026) [2024-06-12 14:03:49,333][65383] Fps is (10 sec: 47513.2, 60 sec: 45602.1, 300 sec: 45430.9). Total num frames: 181354496. Throughput: 0: 45359.0. Samples: 181393220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 14:03:49,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:03:49,525][65616] Updated weights for policy 0, policy_version 11070 (0.0026) [2024-06-12 14:03:53,913][65616] Updated weights for policy 0, policy_version 11080 (0.0023) [2024-06-12 14:03:54,332][65383] Fps is (10 sec: 42598.1, 60 sec: 44509.9, 300 sec: 45153.2). Total num frames: 181534720. Throughput: 0: 45047.5. Samples: 181656240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 14:03:54,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:03:54,374][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000011081_181551104.pth... [2024-06-12 14:03:54,417][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000010419_170704896.pth [2024-06-12 14:03:56,936][65616] Updated weights for policy 0, policy_version 11090 (0.0030) [2024-06-12 14:03:59,332][65383] Fps is (10 sec: 40960.1, 60 sec: 44782.9, 300 sec: 45208.7). Total num frames: 181764096. Throughput: 0: 44877.8. Samples: 181918740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 14:03:59,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:04:01,639][65616] Updated weights for policy 0, policy_version 11100 (0.0028) [2024-06-12 14:04:04,258][65616] Updated weights for policy 0, policy_version 11110 (0.0020) [2024-06-12 14:04:04,332][65383] Fps is (10 sec: 49151.8, 60 sec: 45602.1, 300 sec: 45375.3). Total num frames: 182026240. Throughput: 0: 45109.3. Samples: 182055780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 14:04:04,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:04:09,004][65616] Updated weights for policy 0, policy_version 11120 (0.0023) [2024-06-12 14:04:09,332][65383] Fps is (10 sec: 42599.0, 60 sec: 43963.7, 300 sec: 45097.7). Total num frames: 182190080. Throughput: 0: 44702.3. Samples: 182323880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 14:04:09,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:04:11,475][65616] Updated weights for policy 0, policy_version 11130 (0.0025) [2024-06-12 14:04:14,333][65383] Fps is (10 sec: 42598.2, 60 sec: 45056.0, 300 sec: 45153.2). Total num frames: 182452224. Throughput: 0: 44962.5. Samples: 182594980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 14:04:14,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 14:04:15,947][65616] Updated weights for policy 0, policy_version 11140 (0.0035) [2024-06-12 14:04:18,866][65616] Updated weights for policy 0, policy_version 11150 (0.0026) [2024-06-12 14:04:19,332][65383] Fps is (10 sec: 50790.1, 60 sec: 45602.2, 300 sec: 45375.4). Total num frames: 182697984. Throughput: 0: 45029.0. Samples: 182732800. Policy #0 lag: (min: 1.0, avg: 8.6, max: 23.0) [2024-06-12 14:04:19,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:04:23,093][65616] Updated weights for policy 0, policy_version 11160 (0.0027) [2024-06-12 14:04:24,332][65383] Fps is (10 sec: 44237.2, 60 sec: 44509.9, 300 sec: 45153.2). Total num frames: 182894592. Throughput: 0: 45022.7. Samples: 183009960. Policy #0 lag: (min: 1.0, avg: 8.6, max: 23.0) [2024-06-12 14:04:24,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:04:26,122][65616] Updated weights for policy 0, policy_version 11170 (0.0027) [2024-06-12 14:04:29,332][65383] Fps is (10 sec: 40959.8, 60 sec: 44236.7, 300 sec: 45153.2). Total num frames: 183107584. Throughput: 0: 44752.8. Samples: 183269320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 14:04:29,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 14:04:30,102][65616] Updated weights for policy 0, policy_version 11180 (0.0026) [2024-06-12 14:04:33,425][65595] Signal inference workers to stop experience collection... (2650 times) [2024-06-12 14:04:33,425][65595] Signal inference workers to resume experience collection... (2650 times) [2024-06-12 14:04:33,465][65616] InferenceWorker_p0-w0: stopping experience collection (2650 times) [2024-06-12 14:04:33,466][65616] InferenceWorker_p0-w0: resuming experience collection (2650 times) [2024-06-12 14:04:33,554][65616] Updated weights for policy 0, policy_version 11190 (0.0037) [2024-06-12 14:04:34,332][65383] Fps is (10 sec: 45875.1, 60 sec: 45056.0, 300 sec: 45208.7). Total num frames: 183353344. Throughput: 0: 44663.2. Samples: 183403060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 14:04:34,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 14:04:37,342][65616] Updated weights for policy 0, policy_version 11200 (0.0032) [2024-06-12 14:04:39,332][65383] Fps is (10 sec: 45875.8, 60 sec: 44783.0, 300 sec: 45153.2). Total num frames: 183566336. Throughput: 0: 44883.2. Samples: 183675980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 14:04:39,333][65383] Avg episode reward: [(0, '0.087')] [2024-06-12 14:04:41,181][65616] Updated weights for policy 0, policy_version 11210 (0.0027) [2024-06-12 14:04:44,332][65383] Fps is (10 sec: 44236.8, 60 sec: 44782.9, 300 sec: 45208.7). Total num frames: 183795712. Throughput: 0: 45051.2. Samples: 183946040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 14:04:44,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:04:44,840][65616] Updated weights for policy 0, policy_version 11220 (0.0019) [2024-06-12 14:04:48,588][65616] Updated weights for policy 0, policy_version 11230 (0.0027) [2024-06-12 14:04:49,332][65383] Fps is (10 sec: 44236.9, 60 sec: 44237.0, 300 sec: 45153.2). Total num frames: 184008704. Throughput: 0: 44996.1. Samples: 184080600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 14:04:49,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:04:51,704][65616] Updated weights for policy 0, policy_version 11240 (0.0029) [2024-06-12 14:04:54,332][65383] Fps is (10 sec: 44237.3, 60 sec: 45056.1, 300 sec: 45097.7). Total num frames: 184238080. Throughput: 0: 44787.2. Samples: 184339300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 14:04:54,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:04:56,154][65616] Updated weights for policy 0, policy_version 11250 (0.0027) [2024-06-12 14:04:59,332][65383] Fps is (10 sec: 45874.9, 60 sec: 45056.1, 300 sec: 45153.2). Total num frames: 184467456. Throughput: 0: 44648.1. Samples: 184604140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 14:04:59,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:04:59,361][65616] Updated weights for policy 0, policy_version 11260 (0.0031) [2024-06-12 14:05:03,726][65616] Updated weights for policy 0, policy_version 11270 (0.0028) [2024-06-12 14:05:04,332][65383] Fps is (10 sec: 42598.2, 60 sec: 43963.8, 300 sec: 45042.1). Total num frames: 184664064. Throughput: 0: 44626.3. Samples: 184740980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 14:05:04,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:05:06,654][65616] Updated weights for policy 0, policy_version 11280 (0.0022) [2024-06-12 14:05:09,333][65383] Fps is (10 sec: 42597.7, 60 sec: 45055.9, 300 sec: 45042.1). Total num frames: 184893440. Throughput: 0: 44399.5. Samples: 185007940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 14:05:09,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:05:10,979][65616] Updated weights for policy 0, policy_version 11290 (0.0026) [2024-06-12 14:05:13,775][65616] Updated weights for policy 0, policy_version 11300 (0.0022) [2024-06-12 14:05:14,333][65383] Fps is (10 sec: 47512.8, 60 sec: 44782.9, 300 sec: 45208.7). Total num frames: 185139200. Throughput: 0: 44511.5. Samples: 185272340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 14:05:14,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:05:18,750][65616] Updated weights for policy 0, policy_version 11310 (0.0032) [2024-06-12 14:05:19,333][65383] Fps is (10 sec: 45875.3, 60 sec: 44236.7, 300 sec: 45097.6). Total num frames: 185352192. Throughput: 0: 44607.9. Samples: 185410420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-12 14:05:19,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:05:21,066][65616] Updated weights for policy 0, policy_version 11320 (0.0021) [2024-06-12 14:05:24,332][65383] Fps is (10 sec: 42598.8, 60 sec: 44509.9, 300 sec: 44986.6). Total num frames: 185565184. Throughput: 0: 44352.8. Samples: 185671860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-12 14:05:24,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:05:25,908][65616] Updated weights for policy 0, policy_version 11330 (0.0029) [2024-06-12 14:05:28,466][65616] Updated weights for policy 0, policy_version 11340 (0.0024) [2024-06-12 14:05:29,335][65383] Fps is (10 sec: 47500.7, 60 sec: 45327.0, 300 sec: 45152.7). Total num frames: 185827328. Throughput: 0: 44159.9. Samples: 185933360. Policy #0 lag: (min: 2.0, avg: 10.4, max: 21.0) [2024-06-12 14:05:29,336][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:05:32,946][65616] Updated weights for policy 0, policy_version 11350 (0.0028) [2024-06-12 14:05:34,332][65383] Fps is (10 sec: 44237.1, 60 sec: 44236.8, 300 sec: 44931.2). Total num frames: 186007552. Throughput: 0: 44414.6. Samples: 186079260. Policy #0 lag: (min: 2.0, avg: 10.4, max: 21.0) [2024-06-12 14:05:34,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:05:36,047][65616] Updated weights for policy 0, policy_version 11360 (0.0023) [2024-06-12 14:05:37,529][65595] Signal inference workers to stop experience collection... (2700 times) [2024-06-12 14:05:37,529][65595] Signal inference workers to resume experience collection... (2700 times) [2024-06-12 14:05:37,538][65616] InferenceWorker_p0-w0: stopping experience collection (2700 times) [2024-06-12 14:05:37,538][65616] InferenceWorker_p0-w0: resuming experience collection (2700 times) [2024-06-12 14:05:39,332][65383] Fps is (10 sec: 42610.5, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 186253312. Throughput: 0: 44498.6. Samples: 186341740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-12 14:05:39,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:05:40,325][65616] Updated weights for policy 0, policy_version 11370 (0.0031) [2024-06-12 14:05:43,498][65616] Updated weights for policy 0, policy_version 11380 (0.0029) [2024-06-12 14:05:44,332][65383] Fps is (10 sec: 47513.5, 60 sec: 44783.0, 300 sec: 44986.6). Total num frames: 186482688. Throughput: 0: 44615.5. Samples: 186611840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-12 14:05:44,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:05:47,052][65616] Updated weights for policy 0, policy_version 11390 (0.0022) [2024-06-12 14:05:49,332][65383] Fps is (10 sec: 44236.7, 60 sec: 44782.9, 300 sec: 45042.1). Total num frames: 186695680. Throughput: 0: 44495.5. Samples: 186743280. Policy #0 lag: (min: 1.0, avg: 11.9, max: 23.0) [2024-06-12 14:05:49,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:05:51,004][65616] Updated weights for policy 0, policy_version 11400 (0.0025) [2024-06-12 14:05:54,332][65383] Fps is (10 sec: 44236.6, 60 sec: 44782.8, 300 sec: 44986.6). Total num frames: 186925056. Throughput: 0: 44589.9. Samples: 187014480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 14:05:54,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:05:54,347][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000011409_186925056.pth... [2024-06-12 14:05:54,406][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000010751_176144384.pth [2024-06-12 14:05:54,550][65616] Updated weights for policy 0, policy_version 11410 (0.0025) [2024-06-12 14:05:58,494][65616] Updated weights for policy 0, policy_version 11420 (0.0028) [2024-06-12 14:05:59,332][65383] Fps is (10 sec: 44236.5, 60 sec: 44509.8, 300 sec: 44931.0). Total num frames: 187138048. Throughput: 0: 44648.9. Samples: 187281540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 14:05:59,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 14:06:01,690][65616] Updated weights for policy 0, policy_version 11430 (0.0029) [2024-06-12 14:06:04,332][65383] Fps is (10 sec: 44236.7, 60 sec: 45055.9, 300 sec: 44931.0). Total num frames: 187367424. Throughput: 0: 44551.1. Samples: 187415220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 14:06:04,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:06:06,080][65616] Updated weights for policy 0, policy_version 11440 (0.0037) [2024-06-12 14:06:09,333][65383] Fps is (10 sec: 44236.2, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 187580416. Throughput: 0: 44696.7. Samples: 187683220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 14:06:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:06:09,378][65616] Updated weights for policy 0, policy_version 11450 (0.0029) [2024-06-12 14:06:13,310][65616] Updated weights for policy 0, policy_version 11460 (0.0029) [2024-06-12 14:06:14,332][65383] Fps is (10 sec: 42598.3, 60 sec: 44236.8, 300 sec: 44931.0). Total num frames: 187793408. Throughput: 0: 44944.1. Samples: 187955720. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-12 14:06:14,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:06:16,667][65616] Updated weights for policy 0, policy_version 11470 (0.0026) [2024-06-12 14:06:19,332][65383] Fps is (10 sec: 42598.9, 60 sec: 44236.8, 300 sec: 44820.0). Total num frames: 188006400. Throughput: 0: 44384.3. Samples: 188076560. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-12 14:06:19,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:06:20,244][65616] Updated weights for policy 0, policy_version 11480 (0.0032) [2024-06-12 14:06:23,748][65616] Updated weights for policy 0, policy_version 11490 (0.0024) [2024-06-12 14:06:24,332][65383] Fps is (10 sec: 47513.7, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 188268544. Throughput: 0: 44773.7. Samples: 188356560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 14:06:24,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 14:06:27,702][65616] Updated weights for policy 0, policy_version 11500 (0.0024) [2024-06-12 14:06:29,332][65383] Fps is (10 sec: 45875.6, 60 sec: 43965.8, 300 sec: 44931.0). Total num frames: 188465152. Throughput: 0: 44552.4. Samples: 188616700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 14:06:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:06:31,469][65616] Updated weights for policy 0, policy_version 11510 (0.0033) [2024-06-12 14:06:34,332][65383] Fps is (10 sec: 40960.3, 60 sec: 44509.8, 300 sec: 44764.5). Total num frames: 188678144. Throughput: 0: 44497.8. Samples: 188745680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 14:06:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:06:35,405][65616] Updated weights for policy 0, policy_version 11520 (0.0032) [2024-06-12 14:06:39,234][65616] Updated weights for policy 0, policy_version 11530 (0.0025) [2024-06-12 14:06:39,332][65383] Fps is (10 sec: 44237.2, 60 sec: 44236.9, 300 sec: 44708.9). Total num frames: 188907520. Throughput: 0: 44593.9. Samples: 189021200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 14:06:39,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:06:42,746][65616] Updated weights for policy 0, policy_version 11540 (0.0033) [2024-06-12 14:06:44,332][65383] Fps is (10 sec: 44236.5, 60 sec: 43963.7, 300 sec: 44875.5). Total num frames: 189120512. Throughput: 0: 44235.6. Samples: 189272140. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-12 14:06:44,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:06:46,345][65616] Updated weights for policy 0, policy_version 11550 (0.0026) [2024-06-12 14:06:49,332][65383] Fps is (10 sec: 45874.9, 60 sec: 44509.9, 300 sec: 44875.5). Total num frames: 189366272. Throughput: 0: 44254.3. Samples: 189406660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 14:06:49,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:06:50,334][65616] Updated weights for policy 0, policy_version 11560 (0.0026) [2024-06-12 14:06:53,698][65616] Updated weights for policy 0, policy_version 11570 (0.0025) [2024-06-12 14:06:54,332][65383] Fps is (10 sec: 45875.1, 60 sec: 44236.8, 300 sec: 44819.9). Total num frames: 189579264. Throughput: 0: 44437.0. Samples: 189682880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 14:06:54,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:06:57,378][65616] Updated weights for policy 0, policy_version 11580 (0.0027) [2024-06-12 14:06:59,332][65383] Fps is (10 sec: 44236.5, 60 sec: 44509.9, 300 sec: 44875.5). Total num frames: 189808640. Throughput: 0: 44389.4. Samples: 189953240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 14:06:59,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:07:00,962][65616] Updated weights for policy 0, policy_version 11590 (0.0028) [2024-06-12 14:07:04,332][65383] Fps is (10 sec: 45875.3, 60 sec: 44509.9, 300 sec: 44876.0). Total num frames: 190038016. Throughput: 0: 44730.2. Samples: 190089420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 14:07:04,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:07:04,465][65616] Updated weights for policy 0, policy_version 11600 (0.0025) [2024-06-12 14:07:08,118][65616] Updated weights for policy 0, policy_version 11610 (0.0031) [2024-06-12 14:07:09,332][65383] Fps is (10 sec: 45875.6, 60 sec: 44783.1, 300 sec: 44820.0). Total num frames: 190267392. Throughput: 0: 44475.2. Samples: 190357940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 14:07:09,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:07:11,525][65616] Updated weights for policy 0, policy_version 11620 (0.0031) [2024-06-12 14:07:14,333][65383] Fps is (10 sec: 42598.0, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 190464000. Throughput: 0: 44854.1. Samples: 190635140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 14:07:14,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:07:15,521][65616] Updated weights for policy 0, policy_version 11630 (0.0028) [2024-06-12 14:07:19,196][65616] Updated weights for policy 0, policy_version 11640 (0.0031) [2024-06-12 14:07:19,332][65383] Fps is (10 sec: 44236.7, 60 sec: 45056.1, 300 sec: 44931.0). Total num frames: 190709760. Throughput: 0: 44596.0. Samples: 190752500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 14:07:19,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:07:22,695][65616] Updated weights for policy 0, policy_version 11650 (0.0032) [2024-06-12 14:07:24,332][65383] Fps is (10 sec: 45875.8, 60 sec: 44236.8, 300 sec: 44708.9). Total num frames: 190922752. Throughput: 0: 44495.0. Samples: 191023480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 14:07:24,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:07:26,962][65616] Updated weights for policy 0, policy_version 11660 (0.0025) [2024-06-12 14:07:27,925][65595] Signal inference workers to stop experience collection... (2750 times) [2024-06-12 14:07:27,925][65595] Signal inference workers to resume experience collection... (2750 times) [2024-06-12 14:07:27,951][65616] InferenceWorker_p0-w0: stopping experience collection (2750 times) [2024-06-12 14:07:27,951][65616] InferenceWorker_p0-w0: resuming experience collection (2750 times) [2024-06-12 14:07:29,333][65383] Fps is (10 sec: 44235.7, 60 sec: 44782.8, 300 sec: 44819.9). Total num frames: 191152128. Throughput: 0: 44859.4. Samples: 191290820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 14:07:29,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:07:30,262][65616] Updated weights for policy 0, policy_version 11670 (0.0026) [2024-06-12 14:07:34,217][65616] Updated weights for policy 0, policy_version 11680 (0.0037) [2024-06-12 14:07:34,332][65383] Fps is (10 sec: 44236.5, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 191365120. Throughput: 0: 44819.9. Samples: 191423560. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-12 14:07:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:07:37,423][65616] Updated weights for policy 0, policy_version 11690 (0.0027) [2024-06-12 14:07:39,332][65383] Fps is (10 sec: 45876.7, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 191610880. Throughput: 0: 44536.2. Samples: 191687000. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-12 14:07:39,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:07:41,558][65616] Updated weights for policy 0, policy_version 11700 (0.0033) [2024-06-12 14:07:44,332][65383] Fps is (10 sec: 44237.4, 60 sec: 44783.0, 300 sec: 44708.9). Total num frames: 191807488. Throughput: 0: 44705.0. Samples: 191964960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 14:07:44,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:07:44,742][65616] Updated weights for policy 0, policy_version 11710 (0.0034) [2024-06-12 14:07:49,147][65616] Updated weights for policy 0, policy_version 11720 (0.0034) [2024-06-12 14:07:49,332][65383] Fps is (10 sec: 40959.5, 60 sec: 44236.8, 300 sec: 44597.8). Total num frames: 192020480. Throughput: 0: 44331.6. Samples: 192084340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 14:07:49,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:07:52,256][65616] Updated weights for policy 0, policy_version 11730 (0.0029) [2024-06-12 14:07:54,332][65383] Fps is (10 sec: 45874.8, 60 sec: 44783.0, 300 sec: 44708.9). Total num frames: 192266240. Throughput: 0: 44372.8. Samples: 192354720. Policy #0 lag: (min: 1.0, avg: 12.9, max: 24.0) [2024-06-12 14:07:54,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:07:54,348][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000011735_192266240.pth... [2024-06-12 14:07:54,395][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000011081_181551104.pth [2024-06-12 14:07:56,340][65616] Updated weights for policy 0, policy_version 11740 (0.0035) [2024-06-12 14:07:59,325][65616] Updated weights for policy 0, policy_version 11750 (0.0027) [2024-06-12 14:07:59,332][65383] Fps is (10 sec: 49152.1, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 192512000. Throughput: 0: 44284.2. Samples: 192627920. Policy #0 lag: (min: 1.0, avg: 12.9, max: 24.0) [2024-06-12 14:07:59,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:08:03,559][65616] Updated weights for policy 0, policy_version 11760 (0.0030) [2024-06-12 14:08:04,332][65383] Fps is (10 sec: 42598.2, 60 sec: 44236.8, 300 sec: 44542.2). Total num frames: 192692224. Throughput: 0: 44481.7. Samples: 192754180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 14:08:04,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:08:07,069][65616] Updated weights for policy 0, policy_version 11770 (0.0028) [2024-06-12 14:08:09,333][65383] Fps is (10 sec: 40959.6, 60 sec: 44236.7, 300 sec: 44653.3). Total num frames: 192921600. Throughput: 0: 44378.6. Samples: 193020520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 14:08:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:08:11,025][65616] Updated weights for policy 0, policy_version 11780 (0.0026) [2024-06-12 14:08:14,332][65383] Fps is (10 sec: 45875.5, 60 sec: 44783.1, 300 sec: 44708.9). Total num frames: 193150976. Throughput: 0: 44402.9. Samples: 193288940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 14:08:14,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:08:14,464][65616] Updated weights for policy 0, policy_version 11790 (0.0031) [2024-06-12 14:08:18,005][65616] Updated weights for policy 0, policy_version 11800 (0.0025) [2024-06-12 14:08:19,332][65383] Fps is (10 sec: 45875.8, 60 sec: 44509.9, 300 sec: 44597.8). Total num frames: 193380352. Throughput: 0: 44576.1. Samples: 193429480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 14:08:19,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:08:21,818][65616] Updated weights for policy 0, policy_version 11810 (0.0027) [2024-06-12 14:08:24,334][65383] Fps is (10 sec: 44229.7, 60 sec: 44508.7, 300 sec: 44542.0). Total num frames: 193593344. Throughput: 0: 44657.4. Samples: 193696660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-12 14:08:24,335][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:08:25,455][65616] Updated weights for policy 0, policy_version 11820 (0.0037) [2024-06-12 14:08:29,337][65616] Updated weights for policy 0, policy_version 11830 (0.0027) [2024-06-12 14:08:29,340][65383] Fps is (10 sec: 44203.0, 60 sec: 44504.4, 300 sec: 44652.2). Total num frames: 193822720. Throughput: 0: 44507.5. Samples: 193968140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 14:08:29,340][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:08:32,771][65616] Updated weights for policy 0, policy_version 11840 (0.0031) [2024-06-12 14:08:34,332][65383] Fps is (10 sec: 45882.5, 60 sec: 44783.0, 300 sec: 44653.3). Total num frames: 194052096. Throughput: 0: 44611.1. Samples: 194091840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 14:08:34,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:08:36,712][65616] Updated weights for policy 0, policy_version 11850 (0.0026) [2024-06-12 14:08:39,333][65383] Fps is (10 sec: 42629.9, 60 sec: 43963.5, 300 sec: 44542.2). Total num frames: 194248704. Throughput: 0: 44662.0. Samples: 194364520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-12 14:08:39,340][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:08:40,362][65616] Updated weights for policy 0, policy_version 11860 (0.0027) [2024-06-12 14:08:44,042][65616] Updated weights for policy 0, policy_version 11870 (0.0027) [2024-06-12 14:08:44,332][65383] Fps is (10 sec: 42598.6, 60 sec: 44509.8, 300 sec: 44486.7). Total num frames: 194478080. Throughput: 0: 44436.5. Samples: 194627560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-12 14:08:44,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:08:47,682][65616] Updated weights for policy 0, policy_version 11880 (0.0033) [2024-06-12 14:08:49,332][65383] Fps is (10 sec: 45875.9, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 194707456. Throughput: 0: 44621.8. Samples: 194762160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-12 14:08:49,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:08:51,218][65616] Updated weights for policy 0, policy_version 11890 (0.0034) [2024-06-12 14:08:54,332][65383] Fps is (10 sec: 42598.3, 60 sec: 43963.8, 300 sec: 44542.3). Total num frames: 194904064. Throughput: 0: 44560.1. Samples: 195025720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-12 14:08:54,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:08:55,304][65616] Updated weights for policy 0, policy_version 11900 (0.0032) [2024-06-12 14:08:58,752][65616] Updated weights for policy 0, policy_version 11910 (0.0029) [2024-06-12 14:08:59,332][65383] Fps is (10 sec: 44236.9, 60 sec: 43963.7, 300 sec: 44486.7). Total num frames: 195149824. Throughput: 0: 44541.7. Samples: 195293320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-12 14:08:59,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:09:01,904][65616] Updated weights for policy 0, policy_version 11920 (0.0028) [2024-06-12 14:09:04,332][65383] Fps is (10 sec: 47514.0, 60 sec: 44783.0, 300 sec: 44708.9). Total num frames: 195379200. Throughput: 0: 44428.5. Samples: 195428760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-12 14:09:04,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:09:06,366][65616] Updated weights for policy 0, policy_version 11930 (0.0029) [2024-06-12 14:09:09,332][65383] Fps is (10 sec: 45875.4, 60 sec: 44783.0, 300 sec: 44597.8). Total num frames: 195608576. Throughput: 0: 44391.3. Samples: 195694200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 14:09:09,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:09:10,114][65616] Updated weights for policy 0, policy_version 11940 (0.0027) [2024-06-12 14:09:13,422][65616] Updated weights for policy 0, policy_version 11950 (0.0031) [2024-06-12 14:09:14,333][65383] Fps is (10 sec: 44236.0, 60 sec: 44509.8, 300 sec: 44486.7). Total num frames: 195821568. Throughput: 0: 44086.9. Samples: 195951720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 14:09:14,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:09:17,328][65595] Signal inference workers to stop experience collection... (2800 times) [2024-06-12 14:09:17,332][65595] Signal inference workers to resume experience collection... (2800 times) [2024-06-12 14:09:17,372][65616] InferenceWorker_p0-w0: stopping experience collection (2800 times) [2024-06-12 14:09:17,372][65616] InferenceWorker_p0-w0: resuming experience collection (2800 times) [2024-06-12 14:09:17,454][65616] Updated weights for policy 0, policy_version 11960 (0.0032) [2024-06-12 14:09:19,332][65383] Fps is (10 sec: 42598.5, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 196034560. Throughput: 0: 44331.1. Samples: 196086740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 14:09:19,336][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:09:20,942][65616] Updated weights for policy 0, policy_version 11970 (0.0036) [2024-06-12 14:09:24,332][65383] Fps is (10 sec: 40960.9, 60 sec: 43965.0, 300 sec: 44486.8). Total num frames: 196231168. Throughput: 0: 44123.5. Samples: 196350060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 14:09:24,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:09:25,174][65616] Updated weights for policy 0, policy_version 11980 (0.0030) [2024-06-12 14:09:28,232][65616] Updated weights for policy 0, policy_version 11990 (0.0029) [2024-06-12 14:09:29,332][65383] Fps is (10 sec: 45875.8, 60 sec: 44515.6, 300 sec: 44542.3). Total num frames: 196493312. Throughput: 0: 44008.1. Samples: 196607920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 14:09:29,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:09:32,716][65616] Updated weights for policy 0, policy_version 12000 (0.0024) [2024-06-12 14:09:34,332][65383] Fps is (10 sec: 45874.7, 60 sec: 43963.7, 300 sec: 44486.7). Total num frames: 196689920. Throughput: 0: 44157.4. Samples: 196749240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-12 14:09:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:09:35,521][65616] Updated weights for policy 0, policy_version 12010 (0.0031) [2024-06-12 14:09:39,333][65383] Fps is (10 sec: 39320.6, 60 sec: 43963.8, 300 sec: 44375.6). Total num frames: 196886528. Throughput: 0: 43948.8. Samples: 197003420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-12 14:09:39,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:09:40,434][65616] Updated weights for policy 0, policy_version 12020 (0.0030) [2024-06-12 14:09:43,200][65616] Updated weights for policy 0, policy_version 12030 (0.0020) [2024-06-12 14:09:44,333][65383] Fps is (10 sec: 45874.8, 60 sec: 44509.8, 300 sec: 44542.2). Total num frames: 197148672. Throughput: 0: 43897.3. Samples: 197268700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-12 14:09:44,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:09:47,651][65616] Updated weights for policy 0, policy_version 12040 (0.0034) [2024-06-12 14:09:49,332][65383] Fps is (10 sec: 47514.4, 60 sec: 44236.9, 300 sec: 44486.7). Total num frames: 197361664. Throughput: 0: 43848.4. Samples: 197401940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-12 14:09:49,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:09:50,970][65616] Updated weights for policy 0, policy_version 12050 (0.0026) [2024-06-12 14:09:54,332][65383] Fps is (10 sec: 39322.2, 60 sec: 43963.8, 300 sec: 44320.1). Total num frames: 197541888. Throughput: 0: 43657.9. Samples: 197658800. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-12 14:09:54,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:09:54,359][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000012058_197558272.pth... [2024-06-12 14:09:54,399][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000011409_186925056.pth [2024-06-12 14:09:55,277][65616] Updated weights for policy 0, policy_version 12060 (0.0030) [2024-06-12 14:09:58,676][65616] Updated weights for policy 0, policy_version 12070 (0.0023) [2024-06-12 14:09:59,332][65383] Fps is (10 sec: 40960.1, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 197771264. Throughput: 0: 43786.0. Samples: 197922080. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-12 14:09:59,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:10:02,084][65616] Updated weights for policy 0, policy_version 12080 (0.0035) [2024-06-12 14:10:04,332][65383] Fps is (10 sec: 45874.8, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 198000640. Throughput: 0: 43911.1. Samples: 198062740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 14:10:04,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:10:06,270][65616] Updated weights for policy 0, policy_version 12090 (0.0025) [2024-06-12 14:10:09,332][65383] Fps is (10 sec: 45875.0, 60 sec: 43690.7, 300 sec: 44375.7). Total num frames: 198230016. Throughput: 0: 43814.1. Samples: 198321700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 14:10:09,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:10:09,354][65616] Updated weights for policy 0, policy_version 12100 (0.0027) [2024-06-12 14:10:13,466][65616] Updated weights for policy 0, policy_version 12110 (0.0025) [2024-06-12 14:10:14,333][65383] Fps is (10 sec: 45874.7, 60 sec: 43963.7, 300 sec: 44431.2). Total num frames: 198459392. Throughput: 0: 44141.5. Samples: 198594300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 14:10:14,342][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 14:10:16,420][65616] Updated weights for policy 0, policy_version 12120 (0.0033) [2024-06-12 14:10:19,333][65383] Fps is (10 sec: 42597.7, 60 sec: 43690.6, 300 sec: 44375.6). Total num frames: 198656000. Throughput: 0: 43763.9. Samples: 198718620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 14:10:19,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:10:20,742][65616] Updated weights for policy 0, policy_version 12130 (0.0025) [2024-06-12 14:10:24,035][65616] Updated weights for policy 0, policy_version 12140 (0.0029) [2024-06-12 14:10:24,332][65383] Fps is (10 sec: 44237.3, 60 sec: 44509.8, 300 sec: 44320.5). Total num frames: 198901760. Throughput: 0: 44157.4. Samples: 198990500. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 14:10:24,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:10:28,641][65616] Updated weights for policy 0, policy_version 12150 (0.0027) [2024-06-12 14:10:29,332][65383] Fps is (10 sec: 45875.9, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 199114752. Throughput: 0: 44189.0. Samples: 199257200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 14:10:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:10:29,385][65595] Signal inference workers to stop experience collection... (2850 times) [2024-06-12 14:10:29,429][65616] InferenceWorker_p0-w0: stopping experience collection (2850 times) [2024-06-12 14:10:29,443][65595] Signal inference workers to resume experience collection... (2850 times) [2024-06-12 14:10:29,443][65616] InferenceWorker_p0-w0: resuming experience collection (2850 times) [2024-06-12 14:10:31,642][65616] Updated weights for policy 0, policy_version 12160 (0.0030) [2024-06-12 14:10:34,332][65383] Fps is (10 sec: 40960.2, 60 sec: 43690.7, 300 sec: 44264.6). Total num frames: 199311360. Throughput: 0: 44230.6. Samples: 199392320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 14:10:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:10:35,994][65616] Updated weights for policy 0, policy_version 12170 (0.0028) [2024-06-12 14:10:39,119][65616] Updated weights for policy 0, policy_version 12180 (0.0030) [2024-06-12 14:10:39,332][65383] Fps is (10 sec: 44236.9, 60 sec: 44510.0, 300 sec: 44320.1). Total num frames: 199557120. Throughput: 0: 44236.4. Samples: 199649440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 14:10:39,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:10:43,049][65616] Updated weights for policy 0, policy_version 12190 (0.0029) [2024-06-12 14:10:44,332][65383] Fps is (10 sec: 45874.9, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 199770112. Throughput: 0: 44350.6. Samples: 199917860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 14:10:44,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 14:10:46,687][65616] Updated weights for policy 0, policy_version 12200 (0.0032) [2024-06-12 14:10:49,332][65383] Fps is (10 sec: 40959.5, 60 sec: 43417.5, 300 sec: 44209.0). Total num frames: 199966720. Throughput: 0: 44057.7. Samples: 200045340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-12 14:10:49,333][65383] Avg episode reward: [(0, '0.094')] [2024-06-12 14:10:50,715][65616] Updated weights for policy 0, policy_version 12210 (0.0027) [2024-06-12 14:10:54,040][65616] Updated weights for policy 0, policy_version 12220 (0.0035) [2024-06-12 14:10:54,332][65383] Fps is (10 sec: 45875.3, 60 sec: 44782.9, 300 sec: 44375.7). Total num frames: 200228864. Throughput: 0: 44263.5. Samples: 200313560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-12 14:10:54,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:10:57,663][65616] Updated weights for policy 0, policy_version 12230 (0.0035) [2024-06-12 14:10:59,332][65383] Fps is (10 sec: 45875.1, 60 sec: 44236.7, 300 sec: 44264.6). Total num frames: 200425472. Throughput: 0: 44340.9. Samples: 200589640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 14:10:59,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:11:01,653][65616] Updated weights for policy 0, policy_version 12240 (0.0021) [2024-06-12 14:11:04,332][65383] Fps is (10 sec: 44237.1, 60 sec: 44509.9, 300 sec: 44375.7). Total num frames: 200671232. Throughput: 0: 44363.3. Samples: 200714960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 14:11:04,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:11:04,623][65616] Updated weights for policy 0, policy_version 12250 (0.0027) [2024-06-12 14:11:08,962][65616] Updated weights for policy 0, policy_version 12260 (0.0030) [2024-06-12 14:11:09,332][65383] Fps is (10 sec: 44237.4, 60 sec: 43963.7, 300 sec: 44320.1). Total num frames: 200867840. Throughput: 0: 44365.4. Samples: 200986940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 14:11:09,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:11:12,130][65616] Updated weights for policy 0, policy_version 12270 (0.0028) [2024-06-12 14:11:14,333][65383] Fps is (10 sec: 44236.1, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 201113600. Throughput: 0: 44492.3. Samples: 201259360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 14:11:14,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:11:16,387][65616] Updated weights for policy 0, policy_version 12280 (0.0020) [2024-06-12 14:11:19,332][65383] Fps is (10 sec: 45875.0, 60 sec: 44510.0, 300 sec: 44264.6). Total num frames: 201326592. Throughput: 0: 44525.3. Samples: 201395960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 24.0) [2024-06-12 14:11:19,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:11:19,506][65616] Updated weights for policy 0, policy_version 12290 (0.0027) [2024-06-12 14:11:23,391][65616] Updated weights for policy 0, policy_version 12300 (0.0026) [2024-06-12 14:11:24,332][65383] Fps is (10 sec: 42598.7, 60 sec: 43963.7, 300 sec: 44320.1). Total num frames: 201539584. Throughput: 0: 44593.7. Samples: 201656160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 24.0) [2024-06-12 14:11:24,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:11:26,457][65616] Updated weights for policy 0, policy_version 12310 (0.0026) [2024-06-12 14:11:29,332][65383] Fps is (10 sec: 44236.6, 60 sec: 44236.8, 300 sec: 44375.6). Total num frames: 201768960. Throughput: 0: 44690.7. Samples: 201928940. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 14:11:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:11:30,803][65616] Updated weights for policy 0, policy_version 12320 (0.0035) [2024-06-12 14:11:34,176][65616] Updated weights for policy 0, policy_version 12330 (0.0027) [2024-06-12 14:11:34,332][65383] Fps is (10 sec: 47513.4, 60 sec: 45055.9, 300 sec: 44431.2). Total num frames: 202014720. Throughput: 0: 44736.0. Samples: 202058460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 14:11:34,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:11:38,018][65616] Updated weights for policy 0, policy_version 12340 (0.0028) [2024-06-12 14:11:39,332][65383] Fps is (10 sec: 45875.4, 60 sec: 44509.8, 300 sec: 44431.2). Total num frames: 202227712. Throughput: 0: 44704.9. Samples: 202325280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 14:11:39,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:11:41,309][65616] Updated weights for policy 0, policy_version 12350 (0.0027) [2024-06-12 14:11:44,333][65383] Fps is (10 sec: 42598.1, 60 sec: 44509.8, 300 sec: 44320.1). Total num frames: 202440704. Throughput: 0: 44763.5. Samples: 202604000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 14:11:44,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:11:45,114][65616] Updated weights for policy 0, policy_version 12360 (0.0027) [2024-06-12 14:11:45,992][65595] Signal inference workers to stop experience collection... (2900 times) [2024-06-12 14:11:46,033][65616] InferenceWorker_p0-w0: stopping experience collection (2900 times) [2024-06-12 14:11:46,037][65595] Signal inference workers to resume experience collection... (2900 times) [2024-06-12 14:11:46,041][65616] InferenceWorker_p0-w0: resuming experience collection (2900 times) [2024-06-12 14:11:48,451][65616] Updated weights for policy 0, policy_version 12370 (0.0025) [2024-06-12 14:11:49,332][65383] Fps is (10 sec: 44236.9, 60 sec: 45056.1, 300 sec: 44375.7). Total num frames: 202670080. Throughput: 0: 44800.9. Samples: 202731000. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-12 14:11:49,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:11:52,504][65616] Updated weights for policy 0, policy_version 12380 (0.0030) [2024-06-12 14:11:54,332][65383] Fps is (10 sec: 45875.5, 60 sec: 44509.8, 300 sec: 44375.6). Total num frames: 202899456. Throughput: 0: 44633.2. Samples: 202995440. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-12 14:11:54,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:11:54,349][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000012384_202899456.pth... [2024-06-12 14:11:54,398][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000011735_192266240.pth [2024-06-12 14:11:56,040][65616] Updated weights for policy 0, policy_version 12390 (0.0030) [2024-06-12 14:11:59,332][65383] Fps is (10 sec: 42598.5, 60 sec: 44510.0, 300 sec: 44264.6). Total num frames: 203096064. Throughput: 0: 44472.2. Samples: 203260600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-12 14:11:59,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:11:59,988][65616] Updated weights for policy 0, policy_version 12400 (0.0027) [2024-06-12 14:12:03,330][65616] Updated weights for policy 0, policy_version 12410 (0.0034) [2024-06-12 14:12:04,332][65383] Fps is (10 sec: 42598.9, 60 sec: 44236.8, 300 sec: 44264.6). Total num frames: 203325440. Throughput: 0: 44606.7. Samples: 203403260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-12 14:12:04,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:12:07,489][65616] Updated weights for policy 0, policy_version 12420 (0.0034) [2024-06-12 14:12:09,332][65383] Fps is (10 sec: 49151.5, 60 sec: 45329.0, 300 sec: 44486.7). Total num frames: 203587584. Throughput: 0: 44548.5. Samples: 203660840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 14:12:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:12:11,206][65616] Updated weights for policy 0, policy_version 12430 (0.0025) [2024-06-12 14:12:14,332][65383] Fps is (10 sec: 45875.0, 60 sec: 44510.0, 300 sec: 44320.1). Total num frames: 203784192. Throughput: 0: 44478.3. Samples: 203930460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 14:12:14,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:12:14,581][65616] Updated weights for policy 0, policy_version 12440 (0.0034) [2024-06-12 14:12:18,391][65616] Updated weights for policy 0, policy_version 12450 (0.0025) [2024-06-12 14:12:19,333][65383] Fps is (10 sec: 40959.6, 60 sec: 44509.8, 300 sec: 44320.1). Total num frames: 203997184. Throughput: 0: 44431.1. Samples: 204057860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 14:12:19,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:12:22,254][65616] Updated weights for policy 0, policy_version 12460 (0.0034) [2024-06-12 14:12:24,332][65383] Fps is (10 sec: 44236.8, 60 sec: 44783.0, 300 sec: 44320.1). Total num frames: 204226560. Throughput: 0: 44425.3. Samples: 204324420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-06-12 14:12:24,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:12:26,379][65616] Updated weights for policy 0, policy_version 12470 (0.0030) [2024-06-12 14:12:29,332][65383] Fps is (10 sec: 44237.0, 60 sec: 44509.8, 300 sec: 44320.1). Total num frames: 204439552. Throughput: 0: 43939.6. Samples: 204581280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-06-12 14:12:29,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:12:30,019][65616] Updated weights for policy 0, policy_version 12480 (0.0030) [2024-06-12 14:12:33,862][65616] Updated weights for policy 0, policy_version 12490 (0.0036) [2024-06-12 14:12:34,332][65383] Fps is (10 sec: 40960.1, 60 sec: 43690.8, 300 sec: 44153.5). Total num frames: 204636160. Throughput: 0: 44124.9. Samples: 204716620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-12 14:12:34,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:12:37,041][65616] Updated weights for policy 0, policy_version 12500 (0.0025) [2024-06-12 14:12:39,333][65383] Fps is (10 sec: 44236.6, 60 sec: 44236.7, 300 sec: 44320.1). Total num frames: 204881920. Throughput: 0: 44253.7. Samples: 204986860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-12 14:12:39,333][65383] Avg episode reward: [(0, '0.087')] [2024-06-12 14:12:41,299][65616] Updated weights for policy 0, policy_version 12510 (0.0029) [2024-06-12 14:12:44,319][65616] Updated weights for policy 0, policy_version 12520 (0.0027) [2024-06-12 14:12:44,332][65383] Fps is (10 sec: 49151.7, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 205127680. Throughput: 0: 44164.8. Samples: 205248020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 14:12:44,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:12:48,873][65616] Updated weights for policy 0, policy_version 12530 (0.0026) [2024-06-12 14:12:49,332][65383] Fps is (10 sec: 42598.7, 60 sec: 43963.7, 300 sec: 44209.0). Total num frames: 205307904. Throughput: 0: 43979.0. Samples: 205382320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 14:12:49,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:12:51,518][65616] Updated weights for policy 0, policy_version 12540 (0.0029) [2024-06-12 14:12:54,332][65383] Fps is (10 sec: 42598.7, 60 sec: 44236.9, 300 sec: 44209.0). Total num frames: 205553664. Throughput: 0: 44069.4. Samples: 205643960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 14:12:54,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:12:55,872][65616] Updated weights for policy 0, policy_version 12550 (0.0039) [2024-06-12 14:12:59,287][65616] Updated weights for policy 0, policy_version 12560 (0.0028) [2024-06-12 14:12:59,332][65383] Fps is (10 sec: 47513.6, 60 sec: 44782.9, 300 sec: 44375.7). Total num frames: 205783040. Throughput: 0: 44017.3. Samples: 205911240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 14:12:59,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:13:03,599][65616] Updated weights for policy 0, policy_version 12570 (0.0027) [2024-06-12 14:13:04,332][65383] Fps is (10 sec: 44236.4, 60 sec: 44509.8, 300 sec: 44320.1). Total num frames: 205996032. Throughput: 0: 44029.0. Samples: 206039160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-12 14:13:04,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:13:06,782][65616] Updated weights for policy 0, policy_version 12580 (0.0023) [2024-06-12 14:13:09,332][65383] Fps is (10 sec: 42598.8, 60 sec: 43690.7, 300 sec: 44264.6). Total num frames: 206209024. Throughput: 0: 44054.3. Samples: 206306860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-12 14:13:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:13:11,328][65616] Updated weights for policy 0, policy_version 12590 (0.0023) [2024-06-12 14:13:14,307][65616] Updated weights for policy 0, policy_version 12600 (0.0030) [2024-06-12 14:13:14,332][65383] Fps is (10 sec: 44236.6, 60 sec: 44236.7, 300 sec: 44264.6). Total num frames: 206438400. Throughput: 0: 44176.0. Samples: 206569200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 14:13:14,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:13:18,478][65616] Updated weights for policy 0, policy_version 12610 (0.0025) [2024-06-12 14:13:19,333][65383] Fps is (10 sec: 42597.7, 60 sec: 43963.7, 300 sec: 44209.3). Total num frames: 206635008. Throughput: 0: 43896.7. Samples: 206691980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 14:13:19,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:13:21,550][65616] Updated weights for policy 0, policy_version 12620 (0.0034) [2024-06-12 14:13:24,333][65383] Fps is (10 sec: 42598.2, 60 sec: 43963.6, 300 sec: 44210.2). Total num frames: 206864384. Throughput: 0: 43854.2. Samples: 206960300. Policy #0 lag: (min: 1.0, avg: 10.2, max: 19.0) [2024-06-12 14:13:24,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:13:25,277][65595] Signal inference workers to stop experience collection... (2950 times) [2024-06-12 14:13:25,278][65595] Signal inference workers to resume experience collection... (2950 times) [2024-06-12 14:13:25,299][65616] InferenceWorker_p0-w0: stopping experience collection (2950 times) [2024-06-12 14:13:25,299][65616] InferenceWorker_p0-w0: resuming experience collection (2950 times) [2024-06-12 14:13:25,926][65616] Updated weights for policy 0, policy_version 12630 (0.0032) [2024-06-12 14:13:29,213][65616] Updated weights for policy 0, policy_version 12640 (0.0024) [2024-06-12 14:13:29,332][65383] Fps is (10 sec: 45875.4, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 207093760. Throughput: 0: 43879.1. Samples: 207222580. Policy #0 lag: (min: 1.0, avg: 10.2, max: 19.0) [2024-06-12 14:13:29,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:13:33,821][65616] Updated weights for policy 0, policy_version 12650 (0.0031) [2024-06-12 14:13:34,332][65383] Fps is (10 sec: 40960.2, 60 sec: 43963.6, 300 sec: 44153.5). Total num frames: 207273984. Throughput: 0: 43801.3. Samples: 207353380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-12 14:13:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:13:36,832][65616] Updated weights for policy 0, policy_version 12660 (0.0023) [2024-06-12 14:13:39,332][65383] Fps is (10 sec: 40960.6, 60 sec: 43690.8, 300 sec: 44153.5). Total num frames: 207503360. Throughput: 0: 43702.2. Samples: 207610560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-12 14:13:39,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:13:41,234][65616] Updated weights for policy 0, policy_version 12670 (0.0025) [2024-06-12 14:13:43,998][65616] Updated weights for policy 0, policy_version 12680 (0.0030) [2024-06-12 14:13:44,332][65383] Fps is (10 sec: 47513.8, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 207749120. Throughput: 0: 43845.8. Samples: 207884300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-12 14:13:44,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:13:48,209][65616] Updated weights for policy 0, policy_version 12690 (0.0028) [2024-06-12 14:13:49,332][65383] Fps is (10 sec: 44236.9, 60 sec: 43963.8, 300 sec: 44209.0). Total num frames: 207945728. Throughput: 0: 44011.7. Samples: 208019680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-12 14:13:49,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:13:51,551][65616] Updated weights for policy 0, policy_version 12700 (0.0024) [2024-06-12 14:13:54,332][65383] Fps is (10 sec: 42598.6, 60 sec: 43690.6, 300 sec: 44153.5). Total num frames: 208175104. Throughput: 0: 43800.4. Samples: 208277880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 14:13:54,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:13:54,343][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000012706_208175104.pth... [2024-06-12 14:13:54,391][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000012058_197558272.pth [2024-06-12 14:13:55,462][65616] Updated weights for policy 0, policy_version 12710 (0.0027) [2024-06-12 14:13:58,791][65616] Updated weights for policy 0, policy_version 12720 (0.0030) [2024-06-12 14:13:59,332][65383] Fps is (10 sec: 47513.3, 60 sec: 43963.8, 300 sec: 44209.0). Total num frames: 208420864. Throughput: 0: 44065.9. Samples: 208552160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 14:13:59,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:14:02,893][65616] Updated weights for policy 0, policy_version 12730 (0.0029) [2024-06-12 14:14:04,332][65383] Fps is (10 sec: 44237.1, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 208617472. Throughput: 0: 44335.8. Samples: 208687080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 14:14:04,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:14:06,074][65616] Updated weights for policy 0, policy_version 12740 (0.0029) [2024-06-12 14:14:09,332][65383] Fps is (10 sec: 40959.9, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 208830464. Throughput: 0: 44169.5. Samples: 208947920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 14:14:09,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:14:10,343][65616] Updated weights for policy 0, policy_version 12750 (0.0027) [2024-06-12 14:14:13,723][65616] Updated weights for policy 0, policy_version 12760 (0.0031) [2024-06-12 14:14:14,332][65383] Fps is (10 sec: 44236.5, 60 sec: 43690.7, 300 sec: 44153.5). Total num frames: 209059840. Throughput: 0: 44155.6. Samples: 209209580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 14:14:14,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:14:17,629][65616] Updated weights for policy 0, policy_version 12770 (0.0034) [2024-06-12 14:14:19,332][65383] Fps is (10 sec: 47513.6, 60 sec: 44510.0, 300 sec: 44320.1). Total num frames: 209305600. Throughput: 0: 44359.2. Samples: 209349540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 14:14:19,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:14:21,294][65616] Updated weights for policy 0, policy_version 12780 (0.0028) [2024-06-12 14:14:24,332][65383] Fps is (10 sec: 42598.1, 60 sec: 43690.7, 300 sec: 44042.4). Total num frames: 209485824. Throughput: 0: 44529.2. Samples: 209614380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 14:14:24,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:14:25,273][65616] Updated weights for policy 0, policy_version 12790 (0.0034) [2024-06-12 14:14:28,847][65616] Updated weights for policy 0, policy_version 12800 (0.0027) [2024-06-12 14:14:29,332][65383] Fps is (10 sec: 44236.8, 60 sec: 44236.9, 300 sec: 44264.6). Total num frames: 209747968. Throughput: 0: 44401.4. Samples: 209882360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 14:14:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:14:32,115][65616] Updated weights for policy 0, policy_version 12810 (0.0027) [2024-06-12 14:14:34,332][65383] Fps is (10 sec: 47513.8, 60 sec: 44783.0, 300 sec: 44320.1). Total num frames: 209960960. Throughput: 0: 44331.0. Samples: 210014580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 14:14:34,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:14:36,040][65616] Updated weights for policy 0, policy_version 12820 (0.0029) [2024-06-12 14:14:39,332][65383] Fps is (10 sec: 42598.5, 60 sec: 44509.8, 300 sec: 44153.5). Total num frames: 210173952. Throughput: 0: 44509.4. Samples: 210280800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 14:14:39,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:14:39,555][65616] Updated weights for policy 0, policy_version 12830 (0.0034) [2024-06-12 14:14:44,080][65616] Updated weights for policy 0, policy_version 12840 (0.0028) [2024-06-12 14:14:44,332][65383] Fps is (10 sec: 40960.1, 60 sec: 43690.7, 300 sec: 44097.9). Total num frames: 210370560. Throughput: 0: 44110.6. Samples: 210537140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 14:14:44,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:14:47,307][65616] Updated weights for policy 0, policy_version 12850 (0.0023) [2024-06-12 14:14:49,332][65383] Fps is (10 sec: 42598.3, 60 sec: 44236.7, 300 sec: 44264.6). Total num frames: 210599936. Throughput: 0: 43916.8. Samples: 210663340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 14:14:49,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:14:51,815][65616] Updated weights for policy 0, policy_version 12860 (0.0025) [2024-06-12 14:14:54,332][65383] Fps is (10 sec: 47513.3, 60 sec: 44509.8, 300 sec: 44320.1). Total num frames: 210845696. Throughput: 0: 43900.8. Samples: 210923460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 14:14:54,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:14:54,554][65616] Updated weights for policy 0, policy_version 12870 (0.0035) [2024-06-12 14:14:59,018][65616] Updated weights for policy 0, policy_version 12880 (0.0033) [2024-06-12 14:14:59,337][65383] Fps is (10 sec: 44215.1, 60 sec: 43687.1, 300 sec: 44208.3). Total num frames: 211042304. Throughput: 0: 44050.8. Samples: 211192080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-12 14:14:59,338][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:15:02,208][65616] Updated weights for policy 0, policy_version 12890 (0.0035) [2024-06-12 14:15:04,332][65383] Fps is (10 sec: 40960.0, 60 sec: 43963.6, 300 sec: 44153.5). Total num frames: 211255296. Throughput: 0: 43802.1. Samples: 211320640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-12 14:15:04,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:15:06,339][65616] Updated weights for policy 0, policy_version 12900 (0.0030) [2024-06-12 14:15:07,120][65595] Signal inference workers to stop experience collection... (3000 times) [2024-06-12 14:15:07,120][65595] Signal inference workers to resume experience collection... (3000 times) [2024-06-12 14:15:07,163][65616] InferenceWorker_p0-w0: stopping experience collection (3000 times) [2024-06-12 14:15:07,164][65616] InferenceWorker_p0-w0: resuming experience collection (3000 times) [2024-06-12 14:15:09,332][65383] Fps is (10 sec: 40980.1, 60 sec: 43690.7, 300 sec: 44042.4). Total num frames: 211451904. Throughput: 0: 43840.1. Samples: 211587180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-12 14:15:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:15:09,856][65616] Updated weights for policy 0, policy_version 12910 (0.0037) [2024-06-12 14:15:13,331][65616] Updated weights for policy 0, policy_version 12920 (0.0028) [2024-06-12 14:15:14,332][65383] Fps is (10 sec: 44237.4, 60 sec: 43963.8, 300 sec: 44209.1). Total num frames: 211697664. Throughput: 0: 43618.7. Samples: 211845200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-12 14:15:14,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:15:17,226][65616] Updated weights for policy 0, policy_version 12930 (0.0025) [2024-06-12 14:15:19,332][65383] Fps is (10 sec: 49152.1, 60 sec: 43963.7, 300 sec: 44209.0). Total num frames: 211943424. Throughput: 0: 43632.5. Samples: 211978040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-12 14:15:19,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:15:21,202][65616] Updated weights for policy 0, policy_version 12940 (0.0025) [2024-06-12 14:15:24,332][65383] Fps is (10 sec: 42597.9, 60 sec: 43963.8, 300 sec: 44097.9). Total num frames: 212123648. Throughput: 0: 43493.3. Samples: 212238000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-12 14:15:24,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:15:24,790][65616] Updated weights for policy 0, policy_version 12950 (0.0029) [2024-06-12 14:15:29,076][65616] Updated weights for policy 0, policy_version 12960 (0.0032) [2024-06-12 14:15:29,332][65383] Fps is (10 sec: 39321.7, 60 sec: 43144.6, 300 sec: 44153.5). Total num frames: 212336640. Throughput: 0: 43696.9. Samples: 212503500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 14:15:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:15:32,193][65616] Updated weights for policy 0, policy_version 12970 (0.0023) [2024-06-12 14:15:34,332][65383] Fps is (10 sec: 45874.9, 60 sec: 43690.6, 300 sec: 44153.5). Total num frames: 212582400. Throughput: 0: 43891.9. Samples: 212638480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 14:15:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:15:36,187][65616] Updated weights for policy 0, policy_version 12980 (0.0025) [2024-06-12 14:15:39,332][65383] Fps is (10 sec: 47513.5, 60 sec: 43963.7, 300 sec: 44209.0). Total num frames: 212811776. Throughput: 0: 43936.1. Samples: 212900580. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-12 14:15:39,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:15:39,462][65616] Updated weights for policy 0, policy_version 12990 (0.0025) [2024-06-12 14:15:43,538][65616] Updated weights for policy 0, policy_version 13000 (0.0028) [2024-06-12 14:15:44,332][65383] Fps is (10 sec: 42599.2, 60 sec: 43963.8, 300 sec: 44209.1). Total num frames: 213008384. Throughput: 0: 43736.9. Samples: 213160020. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-12 14:15:44,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:15:46,911][65616] Updated weights for policy 0, policy_version 13010 (0.0025) [2024-06-12 14:15:49,332][65383] Fps is (10 sec: 42598.3, 60 sec: 43963.7, 300 sec: 44098.0). Total num frames: 213237760. Throughput: 0: 44076.5. Samples: 213304080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 14:15:49,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:15:50,786][65616] Updated weights for policy 0, policy_version 13020 (0.0026) [2024-06-12 14:15:54,332][65383] Fps is (10 sec: 44236.4, 60 sec: 43417.7, 300 sec: 44153.5). Total num frames: 213450752. Throughput: 0: 43897.3. Samples: 213562560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 14:15:54,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:15:54,376][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000013029_213467136.pth... [2024-06-12 14:15:54,428][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000012384_202899456.pth [2024-06-12 14:15:54,604][65616] Updated weights for policy 0, policy_version 13030 (0.0031) [2024-06-12 14:15:58,201][65616] Updated weights for policy 0, policy_version 13040 (0.0028) [2024-06-12 14:15:59,332][65383] Fps is (10 sec: 44237.2, 60 sec: 43967.4, 300 sec: 44098.0). Total num frames: 213680128. Throughput: 0: 44009.4. Samples: 213825620. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-12 14:15:59,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:16:02,018][65616] Updated weights for policy 0, policy_version 13050 (0.0035) [2024-06-12 14:16:04,332][65383] Fps is (10 sec: 45874.9, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 213909504. Throughput: 0: 44164.4. Samples: 213965440. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-12 14:16:04,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:16:05,336][65616] Updated weights for policy 0, policy_version 13060 (0.0028) [2024-06-12 14:16:09,270][65616] Updated weights for policy 0, policy_version 13070 (0.0022) [2024-06-12 14:16:09,333][65383] Fps is (10 sec: 45874.2, 60 sec: 44782.8, 300 sec: 44153.5). Total num frames: 214138880. Throughput: 0: 44471.0. Samples: 214239200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 14:16:09,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:16:12,543][65616] Updated weights for policy 0, policy_version 13080 (0.0032) [2024-06-12 14:16:14,332][65383] Fps is (10 sec: 45875.5, 60 sec: 44509.8, 300 sec: 44209.0). Total num frames: 214368256. Throughput: 0: 44400.9. Samples: 214501540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 14:16:14,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:16:16,314][65616] Updated weights for policy 0, policy_version 13090 (0.0025) [2024-06-12 14:16:19,332][65383] Fps is (10 sec: 44237.5, 60 sec: 43963.7, 300 sec: 44209.0). Total num frames: 214581248. Throughput: 0: 44507.2. Samples: 214641300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 14:16:19,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:16:19,935][65616] Updated weights for policy 0, policy_version 13100 (0.0022) [2024-06-12 14:16:22,601][65595] Signal inference workers to stop experience collection... (3050 times) [2024-06-12 14:16:22,620][65616] InferenceWorker_p0-w0: stopping experience collection (3050 times) [2024-06-12 14:16:22,658][65595] Signal inference workers to resume experience collection... (3050 times) [2024-06-12 14:16:22,658][65616] InferenceWorker_p0-w0: resuming experience collection (3050 times) [2024-06-12 14:16:24,184][65616] Updated weights for policy 0, policy_version 13110 (0.0026) [2024-06-12 14:16:24,332][65383] Fps is (10 sec: 42598.2, 60 sec: 44509.9, 300 sec: 44153.5). Total num frames: 214794240. Throughput: 0: 44505.7. Samples: 214903340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 14:16:24,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:16:27,235][65616] Updated weights for policy 0, policy_version 13120 (0.0029) [2024-06-12 14:16:29,332][65383] Fps is (10 sec: 44236.9, 60 sec: 44782.9, 300 sec: 44098.0). Total num frames: 215023616. Throughput: 0: 44771.9. Samples: 215174760. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-12 14:16:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:16:30,871][65616] Updated weights for policy 0, policy_version 13130 (0.0027) [2024-06-12 14:16:34,332][65383] Fps is (10 sec: 47513.2, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 215269376. Throughput: 0: 44627.5. Samples: 215312320. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-12 14:16:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:16:34,369][65616] Updated weights for policy 0, policy_version 13140 (0.0028) [2024-06-12 14:16:38,403][65616] Updated weights for policy 0, policy_version 13150 (0.0030) [2024-06-12 14:16:39,332][65383] Fps is (10 sec: 44236.7, 60 sec: 44236.8, 300 sec: 44153.5). Total num frames: 215465984. Throughput: 0: 44656.9. Samples: 215572120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 25.0) [2024-06-12 14:16:39,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:16:42,137][65616] Updated weights for policy 0, policy_version 13160 (0.0023) [2024-06-12 14:16:44,332][65383] Fps is (10 sec: 42598.7, 60 sec: 44782.8, 300 sec: 44153.5). Total num frames: 215695360. Throughput: 0: 44523.9. Samples: 215829200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 25.0) [2024-06-12 14:16:44,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:16:46,634][65616] Updated weights for policy 0, policy_version 13170 (0.0030) [2024-06-12 14:16:49,333][65383] Fps is (10 sec: 45874.6, 60 sec: 44782.8, 300 sec: 44153.5). Total num frames: 215924736. Throughput: 0: 44354.2. Samples: 215961380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 14:16:49,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:16:49,538][65616] Updated weights for policy 0, policy_version 13180 (0.0022) [2024-06-12 14:16:53,739][65616] Updated weights for policy 0, policy_version 13190 (0.0026) [2024-06-12 14:16:54,332][65383] Fps is (10 sec: 40960.2, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 216104960. Throughput: 0: 44137.0. Samples: 216225360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 14:16:54,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:16:57,278][65616] Updated weights for policy 0, policy_version 13200 (0.0030) [2024-06-12 14:16:59,332][65383] Fps is (10 sec: 42598.5, 60 sec: 44509.7, 300 sec: 44153.5). Total num frames: 216350720. Throughput: 0: 44095.5. Samples: 216485840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 14:16:59,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:17:01,377][65616] Updated weights for policy 0, policy_version 13210 (0.0025) [2024-06-12 14:17:04,332][65383] Fps is (10 sec: 47513.5, 60 sec: 44509.9, 300 sec: 44042.4). Total num frames: 216580096. Throughput: 0: 43917.3. Samples: 216617580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 14:17:04,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:17:04,461][65616] Updated weights for policy 0, policy_version 13220 (0.0022) [2024-06-12 14:17:08,942][65616] Updated weights for policy 0, policy_version 13230 (0.0029) [2024-06-12 14:17:09,333][65383] Fps is (10 sec: 42597.2, 60 sec: 43963.6, 300 sec: 44042.4). Total num frames: 216776704. Throughput: 0: 44120.1. Samples: 216888760. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-12 14:17:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:17:12,143][65616] Updated weights for policy 0, policy_version 13240 (0.0037) [2024-06-12 14:17:14,332][65383] Fps is (10 sec: 42598.1, 60 sec: 43963.7, 300 sec: 44098.0). Total num frames: 217006080. Throughput: 0: 43893.7. Samples: 217149980. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-12 14:17:14,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:17:16,053][65616] Updated weights for policy 0, policy_version 13250 (0.0034) [2024-06-12 14:17:19,332][65383] Fps is (10 sec: 44238.5, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 217219072. Throughput: 0: 43713.9. Samples: 217279440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 14:17:19,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:17:20,051][65616] Updated weights for policy 0, policy_version 13260 (0.0032) [2024-06-12 14:17:23,497][65616] Updated weights for policy 0, policy_version 13270 (0.0036) [2024-06-12 14:17:24,333][65383] Fps is (10 sec: 44236.5, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 217448448. Throughput: 0: 43841.6. Samples: 217545000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 14:17:24,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:17:27,545][65616] Updated weights for policy 0, policy_version 13280 (0.0033) [2024-06-12 14:17:29,333][65383] Fps is (10 sec: 42597.8, 60 sec: 43690.5, 300 sec: 44097.9). Total num frames: 217645056. Throughput: 0: 43834.1. Samples: 217801740. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-12 14:17:29,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:17:30,771][65616] Updated weights for policy 0, policy_version 13290 (0.0031) [2024-06-12 14:17:34,333][65383] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 217858048. Throughput: 0: 43856.4. Samples: 217934920. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-12 14:17:34,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:17:34,974][65616] Updated weights for policy 0, policy_version 13300 (0.0024) [2024-06-12 14:17:38,185][65616] Updated weights for policy 0, policy_version 13310 (0.0027) [2024-06-12 14:17:39,332][65383] Fps is (10 sec: 45875.8, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 218103808. Throughput: 0: 43982.7. Samples: 218204580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-12 14:17:39,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:17:42,651][65616] Updated weights for policy 0, policy_version 13320 (0.0026) [2024-06-12 14:17:44,332][65383] Fps is (10 sec: 47514.4, 60 sec: 43963.8, 300 sec: 44153.5). Total num frames: 218333184. Throughput: 0: 44136.1. Samples: 218471960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-12 14:17:44,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:17:45,774][65616] Updated weights for policy 0, policy_version 13330 (0.0031) [2024-06-12 14:17:49,332][65383] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 43931.3). Total num frames: 218513408. Throughput: 0: 43988.0. Samples: 218597040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 14:17:49,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:17:50,052][65616] Updated weights for policy 0, policy_version 13340 (0.0033) [2024-06-12 14:17:52,903][65616] Updated weights for policy 0, policy_version 13350 (0.0025) [2024-06-12 14:17:54,332][65383] Fps is (10 sec: 44236.3, 60 sec: 44509.8, 300 sec: 44042.4). Total num frames: 218775552. Throughput: 0: 43837.2. Samples: 218861420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 14:17:54,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:17:54,412][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000013354_218791936.pth... [2024-06-12 14:17:54,455][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000012706_208175104.pth [2024-06-12 14:17:57,155][65616] Updated weights for policy 0, policy_version 13360 (0.0032) [2024-06-12 14:17:58,484][65595] Signal inference workers to stop experience collection... (3100 times) [2024-06-12 14:17:58,485][65595] Signal inference workers to resume experience collection... (3100 times) [2024-06-12 14:17:58,512][65616] InferenceWorker_p0-w0: stopping experience collection (3100 times) [2024-06-12 14:17:58,512][65616] InferenceWorker_p0-w0: resuming experience collection (3100 times) [2024-06-12 14:17:59,332][65383] Fps is (10 sec: 50790.6, 60 sec: 44509.9, 300 sec: 44153.5). Total num frames: 219021312. Throughput: 0: 44151.2. Samples: 219136780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-12 14:17:59,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:18:00,192][65616] Updated weights for policy 0, policy_version 13370 (0.0026) [2024-06-12 14:18:04,332][65383] Fps is (10 sec: 42598.6, 60 sec: 43690.6, 300 sec: 44042.4). Total num frames: 219201536. Throughput: 0: 44291.5. Samples: 219272560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-12 14:18:04,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:18:04,389][65616] Updated weights for policy 0, policy_version 13380 (0.0023) [2024-06-12 14:18:07,127][65616] Updated weights for policy 0, policy_version 13390 (0.0019) [2024-06-12 14:18:09,332][65383] Fps is (10 sec: 39321.3, 60 sec: 43964.0, 300 sec: 43986.9). Total num frames: 219414528. Throughput: 0: 44421.9. Samples: 219543980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 14:18:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:18:11,613][65616] Updated weights for policy 0, policy_version 13400 (0.0034) [2024-06-12 14:18:14,332][65383] Fps is (10 sec: 49152.1, 60 sec: 44783.0, 300 sec: 44264.6). Total num frames: 219693056. Throughput: 0: 44572.6. Samples: 219807500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 14:18:14,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:18:14,452][65616] Updated weights for policy 0, policy_version 13410 (0.0024) [2024-06-12 14:18:18,957][65616] Updated weights for policy 0, policy_version 13420 (0.0025) [2024-06-12 14:18:19,332][65383] Fps is (10 sec: 47513.7, 60 sec: 44509.8, 300 sec: 44153.5). Total num frames: 219889664. Throughput: 0: 44742.8. Samples: 219948340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 14:18:19,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:18:22,440][65616] Updated weights for policy 0, policy_version 13430 (0.0031) [2024-06-12 14:18:24,332][65383] Fps is (10 sec: 39321.5, 60 sec: 43963.8, 300 sec: 44042.4). Total num frames: 220086272. Throughput: 0: 44340.9. Samples: 220199920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 14:18:24,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:18:26,308][65616] Updated weights for policy 0, policy_version 13440 (0.0040) [2024-06-12 14:18:29,333][65383] Fps is (10 sec: 44236.3, 60 sec: 44782.9, 300 sec: 44264.6). Total num frames: 220332032. Throughput: 0: 44319.8. Samples: 220466360. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-12 14:18:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:18:30,220][65616] Updated weights for policy 0, policy_version 13450 (0.0028) [2024-06-12 14:18:33,846][65616] Updated weights for policy 0, policy_version 13460 (0.0031) [2024-06-12 14:18:34,333][65383] Fps is (10 sec: 47513.3, 60 sec: 45056.0, 300 sec: 44264.5). Total num frames: 220561408. Throughput: 0: 44552.8. Samples: 220601920. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-12 14:18:34,333][65383] Avg episode reward: [(0, '0.087')] [2024-06-12 14:18:37,718][65616] Updated weights for policy 0, policy_version 13470 (0.0026) [2024-06-12 14:18:39,332][65383] Fps is (10 sec: 40960.8, 60 sec: 43963.8, 300 sec: 44042.4). Total num frames: 220741632. Throughput: 0: 44459.7. Samples: 220862100. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-12 14:18:39,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:18:41,280][65616] Updated weights for policy 0, policy_version 13480 (0.0022) [2024-06-12 14:18:44,332][65383] Fps is (10 sec: 40960.5, 60 sec: 43963.7, 300 sec: 44153.5). Total num frames: 220971008. Throughput: 0: 44025.8. Samples: 221117940. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-12 14:18:44,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:18:45,339][65616] Updated weights for policy 0, policy_version 13490 (0.0030) [2024-06-12 14:18:48,614][65616] Updated weights for policy 0, policy_version 13500 (0.0030) [2024-06-12 14:18:49,332][65383] Fps is (10 sec: 45874.8, 60 sec: 44782.9, 300 sec: 44153.5). Total num frames: 221200384. Throughput: 0: 44061.3. Samples: 221255320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 14:18:49,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:18:52,542][65616] Updated weights for policy 0, policy_version 13510 (0.0026) [2024-06-12 14:18:54,332][65383] Fps is (10 sec: 44237.1, 60 sec: 43963.8, 300 sec: 44042.4). Total num frames: 221413376. Throughput: 0: 43837.0. Samples: 221516640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 14:18:54,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:18:56,058][65616] Updated weights for policy 0, policy_version 13520 (0.0029) [2024-06-12 14:18:59,332][65383] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 44042.4). Total num frames: 221609984. Throughput: 0: 43787.1. Samples: 221777920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 14:18:59,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:19:00,402][65616] Updated weights for policy 0, policy_version 13530 (0.0025) [2024-06-12 14:19:04,075][65616] Updated weights for policy 0, policy_version 13540 (0.0025) [2024-06-12 14:19:04,332][65383] Fps is (10 sec: 42597.9, 60 sec: 43963.7, 300 sec: 44097.9). Total num frames: 221839360. Throughput: 0: 43557.8. Samples: 221908440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-12 14:19:04,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:19:07,711][65616] Updated weights for policy 0, policy_version 13550 (0.0031) [2024-06-12 14:19:09,332][65383] Fps is (10 sec: 45875.5, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 222068736. Throughput: 0: 43964.0. Samples: 222178300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-12 14:19:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:19:11,147][65616] Updated weights for policy 0, policy_version 13560 (0.0025) [2024-06-12 14:19:14,332][65383] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 222281728. Throughput: 0: 43822.3. Samples: 222438360. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-12 14:19:14,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:19:15,037][65616] Updated weights for policy 0, policy_version 13570 (0.0030) [2024-06-12 14:19:18,391][65616] Updated weights for policy 0, policy_version 13580 (0.0028) [2024-06-12 14:19:19,332][65383] Fps is (10 sec: 44236.8, 60 sec: 43690.7, 300 sec: 44153.5). Total num frames: 222511104. Throughput: 0: 43787.7. Samples: 222572360. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-12 14:19:19,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:19:22,241][65616] Updated weights for policy 0, policy_version 13590 (0.0029) [2024-06-12 14:19:24,333][65383] Fps is (10 sec: 44236.1, 60 sec: 43963.6, 300 sec: 43986.8). Total num frames: 222724096. Throughput: 0: 43566.0. Samples: 222822580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 14:19:24,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:19:26,063][65616] Updated weights for policy 0, policy_version 13600 (0.0037) [2024-06-12 14:19:29,332][65383] Fps is (10 sec: 42597.9, 60 sec: 43417.6, 300 sec: 43986.9). Total num frames: 222937088. Throughput: 0: 43644.8. Samples: 223081960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 14:19:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:19:30,203][65616] Updated weights for policy 0, policy_version 13610 (0.0023) [2024-06-12 14:19:33,923][65616] Updated weights for policy 0, policy_version 13620 (0.0029) [2024-06-12 14:19:34,332][65383] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 44042.4). Total num frames: 223166464. Throughput: 0: 43460.4. Samples: 223211040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 14:19:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:19:36,494][65595] Signal inference workers to stop experience collection... (3150 times) [2024-06-12 14:19:36,494][65595] Signal inference workers to resume experience collection... (3150 times) [2024-06-12 14:19:36,510][65616] InferenceWorker_p0-w0: stopping experience collection (3150 times) [2024-06-12 14:19:36,511][65616] InferenceWorker_p0-w0: resuming experience collection (3150 times) [2024-06-12 14:19:37,801][65616] Updated weights for policy 0, policy_version 13630 (0.0027) [2024-06-12 14:19:39,332][65383] Fps is (10 sec: 44237.6, 60 sec: 43963.8, 300 sec: 44098.0). Total num frames: 223379456. Throughput: 0: 43539.1. Samples: 223475900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 14:19:39,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:19:41,277][65616] Updated weights for policy 0, policy_version 13640 (0.0025) [2024-06-12 14:19:44,332][65383] Fps is (10 sec: 42599.0, 60 sec: 43690.7, 300 sec: 44042.4). Total num frames: 223592448. Throughput: 0: 43536.9. Samples: 223737080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 14:19:44,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:19:45,289][65616] Updated weights for policy 0, policy_version 13650 (0.0044) [2024-06-12 14:19:48,846][65616] Updated weights for policy 0, policy_version 13660 (0.0030) [2024-06-12 14:19:49,332][65383] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 43931.3). Total num frames: 223805440. Throughput: 0: 43574.3. Samples: 223869280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 14:19:49,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:19:52,700][65616] Updated weights for policy 0, policy_version 13670 (0.0035) [2024-06-12 14:19:54,332][65383] Fps is (10 sec: 44237.1, 60 sec: 43690.7, 300 sec: 44043.2). Total num frames: 224034816. Throughput: 0: 43581.8. Samples: 224139480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-12 14:19:54,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:19:54,355][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000013675_224051200.pth... [2024-06-12 14:19:54,396][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000013029_213467136.pth [2024-06-12 14:19:56,599][65616] Updated weights for policy 0, policy_version 13680 (0.0022) [2024-06-12 14:19:59,333][65383] Fps is (10 sec: 42597.8, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 224231424. Throughput: 0: 43445.7. Samples: 224393420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-12 14:19:59,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:19:59,997][65616] Updated weights for policy 0, policy_version 13690 (0.0029) [2024-06-12 14:20:04,332][65383] Fps is (10 sec: 40959.3, 60 sec: 43417.6, 300 sec: 44042.4). Total num frames: 224444416. Throughput: 0: 43310.6. Samples: 224521340. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-12 14:20:04,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:20:04,482][65616] Updated weights for policy 0, policy_version 13700 (0.0032) [2024-06-12 14:20:07,966][65616] Updated weights for policy 0, policy_version 13710 (0.0029) [2024-06-12 14:20:09,332][65383] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 43986.9). Total num frames: 224673792. Throughput: 0: 43592.7. Samples: 224784240. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-12 14:20:09,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:20:11,655][65616] Updated weights for policy 0, policy_version 13720 (0.0021) [2024-06-12 14:20:14,332][65383] Fps is (10 sec: 44237.4, 60 sec: 43417.7, 300 sec: 43875.8). Total num frames: 224886784. Throughput: 0: 43670.8. Samples: 225047140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-12 14:20:14,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:20:15,086][65616] Updated weights for policy 0, policy_version 13730 (0.0027) [2024-06-12 14:20:18,992][65616] Updated weights for policy 0, policy_version 13740 (0.0029) [2024-06-12 14:20:19,332][65383] Fps is (10 sec: 45874.8, 60 sec: 43690.6, 300 sec: 44097.9). Total num frames: 225132544. Throughput: 0: 43700.5. Samples: 225177560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-12 14:20:19,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:20:22,771][65616] Updated weights for policy 0, policy_version 13750 (0.0031) [2024-06-12 14:20:24,333][65383] Fps is (10 sec: 44236.1, 60 sec: 43417.7, 300 sec: 44042.4). Total num frames: 225329152. Throughput: 0: 43401.6. Samples: 225428980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 14:20:24,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:20:26,610][65616] Updated weights for policy 0, policy_version 13760 (0.0030) [2024-06-12 14:20:29,332][65383] Fps is (10 sec: 40960.4, 60 sec: 43417.7, 300 sec: 43931.4). Total num frames: 225542144. Throughput: 0: 43372.4. Samples: 225688840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 14:20:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:20:30,670][65616] Updated weights for policy 0, policy_version 13770 (0.0026) [2024-06-12 14:20:34,332][65383] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 43875.8). Total num frames: 225755136. Throughput: 0: 43297.3. Samples: 225817660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-12 14:20:34,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:20:34,491][65616] Updated weights for policy 0, policy_version 13780 (0.0030) [2024-06-12 14:20:37,946][65616] Updated weights for policy 0, policy_version 13790 (0.0022) [2024-06-12 14:20:39,332][65383] Fps is (10 sec: 44236.8, 60 sec: 43417.5, 300 sec: 43986.9). Total num frames: 225984512. Throughput: 0: 43239.9. Samples: 226085280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-12 14:20:39,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:20:41,642][65616] Updated weights for policy 0, policy_version 13800 (0.0030) [2024-06-12 14:20:44,332][65383] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 43931.3). Total num frames: 226197504. Throughput: 0: 43346.8. Samples: 226344020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 14:20:44,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:20:45,826][65616] Updated weights for policy 0, policy_version 13810 (0.0034) [2024-06-12 14:20:49,274][65616] Updated weights for policy 0, policy_version 13820 (0.0024) [2024-06-12 14:20:49,332][65383] Fps is (10 sec: 44236.6, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 226426880. Throughput: 0: 43278.3. Samples: 226468860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 14:20:49,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:20:52,783][65595] Signal inference workers to stop experience collection... (3200 times) [2024-06-12 14:20:52,790][65595] Signal inference workers to resume experience collection... (3200 times) [2024-06-12 14:20:52,805][65616] InferenceWorker_p0-w0: stopping experience collection (3200 times) [2024-06-12 14:20:52,836][65616] InferenceWorker_p0-w0: resuming experience collection (3200 times) [2024-06-12 14:20:52,922][65616] Updated weights for policy 0, policy_version 13830 (0.0022) [2024-06-12 14:20:54,333][65383] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 43875.8). Total num frames: 226623488. Throughput: 0: 43320.7. Samples: 226733680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 14:20:54,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:20:56,971][65616] Updated weights for policy 0, policy_version 13840 (0.0038) [2024-06-12 14:20:59,332][65383] Fps is (10 sec: 42598.2, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 226852864. Throughput: 0: 43251.0. Samples: 226993440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 14:20:59,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:21:00,770][65616] Updated weights for policy 0, policy_version 13850 (0.0042) [2024-06-12 14:21:04,333][65383] Fps is (10 sec: 44236.9, 60 sec: 43690.6, 300 sec: 43820.3). Total num frames: 227065856. Throughput: 0: 43244.9. Samples: 227123580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 14:21:04,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:21:04,822][65616] Updated weights for policy 0, policy_version 13860 (0.0024) [2024-06-12 14:21:08,399][65616] Updated weights for policy 0, policy_version 13870 (0.0026) [2024-06-12 14:21:09,332][65383] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 43709.2). Total num frames: 227262464. Throughput: 0: 43425.9. Samples: 227383140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 14:21:09,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:21:12,146][65616] Updated weights for policy 0, policy_version 13880 (0.0036) [2024-06-12 14:21:14,333][65383] Fps is (10 sec: 44236.5, 60 sec: 43690.5, 300 sec: 43820.2). Total num frames: 227508224. Throughput: 0: 43503.4. Samples: 227646500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 14:21:14,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:21:16,144][65616] Updated weights for policy 0, policy_version 13890 (0.0028) [2024-06-12 14:21:19,332][65383] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 43764.7). Total num frames: 227704832. Throughput: 0: 43685.8. Samples: 227783520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 14:21:19,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:21:19,592][65616] Updated weights for policy 0, policy_version 13900 (0.0029) [2024-06-12 14:21:23,538][65616] Updated weights for policy 0, policy_version 13910 (0.0039) [2024-06-12 14:21:24,332][65383] Fps is (10 sec: 40961.0, 60 sec: 43144.7, 300 sec: 43709.2). Total num frames: 227917824. Throughput: 0: 43555.6. Samples: 228045280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 14:21:24,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 14:21:26,746][65616] Updated weights for policy 0, policy_version 13920 (0.0027) [2024-06-12 14:21:29,332][65383] Fps is (10 sec: 45875.1, 60 sec: 43690.6, 300 sec: 43709.2). Total num frames: 228163584. Throughput: 0: 43545.3. Samples: 228303560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 14:21:29,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:21:30,747][65616] Updated weights for policy 0, policy_version 13930 (0.0033) [2024-06-12 14:21:34,332][65383] Fps is (10 sec: 45875.2, 60 sec: 43690.8, 300 sec: 43764.7). Total num frames: 228376576. Throughput: 0: 43898.3. Samples: 228444280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-12 14:21:34,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:21:34,419][65616] Updated weights for policy 0, policy_version 13940 (0.0036) [2024-06-12 14:21:38,235][65616] Updated weights for policy 0, policy_version 13950 (0.0024) [2024-06-12 14:21:39,332][65383] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 43653.6). Total num frames: 228573184. Throughput: 0: 43775.7. Samples: 228703580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-12 14:21:39,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:21:41,669][65616] Updated weights for policy 0, policy_version 13960 (0.0031) [2024-06-12 14:21:44,332][65383] Fps is (10 sec: 45875.1, 60 sec: 43963.7, 300 sec: 43764.7). Total num frames: 228835328. Throughput: 0: 43777.4. Samples: 228963420. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 14:21:44,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:21:45,736][65616] Updated weights for policy 0, policy_version 13970 (0.0035) [2024-06-12 14:21:49,023][65616] Updated weights for policy 0, policy_version 13980 (0.0035) [2024-06-12 14:21:49,332][65383] Fps is (10 sec: 47513.9, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 229048320. Throughput: 0: 43889.5. Samples: 229098600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 14:21:49,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:21:53,130][65616] Updated weights for policy 0, policy_version 13990 (0.0025) [2024-06-12 14:21:54,332][65383] Fps is (10 sec: 42598.2, 60 sec: 43963.8, 300 sec: 43764.7). Total num frames: 229261312. Throughput: 0: 44044.4. Samples: 229365140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 14:21:54,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:21:54,353][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000013993_229261312.pth... [2024-06-12 14:21:54,416][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000013354_218791936.pth [2024-06-12 14:21:56,627][65616] Updated weights for policy 0, policy_version 14000 (0.0028) [2024-06-12 14:21:59,332][65383] Fps is (10 sec: 42597.7, 60 sec: 43690.7, 300 sec: 43709.2). Total num frames: 229474304. Throughput: 0: 43937.0. Samples: 229623660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 14:21:59,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:22:00,437][65616] Updated weights for policy 0, policy_version 14010 (0.0028) [2024-06-12 14:22:04,332][65383] Fps is (10 sec: 42598.7, 60 sec: 43690.8, 300 sec: 43764.8). Total num frames: 229687296. Throughput: 0: 43824.1. Samples: 229755600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 14:22:04,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:22:04,432][65616] Updated weights for policy 0, policy_version 14020 (0.0027) [2024-06-12 14:22:07,478][65595] Signal inference workers to stop experience collection... (3250 times) [2024-06-12 14:22:07,478][65595] Signal inference workers to resume experience collection... (3250 times) [2024-06-12 14:22:07,493][65616] InferenceWorker_p0-w0: stopping experience collection (3250 times) [2024-06-12 14:22:07,493][65616] InferenceWorker_p0-w0: resuming experience collection (3250 times) [2024-06-12 14:22:07,951][65616] Updated weights for policy 0, policy_version 14030 (0.0028) [2024-06-12 14:22:09,332][65383] Fps is (10 sec: 44237.6, 60 sec: 44236.8, 300 sec: 43764.7). Total num frames: 229916672. Throughput: 0: 43782.2. Samples: 230015480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 14:22:09,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:22:11,590][65616] Updated weights for policy 0, policy_version 14040 (0.0025) [2024-06-12 14:22:14,332][65383] Fps is (10 sec: 42598.3, 60 sec: 43417.7, 300 sec: 43709.2). Total num frames: 230113280. Throughput: 0: 43838.3. Samples: 230276280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 14:22:14,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:22:15,767][65616] Updated weights for policy 0, policy_version 14050 (0.0019) [2024-06-12 14:22:19,148][65616] Updated weights for policy 0, policy_version 14060 (0.0030) [2024-06-12 14:22:19,332][65383] Fps is (10 sec: 44236.3, 60 sec: 44236.8, 300 sec: 43764.7). Total num frames: 230359040. Throughput: 0: 43593.7. Samples: 230406000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 14:22:19,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:22:23,247][65616] Updated weights for policy 0, policy_version 14070 (0.0035) [2024-06-12 14:22:24,332][65383] Fps is (10 sec: 44236.7, 60 sec: 43963.7, 300 sec: 43764.7). Total num frames: 230555648. Throughput: 0: 43598.7. Samples: 230665520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 14:22:24,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:22:26,626][65616] Updated weights for policy 0, policy_version 14080 (0.0026) [2024-06-12 14:22:29,332][65383] Fps is (10 sec: 40960.2, 60 sec: 43417.6, 300 sec: 43764.7). Total num frames: 230768640. Throughput: 0: 43352.9. Samples: 230914300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 14:22:29,333][65383] Avg episode reward: [(0, '0.094')] [2024-06-12 14:22:31,305][65616] Updated weights for policy 0, policy_version 14090 (0.0024) [2024-06-12 14:22:34,120][65616] Updated weights for policy 0, policy_version 14100 (0.0025) [2024-06-12 14:22:34,333][65383] Fps is (10 sec: 45874.4, 60 sec: 43963.6, 300 sec: 43764.7). Total num frames: 231014400. Throughput: 0: 43399.3. Samples: 231051580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 14:22:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:22:38,518][65616] Updated weights for policy 0, policy_version 14110 (0.0029) [2024-06-12 14:22:39,332][65383] Fps is (10 sec: 44237.0, 60 sec: 43963.8, 300 sec: 43653.6). Total num frames: 231211008. Throughput: 0: 43244.5. Samples: 231311140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 14:22:39,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:22:41,678][65616] Updated weights for policy 0, policy_version 14120 (0.0026) [2024-06-12 14:22:44,332][65383] Fps is (10 sec: 39322.3, 60 sec: 42871.4, 300 sec: 43709.2). Total num frames: 231407616. Throughput: 0: 43425.4. Samples: 231577800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 14:22:44,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:22:45,862][65616] Updated weights for policy 0, policy_version 14130 (0.0026) [2024-06-12 14:22:49,197][65616] Updated weights for policy 0, policy_version 14140 (0.0024) [2024-06-12 14:22:49,332][65383] Fps is (10 sec: 45874.5, 60 sec: 43690.6, 300 sec: 43709.2). Total num frames: 231669760. Throughput: 0: 43307.0. Samples: 231704420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-12 14:22:49,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:22:53,518][65616] Updated weights for policy 0, policy_version 14150 (0.0027) [2024-06-12 14:22:54,333][65383] Fps is (10 sec: 45874.6, 60 sec: 43417.5, 300 sec: 43542.5). Total num frames: 231866368. Throughput: 0: 43430.9. Samples: 231969880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-12 14:22:54,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:22:57,454][65616] Updated weights for policy 0, policy_version 14160 (0.0033) [2024-06-12 14:22:59,333][65383] Fps is (10 sec: 40959.7, 60 sec: 43417.6, 300 sec: 43653.6). Total num frames: 232079360. Throughput: 0: 43216.7. Samples: 232221040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 14:22:59,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:23:01,180][65616] Updated weights for policy 0, policy_version 14170 (0.0023) [2024-06-12 14:23:04,332][65383] Fps is (10 sec: 40960.2, 60 sec: 43144.4, 300 sec: 43598.1). Total num frames: 232275968. Throughput: 0: 43116.4. Samples: 232346240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 14:23:04,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:23:05,098][65616] Updated weights for policy 0, policy_version 14180 (0.0027) [2024-06-12 14:23:08,739][65616] Updated weights for policy 0, policy_version 14190 (0.0029) [2024-06-12 14:23:09,332][65383] Fps is (10 sec: 44237.1, 60 sec: 43417.5, 300 sec: 43487.0). Total num frames: 232521728. Throughput: 0: 43221.3. Samples: 232610480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 14:23:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:23:13,022][65616] Updated weights for policy 0, policy_version 14200 (0.0029) [2024-06-12 14:23:14,332][65383] Fps is (10 sec: 44237.6, 60 sec: 43417.7, 300 sec: 43487.0). Total num frames: 232718336. Throughput: 0: 43515.6. Samples: 232872500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 14:23:14,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:23:16,069][65616] Updated weights for policy 0, policy_version 14210 (0.0025) [2024-06-12 14:23:19,332][65383] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 43487.0). Total num frames: 232914944. Throughput: 0: 43141.9. Samples: 232992960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 14:23:19,338][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:23:20,596][65616] Updated weights for policy 0, policy_version 14220 (0.0028) [2024-06-12 14:23:23,506][65616] Updated weights for policy 0, policy_version 14230 (0.0028) [2024-06-12 14:23:24,332][65383] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 43487.0). Total num frames: 233160704. Throughput: 0: 43152.4. Samples: 233253000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 14:23:24,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:23:28,116][65616] Updated weights for policy 0, policy_version 14240 (0.0032) [2024-06-12 14:23:29,332][65383] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 43376.0). Total num frames: 233357312. Throughput: 0: 42975.1. Samples: 233511680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 14:23:29,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:23:30,775][65616] Updated weights for policy 0, policy_version 14250 (0.0033) [2024-06-12 14:23:34,332][65383] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 43487.0). Total num frames: 233570304. Throughput: 0: 42932.1. Samples: 233636360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 14:23:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:23:34,550][65595] Signal inference workers to stop experience collection... (3300 times) [2024-06-12 14:23:34,551][65595] Signal inference workers to resume experience collection... (3300 times) [2024-06-12 14:23:34,569][65616] InferenceWorker_p0-w0: stopping experience collection (3300 times) [2024-06-12 14:23:34,570][65616] InferenceWorker_p0-w0: resuming experience collection (3300 times) [2024-06-12 14:23:35,537][65616] Updated weights for policy 0, policy_version 14260 (0.0031) [2024-06-12 14:23:38,854][65616] Updated weights for policy 0, policy_version 14270 (0.0029) [2024-06-12 14:23:39,332][65383] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 43487.0). Total num frames: 233799680. Throughput: 0: 42785.5. Samples: 233895220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 14:23:39,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:23:43,352][65616] Updated weights for policy 0, policy_version 14280 (0.0027) [2024-06-12 14:23:44,332][65383] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 43320.4). Total num frames: 233979904. Throughput: 0: 43054.0. Samples: 234158460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 14:23:44,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:23:46,476][65616] Updated weights for policy 0, policy_version 14290 (0.0034) [2024-06-12 14:23:49,333][65383] Fps is (10 sec: 40955.9, 60 sec: 42324.7, 300 sec: 43375.8). Total num frames: 234209280. Throughput: 0: 43052.0. Samples: 234283620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 14:23:49,334][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:23:50,660][65616] Updated weights for policy 0, policy_version 14300 (0.0034) [2024-06-12 14:23:54,184][65616] Updated weights for policy 0, policy_version 14310 (0.0042) [2024-06-12 14:23:54,332][65383] Fps is (10 sec: 47513.1, 60 sec: 43144.6, 300 sec: 43542.6). Total num frames: 234455040. Throughput: 0: 43008.5. Samples: 234545860. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-12 14:23:54,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:23:54,347][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000014310_234455040.pth... [2024-06-12 14:23:54,398][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000013675_224051200.pth [2024-06-12 14:23:58,317][65616] Updated weights for policy 0, policy_version 14320 (0.0023) [2024-06-12 14:23:59,333][65383] Fps is (10 sec: 45879.1, 60 sec: 43144.6, 300 sec: 43487.0). Total num frames: 234668032. Throughput: 0: 42977.1. Samples: 234806480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-12 14:23:59,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:24:01,970][65616] Updated weights for policy 0, policy_version 14330 (0.0032) [2024-06-12 14:24:04,332][65383] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 43320.4). Total num frames: 234848256. Throughput: 0: 43077.4. Samples: 234931440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-12 14:24:04,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 14:24:05,931][65616] Updated weights for policy 0, policy_version 14340 (0.0028) [2024-06-12 14:24:09,332][65383] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 43375.9). Total num frames: 235077632. Throughput: 0: 43066.1. Samples: 235190980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-12 14:24:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:24:09,769][65616] Updated weights for policy 0, policy_version 14350 (0.0026) [2024-06-12 14:24:13,252][65616] Updated weights for policy 0, policy_version 14360 (0.0026) [2024-06-12 14:24:14,332][65383] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 43375.9). Total num frames: 235307008. Throughput: 0: 43123.1. Samples: 235452220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 14:24:14,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:24:17,509][65616] Updated weights for policy 0, policy_version 14370 (0.0038) [2024-06-12 14:24:19,332][65383] Fps is (10 sec: 45875.6, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 235536384. Throughput: 0: 43342.2. Samples: 235586760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 14:24:19,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:24:20,640][65616] Updated weights for policy 0, policy_version 14380 (0.0024) [2024-06-12 14:24:24,335][65383] Fps is (10 sec: 40949.7, 60 sec: 42596.6, 300 sec: 43320.0). Total num frames: 235716608. Throughput: 0: 43344.6. Samples: 235845840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-12 14:24:24,336][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:24:25,033][65616] Updated weights for policy 0, policy_version 14390 (0.0021) [2024-06-12 14:24:28,154][65616] Updated weights for policy 0, policy_version 14400 (0.0026) [2024-06-12 14:24:29,332][65383] Fps is (10 sec: 44236.8, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 235978752. Throughput: 0: 43539.5. Samples: 236117740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-12 14:24:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:24:32,827][65616] Updated weights for policy 0, policy_version 14410 (0.0034) [2024-06-12 14:24:34,332][65383] Fps is (10 sec: 47525.8, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 236191744. Throughput: 0: 43725.9. Samples: 236251240. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-12 14:24:34,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:24:35,585][65616] Updated weights for policy 0, policy_version 14420 (0.0029) [2024-06-12 14:24:39,332][65383] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 43375.9). Total num frames: 236388352. Throughput: 0: 43465.8. Samples: 236501820. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-12 14:24:39,333][65383] Avg episode reward: [(0, '0.094')] [2024-06-12 14:24:40,050][65616] Updated weights for policy 0, policy_version 14430 (0.0024) [2024-06-12 14:24:42,943][65616] Updated weights for policy 0, policy_version 14440 (0.0026) [2024-06-12 14:24:44,332][65383] Fps is (10 sec: 44236.8, 60 sec: 44236.8, 300 sec: 43487.0). Total num frames: 236634112. Throughput: 0: 43405.0. Samples: 236759700. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-12 14:24:44,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:24:47,581][65616] Updated weights for policy 0, policy_version 14450 (0.0024) [2024-06-12 14:24:49,332][65383] Fps is (10 sec: 44236.7, 60 sec: 43691.3, 300 sec: 43375.9). Total num frames: 236830720. Throughput: 0: 43791.0. Samples: 236902040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 14:24:49,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:24:50,564][65616] Updated weights for policy 0, policy_version 14460 (0.0028) [2024-06-12 14:24:54,332][65383] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 43375.9). Total num frames: 237027328. Throughput: 0: 43506.2. Samples: 237148760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 14:24:54,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:24:54,664][65595] Signal inference workers to stop experience collection... (3350 times) [2024-06-12 14:24:54,708][65616] InferenceWorker_p0-w0: stopping experience collection (3350 times) [2024-06-12 14:24:54,716][65595] Signal inference workers to resume experience collection... (3350 times) [2024-06-12 14:24:54,721][65616] InferenceWorker_p0-w0: resuming experience collection (3350 times) [2024-06-12 14:24:54,851][65616] Updated weights for policy 0, policy_version 14470 (0.0034) [2024-06-12 14:24:58,354][65616] Updated weights for policy 0, policy_version 14480 (0.0030) [2024-06-12 14:24:59,332][65383] Fps is (10 sec: 45875.0, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 237289472. Throughput: 0: 43626.6. Samples: 237415420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 14:24:59,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:25:02,151][65616] Updated weights for policy 0, policy_version 14490 (0.0025) [2024-06-12 14:25:04,332][65383] Fps is (10 sec: 47514.1, 60 sec: 44236.8, 300 sec: 43487.0). Total num frames: 237502464. Throughput: 0: 43701.8. Samples: 237553340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 14:25:04,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:25:05,881][65616] Updated weights for policy 0, policy_version 14500 (0.0033) [2024-06-12 14:25:09,332][65383] Fps is (10 sec: 42598.8, 60 sec: 43963.8, 300 sec: 43487.0). Total num frames: 237715456. Throughput: 0: 43595.8. Samples: 237807540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-06-12 14:25:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:25:09,776][65616] Updated weights for policy 0, policy_version 14510 (0.0024) [2024-06-12 14:25:13,852][65616] Updated weights for policy 0, policy_version 14520 (0.0031) [2024-06-12 14:25:14,333][65383] Fps is (10 sec: 40959.6, 60 sec: 43417.6, 300 sec: 43320.4). Total num frames: 237912064. Throughput: 0: 43226.1. Samples: 238062920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-06-12 14:25:14,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:25:17,094][65616] Updated weights for policy 0, policy_version 14530 (0.0026) [2024-06-12 14:25:19,332][65383] Fps is (10 sec: 42598.0, 60 sec: 43417.5, 300 sec: 43431.5). Total num frames: 238141440. Throughput: 0: 43111.9. Samples: 238191280. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-12 14:25:19,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:25:21,264][65616] Updated weights for policy 0, policy_version 14540 (0.0034) [2024-06-12 14:25:24,332][65383] Fps is (10 sec: 45876.0, 60 sec: 44238.7, 300 sec: 43487.0). Total num frames: 238370816. Throughput: 0: 43546.8. Samples: 238461420. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-12 14:25:24,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:25:24,839][65616] Updated weights for policy 0, policy_version 14550 (0.0029) [2024-06-12 14:25:29,105][65616] Updated weights for policy 0, policy_version 14560 (0.0033) [2024-06-12 14:25:29,332][65383] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 43376.0). Total num frames: 238551040. Throughput: 0: 43366.2. Samples: 238711180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 14:25:29,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:25:32,290][65616] Updated weights for policy 0, policy_version 14570 (0.0026) [2024-06-12 14:25:34,332][65383] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 43376.0). Total num frames: 238780416. Throughput: 0: 42808.1. Samples: 238828400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 14:25:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:25:37,171][65616] Updated weights for policy 0, policy_version 14580 (0.0034) [2024-06-12 14:25:39,332][65383] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 43320.4). Total num frames: 238977024. Throughput: 0: 43237.9. Samples: 239094460. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-06-12 14:25:39,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:25:39,984][65616] Updated weights for policy 0, policy_version 14590 (0.0024) [2024-06-12 14:25:44,332][65383] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 43264.9). Total num frames: 239190016. Throughput: 0: 43125.1. Samples: 239356040. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-06-12 14:25:44,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:25:44,642][65616] Updated weights for policy 0, policy_version 14600 (0.0036) [2024-06-12 14:25:47,921][65616] Updated weights for policy 0, policy_version 14610 (0.0021) [2024-06-12 14:25:49,333][65383] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 43375.9). Total num frames: 239419392. Throughput: 0: 42703.0. Samples: 239474980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-12 14:25:49,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:25:52,198][65616] Updated weights for policy 0, policy_version 14620 (0.0030) [2024-06-12 14:25:54,333][65383] Fps is (10 sec: 45874.3, 60 sec: 43690.7, 300 sec: 43375.9). Total num frames: 239648768. Throughput: 0: 42787.0. Samples: 239732960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-12 14:25:54,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:25:54,347][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000014627_239648768.pth... [2024-06-12 14:25:54,388][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000013993_229261312.pth [2024-06-12 14:25:55,686][65616] Updated weights for policy 0, policy_version 14630 (0.0040) [2024-06-12 14:25:59,332][65383] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 43264.9). Total num frames: 239828992. Throughput: 0: 42845.4. Samples: 239990960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 14:25:59,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:26:00,073][65616] Updated weights for policy 0, policy_version 14640 (0.0022) [2024-06-12 14:26:03,313][65616] Updated weights for policy 0, policy_version 14650 (0.0033) [2024-06-12 14:26:04,332][65383] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 43431.5). Total num frames: 240074752. Throughput: 0: 42751.1. Samples: 240115080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 14:26:04,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:26:07,682][65616] Updated weights for policy 0, policy_version 14660 (0.0034) [2024-06-12 14:26:09,332][65383] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 43264.9). Total num frames: 240271360. Throughput: 0: 42691.5. Samples: 240382540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 14:26:09,333][65383] Avg episode reward: [(0, '0.084')] [2024-06-12 14:26:11,017][65616] Updated weights for policy 0, policy_version 14670 (0.0035) [2024-06-12 14:26:14,332][65383] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 43264.9). Total num frames: 240467968. Throughput: 0: 42911.6. Samples: 240642200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 14:26:14,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:26:14,896][65616] Updated weights for policy 0, policy_version 14680 (0.0027) [2024-06-12 14:26:18,000][65616] Updated weights for policy 0, policy_version 14690 (0.0028) [2024-06-12 14:26:19,222][65595] Signal inference workers to stop experience collection... (3400 times) [2024-06-12 14:26:19,274][65616] InferenceWorker_p0-w0: stopping experience collection (3400 times) [2024-06-12 14:26:19,329][65595] Signal inference workers to resume experience collection... (3400 times) [2024-06-12 14:26:19,330][65616] InferenceWorker_p0-w0: resuming experience collection (3400 times) [2024-06-12 14:26:19,332][65383] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 43376.0). Total num frames: 240713728. Throughput: 0: 43140.9. Samples: 240769740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 14:26:19,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:26:22,481][65616] Updated weights for policy 0, policy_version 14700 (0.0025) [2024-06-12 14:26:24,332][65383] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 43264.9). Total num frames: 240926720. Throughput: 0: 43048.9. Samples: 241031660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 14:26:24,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:26:25,652][65616] Updated weights for policy 0, policy_version 14710 (0.0024) [2024-06-12 14:26:29,333][65383] Fps is (10 sec: 42597.5, 60 sec: 43144.5, 300 sec: 43264.8). Total num frames: 241139712. Throughput: 0: 43030.9. Samples: 241292440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 14:26:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:26:29,883][65616] Updated weights for policy 0, policy_version 14720 (0.0028) [2024-06-12 14:26:33,462][65616] Updated weights for policy 0, policy_version 14730 (0.0026) [2024-06-12 14:26:34,332][65383] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 43320.4). Total num frames: 241352704. Throughput: 0: 43293.8. Samples: 241423200. Policy #0 lag: (min: 0.0, avg: 8.0, max: 23.0) [2024-06-12 14:26:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:26:37,771][65616] Updated weights for policy 0, policy_version 14740 (0.0029) [2024-06-12 14:26:39,332][65383] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 43098.3). Total num frames: 241549312. Throughput: 0: 43259.3. Samples: 241679620. Policy #0 lag: (min: 0.0, avg: 8.0, max: 23.0) [2024-06-12 14:26:39,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:26:41,357][65616] Updated weights for policy 0, policy_version 14750 (0.0027) [2024-06-12 14:26:44,332][65383] Fps is (10 sec: 44237.5, 60 sec: 43417.6, 300 sec: 43209.3). Total num frames: 241795072. Throughput: 0: 43136.2. Samples: 241932080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 14:26:44,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:26:44,783][65616] Updated weights for policy 0, policy_version 14760 (0.0027) [2024-06-12 14:26:48,675][65616] Updated weights for policy 0, policy_version 14770 (0.0029) [2024-06-12 14:26:49,332][65383] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 43209.3). Total num frames: 242008064. Throughput: 0: 43405.4. Samples: 242068320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 14:26:49,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:26:52,599][65616] Updated weights for policy 0, policy_version 14780 (0.0032) [2024-06-12 14:26:54,332][65383] Fps is (10 sec: 44236.1, 60 sec: 43144.6, 300 sec: 43264.9). Total num frames: 242237440. Throughput: 0: 43201.2. Samples: 242326600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 14:26:54,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:26:56,314][65616] Updated weights for policy 0, policy_version 14790 (0.0028) [2024-06-12 14:26:59,332][65383] Fps is (10 sec: 44236.8, 60 sec: 43690.7, 300 sec: 43264.9). Total num frames: 242450432. Throughput: 0: 43181.3. Samples: 242585360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 14:26:59,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:27:00,034][65616] Updated weights for policy 0, policy_version 14800 (0.0030) [2024-06-12 14:27:03,890][65616] Updated weights for policy 0, policy_version 14810 (0.0023) [2024-06-12 14:27:04,335][65383] Fps is (10 sec: 42587.6, 60 sec: 43142.7, 300 sec: 43208.9). Total num frames: 242663424. Throughput: 0: 43307.2. Samples: 242718680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 14:27:04,335][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:27:07,335][65616] Updated weights for policy 0, policy_version 14820 (0.0024) [2024-06-12 14:27:09,332][65383] Fps is (10 sec: 44236.9, 60 sec: 43690.6, 300 sec: 43320.4). Total num frames: 242892800. Throughput: 0: 43268.4. Samples: 242978740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 14:27:09,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:27:11,197][65616] Updated weights for policy 0, policy_version 14830 (0.0023) [2024-06-12 14:27:14,332][65383] Fps is (10 sec: 44248.7, 60 sec: 43963.8, 300 sec: 43209.3). Total num frames: 243105792. Throughput: 0: 43357.1. Samples: 243243500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 14:27:14,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:27:14,623][65616] Updated weights for policy 0, policy_version 14840 (0.0025) [2024-06-12 14:27:18,851][65616] Updated weights for policy 0, policy_version 14850 (0.0030) [2024-06-12 14:27:19,332][65383] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 43209.3). Total num frames: 243302400. Throughput: 0: 43222.7. Samples: 243368220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 14:27:19,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:27:22,589][65616] Updated weights for policy 0, policy_version 14860 (0.0029) [2024-06-12 14:27:24,332][65383] Fps is (10 sec: 42597.5, 60 sec: 43417.5, 300 sec: 43264.8). Total num frames: 243531776. Throughput: 0: 43414.1. Samples: 243633260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 14:27:24,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:27:26,611][65616] Updated weights for policy 0, policy_version 14870 (0.0028) [2024-06-12 14:27:29,332][65383] Fps is (10 sec: 45875.2, 60 sec: 43690.8, 300 sec: 43209.4). Total num frames: 243761152. Throughput: 0: 43757.3. Samples: 243901160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-12 14:27:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:27:29,937][65616] Updated weights for policy 0, policy_version 14880 (0.0029) [2024-06-12 14:27:34,047][65616] Updated weights for policy 0, policy_version 14890 (0.0033) [2024-06-12 14:27:34,332][65383] Fps is (10 sec: 42599.1, 60 sec: 43417.7, 300 sec: 43209.3). Total num frames: 243957760. Throughput: 0: 43453.4. Samples: 244023720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-12 14:27:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:27:37,875][65616] Updated weights for policy 0, policy_version 14900 (0.0029) [2024-06-12 14:27:39,332][65383] Fps is (10 sec: 44236.8, 60 sec: 44236.8, 300 sec: 43375.9). Total num frames: 244203520. Throughput: 0: 43541.8. Samples: 244285980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 14:27:39,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:27:42,332][65616] Updated weights for policy 0, policy_version 14910 (0.0030) [2024-06-12 14:27:44,332][65383] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 43098.3). Total num frames: 244383744. Throughput: 0: 43372.1. Samples: 244537100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 14:27:44,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:27:44,763][65595] Signal inference workers to stop experience collection... (3450 times) [2024-06-12 14:27:44,763][65595] Signal inference workers to resume experience collection... (3450 times) [2024-06-12 14:27:44,776][65616] InferenceWorker_p0-w0: stopping experience collection (3450 times) [2024-06-12 14:27:44,776][65616] InferenceWorker_p0-w0: resuming experience collection (3450 times) [2024-06-12 14:27:45,327][65616] Updated weights for policy 0, policy_version 14920 (0.0024) [2024-06-12 14:27:49,332][65383] Fps is (10 sec: 37683.3, 60 sec: 42871.5, 300 sec: 43098.3). Total num frames: 244580352. Throughput: 0: 43136.7. Samples: 244659720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 14:27:49,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:27:49,642][65616] Updated weights for policy 0, policy_version 14930 (0.0030) [2024-06-12 14:27:52,997][65616] Updated weights for policy 0, policy_version 14940 (0.0034) [2024-06-12 14:27:54,333][65383] Fps is (10 sec: 47512.8, 60 sec: 43690.6, 300 sec: 43320.4). Total num frames: 244858880. Throughput: 0: 43254.5. Samples: 244925200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 14:27:54,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:27:54,349][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000014945_244858880.pth... [2024-06-12 14:27:54,395][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000014310_234455040.pth [2024-06-12 14:27:57,511][65616] Updated weights for policy 0, policy_version 14950 (0.0026) [2024-06-12 14:27:59,332][65383] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 43209.3). Total num frames: 245022720. Throughput: 0: 43087.0. Samples: 245182420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 14:27:59,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:28:00,477][65616] Updated weights for policy 0, policy_version 14960 (0.0029) [2024-06-12 14:28:04,332][65383] Fps is (10 sec: 37683.3, 60 sec: 42873.3, 300 sec: 43098.3). Total num frames: 245235712. Throughput: 0: 43110.1. Samples: 245308180. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-12 14:28:04,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:28:05,293][65616] Updated weights for policy 0, policy_version 14970 (0.0030) [2024-06-12 14:28:08,125][65616] Updated weights for policy 0, policy_version 14980 (0.0034) [2024-06-12 14:28:09,333][65383] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 43264.8). Total num frames: 245481472. Throughput: 0: 43107.5. Samples: 245573100. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-12 14:28:09,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:28:11,928][65616] Updated weights for policy 0, policy_version 14990 (0.0026) [2024-06-12 14:28:14,332][65383] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 43209.3). Total num frames: 245661696. Throughput: 0: 43034.2. Samples: 245837700. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-12 14:28:14,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:28:15,634][65616] Updated weights for policy 0, policy_version 15000 (0.0025) [2024-06-12 14:28:19,332][65383] Fps is (10 sec: 40960.7, 60 sec: 43144.5, 300 sec: 43153.8). Total num frames: 245891072. Throughput: 0: 43026.7. Samples: 245959920. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-12 14:28:19,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:28:19,473][65616] Updated weights for policy 0, policy_version 15010 (0.0031) [2024-06-12 14:28:23,041][65616] Updated weights for policy 0, policy_version 15020 (0.0027) [2024-06-12 14:28:24,332][65383] Fps is (10 sec: 47514.2, 60 sec: 43417.7, 300 sec: 43320.4). Total num frames: 246136832. Throughput: 0: 42977.8. Samples: 246219980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-12 14:28:24,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:28:27,460][65616] Updated weights for policy 0, policy_version 15030 (0.0032) [2024-06-12 14:28:29,332][65383] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 43264.9). Total num frames: 246333440. Throughput: 0: 43387.9. Samples: 246489560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-12 14:28:29,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:28:30,367][65616] Updated weights for policy 0, policy_version 15040 (0.0030) [2024-06-12 14:28:34,332][65383] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 43209.3). Total num frames: 246546432. Throughput: 0: 43533.3. Samples: 246618720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 14:28:34,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:28:35,214][65616] Updated weights for policy 0, policy_version 15050 (0.0031) [2024-06-12 14:28:38,226][65616] Updated weights for policy 0, policy_version 15060 (0.0029) [2024-06-12 14:28:39,332][65383] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 43375.9). Total num frames: 246775808. Throughput: 0: 43280.1. Samples: 246872800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 14:28:39,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:28:42,108][65616] Updated weights for policy 0, policy_version 15070 (0.0026) [2024-06-12 14:28:44,333][65383] Fps is (10 sec: 45874.6, 60 sec: 43690.5, 300 sec: 43376.1). Total num frames: 247005184. Throughput: 0: 43507.4. Samples: 247140260. Policy #0 lag: (min: 0.0, avg: 12.4, max: 26.0) [2024-06-12 14:28:44,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:28:45,883][65616] Updated weights for policy 0, policy_version 15080 (0.0025) [2024-06-12 14:28:49,332][65383] Fps is (10 sec: 42598.4, 60 sec: 43690.6, 300 sec: 43209.3). Total num frames: 247201792. Throughput: 0: 43644.5. Samples: 247272180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 26.0) [2024-06-12 14:28:49,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:28:49,770][65616] Updated weights for policy 0, policy_version 15090 (0.0027) [2024-06-12 14:28:53,495][65616] Updated weights for policy 0, policy_version 15100 (0.0029) [2024-06-12 14:28:54,332][65383] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 43209.3). Total num frames: 247414784. Throughput: 0: 43543.6. Samples: 247532560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 26.0) [2024-06-12 14:28:54,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:28:57,471][65616] Updated weights for policy 0, policy_version 15110 (0.0027) [2024-06-12 14:28:59,332][65383] Fps is (10 sec: 45875.3, 60 sec: 43963.8, 300 sec: 43431.5). Total num frames: 247660544. Throughput: 0: 43368.5. Samples: 247789280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-12 14:28:59,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:29:01,300][65616] Updated weights for policy 0, policy_version 15120 (0.0031) [2024-06-12 14:29:04,332][65383] Fps is (10 sec: 44237.4, 60 sec: 43690.8, 300 sec: 43320.4). Total num frames: 247857152. Throughput: 0: 43752.9. Samples: 247928800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-12 14:29:04,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:29:04,863][65616] Updated weights for policy 0, policy_version 15130 (0.0026) [2024-06-12 14:29:08,966][65616] Updated weights for policy 0, policy_version 15140 (0.0026) [2024-06-12 14:29:09,332][65383] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 43209.3). Total num frames: 248053760. Throughput: 0: 43571.9. Samples: 248180720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 18.0) [2024-06-12 14:29:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:29:12,122][65616] Updated weights for policy 0, policy_version 15150 (0.0028) [2024-06-12 14:29:14,332][65383] Fps is (10 sec: 44236.6, 60 sec: 43963.8, 300 sec: 43264.9). Total num frames: 248299520. Throughput: 0: 43298.3. Samples: 248437980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 18.0) [2024-06-12 14:29:14,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:29:16,278][65616] Updated weights for policy 0, policy_version 15160 (0.0033) [2024-06-12 14:29:17,520][65595] Signal inference workers to stop experience collection... (3500 times) [2024-06-12 14:29:17,563][65616] InferenceWorker_p0-w0: stopping experience collection (3500 times) [2024-06-12 14:29:17,568][65595] Signal inference workers to resume experience collection... (3500 times) [2024-06-12 14:29:17,579][65616] InferenceWorker_p0-w0: resuming experience collection (3500 times) [2024-06-12 14:29:19,332][65383] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 43320.8). Total num frames: 248496128. Throughput: 0: 43351.1. Samples: 248569520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 14:29:19,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:29:19,837][65616] Updated weights for policy 0, policy_version 15170 (0.0034) [2024-06-12 14:29:24,152][65616] Updated weights for policy 0, policy_version 15180 (0.0030) [2024-06-12 14:29:24,332][65383] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 43153.8). Total num frames: 248709120. Throughput: 0: 43311.1. Samples: 248821800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 14:29:24,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:29:27,618][65616] Updated weights for policy 0, policy_version 15190 (0.0034) [2024-06-12 14:29:29,332][65383] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 43209.3). Total num frames: 248938496. Throughput: 0: 43008.5. Samples: 249075640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-12 14:29:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:29:31,358][65616] Updated weights for policy 0, policy_version 15200 (0.0025) [2024-06-12 14:29:34,332][65383] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 43209.3). Total num frames: 249135104. Throughput: 0: 43019.6. Samples: 249208060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-12 14:29:34,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:29:35,159][65616] Updated weights for policy 0, policy_version 15210 (0.0020) [2024-06-12 14:29:38,870][65616] Updated weights for policy 0, policy_version 15220 (0.0035) [2024-06-12 14:29:39,332][65383] Fps is (10 sec: 45876.1, 60 sec: 43690.8, 300 sec: 43264.9). Total num frames: 249397248. Throughput: 0: 43061.9. Samples: 249470340. Policy #0 lag: (min: 1.0, avg: 9.1, max: 19.0) [2024-06-12 14:29:39,333][65383] Avg episode reward: [(0, '0.094')] [2024-06-12 14:29:42,972][65616] Updated weights for policy 0, policy_version 15230 (0.0032) [2024-06-12 14:29:44,332][65383] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 43209.3). Total num frames: 249577472. Throughput: 0: 43062.1. Samples: 249727080. Policy #0 lag: (min: 1.0, avg: 9.1, max: 19.0) [2024-06-12 14:29:44,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:29:46,526][65616] Updated weights for policy 0, policy_version 15240 (0.0027) [2024-06-12 14:29:49,332][65383] Fps is (10 sec: 40959.7, 60 sec: 43417.6, 300 sec: 43320.4). Total num frames: 249806848. Throughput: 0: 42846.6. Samples: 249856900. Policy #0 lag: (min: 1.0, avg: 9.1, max: 19.0) [2024-06-12 14:29:49,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:29:50,675][65616] Updated weights for policy 0, policy_version 15250 (0.0036) [2024-06-12 14:29:54,222][65616] Updated weights for policy 0, policy_version 15260 (0.0024) [2024-06-12 14:29:54,332][65383] Fps is (10 sec: 44237.2, 60 sec: 43417.7, 300 sec: 43153.8). Total num frames: 250019840. Throughput: 0: 42903.6. Samples: 250111380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 14:29:54,333][65383] Avg episode reward: [(0, '0.094')] [2024-06-12 14:29:54,343][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000015260_250019840.pth... [2024-06-12 14:29:54,384][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000014627_239648768.pth [2024-06-12 14:29:54,387][65595] Saving new best policy, reward=0.094! [2024-06-12 14:29:58,205][65616] Updated weights for policy 0, policy_version 15270 (0.0031) [2024-06-12 14:29:59,332][65383] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 43042.7). Total num frames: 250200064. Throughput: 0: 42801.2. Samples: 250364040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 14:29:59,339][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:30:01,690][65616] Updated weights for policy 0, policy_version 15280 (0.0021) [2024-06-12 14:30:04,332][65383] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 43153.8). Total num frames: 250445824. Throughput: 0: 42769.7. Samples: 250494160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 14:30:04,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:30:05,234][65616] Updated weights for policy 0, policy_version 15290 (0.0024) [2024-06-12 14:30:09,332][65383] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 43153.8). Total num frames: 250642432. Throughput: 0: 43127.6. Samples: 250762540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 14:30:09,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:30:09,541][65616] Updated weights for policy 0, policy_version 15300 (0.0029) [2024-06-12 14:30:13,070][65616] Updated weights for policy 0, policy_version 15310 (0.0027) [2024-06-12 14:30:14,332][65383] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 43098.3). Total num frames: 250855424. Throughput: 0: 43138.4. Samples: 251016860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 14:30:14,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 14:30:17,581][65616] Updated weights for policy 0, policy_version 15320 (0.0024) [2024-06-12 14:30:19,333][65383] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 43098.2). Total num frames: 251084800. Throughput: 0: 43122.5. Samples: 251148580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 14:30:19,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:30:21,112][65616] Updated weights for policy 0, policy_version 15330 (0.0029) [2024-06-12 14:30:24,332][65383] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 43098.3). Total num frames: 251265024. Throughput: 0: 42892.3. Samples: 251400500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 14:30:24,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:30:25,227][65616] Updated weights for policy 0, policy_version 15340 (0.0034) [2024-06-12 14:30:28,432][65616] Updated weights for policy 0, policy_version 15350 (0.0031) [2024-06-12 14:30:29,332][65383] Fps is (10 sec: 45876.1, 60 sec: 43417.7, 300 sec: 43264.9). Total num frames: 251543552. Throughput: 0: 42840.5. Samples: 251654900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 14:30:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:30:33,097][65616] Updated weights for policy 0, policy_version 15360 (0.0029) [2024-06-12 14:30:34,332][65383] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 43153.8). Total num frames: 251707392. Throughput: 0: 42886.2. Samples: 251786780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 14:30:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:30:36,252][65616] Updated weights for policy 0, policy_version 15370 (0.0025) [2024-06-12 14:30:39,332][65383] Fps is (10 sec: 36045.0, 60 sec: 41779.2, 300 sec: 43098.3). Total num frames: 251904000. Throughput: 0: 42895.2. Samples: 252041660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-12 14:30:39,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:30:40,731][65616] Updated weights for policy 0, policy_version 15380 (0.0031) [2024-06-12 14:30:43,440][65616] Updated weights for policy 0, policy_version 15390 (0.0028) [2024-06-12 14:30:44,332][65383] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 43209.4). Total num frames: 252166144. Throughput: 0: 42923.7. Samples: 252295600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-12 14:30:44,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:30:48,305][65616] Updated weights for policy 0, policy_version 15400 (0.0036) [2024-06-12 14:30:49,035][65595] Signal inference workers to stop experience collection... (3550 times) [2024-06-12 14:30:49,040][65595] Signal inference workers to resume experience collection... (3550 times) [2024-06-12 14:30:49,054][65616] InferenceWorker_p0-w0: stopping experience collection (3550 times) [2024-06-12 14:30:49,054][65616] InferenceWorker_p0-w0: resuming experience collection (3550 times) [2024-06-12 14:30:49,333][65383] Fps is (10 sec: 45870.5, 60 sec: 42597.7, 300 sec: 43098.1). Total num frames: 252362752. Throughput: 0: 42988.0. Samples: 252428660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 14:30:49,334][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:30:51,291][65616] Updated weights for policy 0, policy_version 15410 (0.0027) [2024-06-12 14:30:54,332][65383] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 43264.9). Total num frames: 252592128. Throughput: 0: 42884.0. Samples: 252692320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 14:30:54,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:30:55,499][65616] Updated weights for policy 0, policy_version 15420 (0.0030) [2024-06-12 14:30:59,107][65616] Updated weights for policy 0, policy_version 15430 (0.0027) [2024-06-12 14:30:59,332][65383] Fps is (10 sec: 44241.2, 60 sec: 43417.7, 300 sec: 43153.8). Total num frames: 252805120. Throughput: 0: 42876.0. Samples: 252946280. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-12 14:30:59,341][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:31:02,858][65616] Updated weights for policy 0, policy_version 15440 (0.0024) [2024-06-12 14:31:04,332][65383] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 43209.3). Total num frames: 253018112. Throughput: 0: 42910.4. Samples: 253079540. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-12 14:31:04,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:31:06,260][65616] Updated weights for policy 0, policy_version 15450 (0.0032) [2024-06-12 14:31:09,332][65383] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 43264.9). Total num frames: 253231104. Throughput: 0: 43080.0. Samples: 253339100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 14:31:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:31:10,549][65616] Updated weights for policy 0, policy_version 15460 (0.0025) [2024-06-12 14:31:14,030][65616] Updated weights for policy 0, policy_version 15470 (0.0030) [2024-06-12 14:31:14,332][65383] Fps is (10 sec: 44236.8, 60 sec: 43417.5, 300 sec: 43209.3). Total num frames: 253460480. Throughput: 0: 43056.4. Samples: 253592440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 14:31:14,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:31:17,930][65616] Updated weights for policy 0, policy_version 15480 (0.0027) [2024-06-12 14:31:19,332][65383] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 43098.2). Total num frames: 253640704. Throughput: 0: 43144.0. Samples: 253728260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 14:31:19,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:31:21,759][65616] Updated weights for policy 0, policy_version 15490 (0.0028) [2024-06-12 14:31:24,333][65383] Fps is (10 sec: 44236.3, 60 sec: 43963.6, 300 sec: 43264.9). Total num frames: 253902848. Throughput: 0: 43082.9. Samples: 253980400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 14:31:24,342][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:31:25,342][65616] Updated weights for policy 0, policy_version 15500 (0.0034) [2024-06-12 14:31:29,332][65383] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 43153.8). Total num frames: 254083072. Throughput: 0: 43172.0. Samples: 254238340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 14:31:29,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:31:29,583][65616] Updated weights for policy 0, policy_version 15510 (0.0030) [2024-06-12 14:31:32,947][65616] Updated weights for policy 0, policy_version 15520 (0.0025) [2024-06-12 14:31:34,332][65383] Fps is (10 sec: 39322.0, 60 sec: 43144.5, 300 sec: 43209.3). Total num frames: 254296064. Throughput: 0: 43019.5. Samples: 254364500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 14:31:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:31:37,417][65616] Updated weights for policy 0, policy_version 15530 (0.0021) [2024-06-12 14:31:39,332][65383] Fps is (10 sec: 45874.6, 60 sec: 43963.6, 300 sec: 43209.3). Total num frames: 254541824. Throughput: 0: 42930.2. Samples: 254624180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 14:31:39,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:31:40,890][65616] Updated weights for policy 0, policy_version 15540 (0.0029) [2024-06-12 14:31:44,332][65383] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 43042.7). Total num frames: 254705664. Throughput: 0: 43024.9. Samples: 254882400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-12 14:31:44,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:31:45,484][65616] Updated weights for policy 0, policy_version 15550 (0.0026) [2024-06-12 14:31:49,215][65616] Updated weights for policy 0, policy_version 15560 (0.0030) [2024-06-12 14:31:49,332][65383] Fps is (10 sec: 39321.9, 60 sec: 42872.1, 300 sec: 43042.7). Total num frames: 254935040. Throughput: 0: 42693.4. Samples: 255000740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-12 14:31:49,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:31:52,836][65616] Updated weights for policy 0, policy_version 15570 (0.0019) [2024-06-12 14:31:54,332][65383] Fps is (10 sec: 49151.3, 60 sec: 43417.6, 300 sec: 43209.3). Total num frames: 255197184. Throughput: 0: 42912.8. Samples: 255270180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 14:31:54,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:31:54,348][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000015576_255197184.pth... [2024-06-12 14:31:54,396][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000014945_244858880.pth [2024-06-12 14:31:56,050][65616] Updated weights for policy 0, policy_version 15580 (0.0030) [2024-06-12 14:31:59,332][65383] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42987.5). Total num frames: 255344640. Throughput: 0: 43038.2. Samples: 255529160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 14:31:59,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:32:00,369][65616] Updated weights for policy 0, policy_version 15590 (0.0038) [2024-06-12 14:32:03,351][65616] Updated weights for policy 0, policy_version 15600 (0.0025) [2024-06-12 14:32:04,332][65383] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 255606784. Throughput: 0: 42792.0. Samples: 255653900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 14:32:04,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:32:07,742][65616] Updated weights for policy 0, policy_version 15610 (0.0035) [2024-06-12 14:32:09,332][65383] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 255803392. Throughput: 0: 42946.3. Samples: 255912980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 14:32:09,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:32:10,882][65616] Updated weights for policy 0, policy_version 15620 (0.0030) [2024-06-12 14:32:14,332][65383] Fps is (10 sec: 37683.6, 60 sec: 42052.3, 300 sec: 42987.2). Total num frames: 255983616. Throughput: 0: 43293.3. Samples: 256186540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 14:32:14,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:32:15,232][65616] Updated weights for policy 0, policy_version 15630 (0.0028) [2024-06-12 14:32:18,023][65595] Signal inference workers to stop experience collection... (3600 times) [2024-06-12 14:32:18,057][65616] InferenceWorker_p0-w0: stopping experience collection (3600 times) [2024-06-12 14:32:18,080][65595] Signal inference workers to resume experience collection... (3600 times) [2024-06-12 14:32:18,080][65616] InferenceWorker_p0-w0: resuming experience collection (3600 times) [2024-06-12 14:32:18,526][65616] Updated weights for policy 0, policy_version 15640 (0.0028) [2024-06-12 14:32:19,332][65383] Fps is (10 sec: 47513.8, 60 sec: 43963.8, 300 sec: 43209.4). Total num frames: 256278528. Throughput: 0: 43205.4. Samples: 256308740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 26.0) [2024-06-12 14:32:19,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:32:22,900][65616] Updated weights for policy 0, policy_version 15650 (0.0033) [2024-06-12 14:32:24,332][65383] Fps is (10 sec: 47512.9, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 256458752. Throughput: 0: 43264.0. Samples: 256571060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 26.0) [2024-06-12 14:32:24,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:32:26,291][65616] Updated weights for policy 0, policy_version 15660 (0.0027) [2024-06-12 14:32:29,332][65383] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 256671744. Throughput: 0: 43392.0. Samples: 256835040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-12 14:32:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:32:30,218][65616] Updated weights for policy 0, policy_version 15670 (0.0033) [2024-06-12 14:32:33,834][65616] Updated weights for policy 0, policy_version 15680 (0.0031) [2024-06-12 14:32:34,332][65383] Fps is (10 sec: 45875.4, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 256917504. Throughput: 0: 43415.1. Samples: 256954420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-12 14:32:34,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:32:38,015][65616] Updated weights for policy 0, policy_version 15690 (0.0034) [2024-06-12 14:32:39,332][65383] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 43153.8). Total num frames: 257114112. Throughput: 0: 43427.7. Samples: 257224420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-12 14:32:39,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:32:41,529][65616] Updated weights for policy 0, policy_version 15700 (0.0028) [2024-06-12 14:32:44,332][65383] Fps is (10 sec: 40960.4, 60 sec: 43690.7, 300 sec: 43209.3). Total num frames: 257327104. Throughput: 0: 43301.4. Samples: 257477720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-12 14:32:44,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 14:32:45,007][65616] Updated weights for policy 0, policy_version 15710 (0.0027) [2024-06-12 14:32:49,219][65616] Updated weights for policy 0, policy_version 15720 (0.0023) [2024-06-12 14:32:49,332][65383] Fps is (10 sec: 44236.9, 60 sec: 43690.7, 300 sec: 43042.7). Total num frames: 257556480. Throughput: 0: 43427.6. Samples: 257608140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-12 14:32:49,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:32:52,622][65616] Updated weights for policy 0, policy_version 15730 (0.0027) [2024-06-12 14:32:54,332][65383] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 43209.3). Total num frames: 257769472. Throughput: 0: 43474.6. Samples: 257869340. Policy #0 lag: (min: 2.0, avg: 10.2, max: 23.0) [2024-06-12 14:32:54,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:32:57,525][65616] Updated weights for policy 0, policy_version 15740 (0.0026) [2024-06-12 14:32:59,332][65383] Fps is (10 sec: 44236.8, 60 sec: 44236.8, 300 sec: 43264.9). Total num frames: 257998848. Throughput: 0: 43024.9. Samples: 258122660. Policy #0 lag: (min: 2.0, avg: 10.2, max: 23.0) [2024-06-12 14:32:59,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:33:00,303][65616] Updated weights for policy 0, policy_version 15750 (0.0034) [2024-06-12 14:33:04,333][65383] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 258179072. Throughput: 0: 43361.6. Samples: 258260020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 14:33:04,333][65383] Avg episode reward: [(0, '0.094')] [2024-06-12 14:33:04,907][65616] Updated weights for policy 0, policy_version 15760 (0.0029) [2024-06-12 14:33:07,904][65616] Updated weights for policy 0, policy_version 15770 (0.0028) [2024-06-12 14:33:09,332][65383] Fps is (10 sec: 40960.2, 60 sec: 43417.7, 300 sec: 43209.3). Total num frames: 258408448. Throughput: 0: 43022.9. Samples: 258507080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 14:33:09,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:33:12,646][65616] Updated weights for policy 0, policy_version 15780 (0.0028) [2024-06-12 14:33:14,332][65383] Fps is (10 sec: 45875.7, 60 sec: 44236.8, 300 sec: 43209.3). Total num frames: 258637824. Throughput: 0: 42955.5. Samples: 258768040. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-12 14:33:14,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:33:15,417][65616] Updated weights for policy 0, policy_version 15790 (0.0031) [2024-06-12 14:33:19,332][65383] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 43042.7). Total num frames: 258834432. Throughput: 0: 43337.3. Samples: 258904600. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-12 14:33:19,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:33:20,164][65616] Updated weights for policy 0, policy_version 15800 (0.0032) [2024-06-12 14:33:22,757][65616] Updated weights for policy 0, policy_version 15810 (0.0039) [2024-06-12 14:33:24,332][65383] Fps is (10 sec: 42598.6, 60 sec: 43417.7, 300 sec: 43153.8). Total num frames: 259063808. Throughput: 0: 42958.7. Samples: 259157560. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-12 14:33:24,333][65383] Avg episode reward: [(0, '0.090')] [2024-06-12 14:33:27,485][65616] Updated weights for policy 0, policy_version 15820 (0.0028) [2024-06-12 14:33:29,332][65383] Fps is (10 sec: 45875.0, 60 sec: 43690.6, 300 sec: 43209.3). Total num frames: 259293184. Throughput: 0: 43130.1. Samples: 259418580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 14:33:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:33:30,662][65616] Updated weights for policy 0, policy_version 15830 (0.0027) [2024-06-12 14:33:34,332][65383] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 43042.7). Total num frames: 259473408. Throughput: 0: 43086.7. Samples: 259547040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 14:33:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:33:34,845][65616] Updated weights for policy 0, policy_version 15840 (0.0030) [2024-06-12 14:33:38,687][65616] Updated weights for policy 0, policy_version 15850 (0.0029) [2024-06-12 14:33:39,332][65383] Fps is (10 sec: 42598.6, 60 sec: 43417.5, 300 sec: 43098.3). Total num frames: 259719168. Throughput: 0: 42888.4. Samples: 259799320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 14:33:39,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:33:42,413][65616] Updated weights for policy 0, policy_version 15860 (0.0032) [2024-06-12 14:33:44,332][65383] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 259899392. Throughput: 0: 43138.6. Samples: 260063900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 14:33:44,333][65383] Avg episode reward: [(0, '0.094')] [2024-06-12 14:33:46,351][65616] Updated weights for policy 0, policy_version 15870 (0.0030) [2024-06-12 14:33:49,332][65383] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 43153.8). Total num frames: 260145152. Throughput: 0: 42962.8. Samples: 260193340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 14:33:49,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:33:50,037][65616] Updated weights for policy 0, policy_version 15880 (0.0029) [2024-06-12 14:33:52,326][65595] Signal inference workers to stop experience collection... (3650 times) [2024-06-12 14:33:52,326][65595] Signal inference workers to resume experience collection... (3650 times) [2024-06-12 14:33:52,355][65616] InferenceWorker_p0-w0: stopping experience collection (3650 times) [2024-06-12 14:33:52,355][65616] InferenceWorker_p0-w0: resuming experience collection (3650 times) [2024-06-12 14:33:54,332][65383] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 260325376. Throughput: 0: 43042.5. Samples: 260444000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 14:33:54,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:33:54,509][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000015890_260341760.pth... [2024-06-12 14:33:54,510][65616] Updated weights for policy 0, policy_version 15890 (0.0024) [2024-06-12 14:33:54,566][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000015260_250019840.pth [2024-06-12 14:33:57,460][65616] Updated weights for policy 0, policy_version 15900 (0.0026) [2024-06-12 14:33:59,332][65383] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 43098.2). Total num frames: 260571136. Throughput: 0: 43107.2. Samples: 260707860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 14:33:59,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:34:01,601][65616] Updated weights for policy 0, policy_version 15910 (0.0028) [2024-06-12 14:34:04,332][65383] Fps is (10 sec: 45875.4, 60 sec: 43417.7, 300 sec: 43153.8). Total num frames: 260784128. Throughput: 0: 43120.5. Samples: 260845020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 14:34:04,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:34:05,286][65616] Updated weights for policy 0, policy_version 15920 (0.0032) [2024-06-12 14:34:09,035][65616] Updated weights for policy 0, policy_version 15930 (0.0024) [2024-06-12 14:34:09,332][65383] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 260997120. Throughput: 0: 43217.8. Samples: 261102360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 14:34:09,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:34:12,798][65616] Updated weights for policy 0, policy_version 15940 (0.0029) [2024-06-12 14:34:14,332][65383] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 43153.8). Total num frames: 261226496. Throughput: 0: 43064.5. Samples: 261356480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 14:34:14,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:34:16,974][65616] Updated weights for policy 0, policy_version 15950 (0.0024) [2024-06-12 14:34:19,332][65383] Fps is (10 sec: 44236.0, 60 sec: 43417.6, 300 sec: 43153.8). Total num frames: 261439488. Throughput: 0: 43253.7. Samples: 261493460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 14:34:19,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:34:20,214][65616] Updated weights for policy 0, policy_version 15960 (0.0034) [2024-06-12 14:34:24,332][65383] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 261619712. Throughput: 0: 43310.4. Samples: 261748280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 14:34:24,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:34:24,870][65616] Updated weights for policy 0, policy_version 15970 (0.0030) [2024-06-12 14:34:27,227][65616] Updated weights for policy 0, policy_version 15980 (0.0026) [2024-06-12 14:34:29,332][65383] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 43153.8). Total num frames: 261865472. Throughput: 0: 43096.4. Samples: 262003240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 14:34:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:34:32,289][65616] Updated weights for policy 0, policy_version 15990 (0.0029) [2024-06-12 14:34:34,332][65383] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 262078464. Throughput: 0: 43366.7. Samples: 262144840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 14:34:34,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:34:34,911][65616] Updated weights for policy 0, policy_version 16000 (0.0030) [2024-06-12 14:34:39,332][65383] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42987.2). Total num frames: 262258688. Throughput: 0: 43544.1. Samples: 262403480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 14:34:39,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:34:39,882][65616] Updated weights for policy 0, policy_version 16010 (0.0028) [2024-06-12 14:34:42,766][65616] Updated weights for policy 0, policy_version 16020 (0.0031) [2024-06-12 14:34:44,332][65383] Fps is (10 sec: 44236.5, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 262520832. Throughput: 0: 43113.2. Samples: 262647960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 14:34:44,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:34:47,672][65616] Updated weights for policy 0, policy_version 16030 (0.0026) [2024-06-12 14:34:49,332][65383] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 262733824. Throughput: 0: 43178.6. Samples: 262788060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-12 14:34:49,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:34:50,297][65616] Updated weights for policy 0, policy_version 16040 (0.0027) [2024-06-12 14:34:54,332][65383] Fps is (10 sec: 40960.3, 60 sec: 43417.7, 300 sec: 43153.8). Total num frames: 262930432. Throughput: 0: 43063.9. Samples: 263040240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-12 14:34:54,333][65383] Avg episode reward: [(0, '0.094')] [2024-06-12 14:34:54,849][65616] Updated weights for policy 0, policy_version 16050 (0.0035) [2024-06-12 14:34:58,026][65616] Updated weights for policy 0, policy_version 16060 (0.0026) [2024-06-12 14:34:59,332][65383] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 43153.8). Total num frames: 263176192. Throughput: 0: 43149.5. Samples: 263298200. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-12 14:34:59,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:35:02,438][65616] Updated weights for policy 0, policy_version 16070 (0.0033) [2024-06-12 14:35:04,332][65383] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 43153.8). Total num frames: 263372800. Throughput: 0: 43090.3. Samples: 263432520. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-12 14:35:04,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:35:05,540][65616] Updated weights for policy 0, policy_version 16080 (0.0031) [2024-06-12 14:35:09,332][65383] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 43153.8). Total num frames: 263585792. Throughput: 0: 43072.8. Samples: 263686560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 14:35:09,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:35:10,317][65616] Updated weights for policy 0, policy_version 16090 (0.0027) [2024-06-12 14:35:13,185][65616] Updated weights for policy 0, policy_version 16100 (0.0029) [2024-06-12 14:35:14,336][65383] Fps is (10 sec: 42583.7, 60 sec: 42869.0, 300 sec: 43097.8). Total num frames: 263798784. Throughput: 0: 43118.9. Samples: 263943740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 14:35:14,337][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:35:17,345][65616] Updated weights for policy 0, policy_version 16110 (0.0029) [2024-06-12 14:35:19,332][65383] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 43264.9). Total num frames: 264028160. Throughput: 0: 42989.8. Samples: 264079380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 14:35:19,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:35:20,848][65616] Updated weights for policy 0, policy_version 16120 (0.0026) [2024-06-12 14:35:24,332][65383] Fps is (10 sec: 40974.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 264208384. Throughput: 0: 43052.9. Samples: 264340860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 14:35:24,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:35:24,405][65595] Signal inference workers to stop experience collection... (3700 times) [2024-06-12 14:35:24,405][65595] Signal inference workers to resume experience collection... (3700 times) [2024-06-12 14:35:24,424][65616] InferenceWorker_p0-w0: stopping experience collection (3700 times) [2024-06-12 14:35:24,424][65616] InferenceWorker_p0-w0: resuming experience collection (3700 times) [2024-06-12 14:35:24,843][65616] Updated weights for policy 0, policy_version 16130 (0.0028) [2024-06-12 14:35:28,834][65616] Updated weights for policy 0, policy_version 16140 (0.0033) [2024-06-12 14:35:29,332][65383] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 43209.3). Total num frames: 264454144. Throughput: 0: 43150.2. Samples: 264589720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 14:35:29,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:35:32,548][65616] Updated weights for policy 0, policy_version 16150 (0.0032) [2024-06-12 14:35:34,332][65383] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 43264.8). Total num frames: 264667136. Throughput: 0: 42816.5. Samples: 264714800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 14:35:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:35:36,683][65616] Updated weights for policy 0, policy_version 16160 (0.0032) [2024-06-12 14:35:39,333][65383] Fps is (10 sec: 39320.9, 60 sec: 43144.4, 300 sec: 42987.1). Total num frames: 264847360. Throughput: 0: 42961.1. Samples: 264973500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 14:35:39,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:35:40,003][65616] Updated weights for policy 0, policy_version 16170 (0.0029) [2024-06-12 14:35:44,332][65383] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 43098.4). Total num frames: 265076736. Throughput: 0: 43003.4. Samples: 265233360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 14:35:44,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:35:44,661][65616] Updated weights for policy 0, policy_version 16180 (0.0025) [2024-06-12 14:35:47,605][65616] Updated weights for policy 0, policy_version 16190 (0.0024) [2024-06-12 14:35:49,332][65383] Fps is (10 sec: 45876.5, 60 sec: 42871.5, 300 sec: 43098.3). Total num frames: 265306112. Throughput: 0: 42877.0. Samples: 265361980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 14:35:49,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:35:52,254][65616] Updated weights for policy 0, policy_version 16200 (0.0026) [2024-06-12 14:35:54,332][65383] Fps is (10 sec: 45875.4, 60 sec: 43417.5, 300 sec: 43153.8). Total num frames: 265535488. Throughput: 0: 43019.5. Samples: 265622440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 14:35:54,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:35:54,338][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000016207_265535488.pth... [2024-06-12 14:35:54,381][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000015576_255197184.pth [2024-06-12 14:35:55,393][65616] Updated weights for policy 0, policy_version 16210 (0.0030) [2024-06-12 14:35:59,332][65383] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 43042.7). Total num frames: 265715712. Throughput: 0: 43241.5. Samples: 265889460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-12 14:35:59,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:35:59,897][65616] Updated weights for policy 0, policy_version 16220 (0.0031) [2024-06-12 14:36:02,751][65616] Updated weights for policy 0, policy_version 16230 (0.0033) [2024-06-12 14:36:04,332][65383] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 43153.8). Total num frames: 265961472. Throughput: 0: 42981.2. Samples: 266013540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-12 14:36:04,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:36:07,564][65616] Updated weights for policy 0, policy_version 16240 (0.0026) [2024-06-12 14:36:09,332][65383] Fps is (10 sec: 47513.9, 60 sec: 43417.6, 300 sec: 43153.8). Total num frames: 266190848. Throughput: 0: 43308.0. Samples: 266289720. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-06-12 14:36:09,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:36:09,922][65616] Updated weights for policy 0, policy_version 16250 (0.0027) [2024-06-12 14:36:14,332][65383] Fps is (10 sec: 40960.5, 60 sec: 42874.0, 300 sec: 43153.8). Total num frames: 266371072. Throughput: 0: 43568.1. Samples: 266550280. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-06-12 14:36:14,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:36:15,054][65616] Updated weights for policy 0, policy_version 16260 (0.0022) [2024-06-12 14:36:17,187][65616] Updated weights for policy 0, policy_version 16270 (0.0024) [2024-06-12 14:36:19,332][65383] Fps is (10 sec: 45875.1, 60 sec: 43690.7, 300 sec: 43209.3). Total num frames: 266649600. Throughput: 0: 43491.2. Samples: 266671900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-12 14:36:19,333][65383] Avg episode reward: [(0, '0.088')] [2024-06-12 14:36:22,292][65616] Updated weights for policy 0, policy_version 16280 (0.0025) [2024-06-12 14:36:24,332][65383] Fps is (10 sec: 47513.5, 60 sec: 43963.7, 300 sec: 43264.9). Total num frames: 266846208. Throughput: 0: 43878.9. Samples: 266948040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-12 14:36:24,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:36:25,121][65616] Updated weights for policy 0, policy_version 16290 (0.0030) [2024-06-12 14:36:29,332][65383] Fps is (10 sec: 39321.9, 60 sec: 43144.6, 300 sec: 43209.3). Total num frames: 267042816. Throughput: 0: 44031.3. Samples: 267214760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-12 14:36:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:36:29,374][65616] Updated weights for policy 0, policy_version 16300 (0.0033) [2024-06-12 14:36:31,595][65595] Signal inference workers to stop experience collection... (3750 times) [2024-06-12 14:36:31,606][65616] InferenceWorker_p0-w0: stopping experience collection (3750 times) [2024-06-12 14:36:31,704][65595] Signal inference workers to resume experience collection... (3750 times) [2024-06-12 14:36:31,704][65616] InferenceWorker_p0-w0: resuming experience collection (3750 times) [2024-06-12 14:36:33,027][65616] Updated weights for policy 0, policy_version 16310 (0.0029) [2024-06-12 14:36:34,332][65383] Fps is (10 sec: 42597.8, 60 sec: 43417.6, 300 sec: 43153.8). Total num frames: 267272192. Throughput: 0: 43769.2. Samples: 267331600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 14:36:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:36:36,882][65616] Updated weights for policy 0, policy_version 16320 (0.0032) [2024-06-12 14:36:39,332][65383] Fps is (10 sec: 44236.0, 60 sec: 43963.8, 300 sec: 43320.4). Total num frames: 267485184. Throughput: 0: 43806.6. Samples: 267593740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 14:36:39,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:36:40,526][65616] Updated weights for policy 0, policy_version 16330 (0.0028) [2024-06-12 14:36:44,332][65383] Fps is (10 sec: 42598.6, 60 sec: 43690.7, 300 sec: 43264.9). Total num frames: 267698176. Throughput: 0: 43469.3. Samples: 267845580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 25.0) [2024-06-12 14:36:44,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:36:44,466][65616] Updated weights for policy 0, policy_version 16340 (0.0025) [2024-06-12 14:36:48,281][65616] Updated weights for policy 0, policy_version 16350 (0.0030) [2024-06-12 14:36:49,332][65383] Fps is (10 sec: 42598.7, 60 sec: 43417.5, 300 sec: 43098.3). Total num frames: 267911168. Throughput: 0: 43476.5. Samples: 267969980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 25.0) [2024-06-12 14:36:49,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:36:52,224][65616] Updated weights for policy 0, policy_version 16360 (0.0029) [2024-06-12 14:36:54,332][65383] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 43264.9). Total num frames: 268107776. Throughput: 0: 43024.4. Samples: 268225820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 25.0) [2024-06-12 14:36:54,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:36:55,573][65616] Updated weights for policy 0, policy_version 16370 (0.0026) [2024-06-12 14:36:59,332][65383] Fps is (10 sec: 44236.9, 60 sec: 43963.8, 300 sec: 43209.3). Total num frames: 268353536. Throughput: 0: 42927.5. Samples: 268482020. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-12 14:36:59,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:36:59,412][65616] Updated weights for policy 0, policy_version 16380 (0.0025) [2024-06-12 14:37:03,878][65616] Updated weights for policy 0, policy_version 16390 (0.0032) [2024-06-12 14:37:04,332][65383] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 43153.8). Total num frames: 268533760. Throughput: 0: 43242.7. Samples: 268617820. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-12 14:37:04,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:37:07,075][65616] Updated weights for policy 0, policy_version 16400 (0.0024) [2024-06-12 14:37:09,332][65383] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 43264.8). Total num frames: 268746752. Throughput: 0: 42582.1. Samples: 268864240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-12 14:37:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:37:11,571][65616] Updated weights for policy 0, policy_version 16410 (0.0031) [2024-06-12 14:37:14,333][65383] Fps is (10 sec: 45874.3, 60 sec: 43690.5, 300 sec: 43098.2). Total num frames: 268992512. Throughput: 0: 42408.2. Samples: 269123140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-12 14:37:14,333][65383] Avg episode reward: [(0, '0.094')] [2024-06-12 14:37:14,781][65616] Updated weights for policy 0, policy_version 16420 (0.0030) [2024-06-12 14:37:19,216][65616] Updated weights for policy 0, policy_version 16430 (0.0025) [2024-06-12 14:37:19,332][65383] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 43153.8). Total num frames: 269189120. Throughput: 0: 42808.1. Samples: 269257960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 14:37:19,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:37:22,412][65616] Updated weights for policy 0, policy_version 16440 (0.0023) [2024-06-12 14:37:24,332][65383] Fps is (10 sec: 42599.6, 60 sec: 42871.5, 300 sec: 43209.3). Total num frames: 269418496. Throughput: 0: 42606.4. Samples: 269511020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 14:37:24,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:37:27,120][65616] Updated weights for policy 0, policy_version 16450 (0.0031) [2024-06-12 14:37:29,333][65383] Fps is (10 sec: 44236.1, 60 sec: 43144.4, 300 sec: 43098.2). Total num frames: 269631488. Throughput: 0: 42716.8. Samples: 269767840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 14:37:29,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:37:30,468][65616] Updated weights for policy 0, policy_version 16460 (0.0032) [2024-06-12 14:37:34,332][65383] Fps is (10 sec: 39320.9, 60 sec: 42325.4, 300 sec: 43042.7). Total num frames: 269811712. Throughput: 0: 42757.7. Samples: 269894080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-12 14:37:34,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:37:34,650][65616] Updated weights for policy 0, policy_version 16470 (0.0031) [2024-06-12 14:37:37,534][65616] Updated weights for policy 0, policy_version 16480 (0.0029) [2024-06-12 14:37:39,332][65383] Fps is (10 sec: 42599.3, 60 sec: 42871.6, 300 sec: 43153.8). Total num frames: 270057472. Throughput: 0: 42881.4. Samples: 270155480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-12 14:37:39,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:37:42,425][65616] Updated weights for policy 0, policy_version 16490 (0.0026) [2024-06-12 14:37:44,332][65383] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 43098.2). Total num frames: 270270464. Throughput: 0: 42719.1. Samples: 270404380. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-12 14:37:44,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:37:45,417][65616] Updated weights for policy 0, policy_version 16500 (0.0024) [2024-06-12 14:37:49,332][65383] Fps is (10 sec: 37683.1, 60 sec: 42052.3, 300 sec: 42931.7). Total num frames: 270434304. Throughput: 0: 42713.4. Samples: 270539920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-12 14:37:49,333][65383] Avg episode reward: [(0, '0.094')] [2024-06-12 14:37:49,939][65616] Updated weights for policy 0, policy_version 16510 (0.0030) [2024-06-12 14:37:52,825][65616] Updated weights for policy 0, policy_version 16520 (0.0026) [2024-06-12 14:37:54,336][65383] Fps is (10 sec: 42583.5, 60 sec: 43142.0, 300 sec: 43042.2). Total num frames: 270696448. Throughput: 0: 42953.2. Samples: 270797280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 14:37:54,336][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:37:54,348][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000016522_270696448.pth... [2024-06-12 14:37:54,414][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000015890_260341760.pth [2024-06-12 14:37:57,295][65616] Updated weights for policy 0, policy_version 16530 (0.0028) [2024-06-12 14:37:58,089][65595] Signal inference workers to stop experience collection... (3800 times) [2024-06-12 14:37:58,089][65595] Signal inference workers to resume experience collection... (3800 times) [2024-06-12 14:37:58,116][65616] InferenceWorker_p0-w0: stopping experience collection (3800 times) [2024-06-12 14:37:58,116][65616] InferenceWorker_p0-w0: resuming experience collection (3800 times) [2024-06-12 14:37:59,332][65383] Fps is (10 sec: 49151.5, 60 sec: 42871.5, 300 sec: 43209.3). Total num frames: 270925824. Throughput: 0: 42833.0. Samples: 271050620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 14:37:59,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:38:00,301][65616] Updated weights for policy 0, policy_version 16540 (0.0026) [2024-06-12 14:38:04,332][65383] Fps is (10 sec: 44252.1, 60 sec: 43417.6, 300 sec: 43153.8). Total num frames: 271138816. Throughput: 0: 43084.4. Samples: 271196760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 14:38:04,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:38:04,893][65616] Updated weights for policy 0, policy_version 16550 (0.0033) [2024-06-12 14:38:08,210][65616] Updated weights for policy 0, policy_version 16560 (0.0036) [2024-06-12 14:38:09,332][65383] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 271335424. Throughput: 0: 43068.4. Samples: 271449100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 14:38:09,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:38:12,338][65616] Updated weights for policy 0, policy_version 16570 (0.0022) [2024-06-12 14:38:14,332][65383] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 43209.3). Total num frames: 271581184. Throughput: 0: 43130.7. Samples: 271708720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 14:38:14,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:38:16,108][65616] Updated weights for policy 0, policy_version 16580 (0.0039) [2024-06-12 14:38:19,332][65383] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 43098.3). Total num frames: 271777792. Throughput: 0: 43307.2. Samples: 271842900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 14:38:19,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:38:19,682][65616] Updated weights for policy 0, policy_version 16590 (0.0022) [2024-06-12 14:38:23,621][65616] Updated weights for policy 0, policy_version 16600 (0.0025) [2024-06-12 14:38:24,333][65383] Fps is (10 sec: 39321.5, 60 sec: 42598.2, 300 sec: 42987.2). Total num frames: 271974400. Throughput: 0: 43210.9. Samples: 272099980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 14:38:24,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:38:27,127][65616] Updated weights for policy 0, policy_version 16610 (0.0028) [2024-06-12 14:38:29,332][65383] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 43209.3). Total num frames: 272220160. Throughput: 0: 43549.3. Samples: 272364100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 14:38:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:38:31,483][65616] Updated weights for policy 0, policy_version 16620 (0.0034) [2024-06-12 14:38:34,332][65383] Fps is (10 sec: 44237.6, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 272416768. Throughput: 0: 43259.1. Samples: 272486580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 14:38:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:38:34,806][65616] Updated weights for policy 0, policy_version 16630 (0.0038) [2024-06-12 14:38:38,979][65616] Updated weights for policy 0, policy_version 16640 (0.0021) [2024-06-12 14:38:39,332][65383] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 43209.3). Total num frames: 272646144. Throughput: 0: 43174.5. Samples: 272739980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 14:38:39,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:38:42,790][65616] Updated weights for policy 0, policy_version 16650 (0.0030) [2024-06-12 14:38:44,332][65383] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 272842752. Throughput: 0: 43001.3. Samples: 272985680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 14:38:44,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:38:46,858][65616] Updated weights for policy 0, policy_version 16660 (0.0023) [2024-06-12 14:38:49,332][65383] Fps is (10 sec: 40960.1, 60 sec: 43690.6, 300 sec: 43153.8). Total num frames: 273055744. Throughput: 0: 42717.9. Samples: 273119060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 14:38:49,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:38:50,536][65616] Updated weights for policy 0, policy_version 16670 (0.0032) [2024-06-12 14:38:54,090][65616] Updated weights for policy 0, policy_version 16680 (0.0030) [2024-06-12 14:38:54,332][65383] Fps is (10 sec: 44237.1, 60 sec: 43147.1, 300 sec: 43098.2). Total num frames: 273285120. Throughput: 0: 42819.5. Samples: 273375980. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-12 14:38:54,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:38:58,269][65616] Updated weights for policy 0, policy_version 16690 (0.0021) [2024-06-12 14:38:59,332][65383] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42987.2). Total num frames: 273465344. Throughput: 0: 42803.2. Samples: 273634860. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-12 14:38:59,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:39:01,772][65616] Updated weights for policy 0, policy_version 16700 (0.0024) [2024-06-12 14:39:04,332][65383] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 43098.2). Total num frames: 273711104. Throughput: 0: 42660.4. Samples: 273762620. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-12 14:39:04,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:39:05,945][65616] Updated weights for policy 0, policy_version 16710 (0.0027) [2024-06-12 14:39:09,170][65616] Updated weights for policy 0, policy_version 16720 (0.0034) [2024-06-12 14:39:09,332][65383] Fps is (10 sec: 47513.7, 60 sec: 43417.6, 300 sec: 43098.3). Total num frames: 273940480. Throughput: 0: 42941.0. Samples: 274032320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 14:39:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:39:13,003][65616] Updated weights for policy 0, policy_version 16730 (0.0032) [2024-06-12 14:39:14,332][65383] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42987.2). Total num frames: 274120704. Throughput: 0: 42631.5. Samples: 274282520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 14:39:14,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:39:17,036][65616] Updated weights for policy 0, policy_version 16740 (0.0029) [2024-06-12 14:39:19,332][65383] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 43209.3). Total num frames: 274366464. Throughput: 0: 42923.4. Samples: 274418140. Policy #0 lag: (min: 1.0, avg: 12.6, max: 26.0) [2024-06-12 14:39:19,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:39:20,674][65616] Updated weights for policy 0, policy_version 16750 (0.0026) [2024-06-12 14:39:24,282][65595] Signal inference workers to stop experience collection... (3850 times) [2024-06-12 14:39:24,327][65616] InferenceWorker_p0-w0: stopping experience collection (3850 times) [2024-06-12 14:39:24,332][65383] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 274546688. Throughput: 0: 42945.3. Samples: 274672520. Policy #0 lag: (min: 1.0, avg: 12.6, max: 26.0) [2024-06-12 14:39:24,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:39:24,389][65595] Signal inference workers to resume experience collection... (3850 times) [2024-06-12 14:39:24,389][65616] InferenceWorker_p0-w0: resuming experience collection (3850 times) [2024-06-12 14:39:24,702][65616] Updated weights for policy 0, policy_version 16760 (0.0031) [2024-06-12 14:39:28,780][65616] Updated weights for policy 0, policy_version 16770 (0.0027) [2024-06-12 14:39:29,332][65383] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 274776064. Throughput: 0: 43249.0. Samples: 274931880. Policy #0 lag: (min: 1.0, avg: 12.6, max: 26.0) [2024-06-12 14:39:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:39:32,155][65616] Updated weights for policy 0, policy_version 16780 (0.0024) [2024-06-12 14:39:34,332][65383] Fps is (10 sec: 47513.4, 60 sec: 43417.5, 300 sec: 43264.9). Total num frames: 275021824. Throughput: 0: 43304.7. Samples: 275067780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-12 14:39:34,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:39:36,209][65616] Updated weights for policy 0, policy_version 16790 (0.0030) [2024-06-12 14:39:39,332][65383] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 275185664. Throughput: 0: 43242.7. Samples: 275321900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-12 14:39:39,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:39:39,858][65616] Updated weights for policy 0, policy_version 16800 (0.0030) [2024-06-12 14:39:44,058][65616] Updated weights for policy 0, policy_version 16810 (0.0031) [2024-06-12 14:39:44,332][65383] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 275415040. Throughput: 0: 43206.7. Samples: 275579160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 14:39:44,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:39:47,719][65616] Updated weights for policy 0, policy_version 16820 (0.0033) [2024-06-12 14:39:49,332][65383] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 275644416. Throughput: 0: 43232.4. Samples: 275708080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 14:39:49,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:39:51,569][65616] Updated weights for policy 0, policy_version 16830 (0.0038) [2024-06-12 14:39:54,332][65383] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 275873792. Throughput: 0: 42758.7. Samples: 275956460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 14:39:54,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:39:54,345][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000016838_275873792.pth... [2024-06-12 14:39:54,391][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000016207_265535488.pth [2024-06-12 14:39:54,834][65616] Updated weights for policy 0, policy_version 16840 (0.0032) [2024-06-12 14:39:59,031][65616] Updated weights for policy 0, policy_version 16850 (0.0032) [2024-06-12 14:39:59,333][65383] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 276070400. Throughput: 0: 43121.7. Samples: 276223000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 14:39:59,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:40:02,567][65616] Updated weights for policy 0, policy_version 16860 (0.0031) [2024-06-12 14:40:04,333][65383] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 276283392. Throughput: 0: 42927.5. Samples: 276349880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 14:40:04,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:40:06,645][65616] Updated weights for policy 0, policy_version 16870 (0.0030) [2024-06-12 14:40:09,333][65383] Fps is (10 sec: 45875.2, 60 sec: 43144.4, 300 sec: 43154.3). Total num frames: 276529152. Throughput: 0: 43084.4. Samples: 276611320. Policy #0 lag: (min: 1.0, avg: 8.6, max: 19.0) [2024-06-12 14:40:09,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:40:10,366][65616] Updated weights for policy 0, policy_version 16880 (0.0027) [2024-06-12 14:40:14,175][65616] Updated weights for policy 0, policy_version 16890 (0.0028) [2024-06-12 14:40:14,332][65383] Fps is (10 sec: 44237.4, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 276725760. Throughput: 0: 43000.9. Samples: 276866920. Policy #0 lag: (min: 1.0, avg: 8.6, max: 19.0) [2024-06-12 14:40:14,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:40:18,320][65616] Updated weights for policy 0, policy_version 16900 (0.0034) [2024-06-12 14:40:19,332][65383] Fps is (10 sec: 40960.8, 60 sec: 42871.6, 300 sec: 43153.8). Total num frames: 276938752. Throughput: 0: 42789.0. Samples: 276993280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 14:40:19,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:40:22,182][65616] Updated weights for policy 0, policy_version 16910 (0.0030) [2024-06-12 14:40:24,332][65383] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 277151744. Throughput: 0: 42767.5. Samples: 277246440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 14:40:24,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:40:26,046][65616] Updated weights for policy 0, policy_version 16920 (0.0026) [2024-06-12 14:40:29,332][65383] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 277364736. Throughput: 0: 42721.3. Samples: 277501620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 14:40:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:40:29,649][65616] Updated weights for policy 0, policy_version 16930 (0.0028) [2024-06-12 14:40:33,674][65616] Updated weights for policy 0, policy_version 16940 (0.0027) [2024-06-12 14:40:34,333][65383] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 43098.3). Total num frames: 277561344. Throughput: 0: 42731.5. Samples: 277631000. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-06-12 14:40:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:40:36,022][65595] Signal inference workers to stop experience collection... (3900 times) [2024-06-12 14:40:36,056][65616] InferenceWorker_p0-w0: stopping experience collection (3900 times) [2024-06-12 14:40:36,067][65595] Signal inference workers to resume experience collection... (3900 times) [2024-06-12 14:40:36,088][65616] InferenceWorker_p0-w0: resuming experience collection (3900 times) [2024-06-12 14:40:37,278][65616] Updated weights for policy 0, policy_version 16950 (0.0024) [2024-06-12 14:40:39,332][65383] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 43098.3). Total num frames: 277790720. Throughput: 0: 42939.1. Samples: 277888720. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-06-12 14:40:39,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:40:40,998][65616] Updated weights for policy 0, policy_version 16960 (0.0026) [2024-06-12 14:40:44,332][65383] Fps is (10 sec: 45875.9, 60 sec: 43417.6, 300 sec: 43098.2). Total num frames: 278020096. Throughput: 0: 42879.2. Samples: 278152560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 14:40:44,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:40:45,156][65616] Updated weights for policy 0, policy_version 16970 (0.0034) [2024-06-12 14:40:48,649][65616] Updated weights for policy 0, policy_version 16980 (0.0032) [2024-06-12 14:40:49,332][65383] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 278216704. Throughput: 0: 42749.0. Samples: 278273580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 14:40:49,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:40:52,748][65616] Updated weights for policy 0, policy_version 16990 (0.0029) [2024-06-12 14:40:54,332][65383] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 43153.8). Total num frames: 278446080. Throughput: 0: 42776.1. Samples: 278536240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 14:40:54,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:40:56,637][65616] Updated weights for policy 0, policy_version 17000 (0.0035) [2024-06-12 14:40:59,332][65383] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 278626304. Throughput: 0: 42712.4. Samples: 278788980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-12 14:40:59,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:41:00,377][65616] Updated weights for policy 0, policy_version 17010 (0.0038) [2024-06-12 14:41:04,332][65383] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 278839296. Throughput: 0: 42608.4. Samples: 278910660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-12 14:41:04,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:41:04,449][65616] Updated weights for policy 0, policy_version 17020 (0.0030) [2024-06-12 14:41:07,750][65616] Updated weights for policy 0, policy_version 17030 (0.0030) [2024-06-12 14:41:09,333][65383] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 43042.7). Total num frames: 279068672. Throughput: 0: 42690.1. Samples: 279167500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 14:41:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:41:12,107][65616] Updated weights for policy 0, policy_version 17040 (0.0033) [2024-06-12 14:41:14,332][65383] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 279281664. Throughput: 0: 42763.1. Samples: 279425960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 14:41:14,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:41:15,107][65616] Updated weights for policy 0, policy_version 17050 (0.0025) [2024-06-12 14:41:19,332][65383] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 279478272. Throughput: 0: 42572.1. Samples: 279546740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 14:41:19,333][65383] Avg episode reward: [(0, '0.094')] [2024-06-12 14:41:19,837][65616] Updated weights for policy 0, policy_version 17060 (0.0030) [2024-06-12 14:41:23,030][65616] Updated weights for policy 0, policy_version 17070 (0.0023) [2024-06-12 14:41:24,332][65383] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 279724032. Throughput: 0: 42569.4. Samples: 279804340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 14:41:24,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:41:27,646][65616] Updated weights for policy 0, policy_version 17080 (0.0028) [2024-06-12 14:41:29,332][65383] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 279920640. Throughput: 0: 42393.4. Samples: 280060260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 14:41:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:41:31,072][65616] Updated weights for policy 0, policy_version 17090 (0.0033) [2024-06-12 14:41:34,332][65383] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 280133632. Throughput: 0: 42569.4. Samples: 280189200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 14:41:34,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:41:34,829][65616] Updated weights for policy 0, policy_version 17100 (0.0023) [2024-06-12 14:41:38,571][65616] Updated weights for policy 0, policy_version 17110 (0.0032) [2024-06-12 14:41:39,332][65383] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 280346624. Throughput: 0: 42475.7. Samples: 280447640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 14:41:39,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:41:42,654][65616] Updated weights for policy 0, policy_version 17120 (0.0026) [2024-06-12 14:41:44,332][65383] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 280559616. Throughput: 0: 42577.8. Samples: 280704980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 14:41:44,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:41:46,615][65616] Updated weights for policy 0, policy_version 17130 (0.0037) [2024-06-12 14:41:49,332][65383] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 280772608. Throughput: 0: 42760.9. Samples: 280834900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 14:41:49,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:41:50,346][65616] Updated weights for policy 0, policy_version 17140 (0.0024) [2024-06-12 14:41:54,310][65616] Updated weights for policy 0, policy_version 17150 (0.0034) [2024-06-12 14:41:54,332][65383] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 280985600. Throughput: 0: 42767.2. Samples: 281092020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 14:41:54,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:41:54,480][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000017151_281001984.pth... [2024-06-12 14:41:54,524][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000016522_270696448.pth [2024-06-12 14:41:58,026][65616] Updated weights for policy 0, policy_version 17160 (0.0036) [2024-06-12 14:41:59,332][65383] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 281198592. Throughput: 0: 42492.0. Samples: 281338100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 14:41:59,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:42:01,922][65616] Updated weights for policy 0, policy_version 17170 (0.0026) [2024-06-12 14:42:04,332][65383] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 281411584. Throughput: 0: 42943.6. Samples: 281479200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 14:42:04,333][65383] Avg episode reward: [(0, '0.094')] [2024-06-12 14:42:05,525][65616] Updated weights for policy 0, policy_version 17180 (0.0041) [2024-06-12 14:42:09,332][65383] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 281624576. Throughput: 0: 42674.2. Samples: 281724680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 14:42:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:42:09,398][65616] Updated weights for policy 0, policy_version 17190 (0.0030) [2024-06-12 14:42:13,300][65616] Updated weights for policy 0, policy_version 17200 (0.0036) [2024-06-12 14:42:14,332][65383] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 281853952. Throughput: 0: 42736.0. Samples: 281983380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 14:42:14,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:42:17,291][65616] Updated weights for policy 0, policy_version 17210 (0.0031) [2024-06-12 14:42:19,333][65383] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 282066944. Throughput: 0: 42920.3. Samples: 282120620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 14:42:19,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:42:20,394][65616] Updated weights for policy 0, policy_version 17220 (0.0024) [2024-06-12 14:42:24,332][65383] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 282263552. Throughput: 0: 42939.4. Samples: 282379920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 14:42:24,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:42:24,646][65616] Updated weights for policy 0, policy_version 17230 (0.0025) [2024-06-12 14:42:25,402][65595] Signal inference workers to stop experience collection... (3950 times) [2024-06-12 14:42:25,423][65616] InferenceWorker_p0-w0: stopping experience collection (3950 times) [2024-06-12 14:42:25,456][65595] Signal inference workers to resume experience collection... (3950 times) [2024-06-12 14:42:25,456][65616] InferenceWorker_p0-w0: resuming experience collection (3950 times) [2024-06-12 14:42:28,003][65616] Updated weights for policy 0, policy_version 17240 (0.0034) [2024-06-12 14:42:29,335][65383] Fps is (10 sec: 44227.2, 60 sec: 43142.8, 300 sec: 43042.4). Total num frames: 282509312. Throughput: 0: 43040.9. Samples: 282641920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 14:42:29,335][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:42:32,205][65616] Updated weights for policy 0, policy_version 17250 (0.0026) [2024-06-12 14:42:34,333][65383] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 282722304. Throughput: 0: 43115.8. Samples: 282775120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 14:42:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:42:35,935][65616] Updated weights for policy 0, policy_version 17260 (0.0032) [2024-06-12 14:42:39,332][65383] Fps is (10 sec: 40969.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 282918912. Throughput: 0: 43191.7. Samples: 283035640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 14:42:39,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:42:40,211][65616] Updated weights for policy 0, policy_version 17270 (0.0025) [2024-06-12 14:42:43,410][65616] Updated weights for policy 0, policy_version 17280 (0.0032) [2024-06-12 14:42:44,332][65383] Fps is (10 sec: 44237.5, 60 sec: 43417.6, 300 sec: 43153.8). Total num frames: 283164672. Throughput: 0: 43420.8. Samples: 283292040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 14:42:44,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:42:47,607][65616] Updated weights for policy 0, policy_version 17290 (0.0033) [2024-06-12 14:42:49,332][65383] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42932.1). Total num frames: 283361280. Throughput: 0: 43103.0. Samples: 283418840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 14:42:49,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:42:51,275][65616] Updated weights for policy 0, policy_version 17300 (0.0030) [2024-06-12 14:42:54,332][65383] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 283557888. Throughput: 0: 43171.1. Samples: 283667380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 14:42:54,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:42:55,420][65616] Updated weights for policy 0, policy_version 17310 (0.0027) [2024-06-12 14:42:58,979][65616] Updated weights for policy 0, policy_version 17320 (0.0035) [2024-06-12 14:42:59,333][65383] Fps is (10 sec: 40959.3, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 283770880. Throughput: 0: 43065.6. Samples: 283921340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 14:42:59,333][65383] Avg episode reward: [(0, '0.094')] [2024-06-12 14:43:03,138][65616] Updated weights for policy 0, policy_version 17330 (0.0034) [2024-06-12 14:43:04,332][65383] Fps is (10 sec: 42598.1, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 283983872. Throughput: 0: 42868.5. Samples: 284049700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 14:43:04,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:43:06,670][65616] Updated weights for policy 0, policy_version 17340 (0.0025) [2024-06-12 14:43:09,332][65383] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 284196864. Throughput: 0: 42709.4. Samples: 284301840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 14:43:09,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:43:10,957][65616] Updated weights for policy 0, policy_version 17350 (0.0030) [2024-06-12 14:43:13,680][65616] Updated weights for policy 0, policy_version 17360 (0.0035) [2024-06-12 14:43:14,332][65383] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 284426240. Throughput: 0: 42619.5. Samples: 284559700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 14:43:14,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:43:18,185][65616] Updated weights for policy 0, policy_version 17370 (0.0031) [2024-06-12 14:43:19,332][65383] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 284622848. Throughput: 0: 42519.3. Samples: 284688480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 14:43:19,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:43:21,582][65616] Updated weights for policy 0, policy_version 17380 (0.0029) [2024-06-12 14:43:24,333][65383] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 284835840. Throughput: 0: 42330.5. Samples: 284940520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 14:43:24,333][65383] Avg episode reward: [(0, '0.094')] [2024-06-12 14:43:25,727][65616] Updated weights for policy 0, policy_version 17390 (0.0025) [2024-06-12 14:43:29,332][65383] Fps is (10 sec: 42598.6, 60 sec: 42327.0, 300 sec: 42820.5). Total num frames: 285048832. Throughput: 0: 42484.0. Samples: 285203820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 14:43:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:43:29,812][65616] Updated weights for policy 0, policy_version 17400 (0.0029) [2024-06-12 14:43:33,818][65616] Updated weights for policy 0, policy_version 17410 (0.0022) [2024-06-12 14:43:34,332][65383] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 285261824. Throughput: 0: 42348.3. Samples: 285324520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 14:43:34,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:43:37,793][65616] Updated weights for policy 0, policy_version 17420 (0.0026) [2024-06-12 14:43:39,332][65383] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 285458432. Throughput: 0: 42367.5. Samples: 285573920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 26.0) [2024-06-12 14:43:39,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:43:40,988][65616] Updated weights for policy 0, policy_version 17430 (0.0023) [2024-06-12 14:43:44,332][65383] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 285704192. Throughput: 0: 42496.0. Samples: 285833660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 26.0) [2024-06-12 14:43:44,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:43:45,447][65616] Updated weights for policy 0, policy_version 17440 (0.0023) [2024-06-12 14:43:48,668][65616] Updated weights for policy 0, policy_version 17450 (0.0030) [2024-06-12 14:43:49,332][65383] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 285900800. Throughput: 0: 42637.8. Samples: 285968400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 18.0) [2024-06-12 14:43:49,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:43:52,417][65616] Updated weights for policy 0, policy_version 17460 (0.0031) [2024-06-12 14:43:54,332][65383] Fps is (10 sec: 37683.7, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 286081024. Throughput: 0: 42521.9. Samples: 286215320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 18.0) [2024-06-12 14:43:54,333][65383] Avg episode reward: [(0, '0.094')] [2024-06-12 14:43:54,339][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000017461_286081024.pth... [2024-06-12 14:43:54,383][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000016838_275873792.pth [2024-06-12 14:43:56,871][65616] Updated weights for policy 0, policy_version 17470 (0.0024) [2024-06-12 14:43:59,332][65383] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 286343168. Throughput: 0: 42308.5. Samples: 286463580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 18.0) [2024-06-12 14:43:59,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:44:00,537][65616] Updated weights for policy 0, policy_version 17480 (0.0023) [2024-06-12 14:44:01,777][65595] Signal inference workers to stop experience collection... (4000 times) [2024-06-12 14:44:01,778][65595] Signal inference workers to resume experience collection... (4000 times) [2024-06-12 14:44:01,795][65616] InferenceWorker_p0-w0: stopping experience collection (4000 times) [2024-06-12 14:44:01,795][65616] InferenceWorker_p0-w0: resuming experience collection (4000 times) [2024-06-12 14:44:04,332][65383] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 286523392. Throughput: 0: 42512.5. Samples: 286601540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-12 14:44:04,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:44:04,841][65616] Updated weights for policy 0, policy_version 17490 (0.0031) [2024-06-12 14:44:08,889][65616] Updated weights for policy 0, policy_version 17500 (0.0029) [2024-06-12 14:44:09,332][65383] Fps is (10 sec: 37683.1, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 286720000. Throughput: 0: 42241.9. Samples: 286841400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-12 14:44:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:44:12,830][65616] Updated weights for policy 0, policy_version 17510 (0.0027) [2024-06-12 14:44:14,332][65383] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 286965760. Throughput: 0: 42139.0. Samples: 287100080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 14:44:14,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:44:16,666][65616] Updated weights for policy 0, policy_version 17520 (0.0028) [2024-06-12 14:44:19,332][65383] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 287145984. Throughput: 0: 42235.7. Samples: 287225120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 14:44:19,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:44:20,191][65616] Updated weights for policy 0, policy_version 17530 (0.0027) [2024-06-12 14:44:24,006][65616] Updated weights for policy 0, policy_version 17540 (0.0029) [2024-06-12 14:44:24,332][65383] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 287375360. Throughput: 0: 42436.5. Samples: 287483560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 14:44:24,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:44:27,913][65616] Updated weights for policy 0, policy_version 17550 (0.0034) [2024-06-12 14:44:29,332][65383] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 287604736. Throughput: 0: 42388.9. Samples: 287741160. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-12 14:44:29,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:44:31,301][65616] Updated weights for policy 0, policy_version 17560 (0.0033) [2024-06-12 14:44:34,332][65383] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 287801344. Throughput: 0: 42231.5. Samples: 287868820. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-12 14:44:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:44:35,658][65616] Updated weights for policy 0, policy_version 17570 (0.0030) [2024-06-12 14:44:39,156][65616] Updated weights for policy 0, policy_version 17580 (0.0033) [2024-06-12 14:44:39,332][65383] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 288030720. Throughput: 0: 42352.8. Samples: 288121200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 14:44:39,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:44:43,023][65616] Updated weights for policy 0, policy_version 17590 (0.0034) [2024-06-12 14:44:44,332][65383] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 288243712. Throughput: 0: 42776.4. Samples: 288388520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 14:44:44,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:44:46,941][65616] Updated weights for policy 0, policy_version 17600 (0.0031) [2024-06-12 14:44:49,332][65383] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 288440320. Throughput: 0: 42495.9. Samples: 288513860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 14:44:49,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:44:51,197][65616] Updated weights for policy 0, policy_version 17610 (0.0033) [2024-06-12 14:44:54,280][65616] Updated weights for policy 0, policy_version 17620 (0.0030) [2024-06-12 14:44:54,332][65383] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 288686080. Throughput: 0: 42684.8. Samples: 288762220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 14:44:54,341][65383] Avg episode reward: [(0, '0.094')] [2024-06-12 14:44:58,790][65616] Updated weights for policy 0, policy_version 17630 (0.0026) [2024-06-12 14:44:59,332][65383] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42654.0). Total num frames: 288866304. Throughput: 0: 42696.1. Samples: 289021400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 14:44:59,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:45:02,119][65616] Updated weights for policy 0, policy_version 17640 (0.0035) [2024-06-12 14:45:04,332][65383] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 289079296. Throughput: 0: 42750.6. Samples: 289148900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 14:45:04,333][65383] Avg episode reward: [(0, '0.091')] [2024-06-12 14:45:06,071][65616] Updated weights for policy 0, policy_version 17650 (0.0024) [2024-06-12 14:45:09,332][65383] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 289308672. Throughput: 0: 42824.9. Samples: 289410680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 14:45:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:45:09,918][65616] Updated weights for policy 0, policy_version 17660 (0.0026) [2024-06-12 14:45:13,583][65616] Updated weights for policy 0, policy_version 17670 (0.0029) [2024-06-12 14:45:14,332][65383] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 289521664. Throughput: 0: 42723.6. Samples: 289663720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 14:45:14,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:45:17,495][65616] Updated weights for policy 0, policy_version 17680 (0.0029) [2024-06-12 14:45:19,332][65383] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 289751040. Throughput: 0: 42923.6. Samples: 289800380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 14:45:19,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:45:21,251][65616] Updated weights for policy 0, policy_version 17690 (0.0026) [2024-06-12 14:45:24,332][65383] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 289931264. Throughput: 0: 43154.7. Samples: 290063160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 14:45:24,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:45:25,137][65616] Updated weights for policy 0, policy_version 17700 (0.0024) [2024-06-12 14:45:28,591][65616] Updated weights for policy 0, policy_version 17710 (0.0039) [2024-06-12 14:45:29,332][65383] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 290160640. Throughput: 0: 42703.5. Samples: 290310180. Policy #0 lag: (min: 0.0, avg: 13.3, max: 25.0) [2024-06-12 14:45:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:45:32,763][65616] Updated weights for policy 0, policy_version 17720 (0.0026) [2024-06-12 14:45:34,333][65383] Fps is (10 sec: 47513.4, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 290406400. Throughput: 0: 42970.6. Samples: 290447540. Policy #0 lag: (min: 0.0, avg: 13.3, max: 25.0) [2024-06-12 14:45:34,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:45:36,295][65616] Updated weights for policy 0, policy_version 17730 (0.0024) [2024-06-12 14:45:39,332][65383] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 290586624. Throughput: 0: 43177.0. Samples: 290705180. Policy #0 lag: (min: 0.0, avg: 13.3, max: 25.0) [2024-06-12 14:45:39,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:45:40,717][65616] Updated weights for policy 0, policy_version 17740 (0.0030) [2024-06-12 14:45:43,614][65616] Updated weights for policy 0, policy_version 17750 (0.0030) [2024-06-12 14:45:44,332][65383] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 290832384. Throughput: 0: 42750.3. Samples: 290945160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-12 14:45:44,333][65383] Avg episode reward: [(0, '0.094')] [2024-06-12 14:45:48,787][65616] Updated weights for policy 0, policy_version 17760 (0.0028) [2024-06-12 14:45:49,332][65383] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 290996224. Throughput: 0: 42969.8. Samples: 291082540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-12 14:45:49,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:45:51,357][65616] Updated weights for policy 0, policy_version 17770 (0.0027) [2024-06-12 14:45:54,332][65383] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 291225600. Throughput: 0: 42728.0. Samples: 291333440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 14:45:54,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:45:54,345][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000017775_291225600.pth... [2024-06-12 14:45:54,391][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000017151_281001984.pth [2024-06-12 14:45:56,528][65616] Updated weights for policy 0, policy_version 17780 (0.0033) [2024-06-12 14:45:57,797][65595] Signal inference workers to stop experience collection... (4050 times) [2024-06-12 14:45:57,797][65595] Signal inference workers to resume experience collection... (4050 times) [2024-06-12 14:45:57,811][65616] InferenceWorker_p0-w0: stopping experience collection (4050 times) [2024-06-12 14:45:57,811][65616] InferenceWorker_p0-w0: resuming experience collection (4050 times) [2024-06-12 14:45:58,924][65616] Updated weights for policy 0, policy_version 17790 (0.0027) [2024-06-12 14:45:59,332][65383] Fps is (10 sec: 47513.6, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 291471360. Throughput: 0: 42663.6. Samples: 291583580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 14:45:59,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:46:04,332][65383] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 291618816. Throughput: 0: 42327.1. Samples: 291705100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 14:46:04,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:46:04,360][65616] Updated weights for policy 0, policy_version 17800 (0.0029) [2024-06-12 14:46:06,896][65616] Updated weights for policy 0, policy_version 17810 (0.0028) [2024-06-12 14:46:09,332][65383] Fps is (10 sec: 37683.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 291848192. Throughput: 0: 41872.5. Samples: 291947420. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 14:46:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:46:12,527][65616] Updated weights for policy 0, policy_version 17820 (0.0033) [2024-06-12 14:46:14,332][65383] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 292077568. Throughput: 0: 42212.0. Samples: 292209720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 14:46:14,333][65383] Avg episode reward: [(0, '0.089')] [2024-06-12 14:46:14,890][65616] Updated weights for policy 0, policy_version 17830 (0.0031) [2024-06-12 14:46:19,332][65383] Fps is (10 sec: 37682.7, 60 sec: 41233.1, 300 sec: 42376.2). Total num frames: 292225024. Throughput: 0: 41973.8. Samples: 292336360. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 14:46:19,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:46:19,973][65616] Updated weights for policy 0, policy_version 17840 (0.0032) [2024-06-12 14:46:22,707][65616] Updated weights for policy 0, policy_version 17850 (0.0025) [2024-06-12 14:46:24,332][65383] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 292503552. Throughput: 0: 42022.2. Samples: 292596180. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 14:46:24,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:46:27,629][65616] Updated weights for policy 0, policy_version 17860 (0.0027) [2024-06-12 14:46:29,332][65383] Fps is (10 sec: 47514.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 292700160. Throughput: 0: 42192.0. Samples: 292843800. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 14:46:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:46:30,893][65616] Updated weights for policy 0, policy_version 17870 (0.0030) [2024-06-12 14:46:34,332][65383] Fps is (10 sec: 37683.5, 60 sec: 41233.2, 300 sec: 42487.3). Total num frames: 292880384. Throughput: 0: 41903.2. Samples: 292968180. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 14:46:34,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:46:35,099][65616] Updated weights for policy 0, policy_version 17880 (0.0030) [2024-06-12 14:46:38,471][65616] Updated weights for policy 0, policy_version 17890 (0.0022) [2024-06-12 14:46:39,332][65383] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 293142528. Throughput: 0: 42132.0. Samples: 293229380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 14:46:39,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:46:42,863][65616] Updated weights for policy 0, policy_version 17900 (0.0022) [2024-06-12 14:46:44,332][65383] Fps is (10 sec: 47513.1, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 293355520. Throughput: 0: 42249.3. Samples: 293484800. Policy #0 lag: (min: 1.0, avg: 9.4, max: 23.0) [2024-06-12 14:46:44,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:46:46,349][65616] Updated weights for policy 0, policy_version 17910 (0.0036) [2024-06-12 14:46:49,332][65383] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 293552128. Throughput: 0: 42485.8. Samples: 293616960. Policy #0 lag: (min: 1.0, avg: 9.4, max: 23.0) [2024-06-12 14:46:49,333][65383] Avg episode reward: [(0, '0.094')] [2024-06-12 14:46:50,207][65616] Updated weights for policy 0, policy_version 17920 (0.0035) [2024-06-12 14:46:53,955][65616] Updated weights for policy 0, policy_version 17930 (0.0035) [2024-06-12 14:46:54,332][65383] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 293765120. Throughput: 0: 42745.7. Samples: 293870980. Policy #0 lag: (min: 1.0, avg: 9.4, max: 23.0) [2024-06-12 14:46:54,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:46:57,669][65616] Updated weights for policy 0, policy_version 17940 (0.0025) [2024-06-12 14:46:59,332][65383] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 42542.9). Total num frames: 293961728. Throughput: 0: 42667.6. Samples: 294129760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 14:46:59,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:47:01,762][65616] Updated weights for policy 0, policy_version 17950 (0.0033) [2024-06-12 14:47:04,332][65383] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 294191104. Throughput: 0: 42626.4. Samples: 294254540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 14:47:04,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:47:05,425][65616] Updated weights for policy 0, policy_version 17960 (0.0038) [2024-06-12 14:47:09,332][65383] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 294387712. Throughput: 0: 42262.3. Samples: 294497980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 14:47:09,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:47:09,794][65616] Updated weights for policy 0, policy_version 17970 (0.0030) [2024-06-12 14:47:13,586][65616] Updated weights for policy 0, policy_version 17980 (0.0028) [2024-06-12 14:47:14,332][65383] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 294600704. Throughput: 0: 42382.1. Samples: 294751000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-12 14:47:14,342][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:47:15,196][65595] Signal inference workers to stop experience collection... (4100 times) [2024-06-12 14:47:15,235][65616] InferenceWorker_p0-w0: stopping experience collection (4100 times) [2024-06-12 14:47:15,243][65595] Signal inference workers to resume experience collection... (4100 times) [2024-06-12 14:47:15,253][65616] InferenceWorker_p0-w0: resuming experience collection (4100 times) [2024-06-12 14:47:17,402][65616] Updated weights for policy 0, policy_version 17990 (0.0031) [2024-06-12 14:47:19,332][65383] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 294813696. Throughput: 0: 42528.4. Samples: 294881960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-12 14:47:19,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:47:21,155][65616] Updated weights for policy 0, policy_version 18000 (0.0031) [2024-06-12 14:47:24,332][65383] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 295043072. Throughput: 0: 42323.2. Samples: 295133920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-12 14:47:24,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:47:24,624][65616] Updated weights for policy 0, policy_version 18010 (0.0025) [2024-06-12 14:47:29,072][65616] Updated weights for policy 0, policy_version 18020 (0.0034) [2024-06-12 14:47:29,332][65383] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 295239680. Throughput: 0: 42191.1. Samples: 295383400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-12 14:47:29,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:47:32,576][65616] Updated weights for policy 0, policy_version 18030 (0.0023) [2024-06-12 14:47:34,333][65383] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 295469056. Throughput: 0: 42055.9. Samples: 295509480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-12 14:47:34,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:47:37,043][65616] Updated weights for policy 0, policy_version 18040 (0.0031) [2024-06-12 14:47:39,333][65383] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 295665664. Throughput: 0: 42058.1. Samples: 295763600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 14:47:39,333][65383] Avg episode reward: [(0, '0.094')] [2024-06-12 14:47:40,616][65616] Updated weights for policy 0, policy_version 18050 (0.0027) [2024-06-12 14:47:44,332][65383] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 295862272. Throughput: 0: 41816.4. Samples: 296011500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 14:47:44,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:47:44,892][65616] Updated weights for policy 0, policy_version 18060 (0.0027) [2024-06-12 14:47:48,606][65616] Updated weights for policy 0, policy_version 18070 (0.0028) [2024-06-12 14:47:49,332][65383] Fps is (10 sec: 44238.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 296108032. Throughput: 0: 41845.8. Samples: 296137600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 14:47:49,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:47:52,717][65616] Updated weights for policy 0, policy_version 18080 (0.0031) [2024-06-12 14:47:54,336][65383] Fps is (10 sec: 40945.7, 60 sec: 41776.8, 300 sec: 42375.8). Total num frames: 296271872. Throughput: 0: 41941.2. Samples: 296385480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 14:47:54,336][65383] Avg episode reward: [(0, '0.094')] [2024-06-12 14:47:54,358][65595] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000018083_296271872.pth... [2024-06-12 14:47:54,404][65595] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000017461_286081024.pth [2024-06-12 14:47:56,221][65616] Updated weights for policy 0, policy_version 18090 (0.0025) [2024-06-12 14:47:59,333][65383] Fps is (10 sec: 37682.4, 60 sec: 42052.1, 300 sec: 42376.2). Total num frames: 296484864. Throughput: 0: 41944.8. Samples: 296638520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 14:47:59,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:48:00,594][65616] Updated weights for policy 0, policy_version 18100 (0.0028) [2024-06-12 14:48:04,274][65616] Updated weights for policy 0, policy_version 18110 (0.0027) [2024-06-12 14:48:04,333][65383] Fps is (10 sec: 44251.8, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 296714240. Throughput: 0: 41902.1. Samples: 296767560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 14:48:04,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:48:08,179][65616] Updated weights for policy 0, policy_version 18120 (0.0025) [2024-06-12 14:48:09,332][65383] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 296894464. Throughput: 0: 42076.9. Samples: 297027380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 14:48:09,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:48:12,367][65616] Updated weights for policy 0, policy_version 18130 (0.0029) [2024-06-12 14:48:14,336][65383] Fps is (10 sec: 42584.1, 60 sec: 42322.9, 300 sec: 42431.3). Total num frames: 297140224. Throughput: 0: 41787.9. Samples: 297264000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 14:48:14,336][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:48:16,354][65616] Updated weights for policy 0, policy_version 18140 (0.0033) [2024-06-12 14:48:19,332][65383] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 297320448. Throughput: 0: 41918.8. Samples: 297395820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 14:48:19,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:48:20,188][65616] Updated weights for policy 0, policy_version 18150 (0.0029) [2024-06-12 14:48:24,332][65383] Fps is (10 sec: 37696.2, 60 sec: 41233.1, 300 sec: 42265.2). Total num frames: 297517056. Throughput: 0: 41798.8. Samples: 297644540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 14:48:24,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:48:24,591][65616] Updated weights for policy 0, policy_version 18160 (0.0024) [2024-06-12 14:48:27,935][65616] Updated weights for policy 0, policy_version 18170 (0.0023) [2024-06-12 14:48:29,332][65383] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 297762816. Throughput: 0: 41808.4. Samples: 297892880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 14:48:29,333][65383] Avg episode reward: [(0, '0.092')] [2024-06-12 14:48:31,911][65616] Updated weights for policy 0, policy_version 18180 (0.0027) [2024-06-12 14:48:34,332][65383] Fps is (10 sec: 42598.8, 60 sec: 41233.2, 300 sec: 42320.7). Total num frames: 297943040. Throughput: 0: 41952.0. Samples: 298025440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 14:48:34,333][65383] Avg episode reward: [(0, '0.093')] [2024-06-12 14:48:34,614][65595] Signal inference workers to stop experience collection... (4150 times) [2024-06-12 14:48:34,615][65595] Signal inference workers to resume experience collection... (4150 times) [2024-06-12 14:48:34,624][65616] InferenceWorker_p0-w0: stopping experience collection (4150 times) [2024-06-12 14:48:34,624][65616] InferenceWorker_p0-w0: resuming experience collection (4150 times) [2024-06-12 14:48:35,836][65616] Updated weights for policy 0, policy_version 18190 (0.0024) [2024-06-12 14:48:39,332][65383] Fps is (10 sec: 40960.6, 60 sec: 41779.4, 300 sec: 42265.2). Total num frames: 298172416. Throughput: 0: 41871.3. Samples: 298269540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 14:48:39,333][65383] Avg episode reward: [(0, '0.094')] [2024-06-12 14:49:19,273][67877] Saving configuration to /workspace/metta/train_dir/p2.death/config.json... [2024-06-12 14:49:19,289][67877] Rollout worker 0 uses device cpu [2024-06-12 14:49:19,290][67877] Rollout worker 1 uses device cpu [2024-06-12 14:49:19,290][67877] Rollout worker 2 uses device cpu [2024-06-12 14:49:19,290][67877] Rollout worker 3 uses device cpu [2024-06-12 14:49:19,290][67877] Rollout worker 4 uses device cpu [2024-06-12 14:49:19,290][67877] Rollout worker 5 uses device cpu [2024-06-12 14:49:19,290][67877] Rollout worker 6 uses device cpu [2024-06-12 14:49:19,290][67877] Rollout worker 7 uses device cpu [2024-06-12 14:49:19,290][67877] Rollout worker 8 uses device cpu [2024-06-12 14:49:19,291][67877] Rollout worker 9 uses device cpu [2024-06-12 14:49:19,291][67877] Rollout worker 10 uses device cpu [2024-06-12 14:49:19,291][67877] Rollout worker 11 uses device cpu [2024-06-12 14:49:19,291][67877] Rollout worker 12 uses device cpu [2024-06-12 14:49:19,291][67877] Rollout worker 13 uses device cpu [2024-06-12 14:49:19,291][67877] Rollout worker 14 uses device cpu [2024-06-12 14:49:19,291][67877] Rollout worker 15 uses device cpu [2024-06-12 14:49:19,291][67877] Rollout worker 16 uses device cpu [2024-06-12 14:49:19,292][67877] Rollout worker 17 uses device cpu [2024-06-12 14:49:19,292][67877] Rollout worker 18 uses device cpu [2024-06-12 14:49:19,292][67877] Rollout worker 19 uses device cpu [2024-06-12 14:49:19,292][67877] Rollout worker 20 uses device cpu [2024-06-12 14:49:19,292][67877] Rollout worker 21 uses device cpu [2024-06-12 14:49:19,292][67877] Rollout worker 22 uses device cpu [2024-06-12 14:49:19,292][67877] Rollout worker 23 uses device cpu [2024-06-12 14:49:19,292][67877] Rollout worker 24 uses device cpu [2024-06-12 14:49:19,292][67877] Rollout worker 25 uses device cpu [2024-06-12 14:49:19,292][67877] Rollout worker 26 uses device cpu [2024-06-12 14:49:19,292][67877] Rollout worker 27 uses device cpu [2024-06-12 14:49:19,293][67877] Rollout worker 28 uses device cpu [2024-06-12 14:49:19,293][67877] Rollout worker 29 uses device cpu [2024-06-12 14:49:19,293][67877] Rollout worker 30 uses device cpu [2024-06-12 14:49:19,293][67877] Rollout worker 31 uses device cpu [2024-06-12 14:49:19,849][67877] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-12 14:49:19,849][67877] InferenceWorker_p0-w0: min num requests: 10 [2024-06-12 14:49:19,891][67877] Starting all processes... [2024-06-12 14:49:19,891][67877] Starting process learner_proc0 [2024-06-12 14:49:20,154][67877] Starting all processes... [2024-06-12 14:49:20,157][67877] Starting process inference_proc0-0 [2024-06-12 14:49:20,157][67877] Starting process rollout_proc0 [2024-06-12 14:49:20,157][67877] Starting process rollout_proc1 [2024-06-12 14:49:20,157][67877] Starting process rollout_proc2 [2024-06-12 14:49:20,158][67877] Starting process rollout_proc3 [2024-06-12 14:49:20,159][67877] Starting process rollout_proc4 [2024-06-12 14:49:20,161][67877] Starting process rollout_proc5 [2024-06-12 14:49:20,162][67877] Starting process rollout_proc6 [2024-06-12 14:49:20,162][67877] Starting process rollout_proc7 [2024-06-12 14:49:20,162][67877] Starting process rollout_proc8 [2024-06-12 14:49:20,163][67877] Starting process rollout_proc9 [2024-06-12 14:49:20,163][67877] Starting process rollout_proc10 [2024-06-12 14:49:20,163][67877] Starting process rollout_proc11 [2024-06-12 14:49:20,163][67877] Starting process rollout_proc12 [2024-06-12 14:49:20,164][67877] Starting process rollout_proc13 [2024-06-12 14:49:20,164][67877] Starting process rollout_proc14 [2024-06-12 14:49:20,165][67877] Starting process rollout_proc15 [2024-06-12 14:49:20,165][67877] Starting process rollout_proc16 [2024-06-12 14:49:20,165][67877] Starting process rollout_proc17 [2024-06-12 14:49:20,165][67877] Starting process rollout_proc18 [2024-06-12 14:49:20,166][67877] Starting process rollout_proc19 [2024-06-12 14:49:20,168][67877] Starting process rollout_proc20 [2024-06-12 14:49:20,168][67877] Starting process rollout_proc21 [2024-06-12 14:49:20,171][67877] Starting process rollout_proc22 [2024-06-12 14:49:20,171][67877] Starting process rollout_proc23 [2024-06-12 14:49:20,174][67877] Starting process rollout_proc24 [2024-06-12 14:49:20,174][67877] Starting process rollout_proc25 [2024-06-12 14:49:20,176][67877] Starting process rollout_proc26 [2024-06-12 14:49:20,176][67877] Starting process rollout_proc27 [2024-06-12 14:49:20,177][67877] Starting process rollout_proc28 [2024-06-12 14:49:20,179][67877] Starting process rollout_proc29 [2024-06-12 14:49:20,180][67877] Starting process rollout_proc30 [2024-06-12 14:49:20,183][67877] Starting process rollout_proc31 [2024-06-12 14:49:22,251][68114] Worker 4 uses CPU cores [4] [2024-06-12 14:49:22,253][68109] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-12 14:49:22,253][68109] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-06-12 14:49:22,264][68109] Num visible devices: 1 [2024-06-12 14:49:22,282][68124] Worker 13 uses CPU cores [13] [2024-06-12 14:49:22,324][68113] Worker 3 uses CPU cores [3] [2024-06-12 14:49:22,331][68121] Worker 11 uses CPU cores [11] [2024-06-12 14:49:22,347][68135] Worker 25 uses CPU cores [25] [2024-06-12 14:49:22,376][68136] Worker 26 uses CPU cores [26] [2024-06-12 14:49:22,378][68122] Worker 10 uses CPU cores [10] [2024-06-12 14:49:22,408][68133] Worker 23 uses CPU cores [23] [2024-06-12 14:49:22,419][68119] Worker 9 uses CPU cores [9] [2024-06-12 14:49:22,424][68118] Worker 8 uses CPU cores [8] [2024-06-12 14:49:22,432][68131] Worker 22 uses CPU cores [22] [2024-06-12 14:49:22,447][68140] Worker 30 uses CPU cores [30] [2024-06-12 14:49:22,495][68129] Worker 19 uses CPU cores [19] [2024-06-12 14:49:22,499][68116] Worker 6 uses CPU cores [6] [2024-06-12 14:49:22,500][68123] Worker 14 uses CPU cores [14] [2024-06-12 14:49:22,507][68110] Worker 0 uses CPU cores [0] [2024-06-12 14:49:22,515][68128] Worker 17 uses CPU cores [17] [2024-06-12 14:49:22,516][68132] Worker 21 uses CPU cores [21] [2024-06-12 14:49:22,517][68120] Worker 12 uses CPU cores [12] [2024-06-12 14:49:22,536][68115] Worker 5 uses CPU cores [5] [2024-06-12 14:49:22,554][68089] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-12 14:49:22,554][68089] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-06-12 14:49:22,563][68089] Num visible devices: 1 [2024-06-12 14:49:22,572][68117] Worker 7 uses CPU cores [7] [2024-06-12 14:49:22,584][68089] Setting fixed seed 0 [2024-06-12 14:49:22,585][68089] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-12 14:49:22,585][68089] Initializing actor-critic model on device cuda:0 [2024-06-12 14:49:22,588][68112] Worker 2 uses CPU cores [2] [2024-06-12 14:49:22,589][68130] Worker 20 uses CPU cores [20] [2024-06-12 14:49:22,595][68134] Worker 24 uses CPU cores [24] [2024-06-12 14:49:22,645][68138] Worker 28 uses CPU cores [28] [2024-06-12 14:49:22,648][68137] Worker 29 uses CPU cores [29] [2024-06-12 14:49:22,656][68141] Worker 31 uses CPU cores [31] [2024-06-12 14:49:22,669][68111] Worker 1 uses CPU cores [1] [2024-06-12 14:49:22,682][68126] Worker 15 uses CPU cores [15] [2024-06-12 14:49:22,695][68125] Worker 16 uses CPU cores [16] [2024-06-12 14:49:22,726][68139] Worker 27 uses CPU cores [27] [2024-06-12 14:49:22,737][68127] Worker 18 uses CPU cores [18] [2024-06-12 14:49:23,267][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,267][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,267][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,267][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,267][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,267][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,267][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,267][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,268][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,268][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,268][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,268][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,268][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,268][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,268][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,268][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,268][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,268][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,268][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,268][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,268][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,268][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,271][68089] RunningMeanStd input shape: (1,) [2024-06-12 14:49:23,272][68089] RunningMeanStd input shape: (1,) [2024-06-12 14:49:23,272][68089] RunningMeanStd input shape: (1,) [2024-06-12 14:49:23,272][68089] RunningMeanStd input shape: (1,) [2024-06-12 14:49:23,272][68089] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:23,312][68089] RunningMeanStd input shape: (1,) [2024-06-12 14:49:23,316][68089] Created Actor Critic model with architecture: [2024-06-12 14:49:23,316][68089] SampleFactoryAgentWrapper( (obs_normalizer): ObservationNormalizer() (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (agent): MettaAgent( (_encoder): MultiFeatureSetEncoder( (feature_set_encoders): ModuleDict( (grid_obs): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (agent): RunningMeanStdInPlace() (altar): RunningMeanStdInPlace() (converter): RunningMeanStdInPlace() (generator): RunningMeanStdInPlace() (wall): RunningMeanStdInPlace() (agent:dir): RunningMeanStdInPlace() (agent:energy): RunningMeanStdInPlace() (agent:frozen): RunningMeanStdInPlace() (agent:hp): RunningMeanStdInPlace() (agent:id): RunningMeanStdInPlace() (agent:inv_r1): RunningMeanStdInPlace() (agent:inv_r2): RunningMeanStdInPlace() (agent:inv_r3): RunningMeanStdInPlace() (agent:shield): RunningMeanStdInPlace() (altar:hp): RunningMeanStdInPlace() (altar:state): RunningMeanStdInPlace() (converter:hp): RunningMeanStdInPlace() (converter:state): RunningMeanStdInPlace() (generator:amount): RunningMeanStdInPlace() (generator:hp): RunningMeanStdInPlace() (generator:state): RunningMeanStdInPlace() (wall:hp): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) (6): Linear(in_features=512, out_features=512, bias=True) (7): ELU(alpha=1.0) ) ) (global_vars): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (_steps): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_action): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_action_id): RunningMeanStdInPlace() (last_action_val): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_reward): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_reward): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (kinship): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (kinship): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) ) (merged_encoder): Sequential( (0): Linear(in_features=544, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) ) ) (_core): ModelCoreRNN( (core): GRU(512, 512) ) (_decoder): Decoder( (mlp): Identity() ) (_critic_linear): Linear(in_features=512, out_features=1, bias=True) (_action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=16, bias=True) ) ) ) [2024-06-12 14:49:23,389][68089] Using optimizer [2024-06-12 14:49:23,571][68089] Loading state from checkpoint /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000018083_296271872.pth... [2024-06-12 14:49:23,586][68089] Loading model from checkpoint [2024-06-12 14:49:23,588][68089] Loaded experiment state at self.train_step=18083, self.env_steps=296271872 [2024-06-12 14:49:23,588][68089] Initialized policy 0 weights for model version 18083 [2024-06-12 14:49:23,590][68089] LearnerWorker_p0 finished initialization! [2024-06-12 14:49:23,590][68089] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-12 14:49:24,295][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,295][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,295][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,295][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,295][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,295][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,295][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,296][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,296][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,296][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,296][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,296][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,296][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,296][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,296][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,296][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,296][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,296][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,296][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,296][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,296][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,296][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,299][68109] RunningMeanStd input shape: (1,) [2024-06-12 14:49:24,300][68109] RunningMeanStd input shape: (1,) [2024-06-12 14:49:24,300][68109] RunningMeanStd input shape: (1,) [2024-06-12 14:49:24,300][68109] RunningMeanStd input shape: (1,) [2024-06-12 14:49:24,300][68109] RunningMeanStd input shape: (11, 11) [2024-06-12 14:49:24,340][68109] RunningMeanStd input shape: (1,) [2024-06-12 14:49:24,361][67877] Inference worker 0-0 is ready! [2024-06-12 14:49:24,361][67877] All inference workers are ready! Signal rollout workers to start! [2024-06-12 14:49:26,685][68132] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,689][68130] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,689][68129] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,693][68133] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,694][68135] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,695][68131] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,696][68127] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,700][68138] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,710][68125] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,710][68134] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,711][68136] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,712][68141] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,713][68140] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,718][68137] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,731][68128] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,740][68139] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,773][68117] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,776][68115] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,779][68126] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,781][68111] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,785][68123] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,786][68121] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,789][68118] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,791][68124] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,794][68119] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,796][68114] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,801][68110] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,801][68116] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,804][68120] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,814][68113] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,815][68112] Decorrelating experience for 0 frames... [2024-06-12 14:49:26,824][68122] Decorrelating experience for 0 frames... [2024-06-12 14:49:27,027][67877] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 296271872. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-12 14:49:28,004][68130] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,007][68135] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,009][68129] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,009][68127] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,013][68132] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,015][68133] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,030][68131] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,039][68125] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,044][68134] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,044][68136] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,052][68137] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,053][68128] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,056][68140] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,058][68138] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,060][68141] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,080][68117] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,085][68115] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,095][68126] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,101][68121] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,103][68139] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,107][68111] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,112][68124] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,113][68123] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,113][68118] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,117][68116] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,125][68114] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,129][68119] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,131][68110] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,132][68120] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,144][68112] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,145][68122] Decorrelating experience for 256 frames... [2024-06-12 14:49:28,148][68113] Decorrelating experience for 256 frames... [2024-06-12 14:49:32,026][67877] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 296271872. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-12 14:49:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:49:37,027][67877] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 296271872. Throughput: 0: 22684.0. Samples: 226840. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-12 14:49:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:49:39,343][68111] Worker 1, sleep for 4.688 sec to decorrelate experience collection [2024-06-12 14:49:39,352][68123] Worker 14, sleep for 65.625 sec to decorrelate experience collection [2024-06-12 14:49:39,416][68126] Worker 15, sleep for 70.312 sec to decorrelate experience collection [2024-06-12 14:49:39,460][68114] Worker 4, sleep for 18.750 sec to decorrelate experience collection [2024-06-12 14:49:39,478][68116] Worker 6, sleep for 28.125 sec to decorrelate experience collection [2024-06-12 14:49:39,514][68124] Worker 13, sleep for 60.938 sec to decorrelate experience collection [2024-06-12 14:49:39,519][68122] Worker 10, sleep for 46.875 sec to decorrelate experience collection [2024-06-12 14:49:39,522][68120] Worker 12, sleep for 56.250 sec to decorrelate experience collection [2024-06-12 14:49:39,572][68121] Worker 11, sleep for 51.562 sec to decorrelate experience collection [2024-06-12 14:49:39,575][68112] Worker 2, sleep for 9.375 sec to decorrelate experience collection [2024-06-12 14:49:39,585][68119] Worker 9, sleep for 42.188 sec to decorrelate experience collection [2024-06-12 14:49:39,590][68118] Worker 8, sleep for 37.500 sec to decorrelate experience collection [2024-06-12 14:49:39,598][68115] Worker 5, sleep for 23.438 sec to decorrelate experience collection [2024-06-12 14:49:39,602][68089] Signal inference workers to stop experience collection... [2024-06-12 14:49:39,615][68109] InferenceWorker_p0-w0: stopping experience collection [2024-06-12 14:49:39,846][67877] Heartbeat connected on Batcher_0 [2024-06-12 14:49:39,849][67877] Heartbeat connected on InferenceWorker_p0-w0 [2024-06-12 14:49:39,851][67877] Heartbeat connected on RolloutWorker_w0 [2024-06-12 14:49:39,855][67877] Heartbeat connected on RolloutWorker_w3 [2024-06-12 14:49:39,860][67877] Heartbeat connected on RolloutWorker_w7 [2024-06-12 14:49:39,872][67877] Heartbeat connected on RolloutWorker_w16 [2024-06-12 14:49:39,873][67877] Heartbeat connected on RolloutWorker_w17 [2024-06-12 14:49:39,874][67877] Heartbeat connected on RolloutWorker_w18 [2024-06-12 14:49:39,875][67877] Heartbeat connected on RolloutWorker_w19 [2024-06-12 14:49:39,877][67877] Heartbeat connected on RolloutWorker_w20 [2024-06-12 14:49:39,879][67877] Heartbeat connected on RolloutWorker_w22 [2024-06-12 14:49:39,880][67877] Heartbeat connected on RolloutWorker_w23 [2024-06-12 14:49:39,882][67877] Heartbeat connected on RolloutWorker_w24 [2024-06-12 14:49:39,883][67877] Heartbeat connected on RolloutWorker_w25 [2024-06-12 14:49:39,884][67877] Heartbeat connected on RolloutWorker_w26 [2024-06-12 14:49:39,886][67877] Heartbeat connected on RolloutWorker_w27 [2024-06-12 14:49:39,887][67877] Heartbeat connected on RolloutWorker_w28 [2024-06-12 14:49:39,888][67877] Heartbeat connected on RolloutWorker_w29 [2024-06-12 14:49:39,889][67877] Heartbeat connected on RolloutWorker_w30 [2024-06-12 14:49:39,914][67877] Heartbeat connected on RolloutWorker_w31 [2024-06-12 14:49:39,962][67877] Heartbeat connected on RolloutWorker_w21 [2024-06-12 14:49:40,141][68089] Signal inference workers to resume experience collection... [2024-06-12 14:49:40,141][68109] InferenceWorker_p0-w0: resuming experience collection [2024-06-12 14:49:40,157][68113] Worker 3, sleep for 14.062 sec to decorrelate experience collection [2024-06-12 14:49:40,163][68117] Worker 7, sleep for 32.812 sec to decorrelate experience collection [2024-06-12 14:49:40,352][67877] Heartbeat connected on LearnerWorker_p0 [2024-06-12 14:49:40,606][68129] Worker 19, sleep for 89.062 sec to decorrelate experience collection [2024-06-12 14:49:40,691][68133] Worker 23, sleep for 107.812 sec to decorrelate experience collection [2024-06-12 14:49:40,702][68135] Worker 25, sleep for 117.188 sec to decorrelate experience collection [2024-06-12 14:49:40,708][68130] Worker 20, sleep for 93.750 sec to decorrelate experience collection [2024-06-12 14:49:40,708][68132] Worker 21, sleep for 98.438 sec to decorrelate experience collection [2024-06-12 14:49:40,708][68127] Worker 18, sleep for 84.375 sec to decorrelate experience collection [2024-06-12 14:49:40,714][68141] Worker 31, sleep for 145.312 sec to decorrelate experience collection [2024-06-12 14:49:40,725][68128] Worker 17, sleep for 79.688 sec to decorrelate experience collection [2024-06-12 14:49:40,768][68131] Worker 22, sleep for 103.125 sec to decorrelate experience collection [2024-06-12 14:49:40,852][68125] Worker 16, sleep for 75.000 sec to decorrelate experience collection [2024-06-12 14:49:40,856][68136] Worker 26, sleep for 121.875 sec to decorrelate experience collection [2024-06-12 14:49:40,884][68137] Worker 29, sleep for 135.938 sec to decorrelate experience collection [2024-06-12 14:49:40,885][68140] Worker 30, sleep for 140.625 sec to decorrelate experience collection [2024-06-12 14:49:40,889][68138] Worker 28, sleep for 131.250 sec to decorrelate experience collection [2024-06-12 14:49:40,957][68134] Worker 24, sleep for 112.500 sec to decorrelate experience collection [2024-06-12 14:49:41,118][68139] Worker 27, sleep for 126.562 sec to decorrelate experience collection [2024-06-12 14:49:41,288][68109] Updated weights for policy 0, policy_version 18093 (0.0013) [2024-06-12 14:49:42,027][67877] Fps is (10 sec: 16383.6, 60 sec: 10922.6, 300 sec: 10922.6). Total num frames: 296435712. Throughput: 0: 20697.2. Samples: 310460. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-12 14:49:42,028][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:49:44,054][68111] Worker 1 awakens! [2024-06-12 14:49:44,067][67877] Heartbeat connected on RolloutWorker_w1 [2024-06-12 14:49:47,026][67877] Fps is (10 sec: 16384.4, 60 sec: 8192.1, 300 sec: 8192.1). Total num frames: 296435712. Throughput: 0: 16741.2. Samples: 334820. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-12 14:49:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:49:48,996][68112] Worker 2 awakens! [2024-06-12 14:49:49,000][67877] Heartbeat connected on RolloutWorker_w2 [2024-06-12 14:49:52,026][67877] Fps is (10 sec: 1638.4, 60 sec: 7209.0, 300 sec: 7209.0). Total num frames: 296452096. Throughput: 0: 13935.3. Samples: 348380. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-12 14:49:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:49:54,290][68113] Worker 3 awakens! [2024-06-12 14:49:57,026][67877] Fps is (10 sec: 3276.8, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 296468480. Throughput: 0: 11906.7. Samples: 357200. Policy #0 lag: (min: 0.0, avg: 4.6, max: 11.0) [2024-06-12 14:49:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:49:58,304][68114] Worker 4 awakens! [2024-06-12 14:49:58,308][67877] Heartbeat connected on RolloutWorker_w4 [2024-06-12 14:50:02,026][67877] Fps is (10 sec: 4915.2, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 296501248. Throughput: 0: 11068.6. Samples: 387400. Policy #0 lag: (min: 0.0, avg: 4.6, max: 11.0) [2024-06-12 14:50:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:50:03,113][68115] Worker 5 awakens! [2024-06-12 14:50:03,119][67877] Heartbeat connected on RolloutWorker_w5 [2024-06-12 14:50:07,026][67877] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 296534016. Throughput: 0: 10924.0. Samples: 436960. Policy #0 lag: (min: 0.0, avg: 4.6, max: 11.0) [2024-06-12 14:50:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:50:07,684][68116] Worker 6 awakens! [2024-06-12 14:50:07,689][67877] Heartbeat connected on RolloutWorker_w6 [2024-06-12 14:50:12,026][67877] Fps is (10 sec: 8192.0, 60 sec: 6917.7, 300 sec: 6917.7). Total num frames: 296583168. Throughput: 0: 10328.1. Samples: 464760. Policy #0 lag: (min: 0.0, avg: 6.5, max: 17.0) [2024-06-12 14:50:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:50:12,300][68109] Updated weights for policy 0, policy_version 18103 (0.0012) [2024-06-12 14:50:12,986][68117] Worker 7 awakens! [2024-06-12 14:50:17,026][67877] Fps is (10 sec: 13107.3, 60 sec: 7864.4, 300 sec: 7864.4). Total num frames: 296665088. Throughput: 0: 11914.2. Samples: 536140. Policy #0 lag: (min: 0.0, avg: 6.5, max: 17.0) [2024-06-12 14:50:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:50:17,191][68118] Worker 8 awakens! [2024-06-12 14:50:17,195][67877] Heartbeat connected on RolloutWorker_w8 [2024-06-12 14:50:21,872][68119] Worker 9 awakens! [2024-06-12 14:50:21,879][67877] Heartbeat connected on RolloutWorker_w9 [2024-06-12 14:50:22,026][67877] Fps is (10 sec: 14745.7, 60 sec: 8341.0, 300 sec: 8341.0). Total num frames: 296730624. Throughput: 0: 8868.9. Samples: 625940. Policy #0 lag: (min: 0.0, avg: 6.5, max: 17.0) [2024-06-12 14:50:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:50:22,889][68109] Updated weights for policy 0, policy_version 18113 (0.0012) [2024-06-12 14:50:26,495][68122] Worker 10 awakens! [2024-06-12 14:50:26,500][67877] Heartbeat connected on RolloutWorker_w10 [2024-06-12 14:50:27,026][67877] Fps is (10 sec: 16383.9, 60 sec: 9284.3, 300 sec: 9284.3). Total num frames: 296828928. Throughput: 0: 8070.7. Samples: 673640. Policy #0 lag: (min: 0.0, avg: 5.6, max: 28.0) [2024-06-12 14:50:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:50:31,232][68121] Worker 11 awakens! [2024-06-12 14:50:31,237][67877] Heartbeat connected on RolloutWorker_w11 [2024-06-12 14:50:31,615][68109] Updated weights for policy 0, policy_version 18123 (0.0014) [2024-06-12 14:50:32,026][67877] Fps is (10 sec: 19660.6, 60 sec: 10922.7, 300 sec: 10082.5). Total num frames: 296927232. Throughput: 0: 9920.4. Samples: 781240. Policy #0 lag: (min: 0.0, avg: 5.6, max: 28.0) [2024-06-12 14:50:32,033][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:50:35,872][68120] Worker 12 awakens! [2024-06-12 14:50:35,878][67877] Heartbeat connected on RolloutWorker_w12 [2024-06-12 14:50:37,026][67877] Fps is (10 sec: 18022.4, 60 sec: 12288.0, 300 sec: 10532.6). Total num frames: 297009152. Throughput: 0: 12268.0. Samples: 900440. Policy #0 lag: (min: 0.0, avg: 3.1, max: 8.0) [2024-06-12 14:50:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:50:40,477][68109] Updated weights for policy 0, policy_version 18133 (0.0017) [2024-06-12 14:50:40,551][68124] Worker 13 awakens! [2024-06-12 14:50:40,560][67877] Heartbeat connected on RolloutWorker_w13 [2024-06-12 14:50:42,026][67877] Fps is (10 sec: 19660.7, 60 sec: 11468.8, 300 sec: 11359.6). Total num frames: 297123840. Throughput: 0: 13482.7. Samples: 963920. Policy #0 lag: (min: 0.0, avg: 3.1, max: 8.0) [2024-06-12 14:50:42,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:50:45,076][68123] Worker 14 awakens! [2024-06-12 14:50:45,082][67877] Heartbeat connected on RolloutWorker_w14 [2024-06-12 14:50:47,027][67877] Fps is (10 sec: 21299.0, 60 sec: 13107.2, 300 sec: 11878.4). Total num frames: 297222144. Throughput: 0: 15890.6. Samples: 1102480. Policy #0 lag: (min: 0.0, avg: 3.1, max: 8.0) [2024-06-12 14:50:47,034][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:50:47,364][68109] Updated weights for policy 0, policy_version 18143 (0.0022) [2024-06-12 14:50:49,828][68126] Worker 15 awakens! [2024-06-12 14:50:49,836][67877] Heartbeat connected on RolloutWorker_w15 [2024-06-12 14:50:52,026][67877] Fps is (10 sec: 22937.9, 60 sec: 15018.7, 300 sec: 12721.7). Total num frames: 297353216. Throughput: 0: 17842.7. Samples: 1239880. Policy #0 lag: (min: 0.0, avg: 5.2, max: 11.0) [2024-06-12 14:50:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:50:54,485][68109] Updated weights for policy 0, policy_version 18153 (0.0023) [2024-06-12 14:50:55,952][68125] Worker 16 awakens! [2024-06-12 14:50:57,026][67877] Fps is (10 sec: 26214.7, 60 sec: 16930.1, 300 sec: 13471.3). Total num frames: 297484288. Throughput: 0: 18780.8. Samples: 1309900. Policy #0 lag: (min: 0.0, avg: 5.2, max: 11.0) [2024-06-12 14:50:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:51:00,515][68128] Worker 17 awakens! [2024-06-12 14:51:01,165][68109] Updated weights for policy 0, policy_version 18163 (0.0027) [2024-06-12 14:51:02,026][67877] Fps is (10 sec: 24575.6, 60 sec: 18295.4, 300 sec: 13969.5). Total num frames: 297598976. Throughput: 0: 20388.4. Samples: 1453620. Policy #0 lag: (min: 0.0, avg: 13.6, max: 78.0) [2024-06-12 14:51:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:51:05,184][68127] Worker 18 awakens! [2024-06-12 14:51:07,027][67877] Fps is (10 sec: 24575.7, 60 sec: 19933.8, 300 sec: 14581.8). Total num frames: 297730048. Throughput: 0: 21773.2. Samples: 1605740. Policy #0 lag: (min: 0.0, avg: 13.6, max: 78.0) [2024-06-12 14:51:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:51:08,224][68109] Updated weights for policy 0, policy_version 18173 (0.0025) [2024-06-12 14:51:09,769][68129] Worker 19 awakens! [2024-06-12 14:51:12,026][67877] Fps is (10 sec: 26214.6, 60 sec: 21299.2, 300 sec: 15135.7). Total num frames: 297861120. Throughput: 0: 22588.9. Samples: 1690140. Policy #0 lag: (min: 0.0, avg: 13.6, max: 78.0) [2024-06-12 14:51:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:51:12,992][68109] Updated weights for policy 0, policy_version 18183 (0.0026) [2024-06-12 14:51:14,556][68130] Worker 20 awakens! [2024-06-12 14:51:17,026][67877] Fps is (10 sec: 26214.7, 60 sec: 22118.4, 300 sec: 15639.3). Total num frames: 297992192. Throughput: 0: 23951.1. Samples: 1859040. Policy #0 lag: (min: 0.0, avg: 35.5, max: 101.0) [2024-06-12 14:51:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:51:17,037][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000018188_297992192.pth... [2024-06-12 14:51:17,097][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000017775_291225600.pth [2024-06-12 14:51:19,244][68132] Worker 21 awakens! [2024-06-12 14:51:19,417][68109] Updated weights for policy 0, policy_version 18193 (0.0033) [2024-06-12 14:51:22,026][67877] Fps is (10 sec: 27852.5, 60 sec: 23483.6, 300 sec: 16241.5). Total num frames: 298139648. Throughput: 0: 25218.6. Samples: 2035280. Policy #0 lag: (min: 0.0, avg: 35.5, max: 101.0) [2024-06-12 14:51:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:51:23,995][68131] Worker 22 awakens! [2024-06-12 14:51:24,445][68109] Updated weights for policy 0, policy_version 18203 (0.0034) [2024-06-12 14:51:27,026][67877] Fps is (10 sec: 29491.0, 60 sec: 24302.9, 300 sec: 16793.6). Total num frames: 298287104. Throughput: 0: 25755.5. Samples: 2122920. Policy #0 lag: (min: 0.0, avg: 35.5, max: 101.0) [2024-06-12 14:51:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:51:28,604][68133] Worker 23 awakens! [2024-06-12 14:51:30,762][68109] Updated weights for policy 0, policy_version 18213 (0.0027) [2024-06-12 14:51:32,026][67877] Fps is (10 sec: 31129.8, 60 sec: 25395.2, 300 sec: 17432.6). Total num frames: 298450944. Throughput: 0: 26649.8. Samples: 2301720. Policy #0 lag: (min: 0.0, avg: 30.2, max: 127.0) [2024-06-12 14:51:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:51:33,560][68134] Worker 24 awakens! [2024-06-12 14:51:35,572][68109] Updated weights for policy 0, policy_version 18223 (0.0043) [2024-06-12 14:51:37,026][67877] Fps is (10 sec: 32768.2, 60 sec: 26760.5, 300 sec: 18022.4). Total num frames: 298614784. Throughput: 0: 27863.9. Samples: 2493760. Policy #0 lag: (min: 0.0, avg: 30.2, max: 127.0) [2024-06-12 14:51:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:51:37,952][68135] Worker 25 awakens! [2024-06-12 14:51:40,379][68109] Updated weights for policy 0, policy_version 18233 (0.0029) [2024-06-12 14:51:42,026][67877] Fps is (10 sec: 32768.0, 60 sec: 27579.7, 300 sec: 18568.5). Total num frames: 298778624. Throughput: 0: 28440.9. Samples: 2589740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 18.0) [2024-06-12 14:51:42,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:51:42,832][68136] Worker 26 awakens! [2024-06-12 14:51:45,985][68109] Updated weights for policy 0, policy_version 18243 (0.0031) [2024-06-12 14:51:47,026][67877] Fps is (10 sec: 31129.7, 60 sec: 28399.0, 300 sec: 18958.6). Total num frames: 298926080. Throughput: 0: 29644.5. Samples: 2787620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 18.0) [2024-06-12 14:51:47,034][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:51:47,784][68139] Worker 27 awakens! [2024-06-12 14:51:50,294][68109] Updated weights for policy 0, policy_version 18253 (0.0032) [2024-06-12 14:51:52,026][67877] Fps is (10 sec: 32767.9, 60 sec: 29218.1, 300 sec: 19547.8). Total num frames: 299106304. Throughput: 0: 30550.3. Samples: 2980500. Policy #0 lag: (min: 0.0, avg: 8.5, max: 18.0) [2024-06-12 14:51:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:51:52,236][68138] Worker 28 awakens! [2024-06-12 14:51:55,729][68109] Updated weights for policy 0, policy_version 18263 (0.0026) [2024-06-12 14:51:56,920][68137] Worker 29 awakens! [2024-06-12 14:51:57,026][67877] Fps is (10 sec: 36044.7, 60 sec: 30037.3, 300 sec: 20097.7). Total num frames: 299286528. Throughput: 0: 31064.0. Samples: 3088020. Policy #0 lag: (min: 0.0, avg: 8.0, max: 18.0) [2024-06-12 14:51:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:51:59,991][68109] Updated weights for policy 0, policy_version 18273 (0.0035) [2024-06-12 14:52:01,612][68140] Worker 30 awakens! [2024-06-12 14:52:02,026][67877] Fps is (10 sec: 34406.9, 60 sec: 30856.6, 300 sec: 20506.5). Total num frames: 299450368. Throughput: 0: 31841.0. Samples: 3291880. Policy #0 lag: (min: 0.0, avg: 8.0, max: 18.0) [2024-06-12 14:52:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:52:04,748][68109] Updated weights for policy 0, policy_version 18283 (0.0036) [2024-06-12 14:52:06,124][68141] Worker 31 awakens! [2024-06-12 14:52:07,026][67877] Fps is (10 sec: 34406.5, 60 sec: 31675.8, 300 sec: 20992.0). Total num frames: 299630592. Throughput: 0: 32501.8. Samples: 3497860. Policy #0 lag: (min: 0.0, avg: 8.0, max: 18.0) [2024-06-12 14:52:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:52:09,271][68109] Updated weights for policy 0, policy_version 18293 (0.0045) [2024-06-12 14:52:12,026][67877] Fps is (10 sec: 32767.6, 60 sec: 31948.8, 300 sec: 21249.6). Total num frames: 299778048. Throughput: 0: 32941.8. Samples: 3605300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 14:52:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:52:13,883][68109] Updated weights for policy 0, policy_version 18303 (0.0029) [2024-06-12 14:52:17,028][67877] Fps is (10 sec: 34401.2, 60 sec: 33040.3, 300 sec: 21780.9). Total num frames: 299974656. Throughput: 0: 33665.5. Samples: 3816720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 14:52:17,029][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:52:18,914][68109] Updated weights for policy 0, policy_version 18313 (0.0034) [2024-06-12 14:52:22,027][67877] Fps is (10 sec: 36044.6, 60 sec: 33314.1, 300 sec: 22095.0). Total num frames: 300138496. Throughput: 0: 34126.2. Samples: 4029440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 14:52:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:52:23,356][68109] Updated weights for policy 0, policy_version 18323 (0.0034) [2024-06-12 14:52:27,026][67877] Fps is (10 sec: 36050.1, 60 sec: 34133.4, 300 sec: 22573.5). Total num frames: 300335104. Throughput: 0: 34330.6. Samples: 4134620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 14:52:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:52:28,202][68109] Updated weights for policy 0, policy_version 18333 (0.0035) [2024-06-12 14:52:31,970][68109] Updated weights for policy 0, policy_version 18343 (0.0030) [2024-06-12 14:52:32,026][67877] Fps is (10 sec: 39322.3, 60 sec: 34679.5, 300 sec: 23026.2). Total num frames: 300531712. Throughput: 0: 34701.0. Samples: 4349160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 14:52:32,034][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:52:37,026][67877] Fps is (10 sec: 32768.3, 60 sec: 34133.4, 300 sec: 23110.1). Total num frames: 300662784. Throughput: 0: 35043.2. Samples: 4557440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 14:52:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:52:37,949][68109] Updated weights for policy 0, policy_version 18353 (0.0040) [2024-06-12 14:52:41,349][68109] Updated weights for policy 0, policy_version 18363 (0.0029) [2024-06-12 14:52:42,026][67877] Fps is (10 sec: 34406.0, 60 sec: 34952.5, 300 sec: 23609.8). Total num frames: 300875776. Throughput: 0: 35071.6. Samples: 4666240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 14:52:42,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:52:46,619][68109] Updated weights for policy 0, policy_version 18373 (0.0033) [2024-06-12 14:52:47,026][67877] Fps is (10 sec: 36045.0, 60 sec: 34952.6, 300 sec: 23756.8). Total num frames: 301023232. Throughput: 0: 35188.9. Samples: 4875380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 14:52:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:52:50,511][68109] Updated weights for policy 0, policy_version 18383 (0.0040) [2024-06-12 14:52:52,026][67877] Fps is (10 sec: 34406.5, 60 sec: 35225.6, 300 sec: 24136.4). Total num frames: 301219840. Throughput: 0: 35379.5. Samples: 5089940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 14:52:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:52:56,524][68109] Updated weights for policy 0, policy_version 18393 (0.0042) [2024-06-12 14:52:57,032][67877] Fps is (10 sec: 32749.6, 60 sec: 34403.2, 300 sec: 24185.3). Total num frames: 301350912. Throughput: 0: 35496.1. Samples: 5202820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 14:52:57,033][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:52:57,677][68089] Signal inference workers to stop experience collection... (50 times) [2024-06-12 14:52:57,724][68109] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-06-12 14:52:57,729][68089] Signal inference workers to resume experience collection... (50 times) [2024-06-12 14:52:57,742][68109] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-06-12 14:52:59,686][68109] Updated weights for policy 0, policy_version 18403 (0.0028) [2024-06-12 14:53:02,026][67877] Fps is (10 sec: 36044.6, 60 sec: 35498.6, 300 sec: 24690.3). Total num frames: 301580288. Throughput: 0: 35379.4. Samples: 5408740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 14:53:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:53:05,537][68109] Updated weights for policy 0, policy_version 18413 (0.0045) [2024-06-12 14:53:07,026][67877] Fps is (10 sec: 36064.9, 60 sec: 34679.5, 300 sec: 24725.0). Total num frames: 301711360. Throughput: 0: 35438.3. Samples: 5624160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 14:53:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:53:08,914][68109] Updated weights for policy 0, policy_version 18423 (0.0030) [2024-06-12 14:53:12,026][67877] Fps is (10 sec: 32768.4, 60 sec: 35498.7, 300 sec: 25049.3). Total num frames: 301907968. Throughput: 0: 35329.0. Samples: 5724420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 14:53:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:53:14,683][68109] Updated weights for policy 0, policy_version 18433 (0.0031) [2024-06-12 14:53:17,026][67877] Fps is (10 sec: 39321.8, 60 sec: 35499.6, 300 sec: 25359.6). Total num frames: 302104576. Throughput: 0: 35536.4. Samples: 5948300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 14:53:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:53:17,115][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000018440_302120960.pth... [2024-06-12 14:53:17,170][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000018083_296271872.pth [2024-06-12 14:53:18,631][68109] Updated weights for policy 0, policy_version 18443 (0.0040) [2024-06-12 14:53:22,026][67877] Fps is (10 sec: 36044.8, 60 sec: 35498.7, 300 sec: 25517.2). Total num frames: 302268416. Throughput: 0: 35550.7. Samples: 6157220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 14:53:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:53:23,731][68109] Updated weights for policy 0, policy_version 18453 (0.0032) [2024-06-12 14:53:27,032][67877] Fps is (10 sec: 34387.1, 60 sec: 35222.4, 300 sec: 25736.0). Total num frames: 302448640. Throughput: 0: 35425.9. Samples: 6260600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-12 14:53:27,033][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:53:28,096][68109] Updated weights for policy 0, policy_version 18463 (0.0037) [2024-06-12 14:53:32,026][67877] Fps is (10 sec: 36044.7, 60 sec: 34952.5, 300 sec: 25946.9). Total num frames: 302628864. Throughput: 0: 35387.5. Samples: 6467820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-12 14:53:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:53:32,757][68109] Updated weights for policy 0, policy_version 18473 (0.0037) [2024-06-12 14:53:37,027][67877] Fps is (10 sec: 36064.1, 60 sec: 35771.6, 300 sec: 26148.9). Total num frames: 302809088. Throughput: 0: 35379.0. Samples: 6682000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-12 14:53:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:53:37,294][68109] Updated weights for policy 0, policy_version 18483 (0.0021) [2024-06-12 14:53:41,973][68109] Updated weights for policy 0, policy_version 18493 (0.0024) [2024-06-12 14:53:42,028][67877] Fps is (10 sec: 36039.4, 60 sec: 35224.8, 300 sec: 26342.8). Total num frames: 302989312. Throughput: 0: 35427.7. Samples: 6796920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-12 14:53:42,029][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:53:46,521][68109] Updated weights for policy 0, policy_version 18503 (0.0037) [2024-06-12 14:53:47,026][67877] Fps is (10 sec: 36045.1, 60 sec: 35771.6, 300 sec: 26529.5). Total num frames: 303169536. Throughput: 0: 35597.8. Samples: 7010640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-12 14:53:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:53:51,023][68109] Updated weights for policy 0, policy_version 18513 (0.0036) [2024-06-12 14:53:52,026][67877] Fps is (10 sec: 37688.8, 60 sec: 35771.8, 300 sec: 26770.9). Total num frames: 303366144. Throughput: 0: 35456.0. Samples: 7219680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 14:53:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:53:55,927][68109] Updated weights for policy 0, policy_version 18523 (0.0032) [2024-06-12 14:53:57,031][67877] Fps is (10 sec: 32754.4, 60 sec: 35772.5, 300 sec: 26760.1). Total num frames: 303497216. Throughput: 0: 35658.4. Samples: 7329200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 14:53:57,031][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:54:00,333][68109] Updated weights for policy 0, policy_version 18533 (0.0037) [2024-06-12 14:54:02,026][67877] Fps is (10 sec: 36044.8, 60 sec: 35771.8, 300 sec: 27108.1). Total num frames: 303726592. Throughput: 0: 35449.7. Samples: 7543540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 14:54:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:54:05,147][68109] Updated weights for policy 0, policy_version 18543 (0.0029) [2024-06-12 14:54:07,026][67877] Fps is (10 sec: 36059.8, 60 sec: 35771.7, 300 sec: 27092.1). Total num frames: 303857664. Throughput: 0: 35773.7. Samples: 7767040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 14:54:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:54:09,444][68109] Updated weights for policy 0, policy_version 18553 (0.0038) [2024-06-12 14:54:12,027][67877] Fps is (10 sec: 34406.1, 60 sec: 36044.7, 300 sec: 27364.2). Total num frames: 304070656. Throughput: 0: 35611.9. Samples: 7862940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 14:54:12,031][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:54:14,262][68109] Updated weights for policy 0, policy_version 18563 (0.0037) [2024-06-12 14:54:17,027][67877] Fps is (10 sec: 36044.4, 60 sec: 35225.4, 300 sec: 27400.8). Total num frames: 304218112. Throughput: 0: 35933.6. Samples: 8084840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 14:54:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:54:18,271][68109] Updated weights for policy 0, policy_version 18573 (0.0027) [2024-06-12 14:54:22,026][67877] Fps is (10 sec: 34406.7, 60 sec: 35771.7, 300 sec: 27602.9). Total num frames: 304414720. Throughput: 0: 35872.1. Samples: 8296240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-12 14:54:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:54:23,334][68109] Updated weights for policy 0, policy_version 18583 (0.0036) [2024-06-12 14:54:27,026][67877] Fps is (10 sec: 37684.1, 60 sec: 35775.1, 300 sec: 28213.8). Total num frames: 304594944. Throughput: 0: 35649.6. Samples: 8401100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-12 14:54:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:54:27,405][68109] Updated weights for policy 0, policy_version 18593 (0.0030) [2024-06-12 14:54:32,026][67877] Fps is (10 sec: 34406.3, 60 sec: 35498.6, 300 sec: 28769.2). Total num frames: 304758784. Throughput: 0: 35528.9. Samples: 8609440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 14:54:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:54:32,733][68109] Updated weights for policy 0, policy_version 18603 (0.0039) [2024-06-12 14:54:36,452][68109] Updated weights for policy 0, policy_version 18613 (0.0029) [2024-06-12 14:54:37,026][67877] Fps is (10 sec: 37683.2, 60 sec: 36044.9, 300 sec: 28935.8). Total num frames: 304971776. Throughput: 0: 35758.7. Samples: 8828820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 14:54:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:54:41,797][68109] Updated weights for policy 0, policy_version 18623 (0.0028) [2024-06-12 14:54:42,026][67877] Fps is (10 sec: 36044.7, 60 sec: 35499.5, 300 sec: 29435.6). Total num frames: 305119232. Throughput: 0: 35727.3. Samples: 8936780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 14:54:42,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:54:45,888][68109] Updated weights for policy 0, policy_version 18633 (0.0033) [2024-06-12 14:54:47,031][67877] Fps is (10 sec: 34389.0, 60 sec: 35768.8, 300 sec: 30046.1). Total num frames: 305315840. Throughput: 0: 35830.7. Samples: 9156100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-12 14:54:47,032][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:54:51,080][68109] Updated weights for policy 0, policy_version 18643 (0.0044) [2024-06-12 14:54:52,026][67877] Fps is (10 sec: 36045.1, 60 sec: 35225.6, 300 sec: 30546.4). Total num frames: 305479680. Throughput: 0: 35493.9. Samples: 9364260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-12 14:54:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:54:54,982][68109] Updated weights for policy 0, policy_version 18653 (0.0034) [2024-06-12 14:54:57,026][67877] Fps is (10 sec: 34423.4, 60 sec: 36047.3, 300 sec: 31046.3). Total num frames: 305659904. Throughput: 0: 35729.8. Samples: 9470780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-12 14:54:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:54:59,567][68109] Updated weights for policy 0, policy_version 18663 (0.0040) [2024-06-12 14:55:02,026][67877] Fps is (10 sec: 37683.0, 60 sec: 35498.6, 300 sec: 31601.7). Total num frames: 305856512. Throughput: 0: 35702.4. Samples: 9691440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 14:55:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:55:04,312][68109] Updated weights for policy 0, policy_version 18673 (0.0030) [2024-06-12 14:55:07,026][67877] Fps is (10 sec: 34406.4, 60 sec: 35771.7, 300 sec: 31934.9). Total num frames: 306003968. Throughput: 0: 35813.7. Samples: 9907860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 14:55:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:55:08,940][68109] Updated weights for policy 0, policy_version 18683 (0.0035) [2024-06-12 14:55:12,026][67877] Fps is (10 sec: 36044.9, 60 sec: 35771.8, 300 sec: 32379.2). Total num frames: 306216960. Throughput: 0: 35819.1. Samples: 10012960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 14:55:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:55:12,505][68089] Signal inference workers to stop experience collection... (100 times) [2024-06-12 14:55:12,529][68109] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-06-12 14:55:12,552][68089] Signal inference workers to resume experience collection... (100 times) [2024-06-12 14:55:12,556][68109] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-06-12 14:55:12,842][68109] Updated weights for policy 0, policy_version 18693 (0.0039) [2024-06-12 14:55:17,026][67877] Fps is (10 sec: 36045.1, 60 sec: 35771.9, 300 sec: 32656.9). Total num frames: 306364416. Throughput: 0: 36017.8. Samples: 10230240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 14:55:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:55:17,035][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000018699_306364416.pth... [2024-06-12 14:55:17,108][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000018188_297992192.pth [2024-06-12 14:55:18,007][68109] Updated weights for policy 0, policy_version 18703 (0.0033) [2024-06-12 14:55:21,831][68109] Updated weights for policy 0, policy_version 18713 (0.0025) [2024-06-12 14:55:22,026][67877] Fps is (10 sec: 37683.1, 60 sec: 36317.9, 300 sec: 33101.2). Total num frames: 306593792. Throughput: 0: 35767.0. Samples: 10438340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 14:55:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:55:27,026][67877] Fps is (10 sec: 37683.3, 60 sec: 35771.7, 300 sec: 33267.9). Total num frames: 306741248. Throughput: 0: 35907.6. Samples: 10552620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 14:55:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:55:27,101][68109] Updated weights for policy 0, policy_version 18723 (0.0047) [2024-06-12 14:55:31,732][68109] Updated weights for policy 0, policy_version 18733 (0.0030) [2024-06-12 14:55:32,026][67877] Fps is (10 sec: 32768.0, 60 sec: 36044.8, 300 sec: 33601.1). Total num frames: 306921472. Throughput: 0: 35808.4. Samples: 10767300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 14:55:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:55:36,250][68109] Updated weights for policy 0, policy_version 18743 (0.0037) [2024-06-12 14:55:37,029][67877] Fps is (10 sec: 36035.2, 60 sec: 35497.1, 300 sec: 33822.9). Total num frames: 307101696. Throughput: 0: 35913.9. Samples: 10980480. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 14:55:37,030][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:55:41,182][68109] Updated weights for policy 0, policy_version 18753 (0.0029) [2024-06-12 14:55:42,026][67877] Fps is (10 sec: 32768.2, 60 sec: 35498.7, 300 sec: 33989.9). Total num frames: 307249152. Throughput: 0: 35865.4. Samples: 11084720. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-12 14:55:42,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:55:45,663][68109] Updated weights for policy 0, policy_version 18763 (0.0030) [2024-06-12 14:55:47,026][67877] Fps is (10 sec: 34415.6, 60 sec: 35501.7, 300 sec: 34212.0). Total num frames: 307445760. Throughput: 0: 35799.2. Samples: 11302400. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-12 14:55:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:55:50,768][68109] Updated weights for policy 0, policy_version 18773 (0.0038) [2024-06-12 14:55:52,026][67877] Fps is (10 sec: 37683.3, 60 sec: 35771.8, 300 sec: 34378.6). Total num frames: 307625984. Throughput: 0: 35721.0. Samples: 11515300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 14:55:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:55:54,931][68109] Updated weights for policy 0, policy_version 18783 (0.0029) [2024-06-12 14:55:57,026][67877] Fps is (10 sec: 37683.1, 60 sec: 36044.9, 300 sec: 34656.3). Total num frames: 307822592. Throughput: 0: 35507.1. Samples: 11610780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 14:55:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:56:00,022][68109] Updated weights for policy 0, policy_version 18793 (0.0036) [2024-06-12 14:56:02,026][67877] Fps is (10 sec: 34406.2, 60 sec: 35225.6, 300 sec: 34711.9). Total num frames: 307970048. Throughput: 0: 35548.4. Samples: 11829920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 14:56:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:56:03,983][68109] Updated weights for policy 0, policy_version 18803 (0.0036) [2024-06-12 14:56:07,026][67877] Fps is (10 sec: 34406.3, 60 sec: 36044.8, 300 sec: 34934.0). Total num frames: 308166656. Throughput: 0: 35612.0. Samples: 12040880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 14:56:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:56:08,882][68109] Updated weights for policy 0, policy_version 18813 (0.0032) [2024-06-12 14:56:12,026][67877] Fps is (10 sec: 34406.5, 60 sec: 34952.6, 300 sec: 34989.6). Total num frames: 308314112. Throughput: 0: 35538.7. Samples: 12151860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 14:56:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:56:13,042][68109] Updated weights for policy 0, policy_version 18823 (0.0036) [2024-06-12 14:56:17,026][67877] Fps is (10 sec: 34406.6, 60 sec: 35771.8, 300 sec: 35156.2). Total num frames: 308510720. Throughput: 0: 35289.8. Samples: 12355340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 14:56:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:56:17,958][68109] Updated weights for policy 0, policy_version 18833 (0.0041) [2024-06-12 14:56:22,027][67877] Fps is (10 sec: 39320.9, 60 sec: 35225.5, 300 sec: 35322.8). Total num frames: 308707328. Throughput: 0: 35349.0. Samples: 12571100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 14:56:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:56:22,186][68109] Updated weights for policy 0, policy_version 18843 (0.0031) [2024-06-12 14:56:27,029][67877] Fps is (10 sec: 36034.2, 60 sec: 35496.9, 300 sec: 35322.4). Total num frames: 308871168. Throughput: 0: 35411.9. Samples: 12678360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 14:56:27,030][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:56:27,535][68109] Updated weights for policy 0, policy_version 18853 (0.0035) [2024-06-12 14:56:31,261][68109] Updated weights for policy 0, policy_version 18863 (0.0033) [2024-06-12 14:56:32,026][67877] Fps is (10 sec: 36045.5, 60 sec: 35771.8, 300 sec: 35433.9). Total num frames: 309067776. Throughput: 0: 35387.6. Samples: 12894840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 14:56:32,032][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:56:36,855][68109] Updated weights for policy 0, policy_version 18873 (0.0034) [2024-06-12 14:56:37,026][67877] Fps is (10 sec: 34416.6, 60 sec: 35227.2, 300 sec: 35378.3). Total num frames: 309215232. Throughput: 0: 35387.1. Samples: 13107720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 14:56:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:56:40,500][68109] Updated weights for policy 0, policy_version 18883 (0.0024) [2024-06-12 14:56:42,026][67877] Fps is (10 sec: 36044.5, 60 sec: 36317.8, 300 sec: 35600.5). Total num frames: 309428224. Throughput: 0: 35727.1. Samples: 13218500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 14:56:42,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:56:46,035][68109] Updated weights for policy 0, policy_version 18893 (0.0038) [2024-06-12 14:56:47,027][67877] Fps is (10 sec: 34405.8, 60 sec: 35225.5, 300 sec: 35433.9). Total num frames: 309559296. Throughput: 0: 35490.1. Samples: 13426980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 14:56:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:56:49,779][68109] Updated weights for policy 0, policy_version 18903 (0.0040) [2024-06-12 14:56:52,026][67877] Fps is (10 sec: 32768.0, 60 sec: 35498.6, 300 sec: 35489.4). Total num frames: 309755904. Throughput: 0: 35438.2. Samples: 13635600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 14:56:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:56:55,171][68109] Updated weights for policy 0, policy_version 18913 (0.0036) [2024-06-12 14:56:57,027][67877] Fps is (10 sec: 39321.7, 60 sec: 35498.6, 300 sec: 35600.5). Total num frames: 309952512. Throughput: 0: 35430.1. Samples: 13746220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 14:56:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:56:58,990][68109] Updated weights for policy 0, policy_version 18923 (0.0032) [2024-06-12 14:57:02,031][67877] Fps is (10 sec: 34392.0, 60 sec: 35496.2, 300 sec: 35488.9). Total num frames: 310099968. Throughput: 0: 35564.6. Samples: 13955900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 14:57:02,031][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:57:04,692][68109] Updated weights for policy 0, policy_version 18933 (0.0031) [2024-06-12 14:57:07,026][67877] Fps is (10 sec: 34406.8, 60 sec: 35498.7, 300 sec: 35656.0). Total num frames: 310296576. Throughput: 0: 35507.7. Samples: 14168940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 14:57:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:57:08,390][68109] Updated weights for policy 0, policy_version 18943 (0.0029) [2024-06-12 14:57:12,027][67877] Fps is (10 sec: 36059.5, 60 sec: 35771.6, 300 sec: 35545.1). Total num frames: 310460416. Throughput: 0: 35644.0. Samples: 14282240. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-12 14:57:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:57:13,818][68109] Updated weights for policy 0, policy_version 18953 (0.0039) [2024-06-12 14:57:17,026][67877] Fps is (10 sec: 37683.2, 60 sec: 36044.8, 300 sec: 35711.6). Total num frames: 310673408. Throughput: 0: 35505.8. Samples: 14492600. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-12 14:57:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:57:17,047][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000018962_310673408.pth... [2024-06-12 14:57:17,101][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000018440_302120960.pth [2024-06-12 14:57:18,173][68109] Updated weights for policy 0, policy_version 18963 (0.0032) [2024-06-12 14:57:22,026][67877] Fps is (10 sec: 32768.5, 60 sec: 34679.6, 300 sec: 35433.9). Total num frames: 310788096. Throughput: 0: 35560.9. Samples: 14707960. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-12 14:57:22,036][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:57:23,435][68109] Updated weights for policy 0, policy_version 18973 (0.0041) [2024-06-12 14:57:27,026][67877] Fps is (10 sec: 32767.7, 60 sec: 35500.3, 300 sec: 35489.4). Total num frames: 311001088. Throughput: 0: 35415.5. Samples: 14812200. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 14:57:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:57:27,371][68109] Updated weights for policy 0, policy_version 18983 (0.0028) [2024-06-12 14:57:31,369][68089] Signal inference workers to stop experience collection... (150 times) [2024-06-12 14:57:31,374][68089] Signal inference workers to resume experience collection... (150 times) [2024-06-12 14:57:31,386][68109] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-06-12 14:57:31,386][68109] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-06-12 14:57:32,026][67877] Fps is (10 sec: 36044.7, 60 sec: 34679.4, 300 sec: 35544.9). Total num frames: 311148544. Throughput: 0: 35471.6. Samples: 15023200. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 14:57:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:57:32,201][68109] Updated weights for policy 0, policy_version 18993 (0.0040) [2024-06-12 14:57:36,700][68109] Updated weights for policy 0, policy_version 19003 (0.0031) [2024-06-12 14:57:37,026][67877] Fps is (10 sec: 34406.7, 60 sec: 35498.6, 300 sec: 35489.4). Total num frames: 311345152. Throughput: 0: 35626.3. Samples: 15238780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 14:57:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:57:41,078][68109] Updated weights for policy 0, policy_version 19013 (0.0047) [2024-06-12 14:57:42,026][67877] Fps is (10 sec: 37683.3, 60 sec: 34952.6, 300 sec: 35600.5). Total num frames: 311525376. Throughput: 0: 35500.1. Samples: 15343720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 14:57:42,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:57:45,687][68109] Updated weights for policy 0, policy_version 19023 (0.0043) [2024-06-12 14:57:47,026][67877] Fps is (10 sec: 39321.8, 60 sec: 36318.0, 300 sec: 35656.0). Total num frames: 311738368. Throughput: 0: 35735.4. Samples: 15563840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 14:57:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:57:50,354][68109] Updated weights for policy 0, policy_version 19033 (0.0023) [2024-06-12 14:57:52,028][67877] Fps is (10 sec: 37677.5, 60 sec: 35770.9, 300 sec: 35767.6). Total num frames: 311902208. Throughput: 0: 35709.5. Samples: 15775920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-12 14:57:52,029][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:57:54,891][68109] Updated weights for policy 0, policy_version 19043 (0.0031) [2024-06-12 14:57:57,028][67877] Fps is (10 sec: 34401.0, 60 sec: 35497.8, 300 sec: 35600.3). Total num frames: 312082432. Throughput: 0: 35572.7. Samples: 15883060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-12 14:57:57,028][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:57:59,172][68109] Updated weights for policy 0, policy_version 19053 (0.0033) [2024-06-12 14:58:02,026][67877] Fps is (10 sec: 32773.1, 60 sec: 35501.2, 300 sec: 35656.0). Total num frames: 312229888. Throughput: 0: 35502.7. Samples: 16090220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-12 14:58:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:58:03,853][68109] Updated weights for policy 0, policy_version 19063 (0.0038) [2024-06-12 14:58:07,026][67877] Fps is (10 sec: 34411.3, 60 sec: 35498.6, 300 sec: 35656.0). Total num frames: 312426496. Throughput: 0: 35550.6. Samples: 16307740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 14:58:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:58:08,527][68109] Updated weights for policy 0, policy_version 19073 (0.0035) [2024-06-12 14:58:12,026][67877] Fps is (10 sec: 36044.9, 60 sec: 35498.8, 300 sec: 35545.0). Total num frames: 312590336. Throughput: 0: 35704.1. Samples: 16418880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 14:58:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:58:13,264][68109] Updated weights for policy 0, policy_version 19083 (0.0034) [2024-06-12 14:58:17,027][67877] Fps is (10 sec: 37683.1, 60 sec: 35498.6, 300 sec: 35711.5). Total num frames: 312803328. Throughput: 0: 35731.5. Samples: 16631120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 14:58:17,040][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:58:17,195][68109] Updated weights for policy 0, policy_version 19093 (0.0033) [2024-06-12 14:58:22,026][67877] Fps is (10 sec: 37682.6, 60 sec: 36317.8, 300 sec: 35656.7). Total num frames: 312967168. Throughput: 0: 35780.4. Samples: 16848900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 14:58:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:58:22,670][68109] Updated weights for policy 0, policy_version 19103 (0.0029) [2024-06-12 14:58:26,525][68109] Updated weights for policy 0, policy_version 19113 (0.0043) [2024-06-12 14:58:27,026][67877] Fps is (10 sec: 37683.6, 60 sec: 36317.9, 300 sec: 35767.1). Total num frames: 313180160. Throughput: 0: 35812.9. Samples: 16955300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 14:58:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:58:31,382][68109] Updated weights for policy 0, policy_version 19123 (0.0036) [2024-06-12 14:58:32,026][67877] Fps is (10 sec: 34406.5, 60 sec: 36044.8, 300 sec: 35600.5). Total num frames: 313311232. Throughput: 0: 35800.4. Samples: 17174860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 14:58:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:58:35,006][68109] Updated weights for policy 0, policy_version 19133 (0.0032) [2024-06-12 14:58:37,026][67877] Fps is (10 sec: 34406.5, 60 sec: 36317.9, 300 sec: 35711.7). Total num frames: 313524224. Throughput: 0: 35887.0. Samples: 17390780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 14:58:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:58:40,638][68109] Updated weights for policy 0, policy_version 19143 (0.0031) [2024-06-12 14:58:42,026][67877] Fps is (10 sec: 36045.0, 60 sec: 35771.7, 300 sec: 35600.5). Total num frames: 313671680. Throughput: 0: 36083.0. Samples: 17506740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 14:58:42,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 14:58:43,997][68109] Updated weights for policy 0, policy_version 19153 (0.0038) [2024-06-12 14:58:47,026][67877] Fps is (10 sec: 36044.8, 60 sec: 35771.7, 300 sec: 35656.0). Total num frames: 313884672. Throughput: 0: 36259.1. Samples: 17721880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-12 14:58:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:58:49,658][68109] Updated weights for policy 0, policy_version 19163 (0.0030) [2024-06-12 14:58:52,026][67877] Fps is (10 sec: 36044.7, 60 sec: 35499.5, 300 sec: 35712.1). Total num frames: 314032128. Throughput: 0: 36216.5. Samples: 17937480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-12 14:58:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:58:53,499][68109] Updated weights for policy 0, policy_version 19173 (0.0029) [2024-06-12 14:58:57,027][67877] Fps is (10 sec: 36044.2, 60 sec: 36045.6, 300 sec: 35656.0). Total num frames: 314245120. Throughput: 0: 36023.8. Samples: 18039960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-12 14:58:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:58:58,573][68109] Updated weights for policy 0, policy_version 19183 (0.0043) [2024-06-12 14:59:02,026][67877] Fps is (10 sec: 37683.3, 60 sec: 36317.8, 300 sec: 35767.1). Total num frames: 314408960. Throughput: 0: 36041.9. Samples: 18253000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 14:59:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:59:02,576][68109] Updated weights for policy 0, policy_version 19193 (0.0044) [2024-06-12 14:59:07,027][67877] Fps is (10 sec: 34406.6, 60 sec: 36044.8, 300 sec: 35656.0). Total num frames: 314589184. Throughput: 0: 36068.9. Samples: 18472000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 14:59:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:59:07,833][68109] Updated weights for policy 0, policy_version 19203 (0.0041) [2024-06-12 14:59:12,026][67877] Fps is (10 sec: 36044.7, 60 sec: 36317.8, 300 sec: 35767.1). Total num frames: 314769408. Throughput: 0: 36177.8. Samples: 18583300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-12 14:59:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:59:12,174][68109] Updated weights for policy 0, policy_version 19213 (0.0034) [2024-06-12 14:59:15,974][68109] Updated weights for policy 0, policy_version 19223 (0.0034) [2024-06-12 14:59:17,026][67877] Fps is (10 sec: 37683.4, 60 sec: 36044.8, 300 sec: 35767.1). Total num frames: 314966016. Throughput: 0: 36124.4. Samples: 18800460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-12 14:59:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:59:17,038][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000019224_314966016.pth... [2024-06-12 14:59:17,091][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000018699_306364416.pth [2024-06-12 14:59:21,571][68109] Updated weights for policy 0, policy_version 19233 (0.0026) [2024-06-12 14:59:22,027][67877] Fps is (10 sec: 34406.1, 60 sec: 35771.7, 300 sec: 35656.0). Total num frames: 315113472. Throughput: 0: 36104.3. Samples: 19015480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-12 14:59:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:59:22,730][68089] Signal inference workers to stop experience collection... (200 times) [2024-06-12 14:59:22,730][68089] Signal inference workers to resume experience collection... (200 times) [2024-06-12 14:59:22,751][68109] InferenceWorker_p0-w0: stopping experience collection (200 times) [2024-06-12 14:59:22,751][68109] InferenceWorker_p0-w0: resuming experience collection (200 times) [2024-06-12 14:59:24,375][68109] Updated weights for policy 0, policy_version 19243 (0.0032) [2024-06-12 14:59:27,026][67877] Fps is (10 sec: 36044.9, 60 sec: 35771.7, 300 sec: 35822.6). Total num frames: 315326464. Throughput: 0: 35827.1. Samples: 19118960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-12 14:59:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:59:31,463][68109] Updated weights for policy 0, policy_version 19253 (0.0038) [2024-06-12 14:59:32,027][67877] Fps is (10 sec: 37683.3, 60 sec: 36317.8, 300 sec: 35656.0). Total num frames: 315490304. Throughput: 0: 36102.1. Samples: 19346480. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-12 14:59:32,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 14:59:33,567][68109] Updated weights for policy 0, policy_version 19263 (0.0030) [2024-06-12 14:59:37,032][67877] Fps is (10 sec: 34387.7, 60 sec: 35768.4, 300 sec: 35766.4). Total num frames: 315670528. Throughput: 0: 36006.3. Samples: 19557960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-12 14:59:37,032][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:59:40,465][68109] Updated weights for policy 0, policy_version 19273 (0.0034) [2024-06-12 14:59:42,027][67877] Fps is (10 sec: 37683.0, 60 sec: 36590.8, 300 sec: 35767.7). Total num frames: 315867136. Throughput: 0: 36192.9. Samples: 19668640. Policy #0 lag: (min: 0.0, avg: 7.3, max: 21.0) [2024-06-12 14:59:42,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:59:43,142][68109] Updated weights for policy 0, policy_version 19283 (0.0031) [2024-06-12 14:59:47,026][67877] Fps is (10 sec: 34425.4, 60 sec: 35498.7, 300 sec: 35711.6). Total num frames: 316014592. Throughput: 0: 36144.5. Samples: 19879500. Policy #0 lag: (min: 0.0, avg: 7.3, max: 21.0) [2024-06-12 14:59:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:59:49,000][68109] Updated weights for policy 0, policy_version 19293 (0.0043) [2024-06-12 14:59:52,026][67877] Fps is (10 sec: 37684.0, 60 sec: 36864.0, 300 sec: 35878.2). Total num frames: 316243968. Throughput: 0: 36097.0. Samples: 20096360. Policy #0 lag: (min: 0.0, avg: 7.3, max: 21.0) [2024-06-12 14:59:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:59:52,480][68109] Updated weights for policy 0, policy_version 19303 (0.0034) [2024-06-12 14:59:57,026][67877] Fps is (10 sec: 36044.6, 60 sec: 35498.7, 300 sec: 35656.0). Total num frames: 316375040. Throughput: 0: 36065.8. Samples: 20206260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 14:59:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 14:59:58,089][68109] Updated weights for policy 0, policy_version 19313 (0.0032) [2024-06-12 15:00:00,931][68109] Updated weights for policy 0, policy_version 19323 (0.0041) [2024-06-12 15:00:02,026][67877] Fps is (10 sec: 36044.2, 60 sec: 36590.9, 300 sec: 35933.7). Total num frames: 316604416. Throughput: 0: 36204.4. Samples: 20429660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 15:00:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:00:06,798][68109] Updated weights for policy 0, policy_version 19333 (0.0033) [2024-06-12 15:00:07,026][67877] Fps is (10 sec: 37683.0, 60 sec: 36044.8, 300 sec: 35711.6). Total num frames: 316751872. Throughput: 0: 36452.0. Samples: 20655820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 15:00:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:00:10,375][68109] Updated weights for policy 0, policy_version 19343 (0.0030) [2024-06-12 15:00:12,026][67877] Fps is (10 sec: 36045.1, 60 sec: 36590.9, 300 sec: 35933.7). Total num frames: 316964864. Throughput: 0: 36357.8. Samples: 20755060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 15:00:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:00:15,360][68109] Updated weights for policy 0, policy_version 19353 (0.0029) [2024-06-12 15:00:17,026][67877] Fps is (10 sec: 37683.3, 60 sec: 36044.8, 300 sec: 35711.6). Total num frames: 317128704. Throughput: 0: 36311.6. Samples: 20980500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 15:00:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:00:19,567][68109] Updated weights for policy 0, policy_version 19363 (0.0037) [2024-06-12 15:00:22,026][67877] Fps is (10 sec: 34406.3, 60 sec: 36591.0, 300 sec: 35822.6). Total num frames: 317308928. Throughput: 0: 36411.1. Samples: 21196260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 15:00:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:00:24,132][68109] Updated weights for policy 0, policy_version 19373 (0.0024) [2024-06-12 15:00:27,026][67877] Fps is (10 sec: 36045.1, 60 sec: 36044.8, 300 sec: 35822.7). Total num frames: 317489152. Throughput: 0: 36383.3. Samples: 21305880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 15:00:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:00:28,631][68109] Updated weights for policy 0, policy_version 19383 (0.0035) [2024-06-12 15:00:32,027][67877] Fps is (10 sec: 36044.5, 60 sec: 36317.8, 300 sec: 35822.9). Total num frames: 317669376. Throughput: 0: 36593.2. Samples: 21526200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 15:00:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:00:33,225][68109] Updated weights for policy 0, policy_version 19393 (0.0039) [2024-06-12 15:00:37,026][67877] Fps is (10 sec: 36044.9, 60 sec: 36321.2, 300 sec: 35933.7). Total num frames: 317849600. Throughput: 0: 36505.3. Samples: 21739100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 15:00:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:00:38,343][68109] Updated weights for policy 0, policy_version 19403 (0.0044) [2024-06-12 15:00:41,534][68109] Updated weights for policy 0, policy_version 19413 (0.0040) [2024-06-12 15:00:42,026][67877] Fps is (10 sec: 39322.0, 60 sec: 36591.0, 300 sec: 35989.3). Total num frames: 318062592. Throughput: 0: 36426.2. Samples: 21845440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 15:00:42,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:00:46,927][68109] Updated weights for policy 0, policy_version 19423 (0.0035) [2024-06-12 15:00:47,026][67877] Fps is (10 sec: 37683.2, 60 sec: 36864.0, 300 sec: 35933.7). Total num frames: 318226432. Throughput: 0: 36505.9. Samples: 22072420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 15:00:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:00:50,766][68109] Updated weights for policy 0, policy_version 19433 (0.0042) [2024-06-12 15:00:52,026][67877] Fps is (10 sec: 36044.9, 60 sec: 36317.8, 300 sec: 35933.7). Total num frames: 318423040. Throughput: 0: 36365.4. Samples: 22292260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-12 15:00:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:00:55,858][68109] Updated weights for policy 0, policy_version 19443 (0.0028) [2024-06-12 15:00:57,026][67877] Fps is (10 sec: 36044.3, 60 sec: 36864.0, 300 sec: 35989.3). Total num frames: 318586880. Throughput: 0: 36704.9. Samples: 22406780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-12 15:00:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:00:59,803][68109] Updated weights for policy 0, policy_version 19453 (0.0047) [2024-06-12 15:01:02,026][67877] Fps is (10 sec: 34406.5, 60 sec: 36044.9, 300 sec: 35933.7). Total num frames: 318767104. Throughput: 0: 36408.1. Samples: 22618860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-12 15:01:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:01:04,763][68109] Updated weights for policy 0, policy_version 19463 (0.0042) [2024-06-12 15:01:07,026][67877] Fps is (10 sec: 36044.9, 60 sec: 36591.0, 300 sec: 36044.8). Total num frames: 318947328. Throughput: 0: 36552.4. Samples: 22841120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 15:01:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:01:08,802][68109] Updated weights for policy 0, policy_version 19473 (0.0030) [2024-06-12 15:01:12,026][67877] Fps is (10 sec: 36044.5, 60 sec: 36044.8, 300 sec: 35989.3). Total num frames: 319127552. Throughput: 0: 36477.3. Samples: 22947360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 15:01:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:01:13,869][68089] Signal inference workers to stop experience collection... (250 times) [2024-06-12 15:01:13,870][68089] Signal inference workers to resume experience collection... (250 times) [2024-06-12 15:01:13,912][68109] InferenceWorker_p0-w0: stopping experience collection (250 times) [2024-06-12 15:01:13,913][68109] InferenceWorker_p0-w0: resuming experience collection (250 times) [2024-06-12 15:01:14,018][68109] Updated weights for policy 0, policy_version 19483 (0.0036) [2024-06-12 15:01:17,026][67877] Fps is (10 sec: 37683.0, 60 sec: 36590.9, 300 sec: 35989.3). Total num frames: 319324160. Throughput: 0: 36432.0. Samples: 23165640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 15:01:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:01:17,109][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000019491_319340544.pth... [2024-06-12 15:01:17,163][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000018962_310673408.pth [2024-06-12 15:01:17,579][68109] Updated weights for policy 0, policy_version 19493 (0.0031) [2024-06-12 15:01:22,026][67877] Fps is (10 sec: 34406.6, 60 sec: 36044.8, 300 sec: 35934.1). Total num frames: 319471616. Throughput: 0: 36775.1. Samples: 23393980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 15:01:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:01:22,939][68109] Updated weights for policy 0, policy_version 19503 (0.0039) [2024-06-12 15:01:26,974][68109] Updated weights for policy 0, policy_version 19513 (0.0034) [2024-06-12 15:01:27,026][67877] Fps is (10 sec: 37683.3, 60 sec: 36863.9, 300 sec: 36044.8). Total num frames: 319700992. Throughput: 0: 36756.4. Samples: 23499480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 15:01:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:01:32,026][67877] Fps is (10 sec: 37683.3, 60 sec: 36318.0, 300 sec: 36044.8). Total num frames: 319848448. Throughput: 0: 36593.7. Samples: 23719140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 15:01:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:01:32,154][68109] Updated weights for policy 0, policy_version 19523 (0.0028) [2024-06-12 15:01:35,815][68109] Updated weights for policy 0, policy_version 19533 (0.0028) [2024-06-12 15:01:37,026][67877] Fps is (10 sec: 39321.8, 60 sec: 37410.1, 300 sec: 36155.9). Total num frames: 320094208. Throughput: 0: 36421.8. Samples: 23931240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 15:01:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:01:40,648][68109] Updated weights for policy 0, policy_version 19543 (0.0024) [2024-06-12 15:01:42,027][67877] Fps is (10 sec: 37682.6, 60 sec: 36044.7, 300 sec: 36155.9). Total num frames: 320225280. Throughput: 0: 36430.2. Samples: 24046140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 15:01:42,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:01:44,690][68109] Updated weights for policy 0, policy_version 19553 (0.0034) [2024-06-12 15:01:47,027][67877] Fps is (10 sec: 32767.6, 60 sec: 36590.8, 300 sec: 36155.9). Total num frames: 320421888. Throughput: 0: 36618.5. Samples: 24266700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-12 15:01:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:01:49,606][68109] Updated weights for policy 0, policy_version 19563 (0.0027) [2024-06-12 15:01:52,026][67877] Fps is (10 sec: 37683.7, 60 sec: 36317.9, 300 sec: 36100.4). Total num frames: 320602112. Throughput: 0: 36507.6. Samples: 24483960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-12 15:01:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:01:53,401][68109] Updated weights for policy 0, policy_version 19573 (0.0029) [2024-06-12 15:01:57,026][67877] Fps is (10 sec: 37684.1, 60 sec: 36864.1, 300 sec: 36267.5). Total num frames: 320798720. Throughput: 0: 36625.4. Samples: 24595500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 15:01:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:01:58,432][68109] Updated weights for policy 0, policy_version 19583 (0.0032) [2024-06-12 15:02:02,028][67877] Fps is (10 sec: 37677.4, 60 sec: 36863.0, 300 sec: 36211.2). Total num frames: 320978944. Throughput: 0: 36667.7. Samples: 24815740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 15:02:02,028][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:02:02,512][68109] Updated weights for policy 0, policy_version 19593 (0.0040) [2024-06-12 15:02:06,969][68109] Updated weights for policy 0, policy_version 19603 (0.0034) [2024-06-12 15:02:07,026][67877] Fps is (10 sec: 37683.0, 60 sec: 37137.1, 300 sec: 36322.5). Total num frames: 321175552. Throughput: 0: 36512.5. Samples: 25037040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 15:02:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:02:11,360][68109] Updated weights for policy 0, policy_version 19613 (0.0029) [2024-06-12 15:02:12,027][67877] Fps is (10 sec: 36049.8, 60 sec: 36863.9, 300 sec: 36155.9). Total num frames: 321339392. Throughput: 0: 36692.8. Samples: 25150660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 15:02:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:02:15,165][68109] Updated weights for policy 0, policy_version 19623 (0.0034) [2024-06-12 15:02:17,027][67877] Fps is (10 sec: 34405.7, 60 sec: 36590.9, 300 sec: 36378.0). Total num frames: 321519616. Throughput: 0: 36830.5. Samples: 25376520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 15:02:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:02:20,610][68109] Updated weights for policy 0, policy_version 19633 (0.0035) [2024-06-12 15:02:22,026][67877] Fps is (10 sec: 39322.2, 60 sec: 37683.2, 300 sec: 36378.0). Total num frames: 321732608. Throughput: 0: 37132.5. Samples: 25602200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 15:02:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:02:24,116][68109] Updated weights for policy 0, policy_version 19643 (0.0049) [2024-06-12 15:02:27,026][67877] Fps is (10 sec: 37683.7, 60 sec: 36591.0, 300 sec: 36433.6). Total num frames: 321896448. Throughput: 0: 37004.1. Samples: 25711320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-12 15:02:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:02:29,452][68109] Updated weights for policy 0, policy_version 19653 (0.0037) [2024-06-12 15:02:32,026][67877] Fps is (10 sec: 37683.1, 60 sec: 37683.2, 300 sec: 36489.1). Total num frames: 322109440. Throughput: 0: 37033.5. Samples: 25933200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-12 15:02:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:02:33,614][68109] Updated weights for policy 0, policy_version 19663 (0.0030) [2024-06-12 15:02:37,026][67877] Fps is (10 sec: 36044.6, 60 sec: 36044.8, 300 sec: 36378.0). Total num frames: 322256896. Throughput: 0: 37159.1. Samples: 26156120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-12 15:02:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:02:38,320][68109] Updated weights for policy 0, policy_version 19673 (0.0026) [2024-06-12 15:02:42,026][67877] Fps is (10 sec: 36044.6, 60 sec: 37410.2, 300 sec: 36378.0). Total num frames: 322469888. Throughput: 0: 37087.4. Samples: 26264440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-12 15:02:42,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:02:42,275][68109] Updated weights for policy 0, policy_version 19683 (0.0033) [2024-06-12 15:02:46,932][68109] Updated weights for policy 0, policy_version 19693 (0.0039) [2024-06-12 15:02:47,026][67877] Fps is (10 sec: 39321.8, 60 sec: 37137.2, 300 sec: 36433.8). Total num frames: 322650112. Throughput: 0: 37257.3. Samples: 26492260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-12 15:02:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:02:50,870][68109] Updated weights for policy 0, policy_version 19703 (0.0036) [2024-06-12 15:02:52,026][67877] Fps is (10 sec: 36045.2, 60 sec: 37137.1, 300 sec: 36433.8). Total num frames: 322830336. Throughput: 0: 37104.0. Samples: 26706720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-12 15:02:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:02:56,263][68109] Updated weights for policy 0, policy_version 19713 (0.0040) [2024-06-12 15:02:57,026][67877] Fps is (10 sec: 36045.2, 60 sec: 36864.0, 300 sec: 36544.7). Total num frames: 323010560. Throughput: 0: 37118.9. Samples: 26821000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 15:02:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:03:00,105][68109] Updated weights for policy 0, policy_version 19723 (0.0037) [2024-06-12 15:03:02,026][67877] Fps is (10 sec: 39321.2, 60 sec: 37411.1, 300 sec: 36600.2). Total num frames: 323223552. Throughput: 0: 36943.2. Samples: 27038960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 15:03:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:03:04,781][68109] Updated weights for policy 0, policy_version 19733 (0.0039) [2024-06-12 15:03:07,027][67877] Fps is (10 sec: 34405.4, 60 sec: 36317.7, 300 sec: 36489.1). Total num frames: 323354624. Throughput: 0: 36789.2. Samples: 27257720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-12 15:03:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:03:09,169][68109] Updated weights for policy 0, policy_version 19743 (0.0032) [2024-06-12 15:03:12,026][67877] Fps is (10 sec: 36045.1, 60 sec: 37410.2, 300 sec: 36544.7). Total num frames: 323584000. Throughput: 0: 36767.1. Samples: 27365840. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-12 15:03:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:03:13,705][68109] Updated weights for policy 0, policy_version 19753 (0.0030) [2024-06-12 15:03:17,027][67877] Fps is (10 sec: 37683.4, 60 sec: 36864.0, 300 sec: 36489.1). Total num frames: 323731456. Throughput: 0: 36914.6. Samples: 27594360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-12 15:03:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:03:17,042][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000019759_323731456.pth... [2024-06-12 15:03:17,105][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000019224_314966016.pth [2024-06-12 15:03:18,162][68109] Updated weights for policy 0, policy_version 19763 (0.0032) [2024-06-12 15:03:20,841][68089] Signal inference workers to stop experience collection... (300 times) [2024-06-12 15:03:20,842][68089] Signal inference workers to resume experience collection... (300 times) [2024-06-12 15:03:20,859][68109] InferenceWorker_p0-w0: stopping experience collection (300 times) [2024-06-12 15:03:20,864][68109] InferenceWorker_p0-w0: resuming experience collection (300 times) [2024-06-12 15:03:22,026][67877] Fps is (10 sec: 36044.6, 60 sec: 36864.0, 300 sec: 36489.1). Total num frames: 323944448. Throughput: 0: 36849.8. Samples: 27814360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-12 15:03:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:03:22,094][68109] Updated weights for policy 0, policy_version 19773 (0.0044) [2024-06-12 15:03:27,026][67877] Fps is (10 sec: 36045.3, 60 sec: 36590.9, 300 sec: 36544.7). Total num frames: 324091904. Throughput: 0: 36917.8. Samples: 27925740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-12 15:03:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:03:27,267][68109] Updated weights for policy 0, policy_version 19783 (0.0029) [2024-06-12 15:03:30,610][68109] Updated weights for policy 0, policy_version 19793 (0.0038) [2024-06-12 15:03:32,026][67877] Fps is (10 sec: 36045.0, 60 sec: 36591.0, 300 sec: 36544.7). Total num frames: 324304896. Throughput: 0: 36744.5. Samples: 28145760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-12 15:03:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:03:36,151][68109] Updated weights for policy 0, policy_version 19803 (0.0029) [2024-06-12 15:03:37,026][67877] Fps is (10 sec: 37683.0, 60 sec: 36864.0, 300 sec: 36600.2). Total num frames: 324468736. Throughput: 0: 36867.9. Samples: 28365780. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-12 15:03:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:03:39,739][68109] Updated weights for policy 0, policy_version 19813 (0.0033) [2024-06-12 15:03:42,028][67877] Fps is (10 sec: 36039.2, 60 sec: 36590.1, 300 sec: 36544.5). Total num frames: 324665344. Throughput: 0: 36575.1. Samples: 28466940. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-12 15:03:42,028][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:03:45,351][68109] Updated weights for policy 0, policy_version 19823 (0.0028) [2024-06-12 15:03:47,026][67877] Fps is (10 sec: 39321.7, 60 sec: 36864.0, 300 sec: 36711.3). Total num frames: 324861952. Throughput: 0: 36756.5. Samples: 28693000. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-12 15:03:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:03:48,766][68109] Updated weights for policy 0, policy_version 19833 (0.0033) [2024-06-12 15:03:52,026][67877] Fps is (10 sec: 36050.2, 60 sec: 36590.9, 300 sec: 36544.7). Total num frames: 325025792. Throughput: 0: 36533.9. Samples: 28901740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 15:03:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:03:54,543][68109] Updated weights for policy 0, policy_version 19843 (0.0037) [2024-06-12 15:03:57,026][67877] Fps is (10 sec: 37682.9, 60 sec: 37136.9, 300 sec: 36711.3). Total num frames: 325238784. Throughput: 0: 36772.8. Samples: 29020620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 15:03:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:03:57,671][68109] Updated weights for policy 0, policy_version 19853 (0.0039) [2024-06-12 15:04:02,026][67877] Fps is (10 sec: 34406.3, 60 sec: 35771.7, 300 sec: 36544.7). Total num frames: 325369856. Throughput: 0: 36559.2. Samples: 29239520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 15:04:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:04:03,097][68109] Updated weights for policy 0, policy_version 19863 (0.0044) [2024-06-12 15:04:07,026][67877] Fps is (10 sec: 34406.7, 60 sec: 37137.2, 300 sec: 36655.7). Total num frames: 325582848. Throughput: 0: 36596.9. Samples: 29461220. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-12 15:04:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:04:07,141][68109] Updated weights for policy 0, policy_version 19873 (0.0034) [2024-06-12 15:04:11,769][68109] Updated weights for policy 0, policy_version 19883 (0.0033) [2024-06-12 15:04:12,026][67877] Fps is (10 sec: 39321.6, 60 sec: 36317.8, 300 sec: 36600.2). Total num frames: 325763072. Throughput: 0: 36607.5. Samples: 29573080. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-12 15:04:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:04:15,797][68109] Updated weights for policy 0, policy_version 19893 (0.0039) [2024-06-12 15:04:17,026][67877] Fps is (10 sec: 36044.8, 60 sec: 36864.1, 300 sec: 36711.3). Total num frames: 325943296. Throughput: 0: 36543.5. Samples: 29790220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 15:04:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:04:20,851][68109] Updated weights for policy 0, policy_version 19903 (0.0038) [2024-06-12 15:04:22,026][67877] Fps is (10 sec: 34406.4, 60 sec: 36044.8, 300 sec: 36544.7). Total num frames: 326107136. Throughput: 0: 36664.0. Samples: 30015660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 15:04:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:04:24,747][68109] Updated weights for policy 0, policy_version 19913 (0.0042) [2024-06-12 15:04:27,026][67877] Fps is (10 sec: 37683.1, 60 sec: 37137.1, 300 sec: 36711.3). Total num frames: 326320128. Throughput: 0: 36800.3. Samples: 30122900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 15:04:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:04:29,175][68109] Updated weights for policy 0, policy_version 19923 (0.0032) [2024-06-12 15:04:32,026][67877] Fps is (10 sec: 36045.3, 60 sec: 36044.8, 300 sec: 36600.9). Total num frames: 326467584. Throughput: 0: 36813.4. Samples: 30349600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 15:04:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:04:33,843][68109] Updated weights for policy 0, policy_version 19933 (0.0028) [2024-06-12 15:04:37,026][67877] Fps is (10 sec: 37683.4, 60 sec: 37137.1, 300 sec: 36711.3). Total num frames: 326696960. Throughput: 0: 36860.5. Samples: 30560460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 15:04:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:04:38,646][68109] Updated weights for policy 0, policy_version 19943 (0.0030) [2024-06-12 15:04:42,028][67877] Fps is (10 sec: 37677.1, 60 sec: 36317.9, 300 sec: 36711.1). Total num frames: 326844416. Throughput: 0: 36819.7. Samples: 30677560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 15:04:42,029][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:04:43,049][68109] Updated weights for policy 0, policy_version 19953 (0.0033) [2024-06-12 15:04:47,026][67877] Fps is (10 sec: 36045.0, 60 sec: 36591.0, 300 sec: 36655.7). Total num frames: 327057408. Throughput: 0: 36713.0. Samples: 30891600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-12 15:04:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:04:47,094][68109] Updated weights for policy 0, policy_version 19963 (0.0034) [2024-06-12 15:04:52,028][67877] Fps is (10 sec: 36046.2, 60 sec: 36317.2, 300 sec: 36711.1). Total num frames: 327204864. Throughput: 0: 36834.6. Samples: 31118820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-12 15:04:52,028][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:04:52,233][68109] Updated weights for policy 0, policy_version 19973 (0.0039) [2024-06-12 15:04:55,871][68109] Updated weights for policy 0, policy_version 19983 (0.0035) [2024-06-12 15:04:57,027][67877] Fps is (10 sec: 34405.5, 60 sec: 36044.8, 300 sec: 36600.2). Total num frames: 327401472. Throughput: 0: 36598.6. Samples: 31220020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-12 15:04:57,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:05:00,582][68109] Updated weights for policy 0, policy_version 19993 (0.0030) [2024-06-12 15:05:02,027][67877] Fps is (10 sec: 37687.0, 60 sec: 36864.0, 300 sec: 36711.3). Total num frames: 327581696. Throughput: 0: 36829.2. Samples: 31447540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 15:05:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:05:05,016][68109] Updated weights for policy 0, policy_version 20003 (0.0026) [2024-06-12 15:05:07,026][67877] Fps is (10 sec: 37684.1, 60 sec: 36591.0, 300 sec: 36655.7). Total num frames: 327778304. Throughput: 0: 36681.4. Samples: 31666320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 15:05:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:05:09,294][68109] Updated weights for policy 0, policy_version 20013 (0.0031) [2024-06-12 15:05:12,026][67877] Fps is (10 sec: 40960.2, 60 sec: 37137.0, 300 sec: 36822.3). Total num frames: 327991296. Throughput: 0: 36705.7. Samples: 31774660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 15:05:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:05:13,941][68109] Updated weights for policy 0, policy_version 20023 (0.0039) [2024-06-12 15:05:17,026][67877] Fps is (10 sec: 37682.9, 60 sec: 36864.0, 300 sec: 36766.8). Total num frames: 328155136. Throughput: 0: 36733.7. Samples: 32002620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-12 15:05:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:05:17,050][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000020030_328171520.pth... [2024-06-12 15:05:17,096][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000019491_319340544.pth [2024-06-12 15:05:17,876][68109] Updated weights for policy 0, policy_version 20033 (0.0031) [2024-06-12 15:05:22,026][67877] Fps is (10 sec: 36045.3, 60 sec: 37410.2, 300 sec: 36822.4). Total num frames: 328351744. Throughput: 0: 36772.5. Samples: 32215220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-12 15:05:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:05:22,726][68109] Updated weights for policy 0, policy_version 20043 (0.0038) [2024-06-12 15:05:27,026][67877] Fps is (10 sec: 36044.6, 60 sec: 36590.9, 300 sec: 36766.8). Total num frames: 328515584. Throughput: 0: 36688.7. Samples: 32328500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-12 15:05:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:05:28,198][68109] Updated weights for policy 0, policy_version 20053 (0.0037) [2024-06-12 15:05:28,610][68089] Signal inference workers to stop experience collection... (350 times) [2024-06-12 15:05:28,610][68089] Signal inference workers to resume experience collection... (350 times) [2024-06-12 15:05:28,640][68109] InferenceWorker_p0-w0: stopping experience collection (350 times) [2024-06-12 15:05:28,641][68109] InferenceWorker_p0-w0: resuming experience collection (350 times) [2024-06-12 15:05:31,866][68109] Updated weights for policy 0, policy_version 20063 (0.0033) [2024-06-12 15:05:32,026][67877] Fps is (10 sec: 36044.2, 60 sec: 37410.0, 300 sec: 36822.3). Total num frames: 328712192. Throughput: 0: 36871.8. Samples: 32550840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 15:05:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:05:36,675][68109] Updated weights for policy 0, policy_version 20073 (0.0028) [2024-06-12 15:05:37,026][67877] Fps is (10 sec: 36045.0, 60 sec: 36317.8, 300 sec: 36655.7). Total num frames: 328876032. Throughput: 0: 36782.2. Samples: 32773980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 15:05:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:05:40,624][68109] Updated weights for policy 0, policy_version 20083 (0.0035) [2024-06-12 15:05:42,026][67877] Fps is (10 sec: 36045.2, 60 sec: 37138.0, 300 sec: 36766.8). Total num frames: 329072640. Throughput: 0: 36830.4. Samples: 32877380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 15:05:42,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:05:44,966][68109] Updated weights for policy 0, policy_version 20093 (0.0026) [2024-06-12 15:05:47,026][67877] Fps is (10 sec: 34406.4, 60 sec: 36044.7, 300 sec: 36600.2). Total num frames: 329220096. Throughput: 0: 36684.5. Samples: 33098340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 15:05:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:05:49,761][68109] Updated weights for policy 0, policy_version 20103 (0.0043) [2024-06-12 15:05:52,026][67877] Fps is (10 sec: 37683.3, 60 sec: 37410.9, 300 sec: 36822.4). Total num frames: 329449472. Throughput: 0: 36720.4. Samples: 33318740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 15:05:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:05:53,854][68109] Updated weights for policy 0, policy_version 20113 (0.0033) [2024-06-12 15:05:57,026][67877] Fps is (10 sec: 36045.2, 60 sec: 36318.0, 300 sec: 36655.7). Total num frames: 329580544. Throughput: 0: 36695.2. Samples: 33425940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 15:05:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:05:58,572][68109] Updated weights for policy 0, policy_version 20123 (0.0029) [2024-06-12 15:06:02,026][67877] Fps is (10 sec: 37683.0, 60 sec: 37410.2, 300 sec: 36877.9). Total num frames: 329826304. Throughput: 0: 36617.8. Samples: 33650420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 15:06:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:06:03,356][68109] Updated weights for policy 0, policy_version 20133 (0.0029) [2024-06-12 15:06:07,026][67877] Fps is (10 sec: 40960.0, 60 sec: 36864.0, 300 sec: 36822.4). Total num frames: 329990144. Throughput: 0: 36866.7. Samples: 33874220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 15:06:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:06:07,504][68109] Updated weights for policy 0, policy_version 20143 (0.0029) [2024-06-12 15:06:12,026][67877] Fps is (10 sec: 34406.1, 60 sec: 36317.9, 300 sec: 36766.8). Total num frames: 330170368. Throughput: 0: 36644.0. Samples: 33977480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 15:06:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:06:12,106][68109] Updated weights for policy 0, policy_version 20153 (0.0037) [2024-06-12 15:06:16,291][68109] Updated weights for policy 0, policy_version 20163 (0.0029) [2024-06-12 15:06:17,026][67877] Fps is (10 sec: 40959.6, 60 sec: 37410.1, 300 sec: 37044.5). Total num frames: 330399744. Throughput: 0: 36737.0. Samples: 34204000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 15:06:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:06:21,099][68109] Updated weights for policy 0, policy_version 20173 (0.0038) [2024-06-12 15:06:22,026][67877] Fps is (10 sec: 36045.1, 60 sec: 36317.8, 300 sec: 36711.3). Total num frames: 330530816. Throughput: 0: 36845.8. Samples: 34432040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 15:06:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:06:24,928][68109] Updated weights for policy 0, policy_version 20183 (0.0041) [2024-06-12 15:06:27,026][67877] Fps is (10 sec: 36044.7, 60 sec: 37410.2, 300 sec: 36989.0). Total num frames: 330760192. Throughput: 0: 36888.4. Samples: 34537360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-12 15:06:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:06:30,495][68109] Updated weights for policy 0, policy_version 20193 (0.0031) [2024-06-12 15:06:32,026][67877] Fps is (10 sec: 36044.9, 60 sec: 36317.9, 300 sec: 36600.2). Total num frames: 330891264. Throughput: 0: 36987.6. Samples: 34762780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-12 15:06:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:06:33,930][68109] Updated weights for policy 0, policy_version 20203 (0.0033) [2024-06-12 15:06:37,026][67877] Fps is (10 sec: 34406.2, 60 sec: 37137.0, 300 sec: 36877.9). Total num frames: 331104256. Throughput: 0: 36601.7. Samples: 34965820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-12 15:06:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:06:39,637][68109] Updated weights for policy 0, policy_version 20213 (0.0033) [2024-06-12 15:06:42,026][67877] Fps is (10 sec: 39321.6, 60 sec: 36864.0, 300 sec: 36822.4). Total num frames: 331284480. Throughput: 0: 36878.6. Samples: 35085480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 15:06:42,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:06:42,484][68109] Updated weights for policy 0, policy_version 20223 (0.0037) [2024-06-12 15:06:47,026][67877] Fps is (10 sec: 34406.6, 60 sec: 37137.1, 300 sec: 36766.8). Total num frames: 331448320. Throughput: 0: 36782.6. Samples: 35305640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 15:06:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:06:47,891][68109] Updated weights for policy 0, policy_version 20233 (0.0041) [2024-06-12 15:06:51,717][68109] Updated weights for policy 0, policy_version 20243 (0.0033) [2024-06-12 15:06:52,026][67877] Fps is (10 sec: 39321.2, 60 sec: 37137.0, 300 sec: 36877.9). Total num frames: 331677696. Throughput: 0: 36683.0. Samples: 35524960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-12 15:06:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:06:56,634][68109] Updated weights for policy 0, policy_version 20253 (0.0030) [2024-06-12 15:06:57,026][67877] Fps is (10 sec: 39321.8, 60 sec: 37683.2, 300 sec: 36822.5). Total num frames: 331841536. Throughput: 0: 36964.5. Samples: 35640880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-12 15:06:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:07:00,642][68109] Updated weights for policy 0, policy_version 20263 (0.0032) [2024-06-12 15:07:02,026][67877] Fps is (10 sec: 34406.4, 60 sec: 36590.9, 300 sec: 36766.8). Total num frames: 332021760. Throughput: 0: 36796.4. Samples: 35859840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-12 15:07:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:07:05,638][68109] Updated weights for policy 0, policy_version 20273 (0.0029) [2024-06-12 15:07:07,027][67877] Fps is (10 sec: 36043.9, 60 sec: 36863.8, 300 sec: 36822.3). Total num frames: 332201984. Throughput: 0: 36637.1. Samples: 36080720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-12 15:07:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:07:09,199][68109] Updated weights for policy 0, policy_version 20283 (0.0034) [2024-06-12 15:07:12,026][67877] Fps is (10 sec: 36045.0, 60 sec: 36864.0, 300 sec: 36822.4). Total num frames: 332382208. Throughput: 0: 36750.2. Samples: 36191120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-12 15:07:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:07:14,438][68109] Updated weights for policy 0, policy_version 20293 (0.0034) [2024-06-12 15:07:17,027][67877] Fps is (10 sec: 37681.7, 60 sec: 36317.5, 300 sec: 36766.7). Total num frames: 332578816. Throughput: 0: 36625.3. Samples: 36410940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-12 15:07:17,028][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:07:17,038][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000020299_332578816.pth... [2024-06-12 15:07:17,104][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000019759_323731456.pth [2024-06-12 15:07:18,606][68109] Updated weights for policy 0, policy_version 20303 (0.0027) [2024-06-12 15:07:22,027][67877] Fps is (10 sec: 34405.8, 60 sec: 36590.8, 300 sec: 36711.2). Total num frames: 332726272. Throughput: 0: 36876.8. Samples: 36625280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 15:07:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:07:23,461][68109] Updated weights for policy 0, policy_version 20313 (0.0034) [2024-06-12 15:07:26,598][68089] Signal inference workers to stop experience collection... (400 times) [2024-06-12 15:07:26,598][68089] Signal inference workers to resume experience collection... (400 times) [2024-06-12 15:07:26,621][68109] InferenceWorker_p0-w0: stopping experience collection (400 times) [2024-06-12 15:07:26,621][68109] InferenceWorker_p0-w0: resuming experience collection (400 times) [2024-06-12 15:07:27,026][67877] Fps is (10 sec: 36046.8, 60 sec: 36317.8, 300 sec: 36711.3). Total num frames: 332939264. Throughput: 0: 36692.8. Samples: 36736660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 15:07:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:07:27,805][68109] Updated weights for policy 0, policy_version 20323 (0.0037) [2024-06-12 15:07:32,027][67877] Fps is (10 sec: 37683.4, 60 sec: 36863.9, 300 sec: 36766.8). Total num frames: 333103104. Throughput: 0: 36660.3. Samples: 36955360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 15:07:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:07:32,741][68109] Updated weights for policy 0, policy_version 20333 (0.0038) [2024-06-12 15:07:37,027][67877] Fps is (10 sec: 32766.8, 60 sec: 36044.6, 300 sec: 36600.1). Total num frames: 333266944. Throughput: 0: 36694.8. Samples: 37176240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 15:07:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:07:37,443][68109] Updated weights for policy 0, policy_version 20343 (0.0029) [2024-06-12 15:07:41,391][68109] Updated weights for policy 0, policy_version 20353 (0.0029) [2024-06-12 15:07:42,026][67877] Fps is (10 sec: 37683.5, 60 sec: 36590.9, 300 sec: 36711.3). Total num frames: 333479936. Throughput: 0: 36526.1. Samples: 37284560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 15:07:42,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:07:46,288][68109] Updated weights for policy 0, policy_version 20363 (0.0029) [2024-06-12 15:07:47,026][67877] Fps is (10 sec: 37684.5, 60 sec: 36590.9, 300 sec: 36655.7). Total num frames: 333643776. Throughput: 0: 36503.1. Samples: 37502480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 15:07:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:07:50,576][68109] Updated weights for policy 0, policy_version 20373 (0.0034) [2024-06-12 15:07:52,026][67877] Fps is (10 sec: 37683.8, 60 sec: 36318.0, 300 sec: 36766.8). Total num frames: 333856768. Throughput: 0: 36332.7. Samples: 37715680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-12 15:07:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:07:54,812][68109] Updated weights for policy 0, policy_version 20383 (0.0039) [2024-06-12 15:07:57,026][67877] Fps is (10 sec: 36044.8, 60 sec: 36044.7, 300 sec: 36544.6). Total num frames: 334004224. Throughput: 0: 36420.4. Samples: 37830040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-12 15:07:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:07:59,615][68109] Updated weights for policy 0, policy_version 20393 (0.0038) [2024-06-12 15:08:02,026][67877] Fps is (10 sec: 36044.2, 60 sec: 36590.9, 300 sec: 36822.4). Total num frames: 334217216. Throughput: 0: 36472.0. Samples: 38052160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-12 15:08:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:08:03,378][68109] Updated weights for policy 0, policy_version 20403 (0.0035) [2024-06-12 15:08:07,026][67877] Fps is (10 sec: 36044.7, 60 sec: 36044.9, 300 sec: 36544.6). Total num frames: 334364672. Throughput: 0: 36440.1. Samples: 38265080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 15:08:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:08:08,042][68109] Updated weights for policy 0, policy_version 20413 (0.0035) [2024-06-12 15:08:12,026][67877] Fps is (10 sec: 36045.0, 60 sec: 36590.9, 300 sec: 36766.8). Total num frames: 334577664. Throughput: 0: 36480.0. Samples: 38378260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 15:08:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:08:12,593][68109] Updated weights for policy 0, policy_version 20423 (0.0030) [2024-06-12 15:08:17,026][67877] Fps is (10 sec: 39322.1, 60 sec: 36318.2, 300 sec: 36655.7). Total num frames: 334757888. Throughput: 0: 36605.0. Samples: 38602580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 15:08:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:08:17,136][68109] Updated weights for policy 0, policy_version 20433 (0.0037) [2024-06-12 15:08:21,646][68109] Updated weights for policy 0, policy_version 20443 (0.0032) [2024-06-12 15:08:22,026][67877] Fps is (10 sec: 36045.1, 60 sec: 36864.2, 300 sec: 36766.8). Total num frames: 334938112. Throughput: 0: 36656.0. Samples: 38825740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 15:08:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:08:25,685][68109] Updated weights for policy 0, policy_version 20453 (0.0028) [2024-06-12 15:08:27,026][67877] Fps is (10 sec: 36044.6, 60 sec: 36317.9, 300 sec: 36655.7). Total num frames: 335118336. Throughput: 0: 36640.5. Samples: 38933380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 15:08:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:08:30,110][68109] Updated weights for policy 0, policy_version 20463 (0.0027) [2024-06-12 15:08:32,026][67877] Fps is (10 sec: 34406.2, 60 sec: 36318.0, 300 sec: 36655.7). Total num frames: 335282176. Throughput: 0: 36677.8. Samples: 39152980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 15:08:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:08:34,798][68109] Updated weights for policy 0, policy_version 20473 (0.0028) [2024-06-12 15:08:37,027][67877] Fps is (10 sec: 39321.3, 60 sec: 37410.3, 300 sec: 36767.0). Total num frames: 335511552. Throughput: 0: 36816.3. Samples: 39372420. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-12 15:08:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:08:39,888][68109] Updated weights for policy 0, policy_version 20483 (0.0034) [2024-06-12 15:08:42,026][67877] Fps is (10 sec: 40959.7, 60 sec: 36864.0, 300 sec: 36711.3). Total num frames: 335691776. Throughput: 0: 36886.2. Samples: 39489920. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-12 15:08:42,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:08:43,381][68109] Updated weights for policy 0, policy_version 20493 (0.0030) [2024-06-12 15:08:47,026][67877] Fps is (10 sec: 34406.5, 60 sec: 36864.0, 300 sec: 36711.3). Total num frames: 335855616. Throughput: 0: 36804.4. Samples: 39708360. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-12 15:08:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:08:48,601][68109] Updated weights for policy 0, policy_version 20503 (0.0035) [2024-06-12 15:08:52,026][67877] Fps is (10 sec: 37683.3, 60 sec: 36863.9, 300 sec: 36711.3). Total num frames: 336068608. Throughput: 0: 36992.1. Samples: 39929720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-12 15:08:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:08:52,224][68109] Updated weights for policy 0, policy_version 20513 (0.0042) [2024-06-12 15:08:57,024][68109] Updated weights for policy 0, policy_version 20523 (0.0031) [2024-06-12 15:08:57,026][67877] Fps is (10 sec: 37683.3, 60 sec: 37137.1, 300 sec: 36822.3). Total num frames: 336232448. Throughput: 0: 37040.8. Samples: 40045100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-12 15:08:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:09:01,369][68109] Updated weights for policy 0, policy_version 20533 (0.0036) [2024-06-12 15:09:02,026][67877] Fps is (10 sec: 36044.8, 60 sec: 36864.0, 300 sec: 36766.8). Total num frames: 336429056. Throughput: 0: 36910.2. Samples: 40263540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-12 15:09:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:09:06,036][68109] Updated weights for policy 0, policy_version 20543 (0.0034) [2024-06-12 15:09:07,026][67877] Fps is (10 sec: 36045.2, 60 sec: 37137.2, 300 sec: 36711.3). Total num frames: 336592896. Throughput: 0: 36775.5. Samples: 40480640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-12 15:09:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:09:10,951][68109] Updated weights for policy 0, policy_version 20553 (0.0028) [2024-06-12 15:09:12,026][67877] Fps is (10 sec: 36044.9, 60 sec: 36864.0, 300 sec: 36766.8). Total num frames: 336789504. Throughput: 0: 36986.2. Samples: 40597760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-12 15:09:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:09:14,746][68109] Updated weights for policy 0, policy_version 20563 (0.0026) [2024-06-12 15:09:17,026][67877] Fps is (10 sec: 39321.3, 60 sec: 37137.0, 300 sec: 36877.9). Total num frames: 336986112. Throughput: 0: 37047.5. Samples: 40820120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 15:09:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:09:17,154][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000020569_337002496.pth... [2024-06-12 15:09:17,210][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000020030_328171520.pth [2024-06-12 15:09:19,818][68109] Updated weights for policy 0, policy_version 20573 (0.0029) [2024-06-12 15:09:22,026][67877] Fps is (10 sec: 37683.0, 60 sec: 37137.0, 300 sec: 36766.8). Total num frames: 337166336. Throughput: 0: 37154.7. Samples: 41044380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 15:09:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:09:23,478][68109] Updated weights for policy 0, policy_version 20583 (0.0034) [2024-06-12 15:09:27,027][67877] Fps is (10 sec: 36044.4, 60 sec: 37137.0, 300 sec: 36877.9). Total num frames: 337346560. Throughput: 0: 36964.4. Samples: 41153320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 15:09:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:09:28,999][68109] Updated weights for policy 0, policy_version 20593 (0.0033) [2024-06-12 15:09:31,970][68109] Updated weights for policy 0, policy_version 20603 (0.0027) [2024-06-12 15:09:32,026][67877] Fps is (10 sec: 39322.1, 60 sec: 37956.3, 300 sec: 36822.3). Total num frames: 337559552. Throughput: 0: 37069.0. Samples: 41376460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 24.0) [2024-06-12 15:09:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:09:37,027][67877] Fps is (10 sec: 34406.4, 60 sec: 36317.9, 300 sec: 36767.0). Total num frames: 337690624. Throughput: 0: 37015.5. Samples: 41595420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 24.0) [2024-06-12 15:09:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:09:37,778][68109] Updated weights for policy 0, policy_version 20613 (0.0035) [2024-06-12 15:09:38,634][68089] Signal inference workers to stop experience collection... (450 times) [2024-06-12 15:09:38,634][68089] Signal inference workers to resume experience collection... (450 times) [2024-06-12 15:09:38,655][68109] InferenceWorker_p0-w0: stopping experience collection (450 times) [2024-06-12 15:09:38,656][68109] InferenceWorker_p0-w0: resuming experience collection (450 times) [2024-06-12 15:09:41,086][68109] Updated weights for policy 0, policy_version 20623 (0.0031) [2024-06-12 15:09:42,027][67877] Fps is (10 sec: 37682.5, 60 sec: 37410.1, 300 sec: 36877.9). Total num frames: 337936384. Throughput: 0: 37019.1. Samples: 41710960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 24.0) [2024-06-12 15:09:42,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:09:46,637][68109] Updated weights for policy 0, policy_version 20633 (0.0031) [2024-06-12 15:09:47,027][67877] Fps is (10 sec: 37683.2, 60 sec: 36864.0, 300 sec: 36822.5). Total num frames: 338067456. Throughput: 0: 37094.6. Samples: 41932800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 15:09:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:09:49,662][68109] Updated weights for policy 0, policy_version 20643 (0.0039) [2024-06-12 15:09:52,026][67877] Fps is (10 sec: 34406.6, 60 sec: 36864.0, 300 sec: 36877.9). Total num frames: 338280448. Throughput: 0: 37101.2. Samples: 42150200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 15:09:52,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:09:54,723][68109] Updated weights for policy 0, policy_version 20653 (0.0033) [2024-06-12 15:09:57,026][67877] Fps is (10 sec: 36045.1, 60 sec: 36591.0, 300 sec: 36766.8). Total num frames: 338427904. Throughput: 0: 37037.3. Samples: 42264440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 15:09:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:09:58,862][68109] Updated weights for policy 0, policy_version 20663 (0.0036) [2024-06-12 15:10:02,026][67877] Fps is (10 sec: 36045.2, 60 sec: 36864.1, 300 sec: 36822.3). Total num frames: 338640896. Throughput: 0: 37006.3. Samples: 42485400. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-12 15:10:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:10:03,629][68109] Updated weights for policy 0, policy_version 20673 (0.0030) [2024-06-12 15:10:07,026][67877] Fps is (10 sec: 40960.2, 60 sec: 37410.1, 300 sec: 36766.8). Total num frames: 338837504. Throughput: 0: 37105.8. Samples: 42714140. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-12 15:10:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:10:07,570][68109] Updated weights for policy 0, policy_version 20683 (0.0030) [2024-06-12 15:10:12,026][67877] Fps is (10 sec: 37683.3, 60 sec: 37137.1, 300 sec: 36822.4). Total num frames: 339017728. Throughput: 0: 37120.6. Samples: 42823740. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-12 15:10:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:10:12,093][68109] Updated weights for policy 0, policy_version 20693 (0.0033) [2024-06-12 15:10:16,793][68109] Updated weights for policy 0, policy_version 20703 (0.0029) [2024-06-12 15:10:17,026][67877] Fps is (10 sec: 36044.8, 60 sec: 36864.0, 300 sec: 36766.8). Total num frames: 339197952. Throughput: 0: 36911.5. Samples: 43037480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-12 15:10:17,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:10:21,178][68109] Updated weights for policy 0, policy_version 20713 (0.0032) [2024-06-12 15:10:22,027][67877] Fps is (10 sec: 36044.0, 60 sec: 36864.0, 300 sec: 36822.3). Total num frames: 339378176. Throughput: 0: 36998.7. Samples: 43260360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-12 15:10:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:10:26,118][68109] Updated weights for policy 0, policy_version 20723 (0.0033) [2024-06-12 15:10:27,026][67877] Fps is (10 sec: 37683.2, 60 sec: 37137.2, 300 sec: 36822.4). Total num frames: 339574784. Throughput: 0: 37085.0. Samples: 43379780. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-12 15:10:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:10:29,919][68109] Updated weights for policy 0, policy_version 20733 (0.0027) [2024-06-12 15:10:32,026][67877] Fps is (10 sec: 39321.9, 60 sec: 36863.9, 300 sec: 36933.4). Total num frames: 339771392. Throughput: 0: 36933.0. Samples: 43594780. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-06-12 15:10:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:10:34,963][68109] Updated weights for policy 0, policy_version 20743 (0.0039) [2024-06-12 15:10:37,026][67877] Fps is (10 sec: 37683.0, 60 sec: 37683.3, 300 sec: 36877.9). Total num frames: 339951616. Throughput: 0: 37017.4. Samples: 43815980. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-06-12 15:10:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:10:38,711][68109] Updated weights for policy 0, policy_version 20753 (0.0036) [2024-06-12 15:10:42,026][67877] Fps is (10 sec: 34406.5, 60 sec: 36317.9, 300 sec: 36933.4). Total num frames: 340115456. Throughput: 0: 36869.3. Samples: 43923560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 15:10:42,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:10:43,885][68109] Updated weights for policy 0, policy_version 20763 (0.0029) [2024-06-12 15:10:47,026][67877] Fps is (10 sec: 34406.4, 60 sec: 37137.1, 300 sec: 36766.8). Total num frames: 340295680. Throughput: 0: 37031.5. Samples: 44151820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 15:10:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:10:47,759][68109] Updated weights for policy 0, policy_version 20773 (0.0034) [2024-06-12 15:10:52,026][67877] Fps is (10 sec: 32768.5, 60 sec: 36044.9, 300 sec: 36822.4). Total num frames: 340443136. Throughput: 0: 36877.0. Samples: 44373600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 15:10:52,026][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:10:52,630][68109] Updated weights for policy 0, policy_version 20783 (0.0031) [2024-06-12 15:10:56,978][68109] Updated weights for policy 0, policy_version 20793 (0.0027) [2024-06-12 15:10:57,026][67877] Fps is (10 sec: 37683.2, 60 sec: 37410.1, 300 sec: 36766.8). Total num frames: 340672512. Throughput: 0: 36859.5. Samples: 44482420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 15:10:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:11:01,502][68109] Updated weights for policy 0, policy_version 20803 (0.0038) [2024-06-12 15:11:02,026][67877] Fps is (10 sec: 40959.4, 60 sec: 36864.0, 300 sec: 36822.3). Total num frames: 340852736. Throughput: 0: 36850.2. Samples: 44695740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 15:11:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:11:06,047][68109] Updated weights for policy 0, policy_version 20813 (0.0036) [2024-06-12 15:11:07,026][67877] Fps is (10 sec: 37683.2, 60 sec: 36864.0, 300 sec: 36877.9). Total num frames: 341049344. Throughput: 0: 36818.7. Samples: 44917200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 15:11:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:11:10,009][68109] Updated weights for policy 0, policy_version 20823 (0.0028) [2024-06-12 15:11:12,026][67877] Fps is (10 sec: 36044.8, 60 sec: 36590.9, 300 sec: 36655.7). Total num frames: 341213184. Throughput: 0: 36629.3. Samples: 45028100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 15:11:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:11:14,729][68109] Updated weights for policy 0, policy_version 20833 (0.0035) [2024-06-12 15:11:17,026][67877] Fps is (10 sec: 39321.7, 60 sec: 37410.1, 300 sec: 36989.0). Total num frames: 341442560. Throughput: 0: 36850.7. Samples: 45253060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 15:11:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:11:17,039][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000020840_341442560.pth... [2024-06-12 15:11:17,092][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000020299_332578816.pth [2024-06-12 15:11:19,080][68109] Updated weights for policy 0, policy_version 20843 (0.0026) [2024-06-12 15:11:22,027][67877] Fps is (10 sec: 36043.7, 60 sec: 36590.8, 300 sec: 36655.7). Total num frames: 341573632. Throughput: 0: 36899.8. Samples: 45476480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 15:11:22,037][68089] Signal inference workers to stop experience collection... (500 times) [2024-06-12 15:11:22,038][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:11:22,090][68109] InferenceWorker_p0-w0: stopping experience collection (500 times) [2024-06-12 15:11:22,148][68089] Signal inference workers to resume experience collection... (500 times) [2024-06-12 15:11:22,148][68109] InferenceWorker_p0-w0: resuming experience collection (500 times) [2024-06-12 15:11:23,760][68109] Updated weights for policy 0, policy_version 20853 (0.0039) [2024-06-12 15:11:27,026][67877] Fps is (10 sec: 34406.3, 60 sec: 36864.0, 300 sec: 36933.4). Total num frames: 341786624. Throughput: 0: 36966.7. Samples: 45587060. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-06-12 15:11:27,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:11:27,697][68109] Updated weights for policy 0, policy_version 20863 (0.0034) [2024-06-12 15:11:32,027][67877] Fps is (10 sec: 39322.4, 60 sec: 36590.9, 300 sec: 36822.3). Total num frames: 341966848. Throughput: 0: 36906.6. Samples: 45812620. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-06-12 15:11:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:11:32,830][68109] Updated weights for policy 0, policy_version 20873 (0.0040) [2024-06-12 15:11:36,095][68109] Updated weights for policy 0, policy_version 20883 (0.0031) [2024-06-12 15:11:37,026][67877] Fps is (10 sec: 37683.1, 60 sec: 36864.0, 300 sec: 36877.9). Total num frames: 342163456. Throughput: 0: 36789.6. Samples: 46029140. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-06-12 15:11:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:11:41,526][68109] Updated weights for policy 0, policy_version 20893 (0.0035) [2024-06-12 15:11:42,026][67877] Fps is (10 sec: 37683.9, 60 sec: 37137.1, 300 sec: 36933.4). Total num frames: 342343680. Throughput: 0: 36900.5. Samples: 46142940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 15:11:42,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:11:45,145][68109] Updated weights for policy 0, policy_version 20903 (0.0034) [2024-06-12 15:11:47,026][67877] Fps is (10 sec: 36044.8, 60 sec: 37137.0, 300 sec: 36766.8). Total num frames: 342523904. Throughput: 0: 37050.2. Samples: 46363000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 15:11:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:11:50,430][68109] Updated weights for policy 0, policy_version 20913 (0.0028) [2024-06-12 15:11:52,026][67877] Fps is (10 sec: 37682.7, 60 sec: 37956.1, 300 sec: 36877.9). Total num frames: 342720512. Throughput: 0: 37071.1. Samples: 46585400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 15:11:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:11:54,109][68109] Updated weights for policy 0, policy_version 20923 (0.0042) [2024-06-12 15:11:57,026][67877] Fps is (10 sec: 36045.2, 60 sec: 36864.0, 300 sec: 36822.4). Total num frames: 342884352. Throughput: 0: 37013.4. Samples: 46693700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 15:11:57,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:11:58,882][68109] Updated weights for policy 0, policy_version 20933 (0.0032) [2024-06-12 15:12:02,026][67877] Fps is (10 sec: 37683.4, 60 sec: 37410.1, 300 sec: 36933.4). Total num frames: 343097344. Throughput: 0: 37253.8. Samples: 46929480. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 15:12:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:12:02,407][68109] Updated weights for policy 0, policy_version 20943 (0.0030) [2024-06-12 15:12:07,027][67877] Fps is (10 sec: 36044.2, 60 sec: 36590.9, 300 sec: 36822.3). Total num frames: 343244800. Throughput: 0: 37258.4. Samples: 47153100. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 15:12:07,030][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:12:07,853][68109] Updated weights for policy 0, policy_version 20953 (0.0030) [2024-06-12 15:12:11,434][68109] Updated weights for policy 0, policy_version 20963 (0.0037) [2024-06-12 15:12:12,026][67877] Fps is (10 sec: 37683.5, 60 sec: 37683.2, 300 sec: 36933.5). Total num frames: 343474176. Throughput: 0: 37201.4. Samples: 47261120. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-12 15:12:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:12:16,562][68109] Updated weights for policy 0, policy_version 20973 (0.0036) [2024-06-12 15:12:17,026][67877] Fps is (10 sec: 39321.7, 60 sec: 36590.9, 300 sec: 36989.0). Total num frames: 343638016. Throughput: 0: 37536.5. Samples: 47501760. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-12 15:12:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:12:19,730][68109] Updated weights for policy 0, policy_version 20983 (0.0031) [2024-06-12 15:12:22,026][67877] Fps is (10 sec: 37682.8, 60 sec: 37956.4, 300 sec: 36989.0). Total num frames: 343851008. Throughput: 0: 37588.9. Samples: 47720640. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-12 15:12:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:12:24,974][68109] Updated weights for policy 0, policy_version 20993 (0.0025) [2024-06-12 15:12:27,026][67877] Fps is (10 sec: 39322.2, 60 sec: 37410.2, 300 sec: 37044.5). Total num frames: 344031232. Throughput: 0: 37610.7. Samples: 47835420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 15:12:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:12:28,333][68109] Updated weights for policy 0, policy_version 21003 (0.0027) [2024-06-12 15:12:32,026][67877] Fps is (10 sec: 34406.7, 60 sec: 37137.2, 300 sec: 37044.6). Total num frames: 344195072. Throughput: 0: 37664.1. Samples: 48057880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 15:12:32,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:12:33,715][68109] Updated weights for policy 0, policy_version 21013 (0.0029) [2024-06-12 15:12:37,021][68109] Updated weights for policy 0, policy_version 21023 (0.0028) [2024-06-12 15:12:37,026][67877] Fps is (10 sec: 40959.9, 60 sec: 37956.3, 300 sec: 37155.6). Total num frames: 344440832. Throughput: 0: 37708.1. Samples: 48282260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 15:12:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:12:42,026][67877] Fps is (10 sec: 39321.4, 60 sec: 37410.1, 300 sec: 37100.0). Total num frames: 344588288. Throughput: 0: 37824.8. Samples: 48395820. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-12 15:12:42,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:12:42,158][68109] Updated weights for policy 0, policy_version 21033 (0.0028) [2024-06-12 15:12:45,914][68109] Updated weights for policy 0, policy_version 21043 (0.0032) [2024-06-12 15:12:47,026][67877] Fps is (10 sec: 34406.2, 60 sec: 37683.2, 300 sec: 37044.5). Total num frames: 344784896. Throughput: 0: 37519.1. Samples: 48617840. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-12 15:12:47,036][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:12:50,882][68109] Updated weights for policy 0, policy_version 21053 (0.0026) [2024-06-12 15:12:52,026][67877] Fps is (10 sec: 39321.5, 60 sec: 37683.2, 300 sec: 37211.1). Total num frames: 344981504. Throughput: 0: 37809.4. Samples: 48854520. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-12 15:12:52,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:12:54,178][68109] Updated weights for policy 0, policy_version 21063 (0.0036) [2024-06-12 15:12:57,026][67877] Fps is (10 sec: 37682.8, 60 sec: 37956.2, 300 sec: 37100.0). Total num frames: 345161728. Throughput: 0: 37815.9. Samples: 48962840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-12 15:12:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:12:59,288][68109] Updated weights for policy 0, policy_version 21073 (0.0034) [2024-06-12 15:13:02,026][67877] Fps is (10 sec: 37683.6, 60 sec: 37683.3, 300 sec: 37266.7). Total num frames: 345358336. Throughput: 0: 37772.1. Samples: 49201500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-12 15:13:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:13:03,338][68109] Updated weights for policy 0, policy_version 21083 (0.0026) [2024-06-12 15:13:07,026][67877] Fps is (10 sec: 39322.3, 60 sec: 38502.5, 300 sec: 37211.1). Total num frames: 345554944. Throughput: 0: 37850.3. Samples: 49423900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-12 15:13:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:13:07,206][68109] Updated weights for policy 0, policy_version 21093 (0.0029) [2024-06-12 15:13:09,989][68089] Signal inference workers to stop experience collection... (550 times) [2024-06-12 15:13:09,989][68089] Signal inference workers to resume experience collection... (550 times) [2024-06-12 15:13:10,000][68109] InferenceWorker_p0-w0: stopping experience collection (550 times) [2024-06-12 15:13:10,000][68109] InferenceWorker_p0-w0: resuming experience collection (550 times) [2024-06-12 15:13:12,026][67877] Fps is (10 sec: 37683.0, 60 sec: 37683.2, 300 sec: 37211.1). Total num frames: 345735168. Throughput: 0: 37881.7. Samples: 49540100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-12 15:13:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:13:12,093][68109] Updated weights for policy 0, policy_version 21103 (0.0029) [2024-06-12 15:13:15,679][68109] Updated weights for policy 0, policy_version 21113 (0.0035) [2024-06-12 15:13:17,026][67877] Fps is (10 sec: 37682.8, 60 sec: 38229.4, 300 sec: 37266.6). Total num frames: 345931776. Throughput: 0: 38038.2. Samples: 49769600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-12 15:13:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:13:17,045][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000021114_345931776.pth... [2024-06-12 15:13:17,095][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000020569_337002496.pth [2024-06-12 15:13:20,843][68109] Updated weights for policy 0, policy_version 21123 (0.0030) [2024-06-12 15:13:22,027][67877] Fps is (10 sec: 39319.6, 60 sec: 37956.0, 300 sec: 37322.1). Total num frames: 346128384. Throughput: 0: 38253.8. Samples: 50003700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-12 15:13:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:13:24,127][68109] Updated weights for policy 0, policy_version 21133 (0.0031) [2024-06-12 15:13:27,026][67877] Fps is (10 sec: 37683.4, 60 sec: 37956.2, 300 sec: 37377.7). Total num frames: 346308608. Throughput: 0: 38179.1. Samples: 50113880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 15:13:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:13:29,290][68109] Updated weights for policy 0, policy_version 21143 (0.0031) [2024-06-12 15:13:32,026][67877] Fps is (10 sec: 37684.7, 60 sec: 38502.3, 300 sec: 37266.7). Total num frames: 346505216. Throughput: 0: 38520.4. Samples: 50351260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 15:13:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:13:32,975][68109] Updated weights for policy 0, policy_version 21153 (0.0032) [2024-06-12 15:13:37,026][67877] Fps is (10 sec: 39321.7, 60 sec: 37683.2, 300 sec: 37322.2). Total num frames: 346701824. Throughput: 0: 38479.2. Samples: 50586080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 15:13:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:13:37,521][68109] Updated weights for policy 0, policy_version 21163 (0.0032) [2024-06-12 15:13:42,026][67877] Fps is (10 sec: 37683.6, 60 sec: 38229.4, 300 sec: 37377.8). Total num frames: 346882048. Throughput: 0: 38681.0. Samples: 50703480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 15:13:42,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:13:42,120][68109] Updated weights for policy 0, policy_version 21173 (0.0036) [2024-06-12 15:13:46,316][68109] Updated weights for policy 0, policy_version 21183 (0.0024) [2024-06-12 15:13:47,026][67877] Fps is (10 sec: 37682.9, 60 sec: 38229.3, 300 sec: 37322.2). Total num frames: 347078656. Throughput: 0: 38436.4. Samples: 50931140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 15:13:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:13:50,707][68109] Updated weights for policy 0, policy_version 21193 (0.0040) [2024-06-12 15:13:52,026][67877] Fps is (10 sec: 40959.8, 60 sec: 38502.4, 300 sec: 37488.8). Total num frames: 347291648. Throughput: 0: 38626.6. Samples: 51162100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 15:13:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:13:54,439][68109] Updated weights for policy 0, policy_version 21203 (0.0027) [2024-06-12 15:13:57,027][67877] Fps is (10 sec: 39319.6, 60 sec: 38502.1, 300 sec: 37433.2). Total num frames: 347471872. Throughput: 0: 38658.2. Samples: 51279740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 15:13:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:13:59,135][68109] Updated weights for policy 0, policy_version 21213 (0.0030) [2024-06-12 15:14:02,026][67877] Fps is (10 sec: 37683.7, 60 sec: 38502.4, 300 sec: 37544.4). Total num frames: 347668480. Throughput: 0: 38761.9. Samples: 51513880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 15:14:02,026][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:14:03,095][68109] Updated weights for policy 0, policy_version 21223 (0.0035) [2024-06-12 15:14:07,027][67877] Fps is (10 sec: 39323.2, 60 sec: 38502.3, 300 sec: 37544.3). Total num frames: 347865088. Throughput: 0: 38634.1. Samples: 51742220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 15:14:07,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:14:07,113][68109] Updated weights for policy 0, policy_version 21233 (0.0037) [2024-06-12 15:14:11,291][68109] Updated weights for policy 0, policy_version 21243 (0.0034) [2024-06-12 15:14:12,026][67877] Fps is (10 sec: 39320.9, 60 sec: 38775.4, 300 sec: 37544.3). Total num frames: 348061696. Throughput: 0: 38701.7. Samples: 51855460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 15:14:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:14:16,443][68109] Updated weights for policy 0, policy_version 21253 (0.0027) [2024-06-12 15:14:17,026][67877] Fps is (10 sec: 37683.8, 60 sec: 38502.4, 300 sec: 37544.4). Total num frames: 348241920. Throughput: 0: 38413.9. Samples: 52079880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 15:14:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:14:20,394][68109] Updated weights for policy 0, policy_version 21263 (0.0038) [2024-06-12 15:14:22,026][67877] Fps is (10 sec: 37683.7, 60 sec: 38502.8, 300 sec: 37599.9). Total num frames: 348438528. Throughput: 0: 38411.1. Samples: 52314580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-12 15:14:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:14:24,680][68109] Updated weights for policy 0, policy_version 21273 (0.0038) [2024-06-12 15:14:27,026][67877] Fps is (10 sec: 39321.3, 60 sec: 38775.4, 300 sec: 37544.3). Total num frames: 348635136. Throughput: 0: 38345.3. Samples: 52429020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-12 15:14:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:14:28,564][68109] Updated weights for policy 0, policy_version 21283 (0.0038) [2024-06-12 15:14:32,026][67877] Fps is (10 sec: 36044.4, 60 sec: 38229.4, 300 sec: 37655.4). Total num frames: 348798976. Throughput: 0: 38448.4. Samples: 52661320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-12 15:14:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:14:32,965][68109] Updated weights for policy 0, policy_version 21293 (0.0037) [2024-06-12 15:14:36,822][68109] Updated weights for policy 0, policy_version 21303 (0.0030) [2024-06-12 15:14:37,026][67877] Fps is (10 sec: 39321.9, 60 sec: 38775.5, 300 sec: 37599.9). Total num frames: 349028352. Throughput: 0: 38508.0. Samples: 52894960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-12 15:14:37,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:14:41,802][68109] Updated weights for policy 0, policy_version 21313 (0.0039) [2024-06-12 15:14:42,029][67877] Fps is (10 sec: 39311.8, 60 sec: 38500.7, 300 sec: 37710.7). Total num frames: 349192192. Throughput: 0: 38426.7. Samples: 53009020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-12 15:14:42,029][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:14:45,280][68109] Updated weights for policy 0, policy_version 21323 (0.0036) [2024-06-12 15:14:47,026][67877] Fps is (10 sec: 37682.8, 60 sec: 38775.4, 300 sec: 37711.0). Total num frames: 349405184. Throughput: 0: 38273.6. Samples: 53236200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-12 15:14:47,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:14:50,428][68109] Updated weights for policy 0, policy_version 21333 (0.0024) [2024-06-12 15:14:52,026][67877] Fps is (10 sec: 37692.4, 60 sec: 37956.2, 300 sec: 37766.5). Total num frames: 349569024. Throughput: 0: 38320.9. Samples: 53466660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-12 15:14:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:14:53,880][68109] Updated weights for policy 0, policy_version 21343 (0.0027) [2024-06-12 15:14:57,026][67877] Fps is (10 sec: 37683.2, 60 sec: 38502.7, 300 sec: 37766.5). Total num frames: 349782016. Throughput: 0: 38340.0. Samples: 53580760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-12 15:14:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:14:58,407][68109] Updated weights for policy 0, policy_version 21353 (0.0032) [2024-06-12 15:15:02,026][67877] Fps is (10 sec: 40960.2, 60 sec: 38502.3, 300 sec: 37766.5). Total num frames: 349978624. Throughput: 0: 38659.5. Samples: 53819560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-12 15:15:02,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:15:02,673][68109] Updated weights for policy 0, policy_version 21363 (0.0039) [2024-06-12 15:15:07,025][68109] Updated weights for policy 0, policy_version 21373 (0.0039) [2024-06-12 15:15:07,026][67877] Fps is (10 sec: 39321.9, 60 sec: 38502.5, 300 sec: 37822.0). Total num frames: 350175232. Throughput: 0: 38483.5. Samples: 54046340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 15:15:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:15:11,065][68109] Updated weights for policy 0, policy_version 21383 (0.0028) [2024-06-12 15:15:12,026][67877] Fps is (10 sec: 37683.4, 60 sec: 38229.4, 300 sec: 37822.0). Total num frames: 350355456. Throughput: 0: 38471.6. Samples: 54160240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 15:15:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:15:15,076][68109] Updated weights for policy 0, policy_version 21393 (0.0041) [2024-06-12 15:15:17,026][67877] Fps is (10 sec: 37683.1, 60 sec: 38502.4, 300 sec: 37877.6). Total num frames: 350552064. Throughput: 0: 38353.8. Samples: 54387240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 15:15:17,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:15:17,040][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000021396_350552064.pth... [2024-06-12 15:15:17,091][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000020840_341442560.pth [2024-06-12 15:15:19,338][68089] Signal inference workers to stop experience collection... (600 times) [2024-06-12 15:15:19,384][68109] InferenceWorker_p0-w0: stopping experience collection (600 times) [2024-06-12 15:15:19,449][68089] Signal inference workers to resume experience collection... (600 times) [2024-06-12 15:15:19,449][68109] InferenceWorker_p0-w0: resuming experience collection (600 times) [2024-06-12 15:15:19,847][68109] Updated weights for policy 0, policy_version 21403 (0.0029) [2024-06-12 15:15:22,026][67877] Fps is (10 sec: 36044.7, 60 sec: 37956.2, 300 sec: 37766.5). Total num frames: 350715904. Throughput: 0: 38144.0. Samples: 54611440. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-06-12 15:15:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:15:24,222][68109] Updated weights for policy 0, policy_version 21413 (0.0038) [2024-06-12 15:15:27,026][67877] Fps is (10 sec: 37683.2, 60 sec: 38229.3, 300 sec: 37822.0). Total num frames: 350928896. Throughput: 0: 38327.9. Samples: 54733680. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-06-12 15:15:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:15:28,253][68109] Updated weights for policy 0, policy_version 21423 (0.0025) [2024-06-12 15:15:32,026][67877] Fps is (10 sec: 40960.2, 60 sec: 38775.5, 300 sec: 37877.6). Total num frames: 351125504. Throughput: 0: 38379.7. Samples: 54963280. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-06-12 15:15:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:15:33,113][68109] Updated weights for policy 0, policy_version 21433 (0.0033) [2024-06-12 15:15:36,649][68109] Updated weights for policy 0, policy_version 21443 (0.0027) [2024-06-12 15:15:37,026][67877] Fps is (10 sec: 39321.5, 60 sec: 38229.3, 300 sec: 37988.7). Total num frames: 351322112. Throughput: 0: 38415.6. Samples: 55195360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-12 15:15:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:15:41,379][68109] Updated weights for policy 0, policy_version 21453 (0.0027) [2024-06-12 15:15:42,026][67877] Fps is (10 sec: 36044.6, 60 sec: 38230.9, 300 sec: 37933.1). Total num frames: 351485952. Throughput: 0: 38297.8. Samples: 55304160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-12 15:15:42,030][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:15:45,464][68109] Updated weights for policy 0, policy_version 21463 (0.0038) [2024-06-12 15:15:47,026][67877] Fps is (10 sec: 34406.4, 60 sec: 37683.2, 300 sec: 38044.2). Total num frames: 351666176. Throughput: 0: 38002.2. Samples: 55529660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-12 15:15:47,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:15:50,227][68109] Updated weights for policy 0, policy_version 21473 (0.0028) [2024-06-12 15:15:52,026][67877] Fps is (10 sec: 39321.3, 60 sec: 38502.4, 300 sec: 37988.7). Total num frames: 351879168. Throughput: 0: 38076.4. Samples: 55759780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 24.0) [2024-06-12 15:15:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:15:54,495][68109] Updated weights for policy 0, policy_version 21483 (0.0027) [2024-06-12 15:15:57,026][67877] Fps is (10 sec: 39321.8, 60 sec: 37956.3, 300 sec: 37988.7). Total num frames: 352059392. Throughput: 0: 38176.9. Samples: 55878200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 24.0) [2024-06-12 15:15:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:15:58,340][68109] Updated weights for policy 0, policy_version 21493 (0.0036) [2024-06-12 15:16:02,026][67877] Fps is (10 sec: 39321.8, 60 sec: 38229.3, 300 sec: 38044.2). Total num frames: 352272384. Throughput: 0: 38281.3. Samples: 56109900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 24.0) [2024-06-12 15:16:02,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:16:02,717][68109] Updated weights for policy 0, policy_version 21503 (0.0038) [2024-06-12 15:16:06,878][68109] Updated weights for policy 0, policy_version 21513 (0.0024) [2024-06-12 15:16:07,026][67877] Fps is (10 sec: 40959.9, 60 sec: 38229.3, 300 sec: 38155.3). Total num frames: 352468992. Throughput: 0: 38494.6. Samples: 56343700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 15:16:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:16:11,787][68109] Updated weights for policy 0, policy_version 21523 (0.0027) [2024-06-12 15:16:12,026][67877] Fps is (10 sec: 37683.3, 60 sec: 38229.3, 300 sec: 37988.7). Total num frames: 352649216. Throughput: 0: 38294.2. Samples: 56456920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 15:16:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:16:15,444][68109] Updated weights for policy 0, policy_version 21533 (0.0029) [2024-06-12 15:16:17,027][67877] Fps is (10 sec: 39321.2, 60 sec: 38502.3, 300 sec: 38266.4). Total num frames: 352862208. Throughput: 0: 38298.5. Samples: 56686720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 15:16:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:16:20,374][68109] Updated weights for policy 0, policy_version 21543 (0.0043) [2024-06-12 15:16:22,027][67877] Fps is (10 sec: 39321.1, 60 sec: 38775.4, 300 sec: 38155.3). Total num frames: 353042432. Throughput: 0: 38216.4. Samples: 56915100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-12 15:16:22,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:16:24,284][68109] Updated weights for policy 0, policy_version 21553 (0.0037) [2024-06-12 15:16:27,026][67877] Fps is (10 sec: 36045.3, 60 sec: 38229.4, 300 sec: 38155.3). Total num frames: 353222656. Throughput: 0: 38402.7. Samples: 57032280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-12 15:16:27,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:16:28,764][68109] Updated weights for policy 0, policy_version 21563 (0.0028) [2024-06-12 15:16:32,026][67877] Fps is (10 sec: 39321.9, 60 sec: 38502.3, 300 sec: 38210.8). Total num frames: 353435648. Throughput: 0: 38753.3. Samples: 57273560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-12 15:16:32,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:16:32,485][68109] Updated weights for policy 0, policy_version 21573 (0.0042) [2024-06-12 15:16:37,026][67877] Fps is (10 sec: 37682.9, 60 sec: 37956.3, 300 sec: 38155.3). Total num frames: 353599488. Throughput: 0: 38724.9. Samples: 57502400. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-12 15:16:37,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:16:37,158][68109] Updated weights for policy 0, policy_version 21583 (0.0031) [2024-06-12 15:16:40,480][68109] Updated weights for policy 0, policy_version 21593 (0.0024) [2024-06-12 15:16:42,026][67877] Fps is (10 sec: 37683.3, 60 sec: 38775.4, 300 sec: 38266.4). Total num frames: 353812480. Throughput: 0: 38650.2. Samples: 57617460. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-12 15:16:42,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:16:45,216][68109] Updated weights for policy 0, policy_version 21603 (0.0037) [2024-06-12 15:16:47,026][67877] Fps is (10 sec: 42598.8, 60 sec: 39321.7, 300 sec: 38321.9). Total num frames: 354025472. Throughput: 0: 39053.0. Samples: 57867280. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-12 15:16:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:16:48,503][68109] Updated weights for policy 0, policy_version 21613 (0.0033) [2024-06-12 15:16:52,027][67877] Fps is (10 sec: 40959.7, 60 sec: 39048.5, 300 sec: 38433.0). Total num frames: 354222080. Throughput: 0: 39156.8. Samples: 58105760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 15:16:52,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:16:52,841][68109] Updated weights for policy 0, policy_version 21623 (0.0043) [2024-06-12 15:16:56,615][68109] Updated weights for policy 0, policy_version 21633 (0.0036) [2024-06-12 15:16:57,026][67877] Fps is (10 sec: 42597.9, 60 sec: 39867.7, 300 sec: 38488.5). Total num frames: 354451456. Throughput: 0: 39240.8. Samples: 58222760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 15:16:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:17:01,043][68109] Updated weights for policy 0, policy_version 21643 (0.0035) [2024-06-12 15:17:02,027][67877] Fps is (10 sec: 39321.5, 60 sec: 39048.5, 300 sec: 38544.0). Total num frames: 354615296. Throughput: 0: 39694.6. Samples: 58472980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 15:17:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:17:05,175][68109] Updated weights for policy 0, policy_version 21653 (0.0036) [2024-06-12 15:17:07,028][67877] Fps is (10 sec: 39316.1, 60 sec: 39593.7, 300 sec: 38543.9). Total num frames: 354844672. Throughput: 0: 39952.2. Samples: 58713000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 15:17:07,028][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:17:09,293][68109] Updated weights for policy 0, policy_version 21663 (0.0032) [2024-06-12 15:17:12,026][67877] Fps is (10 sec: 42598.5, 60 sec: 39867.7, 300 sec: 38655.1). Total num frames: 355041280. Throughput: 0: 39999.9. Samples: 58832280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 15:17:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:17:13,679][68109] Updated weights for policy 0, policy_version 21673 (0.0030) [2024-06-12 15:17:17,026][67877] Fps is (10 sec: 39327.3, 60 sec: 39594.7, 300 sec: 38599.6). Total num frames: 355237888. Throughput: 0: 39939.6. Samples: 59070840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 15:17:17,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:17:17,055][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000021683_355254272.pth... [2024-06-12 15:17:17,059][68109] Updated weights for policy 0, policy_version 21683 (0.0030) [2024-06-12 15:17:17,098][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000021114_345931776.pth [2024-06-12 15:17:21,885][68109] Updated weights for policy 0, policy_version 21693 (0.0037) [2024-06-12 15:17:22,026][67877] Fps is (10 sec: 37683.7, 60 sec: 39594.8, 300 sec: 38599.6). Total num frames: 355418112. Throughput: 0: 40078.3. Samples: 59305920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 15:17:22,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:17:25,969][68109] Updated weights for policy 0, policy_version 21703 (0.0033) [2024-06-12 15:17:27,026][67877] Fps is (10 sec: 36044.5, 60 sec: 39594.6, 300 sec: 38655.1). Total num frames: 355598336. Throughput: 0: 40082.6. Samples: 59421180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 15:17:27,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:17:27,059][68089] Signal inference workers to stop experience collection... (650 times) [2024-06-12 15:17:27,061][68089] Signal inference workers to resume experience collection... (650 times) [2024-06-12 15:17:27,071][68109] InferenceWorker_p0-w0: stopping experience collection (650 times) [2024-06-12 15:17:27,083][68109] InferenceWorker_p0-w0: resuming experience collection (650 times) [2024-06-12 15:17:30,295][68109] Updated weights for policy 0, policy_version 21713 (0.0025) [2024-06-12 15:17:32,026][67877] Fps is (10 sec: 40959.9, 60 sec: 39867.8, 300 sec: 38599.6). Total num frames: 355827712. Throughput: 0: 39771.5. Samples: 59657000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 15:17:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:17:33,909][68109] Updated weights for policy 0, policy_version 21723 (0.0029) [2024-06-12 15:17:37,027][67877] Fps is (10 sec: 40959.4, 60 sec: 40140.7, 300 sec: 38710.6). Total num frames: 356007936. Throughput: 0: 39723.9. Samples: 59893340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 15:17:37,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:17:38,324][68109] Updated weights for policy 0, policy_version 21733 (0.0031) [2024-06-12 15:17:42,026][67877] Fps is (10 sec: 36045.0, 60 sec: 39594.7, 300 sec: 38655.1). Total num frames: 356188160. Throughput: 0: 39737.0. Samples: 60010920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 15:17:42,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:17:42,437][68109] Updated weights for policy 0, policy_version 21743 (0.0031) [2024-06-12 15:17:46,854][68109] Updated weights for policy 0, policy_version 21753 (0.0032) [2024-06-12 15:17:47,026][67877] Fps is (10 sec: 40961.0, 60 sec: 39867.7, 300 sec: 38766.2). Total num frames: 356417536. Throughput: 0: 39277.5. Samples: 60240460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 15:17:47,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:17:51,261][68109] Updated weights for policy 0, policy_version 21763 (0.0037) [2024-06-12 15:17:52,027][67877] Fps is (10 sec: 40958.3, 60 sec: 39594.5, 300 sec: 38766.2). Total num frames: 356597760. Throughput: 0: 39064.5. Samples: 60470860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 15:17:52,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:17:55,114][68109] Updated weights for policy 0, policy_version 21773 (0.0030) [2024-06-12 15:17:57,026][67877] Fps is (10 sec: 39321.2, 60 sec: 39321.6, 300 sec: 38821.7). Total num frames: 356810752. Throughput: 0: 39209.8. Samples: 60596720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 15:17:57,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:17:59,162][68109] Updated weights for policy 0, policy_version 21783 (0.0031) [2024-06-12 15:18:02,027][67877] Fps is (10 sec: 37684.1, 60 sec: 39321.6, 300 sec: 38710.6). Total num frames: 356974592. Throughput: 0: 39173.2. Samples: 60833640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 15:18:02,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:18:03,565][68109] Updated weights for policy 0, policy_version 21793 (0.0034) [2024-06-12 15:18:06,705][68109] Updated weights for policy 0, policy_version 21803 (0.0027) [2024-06-12 15:18:07,026][67877] Fps is (10 sec: 40960.6, 60 sec: 39595.7, 300 sec: 38932.8). Total num frames: 357220352. Throughput: 0: 39307.1. Samples: 61074740. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-06-12 15:18:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:18:11,705][68109] Updated weights for policy 0, policy_version 21813 (0.0029) [2024-06-12 15:18:12,026][67877] Fps is (10 sec: 42599.1, 60 sec: 39321.7, 300 sec: 38877.3). Total num frames: 357400576. Throughput: 0: 39622.3. Samples: 61204180. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-06-12 15:18:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:18:15,075][68109] Updated weights for policy 0, policy_version 21823 (0.0032) [2024-06-12 15:18:17,026][67877] Fps is (10 sec: 37683.1, 60 sec: 39321.6, 300 sec: 38877.4). Total num frames: 357597184. Throughput: 0: 39533.8. Samples: 61436020. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-06-12 15:18:17,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:18:19,764][68109] Updated weights for policy 0, policy_version 21833 (0.0035) [2024-06-12 15:18:22,026][67877] Fps is (10 sec: 42598.2, 60 sec: 40140.8, 300 sec: 39043.9). Total num frames: 357826560. Throughput: 0: 39689.1. Samples: 61679340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-06-12 15:18:22,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:18:22,982][68109] Updated weights for policy 0, policy_version 21843 (0.0023) [2024-06-12 15:18:27,026][67877] Fps is (10 sec: 40959.9, 60 sec: 40140.8, 300 sec: 38988.4). Total num frames: 358006784. Throughput: 0: 39928.8. Samples: 61807720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-06-12 15:18:27,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:18:27,580][68109] Updated weights for policy 0, policy_version 21853 (0.0027) [2024-06-12 15:18:31,833][68109] Updated weights for policy 0, policy_version 21863 (0.0027) [2024-06-12 15:18:32,026][67877] Fps is (10 sec: 37683.4, 60 sec: 39594.7, 300 sec: 38988.4). Total num frames: 358203392. Throughput: 0: 40314.7. Samples: 62054620. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-06-12 15:18:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:18:35,716][68109] Updated weights for policy 0, policy_version 21873 (0.0035) [2024-06-12 15:18:37,026][67877] Fps is (10 sec: 42598.5, 60 sec: 40414.0, 300 sec: 39155.0). Total num frames: 358432768. Throughput: 0: 40527.9. Samples: 62294600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 15:18:37,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:18:39,724][68109] Updated weights for policy 0, policy_version 21883 (0.0029) [2024-06-12 15:18:42,026][67877] Fps is (10 sec: 40960.0, 60 sec: 40413.9, 300 sec: 39099.5). Total num frames: 358612992. Throughput: 0: 40499.2. Samples: 62419180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 15:18:42,032][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:18:43,539][68109] Updated weights for policy 0, policy_version 21893 (0.0034) [2024-06-12 15:18:47,026][67877] Fps is (10 sec: 39321.6, 60 sec: 40140.8, 300 sec: 39099.4). Total num frames: 358825984. Throughput: 0: 40629.9. Samples: 62661980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 15:18:47,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:18:48,203][68109] Updated weights for policy 0, policy_version 21903 (0.0035) [2024-06-12 15:18:51,362][68109] Updated weights for policy 0, policy_version 21913 (0.0030) [2024-06-12 15:18:52,026][67877] Fps is (10 sec: 40960.1, 60 sec: 40414.1, 300 sec: 39155.1). Total num frames: 359022592. Throughput: 0: 40536.4. Samples: 62898880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 15:18:52,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:18:56,651][68109] Updated weights for policy 0, policy_version 21923 (0.0034) [2024-06-12 15:18:57,027][67877] Fps is (10 sec: 37682.4, 60 sec: 39867.7, 300 sec: 39099.4). Total num frames: 359202816. Throughput: 0: 40399.8. Samples: 63022180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 15:18:57,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:18:59,340][68109] Updated weights for policy 0, policy_version 21933 (0.0030) [2024-06-12 15:19:02,026][67877] Fps is (10 sec: 40959.7, 60 sec: 40960.1, 300 sec: 39210.5). Total num frames: 359432192. Throughput: 0: 40507.1. Samples: 63258840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 15:19:02,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:19:04,611][68109] Updated weights for policy 0, policy_version 21943 (0.0031) [2024-06-12 15:19:07,026][67877] Fps is (10 sec: 44237.3, 60 sec: 40413.8, 300 sec: 39266.1). Total num frames: 359645184. Throughput: 0: 40383.9. Samples: 63496620. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-12 15:19:07,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:19:07,482][68109] Updated weights for policy 0, policy_version 21953 (0.0027) [2024-06-12 15:19:12,026][67877] Fps is (10 sec: 37682.9, 60 sec: 40140.7, 300 sec: 39210.5). Total num frames: 359809024. Throughput: 0: 40414.1. Samples: 63626360. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-12 15:19:12,032][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:19:12,503][68109] Updated weights for policy 0, policy_version 21963 (0.0031) [2024-06-12 15:19:15,345][68109] Updated weights for policy 0, policy_version 21973 (0.0022) [2024-06-12 15:19:17,026][67877] Fps is (10 sec: 40960.1, 60 sec: 40960.0, 300 sec: 39377.1). Total num frames: 360054784. Throughput: 0: 40231.5. Samples: 63865040. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-12 15:19:17,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:19:17,037][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000021976_360054784.pth... [2024-06-12 15:19:17,088][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000021396_350552064.pth [2024-06-12 15:19:20,713][68109] Updated weights for policy 0, policy_version 21983 (0.0036) [2024-06-12 15:19:22,026][67877] Fps is (10 sec: 40960.4, 60 sec: 39867.7, 300 sec: 39266.1). Total num frames: 360218624. Throughput: 0: 40469.3. Samples: 64115720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 15:19:22,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:19:23,253][68109] Updated weights for policy 0, policy_version 21993 (0.0026) [2024-06-12 15:19:27,027][67877] Fps is (10 sec: 36044.4, 60 sec: 40140.7, 300 sec: 39377.1). Total num frames: 360415232. Throughput: 0: 40351.4. Samples: 64235000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 15:19:27,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:19:27,700][68089] Signal inference workers to stop experience collection... (700 times) [2024-06-12 15:19:27,701][68089] Signal inference workers to resume experience collection... (700 times) [2024-06-12 15:19:27,738][68109] InferenceWorker_p0-w0: stopping experience collection (700 times) [2024-06-12 15:19:27,738][68109] InferenceWorker_p0-w0: resuming experience collection (700 times) [2024-06-12 15:19:28,579][68109] Updated weights for policy 0, policy_version 22003 (0.0028) [2024-06-12 15:19:31,423][68109] Updated weights for policy 0, policy_version 22013 (0.0028) [2024-06-12 15:19:32,026][67877] Fps is (10 sec: 45875.0, 60 sec: 41233.0, 300 sec: 39488.2). Total num frames: 360677376. Throughput: 0: 40423.0. Samples: 64481020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 15:19:32,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:19:36,886][68109] Updated weights for policy 0, policy_version 22023 (0.0025) [2024-06-12 15:19:37,026][67877] Fps is (10 sec: 40960.8, 60 sec: 39867.7, 300 sec: 39433.0). Total num frames: 360824832. Throughput: 0: 40655.1. Samples: 64728360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 15:19:37,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:19:39,544][68109] Updated weights for policy 0, policy_version 22033 (0.0028) [2024-06-12 15:19:42,026][67877] Fps is (10 sec: 36044.8, 60 sec: 40413.8, 300 sec: 39432.7). Total num frames: 361037824. Throughput: 0: 40562.8. Samples: 64847500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 15:19:42,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:19:44,914][68109] Updated weights for policy 0, policy_version 22043 (0.0029) [2024-06-12 15:19:47,026][67877] Fps is (10 sec: 42597.9, 60 sec: 40413.8, 300 sec: 39599.3). Total num frames: 361250816. Throughput: 0: 40547.1. Samples: 65083460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 15:19:47,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:19:47,931][68109] Updated weights for policy 0, policy_version 22053 (0.0043) [2024-06-12 15:19:52,026][67877] Fps is (10 sec: 40959.9, 60 sec: 40413.8, 300 sec: 39543.8). Total num frames: 361447424. Throughput: 0: 40752.0. Samples: 65330460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 15:19:52,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:19:53,172][68109] Updated weights for policy 0, policy_version 22063 (0.0031) [2024-06-12 15:19:56,102][68109] Updated weights for policy 0, policy_version 22073 (0.0034) [2024-06-12 15:19:57,027][67877] Fps is (10 sec: 40959.7, 60 sec: 40960.0, 300 sec: 39599.3). Total num frames: 361660416. Throughput: 0: 40615.5. Samples: 65454060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 15:19:57,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:20:01,199][68109] Updated weights for policy 0, policy_version 22083 (0.0030) [2024-06-12 15:20:02,026][67877] Fps is (10 sec: 37683.7, 60 sec: 39867.8, 300 sec: 39488.2). Total num frames: 361824256. Throughput: 0: 40574.3. Samples: 65690880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 15:20:02,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:20:04,338][68109] Updated weights for policy 0, policy_version 22093 (0.0030) [2024-06-12 15:20:07,026][67877] Fps is (10 sec: 39322.1, 60 sec: 40140.8, 300 sec: 39654.8). Total num frames: 362053632. Throughput: 0: 40084.4. Samples: 65919520. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-12 15:20:07,027][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:20:09,223][68109] Updated weights for policy 0, policy_version 22103 (0.0029) [2024-06-12 15:20:12,026][67877] Fps is (10 sec: 42598.5, 60 sec: 40687.0, 300 sec: 39654.8). Total num frames: 362250240. Throughput: 0: 40243.8. Samples: 66045960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-12 15:20:12,027][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:20:12,990][68109] Updated weights for policy 0, policy_version 22113 (0.0029) [2024-06-12 15:20:17,026][67877] Fps is (10 sec: 37683.2, 60 sec: 39594.7, 300 sec: 39710.4). Total num frames: 362430464. Throughput: 0: 40054.3. Samples: 66283460. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-12 15:20:17,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:20:17,400][68109] Updated weights for policy 0, policy_version 22123 (0.0031) [2024-06-12 15:20:21,057][68109] Updated weights for policy 0, policy_version 22133 (0.0029) [2024-06-12 15:20:22,026][67877] Fps is (10 sec: 39321.9, 60 sec: 40414.0, 300 sec: 39710.4). Total num frames: 362643456. Throughput: 0: 39894.3. Samples: 66523600. Policy #0 lag: (min: 1.0, avg: 9.6, max: 24.0) [2024-06-12 15:20:22,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:20:25,536][68109] Updated weights for policy 0, policy_version 22143 (0.0041) [2024-06-12 15:20:27,027][67877] Fps is (10 sec: 39320.7, 60 sec: 40140.8, 300 sec: 39654.8). Total num frames: 362823680. Throughput: 0: 39799.4. Samples: 66638480. Policy #0 lag: (min: 1.0, avg: 9.6, max: 24.0) [2024-06-12 15:20:27,028][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:20:29,524][68109] Updated weights for policy 0, policy_version 22153 (0.0028) [2024-06-12 15:20:32,026][67877] Fps is (10 sec: 40959.2, 60 sec: 39594.7, 300 sec: 39765.9). Total num frames: 363053056. Throughput: 0: 39957.3. Samples: 66881540. Policy #0 lag: (min: 1.0, avg: 9.6, max: 24.0) [2024-06-12 15:20:32,027][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:20:34,159][68109] Updated weights for policy 0, policy_version 22163 (0.0036) [2024-06-12 15:20:37,026][67877] Fps is (10 sec: 42599.5, 60 sec: 40413.9, 300 sec: 39877.0). Total num frames: 363249664. Throughput: 0: 40040.1. Samples: 67132260. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 15:20:37,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:20:37,308][68109] Updated weights for policy 0, policy_version 22173 (0.0025) [2024-06-12 15:20:41,519][68109] Updated weights for policy 0, policy_version 22183 (0.0027) [2024-06-12 15:20:42,026][67877] Fps is (10 sec: 39321.7, 60 sec: 40140.8, 300 sec: 39932.5). Total num frames: 363446272. Throughput: 0: 39882.3. Samples: 67248760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 15:20:42,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:20:45,661][68109] Updated weights for policy 0, policy_version 22193 (0.0029) [2024-06-12 15:20:47,027][67877] Fps is (10 sec: 40959.1, 60 sec: 40140.7, 300 sec: 39932.5). Total num frames: 363659264. Throughput: 0: 40197.1. Samples: 67499760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 15:20:47,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:20:49,534][68109] Updated weights for policy 0, policy_version 22203 (0.0033) [2024-06-12 15:20:52,026][67877] Fps is (10 sec: 40960.4, 60 sec: 40140.9, 300 sec: 39988.1). Total num frames: 363855872. Throughput: 0: 40597.4. Samples: 67746400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 15:20:52,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:20:53,603][68109] Updated weights for policy 0, policy_version 22213 (0.0028) [2024-06-12 15:20:56,963][68109] Updated weights for policy 0, policy_version 22223 (0.0028) [2024-06-12 15:20:57,026][67877] Fps is (10 sec: 44237.3, 60 sec: 40687.0, 300 sec: 40099.1). Total num frames: 364101632. Throughput: 0: 40561.6. Samples: 67871240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 15:20:57,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:21:01,589][68109] Updated weights for policy 0, policy_version 22233 (0.0033) [2024-06-12 15:21:02,027][67877] Fps is (10 sec: 44233.5, 60 sec: 41232.6, 300 sec: 40099.1). Total num frames: 364298240. Throughput: 0: 41093.6. Samples: 68132700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 15:21:02,028][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:21:04,544][68109] Updated weights for policy 0, policy_version 22243 (0.0028) [2024-06-12 15:21:07,026][67877] Fps is (10 sec: 40960.6, 60 sec: 40960.0, 300 sec: 40210.2). Total num frames: 364511232. Throughput: 0: 41434.6. Samples: 68388160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 15:21:07,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:21:08,812][68109] Updated weights for policy 0, policy_version 22253 (0.0035) [2024-06-12 15:21:12,026][67877] Fps is (10 sec: 40962.8, 60 sec: 40960.0, 300 sec: 40154.7). Total num frames: 364707840. Throughput: 0: 41819.8. Samples: 68520360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 15:21:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:21:12,603][68109] Updated weights for policy 0, policy_version 22263 (0.0030) [2024-06-12 15:21:16,494][68109] Updated weights for policy 0, policy_version 22273 (0.0033) [2024-06-12 15:21:17,026][67877] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 40265.8). Total num frames: 364920832. Throughput: 0: 41808.5. Samples: 68762920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 15:21:17,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:21:17,038][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000022273_364920832.pth... [2024-06-12 15:21:17,084][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000021683_355254272.pth [2024-06-12 15:21:20,200][68089] Signal inference workers to stop experience collection... (750 times) [2024-06-12 15:21:20,233][68109] InferenceWorker_p0-w0: stopping experience collection (750 times) [2024-06-12 15:21:20,309][68089] Signal inference workers to resume experience collection... (750 times) [2024-06-12 15:21:20,309][68109] InferenceWorker_p0-w0: resuming experience collection (750 times) [2024-06-12 15:21:20,441][68109] Updated weights for policy 0, policy_version 22283 (0.0034) [2024-06-12 15:21:22,026][67877] Fps is (10 sec: 44236.8, 60 sec: 41779.1, 300 sec: 40432.4). Total num frames: 365150208. Throughput: 0: 41794.2. Samples: 69013000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 15:21:22,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:21:24,642][68109] Updated weights for policy 0, policy_version 22293 (0.0032) [2024-06-12 15:21:27,027][67877] Fps is (10 sec: 42597.8, 60 sec: 42052.3, 300 sec: 40376.8). Total num frames: 365346816. Throughput: 0: 41947.4. Samples: 69136400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 15:21:27,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:21:28,282][68109] Updated weights for policy 0, policy_version 22303 (0.0037) [2024-06-12 15:21:32,026][67877] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 40543.5). Total num frames: 365559808. Throughput: 0: 41954.3. Samples: 69387700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 15:21:32,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:21:32,264][68109] Updated weights for policy 0, policy_version 22313 (0.0028) [2024-06-12 15:21:36,253][68109] Updated weights for policy 0, policy_version 22323 (0.0023) [2024-06-12 15:21:37,026][67877] Fps is (10 sec: 40960.9, 60 sec: 41779.2, 300 sec: 40487.9). Total num frames: 365756416. Throughput: 0: 42021.8. Samples: 69637380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 15:21:37,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:21:40,133][68109] Updated weights for policy 0, policy_version 22333 (0.0027) [2024-06-12 15:21:42,026][67877] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 40432.4). Total num frames: 365953024. Throughput: 0: 42168.9. Samples: 69768840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 15:21:42,027][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:21:43,632][68109] Updated weights for policy 0, policy_version 22343 (0.0029) [2024-06-12 15:21:47,027][67877] Fps is (10 sec: 44235.8, 60 sec: 42325.3, 300 sec: 40599.0). Total num frames: 366198784. Throughput: 0: 41976.9. Samples: 70021640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 15:21:47,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:21:47,684][68109] Updated weights for policy 0, policy_version 22353 (0.0026) [2024-06-12 15:21:51,585][68109] Updated weights for policy 0, policy_version 22363 (0.0031) [2024-06-12 15:21:52,026][67877] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 40543.5). Total num frames: 366411776. Throughput: 0: 42039.1. Samples: 70279920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-12 15:21:52,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:21:55,557][68109] Updated weights for policy 0, policy_version 22373 (0.0030) [2024-06-12 15:21:57,026][67877] Fps is (10 sec: 39322.5, 60 sec: 41506.2, 300 sec: 40599.0). Total num frames: 366592000. Throughput: 0: 42009.8. Samples: 70410800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-12 15:21:57,027][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:21:59,358][68109] Updated weights for policy 0, policy_version 22383 (0.0036) [2024-06-12 15:22:02,026][67877] Fps is (10 sec: 42598.0, 60 sec: 42325.8, 300 sec: 40654.7). Total num frames: 366837760. Throughput: 0: 42118.6. Samples: 70658260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-12 15:22:02,027][67877] Avg episode reward: [(0, '0.003')] [2024-06-12 15:22:03,063][68109] Updated weights for policy 0, policy_version 22393 (0.0026) [2024-06-12 15:22:07,026][67877] Fps is (10 sec: 44236.0, 60 sec: 42052.1, 300 sec: 40654.5). Total num frames: 367034368. Throughput: 0: 42343.4. Samples: 70918460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-12 15:22:07,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:22:07,379][68109] Updated weights for policy 0, policy_version 22403 (0.0026) [2024-06-12 15:22:10,850][68109] Updated weights for policy 0, policy_version 22413 (0.0036) [2024-06-12 15:22:12,027][67877] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 40654.5). Total num frames: 367230976. Throughput: 0: 42420.1. Samples: 71045300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-12 15:22:12,027][67877] Avg episode reward: [(0, '0.000')] [2024-06-12 15:22:15,452][68109] Updated weights for policy 0, policy_version 22423 (0.0030) [2024-06-12 15:22:17,026][67877] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 40821.1). Total num frames: 367460352. Throughput: 0: 42358.7. Samples: 71293840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-12 15:22:17,027][67877] Avg episode reward: [(0, '0.003')] [2024-06-12 15:22:18,833][68109] Updated weights for policy 0, policy_version 22433 (0.0033) [2024-06-12 15:22:22,026][67877] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 40821.2). Total num frames: 367640576. Throughput: 0: 42398.6. Samples: 71545320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-12 15:22:22,027][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:22:23,257][68109] Updated weights for policy 0, policy_version 22443 (0.0043) [2024-06-12 15:22:26,556][68109] Updated weights for policy 0, policy_version 22453 (0.0022) [2024-06-12 15:22:27,026][67877] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 40821.1). Total num frames: 367869952. Throughput: 0: 42250.2. Samples: 71670100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 15:22:27,027][67877] Avg episode reward: [(0, '0.003')] [2024-06-12 15:22:30,739][68109] Updated weights for policy 0, policy_version 22463 (0.0034) [2024-06-12 15:22:32,026][67877] Fps is (10 sec: 47514.3, 60 sec: 42598.5, 300 sec: 41043.4). Total num frames: 368115712. Throughput: 0: 42391.8. Samples: 71929260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 15:22:32,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:22:34,659][68109] Updated weights for policy 0, policy_version 22474 (0.0031) [2024-06-12 15:22:37,026][67877] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 40987.8). Total num frames: 368279552. Throughput: 0: 42287.0. Samples: 72182840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 15:22:37,027][67877] Avg episode reward: [(0, '0.003')] [2024-06-12 15:22:38,548][68109] Updated weights for policy 0, policy_version 22484 (0.0027) [2024-06-12 15:22:42,026][67877] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 41043.3). Total num frames: 368525312. Throughput: 0: 42087.5. Samples: 72304740. Policy #0 lag: (min: 0.0, avg: 7.5, max: 18.0) [2024-06-12 15:22:42,027][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:22:42,334][68109] Updated weights for policy 0, policy_version 22494 (0.0022) [2024-06-12 15:22:46,709][68109] Updated weights for policy 0, policy_version 22504 (0.0039) [2024-06-12 15:22:47,026][67877] Fps is (10 sec: 45875.0, 60 sec: 42325.4, 300 sec: 41154.4). Total num frames: 368738304. Throughput: 0: 42313.8. Samples: 72562380. Policy #0 lag: (min: 0.0, avg: 7.5, max: 18.0) [2024-06-12 15:22:47,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:22:49,966][68109] Updated weights for policy 0, policy_version 22514 (0.0028) [2024-06-12 15:22:52,026][67877] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 41098.9). Total num frames: 368934912. Throughput: 0: 42208.6. Samples: 72817840. Policy #0 lag: (min: 0.0, avg: 7.5, max: 18.0) [2024-06-12 15:22:52,027][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:22:54,394][68109] Updated weights for policy 0, policy_version 22524 (0.0029) [2024-06-12 15:22:57,026][67877] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 41376.6). Total num frames: 369180672. Throughput: 0: 42234.7. Samples: 72945860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 15:22:57,027][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:22:57,504][68109] Updated weights for policy 0, policy_version 22534 (0.0032) [2024-06-12 15:23:02,026][67877] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41098.8). Total num frames: 369344512. Throughput: 0: 42346.8. Samples: 73199440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 15:23:02,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:23:02,226][68109] Updated weights for policy 0, policy_version 22544 (0.0029) [2024-06-12 15:23:05,115][68109] Updated weights for policy 0, policy_version 22554 (0.0028) [2024-06-12 15:23:06,581][68089] Signal inference workers to stop experience collection... (800 times) [2024-06-12 15:23:06,581][68089] Signal inference workers to resume experience collection... (800 times) [2024-06-12 15:23:06,597][68109] InferenceWorker_p0-w0: stopping experience collection (800 times) [2024-06-12 15:23:06,626][68109] InferenceWorker_p0-w0: resuming experience collection (800 times) [2024-06-12 15:23:07,027][67877] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 41321.0). Total num frames: 369590272. Throughput: 0: 42402.2. Samples: 73453420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 15:23:07,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:23:09,675][68109] Updated weights for policy 0, policy_version 22564 (0.0030) [2024-06-12 15:23:12,026][67877] Fps is (10 sec: 42598.9, 60 sec: 42325.5, 300 sec: 41265.5). Total num frames: 369770496. Throughput: 0: 42409.5. Samples: 73578520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 25.0) [2024-06-12 15:23:12,027][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:23:13,053][68109] Updated weights for policy 0, policy_version 22574 (0.0034) [2024-06-12 15:23:17,026][67877] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 41265.5). Total num frames: 369999872. Throughput: 0: 42359.5. Samples: 73835440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 25.0) [2024-06-12 15:23:17,027][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:23:17,036][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000022583_369999872.pth... [2024-06-12 15:23:17,087][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000021976_360054784.pth [2024-06-12 15:23:17,305][68109] Updated weights for policy 0, policy_version 22584 (0.0032) [2024-06-12 15:23:20,985][68109] Updated weights for policy 0, policy_version 22594 (0.0031) [2024-06-12 15:23:22,026][67877] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 41376.5). Total num frames: 370212864. Throughput: 0: 42375.1. Samples: 74089720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 25.0) [2024-06-12 15:23:22,027][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:23:24,908][68109] Updated weights for policy 0, policy_version 22604 (0.0032) [2024-06-12 15:23:27,026][67877] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 41432.1). Total num frames: 370425856. Throughput: 0: 42560.4. Samples: 74219960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 15:23:27,027][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:23:29,020][68109] Updated weights for policy 0, policy_version 22614 (0.0023) [2024-06-12 15:23:32,027][67877] Fps is (10 sec: 44236.8, 60 sec: 42325.2, 300 sec: 41432.1). Total num frames: 370655232. Throughput: 0: 42283.1. Samples: 74465120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 15:23:32,027][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:23:32,649][68109] Updated weights for policy 0, policy_version 22624 (0.0037) [2024-06-12 15:23:36,482][68109] Updated weights for policy 0, policy_version 22634 (0.0029) [2024-06-12 15:23:37,026][67877] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 41487.6). Total num frames: 370851840. Throughput: 0: 42642.6. Samples: 74736760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 15:23:37,027][67877] Avg episode reward: [(0, '0.003')] [2024-06-12 15:23:40,111][68109] Updated weights for policy 0, policy_version 22644 (0.0031) [2024-06-12 15:23:42,026][67877] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 41543.2). Total num frames: 371081216. Throughput: 0: 42589.9. Samples: 74862400. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-12 15:23:42,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:23:44,233][68109] Updated weights for policy 0, policy_version 22654 (0.0034) [2024-06-12 15:23:47,026][67877] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 41543.2). Total num frames: 371277824. Throughput: 0: 42808.9. Samples: 75125840. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-12 15:23:47,027][67877] Avg episode reward: [(0, '0.003')] [2024-06-12 15:23:47,679][68109] Updated weights for policy 0, policy_version 22664 (0.0036) [2024-06-12 15:23:51,930][68109] Updated weights for policy 0, policy_version 22674 (0.0032) [2024-06-12 15:23:52,027][67877] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 41654.2). Total num frames: 371490816. Throughput: 0: 42889.3. Samples: 75383440. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-12 15:23:52,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:23:55,236][68109] Updated weights for policy 0, policy_version 22684 (0.0026) [2024-06-12 15:23:57,026][67877] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 41709.8). Total num frames: 371736576. Throughput: 0: 42969.6. Samples: 75512160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-12 15:23:57,027][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:23:59,138][68109] Updated weights for policy 0, policy_version 22694 (0.0028) [2024-06-12 15:24:02,026][67877] Fps is (10 sec: 45876.1, 60 sec: 43417.6, 300 sec: 41709.8). Total num frames: 371949568. Throughput: 0: 43219.6. Samples: 75780320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-12 15:24:02,027][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:24:02,469][68109] Updated weights for policy 0, policy_version 22704 (0.0030) [2024-06-12 15:24:06,836][68109] Updated weights for policy 0, policy_version 22714 (0.0022) [2024-06-12 15:24:07,026][67877] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 41820.9). Total num frames: 372146176. Throughput: 0: 43353.0. Samples: 76040600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-12 15:24:07,027][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:24:10,080][68109] Updated weights for policy 0, policy_version 22724 (0.0030) [2024-06-12 15:24:12,026][67877] Fps is (10 sec: 42598.1, 60 sec: 43417.5, 300 sec: 41765.3). Total num frames: 372375552. Throughput: 0: 43457.8. Samples: 76175560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 15:24:12,027][67877] Avg episode reward: [(0, '0.004')] [2024-06-12 15:24:14,340][68109] Updated weights for policy 0, policy_version 22734 (0.0033) [2024-06-12 15:24:17,026][67877] Fps is (10 sec: 47513.6, 60 sec: 43690.7, 300 sec: 42043.0). Total num frames: 372621312. Throughput: 0: 43670.3. Samples: 76430280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 15:24:17,027][67877] Avg episode reward: [(0, '0.003')] [2024-06-12 15:24:17,142][68109] Updated weights for policy 0, policy_version 22744 (0.0031) [2024-06-12 15:24:21,881][68109] Updated weights for policy 0, policy_version 22754 (0.0032) [2024-06-12 15:24:22,026][67877] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 41987.5). Total num frames: 372801536. Throughput: 0: 43419.1. Samples: 76690620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 15:24:22,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:24:25,241][68109] Updated weights for policy 0, policy_version 22764 (0.0026) [2024-06-12 15:24:27,026][67877] Fps is (10 sec: 40959.8, 60 sec: 43417.6, 300 sec: 41876.4). Total num frames: 373030912. Throughput: 0: 43536.8. Samples: 76821560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 15:24:27,027][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:24:29,111][68109] Updated weights for policy 0, policy_version 22774 (0.0027) [2024-06-12 15:24:32,031][67877] Fps is (10 sec: 44214.6, 60 sec: 43141.0, 300 sec: 42097.8). Total num frames: 373243904. Throughput: 0: 43514.6. Samples: 77084220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 15:24:32,032][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:24:33,168][68109] Updated weights for policy 0, policy_version 22784 (0.0027) [2024-06-12 15:24:36,474][68109] Updated weights for policy 0, policy_version 22794 (0.0026) [2024-06-12 15:24:37,026][67877] Fps is (10 sec: 44236.6, 60 sec: 43690.7, 300 sec: 42154.1). Total num frames: 373473280. Throughput: 0: 43753.0. Samples: 77352320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 15:24:37,027][67877] Avg episode reward: [(0, '0.003')] [2024-06-12 15:24:40,565][68109] Updated weights for policy 0, policy_version 22804 (0.0026) [2024-06-12 15:24:42,026][67877] Fps is (10 sec: 44259.0, 60 sec: 43417.5, 300 sec: 42154.1). Total num frames: 373686272. Throughput: 0: 43849.8. Samples: 77485400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 15:24:42,027][67877] Avg episode reward: [(0, '0.003')] [2024-06-12 15:24:43,720][68109] Updated weights for policy 0, policy_version 22814 (0.0032) [2024-06-12 15:24:44,295][68089] Signal inference workers to stop experience collection... (850 times) [2024-06-12 15:24:44,339][68109] InferenceWorker_p0-w0: stopping experience collection (850 times) [2024-06-12 15:24:44,407][68089] Signal inference workers to resume experience collection... (850 times) [2024-06-12 15:24:44,408][68109] InferenceWorker_p0-w0: resuming experience collection (850 times) [2024-06-12 15:24:47,026][67877] Fps is (10 sec: 42599.2, 60 sec: 43690.7, 300 sec: 42209.7). Total num frames: 373899264. Throughput: 0: 43537.4. Samples: 77739500. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-12 15:24:47,026][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:24:47,893][68109] Updated weights for policy 0, policy_version 22824 (0.0027) [2024-06-12 15:24:51,500][68109] Updated weights for policy 0, policy_version 22834 (0.0024) [2024-06-12 15:24:52,026][67877] Fps is (10 sec: 44237.3, 60 sec: 43963.9, 300 sec: 42265.2). Total num frames: 374128640. Throughput: 0: 43605.4. Samples: 78002840. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-12 15:24:52,027][67877] Avg episode reward: [(0, '0.004')] [2024-06-12 15:24:55,133][68109] Updated weights for policy 0, policy_version 22844 (0.0026) [2024-06-12 15:24:57,026][67877] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42376.2). Total num frames: 374325248. Throughput: 0: 43661.9. Samples: 78140340. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-12 15:24:57,027][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:24:58,695][68109] Updated weights for policy 0, policy_version 22854 (0.0032) [2024-06-12 15:25:02,026][67877] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42376.3). Total num frames: 374554624. Throughput: 0: 43821.4. Samples: 78402240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 15:25:02,027][67877] Avg episode reward: [(0, '0.004')] [2024-06-12 15:25:02,704][68109] Updated weights for policy 0, policy_version 22864 (0.0028) [2024-06-12 15:25:06,072][68109] Updated weights for policy 0, policy_version 22874 (0.0034) [2024-06-12 15:25:07,026][67877] Fps is (10 sec: 49152.0, 60 sec: 44509.9, 300 sec: 42598.4). Total num frames: 374816768. Throughput: 0: 43886.7. Samples: 78665520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 15:25:07,027][67877] Avg episode reward: [(0, '0.003')] [2024-06-12 15:25:10,074][68109] Updated weights for policy 0, policy_version 22884 (0.0024) [2024-06-12 15:25:12,027][67877] Fps is (10 sec: 44235.9, 60 sec: 43690.6, 300 sec: 42598.4). Total num frames: 374996992. Throughput: 0: 44025.2. Samples: 78802700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 15:25:12,027][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:25:13,463][68109] Updated weights for policy 0, policy_version 22894 (0.0033) [2024-06-12 15:25:17,027][67877] Fps is (10 sec: 39321.1, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 375209984. Throughput: 0: 43763.0. Samples: 79053340. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-12 15:25:17,027][67877] Avg episode reward: [(0, '0.003')] [2024-06-12 15:25:17,157][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000022902_375226368.pth... [2024-06-12 15:25:17,213][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000022273_364920832.pth [2024-06-12 15:25:17,838][68109] Updated weights for policy 0, policy_version 22904 (0.0026) [2024-06-12 15:25:21,298][68109] Updated weights for policy 0, policy_version 22914 (0.0031) [2024-06-12 15:25:22,026][67877] Fps is (10 sec: 45876.0, 60 sec: 44236.9, 300 sec: 42820.6). Total num frames: 375455744. Throughput: 0: 43678.8. Samples: 79317860. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-12 15:25:22,027][67877] Avg episode reward: [(0, '0.001')] [2024-06-12 15:25:25,741][68109] Updated weights for policy 0, policy_version 22924 (0.0023) [2024-06-12 15:25:27,027][67877] Fps is (10 sec: 45875.3, 60 sec: 43963.7, 300 sec: 42765.0). Total num frames: 375668736. Throughput: 0: 43761.3. Samples: 79454660. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-12 15:25:27,027][67877] Avg episode reward: [(0, '0.003')] [2024-06-12 15:25:28,422][68109] Updated weights for policy 0, policy_version 22934 (0.0024) [2024-06-12 15:25:32,026][67877] Fps is (10 sec: 42598.1, 60 sec: 43967.4, 300 sec: 42820.5). Total num frames: 375881728. Throughput: 0: 44109.2. Samples: 79724420. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-12 15:25:32,027][67877] Avg episode reward: [(0, '0.004')] [2024-06-12 15:25:33,301][68109] Updated weights for policy 0, policy_version 22944 (0.0035) [2024-06-12 15:25:35,911][68109] Updated weights for policy 0, policy_version 22954 (0.0030) [2024-06-12 15:25:37,026][67877] Fps is (10 sec: 45875.6, 60 sec: 44236.8, 300 sec: 42987.2). Total num frames: 376127488. Throughput: 0: 43885.3. Samples: 79977680. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-12 15:25:37,027][67877] Avg episode reward: [(0, '0.003')] [2024-06-12 15:25:40,394][68109] Updated weights for policy 0, policy_version 22964 (0.0022) [2024-06-12 15:25:42,026][67877] Fps is (10 sec: 44236.8, 60 sec: 43963.8, 300 sec: 42931.7). Total num frames: 376324096. Throughput: 0: 43920.9. Samples: 80116780. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-12 15:25:42,027][67877] Avg episode reward: [(0, '0.003')] [2024-06-12 15:25:43,226][68109] Updated weights for policy 0, policy_version 22974 (0.0032) [2024-06-12 15:25:47,026][67877] Fps is (10 sec: 40960.1, 60 sec: 43963.7, 300 sec: 42987.2). Total num frames: 376537088. Throughput: 0: 44271.5. Samples: 80394460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 19.0) [2024-06-12 15:25:47,027][67877] Avg episode reward: [(0, '0.003')] [2024-06-12 15:25:47,492][68109] Updated weights for policy 0, policy_version 22984 (0.0026) [2024-06-12 15:25:50,293][68109] Updated weights for policy 0, policy_version 22994 (0.0028) [2024-06-12 15:25:52,026][67877] Fps is (10 sec: 45875.5, 60 sec: 44236.8, 300 sec: 42987.2). Total num frames: 376782848. Throughput: 0: 44320.0. Samples: 80659920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 19.0) [2024-06-12 15:25:52,027][67877] Avg episode reward: [(0, '0.003')] [2024-06-12 15:25:54,789][68109] Updated weights for policy 0, policy_version 23004 (0.0028) [2024-06-12 15:25:57,026][67877] Fps is (10 sec: 49152.1, 60 sec: 45056.0, 300 sec: 43153.9). Total num frames: 377028608. Throughput: 0: 44349.5. Samples: 80798420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 19.0) [2024-06-12 15:25:57,027][67877] Avg episode reward: [(0, '0.004')] [2024-06-12 15:25:57,767][68109] Updated weights for policy 0, policy_version 23014 (0.0023) [2024-06-12 15:26:01,895][68109] Updated weights for policy 0, policy_version 23024 (0.0027) [2024-06-12 15:26:02,026][67877] Fps is (10 sec: 44236.6, 60 sec: 44509.8, 300 sec: 43098.2). Total num frames: 377225216. Throughput: 0: 44929.5. Samples: 81075160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 15:26:02,027][67877] Avg episode reward: [(0, '0.004')] [2024-06-12 15:26:04,729][68109] Updated weights for policy 0, policy_version 23034 (0.0030) [2024-06-12 15:26:07,026][67877] Fps is (10 sec: 44237.2, 60 sec: 44236.9, 300 sec: 43264.9). Total num frames: 377470976. Throughput: 0: 44964.1. Samples: 81341240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 15:26:07,027][67877] Avg episode reward: [(0, '0.005')] [2024-06-12 15:26:08,971][68109] Updated weights for policy 0, policy_version 23044 (0.0027) [2024-06-12 15:26:10,135][68089] Signal inference workers to stop experience collection... (900 times) [2024-06-12 15:26:10,135][68089] Signal inference workers to resume experience collection... (900 times) [2024-06-12 15:26:10,157][68109] InferenceWorker_p0-w0: stopping experience collection (900 times) [2024-06-12 15:26:10,158][68109] InferenceWorker_p0-w0: resuming experience collection (900 times) [2024-06-12 15:26:12,026][67877] Fps is (10 sec: 47513.9, 60 sec: 45056.1, 300 sec: 43320.4). Total num frames: 377700352. Throughput: 0: 45057.5. Samples: 81482240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 15:26:12,027][67877] Avg episode reward: [(0, '0.005')] [2024-06-12 15:26:12,058][68109] Updated weights for policy 0, policy_version 23054 (0.0026) [2024-06-12 15:26:16,200][68109] Updated weights for policy 0, policy_version 23064 (0.0026) [2024-06-12 15:26:17,026][67877] Fps is (10 sec: 42597.9, 60 sec: 44783.0, 300 sec: 43209.3). Total num frames: 377896960. Throughput: 0: 45044.9. Samples: 81751440. Policy #0 lag: (min: 1.0, avg: 10.3, max: 24.0) [2024-06-12 15:26:17,027][67877] Avg episode reward: [(0, '0.006')] [2024-06-12 15:26:19,329][68109] Updated weights for policy 0, policy_version 23074 (0.0032) [2024-06-12 15:26:22,026][67877] Fps is (10 sec: 42598.2, 60 sec: 44509.8, 300 sec: 43320.4). Total num frames: 378126336. Throughput: 0: 45352.9. Samples: 82018560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 24.0) [2024-06-12 15:26:22,027][67877] Avg episode reward: [(0, '0.005')] [2024-06-12 15:26:23,424][68109] Updated weights for policy 0, policy_version 23084 (0.0024) [2024-06-12 15:26:26,609][68109] Updated weights for policy 0, policy_version 23094 (0.0023) [2024-06-12 15:26:27,026][67877] Fps is (10 sec: 49151.9, 60 sec: 45329.1, 300 sec: 43487.0). Total num frames: 378388480. Throughput: 0: 45388.9. Samples: 82159280. Policy #0 lag: (min: 1.0, avg: 10.3, max: 24.0) [2024-06-12 15:26:27,027][67877] Avg episode reward: [(0, '0.003')] [2024-06-12 15:26:30,527][68109] Updated weights for policy 0, policy_version 23104 (0.0022) [2024-06-12 15:26:32,026][67877] Fps is (10 sec: 49151.4, 60 sec: 45602.1, 300 sec: 43598.1). Total num frames: 378617856. Throughput: 0: 45349.2. Samples: 82435180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-12 15:26:32,027][67877] Avg episode reward: [(0, '0.004')] [2024-06-12 15:26:34,132][68109] Updated weights for policy 0, policy_version 23114 (0.0019) [2024-06-12 15:26:37,026][67877] Fps is (10 sec: 42598.2, 60 sec: 44782.9, 300 sec: 43598.1). Total num frames: 378814464. Throughput: 0: 45515.0. Samples: 82708100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-12 15:26:37,027][67877] Avg episode reward: [(0, '0.004')] [2024-06-12 15:26:37,401][68109] Updated weights for policy 0, policy_version 23124 (0.0032) [2024-06-12 15:26:41,301][68109] Updated weights for policy 0, policy_version 23134 (0.0034) [2024-06-12 15:26:42,031][67877] Fps is (10 sec: 42578.9, 60 sec: 45325.5, 300 sec: 43541.9). Total num frames: 379043840. Throughput: 0: 45427.2. Samples: 82842860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-12 15:26:42,031][67877] Avg episode reward: [(0, '0.005')] [2024-06-12 15:26:44,504][68109] Updated weights for policy 0, policy_version 23144 (0.0033) [2024-06-12 15:26:47,026][67877] Fps is (10 sec: 47514.0, 60 sec: 45875.2, 300 sec: 43653.6). Total num frames: 379289600. Throughput: 0: 45296.5. Samples: 83113500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-12 15:26:47,027][67877] Avg episode reward: [(0, '0.003')] [2024-06-12 15:26:48,287][68109] Updated weights for policy 0, policy_version 23154 (0.0030) [2024-06-12 15:26:51,866][68109] Updated weights for policy 0, policy_version 23164 (0.0029) [2024-06-12 15:26:52,027][67877] Fps is (10 sec: 47535.2, 60 sec: 45602.0, 300 sec: 43820.2). Total num frames: 379518976. Throughput: 0: 45592.2. Samples: 83392900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 15:26:52,027][67877] Avg episode reward: [(0, '0.006')] [2024-06-12 15:26:56,036][68109] Updated weights for policy 0, policy_version 23174 (0.0032) [2024-06-12 15:26:57,027][67877] Fps is (10 sec: 44236.2, 60 sec: 45055.9, 300 sec: 43709.2). Total num frames: 379731968. Throughput: 0: 45416.2. Samples: 83525980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 15:26:57,027][67877] Avg episode reward: [(0, '0.005')] [2024-06-12 15:26:58,992][68109] Updated weights for policy 0, policy_version 23184 (0.0029) [2024-06-12 15:27:02,026][67877] Fps is (10 sec: 44237.5, 60 sec: 45602.2, 300 sec: 43820.3). Total num frames: 379961344. Throughput: 0: 45570.7. Samples: 83802120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 15:27:02,027][67877] Avg episode reward: [(0, '0.004')] [2024-06-12 15:27:02,978][68109] Updated weights for policy 0, policy_version 23194 (0.0025) [2024-06-12 15:27:05,682][68109] Updated weights for policy 0, policy_version 23204 (0.0029) [2024-06-12 15:27:07,026][67877] Fps is (10 sec: 47514.1, 60 sec: 45602.0, 300 sec: 43986.9). Total num frames: 380207104. Throughput: 0: 45650.6. Samples: 84072840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-12 15:27:07,027][67877] Avg episode reward: [(0, '0.006')] [2024-06-12 15:27:10,380][68109] Updated weights for policy 0, policy_version 23214 (0.0034) [2024-06-12 15:27:12,026][67877] Fps is (10 sec: 44236.3, 60 sec: 45055.9, 300 sec: 43875.8). Total num frames: 380403712. Throughput: 0: 45436.8. Samples: 84203940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-12 15:27:12,027][67877] Avg episode reward: [(0, '0.004')] [2024-06-12 15:27:13,243][68109] Updated weights for policy 0, policy_version 23224 (0.0026) [2024-06-12 15:27:17,026][67877] Fps is (10 sec: 42598.5, 60 sec: 45602.1, 300 sec: 44042.4). Total num frames: 380633088. Throughput: 0: 45340.5. Samples: 84475500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-12 15:27:17,027][67877] Avg episode reward: [(0, '0.006')] [2024-06-12 15:27:17,043][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000023233_380649472.pth... [2024-06-12 15:27:17,094][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000022583_369999872.pth [2024-06-12 15:27:17,410][68109] Updated weights for policy 0, policy_version 23234 (0.0027) [2024-06-12 15:27:20,757][68109] Updated weights for policy 0, policy_version 23244 (0.0029) [2024-06-12 15:27:22,026][67877] Fps is (10 sec: 49152.3, 60 sec: 46148.2, 300 sec: 44153.5). Total num frames: 380895232. Throughput: 0: 45129.0. Samples: 84738900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 15:27:22,027][67877] Avg episode reward: [(0, '0.006')] [2024-06-12 15:27:24,945][68109] Updated weights for policy 0, policy_version 23254 (0.0027) [2024-06-12 15:27:27,026][67877] Fps is (10 sec: 47513.9, 60 sec: 45329.1, 300 sec: 44042.4). Total num frames: 381108224. Throughput: 0: 45259.9. Samples: 84879340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 15:27:27,027][67877] Avg episode reward: [(0, '0.002')] [2024-06-12 15:27:28,027][68109] Updated weights for policy 0, policy_version 23264 (0.0029) [2024-06-12 15:27:32,026][67877] Fps is (10 sec: 40960.2, 60 sec: 44783.0, 300 sec: 44153.5). Total num frames: 381304832. Throughput: 0: 45484.0. Samples: 85160280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 15:27:32,027][67877] Avg episode reward: [(0, '0.006')] [2024-06-12 15:27:32,100][68109] Updated weights for policy 0, policy_version 23274 (0.0026) [2024-06-12 15:27:33,293][68089] Signal inference workers to stop experience collection... (950 times) [2024-06-12 15:27:33,293][68089] Signal inference workers to resume experience collection... (950 times) [2024-06-12 15:27:33,333][68109] InferenceWorker_p0-w0: stopping experience collection (950 times) [2024-06-12 15:27:33,333][68109] InferenceWorker_p0-w0: resuming experience collection (950 times) [2024-06-12 15:27:35,125][68109] Updated weights for policy 0, policy_version 23284 (0.0028) [2024-06-12 15:27:37,026][67877] Fps is (10 sec: 44236.1, 60 sec: 45602.1, 300 sec: 44153.5). Total num frames: 381550592. Throughput: 0: 45330.3. Samples: 85432760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 15:27:37,027][67877] Avg episode reward: [(0, '0.007')] [2024-06-12 15:27:39,374][68109] Updated weights for policy 0, policy_version 23294 (0.0026) [2024-06-12 15:27:42,026][67877] Fps is (10 sec: 49151.9, 60 sec: 45878.8, 300 sec: 44264.6). Total num frames: 381796352. Throughput: 0: 45437.0. Samples: 85570640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 15:27:42,027][67877] Avg episode reward: [(0, '0.007')] [2024-06-12 15:27:42,134][68109] Updated weights for policy 0, policy_version 23304 (0.0022) [2024-06-12 15:27:45,964][68109] Updated weights for policy 0, policy_version 23314 (0.0030) [2024-06-12 15:27:47,026][67877] Fps is (10 sec: 49152.7, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 382042112. Throughput: 0: 45796.0. Samples: 85862940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 15:27:47,027][67877] Avg episode reward: [(0, '0.008')] [2024-06-12 15:27:49,345][68109] Updated weights for policy 0, policy_version 23324 (0.0023) [2024-06-12 15:27:52,026][67877] Fps is (10 sec: 42598.3, 60 sec: 45056.1, 300 sec: 44209.0). Total num frames: 382222336. Throughput: 0: 45757.8. Samples: 86131940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 15:27:52,027][67877] Avg episode reward: [(0, '0.006')] [2024-06-12 15:27:53,191][68109] Updated weights for policy 0, policy_version 23334 (0.0028) [2024-06-12 15:27:56,180][68109] Updated weights for policy 0, policy_version 23344 (0.0035) [2024-06-12 15:27:57,026][67877] Fps is (10 sec: 42598.5, 60 sec: 45602.3, 300 sec: 44486.7). Total num frames: 382468096. Throughput: 0: 45715.3. Samples: 86261120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 15:27:57,027][67877] Avg episode reward: [(0, '0.006')] [2024-06-12 15:28:00,138][68109] Updated weights for policy 0, policy_version 23354 (0.0025) [2024-06-12 15:28:02,026][67877] Fps is (10 sec: 50790.0, 60 sec: 46148.2, 300 sec: 44542.3). Total num frames: 382730240. Throughput: 0: 45966.6. Samples: 86544000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 15:28:02,027][67877] Avg episode reward: [(0, '0.004')] [2024-06-12 15:28:03,833][68109] Updated weights for policy 0, policy_version 23364 (0.0025) [2024-06-12 15:28:07,027][67877] Fps is (10 sec: 47512.2, 60 sec: 45602.0, 300 sec: 44653.3). Total num frames: 382943232. Throughput: 0: 46354.4. Samples: 86824860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 15:28:07,027][67877] Avg episode reward: [(0, '0.005')] [2024-06-12 15:28:07,273][68109] Updated weights for policy 0, policy_version 23374 (0.0029) [2024-06-12 15:28:10,881][68109] Updated weights for policy 0, policy_version 23384 (0.0032) [2024-06-12 15:28:12,026][67877] Fps is (10 sec: 44237.0, 60 sec: 46148.3, 300 sec: 44653.3). Total num frames: 383172608. Throughput: 0: 46098.6. Samples: 86953780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 15:28:12,027][67877] Avg episode reward: [(0, '0.006')] [2024-06-12 15:28:14,183][68109] Updated weights for policy 0, policy_version 23394 (0.0032) [2024-06-12 15:28:17,026][67877] Fps is (10 sec: 45876.5, 60 sec: 46148.3, 300 sec: 44708.9). Total num frames: 383401984. Throughput: 0: 46182.2. Samples: 87238480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 15:28:17,027][67877] Avg episode reward: [(0, '0.004')] [2024-06-12 15:28:17,866][68109] Updated weights for policy 0, policy_version 23404 (0.0029) [2024-06-12 15:28:21,293][68109] Updated weights for policy 0, policy_version 23414 (0.0036) [2024-06-12 15:28:22,026][67877] Fps is (10 sec: 47513.6, 60 sec: 45875.2, 300 sec: 44820.0). Total num frames: 383647744. Throughput: 0: 46304.5. Samples: 87516460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 15:28:22,027][67877] Avg episode reward: [(0, '0.007')] [2024-06-12 15:28:25,099][68109] Updated weights for policy 0, policy_version 23424 (0.0030) [2024-06-12 15:28:27,027][67877] Fps is (10 sec: 45874.0, 60 sec: 45875.0, 300 sec: 44764.4). Total num frames: 383860736. Throughput: 0: 46281.5. Samples: 87653320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-12 15:28:27,027][67877] Avg episode reward: [(0, '0.005')] [2024-06-12 15:28:28,367][68109] Updated weights for policy 0, policy_version 23434 (0.0027) [2024-06-12 15:28:32,026][67877] Fps is (10 sec: 44237.3, 60 sec: 46421.4, 300 sec: 44875.5). Total num frames: 384090112. Throughput: 0: 46050.7. Samples: 87935220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-12 15:28:32,027][67877] Avg episode reward: [(0, '0.008')] [2024-06-12 15:28:32,073][68109] Updated weights for policy 0, policy_version 23444 (0.0035) [2024-06-12 15:28:35,267][68109] Updated weights for policy 0, policy_version 23454 (0.0029) [2024-06-12 15:28:37,026][67877] Fps is (10 sec: 49152.5, 60 sec: 46694.4, 300 sec: 44986.6). Total num frames: 384352256. Throughput: 0: 46102.6. Samples: 88206560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-12 15:28:37,027][67877] Avg episode reward: [(0, '0.006')] [2024-06-12 15:28:39,651][68109] Updated weights for policy 0, policy_version 23464 (0.0034) [2024-06-12 15:28:42,027][67877] Fps is (10 sec: 45874.1, 60 sec: 45875.1, 300 sec: 44986.6). Total num frames: 384548864. Throughput: 0: 46524.2. Samples: 88354720. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-12 15:28:42,027][67877] Avg episode reward: [(0, '0.005')] [2024-06-12 15:28:42,367][68109] Updated weights for policy 0, policy_version 23474 (0.0025) [2024-06-12 15:28:46,612][68109] Updated weights for policy 0, policy_version 23484 (0.0030) [2024-06-12 15:28:47,026][67877] Fps is (10 sec: 40960.1, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 384761856. Throughput: 0: 46256.9. Samples: 88625560. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-12 15:28:47,027][67877] Avg episode reward: [(0, '0.006')] [2024-06-12 15:28:49,512][68109] Updated weights for policy 0, policy_version 23494 (0.0032) [2024-06-12 15:28:52,026][67877] Fps is (10 sec: 45875.5, 60 sec: 46421.3, 300 sec: 44986.6). Total num frames: 385007616. Throughput: 0: 45832.6. Samples: 88887320. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-12 15:28:52,027][67877] Avg episode reward: [(0, '0.005')] [2024-06-12 15:28:53,738][68109] Updated weights for policy 0, policy_version 23504 (0.0025) [2024-06-12 15:28:54,640][68089] Signal inference workers to stop experience collection... (1000 times) [2024-06-12 15:28:54,686][68109] InferenceWorker_p0-w0: stopping experience collection (1000 times) [2024-06-12 15:28:54,693][68089] Signal inference workers to resume experience collection... (1000 times) [2024-06-12 15:28:54,705][68109] InferenceWorker_p0-w0: resuming experience collection (1000 times) [2024-06-12 15:28:56,596][68109] Updated weights for policy 0, policy_version 23514 (0.0029) [2024-06-12 15:28:57,026][67877] Fps is (10 sec: 49152.2, 60 sec: 46421.3, 300 sec: 45097.6). Total num frames: 385253376. Throughput: 0: 46300.0. Samples: 89037280. Policy #0 lag: (min: 2.0, avg: 11.0, max: 24.0) [2024-06-12 15:28:57,027][67877] Avg episode reward: [(0, '0.006')] [2024-06-12 15:29:01,122][68109] Updated weights for policy 0, policy_version 23524 (0.0025) [2024-06-12 15:29:02,026][67877] Fps is (10 sec: 45875.4, 60 sec: 45602.2, 300 sec: 45153.2). Total num frames: 385466368. Throughput: 0: 46060.8. Samples: 89311220. Policy #0 lag: (min: 2.0, avg: 11.0, max: 24.0) [2024-06-12 15:29:02,027][67877] Avg episode reward: [(0, '0.007')] [2024-06-12 15:29:03,745][68109] Updated weights for policy 0, policy_version 23534 (0.0027) [2024-06-12 15:29:07,027][67877] Fps is (10 sec: 45874.3, 60 sec: 46148.3, 300 sec: 45208.7). Total num frames: 385712128. Throughput: 0: 46114.0. Samples: 89591600. Policy #0 lag: (min: 2.0, avg: 11.0, max: 24.0) [2024-06-12 15:29:07,027][67877] Avg episode reward: [(0, '0.009')] [2024-06-12 15:29:07,798][68109] Updated weights for policy 0, policy_version 23544 (0.0025) [2024-06-12 15:29:10,692][68109] Updated weights for policy 0, policy_version 23554 (0.0025) [2024-06-12 15:29:12,026][67877] Fps is (10 sec: 47513.5, 60 sec: 46148.2, 300 sec: 45153.2). Total num frames: 385941504. Throughput: 0: 46048.6. Samples: 89725500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 15:29:12,027][67877] Avg episode reward: [(0, '0.007')] [2024-06-12 15:29:14,639][68109] Updated weights for policy 0, policy_version 23564 (0.0029) [2024-06-12 15:29:17,026][67877] Fps is (10 sec: 47514.8, 60 sec: 46421.3, 300 sec: 45375.4). Total num frames: 386187264. Throughput: 0: 45991.9. Samples: 90004860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 15:29:17,027][67877] Avg episode reward: [(0, '0.007')] [2024-06-12 15:29:17,102][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000023572_386203648.pth... [2024-06-12 15:29:17,149][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000022902_375226368.pth [2024-06-12 15:29:18,161][68109] Updated weights for policy 0, policy_version 23574 (0.0023) [2024-06-12 15:29:22,026][67877] Fps is (10 sec: 44236.8, 60 sec: 45602.1, 300 sec: 45264.3). Total num frames: 386383872. Throughput: 0: 46052.0. Samples: 90278900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 15:29:22,027][67877] Avg episode reward: [(0, '0.006')] [2024-06-12 15:29:22,196][68109] Updated weights for policy 0, policy_version 23584 (0.0030) [2024-06-12 15:29:25,231][68109] Updated weights for policy 0, policy_version 23594 (0.0030) [2024-06-12 15:29:27,026][67877] Fps is (10 sec: 44236.9, 60 sec: 46148.5, 300 sec: 45376.1). Total num frames: 386629632. Throughput: 0: 45707.8. Samples: 90411560. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 15:29:27,027][67877] Avg episode reward: [(0, '0.006')] [2024-06-12 15:29:29,414][68109] Updated weights for policy 0, policy_version 23604 (0.0034) [2024-06-12 15:29:32,026][67877] Fps is (10 sec: 49152.3, 60 sec: 46421.3, 300 sec: 45430.9). Total num frames: 386875392. Throughput: 0: 45817.0. Samples: 90687320. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 15:29:32,027][67877] Avg episode reward: [(0, '0.006')] [2024-06-12 15:29:32,195][68109] Updated weights for policy 0, policy_version 23614 (0.0024) [2024-06-12 15:29:36,348][68109] Updated weights for policy 0, policy_version 23624 (0.0033) [2024-06-12 15:29:37,026][67877] Fps is (10 sec: 47513.5, 60 sec: 45875.3, 300 sec: 45486.4). Total num frames: 387104768. Throughput: 0: 46443.2. Samples: 90977260. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 15:29:37,027][67877] Avg episode reward: [(0, '0.007')] [2024-06-12 15:29:38,968][68109] Updated weights for policy 0, policy_version 23634 (0.0027) [2024-06-12 15:29:42,026][67877] Fps is (10 sec: 44236.8, 60 sec: 46148.4, 300 sec: 45486.4). Total num frames: 387317760. Throughput: 0: 46126.7. Samples: 91112980. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 15:29:42,027][67877] Avg episode reward: [(0, '0.009')] [2024-06-12 15:29:43,408][68109] Updated weights for policy 0, policy_version 23644 (0.0027) [2024-06-12 15:29:45,991][68109] Updated weights for policy 0, policy_version 23654 (0.0029) [2024-06-12 15:29:47,026][67877] Fps is (10 sec: 47513.6, 60 sec: 46967.5, 300 sec: 45597.5). Total num frames: 387579904. Throughput: 0: 46242.7. Samples: 91392140. Policy #0 lag: (min: 0.0, avg: 7.5, max: 23.0) [2024-06-12 15:29:47,027][67877] Avg episode reward: [(0, '0.010')] [2024-06-12 15:29:50,354][68109] Updated weights for policy 0, policy_version 23664 (0.0032) [2024-06-12 15:29:52,026][67877] Fps is (10 sec: 50790.4, 60 sec: 46967.5, 300 sec: 45764.1). Total num frames: 387825664. Throughput: 0: 46197.6. Samples: 91670480. Policy #0 lag: (min: 0.0, avg: 7.5, max: 23.0) [2024-06-12 15:29:52,027][67877] Avg episode reward: [(0, '0.009')] [2024-06-12 15:29:53,215][68109] Updated weights for policy 0, policy_version 23674 (0.0030) [2024-06-12 15:29:57,026][67877] Fps is (10 sec: 44236.7, 60 sec: 46148.3, 300 sec: 45653.0). Total num frames: 388022272. Throughput: 0: 46361.9. Samples: 91811780. Policy #0 lag: (min: 0.0, avg: 7.5, max: 23.0) [2024-06-12 15:29:57,027][67877] Avg episode reward: [(0, '0.008')] [2024-06-12 15:29:57,247][68109] Updated weights for policy 0, policy_version 23684 (0.0025) [2024-06-12 15:30:00,509][68109] Updated weights for policy 0, policy_version 23694 (0.0027) [2024-06-12 15:30:02,026][67877] Fps is (10 sec: 44236.7, 60 sec: 46694.4, 300 sec: 45597.5). Total num frames: 388268032. Throughput: 0: 46451.5. Samples: 92095180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 15:30:02,027][67877] Avg episode reward: [(0, '0.005')] [2024-06-12 15:30:04,347][68109] Updated weights for policy 0, policy_version 23704 (0.0027) [2024-06-12 15:30:04,601][68089] Signal inference workers to stop experience collection... (1050 times) [2024-06-12 15:30:04,601][68089] Signal inference workers to resume experience collection... (1050 times) [2024-06-12 15:30:04,631][68109] InferenceWorker_p0-w0: stopping experience collection (1050 times) [2024-06-12 15:30:04,632][68109] InferenceWorker_p0-w0: resuming experience collection (1050 times) [2024-06-12 15:30:07,026][67877] Fps is (10 sec: 49151.9, 60 sec: 46694.6, 300 sec: 45819.7). Total num frames: 388513792. Throughput: 0: 46511.6. Samples: 92371920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 15:30:07,027][67877] Avg episode reward: [(0, '0.007')] [2024-06-12 15:30:07,200][68109] Updated weights for policy 0, policy_version 23714 (0.0031) [2024-06-12 15:30:11,062][68109] Updated weights for policy 0, policy_version 23724 (0.0032) [2024-06-12 15:30:12,026][67877] Fps is (10 sec: 45875.7, 60 sec: 46421.5, 300 sec: 45819.7). Total num frames: 388726784. Throughput: 0: 47060.0. Samples: 92529260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 15:30:12,027][67877] Avg episode reward: [(0, '0.007')] [2024-06-12 15:30:14,317][68109] Updated weights for policy 0, policy_version 23734 (0.0028) [2024-06-12 15:30:17,026][67877] Fps is (10 sec: 45875.1, 60 sec: 46421.3, 300 sec: 45819.6). Total num frames: 388972544. Throughput: 0: 47107.5. Samples: 92807160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-12 15:30:17,027][67877] Avg episode reward: [(0, '0.011')] [2024-06-12 15:30:18,059][68109] Updated weights for policy 0, policy_version 23744 (0.0025) [2024-06-12 15:30:21,144][68109] Updated weights for policy 0, policy_version 23754 (0.0042) [2024-06-12 15:30:22,026][67877] Fps is (10 sec: 49151.5, 60 sec: 47240.6, 300 sec: 45930.8). Total num frames: 389218304. Throughput: 0: 46814.6. Samples: 93083920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-12 15:30:22,027][67877] Avg episode reward: [(0, '0.009')] [2024-06-12 15:30:25,211][68109] Updated weights for policy 0, policy_version 23764 (0.0031) [2024-06-12 15:30:27,026][67877] Fps is (10 sec: 47513.5, 60 sec: 46967.4, 300 sec: 45986.3). Total num frames: 389447680. Throughput: 0: 47039.9. Samples: 93229780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-12 15:30:27,027][67877] Avg episode reward: [(0, '0.008')] [2024-06-12 15:30:28,237][68109] Updated weights for policy 0, policy_version 23774 (0.0027) [2024-06-12 15:30:31,806][68109] Updated weights for policy 0, policy_version 23784 (0.0027) [2024-06-12 15:30:32,026][67877] Fps is (10 sec: 45875.5, 60 sec: 46694.4, 300 sec: 45930.7). Total num frames: 389677056. Throughput: 0: 46936.0. Samples: 93504260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 15:30:32,027][67877] Avg episode reward: [(0, '0.009')] [2024-06-12 15:30:35,312][68109] Updated weights for policy 0, policy_version 23794 (0.0024) [2024-06-12 15:30:37,026][67877] Fps is (10 sec: 47514.1, 60 sec: 46967.5, 300 sec: 46097.4). Total num frames: 389922816. Throughput: 0: 47204.9. Samples: 93794700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 15:30:37,027][67877] Avg episode reward: [(0, '0.012')] [2024-06-12 15:30:39,341][68109] Updated weights for policy 0, policy_version 23804 (0.0034) [2024-06-12 15:30:42,026][67877] Fps is (10 sec: 47513.2, 60 sec: 47240.5, 300 sec: 46152.9). Total num frames: 390152192. Throughput: 0: 47198.2. Samples: 93935700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 15:30:42,027][67877] Avg episode reward: [(0, '0.006')] [2024-06-12 15:30:42,269][68109] Updated weights for policy 0, policy_version 23814 (0.0033) [2024-06-12 15:30:46,047][68109] Updated weights for policy 0, policy_version 23824 (0.0032) [2024-06-12 15:30:47,026][67877] Fps is (10 sec: 44236.4, 60 sec: 46421.3, 300 sec: 46041.8). Total num frames: 390365184. Throughput: 0: 47034.2. Samples: 94211720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 15:30:47,027][67877] Avg episode reward: [(0, '0.014')] [2024-06-12 15:30:49,138][68109] Updated weights for policy 0, policy_version 23834 (0.0028) [2024-06-12 15:30:52,026][67877] Fps is (10 sec: 45875.2, 60 sec: 46421.3, 300 sec: 46041.8). Total num frames: 390610944. Throughput: 0: 47175.5. Samples: 94494820. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-12 15:30:52,027][67877] Avg episode reward: [(0, '0.009')] [2024-06-12 15:30:52,916][68109] Updated weights for policy 0, policy_version 23844 (0.0023) [2024-06-12 15:30:55,944][68109] Updated weights for policy 0, policy_version 23854 (0.0030) [2024-06-12 15:30:57,026][67877] Fps is (10 sec: 49152.2, 60 sec: 47240.5, 300 sec: 46208.4). Total num frames: 390856704. Throughput: 0: 46983.5. Samples: 94643520. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-12 15:30:57,027][67877] Avg episode reward: [(0, '0.013')] [2024-06-12 15:30:59,496][68109] Updated weights for policy 0, policy_version 23864 (0.0026) [2024-06-12 15:31:02,026][67877] Fps is (10 sec: 45875.1, 60 sec: 46694.4, 300 sec: 46097.3). Total num frames: 391069696. Throughput: 0: 46888.0. Samples: 94917120. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-12 15:31:02,027][67877] Avg episode reward: [(0, '0.008')] [2024-06-12 15:31:02,859][68109] Updated weights for policy 0, policy_version 23874 (0.0030) [2024-06-12 15:31:06,339][68109] Updated weights for policy 0, policy_version 23884 (0.0027) [2024-06-12 15:31:07,027][67877] Fps is (10 sec: 49151.4, 60 sec: 47240.4, 300 sec: 46263.9). Total num frames: 391348224. Throughput: 0: 47006.1. Samples: 95199200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-12 15:31:07,027][67877] Avg episode reward: [(0, '0.011')] [2024-06-12 15:31:09,967][68109] Updated weights for policy 0, policy_version 23894 (0.0029) [2024-06-12 15:31:12,029][67877] Fps is (10 sec: 49137.7, 60 sec: 47238.1, 300 sec: 46319.0). Total num frames: 391561216. Throughput: 0: 47021.4. Samples: 95345880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-12 15:31:12,030][67877] Avg episode reward: [(0, '0.010')] [2024-06-12 15:31:13,428][68109] Updated weights for policy 0, policy_version 23904 (0.0029) [2024-06-12 15:31:15,754][68089] Signal inference workers to stop experience collection... (1100 times) [2024-06-12 15:31:15,755][68089] Signal inference workers to resume experience collection... (1100 times) [2024-06-12 15:31:15,778][68109] InferenceWorker_p0-w0: stopping experience collection (1100 times) [2024-06-12 15:31:15,779][68109] InferenceWorker_p0-w0: resuming experience collection (1100 times) [2024-06-12 15:31:16,979][68109] Updated weights for policy 0, policy_version 23914 (0.0026) [2024-06-12 15:31:17,026][67877] Fps is (10 sec: 45875.7, 60 sec: 47240.5, 300 sec: 46375.0). Total num frames: 391806976. Throughput: 0: 47382.1. Samples: 95636460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-12 15:31:17,027][67877] Avg episode reward: [(0, '0.010')] [2024-06-12 15:31:17,034][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000023914_391806976.pth... [2024-06-12 15:31:17,079][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000023233_380649472.pth [2024-06-12 15:31:20,493][68109] Updated weights for policy 0, policy_version 23924 (0.0032) [2024-06-12 15:31:22,027][67877] Fps is (10 sec: 47526.9, 60 sec: 46967.4, 300 sec: 46264.0). Total num frames: 392036352. Throughput: 0: 46758.9. Samples: 95898860. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-12 15:31:22,027][67877] Avg episode reward: [(0, '0.011')] [2024-06-12 15:31:24,013][68109] Updated weights for policy 0, policy_version 23934 (0.0029) [2024-06-12 15:31:27,026][67877] Fps is (10 sec: 47513.3, 60 sec: 47240.5, 300 sec: 46319.5). Total num frames: 392282112. Throughput: 0: 47068.8. Samples: 96053800. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-12 15:31:27,027][67877] Avg episode reward: [(0, '0.014')] [2024-06-12 15:31:27,440][68109] Updated weights for policy 0, policy_version 23944 (0.0027) [2024-06-12 15:31:31,606][68109] Updated weights for policy 0, policy_version 23954 (0.0023) [2024-06-12 15:31:32,026][67877] Fps is (10 sec: 45875.5, 60 sec: 46967.4, 300 sec: 46375.0). Total num frames: 392495104. Throughput: 0: 47232.0. Samples: 96337160. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-12 15:31:32,027][67877] Avg episode reward: [(0, '0.014')] [2024-06-12 15:31:34,478][68109] Updated weights for policy 0, policy_version 23964 (0.0029) [2024-06-12 15:31:37,027][67877] Fps is (10 sec: 45874.9, 60 sec: 46967.3, 300 sec: 46431.3). Total num frames: 392740864. Throughput: 0: 47046.5. Samples: 96611920. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-06-12 15:31:37,027][67877] Avg episode reward: [(0, '0.012')] [2024-06-12 15:31:38,312][68109] Updated weights for policy 0, policy_version 23974 (0.0031) [2024-06-12 15:31:41,394][68109] Updated weights for policy 0, policy_version 23984 (0.0033) [2024-06-12 15:31:42,026][67877] Fps is (10 sec: 49152.5, 60 sec: 47240.6, 300 sec: 46430.6). Total num frames: 392986624. Throughput: 0: 46861.8. Samples: 96752300. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-06-12 15:31:42,027][67877] Avg episode reward: [(0, '0.011')] [2024-06-12 15:31:45,345][68109] Updated weights for policy 0, policy_version 23994 (0.0030) [2024-06-12 15:31:47,027][67877] Fps is (10 sec: 45875.2, 60 sec: 47240.5, 300 sec: 46375.0). Total num frames: 393199616. Throughput: 0: 47248.3. Samples: 97043300. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-06-12 15:31:47,027][67877] Avg episode reward: [(0, '0.008')] [2024-06-12 15:31:48,284][68109] Updated weights for policy 0, policy_version 24004 (0.0026) [2024-06-12 15:31:51,963][68109] Updated weights for policy 0, policy_version 24014 (0.0023) [2024-06-12 15:31:52,026][67877] Fps is (10 sec: 45875.7, 60 sec: 47240.6, 300 sec: 46486.2). Total num frames: 393445376. Throughput: 0: 47423.8. Samples: 97333260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 15:31:52,027][67877] Avg episode reward: [(0, '0.014')] [2024-06-12 15:31:55,083][68109] Updated weights for policy 0, policy_version 24024 (0.0029) [2024-06-12 15:31:57,027][67877] Fps is (10 sec: 49152.2, 60 sec: 47240.5, 300 sec: 46541.6). Total num frames: 393691136. Throughput: 0: 47168.7. Samples: 97468340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 15:31:57,027][67877] Avg episode reward: [(0, '0.016')] [2024-06-12 15:31:58,746][68109] Updated weights for policy 0, policy_version 24034 (0.0028) [2024-06-12 15:32:02,005][68109] Updated weights for policy 0, policy_version 24044 (0.0022) [2024-06-12 15:32:02,026][67877] Fps is (10 sec: 49151.5, 60 sec: 47786.7, 300 sec: 46541.7). Total num frames: 393936896. Throughput: 0: 46991.6. Samples: 97751080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 15:32:02,027][67877] Avg episode reward: [(0, '0.010')] [2024-06-12 15:32:05,527][68109] Updated weights for policy 0, policy_version 24054 (0.0026) [2024-06-12 15:32:07,026][67877] Fps is (10 sec: 47514.2, 60 sec: 46967.6, 300 sec: 46652.8). Total num frames: 394166272. Throughput: 0: 47631.7. Samples: 98042280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 15:32:07,027][67877] Avg episode reward: [(0, '0.012')] [2024-06-12 15:32:08,721][68109] Updated weights for policy 0, policy_version 24064 (0.0035) [2024-06-12 15:32:12,026][67877] Fps is (10 sec: 47513.8, 60 sec: 47516.0, 300 sec: 46708.3). Total num frames: 394412032. Throughput: 0: 47357.1. Samples: 98184860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-12 15:32:12,027][67877] Avg episode reward: [(0, '0.016')] [2024-06-12 15:32:12,279][68109] Updated weights for policy 0, policy_version 24074 (0.0024) [2024-06-12 15:32:15,105][68109] Updated weights for policy 0, policy_version 24084 (0.0028) [2024-06-12 15:32:17,026][67877] Fps is (10 sec: 49152.1, 60 sec: 47513.7, 300 sec: 46652.8). Total num frames: 394657792. Throughput: 0: 47735.2. Samples: 98485240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-12 15:32:17,027][67877] Avg episode reward: [(0, '0.012')] [2024-06-12 15:32:18,563][68109] Updated weights for policy 0, policy_version 24094 (0.0025) [2024-06-12 15:32:22,026][67877] Fps is (10 sec: 49151.9, 60 sec: 47786.8, 300 sec: 46763.8). Total num frames: 394903552. Throughput: 0: 47917.6. Samples: 98768200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-12 15:32:22,027][67877] Avg episode reward: [(0, '0.013')] [2024-06-12 15:32:22,135][68109] Updated weights for policy 0, policy_version 24104 (0.0027) [2024-06-12 15:32:25,660][68109] Updated weights for policy 0, policy_version 24114 (0.0032) [2024-06-12 15:32:27,026][67877] Fps is (10 sec: 47513.1, 60 sec: 47513.6, 300 sec: 46874.9). Total num frames: 395132928. Throughput: 0: 48288.4. Samples: 98925280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 15:32:27,027][67877] Avg episode reward: [(0, '0.013')] [2024-06-12 15:32:28,931][68109] Updated weights for policy 0, policy_version 24124 (0.0025) [2024-06-12 15:32:32,026][67877] Fps is (10 sec: 49152.0, 60 sec: 48332.9, 300 sec: 46930.5). Total num frames: 395395072. Throughput: 0: 48090.0. Samples: 99207340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 15:32:32,027][67877] Avg episode reward: [(0, '0.020')] [2024-06-12 15:32:32,366][68109] Updated weights for policy 0, policy_version 24134 (0.0034) [2024-06-12 15:32:36,242][68109] Updated weights for policy 0, policy_version 24144 (0.0020) [2024-06-12 15:32:37,026][67877] Fps is (10 sec: 47513.7, 60 sec: 47786.8, 300 sec: 46819.4). Total num frames: 395608064. Throughput: 0: 47903.8. Samples: 99488940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 15:32:37,027][67877] Avg episode reward: [(0, '0.013')] [2024-06-12 15:32:38,475][68089] Signal inference workers to stop experience collection... (1150 times) [2024-06-12 15:32:38,515][68109] InferenceWorker_p0-w0: stopping experience collection (1150 times) [2024-06-12 15:32:38,529][68089] Signal inference workers to resume experience collection... (1150 times) [2024-06-12 15:32:38,538][68109] InferenceWorker_p0-w0: resuming experience collection (1150 times) [2024-06-12 15:32:39,532][68109] Updated weights for policy 0, policy_version 24154 (0.0023) [2024-06-12 15:32:42,027][67877] Fps is (10 sec: 45874.6, 60 sec: 47786.6, 300 sec: 46819.3). Total num frames: 395853824. Throughput: 0: 48116.9. Samples: 99633600. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-12 15:32:42,027][67877] Avg episode reward: [(0, '0.013')] [2024-06-12 15:32:42,777][68109] Updated weights for policy 0, policy_version 24164 (0.0027) [2024-06-12 15:32:46,507][68109] Updated weights for policy 0, policy_version 24174 (0.0028) [2024-06-12 15:32:47,026][67877] Fps is (10 sec: 49152.2, 60 sec: 48332.9, 300 sec: 47041.5). Total num frames: 396099584. Throughput: 0: 48157.7. Samples: 99918180. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-12 15:32:47,027][67877] Avg episode reward: [(0, '0.016')] [2024-06-12 15:32:49,455][68109] Updated weights for policy 0, policy_version 24184 (0.0025) [2024-06-12 15:32:52,026][67877] Fps is (10 sec: 47513.7, 60 sec: 48059.6, 300 sec: 46986.0). Total num frames: 396328960. Throughput: 0: 48124.4. Samples: 100207880. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-12 15:32:52,027][67877] Avg episode reward: [(0, '0.014')] [2024-06-12 15:32:53,187][68109] Updated weights for policy 0, policy_version 24194 (0.0033) [2024-06-12 15:32:56,195][68109] Updated weights for policy 0, policy_version 24204 (0.0028) [2024-06-12 15:32:57,026][67877] Fps is (10 sec: 47513.6, 60 sec: 48059.8, 300 sec: 46930.5). Total num frames: 396574720. Throughput: 0: 48246.1. Samples: 100355940. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-12 15:32:57,027][67877] Avg episode reward: [(0, '0.012')] [2024-06-12 15:32:59,799][68109] Updated weights for policy 0, policy_version 24214 (0.0029) [2024-06-12 15:33:02,026][67877] Fps is (10 sec: 47514.0, 60 sec: 47786.7, 300 sec: 46986.0). Total num frames: 396804096. Throughput: 0: 48061.3. Samples: 100648000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 15:33:02,027][67877] Avg episode reward: [(0, '0.016')] [2024-06-12 15:33:03,294][68109] Updated weights for policy 0, policy_version 24224 (0.0023) [2024-06-12 15:33:07,026][67877] Fps is (10 sec: 45875.5, 60 sec: 47786.7, 300 sec: 46986.0). Total num frames: 397033472. Throughput: 0: 47953.3. Samples: 100926100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 15:33:07,027][67877] Avg episode reward: [(0, '0.015')] [2024-06-12 15:33:07,041][68109] Updated weights for policy 0, policy_version 24234 (0.0023) [2024-06-12 15:33:09,919][68109] Updated weights for policy 0, policy_version 24244 (0.0025) [2024-06-12 15:33:12,026][67877] Fps is (10 sec: 50790.3, 60 sec: 48332.8, 300 sec: 47152.6). Total num frames: 397312000. Throughput: 0: 47757.4. Samples: 101074360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 15:33:12,027][67877] Avg episode reward: [(0, '0.016')] [2024-06-12 15:33:13,922][68109] Updated weights for policy 0, policy_version 24254 (0.0029) [2024-06-12 15:33:16,762][68109] Updated weights for policy 0, policy_version 24264 (0.0031) [2024-06-12 15:33:17,026][67877] Fps is (10 sec: 50789.9, 60 sec: 48059.7, 300 sec: 47097.1). Total num frames: 397541376. Throughput: 0: 47899.9. Samples: 101362840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 15:33:17,027][67877] Avg episode reward: [(0, '0.019')] [2024-06-12 15:33:17,037][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000024264_397541376.pth... [2024-06-12 15:33:17,088][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000023572_386203648.pth [2024-06-12 15:33:20,691][68109] Updated weights for policy 0, policy_version 24274 (0.0031) [2024-06-12 15:33:22,026][67877] Fps is (10 sec: 47513.3, 60 sec: 48059.7, 300 sec: 47208.2). Total num frames: 397787136. Throughput: 0: 48262.2. Samples: 101660740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 15:33:22,027][67877] Avg episode reward: [(0, '0.019')] [2024-06-12 15:33:23,227][68109] Updated weights for policy 0, policy_version 24284 (0.0029) [2024-06-12 15:33:26,583][68109] Updated weights for policy 0, policy_version 24294 (0.0028) [2024-06-12 15:33:27,026][67877] Fps is (10 sec: 49152.0, 60 sec: 48332.8, 300 sec: 47263.7). Total num frames: 398032896. Throughput: 0: 48197.4. Samples: 101802480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 15:33:27,027][67877] Avg episode reward: [(0, '0.020')] [2024-06-12 15:33:30,212][68109] Updated weights for policy 0, policy_version 24304 (0.0027) [2024-06-12 15:33:32,026][67877] Fps is (10 sec: 49152.4, 60 sec: 48059.7, 300 sec: 47208.2). Total num frames: 398278656. Throughput: 0: 48390.7. Samples: 102095760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 15:33:32,027][67877] Avg episode reward: [(0, '0.018')] [2024-06-12 15:33:33,464][68109] Updated weights for policy 0, policy_version 24314 (0.0020) [2024-06-12 15:33:36,990][68109] Updated weights for policy 0, policy_version 24324 (0.0031) [2024-06-12 15:33:37,026][67877] Fps is (10 sec: 49152.0, 60 sec: 48605.8, 300 sec: 47374.8). Total num frames: 398524416. Throughput: 0: 48465.3. Samples: 102388820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 15:33:37,027][67877] Avg episode reward: [(0, '0.020')] [2024-06-12 15:33:40,355][68109] Updated weights for policy 0, policy_version 24334 (0.0028) [2024-06-12 15:33:42,026][67877] Fps is (10 sec: 47513.5, 60 sec: 48332.9, 300 sec: 47430.3). Total num frames: 398753792. Throughput: 0: 48388.9. Samples: 102533440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 15:33:42,027][67877] Avg episode reward: [(0, '0.019')] [2024-06-12 15:33:43,654][68109] Updated weights for policy 0, policy_version 24344 (0.0027) [2024-06-12 15:33:47,026][67877] Fps is (10 sec: 47513.8, 60 sec: 48332.8, 300 sec: 47430.3). Total num frames: 398999552. Throughput: 0: 48410.2. Samples: 102826460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-12 15:33:47,027][67877] Avg episode reward: [(0, '0.021')] [2024-06-12 15:33:47,533][68109] Updated weights for policy 0, policy_version 24354 (0.0031) [2024-06-12 15:33:50,682][68109] Updated weights for policy 0, policy_version 24364 (0.0025) [2024-06-12 15:33:52,026][67877] Fps is (10 sec: 49151.9, 60 sec: 48605.9, 300 sec: 47430.3). Total num frames: 399245312. Throughput: 0: 48441.7. Samples: 103105980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-12 15:33:52,027][67877] Avg episode reward: [(0, '0.023')] [2024-06-12 15:33:54,529][68109] Updated weights for policy 0, policy_version 24374 (0.0032) [2024-06-12 15:33:57,026][67877] Fps is (10 sec: 47513.2, 60 sec: 48332.7, 300 sec: 47485.8). Total num frames: 399474688. Throughput: 0: 48522.1. Samples: 103257860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-12 15:33:57,027][67877] Avg episode reward: [(0, '0.021')] [2024-06-12 15:33:57,574][68109] Updated weights for policy 0, policy_version 24384 (0.0026) [2024-06-12 15:34:01,420][68109] Updated weights for policy 0, policy_version 24394 (0.0030) [2024-06-12 15:34:02,026][67877] Fps is (10 sec: 47513.7, 60 sec: 48605.9, 300 sec: 47485.9). Total num frames: 399720448. Throughput: 0: 48466.3. Samples: 103543820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-12 15:34:02,027][67877] Avg episode reward: [(0, '0.023')] [2024-06-12 15:34:04,420][68109] Updated weights for policy 0, policy_version 24404 (0.0031) [2024-06-12 15:34:05,647][68089] Signal inference workers to stop experience collection... (1200 times) [2024-06-12 15:34:05,693][68109] InferenceWorker_p0-w0: stopping experience collection (1200 times) [2024-06-12 15:34:05,696][68089] Signal inference workers to resume experience collection... (1200 times) [2024-06-12 15:34:05,703][68109] InferenceWorker_p0-w0: resuming experience collection (1200 times) [2024-06-12 15:34:07,026][67877] Fps is (10 sec: 47513.7, 60 sec: 48605.8, 300 sec: 47485.8). Total num frames: 399949824. Throughput: 0: 48252.0. Samples: 103832080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 15:34:07,027][67877] Avg episode reward: [(0, '0.022')] [2024-06-12 15:34:07,912][68109] Updated weights for policy 0, policy_version 24414 (0.0029) [2024-06-12 15:34:11,087][68109] Updated weights for policy 0, policy_version 24424 (0.0023) [2024-06-12 15:34:12,026][67877] Fps is (10 sec: 47514.0, 60 sec: 48059.8, 300 sec: 47485.8). Total num frames: 400195584. Throughput: 0: 48280.2. Samples: 103975080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 15:34:12,027][67877] Avg episode reward: [(0, '0.021')] [2024-06-12 15:34:14,528][68109] Updated weights for policy 0, policy_version 24434 (0.0027) [2024-06-12 15:34:17,026][67877] Fps is (10 sec: 49152.1, 60 sec: 48332.8, 300 sec: 47652.4). Total num frames: 400441344. Throughput: 0: 48395.9. Samples: 104273580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 15:34:17,027][67877] Avg episode reward: [(0, '0.016')] [2024-06-12 15:34:18,054][68109] Updated weights for policy 0, policy_version 24444 (0.0027) [2024-06-12 15:34:21,042][68109] Updated weights for policy 0, policy_version 24454 (0.0029) [2024-06-12 15:34:22,026][67877] Fps is (10 sec: 49151.4, 60 sec: 48332.8, 300 sec: 47652.4). Total num frames: 400687104. Throughput: 0: 48429.4. Samples: 104568140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 15:34:22,027][67877] Avg episode reward: [(0, '0.019')] [2024-06-12 15:34:24,619][68109] Updated weights for policy 0, policy_version 24464 (0.0028) [2024-06-12 15:34:27,027][67877] Fps is (10 sec: 49151.7, 60 sec: 48332.8, 300 sec: 47652.4). Total num frames: 400932864. Throughput: 0: 48364.7. Samples: 104709860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 15:34:27,027][67877] Avg episode reward: [(0, '0.018')] [2024-06-12 15:34:27,823][68109] Updated weights for policy 0, policy_version 24474 (0.0026) [2024-06-12 15:34:31,508][68109] Updated weights for policy 0, policy_version 24484 (0.0039) [2024-06-12 15:34:32,026][67877] Fps is (10 sec: 49152.2, 60 sec: 48332.8, 300 sec: 47708.0). Total num frames: 401178624. Throughput: 0: 48427.2. Samples: 105005680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 15:34:32,027][67877] Avg episode reward: [(0, '0.020')] [2024-06-12 15:34:34,421][68109] Updated weights for policy 0, policy_version 24494 (0.0021) [2024-06-12 15:34:37,026][67877] Fps is (10 sec: 49152.6, 60 sec: 48332.8, 300 sec: 47819.1). Total num frames: 401424384. Throughput: 0: 48761.3. Samples: 105300240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 15:34:37,027][67877] Avg episode reward: [(0, '0.026')] [2024-06-12 15:34:38,011][68109] Updated weights for policy 0, policy_version 24504 (0.0025) [2024-06-12 15:34:41,295][68109] Updated weights for policy 0, policy_version 24514 (0.0025) [2024-06-12 15:34:42,026][67877] Fps is (10 sec: 49152.2, 60 sec: 48605.9, 300 sec: 47763.5). Total num frames: 401670144. Throughput: 0: 48591.8. Samples: 105444480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 15:34:42,027][67877] Avg episode reward: [(0, '0.021')] [2024-06-12 15:34:44,559][68109] Updated weights for policy 0, policy_version 24524 (0.0031) [2024-06-12 15:34:47,026][67877] Fps is (10 sec: 50790.4, 60 sec: 48878.9, 300 sec: 47819.1). Total num frames: 401932288. Throughput: 0: 48964.9. Samples: 105747240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 15:34:47,027][67877] Avg episode reward: [(0, '0.027')] [2024-06-12 15:34:47,477][68109] Updated weights for policy 0, policy_version 24534 (0.0028) [2024-06-12 15:34:51,070][68109] Updated weights for policy 0, policy_version 24544 (0.0024) [2024-06-12 15:34:52,026][67877] Fps is (10 sec: 50789.9, 60 sec: 48878.9, 300 sec: 47985.7). Total num frames: 402178048. Throughput: 0: 49206.3. Samples: 106046360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 15:34:52,027][67877] Avg episode reward: [(0, '0.029')] [2024-06-12 15:34:54,359][68109] Updated weights for policy 0, policy_version 24554 (0.0025) [2024-06-12 15:34:57,026][67877] Fps is (10 sec: 47513.9, 60 sec: 48879.1, 300 sec: 47930.2). Total num frames: 402407424. Throughput: 0: 49073.3. Samples: 106183380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 15:34:57,027][67877] Avg episode reward: [(0, '0.020')] [2024-06-12 15:34:57,835][68109] Updated weights for policy 0, policy_version 24564 (0.0034) [2024-06-12 15:35:00,852][68109] Updated weights for policy 0, policy_version 24574 (0.0024) [2024-06-12 15:35:02,027][67877] Fps is (10 sec: 47513.1, 60 sec: 48878.8, 300 sec: 47930.1). Total num frames: 402653184. Throughput: 0: 49158.6. Samples: 106485720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 15:35:02,027][67877] Avg episode reward: [(0, '0.020')] [2024-06-12 15:35:04,175][68109] Updated weights for policy 0, policy_version 24584 (0.0023) [2024-06-12 15:35:07,026][67877] Fps is (10 sec: 50790.2, 60 sec: 49425.2, 300 sec: 48096.7). Total num frames: 402915328. Throughput: 0: 49301.8. Samples: 106786720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 15:35:07,027][67877] Avg episode reward: [(0, '0.023')] [2024-06-12 15:35:07,457][68109] Updated weights for policy 0, policy_version 24594 (0.0027) [2024-06-12 15:35:10,856][68109] Updated weights for policy 0, policy_version 24604 (0.0024) [2024-06-12 15:35:12,026][67877] Fps is (10 sec: 50791.1, 60 sec: 49425.0, 300 sec: 48096.8). Total num frames: 403161088. Throughput: 0: 49609.0. Samples: 106942260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 15:35:12,027][67877] Avg episode reward: [(0, '0.027')] [2024-06-12 15:35:14,150][68109] Updated weights for policy 0, policy_version 24614 (0.0029) [2024-06-12 15:35:17,026][67877] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 48096.8). Total num frames: 403406848. Throughput: 0: 49477.3. Samples: 107232160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 15:35:17,027][67877] Avg episode reward: [(0, '0.021')] [2024-06-12 15:35:17,036][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000024622_403406848.pth... [2024-06-12 15:35:17,078][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000023914_391806976.pth [2024-06-12 15:35:17,455][68109] Updated weights for policy 0, policy_version 24624 (0.0027) [2024-06-12 15:35:20,751][68109] Updated weights for policy 0, policy_version 24634 (0.0031) [2024-06-12 15:35:22,026][67877] Fps is (10 sec: 47513.4, 60 sec: 49152.0, 300 sec: 48096.8). Total num frames: 403636224. Throughput: 0: 49508.4. Samples: 107528120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 15:35:22,027][67877] Avg episode reward: [(0, '0.033')] [2024-06-12 15:35:24,065][68109] Updated weights for policy 0, policy_version 24644 (0.0030) [2024-06-12 15:35:25,343][68089] Signal inference workers to stop experience collection... (1250 times) [2024-06-12 15:35:25,390][68109] InferenceWorker_p0-w0: stopping experience collection (1250 times) [2024-06-12 15:35:25,456][68089] Signal inference workers to resume experience collection... (1250 times) [2024-06-12 15:35:25,456][68109] InferenceWorker_p0-w0: resuming experience collection (1250 times) [2024-06-12 15:35:27,026][67877] Fps is (10 sec: 49151.7, 60 sec: 49425.1, 300 sec: 48207.8). Total num frames: 403898368. Throughput: 0: 49652.8. Samples: 107678860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 15:35:27,027][67877] Avg episode reward: [(0, '0.027')] [2024-06-12 15:35:27,583][68109] Updated weights for policy 0, policy_version 24654 (0.0030) [2024-06-12 15:35:31,084][68109] Updated weights for policy 0, policy_version 24664 (0.0027) [2024-06-12 15:35:32,026][67877] Fps is (10 sec: 52428.9, 60 sec: 49698.1, 300 sec: 48263.4). Total num frames: 404160512. Throughput: 0: 49307.1. Samples: 107966060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 15:35:32,027][67877] Avg episode reward: [(0, '0.026')] [2024-06-12 15:35:33,897][68109] Updated weights for policy 0, policy_version 24674 (0.0028) [2024-06-12 15:35:37,026][67877] Fps is (10 sec: 50791.0, 60 sec: 49698.2, 300 sec: 48318.9). Total num frames: 404406272. Throughput: 0: 49458.8. Samples: 108272000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 15:35:37,027][67877] Avg episode reward: [(0, '0.022')] [2024-06-12 15:35:37,666][68109] Updated weights for policy 0, policy_version 24684 (0.0028) [2024-06-12 15:35:40,542][68109] Updated weights for policy 0, policy_version 24694 (0.0024) [2024-06-12 15:35:42,026][67877] Fps is (10 sec: 47513.9, 60 sec: 49425.0, 300 sec: 48374.5). Total num frames: 404635648. Throughput: 0: 49636.9. Samples: 108417040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 15:35:42,027][67877] Avg episode reward: [(0, '0.023')] [2024-06-12 15:35:44,287][68109] Updated weights for policy 0, policy_version 24704 (0.0025) [2024-06-12 15:35:46,898][68109] Updated weights for policy 0, policy_version 24714 (0.0026) [2024-06-12 15:35:47,026][67877] Fps is (10 sec: 50789.8, 60 sec: 49698.1, 300 sec: 48485.5). Total num frames: 404914176. Throughput: 0: 49582.7. Samples: 108716940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-12 15:35:47,027][67877] Avg episode reward: [(0, '0.033')] [2024-06-12 15:35:50,961][68109] Updated weights for policy 0, policy_version 24724 (0.0022) [2024-06-12 15:35:52,026][67877] Fps is (10 sec: 52429.0, 60 sec: 49698.2, 300 sec: 48485.5). Total num frames: 405159936. Throughput: 0: 49597.0. Samples: 109018580. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-12 15:35:52,027][67877] Avg episode reward: [(0, '0.027')] [2024-06-12 15:35:53,666][68109] Updated weights for policy 0, policy_version 24734 (0.0030) [2024-06-12 15:35:57,026][67877] Fps is (10 sec: 44237.1, 60 sec: 49152.0, 300 sec: 48430.0). Total num frames: 405356544. Throughput: 0: 49392.4. Samples: 109164920. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-12 15:35:57,035][67877] Avg episode reward: [(0, '0.028')] [2024-06-12 15:35:57,583][68109] Updated weights for policy 0, policy_version 24744 (0.0028) [2024-06-12 15:36:00,245][68109] Updated weights for policy 0, policy_version 24754 (0.0031) [2024-06-12 15:36:02,027][67877] Fps is (10 sec: 47512.6, 60 sec: 49698.1, 300 sec: 48430.0). Total num frames: 405635072. Throughput: 0: 49558.5. Samples: 109462300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 15:36:02,027][67877] Avg episode reward: [(0, '0.027')] [2024-06-12 15:36:04,379][68109] Updated weights for policy 0, policy_version 24764 (0.0030) [2024-06-12 15:36:07,006][68109] Updated weights for policy 0, policy_version 24774 (0.0029) [2024-06-12 15:36:07,026][67877] Fps is (10 sec: 54066.8, 60 sec: 49698.1, 300 sec: 48597.1). Total num frames: 405897216. Throughput: 0: 49294.6. Samples: 109746380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 15:36:07,027][67877] Avg episode reward: [(0, '0.033')] [2024-06-12 15:36:10,756][68109] Updated weights for policy 0, policy_version 24784 (0.0030) [2024-06-12 15:36:12,026][67877] Fps is (10 sec: 50791.4, 60 sec: 49698.2, 300 sec: 48596.6). Total num frames: 406142976. Throughput: 0: 49565.5. Samples: 109909300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 15:36:12,027][67877] Avg episode reward: [(0, '0.026')] [2024-06-12 15:36:14,228][68109] Updated weights for policy 0, policy_version 24794 (0.0031) [2024-06-12 15:36:16,737][68089] Signal inference workers to stop experience collection... (1300 times) [2024-06-12 15:36:16,766][68109] InferenceWorker_p0-w0: stopping experience collection (1300 times) [2024-06-12 15:36:16,791][68089] Signal inference workers to resume experience collection... (1300 times) [2024-06-12 15:36:16,791][68109] InferenceWorker_p0-w0: resuming experience collection (1300 times) [2024-06-12 15:36:17,026][67877] Fps is (10 sec: 47513.9, 60 sec: 49425.1, 300 sec: 48596.6). Total num frames: 406372352. Throughput: 0: 49790.7. Samples: 110206640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-12 15:36:17,027][67877] Avg episode reward: [(0, '0.033')] [2024-06-12 15:36:17,107][68109] Updated weights for policy 0, policy_version 24804 (0.0031) [2024-06-12 15:36:20,794][68109] Updated weights for policy 0, policy_version 24814 (0.0032) [2024-06-12 15:36:22,026][67877] Fps is (10 sec: 49151.9, 60 sec: 49971.3, 300 sec: 48652.2). Total num frames: 406634496. Throughput: 0: 49685.3. Samples: 110507840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-12 15:36:22,027][67877] Avg episode reward: [(0, '0.034')] [2024-06-12 15:36:23,774][68109] Updated weights for policy 0, policy_version 24824 (0.0029) [2024-06-12 15:36:27,026][67877] Fps is (10 sec: 49151.8, 60 sec: 49425.1, 300 sec: 48707.7). Total num frames: 406863872. Throughput: 0: 49545.2. Samples: 110646580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-12 15:36:27,027][67877] Avg episode reward: [(0, '0.031')] [2024-06-12 15:36:27,443][68109] Updated weights for policy 0, policy_version 24834 (0.0026) [2024-06-12 15:36:30,263][68109] Updated weights for policy 0, policy_version 24844 (0.0029) [2024-06-12 15:36:32,026][67877] Fps is (10 sec: 50790.5, 60 sec: 49698.2, 300 sec: 48818.8). Total num frames: 407142400. Throughput: 0: 49523.7. Samples: 110945500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-12 15:36:32,027][67877] Avg episode reward: [(0, '0.026')] [2024-06-12 15:36:33,903][68109] Updated weights for policy 0, policy_version 24854 (0.0024) [2024-06-12 15:36:36,743][68109] Updated weights for policy 0, policy_version 24864 (0.0025) [2024-06-12 15:36:37,026][67877] Fps is (10 sec: 50790.5, 60 sec: 49425.0, 300 sec: 48763.2). Total num frames: 407371776. Throughput: 0: 49485.2. Samples: 111245420. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-12 15:36:37,027][67877] Avg episode reward: [(0, '0.029')] [2024-06-12 15:36:40,454][68109] Updated weights for policy 0, policy_version 24874 (0.0034) [2024-06-12 15:36:42,026][67877] Fps is (10 sec: 47513.3, 60 sec: 49698.1, 300 sec: 48874.3). Total num frames: 407617536. Throughput: 0: 49637.8. Samples: 111398620. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-12 15:36:42,027][67877] Avg episode reward: [(0, '0.031')] [2024-06-12 15:36:43,324][68109] Updated weights for policy 0, policy_version 24884 (0.0024) [2024-06-12 15:36:46,729][68109] Updated weights for policy 0, policy_version 24894 (0.0031) [2024-06-12 15:36:47,026][67877] Fps is (10 sec: 49152.4, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 407863296. Throughput: 0: 49599.3. Samples: 111694260. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-12 15:36:47,027][67877] Avg episode reward: [(0, '0.027')] [2024-06-12 15:36:50,231][68109] Updated weights for policy 0, policy_version 24904 (0.0023) [2024-06-12 15:36:52,026][67877] Fps is (10 sec: 50790.5, 60 sec: 49425.0, 300 sec: 48929.9). Total num frames: 408125440. Throughput: 0: 49874.8. Samples: 111990740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 15:36:52,027][67877] Avg episode reward: [(0, '0.040')] [2024-06-12 15:36:53,394][68109] Updated weights for policy 0, policy_version 24914 (0.0030) [2024-06-12 15:36:56,223][68109] Updated weights for policy 0, policy_version 24924 (0.0026) [2024-06-12 15:36:57,026][67877] Fps is (10 sec: 52428.6, 60 sec: 50517.3, 300 sec: 48985.4). Total num frames: 408387584. Throughput: 0: 49871.0. Samples: 112153500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 15:36:57,027][67877] Avg episode reward: [(0, '0.032')] [2024-06-12 15:36:59,843][68109] Updated weights for policy 0, policy_version 24934 (0.0021) [2024-06-12 15:37:02,026][67877] Fps is (10 sec: 50790.5, 60 sec: 49971.3, 300 sec: 49040.9). Total num frames: 408633344. Throughput: 0: 49934.7. Samples: 112453700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 15:37:02,027][67877] Avg episode reward: [(0, '0.029')] [2024-06-12 15:37:02,603][68109] Updated weights for policy 0, policy_version 24944 (0.0023) [2024-06-12 15:37:06,357][68109] Updated weights for policy 0, policy_version 24954 (0.0031) [2024-06-12 15:37:07,026][67877] Fps is (10 sec: 49152.3, 60 sec: 49698.2, 300 sec: 49040.9). Total num frames: 408879104. Throughput: 0: 49975.1. Samples: 112756720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-12 15:37:07,027][67877] Avg episode reward: [(0, '0.033')] [2024-06-12 15:37:09,219][68109] Updated weights for policy 0, policy_version 24964 (0.0026) [2024-06-12 15:37:12,026][67877] Fps is (10 sec: 49152.1, 60 sec: 49698.1, 300 sec: 49040.9). Total num frames: 409124864. Throughput: 0: 50154.8. Samples: 112903540. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-12 15:37:12,027][67877] Avg episode reward: [(0, '0.034')] [2024-06-12 15:37:12,795][68109] Updated weights for policy 0, policy_version 24974 (0.0031) [2024-06-12 15:37:15,591][68109] Updated weights for policy 0, policy_version 24984 (0.0022) [2024-06-12 15:37:17,026][67877] Fps is (10 sec: 49151.4, 60 sec: 49971.2, 300 sec: 49040.9). Total num frames: 409370624. Throughput: 0: 50181.7. Samples: 113203680. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-12 15:37:17,027][67877] Avg episode reward: [(0, '0.037')] [2024-06-12 15:37:17,036][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000024986_409370624.pth... [2024-06-12 15:37:17,103][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000024264_397541376.pth [2024-06-12 15:37:19,530][68109] Updated weights for policy 0, policy_version 24994 (0.0026) [2024-06-12 15:37:22,027][67877] Fps is (10 sec: 52427.7, 60 sec: 50244.1, 300 sec: 49207.5). Total num frames: 409649152. Throughput: 0: 50232.8. Samples: 113505900. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-12 15:37:22,027][67877] Avg episode reward: [(0, '0.028')] [2024-06-12 15:37:22,227][68109] Updated weights for policy 0, policy_version 25004 (0.0026) [2024-06-12 15:37:25,859][68109] Updated weights for policy 0, policy_version 25014 (0.0028) [2024-06-12 15:37:26,720][68089] Signal inference workers to stop experience collection... (1350 times) [2024-06-12 15:37:26,765][68109] InferenceWorker_p0-w0: stopping experience collection (1350 times) [2024-06-12 15:37:26,830][68089] Signal inference workers to resume experience collection... (1350 times) [2024-06-12 15:37:26,836][68109] InferenceWorker_p0-w0: resuming experience collection (1350 times) [2024-06-12 15:37:27,026][67877] Fps is (10 sec: 50790.8, 60 sec: 50244.3, 300 sec: 49096.5). Total num frames: 409878528. Throughput: 0: 50366.3. Samples: 113665100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-12 15:37:27,027][67877] Avg episode reward: [(0, '0.040')] [2024-06-12 15:37:28,624][68109] Updated weights for policy 0, policy_version 25024 (0.0023) [2024-06-12 15:37:32,026][67877] Fps is (10 sec: 49153.0, 60 sec: 49971.2, 300 sec: 49263.1). Total num frames: 410140672. Throughput: 0: 50408.5. Samples: 113962640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-12 15:37:32,027][67877] Avg episode reward: [(0, '0.038')] [2024-06-12 15:37:32,390][68109] Updated weights for policy 0, policy_version 25034 (0.0026) [2024-06-12 15:37:35,182][68109] Updated weights for policy 0, policy_version 25044 (0.0026) [2024-06-12 15:37:37,026][67877] Fps is (10 sec: 47513.9, 60 sec: 49698.2, 300 sec: 49152.0). Total num frames: 410353664. Throughput: 0: 50381.8. Samples: 114257920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-12 15:37:37,027][67877] Avg episode reward: [(0, '0.032')] [2024-06-12 15:37:39,112][68109] Updated weights for policy 0, policy_version 25054 (0.0035) [2024-06-12 15:37:42,026][67877] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 49263.1). Total num frames: 410632192. Throughput: 0: 50058.7. Samples: 114406140. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-12 15:37:42,027][67877] Avg episode reward: [(0, '0.031')] [2024-06-12 15:37:42,125][68109] Updated weights for policy 0, policy_version 25064 (0.0027) [2024-06-12 15:37:45,590][68109] Updated weights for policy 0, policy_version 25074 (0.0022) [2024-06-12 15:37:47,026][67877] Fps is (10 sec: 52428.1, 60 sec: 50244.2, 300 sec: 49318.6). Total num frames: 410877952. Throughput: 0: 50143.9. Samples: 114710180. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-12 15:37:47,027][67877] Avg episode reward: [(0, '0.035')] [2024-06-12 15:37:48,686][68109] Updated weights for policy 0, policy_version 25084 (0.0032) [2024-06-12 15:37:52,026][67877] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 49318.6). Total num frames: 411123712. Throughput: 0: 49931.9. Samples: 115003660. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-12 15:37:52,027][67877] Avg episode reward: [(0, '0.041')] [2024-06-12 15:37:52,066][68109] Updated weights for policy 0, policy_version 25094 (0.0022) [2024-06-12 15:37:55,148][68109] Updated weights for policy 0, policy_version 25104 (0.0029) [2024-06-12 15:37:57,026][67877] Fps is (10 sec: 52428.8, 60 sec: 50244.2, 300 sec: 49485.2). Total num frames: 411402240. Throughput: 0: 50119.9. Samples: 115158940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 15:37:57,027][67877] Avg episode reward: [(0, '0.038')] [2024-06-12 15:37:58,614][68109] Updated weights for policy 0, policy_version 25114 (0.0025) [2024-06-12 15:38:01,791][68109] Updated weights for policy 0, policy_version 25124 (0.0032) [2024-06-12 15:38:02,026][67877] Fps is (10 sec: 50790.4, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 411631616. Throughput: 0: 50068.5. Samples: 115456760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 15:38:02,027][67877] Avg episode reward: [(0, '0.041')] [2024-06-12 15:38:05,178][68109] Updated weights for policy 0, policy_version 25134 (0.0029) [2024-06-12 15:38:07,026][67877] Fps is (10 sec: 49152.5, 60 sec: 50244.3, 300 sec: 49429.7). Total num frames: 411893760. Throughput: 0: 50067.8. Samples: 115758940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 15:38:07,027][67877] Avg episode reward: [(0, '0.042')] [2024-06-12 15:38:08,287][68109] Updated weights for policy 0, policy_version 25144 (0.0029) [2024-06-12 15:38:11,390][68109] Updated weights for policy 0, policy_version 25154 (0.0024) [2024-06-12 15:38:12,026][67877] Fps is (10 sec: 50790.3, 60 sec: 50244.2, 300 sec: 49485.2). Total num frames: 412139520. Throughput: 0: 49986.6. Samples: 115914500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 15:38:12,027][67877] Avg episode reward: [(0, '0.035')] [2024-06-12 15:38:14,737][68109] Updated weights for policy 0, policy_version 25164 (0.0028) [2024-06-12 15:38:17,026][67877] Fps is (10 sec: 50790.3, 60 sec: 50517.4, 300 sec: 49540.8). Total num frames: 412401664. Throughput: 0: 50365.8. Samples: 116229100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 15:38:17,027][67877] Avg episode reward: [(0, '0.034')] [2024-06-12 15:38:17,942][68109] Updated weights for policy 0, policy_version 25174 (0.0034) [2024-06-12 15:38:21,076][68109] Updated weights for policy 0, policy_version 25184 (0.0028) [2024-06-12 15:38:22,026][67877] Fps is (10 sec: 50790.1, 60 sec: 49971.3, 300 sec: 49540.8). Total num frames: 412647424. Throughput: 0: 50434.1. Samples: 116527460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 15:38:22,027][67877] Avg episode reward: [(0, '0.033')] [2024-06-12 15:38:24,276][68109] Updated weights for policy 0, policy_version 25194 (0.0033) [2024-06-12 15:38:27,026][67877] Fps is (10 sec: 49151.5, 60 sec: 50244.2, 300 sec: 49540.8). Total num frames: 412893184. Throughput: 0: 50339.9. Samples: 116671440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 15:38:27,027][67877] Avg episode reward: [(0, '0.036')] [2024-06-12 15:38:27,849][68109] Updated weights for policy 0, policy_version 25204 (0.0034) [2024-06-12 15:38:30,813][68109] Updated weights for policy 0, policy_version 25214 (0.0029) [2024-06-12 15:38:32,026][67877] Fps is (10 sec: 49152.3, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 413138944. Throughput: 0: 50231.6. Samples: 116970600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-12 15:38:32,027][67877] Avg episode reward: [(0, '0.037')] [2024-06-12 15:38:34,215][68109] Updated weights for policy 0, policy_version 25224 (0.0024) [2024-06-12 15:38:36,144][68089] Signal inference workers to stop experience collection... (1400 times) [2024-06-12 15:38:36,192][68109] InferenceWorker_p0-w0: stopping experience collection (1400 times) [2024-06-12 15:38:36,200][68089] Signal inference workers to resume experience collection... (1400 times) [2024-06-12 15:38:36,210][68109] InferenceWorker_p0-w0: resuming experience collection (1400 times) [2024-06-12 15:38:37,027][67877] Fps is (10 sec: 52428.6, 60 sec: 51063.3, 300 sec: 49707.4). Total num frames: 413417472. Throughput: 0: 50377.2. Samples: 117270640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-12 15:38:37,027][67877] Avg episode reward: [(0, '0.038')] [2024-06-12 15:38:37,039][68109] Updated weights for policy 0, policy_version 25234 (0.0027) [2024-06-12 15:38:40,925][68109] Updated weights for policy 0, policy_version 25244 (0.0032) [2024-06-12 15:38:42,026][67877] Fps is (10 sec: 52428.9, 60 sec: 50517.3, 300 sec: 49707.4). Total num frames: 413663232. Throughput: 0: 50618.3. Samples: 117436760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-12 15:38:42,027][67877] Avg episode reward: [(0, '0.052')] [2024-06-12 15:38:43,666][68109] Updated weights for policy 0, policy_version 25254 (0.0029) [2024-06-12 15:38:47,026][67877] Fps is (10 sec: 49152.9, 60 sec: 50517.4, 300 sec: 49707.4). Total num frames: 413908992. Throughput: 0: 50714.3. Samples: 117738900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-12 15:38:47,027][67877] Avg episode reward: [(0, '0.043')] [2024-06-12 15:38:47,074][68109] Updated weights for policy 0, policy_version 25264 (0.0024) [2024-06-12 15:38:49,775][68109] Updated weights for policy 0, policy_version 25274 (0.0026) [2024-06-12 15:38:52,026][67877] Fps is (10 sec: 49151.8, 60 sec: 50517.3, 300 sec: 49762.9). Total num frames: 414154752. Throughput: 0: 50893.3. Samples: 118049140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 15:38:52,027][67877] Avg episode reward: [(0, '0.049')] [2024-06-12 15:38:53,774][68109] Updated weights for policy 0, policy_version 25284 (0.0026) [2024-06-12 15:38:56,365][68109] Updated weights for policy 0, policy_version 25294 (0.0020) [2024-06-12 15:38:57,026][67877] Fps is (10 sec: 52428.6, 60 sec: 50517.4, 300 sec: 49874.0). Total num frames: 414433280. Throughput: 0: 50612.5. Samples: 118192060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 15:38:57,027][67877] Avg episode reward: [(0, '0.039')] [2024-06-12 15:38:59,992][68109] Updated weights for policy 0, policy_version 25304 (0.0026) [2024-06-12 15:39:02,026][67877] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 49874.0). Total num frames: 414662656. Throughput: 0: 50283.5. Samples: 118491860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 15:39:02,027][67877] Avg episode reward: [(0, '0.041')] [2024-06-12 15:39:02,855][68109] Updated weights for policy 0, policy_version 25314 (0.0030) [2024-06-12 15:39:06,692][68109] Updated weights for policy 0, policy_version 25324 (0.0029) [2024-06-12 15:39:07,026][67877] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 49929.5). Total num frames: 414924800. Throughput: 0: 50523.3. Samples: 118801000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 15:39:07,027][67877] Avg episode reward: [(0, '0.036')] [2024-06-12 15:39:09,378][68109] Updated weights for policy 0, policy_version 25334 (0.0025) [2024-06-12 15:39:12,026][67877] Fps is (10 sec: 47513.9, 60 sec: 49971.2, 300 sec: 49818.5). Total num frames: 415137792. Throughput: 0: 50477.0. Samples: 118942900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 15:39:12,027][67877] Avg episode reward: [(0, '0.045')] [2024-06-12 15:39:13,310][68109] Updated weights for policy 0, policy_version 25344 (0.0029) [2024-06-12 15:39:15,788][68109] Updated weights for policy 0, policy_version 25354 (0.0035) [2024-06-12 15:39:17,026][67877] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 49985.1). Total num frames: 415432704. Throughput: 0: 50418.7. Samples: 119239440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 15:39:17,027][67877] Avg episode reward: [(0, '0.035')] [2024-06-12 15:39:17,037][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000025356_415432704.pth... [2024-06-12 15:39:17,075][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000024622_403406848.pth [2024-06-12 15:39:19,620][68109] Updated weights for policy 0, policy_version 25364 (0.0034) [2024-06-12 15:39:22,026][67877] Fps is (10 sec: 54067.0, 60 sec: 50517.4, 300 sec: 49985.1). Total num frames: 415678464. Throughput: 0: 50548.1. Samples: 119545300. Policy #0 lag: (min: 1.0, avg: 12.4, max: 25.0) [2024-06-12 15:39:22,027][67877] Avg episode reward: [(0, '0.043')] [2024-06-12 15:39:22,749][68109] Updated weights for policy 0, policy_version 25374 (0.0035) [2024-06-12 15:39:26,621][68109] Updated weights for policy 0, policy_version 25384 (0.0029) [2024-06-12 15:39:27,026][67877] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 49985.1). Total num frames: 415924224. Throughput: 0: 50255.1. Samples: 119698240. Policy #0 lag: (min: 1.0, avg: 12.4, max: 25.0) [2024-06-12 15:39:27,027][67877] Avg episode reward: [(0, '0.042')] [2024-06-12 15:39:29,202][68109] Updated weights for policy 0, policy_version 25394 (0.0027) [2024-06-12 15:39:32,026][67877] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 49985.1). Total num frames: 416169984. Throughput: 0: 50457.7. Samples: 120009500. Policy #0 lag: (min: 1.0, avg: 12.4, max: 25.0) [2024-06-12 15:39:32,027][67877] Avg episode reward: [(0, '0.044')] [2024-06-12 15:39:32,847][68109] Updated weights for policy 0, policy_version 25404 (0.0025) [2024-06-12 15:39:35,679][68109] Updated weights for policy 0, policy_version 25414 (0.0028) [2024-06-12 15:39:37,026][67877] Fps is (10 sec: 50789.9, 60 sec: 50244.3, 300 sec: 50040.6). Total num frames: 416432128. Throughput: 0: 50310.6. Samples: 120313120. Policy #0 lag: (min: 1.0, avg: 12.4, max: 25.0) [2024-06-12 15:39:37,027][67877] Avg episode reward: [(0, '0.044')] [2024-06-12 15:39:39,423][68109] Updated weights for policy 0, policy_version 25424 (0.0022) [2024-06-12 15:39:42,026][67877] Fps is (10 sec: 52428.7, 60 sec: 50517.3, 300 sec: 50040.6). Total num frames: 416694272. Throughput: 0: 50660.4. Samples: 120471780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 15:39:42,027][67877] Avg episode reward: [(0, '0.041')] [2024-06-12 15:39:42,185][68109] Updated weights for policy 0, policy_version 25434 (0.0027) [2024-06-12 15:39:42,775][68089] Signal inference workers to stop experience collection... (1450 times) [2024-06-12 15:39:42,813][68109] InferenceWorker_p0-w0: stopping experience collection (1450 times) [2024-06-12 15:39:42,881][68089] Signal inference workers to resume experience collection... (1450 times) [2024-06-12 15:39:42,881][68109] InferenceWorker_p0-w0: resuming experience collection (1450 times) [2024-06-12 15:39:45,679][68109] Updated weights for policy 0, policy_version 25444 (0.0033) [2024-06-12 15:39:47,026][67877] Fps is (10 sec: 50790.4, 60 sec: 50517.2, 300 sec: 50040.6). Total num frames: 416940032. Throughput: 0: 50568.4. Samples: 120767440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 15:39:47,027][67877] Avg episode reward: [(0, '0.045')] [2024-06-12 15:39:48,519][68109] Updated weights for policy 0, policy_version 25454 (0.0025) [2024-06-12 15:39:52,026][67877] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50096.2). Total num frames: 417185792. Throughput: 0: 50504.4. Samples: 121073700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 15:39:52,027][67877] Avg episode reward: [(0, '0.042')] [2024-06-12 15:39:52,176][68109] Updated weights for policy 0, policy_version 25464 (0.0023) [2024-06-12 15:39:54,965][68109] Updated weights for policy 0, policy_version 25474 (0.0024) [2024-06-12 15:39:57,026][67877] Fps is (10 sec: 49152.5, 60 sec: 49971.2, 300 sec: 50096.2). Total num frames: 417431552. Throughput: 0: 50808.9. Samples: 121229300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 15:39:57,027][67877] Avg episode reward: [(0, '0.045')] [2024-06-12 15:39:58,582][68109] Updated weights for policy 0, policy_version 25484 (0.0025) [2024-06-12 15:40:01,458][68109] Updated weights for policy 0, policy_version 25494 (0.0026) [2024-06-12 15:40:02,026][67877] Fps is (10 sec: 52428.9, 60 sec: 50790.5, 300 sec: 50151.7). Total num frames: 417710080. Throughput: 0: 50650.3. Samples: 121518700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 15:40:02,027][67877] Avg episode reward: [(0, '0.046')] [2024-06-12 15:40:04,791][68109] Updated weights for policy 0, policy_version 25504 (0.0020) [2024-06-12 15:40:07,026][67877] Fps is (10 sec: 54067.0, 60 sec: 50790.3, 300 sec: 50207.2). Total num frames: 417972224. Throughput: 0: 50780.4. Samples: 121830420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 15:40:07,027][67877] Avg episode reward: [(0, '0.046')] [2024-06-12 15:40:08,436][68109] Updated weights for policy 0, policy_version 25514 (0.0027) [2024-06-12 15:40:11,305][68109] Updated weights for policy 0, policy_version 25524 (0.0029) [2024-06-12 15:40:12,026][67877] Fps is (10 sec: 49151.8, 60 sec: 51063.5, 300 sec: 50151.7). Total num frames: 418201600. Throughput: 0: 50751.1. Samples: 121982040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 15:40:12,027][67877] Avg episode reward: [(0, '0.040')] [2024-06-12 15:40:14,626][68109] Updated weights for policy 0, policy_version 25534 (0.0031) [2024-06-12 15:40:17,027][67877] Fps is (10 sec: 49147.2, 60 sec: 50516.5, 300 sec: 50262.6). Total num frames: 418463744. Throughput: 0: 50901.1. Samples: 122300100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-06-12 15:40:17,028][67877] Avg episode reward: [(0, '0.046')] [2024-06-12 15:40:17,504][68109] Updated weights for policy 0, policy_version 25544 (0.0022) [2024-06-12 15:40:21,193][68109] Updated weights for policy 0, policy_version 25554 (0.0032) [2024-06-12 15:40:22,027][67877] Fps is (10 sec: 50789.6, 60 sec: 50517.2, 300 sec: 50207.2). Total num frames: 418709504. Throughput: 0: 50865.3. Samples: 122602060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-06-12 15:40:22,027][67877] Avg episode reward: [(0, '0.035')] [2024-06-12 15:40:23,887][68109] Updated weights for policy 0, policy_version 25564 (0.0028) [2024-06-12 15:40:27,026][67877] Fps is (10 sec: 52434.0, 60 sec: 51063.5, 300 sec: 50262.8). Total num frames: 418988032. Throughput: 0: 50657.8. Samples: 122751380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-06-12 15:40:27,027][67877] Avg episode reward: [(0, '0.049')] [2024-06-12 15:40:27,083][68109] Updated weights for policy 0, policy_version 25574 (0.0030) [2024-06-12 15:40:30,350][68109] Updated weights for policy 0, policy_version 25584 (0.0029) [2024-06-12 15:40:32,026][67877] Fps is (10 sec: 52429.0, 60 sec: 51063.4, 300 sec: 50262.8). Total num frames: 419233792. Throughput: 0: 50721.3. Samples: 123049900. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-12 15:40:32,027][67877] Avg episode reward: [(0, '0.048')] [2024-06-12 15:40:33,841][68109] Updated weights for policy 0, policy_version 25594 (0.0030) [2024-06-12 15:40:36,437][68109] Updated weights for policy 0, policy_version 25604 (0.0032) [2024-06-12 15:40:37,026][67877] Fps is (10 sec: 50790.0, 60 sec: 51063.5, 300 sec: 50373.8). Total num frames: 419495936. Throughput: 0: 50645.6. Samples: 123352760. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-12 15:40:37,027][67877] Avg episode reward: [(0, '0.047')] [2024-06-12 15:40:40,507][68109] Updated weights for policy 0, policy_version 25614 (0.0032) [2024-06-12 15:40:42,026][67877] Fps is (10 sec: 50791.1, 60 sec: 50790.4, 300 sec: 50262.8). Total num frames: 419741696. Throughput: 0: 50443.6. Samples: 123499260. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-12 15:40:42,027][67877] Avg episode reward: [(0, '0.047')] [2024-06-12 15:40:43,312][68109] Updated weights for policy 0, policy_version 25624 (0.0029) [2024-06-12 15:40:46,786][68109] Updated weights for policy 0, policy_version 25634 (0.0031) [2024-06-12 15:40:47,026][67877] Fps is (10 sec: 49152.6, 60 sec: 50790.5, 300 sec: 50262.8). Total num frames: 419987456. Throughput: 0: 51122.6. Samples: 123819220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 15:40:47,027][67877] Avg episode reward: [(0, '0.048')] [2024-06-12 15:40:49,660][68109] Updated weights for policy 0, policy_version 25644 (0.0025) [2024-06-12 15:40:50,144][68089] Signal inference workers to stop experience collection... (1500 times) [2024-06-12 15:40:50,145][68089] Signal inference workers to resume experience collection... (1500 times) [2024-06-12 15:40:50,193][68109] InferenceWorker_p0-w0: stopping experience collection (1500 times) [2024-06-12 15:40:50,194][68109] InferenceWorker_p0-w0: resuming experience collection (1500 times) [2024-06-12 15:40:52,026][67877] Fps is (10 sec: 52428.0, 60 sec: 51336.4, 300 sec: 50540.5). Total num frames: 420265984. Throughput: 0: 51201.7. Samples: 124134500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 15:40:52,027][67877] Avg episode reward: [(0, '0.047')] [2024-06-12 15:40:53,094][68109] Updated weights for policy 0, policy_version 25654 (0.0026) [2024-06-12 15:40:56,219][68109] Updated weights for policy 0, policy_version 25664 (0.0024) [2024-06-12 15:40:57,026][67877] Fps is (10 sec: 52428.7, 60 sec: 51336.5, 300 sec: 50429.4). Total num frames: 420511744. Throughput: 0: 51152.9. Samples: 124283920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 15:40:57,027][67877] Avg episode reward: [(0, '0.044')] [2024-06-12 15:40:59,512][68109] Updated weights for policy 0, policy_version 25674 (0.0027) [2024-06-12 15:41:02,026][67877] Fps is (10 sec: 50790.9, 60 sec: 51063.4, 300 sec: 50429.4). Total num frames: 420773888. Throughput: 0: 50982.9. Samples: 124594280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 15:41:02,027][67877] Avg episode reward: [(0, '0.043')] [2024-06-12 15:41:02,425][68109] Updated weights for policy 0, policy_version 25684 (0.0022) [2024-06-12 15:41:05,595][68109] Updated weights for policy 0, policy_version 25694 (0.0023) [2024-06-12 15:41:07,027][67877] Fps is (10 sec: 52427.7, 60 sec: 51063.3, 300 sec: 50484.9). Total num frames: 421036032. Throughput: 0: 51132.4. Samples: 124903020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 15:41:07,027][67877] Avg episode reward: [(0, '0.048')] [2024-06-12 15:41:08,914][68109] Updated weights for policy 0, policy_version 25704 (0.0026) [2024-06-12 15:41:12,026][67877] Fps is (10 sec: 50790.7, 60 sec: 51336.6, 300 sec: 50540.5). Total num frames: 421281792. Throughput: 0: 51289.9. Samples: 125059420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 15:41:12,027][67877] Avg episode reward: [(0, '0.039')] [2024-06-12 15:41:12,067][68109] Updated weights for policy 0, policy_version 25714 (0.0025) [2024-06-12 15:41:15,350][68109] Updated weights for policy 0, policy_version 25724 (0.0028) [2024-06-12 15:41:17,026][67877] Fps is (10 sec: 49153.0, 60 sec: 51064.3, 300 sec: 50484.9). Total num frames: 421527552. Throughput: 0: 51434.8. Samples: 125364460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 15:41:17,027][67877] Avg episode reward: [(0, '0.043')] [2024-06-12 15:41:17,040][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000025729_421543936.pth... [2024-06-12 15:41:17,092][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000024986_409370624.pth [2024-06-12 15:41:18,651][68109] Updated weights for policy 0, policy_version 25734 (0.0024) [2024-06-12 15:41:22,026][67877] Fps is (10 sec: 49151.8, 60 sec: 51063.6, 300 sec: 50540.5). Total num frames: 421773312. Throughput: 0: 51275.7. Samples: 125660160. Policy #0 lag: (min: 2.0, avg: 10.9, max: 23.0) [2024-06-12 15:41:22,027][67877] Avg episode reward: [(0, '0.040')] [2024-06-12 15:41:22,115][68109] Updated weights for policy 0, policy_version 25744 (0.0023) [2024-06-12 15:41:24,967][68109] Updated weights for policy 0, policy_version 25754 (0.0022) [2024-06-12 15:41:27,027][67877] Fps is (10 sec: 52427.8, 60 sec: 51063.3, 300 sec: 50540.4). Total num frames: 422051840. Throughput: 0: 51551.3. Samples: 125819080. Policy #0 lag: (min: 2.0, avg: 10.9, max: 23.0) [2024-06-12 15:41:27,027][67877] Avg episode reward: [(0, '0.043')] [2024-06-12 15:41:28,390][68109] Updated weights for policy 0, policy_version 25764 (0.0022) [2024-06-12 15:41:31,407][68109] Updated weights for policy 0, policy_version 25774 (0.0025) [2024-06-12 15:41:32,026][67877] Fps is (10 sec: 54067.2, 60 sec: 51336.6, 300 sec: 50651.6). Total num frames: 422313984. Throughput: 0: 51419.1. Samples: 126133080. Policy #0 lag: (min: 2.0, avg: 10.9, max: 23.0) [2024-06-12 15:41:32,027][67877] Avg episode reward: [(0, '0.047')] [2024-06-12 15:41:35,047][68109] Updated weights for policy 0, policy_version 25784 (0.0028) [2024-06-12 15:41:37,026][67877] Fps is (10 sec: 47514.6, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 422526976. Throughput: 0: 50826.4. Samples: 126421680. Policy #0 lag: (min: 2.0, avg: 10.9, max: 23.0) [2024-06-12 15:41:37,027][67877] Avg episode reward: [(0, '0.045')] [2024-06-12 15:41:37,940][68109] Updated weights for policy 0, policy_version 25794 (0.0034) [2024-06-12 15:41:41,300][68109] Updated weights for policy 0, policy_version 25804 (0.0022) [2024-06-12 15:41:42,026][67877] Fps is (10 sec: 47513.3, 60 sec: 50790.3, 300 sec: 50596.0). Total num frames: 422789120. Throughput: 0: 50801.7. Samples: 126570000. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-12 15:41:42,027][67877] Avg episode reward: [(0, '0.052')] [2024-06-12 15:41:44,599][68109] Updated weights for policy 0, policy_version 25814 (0.0027) [2024-06-12 15:41:47,026][67877] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50596.0). Total num frames: 423051264. Throughput: 0: 50746.7. Samples: 126877880. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-12 15:41:47,027][67877] Avg episode reward: [(0, '0.046')] [2024-06-12 15:41:47,713][68109] Updated weights for policy 0, policy_version 25824 (0.0034) [2024-06-12 15:41:50,935][68109] Updated weights for policy 0, policy_version 25834 (0.0028) [2024-06-12 15:41:52,026][67877] Fps is (10 sec: 54067.0, 60 sec: 51063.5, 300 sec: 50651.5). Total num frames: 423329792. Throughput: 0: 50852.1. Samples: 127191360. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-12 15:41:52,027][67877] Avg episode reward: [(0, '0.044')] [2024-06-12 15:41:54,121][68109] Updated weights for policy 0, policy_version 25844 (0.0020) [2024-06-12 15:41:57,026][67877] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50651.5). Total num frames: 423575552. Throughput: 0: 50759.4. Samples: 127343600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 15:41:57,027][67877] Avg episode reward: [(0, '0.051')] [2024-06-12 15:41:57,261][68109] Updated weights for policy 0, policy_version 25854 (0.0027) [2024-06-12 15:42:01,032][68109] Updated weights for policy 0, policy_version 25864 (0.0030) [2024-06-12 15:42:02,026][67877] Fps is (10 sec: 49152.5, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 423821312. Throughput: 0: 50795.6. Samples: 127650260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 15:42:02,027][67877] Avg episode reward: [(0, '0.042')] [2024-06-12 15:42:03,452][68109] Updated weights for policy 0, policy_version 25874 (0.0023) [2024-06-12 15:42:06,880][68109] Updated weights for policy 0, policy_version 25884 (0.0024) [2024-06-12 15:42:07,027][67877] Fps is (10 sec: 50789.9, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 424083456. Throughput: 0: 51074.9. Samples: 127958540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 15:42:07,027][67877] Avg episode reward: [(0, '0.045')] [2024-06-12 15:42:07,853][68089] Signal inference workers to stop experience collection... (1550 times) [2024-06-12 15:42:07,854][68089] Signal inference workers to resume experience collection... (1550 times) [2024-06-12 15:42:07,899][68109] InferenceWorker_p0-w0: stopping experience collection (1550 times) [2024-06-12 15:42:07,900][68109] InferenceWorker_p0-w0: resuming experience collection (1550 times) [2024-06-12 15:42:09,852][68109] Updated weights for policy 0, policy_version 25894 (0.0025) [2024-06-12 15:42:12,026][67877] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 424329216. Throughput: 0: 50783.8. Samples: 128104340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 15:42:12,027][67877] Avg episode reward: [(0, '0.044')] [2024-06-12 15:42:13,308][68109] Updated weights for policy 0, policy_version 25904 (0.0023) [2024-06-12 15:42:16,261][68109] Updated weights for policy 0, policy_version 25914 (0.0026) [2024-06-12 15:42:17,026][67877] Fps is (10 sec: 52429.4, 60 sec: 51336.5, 300 sec: 50707.1). Total num frames: 424607744. Throughput: 0: 50766.6. Samples: 128417580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 15:42:17,027][67877] Avg episode reward: [(0, '0.042')] [2024-06-12 15:42:19,480][68109] Updated weights for policy 0, policy_version 25924 (0.0023) [2024-06-12 15:42:22,026][67877] Fps is (10 sec: 50790.0, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 424837120. Throughput: 0: 51085.3. Samples: 128720520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 15:42:22,027][67877] Avg episode reward: [(0, '0.048')] [2024-06-12 15:42:23,081][68109] Updated weights for policy 0, policy_version 25934 (0.0034) [2024-06-12 15:42:26,054][68109] Updated weights for policy 0, policy_version 25944 (0.0028) [2024-06-12 15:42:27,027][67877] Fps is (10 sec: 49151.4, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 425099264. Throughput: 0: 51098.5. Samples: 128869440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 15:42:27,027][67877] Avg episode reward: [(0, '0.043')] [2024-06-12 15:42:29,391][68109] Updated weights for policy 0, policy_version 25954 (0.0025) [2024-06-12 15:42:32,026][67877] Fps is (10 sec: 52428.8, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 425361408. Throughput: 0: 51192.8. Samples: 129181560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 15:42:32,027][67877] Avg episode reward: [(0, '0.042')] [2024-06-12 15:42:32,431][68109] Updated weights for policy 0, policy_version 25964 (0.0035) [2024-06-12 15:42:36,033][68109] Updated weights for policy 0, policy_version 25974 (0.0027) [2024-06-12 15:42:37,026][67877] Fps is (10 sec: 49152.9, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 425590784. Throughput: 0: 50920.1. Samples: 129482760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 15:42:37,027][67877] Avg episode reward: [(0, '0.046')] [2024-06-12 15:42:38,887][68109] Updated weights for policy 0, policy_version 25984 (0.0027) [2024-06-12 15:42:42,026][67877] Fps is (10 sec: 50790.3, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 425869312. Throughput: 0: 50717.8. Samples: 129625900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 15:42:42,027][67877] Avg episode reward: [(0, '0.042')] [2024-06-12 15:42:42,674][68109] Updated weights for policy 0, policy_version 25994 (0.0021) [2024-06-12 15:42:45,269][68109] Updated weights for policy 0, policy_version 26004 (0.0029) [2024-06-12 15:42:47,026][67877] Fps is (10 sec: 55705.2, 60 sec: 51609.5, 300 sec: 50929.2). Total num frames: 426147840. Throughput: 0: 50865.3. Samples: 129939200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 15:42:47,027][67877] Avg episode reward: [(0, '0.049')] [2024-06-12 15:42:48,620][68109] Updated weights for policy 0, policy_version 26014 (0.0030) [2024-06-12 15:42:51,459][68109] Updated weights for policy 0, policy_version 26024 (0.0025) [2024-06-12 15:42:52,026][67877] Fps is (10 sec: 52428.5, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 426393600. Throughput: 0: 50990.3. Samples: 130253100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 15:42:52,027][67877] Avg episode reward: [(0, '0.055')] [2024-06-12 15:42:54,973][68109] Updated weights for policy 0, policy_version 26034 (0.0033) [2024-06-12 15:42:57,026][67877] Fps is (10 sec: 49151.7, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 426639360. Throughput: 0: 51315.4. Samples: 130413540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 15:42:57,027][67877] Avg episode reward: [(0, '0.052')] [2024-06-12 15:42:57,976][68109] Updated weights for policy 0, policy_version 26044 (0.0024) [2024-06-12 15:43:01,804][68109] Updated weights for policy 0, policy_version 26054 (0.0022) [2024-06-12 15:43:02,026][67877] Fps is (10 sec: 47514.1, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 426868736. Throughput: 0: 51093.8. Samples: 130716800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 15:43:02,027][67877] Avg episode reward: [(0, '0.046')] [2024-06-12 15:43:04,437][68109] Updated weights for policy 0, policy_version 26064 (0.0025) [2024-06-12 15:43:07,026][67877] Fps is (10 sec: 52429.2, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 427163648. Throughput: 0: 51172.9. Samples: 131023300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-12 15:43:07,027][67877] Avg episode reward: [(0, '0.055')] [2024-06-12 15:43:07,772][68109] Updated weights for policy 0, policy_version 26074 (0.0021) [2024-06-12 15:43:10,386][68109] Updated weights for policy 0, policy_version 26084 (0.0027) [2024-06-12 15:43:12,026][67877] Fps is (10 sec: 55705.2, 60 sec: 51609.5, 300 sec: 50929.2). Total num frames: 427425792. Throughput: 0: 51553.5. Samples: 131189340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-12 15:43:12,027][67877] Avg episode reward: [(0, '0.048')] [2024-06-12 15:43:14,403][68109] Updated weights for policy 0, policy_version 26094 (0.0027) [2024-06-12 15:43:16,987][68089] Signal inference workers to stop experience collection... (1600 times) [2024-06-12 15:43:17,025][68109] InferenceWorker_p0-w0: stopping experience collection (1600 times) [2024-06-12 15:43:17,026][67877] Fps is (10 sec: 50790.0, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 427671552. Throughput: 0: 51575.4. Samples: 131502460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-12 15:43:17,027][67877] Avg episode reward: [(0, '0.057')] [2024-06-12 15:43:17,092][68089] Signal inference workers to resume experience collection... (1600 times) [2024-06-12 15:43:17,093][68109] InferenceWorker_p0-w0: resuming experience collection (1600 times) [2024-06-12 15:43:17,096][68109] Updated weights for policy 0, policy_version 26104 (0.0032) [2024-06-12 15:43:17,219][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000026105_427704320.pth... [2024-06-12 15:43:17,264][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000025356_415432704.pth [2024-06-12 15:43:20,585][68109] Updated weights for policy 0, policy_version 26114 (0.0021) [2024-06-12 15:43:22,027][67877] Fps is (10 sec: 50789.6, 60 sec: 51609.4, 300 sec: 50984.8). Total num frames: 427933696. Throughput: 0: 51726.4. Samples: 131810460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-12 15:43:22,027][67877] Avg episode reward: [(0, '0.048')] [2024-06-12 15:43:23,485][68109] Updated weights for policy 0, policy_version 26124 (0.0026) [2024-06-12 15:43:27,003][68109] Updated weights for policy 0, policy_version 26134 (0.0025) [2024-06-12 15:43:27,026][67877] Fps is (10 sec: 50790.3, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 428179456. Throughput: 0: 51723.4. Samples: 131953460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 15:43:27,027][67877] Avg episode reward: [(0, '0.050')] [2024-06-12 15:43:30,143][68109] Updated weights for policy 0, policy_version 26144 (0.0027) [2024-06-12 15:43:32,026][67877] Fps is (10 sec: 50791.3, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 428441600. Throughput: 0: 51666.3. Samples: 132264180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 15:43:32,027][67877] Avg episode reward: [(0, '0.046')] [2024-06-12 15:43:33,310][68109] Updated weights for policy 0, policy_version 26154 (0.0021) [2024-06-12 15:43:36,475][68109] Updated weights for policy 0, policy_version 26164 (0.0031) [2024-06-12 15:43:37,027][67877] Fps is (10 sec: 52428.4, 60 sec: 51882.5, 300 sec: 50984.8). Total num frames: 428703744. Throughput: 0: 51442.1. Samples: 132568000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 15:43:37,027][67877] Avg episode reward: [(0, '0.052')] [2024-06-12 15:43:39,654][68109] Updated weights for policy 0, policy_version 26174 (0.0025) [2024-06-12 15:43:42,027][67877] Fps is (10 sec: 49151.2, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 428933120. Throughput: 0: 51152.8. Samples: 132715420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 15:43:42,027][67877] Avg episode reward: [(0, '0.051')] [2024-06-12 15:43:42,923][68109] Updated weights for policy 0, policy_version 26184 (0.0029) [2024-06-12 15:43:46,198][68109] Updated weights for policy 0, policy_version 26194 (0.0022) [2024-06-12 15:43:47,027][67877] Fps is (10 sec: 50790.4, 60 sec: 51063.3, 300 sec: 51040.3). Total num frames: 429211648. Throughput: 0: 51452.2. Samples: 133032160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 15:43:47,035][67877] Avg episode reward: [(0, '0.046')] [2024-06-12 15:43:49,306][68109] Updated weights for policy 0, policy_version 26204 (0.0030) [2024-06-12 15:43:52,026][67877] Fps is (10 sec: 52429.9, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 429457408. Throughput: 0: 51473.4. Samples: 133339600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 15:43:52,027][67877] Avg episode reward: [(0, '0.057')] [2024-06-12 15:43:52,521][68109] Updated weights for policy 0, policy_version 26214 (0.0026) [2024-06-12 15:43:55,521][68109] Updated weights for policy 0, policy_version 26224 (0.0023) [2024-06-12 15:43:57,026][67877] Fps is (10 sec: 50791.6, 60 sec: 51336.7, 300 sec: 51040.3). Total num frames: 429719552. Throughput: 0: 51199.2. Samples: 133493300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 15:43:57,027][67877] Avg episode reward: [(0, '0.048')] [2024-06-12 15:43:58,785][68109] Updated weights for policy 0, policy_version 26234 (0.0024) [2024-06-12 15:44:01,953][68109] Updated weights for policy 0, policy_version 26244 (0.0031) [2024-06-12 15:44:02,026][67877] Fps is (10 sec: 52428.5, 60 sec: 51882.6, 300 sec: 51040.3). Total num frames: 429981696. Throughput: 0: 51086.3. Samples: 133801340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-12 15:44:02,027][67877] Avg episode reward: [(0, '0.047')] [2024-06-12 15:44:05,096][68109] Updated weights for policy 0, policy_version 26254 (0.0024) [2024-06-12 15:44:07,026][67877] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 51151.4). Total num frames: 430227456. Throughput: 0: 51073.5. Samples: 134108760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-12 15:44:07,027][67877] Avg episode reward: [(0, '0.052')] [2024-06-12 15:44:08,294][68109] Updated weights for policy 0, policy_version 26264 (0.0023) [2024-06-12 15:44:11,414][68109] Updated weights for policy 0, policy_version 26274 (0.0024) [2024-06-12 15:44:12,026][67877] Fps is (10 sec: 52428.2, 60 sec: 51336.5, 300 sec: 51095.8). Total num frames: 430505984. Throughput: 0: 51176.0. Samples: 134256380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-12 15:44:12,027][67877] Avg episode reward: [(0, '0.057')] [2024-06-12 15:44:14,955][68109] Updated weights for policy 0, policy_version 26284 (0.0027) [2024-06-12 15:44:17,026][67877] Fps is (10 sec: 52428.3, 60 sec: 51336.5, 300 sec: 51095.8). Total num frames: 430751744. Throughput: 0: 51175.9. Samples: 134567100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 15:44:17,027][67877] Avg episode reward: [(0, '0.049')] [2024-06-12 15:44:17,888][68109] Updated weights for policy 0, policy_version 26294 (0.0030) [2024-06-12 15:44:21,219][68109] Updated weights for policy 0, policy_version 26304 (0.0033) [2024-06-12 15:44:22,026][67877] Fps is (10 sec: 49152.6, 60 sec: 51063.6, 300 sec: 51095.9). Total num frames: 430997504. Throughput: 0: 51266.4. Samples: 134874980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 15:44:22,027][67877] Avg episode reward: [(0, '0.059')] [2024-06-12 15:44:24,113][68109] Updated weights for policy 0, policy_version 26314 (0.0034) [2024-06-12 15:44:27,026][67877] Fps is (10 sec: 52429.4, 60 sec: 51609.7, 300 sec: 51206.9). Total num frames: 431276032. Throughput: 0: 51464.7. Samples: 135031320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 15:44:27,027][67877] Avg episode reward: [(0, '0.051')] [2024-06-12 15:44:27,538][68109] Updated weights for policy 0, policy_version 26324 (0.0025) [2024-06-12 15:44:27,846][68089] Signal inference workers to stop experience collection... (1650 times) [2024-06-12 15:44:27,846][68089] Signal inference workers to resume experience collection... (1650 times) [2024-06-12 15:44:27,855][68109] InferenceWorker_p0-w0: stopping experience collection (1650 times) [2024-06-12 15:44:27,856][68109] InferenceWorker_p0-w0: resuming experience collection (1650 times) [2024-06-12 15:44:30,575][68109] Updated weights for policy 0, policy_version 26334 (0.0032) [2024-06-12 15:44:32,026][67877] Fps is (10 sec: 52428.3, 60 sec: 51336.5, 300 sec: 51151.4). Total num frames: 431521792. Throughput: 0: 51157.4. Samples: 135334240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 15:44:32,027][67877] Avg episode reward: [(0, '0.051')] [2024-06-12 15:44:33,884][68109] Updated weights for policy 0, policy_version 26344 (0.0026) [2024-06-12 15:44:36,993][68109] Updated weights for policy 0, policy_version 26354 (0.0022) [2024-06-12 15:44:37,026][67877] Fps is (10 sec: 50790.0, 60 sec: 51336.6, 300 sec: 51151.4). Total num frames: 431783936. Throughput: 0: 51345.2. Samples: 135650140. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-12 15:44:37,027][67877] Avg episode reward: [(0, '0.053')] [2024-06-12 15:44:40,155][68109] Updated weights for policy 0, policy_version 26364 (0.0023) [2024-06-12 15:44:42,026][67877] Fps is (10 sec: 49152.7, 60 sec: 51336.7, 300 sec: 51095.9). Total num frames: 432013312. Throughput: 0: 51384.0. Samples: 135805580. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-12 15:44:42,027][67877] Avg episode reward: [(0, '0.057')] [2024-06-12 15:44:43,479][68109] Updated weights for policy 0, policy_version 26374 (0.0023) [2024-06-12 15:44:46,420][68109] Updated weights for policy 0, policy_version 26384 (0.0027) [2024-06-12 15:44:47,026][67877] Fps is (10 sec: 52428.8, 60 sec: 51609.7, 300 sec: 51262.5). Total num frames: 432308224. Throughput: 0: 51368.4. Samples: 136112920. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-12 15:44:47,027][67877] Avg episode reward: [(0, '0.048')] [2024-06-12 15:44:49,687][68109] Updated weights for policy 0, policy_version 26394 (0.0031) [2024-06-12 15:44:52,026][67877] Fps is (10 sec: 54066.6, 60 sec: 51609.5, 300 sec: 51262.5). Total num frames: 432553984. Throughput: 0: 51434.6. Samples: 136423320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-12 15:44:52,027][67877] Avg episode reward: [(0, '0.058')] [2024-06-12 15:44:52,794][68109] Updated weights for policy 0, policy_version 26404 (0.0020) [2024-06-12 15:44:56,090][68109] Updated weights for policy 0, policy_version 26414 (0.0027) [2024-06-12 15:44:57,026][67877] Fps is (10 sec: 50790.6, 60 sec: 51609.5, 300 sec: 51206.9). Total num frames: 432816128. Throughput: 0: 51660.1. Samples: 136581080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-12 15:44:57,027][67877] Avg episode reward: [(0, '0.051')] [2024-06-12 15:44:59,174][68109] Updated weights for policy 0, policy_version 26424 (0.0027) [2024-06-12 15:45:02,026][67877] Fps is (10 sec: 49151.8, 60 sec: 51063.4, 300 sec: 51095.9). Total num frames: 433045504. Throughput: 0: 51364.9. Samples: 136878520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-12 15:45:02,027][67877] Avg episode reward: [(0, '0.049')] [2024-06-12 15:45:02,656][68109] Updated weights for policy 0, policy_version 26434 (0.0027) [2024-06-12 15:45:05,648][68109] Updated weights for policy 0, policy_version 26444 (0.0031) [2024-06-12 15:45:07,026][67877] Fps is (10 sec: 52428.8, 60 sec: 51882.7, 300 sec: 51318.0). Total num frames: 433340416. Throughput: 0: 51556.4. Samples: 137195020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-12 15:45:07,027][67877] Avg episode reward: [(0, '0.065')] [2024-06-12 15:45:08,807][68109] Updated weights for policy 0, policy_version 26454 (0.0028) [2024-06-12 15:45:11,679][68109] Updated weights for policy 0, policy_version 26464 (0.0029) [2024-06-12 15:45:12,026][67877] Fps is (10 sec: 55706.1, 60 sec: 51609.7, 300 sec: 51318.2). Total num frames: 433602560. Throughput: 0: 51524.9. Samples: 137349940. Policy #0 lag: (min: 0.0, avg: 8.4, max: 19.0) [2024-06-12 15:45:12,027][67877] Avg episode reward: [(0, '0.065')] [2024-06-12 15:45:15,135][68109] Updated weights for policy 0, policy_version 26474 (0.0025) [2024-06-12 15:45:17,026][67877] Fps is (10 sec: 49151.4, 60 sec: 51336.5, 300 sec: 51262.5). Total num frames: 433831936. Throughput: 0: 51836.8. Samples: 137666900. Policy #0 lag: (min: 0.0, avg: 8.4, max: 19.0) [2024-06-12 15:45:17,027][67877] Avg episode reward: [(0, '0.048')] [2024-06-12 15:45:17,037][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000026479_433831936.pth... [2024-06-12 15:45:17,099][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000025729_421543936.pth [2024-06-12 15:45:17,852][68109] Updated weights for policy 0, policy_version 26484 (0.0027) [2024-06-12 15:45:21,395][68109] Updated weights for policy 0, policy_version 26494 (0.0026) [2024-06-12 15:45:22,026][67877] Fps is (10 sec: 50790.4, 60 sec: 51882.7, 300 sec: 51262.5). Total num frames: 434110464. Throughput: 0: 51678.3. Samples: 137975660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 19.0) [2024-06-12 15:45:22,027][67877] Avg episode reward: [(0, '0.057')] [2024-06-12 15:45:24,190][68109] Updated weights for policy 0, policy_version 26504 (0.0032) [2024-06-12 15:45:27,026][67877] Fps is (10 sec: 49153.0, 60 sec: 50790.4, 300 sec: 51151.4). Total num frames: 434323456. Throughput: 0: 51599.1. Samples: 138127540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 15:45:27,027][67877] Avg episode reward: [(0, '0.063')] [2024-06-12 15:45:27,878][68109] Updated weights for policy 0, policy_version 26514 (0.0028) [2024-06-12 15:45:30,675][68109] Updated weights for policy 0, policy_version 26524 (0.0026) [2024-06-12 15:45:32,026][67877] Fps is (10 sec: 49151.6, 60 sec: 51336.5, 300 sec: 51206.9). Total num frames: 434601984. Throughput: 0: 51431.1. Samples: 138427320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 15:45:32,027][67877] Avg episode reward: [(0, '0.061')] [2024-06-12 15:45:34,129][68109] Updated weights for policy 0, policy_version 26534 (0.0026) [2024-06-12 15:45:36,707][68109] Updated weights for policy 0, policy_version 26544 (0.0025) [2024-06-12 15:45:37,026][67877] Fps is (10 sec: 57343.8, 60 sec: 51882.7, 300 sec: 51373.6). Total num frames: 434896896. Throughput: 0: 51555.7. Samples: 138743320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 15:45:37,027][67877] Avg episode reward: [(0, '0.054')] [2024-06-12 15:45:38,596][68089] Signal inference workers to stop experience collection... (1700 times) [2024-06-12 15:45:38,651][68109] InferenceWorker_p0-w0: stopping experience collection (1700 times) [2024-06-12 15:45:38,651][68089] Signal inference workers to resume experience collection... (1700 times) [2024-06-12 15:45:38,662][68109] InferenceWorker_p0-w0: resuming experience collection (1700 times) [2024-06-12 15:45:40,581][68109] Updated weights for policy 0, policy_version 26554 (0.0023) [2024-06-12 15:45:42,026][67877] Fps is (10 sec: 52429.2, 60 sec: 51882.6, 300 sec: 51318.0). Total num frames: 435126272. Throughput: 0: 51736.0. Samples: 138909200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 15:45:42,027][67877] Avg episode reward: [(0, '0.060')] [2024-06-12 15:45:43,350][68109] Updated weights for policy 0, policy_version 26564 (0.0024) [2024-06-12 15:45:47,026][67877] Fps is (10 sec: 47513.6, 60 sec: 51063.5, 300 sec: 51207.0). Total num frames: 435372032. Throughput: 0: 51976.1. Samples: 139217440. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-12 15:45:47,027][67877] Avg episode reward: [(0, '0.058')] [2024-06-12 15:45:47,072][68109] Updated weights for policy 0, policy_version 26574 (0.0025) [2024-06-12 15:45:49,443][68109] Updated weights for policy 0, policy_version 26584 (0.0019) [2024-06-12 15:45:52,026][67877] Fps is (10 sec: 50790.4, 60 sec: 51336.6, 300 sec: 51262.5). Total num frames: 435634176. Throughput: 0: 51876.0. Samples: 139529440. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-12 15:45:52,027][67877] Avg episode reward: [(0, '0.070')] [2024-06-12 15:45:53,435][68109] Updated weights for policy 0, policy_version 26594 (0.0029) [2024-06-12 15:45:55,901][68109] Updated weights for policy 0, policy_version 26604 (0.0026) [2024-06-12 15:45:57,026][67877] Fps is (10 sec: 52428.3, 60 sec: 51336.5, 300 sec: 51262.5). Total num frames: 435896320. Throughput: 0: 51618.2. Samples: 139672760. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-12 15:45:57,027][67877] Avg episode reward: [(0, '0.065')] [2024-06-12 15:45:59,567][68109] Updated weights for policy 0, policy_version 26614 (0.0025) [2024-06-12 15:46:02,026][67877] Fps is (10 sec: 54066.6, 60 sec: 52155.7, 300 sec: 51318.0). Total num frames: 436174848. Throughput: 0: 51457.8. Samples: 139982500. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-12 15:46:02,027][67877] Avg episode reward: [(0, '0.060')] [2024-06-12 15:46:02,589][68109] Updated weights for policy 0, policy_version 26624 (0.0025) [2024-06-12 15:46:05,852][68109] Updated weights for policy 0, policy_version 26634 (0.0028) [2024-06-12 15:46:07,026][67877] Fps is (10 sec: 52428.9, 60 sec: 51336.5, 300 sec: 51318.0). Total num frames: 436420608. Throughput: 0: 51582.6. Samples: 140296880. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-12 15:46:07,027][67877] Avg episode reward: [(0, '0.063')] [2024-06-12 15:46:09,345][68109] Updated weights for policy 0, policy_version 26644 (0.0033) [2024-06-12 15:46:12,026][67877] Fps is (10 sec: 50791.2, 60 sec: 51336.6, 300 sec: 51373.6). Total num frames: 436682752. Throughput: 0: 51629.3. Samples: 140450860. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-12 15:46:12,027][67877] Avg episode reward: [(0, '0.066')] [2024-06-12 15:46:12,340][68109] Updated weights for policy 0, policy_version 26654 (0.0022) [2024-06-12 15:46:15,293][68109] Updated weights for policy 0, policy_version 26664 (0.0028) [2024-06-12 15:46:17,026][67877] Fps is (10 sec: 49152.0, 60 sec: 51336.6, 300 sec: 51318.0). Total num frames: 436912128. Throughput: 0: 51819.6. Samples: 140759200. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-12 15:46:17,027][67877] Avg episode reward: [(0, '0.059')] [2024-06-12 15:46:18,789][68109] Updated weights for policy 0, policy_version 26674 (0.0018) [2024-06-12 15:46:21,780][68109] Updated weights for policy 0, policy_version 26684 (0.0026) [2024-06-12 15:46:22,026][67877] Fps is (10 sec: 50789.6, 60 sec: 51336.4, 300 sec: 51318.0). Total num frames: 437190656. Throughput: 0: 51572.3. Samples: 141064080. Policy #0 lag: (min: 2.0, avg: 10.7, max: 22.0) [2024-06-12 15:46:22,027][67877] Avg episode reward: [(0, '0.059')] [2024-06-12 15:46:25,139][68109] Updated weights for policy 0, policy_version 26694 (0.0024) [2024-06-12 15:46:27,026][67877] Fps is (10 sec: 57343.7, 60 sec: 52701.7, 300 sec: 51429.1). Total num frames: 437485568. Throughput: 0: 51624.8. Samples: 141232320. Policy #0 lag: (min: 2.0, avg: 10.7, max: 22.0) [2024-06-12 15:46:27,027][67877] Avg episode reward: [(0, '0.073')] [2024-06-12 15:46:28,115][68109] Updated weights for policy 0, policy_version 26704 (0.0025) [2024-06-12 15:46:31,375][68109] Updated weights for policy 0, policy_version 26714 (0.0028) [2024-06-12 15:46:32,026][67877] Fps is (10 sec: 52429.4, 60 sec: 51882.7, 300 sec: 51484.6). Total num frames: 437714944. Throughput: 0: 51532.0. Samples: 141536380. Policy #0 lag: (min: 2.0, avg: 10.7, max: 22.0) [2024-06-12 15:46:32,027][67877] Avg episode reward: [(0, '0.056')] [2024-06-12 15:46:34,483][68109] Updated weights for policy 0, policy_version 26724 (0.0026) [2024-06-12 15:46:37,026][67877] Fps is (10 sec: 47514.2, 60 sec: 51063.5, 300 sec: 51429.1). Total num frames: 437960704. Throughput: 0: 51514.7. Samples: 141847600. Policy #0 lag: (min: 2.0, avg: 10.7, max: 22.0) [2024-06-12 15:46:37,027][67877] Avg episode reward: [(0, '0.056')] [2024-06-12 15:46:37,617][68109] Updated weights for policy 0, policy_version 26734 (0.0023) [2024-06-12 15:46:41,008][68109] Updated weights for policy 0, policy_version 26744 (0.0022) [2024-06-12 15:46:42,026][67877] Fps is (10 sec: 49151.9, 60 sec: 51336.5, 300 sec: 51373.6). Total num frames: 438206464. Throughput: 0: 51585.8. Samples: 141994120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 15:46:42,027][67877] Avg episode reward: [(0, '0.063')] [2024-06-12 15:46:44,178][68109] Updated weights for policy 0, policy_version 26754 (0.0032) [2024-06-12 15:46:47,026][67877] Fps is (10 sec: 52428.8, 60 sec: 51882.7, 300 sec: 51373.6). Total num frames: 438484992. Throughput: 0: 51459.3. Samples: 142298160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 15:46:47,027][67877] Avg episode reward: [(0, '0.062')] [2024-06-12 15:46:47,281][68109] Updated weights for policy 0, policy_version 26764 (0.0028) [2024-06-12 15:46:48,362][68089] Signal inference workers to stop experience collection... (1750 times) [2024-06-12 15:46:48,410][68109] InferenceWorker_p0-w0: stopping experience collection (1750 times) [2024-06-12 15:46:48,472][68089] Signal inference workers to resume experience collection... (1750 times) [2024-06-12 15:46:48,473][68109] InferenceWorker_p0-w0: resuming experience collection (1750 times) [2024-06-12 15:46:50,313][68109] Updated weights for policy 0, policy_version 26774 (0.0026) [2024-06-12 15:46:52,026][67877] Fps is (10 sec: 52428.9, 60 sec: 51609.6, 300 sec: 51373.6). Total num frames: 438730752. Throughput: 0: 51465.8. Samples: 142612840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 15:46:52,027][67877] Avg episode reward: [(0, '0.055')] [2024-06-12 15:46:53,274][68109] Updated weights for policy 0, policy_version 26784 (0.0031) [2024-06-12 15:46:56,747][68109] Updated weights for policy 0, policy_version 26794 (0.0020) [2024-06-12 15:46:57,026][67877] Fps is (10 sec: 50790.5, 60 sec: 51609.7, 300 sec: 51429.1). Total num frames: 438992896. Throughput: 0: 51670.2. Samples: 142776020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 15:46:57,027][67877] Avg episode reward: [(0, '0.059')] [2024-06-12 15:46:59,719][68109] Updated weights for policy 0, policy_version 26804 (0.0025) [2024-06-12 15:47:02,026][67877] Fps is (10 sec: 54066.8, 60 sec: 51609.6, 300 sec: 51484.6). Total num frames: 439271424. Throughput: 0: 51768.4. Samples: 143088780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 15:47:02,027][67877] Avg episode reward: [(0, '0.064')] [2024-06-12 15:47:02,949][68109] Updated weights for policy 0, policy_version 26814 (0.0030) [2024-06-12 15:47:05,967][68109] Updated weights for policy 0, policy_version 26824 (0.0029) [2024-06-12 15:47:07,027][67877] Fps is (10 sec: 55704.5, 60 sec: 52155.6, 300 sec: 51595.7). Total num frames: 439549952. Throughput: 0: 52038.6. Samples: 143405820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 15:47:07,027][67877] Avg episode reward: [(0, '0.076')] [2024-06-12 15:47:09,071][68109] Updated weights for policy 0, policy_version 26834 (0.0029) [2024-06-12 15:47:12,026][67877] Fps is (10 sec: 52428.9, 60 sec: 51882.6, 300 sec: 51484.6). Total num frames: 439795712. Throughput: 0: 51808.5. Samples: 143563700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 15:47:12,027][67877] Avg episode reward: [(0, '0.072')] [2024-06-12 15:47:12,458][68109] Updated weights for policy 0, policy_version 26844 (0.0019) [2024-06-12 15:47:15,467][68109] Updated weights for policy 0, policy_version 26854 (0.0025) [2024-06-12 15:47:17,026][67877] Fps is (10 sec: 47514.2, 60 sec: 51882.7, 300 sec: 51484.6). Total num frames: 440025088. Throughput: 0: 51840.4. Samples: 143869200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 15:47:17,027][67877] Avg episode reward: [(0, '0.058')] [2024-06-12 15:47:17,163][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000026858_440041472.pth... [2024-06-12 15:47:17,204][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000026105_427704320.pth [2024-06-12 15:47:18,956][68109] Updated weights for policy 0, policy_version 26864 (0.0027) [2024-06-12 15:47:21,634][68109] Updated weights for policy 0, policy_version 26874 (0.0022) [2024-06-12 15:47:22,026][67877] Fps is (10 sec: 54068.0, 60 sec: 52429.0, 300 sec: 51651.3). Total num frames: 440336384. Throughput: 0: 51713.9. Samples: 144174720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 15:47:22,027][67877] Avg episode reward: [(0, '0.054')] [2024-06-12 15:47:25,264][68109] Updated weights for policy 0, policy_version 26884 (0.0024) [2024-06-12 15:47:27,026][67877] Fps is (10 sec: 54067.5, 60 sec: 51336.6, 300 sec: 51540.2). Total num frames: 440565760. Throughput: 0: 52015.1. Samples: 144334800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 15:47:27,027][67877] Avg episode reward: [(0, '0.067')] [2024-06-12 15:47:27,860][68109] Updated weights for policy 0, policy_version 26894 (0.0024) [2024-06-12 15:47:31,787][68109] Updated weights for policy 0, policy_version 26904 (0.0027) [2024-06-12 15:47:32,026][67877] Fps is (10 sec: 45874.8, 60 sec: 51336.5, 300 sec: 51540.2). Total num frames: 440795136. Throughput: 0: 51920.4. Samples: 144634580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 15:47:32,027][67877] Avg episode reward: [(0, '0.067')] [2024-06-12 15:47:34,175][68109] Updated weights for policy 0, policy_version 26914 (0.0025) [2024-06-12 15:47:37,027][67877] Fps is (10 sec: 49151.2, 60 sec: 51609.4, 300 sec: 51484.6). Total num frames: 441057280. Throughput: 0: 51958.5. Samples: 144950980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 15:47:37,027][67877] Avg episode reward: [(0, '0.054')] [2024-06-12 15:47:37,975][68109] Updated weights for policy 0, policy_version 26924 (0.0032) [2024-06-12 15:47:40,678][68109] Updated weights for policy 0, policy_version 26934 (0.0026) [2024-06-12 15:47:42,026][67877] Fps is (10 sec: 50790.3, 60 sec: 51609.6, 300 sec: 51373.6). Total num frames: 441303040. Throughput: 0: 51393.3. Samples: 145088720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 15:47:42,027][67877] Avg episode reward: [(0, '0.058')] [2024-06-12 15:47:44,455][68109] Updated weights for policy 0, policy_version 26944 (0.0026) [2024-06-12 15:47:46,792][68109] Updated weights for policy 0, policy_version 26954 (0.0024) [2024-06-12 15:47:47,026][67877] Fps is (10 sec: 55706.5, 60 sec: 52155.7, 300 sec: 51595.7). Total num frames: 441614336. Throughput: 0: 51235.6. Samples: 145394380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 15:47:47,027][67877] Avg episode reward: [(0, '0.060')] [2024-06-12 15:47:50,595][68109] Updated weights for policy 0, policy_version 26964 (0.0022) [2024-06-12 15:47:50,888][68089] Signal inference workers to stop experience collection... (1800 times) [2024-06-12 15:47:50,891][68089] Signal inference workers to resume experience collection... (1800 times) [2024-06-12 15:47:50,902][68109] InferenceWorker_p0-w0: stopping experience collection (1800 times) [2024-06-12 15:47:50,902][68109] InferenceWorker_p0-w0: resuming experience collection (1800 times) [2024-06-12 15:47:52,026][67877] Fps is (10 sec: 57344.4, 60 sec: 52428.8, 300 sec: 51651.3). Total num frames: 441876480. Throughput: 0: 51302.0. Samples: 145714400. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-06-12 15:47:52,027][67877] Avg episode reward: [(0, '0.059')] [2024-06-12 15:47:53,514][68109] Updated weights for policy 0, policy_version 26974 (0.0031) [2024-06-12 15:47:57,026][67877] Fps is (10 sec: 47513.8, 60 sec: 51609.6, 300 sec: 51595.7). Total num frames: 442089472. Throughput: 0: 51353.0. Samples: 145874580. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-06-12 15:47:57,027][67877] Avg episode reward: [(0, '0.067')] [2024-06-12 15:47:57,060][68109] Updated weights for policy 0, policy_version 26984 (0.0026) [2024-06-12 15:47:59,701][68109] Updated weights for policy 0, policy_version 26994 (0.0030) [2024-06-12 15:48:02,026][67877] Fps is (10 sec: 42597.7, 60 sec: 50517.3, 300 sec: 51318.0). Total num frames: 442302464. Throughput: 0: 51207.0. Samples: 146173520. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-06-12 15:48:02,027][67877] Avg episode reward: [(0, '0.067')] [2024-06-12 15:48:03,541][68109] Updated weights for policy 0, policy_version 27004 (0.0021) [2024-06-12 15:48:05,902][68109] Updated weights for policy 0, policy_version 27014 (0.0025) [2024-06-12 15:48:07,026][67877] Fps is (10 sec: 54066.9, 60 sec: 51336.7, 300 sec: 51540.2). Total num frames: 442630144. Throughput: 0: 51367.9. Samples: 146486280. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-06-12 15:48:07,027][67877] Avg episode reward: [(0, '0.072')] [2024-06-12 15:48:09,909][68109] Updated weights for policy 0, policy_version 27024 (0.0028) [2024-06-12 15:48:12,026][67877] Fps is (10 sec: 60620.7, 60 sec: 51882.6, 300 sec: 51651.2). Total num frames: 442908672. Throughput: 0: 51486.1. Samples: 146651680. Policy #0 lag: (min: 0.0, avg: 6.5, max: 21.0) [2024-06-12 15:48:12,027][67877] Avg episode reward: [(0, '0.066')] [2024-06-12 15:48:12,414][68109] Updated weights for policy 0, policy_version 27034 (0.0025) [2024-06-12 15:48:16,248][68109] Updated weights for policy 0, policy_version 27044 (0.0019) [2024-06-12 15:48:17,026][67877] Fps is (10 sec: 50790.3, 60 sec: 51882.7, 300 sec: 51540.2). Total num frames: 443138048. Throughput: 0: 51634.6. Samples: 146958140. Policy #0 lag: (min: 0.0, avg: 6.5, max: 21.0) [2024-06-12 15:48:17,027][67877] Avg episode reward: [(0, '0.061')] [2024-06-12 15:48:18,487][68109] Updated weights for policy 0, policy_version 27054 (0.0017) [2024-06-12 15:48:22,026][67877] Fps is (10 sec: 44237.5, 60 sec: 50244.2, 300 sec: 51429.1). Total num frames: 443351040. Throughput: 0: 51448.2. Samples: 147266140. Policy #0 lag: (min: 0.0, avg: 6.5, max: 21.0) [2024-06-12 15:48:22,027][67877] Avg episode reward: [(0, '0.059')] [2024-06-12 15:48:23,013][68109] Updated weights for policy 0, policy_version 27064 (0.0026) [2024-06-12 15:48:25,171][68109] Updated weights for policy 0, policy_version 27074 (0.0030) [2024-06-12 15:48:27,026][67877] Fps is (10 sec: 47513.9, 60 sec: 50790.4, 300 sec: 51429.1). Total num frames: 443613184. Throughput: 0: 51271.6. Samples: 147395940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-12 15:48:27,027][67877] Avg episode reward: [(0, '0.063')] [2024-06-12 15:48:29,088][68109] Updated weights for policy 0, policy_version 27084 (0.0025) [2024-06-12 15:48:31,448][68109] Updated weights for policy 0, policy_version 27094 (0.0024) [2024-06-12 15:48:32,026][67877] Fps is (10 sec: 55705.4, 60 sec: 51882.6, 300 sec: 51540.2). Total num frames: 443908096. Throughput: 0: 51432.0. Samples: 147708820. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-12 15:48:32,027][67877] Avg episode reward: [(0, '0.062')] [2024-06-12 15:48:35,529][68109] Updated weights for policy 0, policy_version 27104 (0.0023) [2024-06-12 15:48:36,220][68089] Signal inference workers to stop experience collection... (1850 times) [2024-06-12 15:48:36,220][68089] Signal inference workers to resume experience collection... (1850 times) [2024-06-12 15:48:36,258][68109] InferenceWorker_p0-w0: stopping experience collection (1850 times) [2024-06-12 15:48:36,258][68109] InferenceWorker_p0-w0: resuming experience collection (1850 times) [2024-06-12 15:48:37,026][67877] Fps is (10 sec: 55705.1, 60 sec: 51882.8, 300 sec: 51651.3). Total num frames: 444170240. Throughput: 0: 51069.2. Samples: 148012520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-12 15:48:37,027][67877] Avg episode reward: [(0, '0.080')] [2024-06-12 15:48:38,249][68109] Updated weights for policy 0, policy_version 27114 (0.0018) [2024-06-12 15:48:41,979][68109] Updated weights for policy 0, policy_version 27124 (0.0023) [2024-06-12 15:48:42,026][67877] Fps is (10 sec: 49152.1, 60 sec: 51609.6, 300 sec: 51484.7). Total num frames: 444399616. Throughput: 0: 51163.1. Samples: 148176920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-12 15:48:42,027][67877] Avg episode reward: [(0, '0.064')] [2024-06-12 15:48:44,153][68109] Updated weights for policy 0, policy_version 27134 (0.0029) [2024-06-12 15:48:47,026][67877] Fps is (10 sec: 45875.6, 60 sec: 50244.3, 300 sec: 51429.1). Total num frames: 444628992. Throughput: 0: 51503.3. Samples: 148491160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 15:48:47,027][67877] Avg episode reward: [(0, '0.066')] [2024-06-12 15:48:48,195][68109] Updated weights for policy 0, policy_version 27144 (0.0025) [2024-06-12 15:48:51,099][68109] Updated weights for policy 0, policy_version 27154 (0.0023) [2024-06-12 15:48:52,026][67877] Fps is (10 sec: 50790.5, 60 sec: 50517.3, 300 sec: 51484.6). Total num frames: 444907520. Throughput: 0: 51392.9. Samples: 148798960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 15:48:52,027][67877] Avg episode reward: [(0, '0.068')] [2024-06-12 15:48:54,482][68109] Updated weights for policy 0, policy_version 27164 (0.0034) [2024-06-12 15:48:57,026][67877] Fps is (10 sec: 57343.3, 60 sec: 51882.6, 300 sec: 51595.7). Total num frames: 445202432. Throughput: 0: 51111.6. Samples: 148951700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 15:48:57,027][67877] Avg episode reward: [(0, '0.075')] [2024-06-12 15:48:57,226][68109] Updated weights for policy 0, policy_version 27174 (0.0028) [2024-06-12 15:49:01,027][68109] Updated weights for policy 0, policy_version 27184 (0.0025) [2024-06-12 15:49:02,026][67877] Fps is (10 sec: 54067.1, 60 sec: 52428.9, 300 sec: 51595.7). Total num frames: 445448192. Throughput: 0: 51249.4. Samples: 149264360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 15:49:02,027][67877] Avg episode reward: [(0, '0.075')] [2024-06-12 15:49:03,815][68109] Updated weights for policy 0, policy_version 27194 (0.0022) [2024-06-12 15:49:07,026][67877] Fps is (10 sec: 49152.0, 60 sec: 51063.4, 300 sec: 51484.6). Total num frames: 445693952. Throughput: 0: 51306.6. Samples: 149574940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-12 15:49:07,027][67877] Avg episode reward: [(0, '0.076')] [2024-06-12 15:49:07,328][68109] Updated weights for policy 0, policy_version 27204 (0.0022) [2024-06-12 15:49:10,238][68109] Updated weights for policy 0, policy_version 27214 (0.0028) [2024-06-12 15:49:12,026][67877] Fps is (10 sec: 49151.9, 60 sec: 50517.4, 300 sec: 51484.7). Total num frames: 445939712. Throughput: 0: 51645.3. Samples: 149719980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-12 15:49:12,027][67877] Avg episode reward: [(0, '0.060')] [2024-06-12 15:49:13,755][68109] Updated weights for policy 0, policy_version 27224 (0.0028) [2024-06-12 15:49:16,424][68109] Updated weights for policy 0, policy_version 27234 (0.0022) [2024-06-12 15:49:17,027][67877] Fps is (10 sec: 50789.9, 60 sec: 51063.3, 300 sec: 51540.1). Total num frames: 446201856. Throughput: 0: 51346.9. Samples: 150019440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-12 15:49:17,027][67877] Avg episode reward: [(0, '0.073')] [2024-06-12 15:49:17,035][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000027234_446201856.pth... [2024-06-12 15:49:17,076][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000026479_433831936.pth [2024-06-12 15:49:20,117][68109] Updated weights for policy 0, policy_version 27244 (0.0026) [2024-06-12 15:49:22,026][67877] Fps is (10 sec: 54067.4, 60 sec: 52155.7, 300 sec: 51540.2). Total num frames: 446480384. Throughput: 0: 51604.5. Samples: 150334720. Policy #0 lag: (min: 0.0, avg: 12.8, max: 26.0) [2024-06-12 15:49:22,027][67877] Avg episode reward: [(0, '0.066')] [2024-06-12 15:49:23,011][68109] Updated weights for policy 0, policy_version 27254 (0.0025) [2024-06-12 15:49:26,309][68109] Updated weights for policy 0, policy_version 27264 (0.0026) [2024-06-12 15:49:27,027][67877] Fps is (10 sec: 52428.8, 60 sec: 51882.5, 300 sec: 51540.2). Total num frames: 446726144. Throughput: 0: 51580.7. Samples: 150498060. Policy #0 lag: (min: 0.0, avg: 12.8, max: 26.0) [2024-06-12 15:49:27,027][67877] Avg episode reward: [(0, '0.080')] [2024-06-12 15:49:28,039][68089] Signal inference workers to stop experience collection... (1900 times) [2024-06-12 15:49:28,040][68089] Signal inference workers to resume experience collection... (1900 times) [2024-06-12 15:49:28,049][68109] InferenceWorker_p0-w0: stopping experience collection (1900 times) [2024-06-12 15:49:28,049][68109] InferenceWorker_p0-w0: resuming experience collection (1900 times) [2024-06-12 15:49:29,599][68109] Updated weights for policy 0, policy_version 27274 (0.0031) [2024-06-12 15:49:32,027][67877] Fps is (10 sec: 50789.6, 60 sec: 51336.4, 300 sec: 51540.2). Total num frames: 446988288. Throughput: 0: 51329.6. Samples: 150801000. Policy #0 lag: (min: 0.0, avg: 12.8, max: 26.0) [2024-06-12 15:49:32,027][67877] Avg episode reward: [(0, '0.056')] [2024-06-12 15:49:32,827][68109] Updated weights for policy 0, policy_version 27284 (0.0028) [2024-06-12 15:49:35,850][68109] Updated weights for policy 0, policy_version 27294 (0.0024) [2024-06-12 15:49:37,027][67877] Fps is (10 sec: 50790.2, 60 sec: 51063.3, 300 sec: 51595.7). Total num frames: 447234048. Throughput: 0: 51340.2. Samples: 151109280. Policy #0 lag: (min: 0.0, avg: 12.8, max: 26.0) [2024-06-12 15:49:37,027][67877] Avg episode reward: [(0, '0.075')] [2024-06-12 15:49:39,088][68109] Updated weights for policy 0, policy_version 27304 (0.0029) [2024-06-12 15:49:41,787][68109] Updated weights for policy 0, policy_version 27314 (0.0023) [2024-06-12 15:49:42,026][67877] Fps is (10 sec: 52429.8, 60 sec: 51882.7, 300 sec: 51540.2). Total num frames: 447512576. Throughput: 0: 51541.0. Samples: 151271040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 15:49:42,027][67877] Avg episode reward: [(0, '0.057')] [2024-06-12 15:49:45,326][68109] Updated weights for policy 0, policy_version 27324 (0.0025) [2024-06-12 15:49:47,026][67877] Fps is (10 sec: 52430.2, 60 sec: 52155.7, 300 sec: 51540.2). Total num frames: 447758336. Throughput: 0: 51602.3. Samples: 151586460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 15:49:47,027][67877] Avg episode reward: [(0, '0.080')] [2024-06-12 15:49:48,108][68109] Updated weights for policy 0, policy_version 27334 (0.0024) [2024-06-12 15:49:51,727][68109] Updated weights for policy 0, policy_version 27344 (0.0025) [2024-06-12 15:49:52,026][67877] Fps is (10 sec: 49151.9, 60 sec: 51609.6, 300 sec: 51484.6). Total num frames: 448004096. Throughput: 0: 51433.4. Samples: 151889440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 15:49:52,027][67877] Avg episode reward: [(0, '0.069')] [2024-06-12 15:49:54,598][68109] Updated weights for policy 0, policy_version 27354 (0.0030) [2024-06-12 15:49:57,026][67877] Fps is (10 sec: 54067.0, 60 sec: 51609.7, 300 sec: 51706.8). Total num frames: 448299008. Throughput: 0: 51608.5. Samples: 152042360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 15:49:57,027][67877] Avg episode reward: [(0, '0.062')] [2024-06-12 15:49:57,902][68109] Updated weights for policy 0, policy_version 27364 (0.0033) [2024-06-12 15:50:01,266][68109] Updated weights for policy 0, policy_version 27374 (0.0026) [2024-06-12 15:50:02,026][67877] Fps is (10 sec: 52428.9, 60 sec: 51336.6, 300 sec: 51484.6). Total num frames: 448528384. Throughput: 0: 51847.8. Samples: 152352580. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-12 15:50:02,027][67877] Avg episode reward: [(0, '0.056')] [2024-06-12 15:50:04,304][68109] Updated weights for policy 0, policy_version 27384 (0.0028) [2024-06-12 15:50:07,026][67877] Fps is (10 sec: 49152.2, 60 sec: 51609.7, 300 sec: 51484.6). Total num frames: 448790528. Throughput: 0: 51654.2. Samples: 152659160. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-12 15:50:07,027][67877] Avg episode reward: [(0, '0.066')] [2024-06-12 15:50:07,657][68109] Updated weights for policy 0, policy_version 27394 (0.0029) [2024-06-12 15:50:10,709][68109] Updated weights for policy 0, policy_version 27404 (0.0020) [2024-06-12 15:50:12,026][67877] Fps is (10 sec: 49151.8, 60 sec: 51336.5, 300 sec: 51484.7). Total num frames: 449019904. Throughput: 0: 51362.0. Samples: 152809340. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-12 15:50:12,027][67877] Avg episode reward: [(0, '0.078')] [2024-06-12 15:50:13,725][68109] Updated weights for policy 0, policy_version 27414 (0.0023) [2024-06-12 15:50:17,027][67877] Fps is (10 sec: 50789.5, 60 sec: 51609.6, 300 sec: 51484.6). Total num frames: 449298432. Throughput: 0: 51563.1. Samples: 153121340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 15:50:17,027][67877] Avg episode reward: [(0, '0.074')] [2024-06-12 15:50:17,087][68109] Updated weights for policy 0, policy_version 27424 (0.0030) [2024-06-12 15:50:20,268][68109] Updated weights for policy 0, policy_version 27434 (0.0024) [2024-06-12 15:50:22,026][67877] Fps is (10 sec: 55706.3, 60 sec: 51609.7, 300 sec: 51706.8). Total num frames: 449576960. Throughput: 0: 51553.3. Samples: 153429160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 15:50:22,026][67877] Avg episode reward: [(0, '0.062')] [2024-06-12 15:50:23,268][68109] Updated weights for policy 0, policy_version 27444 (0.0022) [2024-06-12 15:50:26,566][68109] Updated weights for policy 0, policy_version 27454 (0.0026) [2024-06-12 15:50:27,027][67877] Fps is (10 sec: 50790.3, 60 sec: 51336.6, 300 sec: 51540.2). Total num frames: 449806336. Throughput: 0: 51696.7. Samples: 153597400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 15:50:27,027][67877] Avg episode reward: [(0, '0.069')] [2024-06-12 15:50:29,616][68109] Updated weights for policy 0, policy_version 27464 (0.0027) [2024-06-12 15:50:31,805][68089] Signal inference workers to stop experience collection... (1950 times) [2024-06-12 15:50:31,805][68089] Signal inference workers to resume experience collection... (1950 times) [2024-06-12 15:50:31,820][68109] InferenceWorker_p0-w0: stopping experience collection (1950 times) [2024-06-12 15:50:31,820][68109] InferenceWorker_p0-w0: resuming experience collection (1950 times) [2024-06-12 15:50:32,026][67877] Fps is (10 sec: 52428.4, 60 sec: 51882.8, 300 sec: 51540.2). Total num frames: 450101248. Throughput: 0: 51680.9. Samples: 153912100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 15:50:32,027][67877] Avg episode reward: [(0, '0.082')] [2024-06-12 15:50:32,720][68109] Updated weights for policy 0, policy_version 27474 (0.0024) [2024-06-12 15:50:35,993][68109] Updated weights for policy 0, policy_version 27484 (0.0022) [2024-06-12 15:50:37,026][67877] Fps is (10 sec: 52429.8, 60 sec: 51609.8, 300 sec: 51540.2). Total num frames: 450330624. Throughput: 0: 51829.8. Samples: 154221780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 15:50:37,027][67877] Avg episode reward: [(0, '0.063')] [2024-06-12 15:50:38,739][68109] Updated weights for policy 0, policy_version 27494 (0.0022) [2024-06-12 15:50:42,026][67877] Fps is (10 sec: 50790.4, 60 sec: 51609.6, 300 sec: 51651.3). Total num frames: 450609152. Throughput: 0: 51959.1. Samples: 154380520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 15:50:42,027][67877] Avg episode reward: [(0, '0.068')] [2024-06-12 15:50:42,385][68109] Updated weights for policy 0, policy_version 27504 (0.0024) [2024-06-12 15:50:44,976][68109] Updated weights for policy 0, policy_version 27514 (0.0022) [2024-06-12 15:50:47,026][67877] Fps is (10 sec: 55705.4, 60 sec: 52155.7, 300 sec: 51706.8). Total num frames: 450887680. Throughput: 0: 52086.2. Samples: 154696460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 15:50:47,027][67877] Avg episode reward: [(0, '0.068')] [2024-06-12 15:50:48,349][68109] Updated weights for policy 0, policy_version 27524 (0.0031) [2024-06-12 15:50:51,538][68109] Updated weights for policy 0, policy_version 27534 (0.0023) [2024-06-12 15:50:52,026][67877] Fps is (10 sec: 52429.1, 60 sec: 52155.8, 300 sec: 51651.3). Total num frames: 451133440. Throughput: 0: 52123.6. Samples: 155004720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 15:50:52,027][67877] Avg episode reward: [(0, '0.083')] [2024-06-12 15:50:54,555][68109] Updated weights for policy 0, policy_version 27544 (0.0023) [2024-06-12 15:50:57,027][67877] Fps is (10 sec: 50789.7, 60 sec: 51609.5, 300 sec: 51595.7). Total num frames: 451395584. Throughput: 0: 52260.3. Samples: 155161060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 15:50:57,027][67877] Avg episode reward: [(0, '0.076')] [2024-06-12 15:50:57,841][68109] Updated weights for policy 0, policy_version 27554 (0.0022) [2024-06-12 15:51:00,600][68109] Updated weights for policy 0, policy_version 27564 (0.0026) [2024-06-12 15:51:02,027][67877] Fps is (10 sec: 52427.5, 60 sec: 52155.6, 300 sec: 51651.2). Total num frames: 451657728. Throughput: 0: 52378.2. Samples: 155478360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 15:51:02,027][67877] Avg episode reward: [(0, '0.076')] [2024-06-12 15:51:03,803][68109] Updated weights for policy 0, policy_version 27574 (0.0029) [2024-06-12 15:51:07,026][67877] Fps is (10 sec: 52429.5, 60 sec: 52155.7, 300 sec: 51651.2). Total num frames: 451919872. Throughput: 0: 52469.6. Samples: 155790300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 15:51:07,027][67877] Avg episode reward: [(0, '0.075')] [2024-06-12 15:51:07,046][68109] Updated weights for policy 0, policy_version 27584 (0.0025) [2024-06-12 15:51:10,294][68109] Updated weights for policy 0, policy_version 27594 (0.0028) [2024-06-12 15:51:12,026][67877] Fps is (10 sec: 54068.1, 60 sec: 52975.0, 300 sec: 51817.9). Total num frames: 452198400. Throughput: 0: 52317.5. Samples: 155951680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-12 15:51:12,027][67877] Avg episode reward: [(0, '0.072')] [2024-06-12 15:51:13,442][68109] Updated weights for policy 0, policy_version 27604 (0.0025) [2024-06-12 15:51:16,727][68109] Updated weights for policy 0, policy_version 27614 (0.0027) [2024-06-12 15:51:17,026][67877] Fps is (10 sec: 50790.0, 60 sec: 52155.8, 300 sec: 51651.3). Total num frames: 452427776. Throughput: 0: 52322.1. Samples: 156266600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-12 15:51:17,027][67877] Avg episode reward: [(0, '0.072')] [2024-06-12 15:51:17,038][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000027614_452427776.pth... [2024-06-12 15:51:17,087][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000026858_440041472.pth [2024-06-12 15:51:19,883][68109] Updated weights for policy 0, policy_version 27624 (0.0025) [2024-06-12 15:51:22,026][67877] Fps is (10 sec: 50790.5, 60 sec: 52155.7, 300 sec: 51595.7). Total num frames: 452706304. Throughput: 0: 52391.6. Samples: 156579400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-12 15:51:22,027][67877] Avg episode reward: [(0, '0.064')] [2024-06-12 15:51:23,054][68109] Updated weights for policy 0, policy_version 27634 (0.0031) [2024-06-12 15:51:25,803][68109] Updated weights for policy 0, policy_version 27644 (0.0025) [2024-06-12 15:51:27,027][67877] Fps is (10 sec: 55705.1, 60 sec: 52974.9, 300 sec: 51762.3). Total num frames: 452984832. Throughput: 0: 52378.4. Samples: 156737560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-12 15:51:27,027][67877] Avg episode reward: [(0, '0.072')] [2024-06-12 15:51:29,175][68109] Updated weights for policy 0, policy_version 27654 (0.0028) [2024-06-12 15:51:32,026][67877] Fps is (10 sec: 52428.7, 60 sec: 52155.7, 300 sec: 51762.3). Total num frames: 453230592. Throughput: 0: 52358.2. Samples: 157052580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 15:51:32,027][67877] Avg episode reward: [(0, '0.060')] [2024-06-12 15:51:32,126][68109] Updated weights for policy 0, policy_version 27664 (0.0034) [2024-06-12 15:51:35,447][68109] Updated weights for policy 0, policy_version 27674 (0.0030) [2024-06-12 15:51:37,026][67877] Fps is (10 sec: 50791.3, 60 sec: 52701.9, 300 sec: 51817.9). Total num frames: 453492736. Throughput: 0: 52364.8. Samples: 157361140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 15:51:37,027][67877] Avg episode reward: [(0, '0.080')] [2024-06-12 15:51:37,262][68089] Signal inference workers to stop experience collection... (2000 times) [2024-06-12 15:51:37,263][68089] Signal inference workers to resume experience collection... (2000 times) [2024-06-12 15:51:37,304][68109] InferenceWorker_p0-w0: stopping experience collection (2000 times) [2024-06-12 15:51:37,304][68109] InferenceWorker_p0-w0: resuming experience collection (2000 times) [2024-06-12 15:51:38,243][68109] Updated weights for policy 0, policy_version 27684 (0.0026) [2024-06-12 15:51:41,791][68109] Updated weights for policy 0, policy_version 27694 (0.0026) [2024-06-12 15:51:42,026][67877] Fps is (10 sec: 50790.4, 60 sec: 52155.7, 300 sec: 51706.8). Total num frames: 453738496. Throughput: 0: 52371.3. Samples: 157517760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 15:51:42,027][67877] Avg episode reward: [(0, '0.075')] [2024-06-12 15:51:44,661][68109] Updated weights for policy 0, policy_version 27704 (0.0030) [2024-06-12 15:51:47,027][67877] Fps is (10 sec: 52427.9, 60 sec: 52155.6, 300 sec: 51817.8). Total num frames: 454017024. Throughput: 0: 52145.8. Samples: 157824920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 15:51:47,027][67877] Avg episode reward: [(0, '0.077')] [2024-06-12 15:51:48,021][68109] Updated weights for policy 0, policy_version 27714 (0.0024) [2024-06-12 15:51:50,852][68109] Updated weights for policy 0, policy_version 27724 (0.0026) [2024-06-12 15:51:52,026][67877] Fps is (10 sec: 54067.4, 60 sec: 52428.8, 300 sec: 51817.9). Total num frames: 454279168. Throughput: 0: 52084.5. Samples: 158134100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 15:51:52,027][67877] Avg episode reward: [(0, '0.069')] [2024-06-12 15:51:54,194][68109] Updated weights for policy 0, policy_version 27734 (0.0027) [2024-06-12 15:51:57,026][67877] Fps is (10 sec: 50791.4, 60 sec: 52155.9, 300 sec: 51706.8). Total num frames: 454524928. Throughput: 0: 52183.6. Samples: 158299940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 15:51:57,027][67877] Avg episode reward: [(0, '0.073')] [2024-06-12 15:51:57,216][68109] Updated weights for policy 0, policy_version 27744 (0.0020) [2024-06-12 15:52:00,507][68109] Updated weights for policy 0, policy_version 27754 (0.0021) [2024-06-12 15:52:02,027][67877] Fps is (10 sec: 50789.2, 60 sec: 52155.7, 300 sec: 51651.2). Total num frames: 454787072. Throughput: 0: 52256.7. Samples: 158618160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 15:52:02,027][67877] Avg episode reward: [(0, '0.065')] [2024-06-12 15:52:03,554][68109] Updated weights for policy 0, policy_version 27764 (0.0020) [2024-06-12 15:52:06,654][68109] Updated weights for policy 0, policy_version 27774 (0.0029) [2024-06-12 15:52:07,026][67877] Fps is (10 sec: 52428.4, 60 sec: 52155.7, 300 sec: 51706.8). Total num frames: 455049216. Throughput: 0: 52210.6. Samples: 158928880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 15:52:07,027][67877] Avg episode reward: [(0, '0.075')] [2024-06-12 15:52:09,859][68109] Updated weights for policy 0, policy_version 27784 (0.0026) [2024-06-12 15:52:12,026][67877] Fps is (10 sec: 55706.5, 60 sec: 52428.8, 300 sec: 51928.9). Total num frames: 455344128. Throughput: 0: 52389.9. Samples: 159095100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 15:52:12,027][67877] Avg episode reward: [(0, '0.079')] [2024-06-12 15:52:12,635][68109] Updated weights for policy 0, policy_version 27794 (0.0026) [2024-06-12 15:52:15,987][68109] Updated weights for policy 0, policy_version 27804 (0.0026) [2024-06-12 15:52:17,026][67877] Fps is (10 sec: 55705.0, 60 sec: 52974.9, 300 sec: 51762.3). Total num frames: 455606272. Throughput: 0: 52347.4. Samples: 159408220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 15:52:17,027][67877] Avg episode reward: [(0, '0.079')] [2024-06-12 15:52:19,110][68109] Updated weights for policy 0, policy_version 27814 (0.0023) [2024-06-12 15:52:22,026][67877] Fps is (10 sec: 49152.2, 60 sec: 52155.7, 300 sec: 51762.3). Total num frames: 455835648. Throughput: 0: 52476.4. Samples: 159722580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 15:52:22,027][67877] Avg episode reward: [(0, '0.074')] [2024-06-12 15:52:22,302][68109] Updated weights for policy 0, policy_version 27824 (0.0018) [2024-06-12 15:52:25,288][68109] Updated weights for policy 0, policy_version 27834 (0.0026) [2024-06-12 15:52:27,027][67877] Fps is (10 sec: 49151.8, 60 sec: 51882.7, 300 sec: 51873.4). Total num frames: 456097792. Throughput: 0: 52495.4. Samples: 159880060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 15:52:27,027][67877] Avg episode reward: [(0, '0.075')] [2024-06-12 15:52:28,703][68109] Updated weights for policy 0, policy_version 27844 (0.0027) [2024-06-12 15:52:31,070][68109] Updated weights for policy 0, policy_version 27854 (0.0032) [2024-06-12 15:52:32,027][67877] Fps is (10 sec: 54066.1, 60 sec: 52428.6, 300 sec: 51928.9). Total num frames: 456376320. Throughput: 0: 52810.2. Samples: 160201380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 15:52:32,027][67877] Avg episode reward: [(0, '0.072')] [2024-06-12 15:52:34,850][68109] Updated weights for policy 0, policy_version 27864 (0.0021) [2024-06-12 15:52:37,026][67877] Fps is (10 sec: 57344.9, 60 sec: 52974.9, 300 sec: 52095.6). Total num frames: 456671232. Throughput: 0: 52907.1. Samples: 160514920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 15:52:37,027][67877] Avg episode reward: [(0, '0.069')] [2024-06-12 15:52:37,449][68109] Updated weights for policy 0, policy_version 27874 (0.0028) [2024-06-12 15:52:40,097][68089] Signal inference workers to stop experience collection... (2050 times) [2024-06-12 15:52:40,098][68089] Signal inference workers to resume experience collection... (2050 times) [2024-06-12 15:52:40,110][68109] InferenceWorker_p0-w0: stopping experience collection (2050 times) [2024-06-12 15:52:40,111][68109] InferenceWorker_p0-w0: resuming experience collection (2050 times) [2024-06-12 15:52:41,110][68109] Updated weights for policy 0, policy_version 27884 (0.0028) [2024-06-12 15:52:42,026][67877] Fps is (10 sec: 54068.3, 60 sec: 52974.9, 300 sec: 51873.4). Total num frames: 456916992. Throughput: 0: 52748.8. Samples: 160673640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 15:52:42,027][67877] Avg episode reward: [(0, '0.082')] [2024-06-12 15:52:43,851][68109] Updated weights for policy 0, policy_version 27894 (0.0024) [2024-06-12 15:52:47,026][67877] Fps is (10 sec: 49151.7, 60 sec: 52428.9, 300 sec: 51817.9). Total num frames: 457162752. Throughput: 0: 52706.4. Samples: 160989940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-12 15:52:47,027][67877] Avg episode reward: [(0, '0.079')] [2024-06-12 15:52:47,238][68109] Updated weights for policy 0, policy_version 27904 (0.0024) [2024-06-12 15:52:50,132][68109] Updated weights for policy 0, policy_version 27914 (0.0026) [2024-06-12 15:52:52,026][67877] Fps is (10 sec: 49152.0, 60 sec: 52155.7, 300 sec: 51928.9). Total num frames: 457408512. Throughput: 0: 52644.9. Samples: 161297900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-12 15:52:52,027][67877] Avg episode reward: [(0, '0.068')] [2024-06-12 15:52:53,592][68109] Updated weights for policy 0, policy_version 27924 (0.0031) [2024-06-12 15:52:56,700][68109] Updated weights for policy 0, policy_version 27934 (0.0036) [2024-06-12 15:52:57,026][67877] Fps is (10 sec: 50790.2, 60 sec: 52428.7, 300 sec: 52095.6). Total num frames: 457670656. Throughput: 0: 52414.6. Samples: 161453760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-12 15:52:57,027][67877] Avg episode reward: [(0, '0.075')] [2024-06-12 15:52:59,681][68109] Updated weights for policy 0, policy_version 27944 (0.0027) [2024-06-12 15:53:02,026][67877] Fps is (10 sec: 55705.9, 60 sec: 52975.1, 300 sec: 51984.5). Total num frames: 457965568. Throughput: 0: 52490.9. Samples: 161770300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-12 15:53:02,027][67877] Avg episode reward: [(0, '0.080')] [2024-06-12 15:53:02,607][68109] Updated weights for policy 0, policy_version 27954 (0.0026) [2024-06-12 15:53:05,986][68109] Updated weights for policy 0, policy_version 27964 (0.0024) [2024-06-12 15:53:07,026][67877] Fps is (10 sec: 55706.3, 60 sec: 52975.0, 300 sec: 51929.0). Total num frames: 458227712. Throughput: 0: 52526.7. Samples: 162086280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-12 15:53:07,027][67877] Avg episode reward: [(0, '0.074')] [2024-06-12 15:53:08,993][68109] Updated weights for policy 0, policy_version 27974 (0.0027) [2024-06-12 15:53:12,026][67877] Fps is (10 sec: 50790.2, 60 sec: 52155.8, 300 sec: 51984.5). Total num frames: 458473472. Throughput: 0: 52670.9. Samples: 162250240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-12 15:53:12,027][67877] Avg episode reward: [(0, '0.090')] [2024-06-12 15:53:12,135][68109] Updated weights for policy 0, policy_version 27984 (0.0026) [2024-06-12 15:53:15,162][68109] Updated weights for policy 0, policy_version 27994 (0.0028) [2024-06-12 15:53:17,026][67877] Fps is (10 sec: 49152.1, 60 sec: 51882.8, 300 sec: 52095.6). Total num frames: 458719232. Throughput: 0: 52253.6. Samples: 162552780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-12 15:53:17,027][67877] Avg episode reward: [(0, '0.086')] [2024-06-12 15:53:17,031][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000027998_458719232.pth... [2024-06-12 15:53:17,090][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000027234_446201856.pth [2024-06-12 15:53:18,557][68109] Updated weights for policy 0, policy_version 28004 (0.0024) [2024-06-12 15:53:21,325][68109] Updated weights for policy 0, policy_version 28014 (0.0021) [2024-06-12 15:53:22,026][67877] Fps is (10 sec: 50790.4, 60 sec: 52428.8, 300 sec: 52095.6). Total num frames: 458981376. Throughput: 0: 52163.1. Samples: 162862260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 15:53:22,027][67877] Avg episode reward: [(0, '0.074')] [2024-06-12 15:53:24,881][68109] Updated weights for policy 0, policy_version 28024 (0.0031) [2024-06-12 15:53:27,026][67877] Fps is (10 sec: 54067.0, 60 sec: 52702.0, 300 sec: 52040.0). Total num frames: 459259904. Throughput: 0: 52272.9. Samples: 163025920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 15:53:27,027][67877] Avg episode reward: [(0, '0.068')] [2024-06-12 15:53:27,679][68109] Updated weights for policy 0, policy_version 28034 (0.0025) [2024-06-12 15:53:30,831][68109] Updated weights for policy 0, policy_version 28044 (0.0025) [2024-06-12 15:53:32,026][67877] Fps is (10 sec: 52428.8, 60 sec: 52155.9, 300 sec: 51984.5). Total num frames: 459505664. Throughput: 0: 52232.1. Samples: 163340380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 15:53:32,027][67877] Avg episode reward: [(0, '0.078')] [2024-06-12 15:53:34,011][68109] Updated weights for policy 0, policy_version 28054 (0.0028) [2024-06-12 15:53:37,026][67877] Fps is (10 sec: 50790.3, 60 sec: 51609.6, 300 sec: 52095.6). Total num frames: 459767808. Throughput: 0: 52406.2. Samples: 163656180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 15:53:37,027][67877] Avg episode reward: [(0, '0.073')] [2024-06-12 15:53:37,591][68109] Updated weights for policy 0, policy_version 28064 (0.0023) [2024-06-12 15:53:40,122][68109] Updated weights for policy 0, policy_version 28074 (0.0027) [2024-06-12 15:53:42,026][67877] Fps is (10 sec: 52428.4, 60 sec: 51882.6, 300 sec: 52206.6). Total num frames: 460029952. Throughput: 0: 52147.2. Samples: 163800380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 15:53:42,027][67877] Avg episode reward: [(0, '0.081')] [2024-06-12 15:53:43,581][68109] Updated weights for policy 0, policy_version 28084 (0.0023) [2024-06-12 15:53:46,318][68109] Updated weights for policy 0, policy_version 28094 (0.0027) [2024-06-12 15:53:47,026][67877] Fps is (10 sec: 55705.3, 60 sec: 52701.8, 300 sec: 52262.2). Total num frames: 460324864. Throughput: 0: 52149.2. Samples: 164117020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 15:53:47,027][67877] Avg episode reward: [(0, '0.078')] [2024-06-12 15:53:50,115][68109] Updated weights for policy 0, policy_version 28104 (0.0026) [2024-06-12 15:53:50,802][68089] Signal inference workers to stop experience collection... (2100 times) [2024-06-12 15:53:50,809][68089] Signal inference workers to resume experience collection... (2100 times) [2024-06-12 15:53:50,827][68109] InferenceWorker_p0-w0: stopping experience collection (2100 times) [2024-06-12 15:53:50,828][68109] InferenceWorker_p0-w0: resuming experience collection (2100 times) [2024-06-12 15:53:52,026][67877] Fps is (10 sec: 54067.5, 60 sec: 52701.9, 300 sec: 52095.6). Total num frames: 460570624. Throughput: 0: 52293.3. Samples: 164439480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 15:53:52,027][67877] Avg episode reward: [(0, '0.084')] [2024-06-12 15:53:52,715][68109] Updated weights for policy 0, policy_version 28114 (0.0024) [2024-06-12 15:53:56,090][68109] Updated weights for policy 0, policy_version 28124 (0.0027) [2024-06-12 15:53:57,026][67877] Fps is (10 sec: 49152.3, 60 sec: 52428.9, 300 sec: 52095.6). Total num frames: 460816384. Throughput: 0: 51962.6. Samples: 164588560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 15:53:57,027][67877] Avg episode reward: [(0, '0.080')] [2024-06-12 15:53:59,177][68109] Updated weights for policy 0, policy_version 28134 (0.0032) [2024-06-12 15:54:02,028][67877] Fps is (10 sec: 52418.3, 60 sec: 52154.0, 300 sec: 52206.3). Total num frames: 461094912. Throughput: 0: 52118.1. Samples: 164898200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 15:54:02,029][67877] Avg episode reward: [(0, '0.060')] [2024-06-12 15:54:02,355][68109] Updated weights for policy 0, policy_version 28144 (0.0030) [2024-06-12 15:54:05,817][68109] Updated weights for policy 0, policy_version 28154 (0.0026) [2024-06-12 15:54:07,026][67877] Fps is (10 sec: 52429.1, 60 sec: 51882.7, 300 sec: 52206.7). Total num frames: 461340672. Throughput: 0: 52340.9. Samples: 165217600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 15:54:07,027][67877] Avg episode reward: [(0, '0.070')] [2024-06-12 15:54:08,434][68109] Updated weights for policy 0, policy_version 28164 (0.0026) [2024-06-12 15:54:11,892][68109] Updated weights for policy 0, policy_version 28174 (0.0026) [2024-06-12 15:54:12,027][67877] Fps is (10 sec: 50799.8, 60 sec: 52155.6, 300 sec: 52206.6). Total num frames: 461602816. Throughput: 0: 52245.6. Samples: 165376980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 15:54:12,027][67877] Avg episode reward: [(0, '0.080')] [2024-06-12 15:54:14,907][68109] Updated weights for policy 0, policy_version 28184 (0.0025) [2024-06-12 15:54:17,026][67877] Fps is (10 sec: 50789.7, 60 sec: 52155.6, 300 sec: 52095.5). Total num frames: 461848576. Throughput: 0: 52162.5. Samples: 165687700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 15:54:17,027][67877] Avg episode reward: [(0, '0.077')] [2024-06-12 15:54:17,888][68109] Updated weights for policy 0, policy_version 28194 (0.0018) [2024-06-12 15:54:21,112][68109] Updated weights for policy 0, policy_version 28204 (0.0025) [2024-06-12 15:54:22,026][67877] Fps is (10 sec: 52430.0, 60 sec: 52428.9, 300 sec: 52206.7). Total num frames: 462127104. Throughput: 0: 52121.4. Samples: 166001640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 15:54:22,027][67877] Avg episode reward: [(0, '0.079')] [2024-06-12 15:54:24,761][68109] Updated weights for policy 0, policy_version 28214 (0.0026) [2024-06-12 15:54:27,026][67877] Fps is (10 sec: 55706.3, 60 sec: 52428.8, 300 sec: 52262.2). Total num frames: 462405632. Throughput: 0: 52459.6. Samples: 166161060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 15:54:27,027][67877] Avg episode reward: [(0, '0.079')] [2024-06-12 15:54:27,279][68109] Updated weights for policy 0, policy_version 28224 (0.0024) [2024-06-12 15:54:30,948][68109] Updated weights for policy 0, policy_version 28234 (0.0031) [2024-06-12 15:54:32,026][67877] Fps is (10 sec: 49151.7, 60 sec: 51882.7, 300 sec: 52151.1). Total num frames: 462618624. Throughput: 0: 52237.9. Samples: 166467720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 15:54:32,027][67877] Avg episode reward: [(0, '0.080')] [2024-06-12 15:54:33,553][68109] Updated weights for policy 0, policy_version 28244 (0.0026) [2024-06-12 15:54:37,026][67877] Fps is (10 sec: 49152.2, 60 sec: 52155.8, 300 sec: 52151.1). Total num frames: 462897152. Throughput: 0: 52048.9. Samples: 166781680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 15:54:37,027][67877] Avg episode reward: [(0, '0.073')] [2024-06-12 15:54:37,253][68109] Updated weights for policy 0, policy_version 28254 (0.0023) [2024-06-12 15:54:39,791][68109] Updated weights for policy 0, policy_version 28264 (0.0023) [2024-06-12 15:54:42,027][67877] Fps is (10 sec: 54066.5, 60 sec: 52155.7, 300 sec: 52206.6). Total num frames: 463159296. Throughput: 0: 52104.8. Samples: 166933280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 15:54:42,027][67877] Avg episode reward: [(0, '0.081')] [2024-06-12 15:54:43,267][68109] Updated weights for policy 0, policy_version 28274 (0.0019) [2024-06-12 15:54:45,796][68109] Updated weights for policy 0, policy_version 28284 (0.0024) [2024-06-12 15:54:47,026][67877] Fps is (10 sec: 55705.6, 60 sec: 52155.8, 300 sec: 52373.3). Total num frames: 463454208. Throughput: 0: 52255.7. Samples: 167249600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 15:54:47,027][67877] Avg episode reward: [(0, '0.075')] [2024-06-12 15:54:49,510][68109] Updated weights for policy 0, policy_version 28294 (0.0028) [2024-06-12 15:54:52,026][67877] Fps is (10 sec: 55706.1, 60 sec: 52428.8, 300 sec: 52262.2). Total num frames: 463716352. Throughput: 0: 52182.2. Samples: 167565800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 15:54:52,027][67877] Avg episode reward: [(0, '0.100')] [2024-06-12 15:54:52,087][68089] Saving new best policy, reward=0.100! [2024-06-12 15:54:52,091][68109] Updated weights for policy 0, policy_version 28304 (0.0027) [2024-06-12 15:54:55,147][68089] Signal inference workers to stop experience collection... (2150 times) [2024-06-12 15:54:55,188][68109] InferenceWorker_p0-w0: stopping experience collection (2150 times) [2024-06-12 15:54:55,196][68089] Signal inference workers to resume experience collection... (2150 times) [2024-06-12 15:54:55,205][68109] InferenceWorker_p0-w0: resuming experience collection (2150 times) [2024-06-12 15:54:55,882][68109] Updated weights for policy 0, policy_version 28314 (0.0037) [2024-06-12 15:54:57,026][67877] Fps is (10 sec: 52428.2, 60 sec: 52701.8, 300 sec: 52373.2). Total num frames: 463978496. Throughput: 0: 52356.1. Samples: 167733000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-12 15:54:57,027][67877] Avg episode reward: [(0, '0.082')] [2024-06-12 15:54:58,538][68109] Updated weights for policy 0, policy_version 28324 (0.0030) [2024-06-12 15:55:02,000][68109] Updated weights for policy 0, policy_version 28334 (0.0030) [2024-06-12 15:55:02,026][67877] Fps is (10 sec: 50790.5, 60 sec: 52157.5, 300 sec: 52317.7). Total num frames: 464224256. Throughput: 0: 52353.0. Samples: 168043580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-12 15:55:02,027][67877] Avg episode reward: [(0, '0.091')] [2024-06-12 15:55:05,043][68109] Updated weights for policy 0, policy_version 28344 (0.0028) [2024-06-12 15:55:07,026][67877] Fps is (10 sec: 49152.4, 60 sec: 52155.7, 300 sec: 52373.3). Total num frames: 464470016. Throughput: 0: 52300.4. Samples: 168355160. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-12 15:55:07,027][67877] Avg episode reward: [(0, '0.080')] [2024-06-12 15:55:08,534][68109] Updated weights for policy 0, policy_version 28354 (0.0020) [2024-06-12 15:55:11,093][68109] Updated weights for policy 0, policy_version 28364 (0.0026) [2024-06-12 15:55:12,027][67877] Fps is (10 sec: 54066.4, 60 sec: 52701.9, 300 sec: 52428.8). Total num frames: 464764928. Throughput: 0: 52237.6. Samples: 168511760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-12 15:55:12,027][67877] Avg episode reward: [(0, '0.079')] [2024-06-12 15:55:14,542][68109] Updated weights for policy 0, policy_version 28374 (0.0025) [2024-06-12 15:55:17,026][67877] Fps is (10 sec: 54067.0, 60 sec: 52701.9, 300 sec: 52317.7). Total num frames: 465010688. Throughput: 0: 52408.4. Samples: 168826100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-12 15:55:17,027][67877] Avg episode reward: [(0, '0.084')] [2024-06-12 15:55:17,079][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000028383_465027072.pth... [2024-06-12 15:55:17,126][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000027614_452427776.pth [2024-06-12 15:55:17,283][68109] Updated weights for policy 0, policy_version 28384 (0.0031) [2024-06-12 15:55:20,970][68109] Updated weights for policy 0, policy_version 28394 (0.0026) [2024-06-12 15:55:22,027][67877] Fps is (10 sec: 50790.5, 60 sec: 52428.6, 300 sec: 52428.8). Total num frames: 465272832. Throughput: 0: 52390.0. Samples: 169139240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-12 15:55:22,027][67877] Avg episode reward: [(0, '0.080')] [2024-06-12 15:55:23,677][68109] Updated weights for policy 0, policy_version 28404 (0.0023) [2024-06-12 15:55:27,026][67877] Fps is (10 sec: 49152.6, 60 sec: 51609.7, 300 sec: 52206.7). Total num frames: 465502208. Throughput: 0: 52350.5. Samples: 169289040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-12 15:55:27,027][67877] Avg episode reward: [(0, '0.080')] [2024-06-12 15:55:27,465][68109] Updated weights for policy 0, policy_version 28414 (0.0027) [2024-06-12 15:55:30,538][68109] Updated weights for policy 0, policy_version 28424 (0.0028) [2024-06-12 15:55:32,026][67877] Fps is (10 sec: 50791.0, 60 sec: 52701.8, 300 sec: 52373.3). Total num frames: 465780736. Throughput: 0: 52223.0. Samples: 169599640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 15:55:32,027][67877] Avg episode reward: [(0, '0.078')] [2024-06-12 15:55:33,799][68109] Updated weights for policy 0, policy_version 28434 (0.0029) [2024-06-12 15:55:36,499][68109] Updated weights for policy 0, policy_version 28444 (0.0028) [2024-06-12 15:55:37,026][67877] Fps is (10 sec: 55705.1, 60 sec: 52701.8, 300 sec: 52373.3). Total num frames: 466059264. Throughput: 0: 52351.6. Samples: 169921620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 15:55:37,027][67877] Avg episode reward: [(0, '0.079')] [2024-06-12 15:55:39,831][68109] Updated weights for policy 0, policy_version 28454 (0.0025) [2024-06-12 15:55:42,026][67877] Fps is (10 sec: 52429.0, 60 sec: 52428.9, 300 sec: 52262.2). Total num frames: 466305024. Throughput: 0: 52079.7. Samples: 170076580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 15:55:42,027][67877] Avg episode reward: [(0, '0.103')] [2024-06-12 15:55:42,097][68089] Saving new best policy, reward=0.103! [2024-06-12 15:55:42,718][68109] Updated weights for policy 0, policy_version 28464 (0.0026) [2024-06-12 15:55:46,305][68109] Updated weights for policy 0, policy_version 28474 (0.0026) [2024-06-12 15:55:47,026][67877] Fps is (10 sec: 50790.3, 60 sec: 51882.6, 300 sec: 52317.7). Total num frames: 466567168. Throughput: 0: 52229.3. Samples: 170393900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 15:55:47,027][67877] Avg episode reward: [(0, '0.074')] [2024-06-12 15:55:49,298][68109] Updated weights for policy 0, policy_version 28484 (0.0031) [2024-06-12 15:55:49,453][68089] Signal inference workers to stop experience collection... (2200 times) [2024-06-12 15:55:49,453][68089] Signal inference workers to resume experience collection... (2200 times) [2024-06-12 15:55:49,491][68109] InferenceWorker_p0-w0: stopping experience collection (2200 times) [2024-06-12 15:55:49,492][68109] InferenceWorker_p0-w0: resuming experience collection (2200 times) [2024-06-12 15:55:52,026][67877] Fps is (10 sec: 49151.9, 60 sec: 51336.6, 300 sec: 52206.7). Total num frames: 466796544. Throughput: 0: 52174.2. Samples: 170703000. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-12 15:55:52,027][67877] Avg episode reward: [(0, '0.071')] [2024-06-12 15:55:52,436][68109] Updated weights for policy 0, policy_version 28494 (0.0025) [2024-06-12 15:55:55,309][68109] Updated weights for policy 0, policy_version 28504 (0.0039) [2024-06-12 15:55:57,026][67877] Fps is (10 sec: 50790.4, 60 sec: 51609.7, 300 sec: 52262.2). Total num frames: 467075072. Throughput: 0: 51989.9. Samples: 170851300. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-12 15:55:57,027][67877] Avg episode reward: [(0, '0.097')] [2024-06-12 15:55:58,700][68109] Updated weights for policy 0, policy_version 28514 (0.0026) [2024-06-12 15:56:01,492][68109] Updated weights for policy 0, policy_version 28524 (0.0027) [2024-06-12 15:56:02,026][67877] Fps is (10 sec: 55705.8, 60 sec: 52155.8, 300 sec: 52317.7). Total num frames: 467353600. Throughput: 0: 52001.4. Samples: 171166160. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-12 15:56:02,027][67877] Avg episode reward: [(0, '0.083')] [2024-06-12 15:56:05,046][68109] Updated weights for policy 0, policy_version 28534 (0.0024) [2024-06-12 15:56:07,026][67877] Fps is (10 sec: 54066.8, 60 sec: 52428.7, 300 sec: 52262.2). Total num frames: 467615744. Throughput: 0: 51802.7. Samples: 171470360. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-12 15:56:07,027][67877] Avg episode reward: [(0, '0.094')] [2024-06-12 15:56:08,083][68109] Updated weights for policy 0, policy_version 28544 (0.0028) [2024-06-12 15:56:11,351][68109] Updated weights for policy 0, policy_version 28554 (0.0022) [2024-06-12 15:56:12,026][67877] Fps is (10 sec: 50789.9, 60 sec: 51609.7, 300 sec: 52317.7). Total num frames: 467861504. Throughput: 0: 51963.4. Samples: 171627400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 15:56:12,027][67877] Avg episode reward: [(0, '0.092')] [2024-06-12 15:56:14,091][68109] Updated weights for policy 0, policy_version 28564 (0.0031) [2024-06-12 15:56:17,026][67877] Fps is (10 sec: 47513.9, 60 sec: 51336.5, 300 sec: 52151.1). Total num frames: 468090880. Throughput: 0: 51990.2. Samples: 171939200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 15:56:17,027][67877] Avg episode reward: [(0, '0.075')] [2024-06-12 15:56:17,627][68109] Updated weights for policy 0, policy_version 28574 (0.0031) [2024-06-12 15:56:20,515][68109] Updated weights for policy 0, policy_version 28584 (0.0022) [2024-06-12 15:56:22,026][67877] Fps is (10 sec: 50790.5, 60 sec: 51609.7, 300 sec: 52151.1). Total num frames: 468369408. Throughput: 0: 51730.6. Samples: 172249500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 15:56:22,027][67877] Avg episode reward: [(0, '0.088')] [2024-06-12 15:56:23,830][68109] Updated weights for policy 0, policy_version 28594 (0.0029) [2024-06-12 15:56:26,691][68109] Updated weights for policy 0, policy_version 28604 (0.0026) [2024-06-12 15:56:27,026][67877] Fps is (10 sec: 55705.5, 60 sec: 52428.7, 300 sec: 52262.2). Total num frames: 468647936. Throughput: 0: 51811.0. Samples: 172408080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 15:56:27,027][67877] Avg episode reward: [(0, '0.098')] [2024-06-12 15:56:30,375][68109] Updated weights for policy 0, policy_version 28614 (0.0025) [2024-06-12 15:56:32,026][67877] Fps is (10 sec: 52428.7, 60 sec: 51882.6, 300 sec: 52206.6). Total num frames: 468893696. Throughput: 0: 51676.0. Samples: 172719320. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 15:56:32,027][67877] Avg episode reward: [(0, '0.092')] [2024-06-12 15:56:33,321][68109] Updated weights for policy 0, policy_version 28624 (0.0024) [2024-06-12 15:56:36,560][68109] Updated weights for policy 0, policy_version 28634 (0.0019) [2024-06-12 15:56:37,026][67877] Fps is (10 sec: 52429.2, 60 sec: 51882.7, 300 sec: 52317.7). Total num frames: 469172224. Throughput: 0: 51765.8. Samples: 173032460. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 15:56:37,027][67877] Avg episode reward: [(0, '0.088')] [2024-06-12 15:56:39,668][68109] Updated weights for policy 0, policy_version 28644 (0.0023) [2024-06-12 15:56:42,026][67877] Fps is (10 sec: 50790.3, 60 sec: 51609.5, 300 sec: 52151.1). Total num frames: 469401600. Throughput: 0: 51765.3. Samples: 173180740. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 15:56:42,027][67877] Avg episode reward: [(0, '0.076')] [2024-06-12 15:56:42,989][68109] Updated weights for policy 0, policy_version 28654 (0.0021) [2024-06-12 15:56:45,815][68109] Updated weights for policy 0, policy_version 28664 (0.0026) [2024-06-12 15:56:46,511][68089] Signal inference workers to stop experience collection... (2250 times) [2024-06-12 15:56:46,517][68109] InferenceWorker_p0-w0: stopping experience collection (2250 times) [2024-06-12 15:56:46,564][68089] Signal inference workers to resume experience collection... (2250 times) [2024-06-12 15:56:46,564][68109] InferenceWorker_p0-w0: resuming experience collection (2250 times) [2024-06-12 15:56:47,026][67877] Fps is (10 sec: 50790.5, 60 sec: 51882.7, 300 sec: 52206.6). Total num frames: 469680128. Throughput: 0: 51844.9. Samples: 173499180. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 15:56:47,027][67877] Avg episode reward: [(0, '0.073')] [2024-06-12 15:56:48,978][68109] Updated weights for policy 0, policy_version 28674 (0.0029) [2024-06-12 15:56:52,027][67877] Fps is (10 sec: 54066.8, 60 sec: 52428.7, 300 sec: 52262.2). Total num frames: 469942272. Throughput: 0: 51983.9. Samples: 173809640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 15:56:52,027][67877] Avg episode reward: [(0, '0.083')] [2024-06-12 15:56:52,315][68109] Updated weights for policy 0, policy_version 28684 (0.0025) [2024-06-12 15:56:55,277][68109] Updated weights for policy 0, policy_version 28694 (0.0031) [2024-06-12 15:56:57,026][67877] Fps is (10 sec: 52428.7, 60 sec: 52155.8, 300 sec: 52262.2). Total num frames: 470204416. Throughput: 0: 52091.2. Samples: 173971500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 15:56:57,027][67877] Avg episode reward: [(0, '0.084')] [2024-06-12 15:56:58,369][68109] Updated weights for policy 0, policy_version 28704 (0.0022) [2024-06-12 15:57:01,320][68109] Updated weights for policy 0, policy_version 28714 (0.0024) [2024-06-12 15:57:02,026][67877] Fps is (10 sec: 52429.0, 60 sec: 51882.5, 300 sec: 52262.2). Total num frames: 470466560. Throughput: 0: 52254.6. Samples: 174290660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 15:57:02,027][67877] Avg episode reward: [(0, '0.089')] [2024-06-12 15:57:04,693][68109] Updated weights for policy 0, policy_version 28724 (0.0021) [2024-06-12 15:57:07,026][67877] Fps is (10 sec: 52428.7, 60 sec: 51882.7, 300 sec: 52151.1). Total num frames: 470728704. Throughput: 0: 52386.7. Samples: 174606900. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-12 15:57:07,027][67877] Avg episode reward: [(0, '0.100')] [2024-06-12 15:57:07,899][68109] Updated weights for policy 0, policy_version 28734 (0.0030) [2024-06-12 15:57:11,033][68109] Updated weights for policy 0, policy_version 28744 (0.0022) [2024-06-12 15:57:12,026][67877] Fps is (10 sec: 52428.8, 60 sec: 52155.7, 300 sec: 52151.1). Total num frames: 470990848. Throughput: 0: 52162.6. Samples: 174755400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-12 15:57:12,027][67877] Avg episode reward: [(0, '0.079')] [2024-06-12 15:57:14,015][68109] Updated weights for policy 0, policy_version 28754 (0.0025) [2024-06-12 15:57:17,026][67877] Fps is (10 sec: 54066.7, 60 sec: 52974.9, 300 sec: 52317.7). Total num frames: 471269376. Throughput: 0: 52389.7. Samples: 175076860. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-12 15:57:17,027][67877] Avg episode reward: [(0, '0.093')] [2024-06-12 15:57:17,028][68109] Updated weights for policy 0, policy_version 28764 (0.0025) [2024-06-12 15:57:17,034][68089] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000028764_471269376.pth... [2024-06-12 15:57:17,076][68089] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000027998_458719232.pth [2024-06-12 15:57:20,171][68109] Updated weights for policy 0, policy_version 28774 (0.0031) [2024-06-12 15:57:22,026][67877] Fps is (10 sec: 52429.3, 60 sec: 52428.8, 300 sec: 52262.2). Total num frames: 471515136. Throughput: 0: 52450.2. Samples: 175392720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-12 15:57:22,027][67877] Avg episode reward: [(0, '0.096')] [2024-06-12 15:57:23,440][68109] Updated weights for policy 0, policy_version 28784 (0.0032) [2024-06-12 16:17:43,176][70768] Saving configuration to /workspace/metta/train_dir/p2.death/config.json... [2024-06-12 16:17:43,192][70768] Rollout worker 0 uses device cpu [2024-06-12 16:17:43,192][70768] Rollout worker 1 uses device cpu [2024-06-12 16:17:43,192][70768] Rollout worker 2 uses device cpu [2024-06-12 16:17:43,192][70768] Rollout worker 3 uses device cpu [2024-06-12 16:17:43,193][70768] Rollout worker 4 uses device cpu [2024-06-12 16:17:43,193][70768] Rollout worker 5 uses device cpu [2024-06-12 16:17:43,193][70768] Rollout worker 6 uses device cpu [2024-06-12 16:17:43,193][70768] Rollout worker 7 uses device cpu [2024-06-12 16:17:43,193][70768] Rollout worker 8 uses device cpu [2024-06-12 16:17:43,193][70768] Rollout worker 9 uses device cpu [2024-06-12 16:17:43,193][70768] Rollout worker 10 uses device cpu [2024-06-12 16:17:43,193][70768] Rollout worker 11 uses device cpu [2024-06-12 16:17:43,193][70768] Rollout worker 12 uses device cpu [2024-06-12 16:17:43,193][70768] Rollout worker 13 uses device cpu [2024-06-12 16:17:43,193][70768] Rollout worker 14 uses device cpu [2024-06-12 16:17:43,193][70768] Rollout worker 15 uses device cpu [2024-06-12 16:17:43,194][70768] Rollout worker 16 uses device cpu [2024-06-12 16:17:43,194][70768] Rollout worker 17 uses device cpu [2024-06-12 16:17:43,194][70768] Rollout worker 18 uses device cpu [2024-06-12 16:17:43,194][70768] Rollout worker 19 uses device cpu [2024-06-12 16:17:43,194][70768] Rollout worker 20 uses device cpu [2024-06-12 16:17:43,194][70768] Rollout worker 21 uses device cpu [2024-06-12 16:17:43,194][70768] Rollout worker 22 uses device cpu [2024-06-12 16:17:43,194][70768] Rollout worker 23 uses device cpu [2024-06-12 16:17:43,194][70768] Rollout worker 24 uses device cpu [2024-06-12 16:17:43,194][70768] Rollout worker 25 uses device cpu [2024-06-12 16:17:43,194][70768] Rollout worker 26 uses device cpu [2024-06-12 16:17:43,194][70768] Rollout worker 27 uses device cpu [2024-06-12 16:17:43,195][70768] Rollout worker 28 uses device cpu [2024-06-12 16:17:43,195][70768] Rollout worker 29 uses device cpu [2024-06-12 16:17:43,195][70768] Rollout worker 30 uses device cpu [2024-06-12 16:17:43,195][70768] Rollout worker 31 uses device cpu [2024-06-12 16:17:43,755][70768] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-12 16:17:43,755][70768] InferenceWorker_p0-w0: min num requests: 10 [2024-06-12 16:17:43,834][70768] Starting all processes... [2024-06-12 16:17:43,834][70768] Starting process learner_proc0 [2024-06-12 16:17:44,062][70768] Starting all processes... [2024-06-12 16:17:44,065][70768] Starting process inference_proc0-0 [2024-06-12 16:17:44,065][70768] Starting process rollout_proc0 [2024-06-12 16:17:44,065][70768] Starting process rollout_proc1 [2024-06-12 16:17:44,065][70768] Starting process rollout_proc2 [2024-06-12 16:17:44,065][70768] Starting process rollout_proc3 [2024-06-12 16:17:44,066][70768] Starting process rollout_proc4 [2024-06-12 16:17:44,067][70768] Starting process rollout_proc5 [2024-06-12 16:17:44,068][70768] Starting process rollout_proc6 [2024-06-12 16:17:44,068][70768] Starting process rollout_proc7 [2024-06-12 16:17:44,068][70768] Starting process rollout_proc8 [2024-06-12 16:17:44,068][70768] Starting process rollout_proc9 [2024-06-12 16:17:44,069][70768] Starting process rollout_proc10 [2024-06-12 16:17:44,069][70768] Starting process rollout_proc11 [2024-06-12 16:17:44,071][70768] Starting process rollout_proc12 [2024-06-12 16:17:44,071][70768] Starting process rollout_proc13 [2024-06-12 16:17:44,071][70768] Starting process rollout_proc14 [2024-06-12 16:17:44,073][70768] Starting process rollout_proc15 [2024-06-12 16:17:44,074][70768] Starting process rollout_proc16 [2024-06-12 16:17:44,075][70768] Starting process rollout_proc17 [2024-06-12 16:17:44,075][70768] Starting process rollout_proc18 [2024-06-12 16:17:44,076][70768] Starting process rollout_proc19 [2024-06-12 16:17:44,079][70768] Starting process rollout_proc20 [2024-06-12 16:17:44,080][70768] Starting process rollout_proc21 [2024-06-12 16:17:44,080][70768] Starting process rollout_proc22 [2024-06-12 16:17:44,080][70768] Starting process rollout_proc23 [2024-06-12 16:17:44,085][70768] Starting process rollout_proc24 [2024-06-12 16:17:44,085][70768] Starting process rollout_proc25 [2024-06-12 16:17:44,086][70768] Starting process rollout_proc26 [2024-06-12 16:17:44,088][70768] Starting process rollout_proc27 [2024-06-12 16:17:44,090][70768] Starting process rollout_proc28 [2024-06-12 16:17:44,090][70768] Starting process rollout_proc29 [2024-06-12 16:17:44,093][70768] Starting process rollout_proc30 [2024-06-12 16:17:44,095][70768] Starting process rollout_proc31 [2024-06-12 16:17:46,199][71000] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-12 16:17:46,199][71000] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-06-12 16:17:46,212][71000] Num visible devices: 1 [2024-06-12 16:17:46,240][71005] Worker 3 uses CPU cores [3] [2024-06-12 16:17:46,240][71020] Worker 19 uses CPU cores [19] [2024-06-12 16:17:46,241][70980] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-12 16:17:46,241][70980] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-06-12 16:17:46,251][70980] Num visible devices: 1 [2024-06-12 16:17:46,256][71012] Worker 10 uses CPU cores [10] [2024-06-12 16:17:46,264][70980] Setting fixed seed 0 [2024-06-12 16:17:46,265][70980] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-12 16:17:46,265][70980] Initializing actor-critic model on device cuda:0 [2024-06-12 16:17:46,287][71022] Worker 21 uses CPU cores [21] [2024-06-12 16:17:46,300][71017] Worker 16 uses CPU cores [16] [2024-06-12 16:17:46,320][71027] Worker 26 uses CPU cores [26] [2024-06-12 16:17:46,340][71025] Worker 25 uses CPU cores [25] [2024-06-12 16:17:46,344][71019] Worker 18 uses CPU cores [18] [2024-06-12 16:17:46,356][71011] Worker 11 uses CPU cores [11] [2024-06-12 16:17:46,356][71024] Worker 23 uses CPU cores [23] [2024-06-12 16:17:46,380][71021] Worker 20 uses CPU cores [20] [2024-06-12 16:17:46,388][71013] Worker 13 uses CPU cores [13] [2024-06-12 16:17:46,388][71014] Worker 12 uses CPU cores [12] [2024-06-12 16:17:46,396][71026] Worker 24 uses CPU cores [24] [2024-06-12 16:17:46,412][71029] Worker 28 uses CPU cores [28] [2024-06-12 16:17:46,428][71007] Worker 6 uses CPU cores [6] [2024-06-12 16:17:46,451][71028] Worker 27 uses CPU cores [27] [2024-06-12 16:17:46,456][71004] Worker 4 uses CPU cores [4] [2024-06-12 16:17:46,468][71032] Worker 30 uses CPU cores [30] [2024-06-12 16:17:46,475][71010] Worker 8 uses CPU cores [8] [2024-06-12 16:17:46,480][71018] Worker 17 uses CPU cores [17] [2024-06-12 16:17:46,490][71016] Worker 15 uses CPU cores [15] [2024-06-12 16:17:46,494][71008] Worker 7 uses CPU cores [7] [2024-06-12 16:17:46,504][71003] Worker 2 uses CPU cores [2] [2024-06-12 16:17:46,518][71001] Worker 1 uses CPU cores [1] [2024-06-12 16:17:46,547][71009] Worker 9 uses CPU cores [9] [2024-06-12 16:17:46,555][71030] Worker 29 uses CPU cores [29] [2024-06-12 16:17:46,578][71023] Worker 22 uses CPU cores [22] [2024-06-12 16:17:46,591][71006] Worker 5 uses CPU cores [5] [2024-06-12 16:17:46,645][71002] Worker 0 uses CPU cores [0] [2024-06-12 16:17:46,650][71015] Worker 14 uses CPU cores [14] [2024-06-12 16:17:46,693][71031] Worker 31 uses CPU cores [31] [2024-06-12 16:17:47,089][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,089][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,089][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,089][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,089][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,089][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,089][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,090][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,090][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,090][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,090][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,090][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,090][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,090][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,090][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,090][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,090][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,090][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,090][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,090][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,090][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,090][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,093][70980] RunningMeanStd input shape: (1,) [2024-06-12 16:17:47,093][70980] RunningMeanStd input shape: (1,) [2024-06-12 16:17:47,093][70980] RunningMeanStd input shape: (1,) [2024-06-12 16:17:47,094][70980] RunningMeanStd input shape: (1,) [2024-06-12 16:17:47,094][70980] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:47,132][70980] RunningMeanStd input shape: (1,) [2024-06-12 16:17:47,137][70980] Created Actor Critic model with architecture: [2024-06-12 16:17:47,137][70980] SampleFactoryAgentWrapper( (obs_normalizer): ObservationNormalizer() (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (agent): MettaAgent( (_encoder): MultiFeatureSetEncoder( (feature_set_encoders): ModuleDict( (grid_obs): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (agent): RunningMeanStdInPlace() (altar): RunningMeanStdInPlace() (converter): RunningMeanStdInPlace() (generator): RunningMeanStdInPlace() (wall): RunningMeanStdInPlace() (agent:dir): RunningMeanStdInPlace() (agent:energy): RunningMeanStdInPlace() (agent:frozen): RunningMeanStdInPlace() (agent:hp): RunningMeanStdInPlace() (agent:id): RunningMeanStdInPlace() (agent:inv_r1): RunningMeanStdInPlace() (agent:inv_r2): RunningMeanStdInPlace() (agent:inv_r3): RunningMeanStdInPlace() (agent:shield): RunningMeanStdInPlace() (altar:hp): RunningMeanStdInPlace() (altar:state): RunningMeanStdInPlace() (converter:hp): RunningMeanStdInPlace() (converter:state): RunningMeanStdInPlace() (generator:amount): RunningMeanStdInPlace() (generator:hp): RunningMeanStdInPlace() (generator:state): RunningMeanStdInPlace() (wall:hp): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) (6): Linear(in_features=512, out_features=512, bias=True) (7): ELU(alpha=1.0) ) ) (global_vars): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (_steps): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_action): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_action_id): RunningMeanStdInPlace() (last_action_val): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_reward): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_reward): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (kinship): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (kinship): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) ) (merged_encoder): Sequential( (0): Linear(in_features=544, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) ) ) (_core): ModelCoreRNN( (core): GRU(512, 512) ) (_decoder): Decoder( (mlp): Identity() ) (_critic_linear): Linear(in_features=512, out_features=1, bias=True) (_action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=16, bias=True) ) ) ) [2024-06-12 16:17:47,209][70980] Using optimizer [2024-06-12 16:17:47,390][70980] Loading state from checkpoint /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000028764_471269376.pth... [2024-06-12 16:17:47,404][70980] Loading model from checkpoint [2024-06-12 16:17:47,406][70980] Loaded experiment state at self.train_step=28764, self.env_steps=471269376 [2024-06-12 16:17:47,406][70980] Initialized policy 0 weights for model version 28764 [2024-06-12 16:17:47,407][70980] LearnerWorker_p0 finished initialization! [2024-06-12 16:17:47,407][70980] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-12 16:17:48,102][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,103][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,103][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,103][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,103][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,103][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,103][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,103][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,103][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,103][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,103][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,103][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,103][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,103][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,103][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,103][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,103][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,103][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,104][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,104][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,104][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,104][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,107][71000] RunningMeanStd input shape: (1,) [2024-06-12 16:17:48,107][71000] RunningMeanStd input shape: (1,) [2024-06-12 16:17:48,107][71000] RunningMeanStd input shape: (1,) [2024-06-12 16:17:48,107][71000] RunningMeanStd input shape: (1,) [2024-06-12 16:17:48,108][71000] RunningMeanStd input shape: (11, 11) [2024-06-12 16:17:48,146][71000] RunningMeanStd input shape: (1,) [2024-06-12 16:17:48,167][70768] Inference worker 0-0 is ready! [2024-06-12 16:17:48,168][70768] All inference workers are ready! Signal rollout workers to start! [2024-06-12 16:17:50,743][71020] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,746][71019] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,750][71030] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,750][71023] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,754][71025] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,757][71024] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,757][71022] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,767][71027] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,770][71018] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,774][71021] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,777][71031] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,782][71032] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,796][71028] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,799][71006] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,799][71029] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,806][71007] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,807][71009] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,808][71013] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,810][71011] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,813][71017] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,813][71014] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,814][71008] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,815][71012] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,815][71004] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,815][71016] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,816][71010] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,816][71005] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,817][71002] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,817][71001] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,817][71015] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,819][71003] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,826][71026] Decorrelating experience for 0 frames... [2024-06-12 16:17:50,940][70768] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 471269376. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-12 16:17:51,742][71020] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,745][71019] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,750][71030] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,756][71023] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,763][71025] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,766][71022] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,777][71027] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,779][71024] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,787][71018] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,792][71021] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,798][71032] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,799][71031] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,819][71029] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,827][71006] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,845][71007] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,847][71013] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,849][71028] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,851][71009] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,855][71011] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,862][71004] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,863][71014] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,864][71008] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,865][71012] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,867][71026] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,867][71016] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,867][71010] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,870][71005] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,870][71017] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,874][71015] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,877][71002] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,878][71001] Decorrelating experience for 256 frames... [2024-06-12 16:17:51,880][71003] Decorrelating experience for 256 frames... [2024-06-12 16:17:55,940][70768] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 471269376. Throughput: 0: 196.0. Samples: 980. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-12 16:17:55,940][70768] Avg episode reward: [(0, '0.000')] [2024-06-12 16:17:59,265][71001] Worker 1, sleep for 4.688 sec to decorrelate experience collection [2024-06-12 16:17:59,286][71012] Worker 10, sleep for 46.875 sec to decorrelate experience collection [2024-06-12 16:17:59,294][71006] Worker 5, sleep for 23.438 sec to decorrelate experience collection [2024-06-12 16:17:59,296][71004] Worker 4, sleep for 18.750 sec to decorrelate experience collection [2024-06-12 16:17:59,304][71009] Worker 9, sleep for 42.188 sec to decorrelate experience collection [2024-06-12 16:17:59,312][71030] Worker 29, sleep for 135.938 sec to decorrelate experience collection [2024-06-12 16:17:59,331][71024] Worker 23, sleep for 107.812 sec to decorrelate experience collection [2024-06-12 16:17:59,331][71027] Worker 26, sleep for 121.875 sec to decorrelate experience collection [2024-06-12 16:17:59,332][71025] Worker 25, sleep for 117.188 sec to decorrelate experience collection [2024-06-12 16:17:59,332][71031] Worker 31, sleep for 145.312 sec to decorrelate experience collection [2024-06-12 16:17:59,338][71029] Worker 28, sleep for 131.250 sec to decorrelate experience collection [2024-06-12 16:17:59,372][71028] Worker 27, sleep for 126.562 sec to decorrelate experience collection [2024-06-12 16:17:59,383][71017] Worker 16, sleep for 75.000 sec to decorrelate experience collection [2024-06-12 16:17:59,427][70980] Signal inference workers to stop experience collection... [2024-06-12 16:17:59,433][71000] InferenceWorker_p0-w0: stopping experience collection [2024-06-12 16:17:59,971][70980] Signal inference workers to resume experience collection... [2024-06-12 16:17:59,971][71000] InferenceWorker_p0-w0: resuming experience collection [2024-06-12 16:18:00,006][71003] Worker 2, sleep for 9.375 sec to decorrelate experience collection [2024-06-12 16:18:00,557][71010] Worker 8, sleep for 37.500 sec to decorrelate experience collection [2024-06-12 16:18:00,634][71005] Worker 3, sleep for 14.062 sec to decorrelate experience collection [2024-06-12 16:18:00,649][71011] Worker 11, sleep for 51.562 sec to decorrelate experience collection [2024-06-12 16:18:00,674][71008] Worker 7, sleep for 32.812 sec to decorrelate experience collection [2024-06-12 16:18:00,693][71013] Worker 13, sleep for 60.938 sec to decorrelate experience collection [2024-06-12 16:18:00,814][71007] Worker 6, sleep for 28.125 sec to decorrelate experience collection [2024-06-12 16:18:00,849][71015] Worker 14, sleep for 65.625 sec to decorrelate experience collection [2024-06-12 16:18:00,859][71014] Worker 12, sleep for 56.250 sec to decorrelate experience collection [2024-06-12 16:18:00,864][71016] Worker 15, sleep for 70.312 sec to decorrelate experience collection [2024-06-12 16:18:00,939][70768] Fps is (10 sec: 9830.4, 60 sec: 9830.4, 300 sec: 9830.4). Total num frames: 471367680. Throughput: 0: 30612.1. Samples: 306120. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-12 16:18:00,940][70768] Avg episode reward: [(0, '0.009')] [2024-06-12 16:18:01,022][71021] Worker 20, sleep for 93.750 sec to decorrelate experience collection [2024-06-12 16:18:01,103][71019] Worker 18, sleep for 84.375 sec to decorrelate experience collection [2024-06-12 16:18:01,177][71018] Worker 17, sleep for 79.688 sec to decorrelate experience collection [2024-06-12 16:18:01,180][71023] Worker 22, sleep for 103.125 sec to decorrelate experience collection [2024-06-12 16:18:01,187][71022] Worker 21, sleep for 98.438 sec to decorrelate experience collection [2024-06-12 16:18:01,283][71032] Worker 30, sleep for 140.625 sec to decorrelate experience collection [2024-06-12 16:18:01,315][71026] Worker 24, sleep for 112.500 sec to decorrelate experience collection [2024-06-12 16:18:01,441][71000] Updated weights for policy 0, policy_version 28774 (0.0020) [2024-06-12 16:18:01,593][71020] Worker 19, sleep for 89.062 sec to decorrelate experience collection [2024-06-12 16:18:03,752][70768] Heartbeat connected on Batcher_0 [2024-06-12 16:18:03,754][70768] Heartbeat connected on LearnerWorker_p0 [2024-06-12 16:18:03,758][70768] Heartbeat connected on RolloutWorker_w0 [2024-06-12 16:18:03,806][70768] Heartbeat connected on InferenceWorker_p0-w0 [2024-06-12 16:18:03,976][71001] Worker 1 awakens! [2024-06-12 16:18:03,987][70768] Heartbeat connected on RolloutWorker_w1 [2024-06-12 16:18:05,940][70768] Fps is (10 sec: 16383.6, 60 sec: 10922.4, 300 sec: 10922.4). Total num frames: 471433216. Throughput: 0: 22211.5. Samples: 333180. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-12 16:18:05,941][70768] Avg episode reward: [(0, '0.009')] [2024-06-12 16:18:09,389][71003] Worker 2 awakens! [2024-06-12 16:18:09,399][70768] Heartbeat connected on RolloutWorker_w2 [2024-06-12 16:18:10,940][70768] Fps is (10 sec: 8192.0, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 471449600. Throughput: 0: 17238.0. Samples: 344760. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-12 16:18:10,940][70768] Avg episode reward: [(0, '0.021')] [2024-06-12 16:18:14,767][71005] Worker 3 awakens! [2024-06-12 16:18:14,776][70768] Heartbeat connected on RolloutWorker_w3 [2024-06-12 16:18:15,940][70768] Fps is (10 sec: 3276.9, 60 sec: 7864.3, 300 sec: 7864.3). Total num frames: 471465984. Throughput: 0: 14575.2. Samples: 364380. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-12 16:18:15,940][70768] Avg episode reward: [(0, '0.038')] [2024-06-12 16:18:18,118][71004] Worker 4 awakens! [2024-06-12 16:18:18,127][70768] Heartbeat connected on RolloutWorker_w4 [2024-06-12 16:18:20,939][70768] Fps is (10 sec: 8192.0, 60 sec: 8738.1, 300 sec: 8738.1). Total num frames: 471531520. Throughput: 0: 12584.7. Samples: 377540. Policy #0 lag: (min: 0.0, avg: 1.0, max: 3.0) [2024-06-12 16:18:20,940][70768] Avg episode reward: [(0, '0.053')] [2024-06-12 16:18:22,764][71006] Worker 5 awakens! [2024-06-12 16:18:22,768][70768] Heartbeat connected on RolloutWorker_w5 [2024-06-12 16:18:23,892][71000] Updated weights for policy 0, policy_version 28784 (0.0015) [2024-06-12 16:18:25,939][70768] Fps is (10 sec: 16384.2, 60 sec: 10298.5, 300 sec: 10298.5). Total num frames: 471629824. Throughput: 0: 14064.6. Samples: 492260. Policy #0 lag: (min: 0.0, avg: 1.0, max: 3.0) [2024-06-12 16:18:25,940][70768] Avg episode reward: [(0, '0.097')] [2024-06-12 16:18:29,036][71007] Worker 6 awakens! [2024-06-12 16:18:29,040][70768] Heartbeat connected on RolloutWorker_w6 [2024-06-12 16:18:30,939][70768] Fps is (10 sec: 19660.7, 60 sec: 11468.8, 300 sec: 11468.8). Total num frames: 471728128. Throughput: 0: 15905.0. Samples: 636200. Policy #0 lag: (min: 0.0, avg: 1.0, max: 3.0) [2024-06-12 16:18:30,940][70768] Avg episode reward: [(0, '0.105')] [2024-06-12 16:18:31,029][70980] Saving new best policy, reward=0.105! [2024-06-12 16:18:31,288][71000] Updated weights for policy 0, policy_version 28794 (0.0012) [2024-06-12 16:18:33,514][71008] Worker 7 awakens! [2024-06-12 16:18:33,520][70768] Heartbeat connected on RolloutWorker_w7 [2024-06-12 16:18:35,940][70768] Fps is (10 sec: 26214.1, 60 sec: 13835.4, 300 sec: 13835.4). Total num frames: 471891968. Throughput: 0: 15830.6. Samples: 712380. Policy #0 lag: (min: 0.0, avg: 1.0, max: 3.0) [2024-06-12 16:18:35,948][70768] Avg episode reward: [(0, '0.099')] [2024-06-12 16:18:36,457][71000] Updated weights for policy 0, policy_version 28804 (0.0011) [2024-06-12 16:18:38,156][71010] Worker 8 awakens! [2024-06-12 16:18:38,161][70768] Heartbeat connected on RolloutWorker_w8 [2024-06-12 16:18:40,940][70768] Fps is (10 sec: 31129.5, 60 sec: 15401.0, 300 sec: 15401.0). Total num frames: 472039424. Throughput: 0: 19699.6. Samples: 887460. Policy #0 lag: (min: 0.0, avg: 2.1, max: 7.0) [2024-06-12 16:18:40,940][70768] Avg episode reward: [(0, '0.098')] [2024-06-12 16:18:41,592][71009] Worker 9 awakens! [2024-06-12 16:18:41,595][70768] Heartbeat connected on RolloutWorker_w9 [2024-06-12 16:18:41,736][71000] Updated weights for policy 0, policy_version 28814 (0.0013) [2024-06-12 16:18:45,939][70768] Fps is (10 sec: 32768.2, 60 sec: 17277.7, 300 sec: 17277.7). Total num frames: 472219648. Throughput: 0: 17420.9. Samples: 1090060. Policy #0 lag: (min: 0.0, avg: 2.1, max: 7.0) [2024-06-12 16:18:45,940][70768] Avg episode reward: [(0, '0.081')] [2024-06-12 16:18:46,261][71012] Worker 10 awakens! [2024-06-12 16:18:46,264][70768] Heartbeat connected on RolloutWorker_w10 [2024-06-12 16:18:46,836][71000] Updated weights for policy 0, policy_version 28824 (0.0014) [2024-06-12 16:18:50,343][71000] Updated weights for policy 0, policy_version 28834 (0.0013) [2024-06-12 16:18:50,940][70768] Fps is (10 sec: 39321.4, 60 sec: 19387.7, 300 sec: 19387.7). Total num frames: 472432640. Throughput: 0: 19321.9. Samples: 1202660. Policy #0 lag: (min: 0.0, avg: 2.1, max: 7.0) [2024-06-12 16:18:50,940][70768] Avg episode reward: [(0, '0.083')] [2024-06-12 16:18:52,312][71011] Worker 11 awakens! [2024-06-12 16:18:52,317][70768] Heartbeat connected on RolloutWorker_w11 [2024-06-12 16:18:54,950][71000] Updated weights for policy 0, policy_version 28844 (0.0013) [2024-06-12 16:18:55,940][70768] Fps is (10 sec: 40959.9, 60 sec: 22664.6, 300 sec: 20921.1). Total num frames: 472629248. Throughput: 0: 24644.4. Samples: 1453760. Policy #0 lag: (min: 0.0, avg: 2.1, max: 7.0) [2024-06-12 16:18:55,940][70768] Avg episode reward: [(0, '0.114')] [2024-06-12 16:18:55,940][70980] Saving new best policy, reward=0.114! [2024-06-12 16:18:57,210][71014] Worker 12 awakens! [2024-06-12 16:18:57,215][70768] Heartbeat connected on RolloutWorker_w12 [2024-06-12 16:18:58,682][71000] Updated weights for policy 0, policy_version 28854 (0.0014) [2024-06-12 16:19:00,940][70768] Fps is (10 sec: 40960.0, 60 sec: 24576.0, 300 sec: 22469.5). Total num frames: 472842240. Throughput: 0: 30136.4. Samples: 1720520. Policy #0 lag: (min: 0.0, avg: 30.1, max: 86.0) [2024-06-12 16:19:00,940][70768] Avg episode reward: [(0, '0.095')] [2024-06-12 16:19:01,664][71013] Worker 13 awakens! [2024-06-12 16:19:01,671][70768] Heartbeat connected on RolloutWorker_w13 [2024-06-12 16:19:01,980][71000] Updated weights for policy 0, policy_version 28864 (0.0015) [2024-06-12 16:19:05,536][71000] Updated weights for policy 0, policy_version 28874 (0.0019) [2024-06-12 16:19:05,940][70768] Fps is (10 sec: 45874.9, 60 sec: 27579.8, 300 sec: 24248.3). Total num frames: 473088000. Throughput: 0: 32911.5. Samples: 1858560. Policy #0 lag: (min: 0.0, avg: 30.1, max: 86.0) [2024-06-12 16:19:05,940][70768] Avg episode reward: [(0, '0.094')] [2024-06-12 16:19:06,572][71015] Worker 14 awakens! [2024-06-12 16:19:06,577][70768] Heartbeat connected on RolloutWorker_w14 [2024-06-12 16:19:09,030][71000] Updated weights for policy 0, policy_version 28884 (0.0018) [2024-06-12 16:19:10,940][70768] Fps is (10 sec: 44236.9, 60 sec: 30583.4, 300 sec: 25190.4). Total num frames: 473284608. Throughput: 0: 36219.9. Samples: 2122160. Policy #0 lag: (min: 0.0, avg: 30.1, max: 86.0) [2024-06-12 16:19:10,940][70768] Avg episode reward: [(0, '0.101')] [2024-06-12 16:19:11,276][71016] Worker 15 awakens! [2024-06-12 16:19:11,281][70768] Heartbeat connected on RolloutWorker_w15 [2024-06-12 16:19:12,786][71000] Updated weights for policy 0, policy_version 28894 (0.0019) [2024-06-12 16:19:14,484][71017] Worker 16 awakens! [2024-06-12 16:19:14,490][70768] Heartbeat connected on RolloutWorker_w16 [2024-06-12 16:19:15,940][70768] Fps is (10 sec: 42598.3, 60 sec: 34133.3, 300 sec: 26407.1). Total num frames: 473513984. Throughput: 0: 38726.6. Samples: 2378900. Policy #0 lag: (min: 0.0, avg: 30.1, max: 86.0) [2024-06-12 16:19:15,940][70768] Avg episode reward: [(0, '0.100')] [2024-06-12 16:19:16,933][71000] Updated weights for policy 0, policy_version 28904 (0.0022) [2024-06-12 16:19:20,626][71000] Updated weights for policy 0, policy_version 28914 (0.0018) [2024-06-12 16:19:20,916][71018] Worker 17 awakens! [2024-06-12 16:19:20,924][70768] Heartbeat connected on RolloutWorker_w17 [2024-06-12 16:19:20,940][70768] Fps is (10 sec: 44236.9, 60 sec: 36590.9, 300 sec: 27306.7). Total num frames: 473726976. Throughput: 0: 40008.0. Samples: 2512740. Policy #0 lag: (min: 0.0, avg: 6.1, max: 13.0) [2024-06-12 16:19:20,940][70768] Avg episode reward: [(0, '0.101')] [2024-06-12 16:19:24,550][71000] Updated weights for policy 0, policy_version 28924 (0.0023) [2024-06-12 16:19:25,514][71019] Worker 18 awakens! [2024-06-12 16:19:25,523][70768] Heartbeat connected on RolloutWorker_w18 [2024-06-12 16:19:25,940][70768] Fps is (10 sec: 42598.5, 60 sec: 38502.3, 300 sec: 28111.5). Total num frames: 473939968. Throughput: 0: 41864.8. Samples: 2771380. Policy #0 lag: (min: 0.0, avg: 6.1, max: 13.0) [2024-06-12 16:19:25,940][70768] Avg episode reward: [(0, '0.100')] [2024-06-12 16:19:28,388][71000] Updated weights for policy 0, policy_version 28934 (0.0021) [2024-06-12 16:19:30,756][71020] Worker 19 awakens! [2024-06-12 16:19:30,765][70768] Heartbeat connected on RolloutWorker_w19 [2024-06-12 16:19:30,940][70768] Fps is (10 sec: 44236.6, 60 sec: 40686.9, 300 sec: 28999.7). Total num frames: 474169344. Throughput: 0: 43318.1. Samples: 3039380. Policy #0 lag: (min: 0.0, avg: 6.1, max: 13.0) [2024-06-12 16:19:30,940][70768] Avg episode reward: [(0, '0.105')] [2024-06-12 16:19:31,747][71000] Updated weights for policy 0, policy_version 28944 (0.0020) [2024-06-12 16:19:34,800][71021] Worker 20 awakens! [2024-06-12 16:19:34,809][70768] Heartbeat connected on RolloutWorker_w20 [2024-06-12 16:19:35,101][71000] Updated weights for policy 0, policy_version 28954 (0.0018) [2024-06-12 16:19:35,940][70768] Fps is (10 sec: 47513.6, 60 sec: 42052.3, 300 sec: 29959.3). Total num frames: 474415104. Throughput: 0: 43959.6. Samples: 3180840. Policy #0 lag: (min: 0.0, avg: 6.1, max: 13.0) [2024-06-12 16:19:35,940][70768] Avg episode reward: [(0, '0.099')] [2024-06-12 16:19:38,953][71000] Updated weights for policy 0, policy_version 28964 (0.0017) [2024-06-12 16:19:39,695][71022] Worker 21 awakens! [2024-06-12 16:19:39,704][70768] Heartbeat connected on RolloutWorker_w21 [2024-06-12 16:19:40,940][70768] Fps is (10 sec: 47513.7, 60 sec: 43417.6, 300 sec: 30682.7). Total num frames: 474644480. Throughput: 0: 44520.0. Samples: 3457160. Policy #0 lag: (min: 0.0, avg: 7.3, max: 14.0) [2024-06-12 16:19:40,940][70768] Avg episode reward: [(0, '0.099')] [2024-06-12 16:19:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000028970_474644480.pth... [2024-06-12 16:19:40,992][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000028383_465027072.pth [2024-06-12 16:19:42,413][71000] Updated weights for policy 0, policy_version 28974 (0.0021) [2024-06-12 16:19:44,406][71023] Worker 22 awakens! [2024-06-12 16:19:44,417][70768] Heartbeat connected on RolloutWorker_w22 [2024-06-12 16:19:45,465][71000] Updated weights for policy 0, policy_version 28984 (0.0028) [2024-06-12 16:19:45,940][70768] Fps is (10 sec: 45875.4, 60 sec: 44236.8, 300 sec: 31343.3). Total num frames: 474873856. Throughput: 0: 44890.3. Samples: 3740580. Policy #0 lag: (min: 0.0, avg: 7.3, max: 14.0) [2024-06-12 16:19:45,940][70768] Avg episode reward: [(0, '0.100')] [2024-06-12 16:19:47,164][71024] Worker 23 awakens! [2024-06-12 16:19:47,173][70768] Heartbeat connected on RolloutWorker_w23 [2024-06-12 16:19:49,034][71000] Updated weights for policy 0, policy_version 28994 (0.0023) [2024-06-12 16:19:50,940][70768] Fps is (10 sec: 47513.2, 60 sec: 44782.9, 300 sec: 32085.3). Total num frames: 475119616. Throughput: 0: 45159.9. Samples: 3890760. Policy #0 lag: (min: 0.0, avg: 7.3, max: 14.0) [2024-06-12 16:19:50,940][70768] Avg episode reward: [(0, '0.102')] [2024-06-12 16:19:52,788][71000] Updated weights for policy 0, policy_version 29004 (0.0026) [2024-06-12 16:19:53,916][71026] Worker 24 awakens! [2024-06-12 16:19:53,927][70768] Heartbeat connected on RolloutWorker_w24 [2024-06-12 16:19:55,603][71000] Updated weights for policy 0, policy_version 29014 (0.0024) [2024-06-12 16:19:55,940][70768] Fps is (10 sec: 49151.2, 60 sec: 45602.0, 300 sec: 32767.9). Total num frames: 475365376. Throughput: 0: 45608.7. Samples: 4174560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-06-12 16:19:55,940][70768] Avg episode reward: [(0, '0.106')] [2024-06-12 16:19:56,620][71025] Worker 25 awakens! [2024-06-12 16:19:56,631][70768] Heartbeat connected on RolloutWorker_w25 [2024-06-12 16:19:59,454][71000] Updated weights for policy 0, policy_version 29024 (0.0031) [2024-06-12 16:20:00,940][70768] Fps is (10 sec: 50790.6, 60 sec: 46421.3, 300 sec: 33524.2). Total num frames: 475627520. Throughput: 0: 46459.5. Samples: 4469580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-06-12 16:20:00,940][70768] Avg episode reward: [(0, '0.093')] [2024-06-12 16:20:01,298][71027] Worker 26 awakens! [2024-06-12 16:20:01,308][70768] Heartbeat connected on RolloutWorker_w26 [2024-06-12 16:20:02,643][71000] Updated weights for policy 0, policy_version 29034 (0.0028) [2024-06-12 16:20:05,745][71000] Updated weights for policy 0, policy_version 29044 (0.0022) [2024-06-12 16:20:05,940][70768] Fps is (10 sec: 49152.6, 60 sec: 46148.3, 300 sec: 33981.6). Total num frames: 475856896. Throughput: 0: 46922.6. Samples: 4624260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-06-12 16:20:05,940][70768] Avg episode reward: [(0, '0.087')] [2024-06-12 16:20:06,032][71028] Worker 27 awakens! [2024-06-12 16:20:06,042][70768] Heartbeat connected on RolloutWorker_w27 [2024-06-12 16:20:09,077][71000] Updated weights for policy 0, policy_version 29054 (0.0026) [2024-06-12 16:20:10,688][71029] Worker 28 awakens! [2024-06-12 16:20:10,698][70768] Heartbeat connected on RolloutWorker_w28 [2024-06-12 16:20:10,940][70768] Fps is (10 sec: 47513.9, 60 sec: 46967.5, 300 sec: 34523.4). Total num frames: 476102656. Throughput: 0: 47908.5. Samples: 4927260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-06-12 16:20:10,940][70768] Avg episode reward: [(0, '0.097')] [2024-06-12 16:20:12,034][71000] Updated weights for policy 0, policy_version 29064 (0.0025) [2024-06-12 16:20:15,326][71030] Worker 29 awakens! [2024-06-12 16:20:15,337][70768] Heartbeat connected on RolloutWorker_w29 [2024-06-12 16:20:15,939][70768] Fps is (10 sec: 47514.1, 60 sec: 46967.6, 300 sec: 34914.9). Total num frames: 476332032. Throughput: 0: 48703.2. Samples: 5231020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-12 16:20:15,940][70768] Avg episode reward: [(0, '0.090')] [2024-06-12 16:20:15,992][71000] Updated weights for policy 0, policy_version 29074 (0.0029) [2024-06-12 16:20:18,716][71000] Updated weights for policy 0, policy_version 29084 (0.0028) [2024-06-12 16:20:20,940][70768] Fps is (10 sec: 49151.9, 60 sec: 47786.6, 300 sec: 35498.7). Total num frames: 476594176. Throughput: 0: 48676.9. Samples: 5371300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-12 16:20:20,940][70768] Avg episode reward: [(0, '0.104')] [2024-06-12 16:20:22,009][71032] Worker 30 awakens! [2024-06-12 16:20:22,020][70768] Heartbeat connected on RolloutWorker_w30 [2024-06-12 16:20:22,324][71000] Updated weights for policy 0, policy_version 29094 (0.0027) [2024-06-12 16:20:24,745][71031] Worker 31 awakens! [2024-06-12 16:20:24,755][70768] Heartbeat connected on RolloutWorker_w31 [2024-06-12 16:20:25,003][71000] Updated weights for policy 0, policy_version 29104 (0.0028) [2024-06-12 16:20:25,940][70768] Fps is (10 sec: 54066.3, 60 sec: 48878.9, 300 sec: 36150.5). Total num frames: 476872704. Throughput: 0: 49389.2. Samples: 5679680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-12 16:20:25,940][70768] Avg episode reward: [(0, '0.113')] [2024-06-12 16:20:28,671][71000] Updated weights for policy 0, policy_version 29114 (0.0027) [2024-06-12 16:20:30,424][70980] Signal inference workers to stop experience collection... (50 times) [2024-06-12 16:20:30,424][70980] Signal inference workers to resume experience collection... (50 times) [2024-06-12 16:20:30,445][71000] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-06-12 16:20:30,445][71000] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-06-12 16:20:30,940][70768] Fps is (10 sec: 54067.2, 60 sec: 49425.1, 300 sec: 36659.2). Total num frames: 477134848. Throughput: 0: 50109.3. Samples: 5995500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-12 16:20:30,940][70768] Avg episode reward: [(0, '0.110')] [2024-06-12 16:20:31,220][71000] Updated weights for policy 0, policy_version 29124 (0.0027) [2024-06-12 16:20:34,974][71000] Updated weights for policy 0, policy_version 29134 (0.0023) [2024-06-12 16:20:35,939][70768] Fps is (10 sec: 47514.4, 60 sec: 48879.0, 300 sec: 36839.2). Total num frames: 477347840. Throughput: 0: 50094.4. Samples: 6145000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 16:20:35,940][70768] Avg episode reward: [(0, '0.111')] [2024-06-12 16:20:37,541][71000] Updated weights for policy 0, policy_version 29144 (0.0029) [2024-06-12 16:20:40,939][70768] Fps is (10 sec: 47514.2, 60 sec: 49425.2, 300 sec: 37297.7). Total num frames: 477609984. Throughput: 0: 50578.1. Samples: 6450560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 16:20:40,940][70768] Avg episode reward: [(0, '0.104')] [2024-06-12 16:20:41,617][71000] Updated weights for policy 0, policy_version 29154 (0.0029) [2024-06-12 16:20:44,294][71000] Updated weights for policy 0, policy_version 29164 (0.0026) [2024-06-12 16:20:45,940][70768] Fps is (10 sec: 54066.5, 60 sec: 50244.2, 300 sec: 37823.6). Total num frames: 477888512. Throughput: 0: 50505.8. Samples: 6742340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 16:20:45,940][70768] Avg episode reward: [(0, '0.128')] [2024-06-12 16:20:46,046][70980] Saving new best policy, reward=0.128! [2024-06-12 16:20:48,170][71000] Updated weights for policy 0, policy_version 29174 (0.0034) [2024-06-12 16:20:50,651][71000] Updated weights for policy 0, policy_version 29184 (0.0023) [2024-06-12 16:20:50,940][70768] Fps is (10 sec: 55704.4, 60 sec: 50790.4, 300 sec: 38320.3). Total num frames: 478167040. Throughput: 0: 50866.6. Samples: 6913260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 16:20:50,940][70768] Avg episode reward: [(0, '0.128')] [2024-06-12 16:20:54,313][71000] Updated weights for policy 0, policy_version 29194 (0.0031) [2024-06-12 16:20:55,940][70768] Fps is (10 sec: 49152.4, 60 sec: 50244.4, 300 sec: 38436.0). Total num frames: 478380032. Throughput: 0: 50934.7. Samples: 7219320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 16:20:55,940][70768] Avg episode reward: [(0, '0.123')] [2024-06-12 16:20:56,898][71000] Updated weights for policy 0, policy_version 29204 (0.0027) [2024-06-12 16:21:00,794][71000] Updated weights for policy 0, policy_version 29214 (0.0027) [2024-06-12 16:21:00,940][70768] Fps is (10 sec: 47513.8, 60 sec: 50244.3, 300 sec: 38804.2). Total num frames: 478642176. Throughput: 0: 50928.3. Samples: 7522800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 16:21:00,940][70768] Avg episode reward: [(0, '0.112')] [2024-06-12 16:21:03,507][71000] Updated weights for policy 0, policy_version 29224 (0.0024) [2024-06-12 16:21:05,940][70768] Fps is (10 sec: 52427.8, 60 sec: 50790.3, 300 sec: 39153.5). Total num frames: 478904320. Throughput: 0: 51073.2. Samples: 7669600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 16:21:05,940][70768] Avg episode reward: [(0, '0.105')] [2024-06-12 16:21:07,356][71000] Updated weights for policy 0, policy_version 29234 (0.0029) [2024-06-12 16:21:10,014][71000] Updated weights for policy 0, policy_version 29244 (0.0025) [2024-06-12 16:21:10,940][70768] Fps is (10 sec: 54067.5, 60 sec: 51336.5, 300 sec: 39567.4). Total num frames: 479182848. Throughput: 0: 51119.7. Samples: 7980060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 16:21:10,940][70768] Avg episode reward: [(0, '0.108')] [2024-06-12 16:21:13,641][71000] Updated weights for policy 0, policy_version 29254 (0.0023) [2024-06-12 16:21:15,939][70768] Fps is (10 sec: 52429.9, 60 sec: 51609.6, 300 sec: 39801.1). Total num frames: 479428608. Throughput: 0: 51143.2. Samples: 8296940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 16:21:15,940][70768] Avg episode reward: [(0, '0.124')] [2024-06-12 16:21:16,196][71000] Updated weights for policy 0, policy_version 29264 (0.0029) [2024-06-12 16:21:20,313][71000] Updated weights for policy 0, policy_version 29274 (0.0023) [2024-06-12 16:21:20,940][70768] Fps is (10 sec: 47513.8, 60 sec: 51063.5, 300 sec: 39945.8). Total num frames: 479657984. Throughput: 0: 51171.5. Samples: 8447720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 16:21:20,940][70768] Avg episode reward: [(0, '0.115')] [2024-06-12 16:21:22,618][71000] Updated weights for policy 0, policy_version 29284 (0.0032) [2024-06-12 16:21:25,944][70768] Fps is (10 sec: 49130.8, 60 sec: 50786.9, 300 sec: 40235.3). Total num frames: 479920128. Throughput: 0: 51133.7. Samples: 8751800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 16:21:25,944][70768] Avg episode reward: [(0, '0.112')] [2024-06-12 16:21:26,494][71000] Updated weights for policy 0, policy_version 29294 (0.0033) [2024-06-12 16:21:28,831][71000] Updated weights for policy 0, policy_version 29304 (0.0027) [2024-06-12 16:21:30,940][70768] Fps is (10 sec: 54065.9, 60 sec: 51063.3, 300 sec: 40587.6). Total num frames: 480198656. Throughput: 0: 51596.7. Samples: 9064200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 16:21:30,940][70768] Avg episode reward: [(0, '0.116')] [2024-06-12 16:21:31,430][70980] Signal inference workers to stop experience collection... (100 times) [2024-06-12 16:21:31,430][70980] Signal inference workers to resume experience collection... (100 times) [2024-06-12 16:21:31,470][71000] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-06-12 16:21:31,471][71000] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-06-12 16:21:32,761][71000] Updated weights for policy 0, policy_version 29314 (0.0024) [2024-06-12 16:21:35,505][71000] Updated weights for policy 0, policy_version 29324 (0.0035) [2024-06-12 16:21:35,940][70768] Fps is (10 sec: 57368.6, 60 sec: 52428.7, 300 sec: 40996.4). Total num frames: 480493568. Throughput: 0: 51487.7. Samples: 9230200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-12 16:21:35,940][70768] Avg episode reward: [(0, '0.107')] [2024-06-12 16:21:39,230][71000] Updated weights for policy 0, policy_version 29334 (0.0024) [2024-06-12 16:21:40,940][70768] Fps is (10 sec: 49151.3, 60 sec: 51336.1, 300 sec: 40959.9). Total num frames: 480690176. Throughput: 0: 51471.6. Samples: 9535560. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-12 16:21:40,941][70768] Avg episode reward: [(0, '0.108')] [2024-06-12 16:21:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000029339_480690176.pth... [2024-06-12 16:21:40,993][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000028764_471269376.pth [2024-06-12 16:21:41,876][71000] Updated weights for policy 0, policy_version 29344 (0.0025) [2024-06-12 16:21:45,745][71000] Updated weights for policy 0, policy_version 29354 (0.0024) [2024-06-12 16:21:45,940][70768] Fps is (10 sec: 44236.4, 60 sec: 50790.4, 300 sec: 41134.3). Total num frames: 480935936. Throughput: 0: 51530.2. Samples: 9841660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-12 16:21:45,940][70768] Avg episode reward: [(0, '0.108')] [2024-06-12 16:21:48,215][71000] Updated weights for policy 0, policy_version 29364 (0.0032) [2024-06-12 16:21:50,940][70768] Fps is (10 sec: 52430.1, 60 sec: 50790.4, 300 sec: 41437.8). Total num frames: 481214464. Throughput: 0: 51478.7. Samples: 9986140. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-12 16:21:50,940][70768] Avg episode reward: [(0, '0.107')] [2024-06-12 16:21:52,013][71000] Updated weights for policy 0, policy_version 29374 (0.0031) [2024-06-12 16:21:54,642][71000] Updated weights for policy 0, policy_version 29384 (0.0023) [2024-06-12 16:21:55,940][70768] Fps is (10 sec: 55705.4, 60 sec: 51882.5, 300 sec: 41729.0). Total num frames: 481492992. Throughput: 0: 51408.3. Samples: 10293440. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 16:21:55,940][70768] Avg episode reward: [(0, '0.110')] [2024-06-12 16:21:58,384][71000] Updated weights for policy 0, policy_version 29394 (0.0029) [2024-06-12 16:22:00,940][70768] Fps is (10 sec: 50790.8, 60 sec: 51336.6, 300 sec: 41812.0). Total num frames: 481722368. Throughput: 0: 51314.6. Samples: 10606100. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 16:22:00,940][70768] Avg episode reward: [(0, '0.108')] [2024-06-12 16:22:01,183][71000] Updated weights for policy 0, policy_version 29404 (0.0024) [2024-06-12 16:22:05,040][71000] Updated weights for policy 0, policy_version 29414 (0.0020) [2024-06-12 16:22:05,940][70768] Fps is (10 sec: 45875.9, 60 sec: 50790.6, 300 sec: 41891.6). Total num frames: 481951744. Throughput: 0: 51118.7. Samples: 10748060. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 16:22:05,940][70768] Avg episode reward: [(0, '0.119')] [2024-06-12 16:22:07,661][71000] Updated weights for policy 0, policy_version 29424 (0.0033) [2024-06-12 16:22:10,939][70768] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 42094.3). Total num frames: 482213888. Throughput: 0: 50989.4. Samples: 11046100. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 16:22:10,940][70768] Avg episode reward: [(0, '0.114')] [2024-06-12 16:22:11,476][71000] Updated weights for policy 0, policy_version 29434 (0.0029) [2024-06-12 16:22:13,712][71000] Updated weights for policy 0, policy_version 29444 (0.0027) [2024-06-12 16:22:15,940][70768] Fps is (10 sec: 55705.5, 60 sec: 51336.5, 300 sec: 42412.9). Total num frames: 482508800. Throughput: 0: 51026.5. Samples: 11360380. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-12 16:22:15,940][70768] Avg episode reward: [(0, '0.118')] [2024-06-12 16:22:17,614][71000] Updated weights for policy 0, policy_version 29454 (0.0026) [2024-06-12 16:22:20,338][71000] Updated weights for policy 0, policy_version 29464 (0.0027) [2024-06-12 16:22:20,939][70768] Fps is (10 sec: 55705.6, 60 sec: 51882.7, 300 sec: 42598.4). Total num frames: 482770944. Throughput: 0: 51075.6. Samples: 11528600. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-12 16:22:20,940][70768] Avg episode reward: [(0, '0.119')] [2024-06-12 16:22:22,342][70980] Signal inference workers to stop experience collection... (150 times) [2024-06-12 16:22:22,342][70980] Signal inference workers to resume experience collection... (150 times) [2024-06-12 16:22:22,363][71000] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-06-12 16:22:22,363][71000] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-06-12 16:22:24,342][71000] Updated weights for policy 0, policy_version 29474 (0.0021) [2024-06-12 16:22:25,940][70768] Fps is (10 sec: 47513.7, 60 sec: 51067.1, 300 sec: 42598.4). Total num frames: 482983936. Throughput: 0: 51093.8. Samples: 11834760. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-12 16:22:25,940][70768] Avg episode reward: [(0, '0.117')] [2024-06-12 16:22:26,943][71000] Updated weights for policy 0, policy_version 29484 (0.0033) [2024-06-12 16:22:30,445][71000] Updated weights for policy 0, policy_version 29494 (0.0023) [2024-06-12 16:22:30,940][70768] Fps is (10 sec: 47513.2, 60 sec: 50790.6, 300 sec: 42773.9). Total num frames: 483246080. Throughput: 0: 51187.2. Samples: 12145080. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-12 16:22:30,940][70768] Avg episode reward: [(0, '0.124')] [2024-06-12 16:22:33,246][71000] Updated weights for policy 0, policy_version 29504 (0.0026) [2024-06-12 16:22:35,939][70768] Fps is (10 sec: 54067.2, 60 sec: 50517.4, 300 sec: 43000.8). Total num frames: 483524608. Throughput: 0: 51056.6. Samples: 12283680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 16:22:35,940][70768] Avg episode reward: [(0, '0.122')] [2024-06-12 16:22:36,843][71000] Updated weights for policy 0, policy_version 29514 (0.0021) [2024-06-12 16:22:39,627][71000] Updated weights for policy 0, policy_version 29524 (0.0029) [2024-06-12 16:22:40,940][70768] Fps is (10 sec: 57343.9, 60 sec: 52156.0, 300 sec: 43276.3). Total num frames: 483819520. Throughput: 0: 51200.5. Samples: 12597460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 16:22:40,940][70768] Avg episode reward: [(0, '0.133')] [2024-06-12 16:22:40,954][70980] Saving new best policy, reward=0.133! [2024-06-12 16:22:43,243][71000] Updated weights for policy 0, policy_version 29534 (0.0024) [2024-06-12 16:22:45,940][70768] Fps is (10 sec: 50789.8, 60 sec: 51609.6, 300 sec: 43264.9). Total num frames: 484032512. Throughput: 0: 51099.0. Samples: 12905560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 16:22:45,940][70768] Avg episode reward: [(0, '0.109')] [2024-06-12 16:22:46,066][71000] Updated weights for policy 0, policy_version 29544 (0.0030) [2024-06-12 16:22:49,563][71000] Updated weights for policy 0, policy_version 29554 (0.0024) [2024-06-12 16:22:50,940][70768] Fps is (10 sec: 45875.4, 60 sec: 51063.5, 300 sec: 44098.0). Total num frames: 484278272. Throughput: 0: 51149.3. Samples: 13049780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 16:22:50,940][70768] Avg episode reward: [(0, '0.117')] [2024-06-12 16:22:52,322][71000] Updated weights for policy 0, policy_version 29564 (0.0025) [2024-06-12 16:22:55,898][71000] Updated weights for policy 0, policy_version 29574 (0.0027) [2024-06-12 16:22:55,940][70768] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 44653.3). Total num frames: 484540416. Throughput: 0: 51663.5. Samples: 13370960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 16:22:55,940][70768] Avg episode reward: [(0, '0.147')] [2024-06-12 16:22:56,054][70980] Saving new best policy, reward=0.147! [2024-06-12 16:22:58,580][71000] Updated weights for policy 0, policy_version 29584 (0.0029) [2024-06-12 16:23:00,939][70768] Fps is (10 sec: 55705.9, 60 sec: 51882.7, 300 sec: 45430.9). Total num frames: 484835328. Throughput: 0: 51586.3. Samples: 13681760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 16:23:00,940][70768] Avg episode reward: [(0, '0.127')] [2024-06-12 16:23:01,960][71000] Updated weights for policy 0, policy_version 29594 (0.0029) [2024-06-12 16:23:05,147][71000] Updated weights for policy 0, policy_version 29604 (0.0028) [2024-06-12 16:23:05,940][70768] Fps is (10 sec: 52429.0, 60 sec: 51882.7, 300 sec: 46152.9). Total num frames: 485064704. Throughput: 0: 51472.9. Samples: 13844880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 16:23:05,940][70768] Avg episode reward: [(0, '0.117')] [2024-06-12 16:23:08,480][71000] Updated weights for policy 0, policy_version 29614 (0.0026) [2024-06-12 16:23:10,940][70768] Fps is (10 sec: 47513.3, 60 sec: 51609.5, 300 sec: 46930.4). Total num frames: 485310464. Throughput: 0: 51400.4. Samples: 14147780. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 16:23:10,940][70768] Avg episode reward: [(0, '0.118')] [2024-06-12 16:23:11,569][71000] Updated weights for policy 0, policy_version 29624 (0.0028) [2024-06-12 16:23:14,737][71000] Updated weights for policy 0, policy_version 29634 (0.0024) [2024-06-12 16:23:15,940][70768] Fps is (10 sec: 50790.3, 60 sec: 51063.5, 300 sec: 47596.9). Total num frames: 485572608. Throughput: 0: 51170.3. Samples: 14447740. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 16:23:15,940][70768] Avg episode reward: [(0, '0.127')] [2024-06-12 16:23:18,146][71000] Updated weights for policy 0, policy_version 29644 (0.0030) [2024-06-12 16:23:20,940][70768] Fps is (10 sec: 52428.7, 60 sec: 51063.4, 300 sec: 48152.3). Total num frames: 485834752. Throughput: 0: 51572.8. Samples: 14604460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 16:23:20,940][70768] Avg episode reward: [(0, '0.133')] [2024-06-12 16:23:21,004][71000] Updated weights for policy 0, policy_version 29654 (0.0021) [2024-06-12 16:23:23,554][70980] Signal inference workers to stop experience collection... (200 times) [2024-06-12 16:23:23,556][70980] Signal inference workers to resume experience collection... (200 times) [2024-06-12 16:23:23,595][71000] InferenceWorker_p0-w0: stopping experience collection (200 times) [2024-06-12 16:23:23,595][71000] InferenceWorker_p0-w0: resuming experience collection (200 times) [2024-06-12 16:23:24,433][71000] Updated weights for policy 0, policy_version 29664 (0.0029) [2024-06-12 16:23:25,941][70768] Fps is (10 sec: 52423.5, 60 sec: 51881.8, 300 sec: 48707.5). Total num frames: 486096896. Throughput: 0: 51397.6. Samples: 14910400. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 16:23:25,941][70768] Avg episode reward: [(0, '0.124')] [2024-06-12 16:23:27,179][71000] Updated weights for policy 0, policy_version 29674 (0.0035) [2024-06-12 16:23:30,939][70768] Fps is (10 sec: 49152.5, 60 sec: 51336.6, 300 sec: 48929.9). Total num frames: 486326272. Throughput: 0: 51409.1. Samples: 15218960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 16:23:30,940][70768] Avg episode reward: [(0, '0.132')] [2024-06-12 16:23:31,009][71000] Updated weights for policy 0, policy_version 29684 (0.0023) [2024-06-12 16:23:33,782][71000] Updated weights for policy 0, policy_version 29694 (0.0027) [2024-06-12 16:23:35,940][70768] Fps is (10 sec: 49155.3, 60 sec: 51063.2, 300 sec: 49318.6). Total num frames: 486588416. Throughput: 0: 51435.7. Samples: 15364400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 16:23:35,941][70768] Avg episode reward: [(0, '0.126')] [2024-06-12 16:23:37,402][71000] Updated weights for policy 0, policy_version 29704 (0.0029) [2024-06-12 16:23:40,140][71000] Updated weights for policy 0, policy_version 29714 (0.0029) [2024-06-12 16:23:40,940][70768] Fps is (10 sec: 52427.9, 60 sec: 50517.3, 300 sec: 49596.3). Total num frames: 486850560. Throughput: 0: 50983.4. Samples: 15665220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 16:23:40,940][70768] Avg episode reward: [(0, '0.143')] [2024-06-12 16:23:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000029715_486850560.pth... [2024-06-12 16:23:40,987][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000028970_474644480.pth [2024-06-12 16:23:43,545][71000] Updated weights for policy 0, policy_version 29724 (0.0027) [2024-06-12 16:23:45,940][70768] Fps is (10 sec: 54068.3, 60 sec: 51609.6, 300 sec: 49818.5). Total num frames: 487129088. Throughput: 0: 50978.0. Samples: 15975780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 16:23:45,940][70768] Avg episode reward: [(0, '0.124')] [2024-06-12 16:23:46,449][71000] Updated weights for policy 0, policy_version 29734 (0.0034) [2024-06-12 16:23:50,091][71000] Updated weights for policy 0, policy_version 29744 (0.0025) [2024-06-12 16:23:50,940][70768] Fps is (10 sec: 52429.5, 60 sec: 51609.6, 300 sec: 49985.1). Total num frames: 487374848. Throughput: 0: 50797.8. Samples: 16130780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 16:23:50,940][70768] Avg episode reward: [(0, '0.100')] [2024-06-12 16:23:52,851][71000] Updated weights for policy 0, policy_version 29754 (0.0030) [2024-06-12 16:23:55,940][70768] Fps is (10 sec: 49152.2, 60 sec: 51336.5, 300 sec: 50096.2). Total num frames: 487620608. Throughput: 0: 50984.8. Samples: 16442100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 16:23:55,940][70768] Avg episode reward: [(0, '0.140')] [2024-06-12 16:23:56,523][71000] Updated weights for policy 0, policy_version 29764 (0.0022) [2024-06-12 16:23:59,264][71000] Updated weights for policy 0, policy_version 29774 (0.0036) [2024-06-12 16:24:00,940][70768] Fps is (10 sec: 49150.9, 60 sec: 50517.1, 300 sec: 50096.1). Total num frames: 487866368. Throughput: 0: 51113.5. Samples: 16747860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 16:24:00,940][70768] Avg episode reward: [(0, '0.130')] [2024-06-12 16:24:02,811][71000] Updated weights for policy 0, policy_version 29784 (0.0024) [2024-06-12 16:24:05,940][70768] Fps is (10 sec: 50790.5, 60 sec: 51063.4, 300 sec: 50318.3). Total num frames: 488128512. Throughput: 0: 50909.8. Samples: 16895400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 16:24:05,940][70768] Avg episode reward: [(0, '0.132')] [2024-06-12 16:24:06,010][71000] Updated weights for policy 0, policy_version 29794 (0.0027) [2024-06-12 16:24:09,544][71000] Updated weights for policy 0, policy_version 29804 (0.0026) [2024-06-12 16:24:10,940][70768] Fps is (10 sec: 52429.7, 60 sec: 51336.5, 300 sec: 50429.4). Total num frames: 488390656. Throughput: 0: 50860.2. Samples: 17199060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 16:24:10,940][70768] Avg episode reward: [(0, '0.125')] [2024-06-12 16:24:12,336][71000] Updated weights for policy 0, policy_version 29814 (0.0026) [2024-06-12 16:24:15,940][70768] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 488620032. Throughput: 0: 50855.9. Samples: 17507480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 16:24:15,940][70768] Avg episode reward: [(0, '0.122')] [2024-06-12 16:24:16,044][71000] Updated weights for policy 0, policy_version 29824 (0.0019) [2024-06-12 16:24:18,753][71000] Updated weights for policy 0, policy_version 29834 (0.0022) [2024-06-12 16:24:20,940][70768] Fps is (10 sec: 49152.3, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 488882176. Throughput: 0: 50910.6. Samples: 17655360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 16:24:20,940][70768] Avg episode reward: [(0, '0.124')] [2024-06-12 16:24:22,390][71000] Updated weights for policy 0, policy_version 29844 (0.0022) [2024-06-12 16:24:25,031][71000] Updated weights for policy 0, policy_version 29854 (0.0031) [2024-06-12 16:24:25,940][70768] Fps is (10 sec: 54067.0, 60 sec: 51064.3, 300 sec: 50818.2). Total num frames: 489160704. Throughput: 0: 51031.6. Samples: 17961640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 16:24:25,940][70768] Avg episode reward: [(0, '0.112')] [2024-06-12 16:24:28,883][71000] Updated weights for policy 0, policy_version 29864 (0.0029) [2024-06-12 16:24:30,939][70768] Fps is (10 sec: 52429.1, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 489406464. Throughput: 0: 51090.0. Samples: 18274820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 16:24:30,940][70768] Avg episode reward: [(0, '0.126')] [2024-06-12 16:24:31,850][71000] Updated weights for policy 0, policy_version 29874 (0.0033) [2024-06-12 16:24:35,342][71000] Updated weights for policy 0, policy_version 29884 (0.0023) [2024-06-12 16:24:35,940][70768] Fps is (10 sec: 47513.8, 60 sec: 50790.7, 300 sec: 50818.2). Total num frames: 489635840. Throughput: 0: 50817.3. Samples: 18417560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 16:24:35,940][70768] Avg episode reward: [(0, '0.125')] [2024-06-12 16:24:38,317][71000] Updated weights for policy 0, policy_version 29894 (0.0024) [2024-06-12 16:24:40,940][70768] Fps is (10 sec: 47513.2, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 489881600. Throughput: 0: 50599.2. Samples: 18719060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 16:24:40,940][70768] Avg episode reward: [(0, '0.123')] [2024-06-12 16:24:41,591][71000] Updated weights for policy 0, policy_version 29904 (0.0020) [2024-06-12 16:24:43,494][70980] Signal inference workers to stop experience collection... (250 times) [2024-06-12 16:24:43,537][71000] InferenceWorker_p0-w0: stopping experience collection (250 times) [2024-06-12 16:24:43,544][70980] Signal inference workers to resume experience collection... (250 times) [2024-06-12 16:24:43,549][71000] InferenceWorker_p0-w0: resuming experience collection (250 times) [2024-06-12 16:24:44,649][71000] Updated weights for policy 0, policy_version 29914 (0.0027) [2024-06-12 16:24:45,940][70768] Fps is (10 sec: 52428.9, 60 sec: 50517.4, 300 sec: 50984.8). Total num frames: 490160128. Throughput: 0: 50498.0. Samples: 19020260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 16:24:45,940][70768] Avg episode reward: [(0, '0.124')] [2024-06-12 16:24:48,284][71000] Updated weights for policy 0, policy_version 29924 (0.0028) [2024-06-12 16:24:50,939][70768] Fps is (10 sec: 54067.5, 60 sec: 50790.4, 300 sec: 51040.4). Total num frames: 490422272. Throughput: 0: 50726.8. Samples: 19178100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 16:24:50,940][70768] Avg episode reward: [(0, '0.101')] [2024-06-12 16:24:51,121][71000] Updated weights for policy 0, policy_version 29934 (0.0032) [2024-06-12 16:24:54,898][71000] Updated weights for policy 0, policy_version 29944 (0.0027) [2024-06-12 16:24:55,940][70768] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50929.3). Total num frames: 490651648. Throughput: 0: 50929.8. Samples: 19490900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 16:24:55,940][70768] Avg episode reward: [(0, '0.126')] [2024-06-12 16:24:57,625][71000] Updated weights for policy 0, policy_version 29954 (0.0029) [2024-06-12 16:25:00,940][70768] Fps is (10 sec: 47512.5, 60 sec: 50517.4, 300 sec: 50984.8). Total num frames: 490897408. Throughput: 0: 50822.0. Samples: 19794480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 16:25:00,940][70768] Avg episode reward: [(0, '0.105')] [2024-06-12 16:25:01,206][71000] Updated weights for policy 0, policy_version 29964 (0.0027) [2024-06-12 16:25:03,794][71000] Updated weights for policy 0, policy_version 29974 (0.0024) [2024-06-12 16:25:05,940][70768] Fps is (10 sec: 54066.5, 60 sec: 51063.4, 300 sec: 51151.4). Total num frames: 491192320. Throughput: 0: 50900.7. Samples: 19945900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 16:25:05,940][70768] Avg episode reward: [(0, '0.123')] [2024-06-12 16:25:07,448][71000] Updated weights for policy 0, policy_version 29984 (0.0034) [2024-06-12 16:25:10,211][71000] Updated weights for policy 0, policy_version 29994 (0.0026) [2024-06-12 16:25:10,940][70768] Fps is (10 sec: 54068.2, 60 sec: 50790.4, 300 sec: 51206.9). Total num frames: 491438080. Throughput: 0: 50911.6. Samples: 20252660. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-12 16:25:10,940][70768] Avg episode reward: [(0, '0.124')] [2024-06-12 16:25:14,519][71000] Updated weights for policy 0, policy_version 30004 (0.0024) [2024-06-12 16:25:15,940][70768] Fps is (10 sec: 49151.8, 60 sec: 51063.3, 300 sec: 51151.4). Total num frames: 491683840. Throughput: 0: 50800.6. Samples: 20560860. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-12 16:25:15,940][70768] Avg episode reward: [(0, '0.126')] [2024-06-12 16:25:16,698][71000] Updated weights for policy 0, policy_version 30014 (0.0023) [2024-06-12 16:25:20,730][71000] Updated weights for policy 0, policy_version 30024 (0.0030) [2024-06-12 16:25:20,940][70768] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 51040.3). Total num frames: 491929600. Throughput: 0: 50884.9. Samples: 20707380. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-12 16:25:20,940][70768] Avg episode reward: [(0, '0.126')] [2024-06-12 16:25:23,286][71000] Updated weights for policy 0, policy_version 30034 (0.0029) [2024-06-12 16:25:25,940][70768] Fps is (10 sec: 50791.1, 60 sec: 50517.4, 300 sec: 51040.3). Total num frames: 492191744. Throughput: 0: 50962.7. Samples: 21012380. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-12 16:25:25,940][70768] Avg episode reward: [(0, '0.127')] [2024-06-12 16:25:26,970][71000] Updated weights for policy 0, policy_version 30044 (0.0030) [2024-06-12 16:25:29,409][71000] Updated weights for policy 0, policy_version 30054 (0.0024) [2024-06-12 16:25:30,940][70768] Fps is (10 sec: 54066.6, 60 sec: 51063.3, 300 sec: 51262.4). Total num frames: 492470272. Throughput: 0: 51215.4. Samples: 21324960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-12 16:25:30,940][70768] Avg episode reward: [(0, '0.120')] [2024-06-12 16:25:33,134][71000] Updated weights for policy 0, policy_version 30064 (0.0027) [2024-06-12 16:25:35,726][71000] Updated weights for policy 0, policy_version 30074 (0.0031) [2024-06-12 16:25:35,940][70768] Fps is (10 sec: 54066.2, 60 sec: 51609.4, 300 sec: 51262.4). Total num frames: 492732416. Throughput: 0: 51151.3. Samples: 21479920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-12 16:25:35,940][70768] Avg episode reward: [(0, '0.124')] [2024-06-12 16:25:39,966][71000] Updated weights for policy 0, policy_version 30084 (0.0027) [2024-06-12 16:25:40,940][70768] Fps is (10 sec: 49152.4, 60 sec: 51336.5, 300 sec: 51095.9). Total num frames: 492961792. Throughput: 0: 51099.5. Samples: 21790380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-12 16:25:40,940][70768] Avg episode reward: [(0, '0.139')] [2024-06-12 16:25:40,994][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000030089_492978176.pth... [2024-06-12 16:25:41,041][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000029339_480690176.pth [2024-06-12 16:25:42,192][71000] Updated weights for policy 0, policy_version 30094 (0.0026) [2024-06-12 16:25:45,940][70768] Fps is (10 sec: 45875.6, 60 sec: 50517.2, 300 sec: 50929.2). Total num frames: 493191168. Throughput: 0: 50845.4. Samples: 22082520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-12 16:25:45,940][70768] Avg episode reward: [(0, '0.114')] [2024-06-12 16:25:46,469][71000] Updated weights for policy 0, policy_version 30104 (0.0027) [2024-06-12 16:25:49,006][71000] Updated weights for policy 0, policy_version 30114 (0.0029) [2024-06-12 16:25:50,940][70768] Fps is (10 sec: 50790.0, 60 sec: 50790.3, 300 sec: 51151.4). Total num frames: 493469696. Throughput: 0: 50836.9. Samples: 22233560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-12 16:25:50,940][70768] Avg episode reward: [(0, '0.125')] [2024-06-12 16:25:53,020][71000] Updated weights for policy 0, policy_version 30124 (0.0025) [2024-06-12 16:25:53,854][70980] Signal inference workers to stop experience collection... (300 times) [2024-06-12 16:25:53,854][70980] Signal inference workers to resume experience collection... (300 times) [2024-06-12 16:25:53,873][71000] InferenceWorker_p0-w0: stopping experience collection (300 times) [2024-06-12 16:25:53,873][71000] InferenceWorker_p0-w0: resuming experience collection (300 times) [2024-06-12 16:25:55,322][71000] Updated weights for policy 0, policy_version 30134 (0.0025) [2024-06-12 16:25:55,940][70768] Fps is (10 sec: 54067.6, 60 sec: 51336.5, 300 sec: 51151.4). Total num frames: 493731840. Throughput: 0: 50795.1. Samples: 22538440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-12 16:25:55,940][70768] Avg episode reward: [(0, '0.134')] [2024-06-12 16:25:59,219][71000] Updated weights for policy 0, policy_version 30144 (0.0024) [2024-06-12 16:26:00,940][70768] Fps is (10 sec: 50791.1, 60 sec: 51336.7, 300 sec: 51095.9). Total num frames: 493977600. Throughput: 0: 50825.1. Samples: 22847980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-12 16:26:00,940][70768] Avg episode reward: [(0, '0.132')] [2024-06-12 16:26:01,915][71000] Updated weights for policy 0, policy_version 30154 (0.0024) [2024-06-12 16:26:05,940][70768] Fps is (10 sec: 47513.7, 60 sec: 50244.4, 300 sec: 50929.2). Total num frames: 494206976. Throughput: 0: 50692.9. Samples: 22988560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-12 16:26:05,940][70768] Avg episode reward: [(0, '0.139')] [2024-06-12 16:26:05,945][71000] Updated weights for policy 0, policy_version 30164 (0.0023) [2024-06-12 16:26:08,365][71000] Updated weights for policy 0, policy_version 30174 (0.0025) [2024-06-12 16:26:10,940][70768] Fps is (10 sec: 49151.8, 60 sec: 50517.3, 300 sec: 50984.8). Total num frames: 494469120. Throughput: 0: 50777.3. Samples: 23297360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-12 16:26:10,940][70768] Avg episode reward: [(0, '0.153')] [2024-06-12 16:26:10,949][70980] Saving new best policy, reward=0.153! [2024-06-12 16:26:12,271][71000] Updated weights for policy 0, policy_version 30184 (0.0027) [2024-06-12 16:26:14,729][71000] Updated weights for policy 0, policy_version 30194 (0.0026) [2024-06-12 16:26:15,939][70768] Fps is (10 sec: 54067.7, 60 sec: 51063.7, 300 sec: 51151.4). Total num frames: 494747648. Throughput: 0: 50670.0. Samples: 23605100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-12 16:26:15,940][70768] Avg episode reward: [(0, '0.136')] [2024-06-12 16:26:18,711][71000] Updated weights for policy 0, policy_version 30204 (0.0033) [2024-06-12 16:26:20,940][70768] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 51096.6). Total num frames: 494993408. Throughput: 0: 50827.8. Samples: 23767160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-12 16:26:20,940][70768] Avg episode reward: [(0, '0.140')] [2024-06-12 16:26:21,367][71000] Updated weights for policy 0, policy_version 30214 (0.0037) [2024-06-12 16:26:25,311][71000] Updated weights for policy 0, policy_version 30224 (0.0025) [2024-06-12 16:26:25,940][70768] Fps is (10 sec: 47512.6, 60 sec: 50517.2, 300 sec: 50929.3). Total num frames: 495222784. Throughput: 0: 50264.8. Samples: 24052300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-12 16:26:25,940][70768] Avg episode reward: [(0, '0.106')] [2024-06-12 16:26:27,903][71000] Updated weights for policy 0, policy_version 30234 (0.0027) [2024-06-12 16:26:30,940][70768] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 495484928. Throughput: 0: 50415.2. Samples: 24351200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 16:26:30,940][70768] Avg episode reward: [(0, '0.145')] [2024-06-12 16:26:31,714][71000] Updated weights for policy 0, policy_version 30244 (0.0027) [2024-06-12 16:26:34,357][71000] Updated weights for policy 0, policy_version 30254 (0.0025) [2024-06-12 16:26:35,940][70768] Fps is (10 sec: 54067.2, 60 sec: 50517.4, 300 sec: 51095.9). Total num frames: 495763456. Throughput: 0: 50624.4. Samples: 24511660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 16:26:35,940][70768] Avg episode reward: [(0, '0.132')] [2024-06-12 16:26:37,832][71000] Updated weights for policy 0, policy_version 30264 (0.0026) [2024-06-12 16:26:40,837][71000] Updated weights for policy 0, policy_version 30274 (0.0030) [2024-06-12 16:26:40,940][70768] Fps is (10 sec: 52428.9, 60 sec: 50790.4, 300 sec: 51095.9). Total num frames: 496009216. Throughput: 0: 50768.4. Samples: 24823020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 16:26:40,940][70768] Avg episode reward: [(0, '0.129')] [2024-06-12 16:26:44,636][71000] Updated weights for policy 0, policy_version 30284 (0.0023) [2024-06-12 16:26:45,940][70768] Fps is (10 sec: 49152.6, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 496254976. Throughput: 0: 50628.9. Samples: 25126280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 16:26:45,940][70768] Avg episode reward: [(0, '0.118')] [2024-06-12 16:26:47,258][71000] Updated weights for policy 0, policy_version 30294 (0.0028) [2024-06-12 16:26:50,940][70768] Fps is (10 sec: 47513.5, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 496484352. Throughput: 0: 50651.9. Samples: 25267900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 16:26:50,940][70768] Avg episode reward: [(0, '0.131')] [2024-06-12 16:26:50,951][71000] Updated weights for policy 0, policy_version 30304 (0.0027) [2024-06-12 16:26:53,861][71000] Updated weights for policy 0, policy_version 30314 (0.0025) [2024-06-12 16:26:55,940][70768] Fps is (10 sec: 49151.5, 60 sec: 50244.2, 300 sec: 50929.2). Total num frames: 496746496. Throughput: 0: 50616.3. Samples: 25575100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 16:26:55,940][70768] Avg episode reward: [(0, '0.133')] [2024-06-12 16:26:57,281][71000] Updated weights for policy 0, policy_version 30324 (0.0030) [2024-06-12 16:27:00,205][71000] Updated weights for policy 0, policy_version 30334 (0.0028) [2024-06-12 16:27:00,940][70768] Fps is (10 sec: 54067.5, 60 sec: 50790.4, 300 sec: 51095.9). Total num frames: 497025024. Throughput: 0: 50524.3. Samples: 25878700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 16:27:00,940][70768] Avg episode reward: [(0, '0.133')] [2024-06-12 16:27:03,758][71000] Updated weights for policy 0, policy_version 30344 (0.0024) [2024-06-12 16:27:05,940][70768] Fps is (10 sec: 52428.5, 60 sec: 51063.3, 300 sec: 51040.3). Total num frames: 497270784. Throughput: 0: 50454.9. Samples: 26037640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 16:27:05,940][70768] Avg episode reward: [(0, '0.145')] [2024-06-12 16:27:06,324][70980] Signal inference workers to stop experience collection... (350 times) [2024-06-12 16:27:06,347][71000] InferenceWorker_p0-w0: stopping experience collection (350 times) [2024-06-12 16:27:06,384][70980] Signal inference workers to resume experience collection... (350 times) [2024-06-12 16:27:06,385][71000] InferenceWorker_p0-w0: resuming experience collection (350 times) [2024-06-12 16:27:06,682][71000] Updated weights for policy 0, policy_version 30354 (0.0027) [2024-06-12 16:27:10,061][71000] Updated weights for policy 0, policy_version 30364 (0.0023) [2024-06-12 16:27:10,939][70768] Fps is (10 sec: 50790.7, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 497532928. Throughput: 0: 51114.0. Samples: 26352420. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-12 16:27:10,940][70768] Avg episode reward: [(0, '0.130')] [2024-06-12 16:27:13,043][71000] Updated weights for policy 0, policy_version 30374 (0.0022) [2024-06-12 16:27:15,940][70768] Fps is (10 sec: 50790.6, 60 sec: 50517.1, 300 sec: 50873.7). Total num frames: 497778688. Throughput: 0: 51067.9. Samples: 26649260. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-12 16:27:15,940][70768] Avg episode reward: [(0, '0.137')] [2024-06-12 16:27:16,451][71000] Updated weights for policy 0, policy_version 30384 (0.0028) [2024-06-12 16:27:19,635][71000] Updated weights for policy 0, policy_version 30394 (0.0029) [2024-06-12 16:27:20,940][70768] Fps is (10 sec: 50788.9, 60 sec: 50790.2, 300 sec: 51040.3). Total num frames: 498040832. Throughput: 0: 50941.2. Samples: 26804020. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-12 16:27:20,940][70768] Avg episode reward: [(0, '0.153')] [2024-06-12 16:27:22,883][71000] Updated weights for policy 0, policy_version 30404 (0.0027) [2024-06-12 16:27:25,893][71000] Updated weights for policy 0, policy_version 30414 (0.0025) [2024-06-12 16:27:25,940][70768] Fps is (10 sec: 52429.0, 60 sec: 51336.6, 300 sec: 51040.3). Total num frames: 498302976. Throughput: 0: 50749.7. Samples: 27106760. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-12 16:27:25,940][70768] Avg episode reward: [(0, '0.133')] [2024-06-12 16:27:29,355][71000] Updated weights for policy 0, policy_version 30424 (0.0027) [2024-06-12 16:27:30,940][70768] Fps is (10 sec: 47514.5, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 498515968. Throughput: 0: 50891.5. Samples: 27416400. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-12 16:27:30,940][70768] Avg episode reward: [(0, '0.123')] [2024-06-12 16:27:32,357][71000] Updated weights for policy 0, policy_version 30434 (0.0023) [2024-06-12 16:27:35,798][71000] Updated weights for policy 0, policy_version 30444 (0.0035) [2024-06-12 16:27:35,940][70768] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 498794496. Throughput: 0: 50980.9. Samples: 27562040. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-12 16:27:35,940][70768] Avg episode reward: [(0, '0.137')] [2024-06-12 16:27:38,931][71000] Updated weights for policy 0, policy_version 30454 (0.0026) [2024-06-12 16:27:40,940][70768] Fps is (10 sec: 54066.8, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 499056640. Throughput: 0: 50979.6. Samples: 27869180. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-12 16:27:40,940][70768] Avg episode reward: [(0, '0.137')] [2024-06-12 16:27:40,945][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000030460_499056640.pth... [2024-06-12 16:27:40,983][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000029715_486850560.pth [2024-06-12 16:27:42,362][71000] Updated weights for policy 0, policy_version 30464 (0.0030) [2024-06-12 16:27:45,404][71000] Updated weights for policy 0, policy_version 30474 (0.0026) [2024-06-12 16:27:45,940][70768] Fps is (10 sec: 52428.1, 60 sec: 51063.3, 300 sec: 50984.8). Total num frames: 499318784. Throughput: 0: 51086.9. Samples: 28177620. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-12 16:27:45,940][70768] Avg episode reward: [(0, '0.152')] [2024-06-12 16:27:48,872][71000] Updated weights for policy 0, policy_version 30484 (0.0031) [2024-06-12 16:27:50,940][70768] Fps is (10 sec: 49151.8, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 499548160. Throughput: 0: 50839.1. Samples: 28325400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 16:27:50,940][70768] Avg episode reward: [(0, '0.124')] [2024-06-12 16:27:51,775][71000] Updated weights for policy 0, policy_version 30494 (0.0025) [2024-06-12 16:27:55,135][71000] Updated weights for policy 0, policy_version 30504 (0.0033) [2024-06-12 16:27:55,940][70768] Fps is (10 sec: 49151.6, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 499810304. Throughput: 0: 50483.2. Samples: 28624180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 16:27:55,940][70768] Avg episode reward: [(0, '0.143')] [2024-06-12 16:27:58,209][71000] Updated weights for policy 0, policy_version 30514 (0.0028) [2024-06-12 16:28:00,940][70768] Fps is (10 sec: 52429.8, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 500072448. Throughput: 0: 50794.9. Samples: 28935020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 16:28:00,940][70768] Avg episode reward: [(0, '0.149')] [2024-06-12 16:28:01,662][71000] Updated weights for policy 0, policy_version 30524 (0.0027) [2024-06-12 16:28:04,719][71000] Updated weights for policy 0, policy_version 30534 (0.0035) [2024-06-12 16:28:05,940][70768] Fps is (10 sec: 50791.4, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 500318208. Throughput: 0: 50892.2. Samples: 29094160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 16:28:05,940][70768] Avg episode reward: [(0, '0.167')] [2024-06-12 16:28:05,941][70980] Saving new best policy, reward=0.167! [2024-06-12 16:28:08,298][71000] Updated weights for policy 0, policy_version 30544 (0.0031) [2024-06-12 16:28:10,660][70980] Signal inference workers to stop experience collection... (400 times) [2024-06-12 16:28:10,660][70980] Signal inference workers to resume experience collection... (400 times) [2024-06-12 16:28:10,699][71000] InferenceWorker_p0-w0: stopping experience collection (400 times) [2024-06-12 16:28:10,699][71000] InferenceWorker_p0-w0: resuming experience collection (400 times) [2024-06-12 16:28:10,939][70768] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 500580352. Throughput: 0: 50827.8. Samples: 29394000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 16:28:10,940][70768] Avg episode reward: [(0, '0.131')] [2024-06-12 16:28:10,977][71000] Updated weights for policy 0, policy_version 30554 (0.0037) [2024-06-12 16:28:14,579][71000] Updated weights for policy 0, policy_version 30564 (0.0032) [2024-06-12 16:28:15,940][70768] Fps is (10 sec: 47513.8, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 500793344. Throughput: 0: 50649.8. Samples: 29695640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 16:28:15,940][70768] Avg episode reward: [(0, '0.154')] [2024-06-12 16:28:17,399][71000] Updated weights for policy 0, policy_version 30574 (0.0026) [2024-06-12 16:28:20,678][71000] Updated weights for policy 0, policy_version 30584 (0.0025) [2024-06-12 16:28:20,939][70768] Fps is (10 sec: 50790.2, 60 sec: 50790.6, 300 sec: 50818.3). Total num frames: 501088256. Throughput: 0: 51003.2. Samples: 29857180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 16:28:20,940][70768] Avg episode reward: [(0, '0.141')] [2024-06-12 16:28:23,613][71000] Updated weights for policy 0, policy_version 30594 (0.0023) [2024-06-12 16:28:25,940][70768] Fps is (10 sec: 55705.5, 60 sec: 50790.5, 300 sec: 50929.2). Total num frames: 501350400. Throughput: 0: 51085.0. Samples: 30168000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 16:28:25,940][70768] Avg episode reward: [(0, '0.139')] [2024-06-12 16:28:27,293][71000] Updated weights for policy 0, policy_version 30604 (0.0033) [2024-06-12 16:28:30,094][71000] Updated weights for policy 0, policy_version 30614 (0.0033) [2024-06-12 16:28:30,940][70768] Fps is (10 sec: 49151.9, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 501579776. Throughput: 0: 50749.5. Samples: 30461340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 16:28:30,940][70768] Avg episode reward: [(0, '0.157')] [2024-06-12 16:28:33,778][71000] Updated weights for policy 0, policy_version 30624 (0.0034) [2024-06-12 16:28:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 501841920. Throughput: 0: 50651.3. Samples: 30604700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 16:28:35,940][70768] Avg episode reward: [(0, '0.142')] [2024-06-12 16:28:36,758][71000] Updated weights for policy 0, policy_version 30634 (0.0028) [2024-06-12 16:28:40,095][71000] Updated weights for policy 0, policy_version 30644 (0.0028) [2024-06-12 16:28:40,940][70768] Fps is (10 sec: 50790.4, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 502087680. Throughput: 0: 50741.6. Samples: 30907540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 16:28:40,940][70768] Avg episode reward: [(0, '0.144')] [2024-06-12 16:28:43,469][71000] Updated weights for policy 0, policy_version 30654 (0.0021) [2024-06-12 16:28:45,940][70768] Fps is (10 sec: 52428.0, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 502366208. Throughput: 0: 50846.5. Samples: 31223120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 16:28:45,940][70768] Avg episode reward: [(0, '0.136')] [2024-06-12 16:28:46,281][71000] Updated weights for policy 0, policy_version 30664 (0.0030) [2024-06-12 16:28:49,790][71000] Updated weights for policy 0, policy_version 30674 (0.0021) [2024-06-12 16:28:50,940][70768] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 502595584. Throughput: 0: 50648.0. Samples: 31373320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-12 16:28:50,940][70768] Avg episode reward: [(0, '0.139')] [2024-06-12 16:28:53,053][71000] Updated weights for policy 0, policy_version 30684 (0.0029) [2024-06-12 16:28:55,940][70768] Fps is (10 sec: 49152.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 502857728. Throughput: 0: 50690.9. Samples: 31675100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-12 16:28:55,940][70768] Avg episode reward: [(0, '0.158')] [2024-06-12 16:28:56,360][71000] Updated weights for policy 0, policy_version 30694 (0.0034) [2024-06-12 16:28:59,646][71000] Updated weights for policy 0, policy_version 30704 (0.0037) [2024-06-12 16:29:00,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49971.1, 300 sec: 50651.6). Total num frames: 503070720. Throughput: 0: 50267.5. Samples: 31957680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-12 16:29:00,940][70768] Avg episode reward: [(0, '0.170')] [2024-06-12 16:29:00,950][70980] Saving new best policy, reward=0.170! [2024-06-12 16:29:03,153][71000] Updated weights for policy 0, policy_version 30714 (0.0027) [2024-06-12 16:29:04,481][70980] Signal inference workers to stop experience collection... (450 times) [2024-06-12 16:29:04,481][70980] Signal inference workers to resume experience collection... (450 times) [2024-06-12 16:29:04,500][71000] InferenceWorker_p0-w0: stopping experience collection (450 times) [2024-06-12 16:29:04,500][71000] InferenceWorker_p0-w0: resuming experience collection (450 times) [2024-06-12 16:29:05,939][70768] Fps is (10 sec: 50791.2, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 503365632. Throughput: 0: 50132.0. Samples: 32113120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-12 16:29:05,940][70768] Avg episode reward: [(0, '0.163')] [2024-06-12 16:29:05,980][71000] Updated weights for policy 0, policy_version 30724 (0.0027) [2024-06-12 16:29:09,691][71000] Updated weights for policy 0, policy_version 30734 (0.0030) [2024-06-12 16:29:10,939][70768] Fps is (10 sec: 52429.3, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 503595008. Throughput: 0: 49991.2. Samples: 32417600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 16:29:10,940][70768] Avg episode reward: [(0, '0.121')] [2024-06-12 16:29:12,610][71000] Updated weights for policy 0, policy_version 30744 (0.0024) [2024-06-12 16:29:15,939][70768] Fps is (10 sec: 47513.7, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 503840768. Throughput: 0: 50326.7. Samples: 32726040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 16:29:15,940][70768] Avg episode reward: [(0, '0.154')] [2024-06-12 16:29:16,176][71000] Updated weights for policy 0, policy_version 30754 (0.0028) [2024-06-12 16:29:19,230][71000] Updated weights for policy 0, policy_version 30764 (0.0033) [2024-06-12 16:29:20,940][70768] Fps is (10 sec: 52428.1, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 504119296. Throughput: 0: 50406.5. Samples: 32873000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 16:29:20,940][70768] Avg episode reward: [(0, '0.130')] [2024-06-12 16:29:22,495][71000] Updated weights for policy 0, policy_version 30774 (0.0022) [2024-06-12 16:29:25,622][71000] Updated weights for policy 0, policy_version 30784 (0.0026) [2024-06-12 16:29:25,940][70768] Fps is (10 sec: 54066.7, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 504381440. Throughput: 0: 50429.7. Samples: 33176880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 16:29:25,940][70768] Avg episode reward: [(0, '0.155')] [2024-06-12 16:29:29,154][71000] Updated weights for policy 0, policy_version 30794 (0.0030) [2024-06-12 16:29:30,940][70768] Fps is (10 sec: 49152.0, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 504610816. Throughput: 0: 50251.6. Samples: 33484440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 16:29:30,940][70768] Avg episode reward: [(0, '0.156')] [2024-06-12 16:29:31,868][71000] Updated weights for policy 0, policy_version 30804 (0.0033) [2024-06-12 16:29:35,858][71000] Updated weights for policy 0, policy_version 30814 (0.0030) [2024-06-12 16:29:35,940][70768] Fps is (10 sec: 47513.2, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 504856576. Throughput: 0: 50264.4. Samples: 33635220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 16:29:35,940][70768] Avg episode reward: [(0, '0.132')] [2024-06-12 16:29:38,450][71000] Updated weights for policy 0, policy_version 30824 (0.0031) [2024-06-12 16:29:40,940][70768] Fps is (10 sec: 50790.8, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 505118720. Throughput: 0: 50234.7. Samples: 33935660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 16:29:40,940][70768] Avg episode reward: [(0, '0.156')] [2024-06-12 16:29:40,957][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000030830_505118720.pth... [2024-06-12 16:29:40,999][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000030089_492978176.pth [2024-06-12 16:29:42,310][71000] Updated weights for policy 0, policy_version 30834 (0.0027) [2024-06-12 16:29:45,442][71000] Updated weights for policy 0, policy_version 30844 (0.0027) [2024-06-12 16:29:45,939][70768] Fps is (10 sec: 50791.2, 60 sec: 49971.4, 300 sec: 50651.6). Total num frames: 505364480. Throughput: 0: 50470.4. Samples: 34228840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 16:29:45,948][70768] Avg episode reward: [(0, '0.154')] [2024-06-12 16:29:49,000][71000] Updated weights for policy 0, policy_version 30854 (0.0031) [2024-06-12 16:29:50,940][70768] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 505610240. Throughput: 0: 50356.9. Samples: 34379180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 16:29:50,940][70768] Avg episode reward: [(0, '0.138')] [2024-06-12 16:29:51,875][71000] Updated weights for policy 0, policy_version 30864 (0.0036) [2024-06-12 16:29:55,674][71000] Updated weights for policy 0, policy_version 30874 (0.0036) [2024-06-12 16:29:55,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49971.3, 300 sec: 50707.1). Total num frames: 505856000. Throughput: 0: 50392.8. Samples: 34685280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 16:29:55,940][70768] Avg episode reward: [(0, '0.161')] [2024-06-12 16:29:58,184][71000] Updated weights for policy 0, policy_version 30884 (0.0022) [2024-06-12 16:30:00,940][70768] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 506101760. Throughput: 0: 50411.5. Samples: 34994560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 16:30:00,940][70768] Avg episode reward: [(0, '0.154')] [2024-06-12 16:30:01,990][71000] Updated weights for policy 0, policy_version 30894 (0.0034) [2024-06-12 16:30:04,973][71000] Updated weights for policy 0, policy_version 30904 (0.0025) [2024-06-12 16:30:05,940][70768] Fps is (10 sec: 52428.1, 60 sec: 50244.1, 300 sec: 50651.5). Total num frames: 506380288. Throughput: 0: 50336.8. Samples: 35138160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 16:30:05,940][70768] Avg episode reward: [(0, '0.152')] [2024-06-12 16:30:08,399][71000] Updated weights for policy 0, policy_version 30914 (0.0028) [2024-06-12 16:30:10,940][70768] Fps is (10 sec: 50790.4, 60 sec: 50244.3, 300 sec: 50596.0). Total num frames: 506609664. Throughput: 0: 50154.3. Samples: 35433820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 16:30:10,940][70768] Avg episode reward: [(0, '0.159')] [2024-06-12 16:30:11,544][71000] Updated weights for policy 0, policy_version 30924 (0.0027) [2024-06-12 16:30:14,981][71000] Updated weights for policy 0, policy_version 30934 (0.0032) [2024-06-12 16:30:15,939][70768] Fps is (10 sec: 47514.8, 60 sec: 50244.3, 300 sec: 50596.0). Total num frames: 506855424. Throughput: 0: 50034.9. Samples: 35736000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 16:30:15,940][70768] Avg episode reward: [(0, '0.150')] [2024-06-12 16:30:17,932][71000] Updated weights for policy 0, policy_version 30944 (0.0042) [2024-06-12 16:30:20,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.2, 300 sec: 50540.5). Total num frames: 507101184. Throughput: 0: 49968.1. Samples: 35883780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 16:30:20,940][70768] Avg episode reward: [(0, '0.167')] [2024-06-12 16:30:21,546][71000] Updated weights for policy 0, policy_version 30954 (0.0024) [2024-06-12 16:30:24,274][71000] Updated weights for policy 0, policy_version 30964 (0.0024) [2024-06-12 16:30:25,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49698.1, 300 sec: 50485.0). Total num frames: 507363328. Throughput: 0: 50172.5. Samples: 36193420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 16:30:25,940][70768] Avg episode reward: [(0, '0.139')] [2024-06-12 16:30:28,161][71000] Updated weights for policy 0, policy_version 30974 (0.0031) [2024-06-12 16:30:30,940][70768] Fps is (10 sec: 52428.8, 60 sec: 50244.4, 300 sec: 50485.0). Total num frames: 507625472. Throughput: 0: 50263.0. Samples: 36490680. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-12 16:30:30,940][70768] Avg episode reward: [(0, '0.144')] [2024-06-12 16:30:30,991][71000] Updated weights for policy 0, policy_version 30984 (0.0025) [2024-06-12 16:30:34,455][71000] Updated weights for policy 0, policy_version 30994 (0.0025) [2024-06-12 16:30:35,939][70768] Fps is (10 sec: 50790.7, 60 sec: 50244.4, 300 sec: 50540.5). Total num frames: 507871232. Throughput: 0: 50356.5. Samples: 36645220. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-12 16:30:35,940][70768] Avg episode reward: [(0, '0.177')] [2024-06-12 16:30:35,940][70980] Saving new best policy, reward=0.177! [2024-06-12 16:30:37,952][71000] Updated weights for policy 0, policy_version 31004 (0.0025) [2024-06-12 16:30:40,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49698.2, 300 sec: 50540.5). Total num frames: 508100608. Throughput: 0: 49958.8. Samples: 36933420. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-12 16:30:40,940][70768] Avg episode reward: [(0, '0.163')] [2024-06-12 16:30:41,266][71000] Updated weights for policy 0, policy_version 31014 (0.0037) [2024-06-12 16:30:44,287][71000] Updated weights for policy 0, policy_version 31024 (0.0034) [2024-06-12 16:30:45,939][70768] Fps is (10 sec: 47513.6, 60 sec: 49698.1, 300 sec: 50429.4). Total num frames: 508346368. Throughput: 0: 49517.8. Samples: 37222860. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-12 16:30:45,940][70768] Avg episode reward: [(0, '0.146')] [2024-06-12 16:30:46,475][70980] Signal inference workers to stop experience collection... (500 times) [2024-06-12 16:30:46,475][70980] Signal inference workers to resume experience collection... (500 times) [2024-06-12 16:30:46,493][71000] InferenceWorker_p0-w0: stopping experience collection (500 times) [2024-06-12 16:30:46,493][71000] InferenceWorker_p0-w0: resuming experience collection (500 times) [2024-06-12 16:30:47,889][71000] Updated weights for policy 0, policy_version 31034 (0.0027) [2024-06-12 16:30:50,747][71000] Updated weights for policy 0, policy_version 31044 (0.0028) [2024-06-12 16:30:50,940][70768] Fps is (10 sec: 54067.0, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 508641280. Throughput: 0: 49906.9. Samples: 37383960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-12 16:30:50,940][70768] Avg episode reward: [(0, '0.146')] [2024-06-12 16:30:54,295][71000] Updated weights for policy 0, policy_version 31054 (0.0021) [2024-06-12 16:30:55,940][70768] Fps is (10 sec: 52428.2, 60 sec: 50244.2, 300 sec: 50484.9). Total num frames: 508870656. Throughput: 0: 50203.0. Samples: 37692960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-12 16:30:55,940][70768] Avg episode reward: [(0, '0.154')] [2024-06-12 16:30:57,290][71000] Updated weights for policy 0, policy_version 31064 (0.0035) [2024-06-12 16:31:00,853][71000] Updated weights for policy 0, policy_version 31074 (0.0036) [2024-06-12 16:31:00,939][70768] Fps is (10 sec: 47513.8, 60 sec: 50244.3, 300 sec: 50540.5). Total num frames: 509116416. Throughput: 0: 50200.9. Samples: 37995040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-12 16:31:00,940][70768] Avg episode reward: [(0, '0.156')] [2024-06-12 16:31:03,923][71000] Updated weights for policy 0, policy_version 31084 (0.0022) [2024-06-12 16:31:05,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49698.2, 300 sec: 50484.9). Total num frames: 509362176. Throughput: 0: 50061.6. Samples: 38136560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-12 16:31:05,940][70768] Avg episode reward: [(0, '0.156')] [2024-06-12 16:31:07,304][71000] Updated weights for policy 0, policy_version 31094 (0.0032) [2024-06-12 16:31:10,272][71000] Updated weights for policy 0, policy_version 31104 (0.0022) [2024-06-12 16:31:10,940][70768] Fps is (10 sec: 52427.5, 60 sec: 50517.2, 300 sec: 50484.9). Total num frames: 509640704. Throughput: 0: 49831.4. Samples: 38435840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 16:31:10,940][70768] Avg episode reward: [(0, '0.155')] [2024-06-12 16:31:13,810][71000] Updated weights for policy 0, policy_version 31114 (0.0031) [2024-06-12 16:31:15,940][70768] Fps is (10 sec: 50790.5, 60 sec: 50244.1, 300 sec: 50429.4). Total num frames: 509870080. Throughput: 0: 50122.1. Samples: 38746180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 16:31:15,940][70768] Avg episode reward: [(0, '0.141')] [2024-06-12 16:31:16,693][71000] Updated weights for policy 0, policy_version 31124 (0.0028) [2024-06-12 16:31:20,377][71000] Updated weights for policy 0, policy_version 31134 (0.0020) [2024-06-12 16:31:20,940][70768] Fps is (10 sec: 47513.6, 60 sec: 50244.1, 300 sec: 50484.9). Total num frames: 510115840. Throughput: 0: 49990.0. Samples: 38894780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 16:31:20,940][70768] Avg episode reward: [(0, '0.135')] [2024-06-12 16:31:23,372][71000] Updated weights for policy 0, policy_version 31144 (0.0029) [2024-06-12 16:31:25,939][70768] Fps is (10 sec: 49153.0, 60 sec: 49971.3, 300 sec: 50429.4). Total num frames: 510361600. Throughput: 0: 50212.9. Samples: 39193000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 16:31:25,940][70768] Avg episode reward: [(0, '0.164')] [2024-06-12 16:31:26,967][71000] Updated weights for policy 0, policy_version 31154 (0.0028) [2024-06-12 16:31:29,857][71000] Updated weights for policy 0, policy_version 31164 (0.0030) [2024-06-12 16:31:30,939][70768] Fps is (10 sec: 49153.3, 60 sec: 49698.2, 300 sec: 50318.4). Total num frames: 510607360. Throughput: 0: 50417.4. Samples: 39491640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-12 16:31:30,940][70768] Avg episode reward: [(0, '0.173')] [2024-06-12 16:31:33,408][71000] Updated weights for policy 0, policy_version 31174 (0.0023) [2024-06-12 16:31:35,940][70768] Fps is (10 sec: 54065.3, 60 sec: 50517.1, 300 sec: 50484.9). Total num frames: 510902272. Throughput: 0: 50294.8. Samples: 39647240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-12 16:31:35,941][70768] Avg episode reward: [(0, '0.155')] [2024-06-12 16:31:36,442][71000] Updated weights for policy 0, policy_version 31184 (0.0030) [2024-06-12 16:31:39,697][71000] Updated weights for policy 0, policy_version 31194 (0.0027) [2024-06-12 16:31:40,940][70768] Fps is (10 sec: 50789.8, 60 sec: 50244.2, 300 sec: 50373.9). Total num frames: 511115264. Throughput: 0: 50248.0. Samples: 39954120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-12 16:31:40,940][70768] Avg episode reward: [(0, '0.161')] [2024-06-12 16:31:41,016][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000031197_511131648.pth... [2024-06-12 16:31:41,055][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000030460_499056640.pth [2024-06-12 16:31:42,727][71000] Updated weights for policy 0, policy_version 31204 (0.0027) [2024-06-12 16:31:45,940][70768] Fps is (10 sec: 47513.7, 60 sec: 50517.1, 300 sec: 50484.9). Total num frames: 511377408. Throughput: 0: 50194.3. Samples: 40253800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-12 16:31:45,940][70768] Avg episode reward: [(0, '0.171')] [2024-06-12 16:31:46,355][71000] Updated weights for policy 0, policy_version 31214 (0.0024) [2024-06-12 16:31:49,709][71000] Updated weights for policy 0, policy_version 31224 (0.0032) [2024-06-12 16:31:50,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.1, 300 sec: 50429.4). Total num frames: 511623168. Throughput: 0: 50277.9. Samples: 40399060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 16:31:50,940][70768] Avg episode reward: [(0, '0.154')] [2024-06-12 16:31:52,941][70980] Signal inference workers to stop experience collection... (550 times) [2024-06-12 16:31:52,941][70980] Signal inference workers to resume experience collection... (550 times) [2024-06-12 16:31:52,950][71000] InferenceWorker_p0-w0: stopping experience collection (550 times) [2024-06-12 16:31:52,963][71000] InferenceWorker_p0-w0: resuming experience collection (550 times) [2024-06-12 16:31:53,077][71000] Updated weights for policy 0, policy_version 31234 (0.0032) [2024-06-12 16:31:55,940][70768] Fps is (10 sec: 50791.4, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 511885312. Throughput: 0: 50424.2. Samples: 40704920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 16:31:55,940][70768] Avg episode reward: [(0, '0.163')] [2024-06-12 16:31:56,053][71000] Updated weights for policy 0, policy_version 31244 (0.0027) [2024-06-12 16:31:59,514][71000] Updated weights for policy 0, policy_version 31254 (0.0025) [2024-06-12 16:32:00,940][70768] Fps is (10 sec: 50790.6, 60 sec: 50244.2, 300 sec: 50373.9). Total num frames: 512131072. Throughput: 0: 50300.6. Samples: 41009700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 16:32:00,940][70768] Avg episode reward: [(0, '0.154')] [2024-06-12 16:32:02,547][71000] Updated weights for policy 0, policy_version 31264 (0.0030) [2024-06-12 16:32:05,622][71000] Updated weights for policy 0, policy_version 31274 (0.0025) [2024-06-12 16:32:05,940][70768] Fps is (10 sec: 50790.6, 60 sec: 50517.4, 300 sec: 50373.8). Total num frames: 512393216. Throughput: 0: 50513.1. Samples: 41167860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 16:32:05,940][70768] Avg episode reward: [(0, '0.162')] [2024-06-12 16:32:08,715][71000] Updated weights for policy 0, policy_version 31284 (0.0031) [2024-06-12 16:32:10,940][70768] Fps is (10 sec: 52428.1, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 512655360. Throughput: 0: 50796.2. Samples: 41478840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 16:32:10,940][70768] Avg episode reward: [(0, '0.140')] [2024-06-12 16:32:12,135][71000] Updated weights for policy 0, policy_version 31294 (0.0029) [2024-06-12 16:32:15,414][71000] Updated weights for policy 0, policy_version 31304 (0.0027) [2024-06-12 16:32:15,939][70768] Fps is (10 sec: 49152.1, 60 sec: 50244.4, 300 sec: 50318.4). Total num frames: 512884736. Throughput: 0: 50620.4. Samples: 41769560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 16:32:15,940][70768] Avg episode reward: [(0, '0.158')] [2024-06-12 16:32:18,966][71000] Updated weights for policy 0, policy_version 31314 (0.0032) [2024-06-12 16:32:20,944][70768] Fps is (10 sec: 50769.2, 60 sec: 50786.9, 300 sec: 50373.1). Total num frames: 513163264. Throughput: 0: 50494.6. Samples: 41919700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 16:32:20,944][70768] Avg episode reward: [(0, '0.176')] [2024-06-12 16:32:21,845][71000] Updated weights for policy 0, policy_version 31324 (0.0032) [2024-06-12 16:32:25,444][71000] Updated weights for policy 0, policy_version 31334 (0.0021) [2024-06-12 16:32:25,940][70768] Fps is (10 sec: 49151.9, 60 sec: 50244.2, 300 sec: 50373.9). Total num frames: 513376256. Throughput: 0: 50252.9. Samples: 42215500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 16:32:25,940][70768] Avg episode reward: [(0, '0.159')] [2024-06-12 16:32:28,450][71000] Updated weights for policy 0, policy_version 31344 (0.0023) [2024-06-12 16:32:30,940][70768] Fps is (10 sec: 50812.3, 60 sec: 51063.4, 300 sec: 50429.4). Total num frames: 513671168. Throughput: 0: 50472.7. Samples: 42525060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 16:32:30,940][70768] Avg episode reward: [(0, '0.131')] [2024-06-12 16:32:31,589][71000] Updated weights for policy 0, policy_version 31354 (0.0033) [2024-06-12 16:32:34,871][71000] Updated weights for policy 0, policy_version 31364 (0.0022) [2024-06-12 16:32:35,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49698.3, 300 sec: 50262.8). Total num frames: 513884160. Throughput: 0: 50670.2. Samples: 42679220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 16:32:35,940][70768] Avg episode reward: [(0, '0.160')] [2024-06-12 16:32:38,557][71000] Updated weights for policy 0, policy_version 31374 (0.0026) [2024-06-12 16:32:40,940][70768] Fps is (10 sec: 49151.3, 60 sec: 50790.3, 300 sec: 50318.3). Total num frames: 514162688. Throughput: 0: 50439.4. Samples: 42974700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 16:32:40,940][70768] Avg episode reward: [(0, '0.156')] [2024-06-12 16:32:41,503][71000] Updated weights for policy 0, policy_version 31384 (0.0027) [2024-06-12 16:32:45,313][71000] Updated weights for policy 0, policy_version 31394 (0.0035) [2024-06-12 16:32:45,940][70768] Fps is (10 sec: 52428.9, 60 sec: 50517.5, 300 sec: 50373.9). Total num frames: 514408448. Throughput: 0: 50326.2. Samples: 43274380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 16:32:45,940][70768] Avg episode reward: [(0, '0.140')] [2024-06-12 16:32:48,516][71000] Updated weights for policy 0, policy_version 31404 (0.0026) [2024-06-12 16:32:50,940][70768] Fps is (10 sec: 49152.6, 60 sec: 50517.3, 300 sec: 50318.4). Total num frames: 514654208. Throughput: 0: 50052.0. Samples: 43420200. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-12 16:32:50,940][70768] Avg episode reward: [(0, '0.176')] [2024-06-12 16:32:51,571][71000] Updated weights for policy 0, policy_version 31414 (0.0020) [2024-06-12 16:32:54,773][71000] Updated weights for policy 0, policy_version 31424 (0.0027) [2024-06-12 16:32:55,940][70768] Fps is (10 sec: 49152.3, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 514899968. Throughput: 0: 49849.0. Samples: 43722040. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-12 16:32:55,940][70768] Avg episode reward: [(0, '0.151')] [2024-06-12 16:32:58,000][71000] Updated weights for policy 0, policy_version 31434 (0.0022) [2024-06-12 16:33:00,940][70768] Fps is (10 sec: 50790.4, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 515162112. Throughput: 0: 50104.0. Samples: 44024240. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-12 16:33:00,940][70768] Avg episode reward: [(0, '0.166')] [2024-06-12 16:33:01,372][71000] Updated weights for policy 0, policy_version 31444 (0.0033) [2024-06-12 16:33:04,545][70980] Signal inference workers to stop experience collection... (600 times) [2024-06-12 16:33:04,545][70980] Signal inference workers to resume experience collection... (600 times) [2024-06-12 16:33:04,588][71000] InferenceWorker_p0-w0: stopping experience collection (600 times) [2024-06-12 16:33:04,588][71000] InferenceWorker_p0-w0: resuming experience collection (600 times) [2024-06-12 16:33:04,674][71000] Updated weights for policy 0, policy_version 31454 (0.0036) [2024-06-12 16:33:05,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49971.3, 300 sec: 50207.2). Total num frames: 515391488. Throughput: 0: 49970.2. Samples: 44168140. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-12 16:33:05,940][70768] Avg episode reward: [(0, '0.166')] [2024-06-12 16:33:07,958][71000] Updated weights for policy 0, policy_version 31464 (0.0035) [2024-06-12 16:33:10,944][70768] Fps is (10 sec: 47494.1, 60 sec: 49694.8, 300 sec: 50317.6). Total num frames: 515637248. Throughput: 0: 50011.0. Samples: 44466200. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-12 16:33:10,944][70768] Avg episode reward: [(0, '0.153')] [2024-06-12 16:33:11,331][71000] Updated weights for policy 0, policy_version 31474 (0.0027) [2024-06-12 16:33:14,778][71000] Updated weights for policy 0, policy_version 31484 (0.0030) [2024-06-12 16:33:15,939][70768] Fps is (10 sec: 50790.4, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 515899392. Throughput: 0: 49965.4. Samples: 44773500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 16:33:15,940][70768] Avg episode reward: [(0, '0.176')] [2024-06-12 16:33:17,642][71000] Updated weights for policy 0, policy_version 31494 (0.0025) [2024-06-12 16:33:20,940][70768] Fps is (10 sec: 50811.3, 60 sec: 49701.7, 300 sec: 50151.7). Total num frames: 516145152. Throughput: 0: 49922.3. Samples: 44925720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 16:33:20,940][70768] Avg episode reward: [(0, '0.155')] [2024-06-12 16:33:20,987][71000] Updated weights for policy 0, policy_version 31504 (0.0027) [2024-06-12 16:33:24,082][71000] Updated weights for policy 0, policy_version 31514 (0.0035) [2024-06-12 16:33:25,940][70768] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 516407296. Throughput: 0: 50071.3. Samples: 45227900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 16:33:25,940][70768] Avg episode reward: [(0, '0.158')] [2024-06-12 16:33:27,483][71000] Updated weights for policy 0, policy_version 31524 (0.0025) [2024-06-12 16:33:30,658][71000] Updated weights for policy 0, policy_version 31534 (0.0025) [2024-06-12 16:33:30,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 516669440. Throughput: 0: 50068.5. Samples: 45527460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 16:33:30,940][70768] Avg episode reward: [(0, '0.169')] [2024-06-12 16:33:33,934][71000] Updated weights for policy 0, policy_version 31544 (0.0027) [2024-06-12 16:33:35,940][70768] Fps is (10 sec: 50790.2, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 516915200. Throughput: 0: 50380.0. Samples: 45687300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 16:33:35,940][70768] Avg episode reward: [(0, '0.164')] [2024-06-12 16:33:36,951][71000] Updated weights for policy 0, policy_version 31554 (0.0032) [2024-06-12 16:33:40,687][71000] Updated weights for policy 0, policy_version 31564 (0.0029) [2024-06-12 16:33:40,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 517160960. Throughput: 0: 50424.2. Samples: 45991140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 16:33:40,940][70768] Avg episode reward: [(0, '0.144')] [2024-06-12 16:33:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000031565_517160960.pth... [2024-06-12 16:33:40,998][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000030830_505118720.pth [2024-06-12 16:33:43,366][71000] Updated weights for policy 0, policy_version 31574 (0.0026) [2024-06-12 16:33:45,939][70768] Fps is (10 sec: 49152.5, 60 sec: 49971.3, 300 sec: 50207.3). Total num frames: 517406720. Throughput: 0: 50541.4. Samples: 46298600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 16:33:45,940][70768] Avg episode reward: [(0, '0.157')] [2024-06-12 16:33:46,730][71000] Updated weights for policy 0, policy_version 31584 (0.0024) [2024-06-12 16:33:49,681][71000] Updated weights for policy 0, policy_version 31594 (0.0029) [2024-06-12 16:33:50,940][70768] Fps is (10 sec: 52429.7, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 517685248. Throughput: 0: 50602.6. Samples: 46445260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 16:33:50,940][70768] Avg episode reward: [(0, '0.178')] [2024-06-12 16:33:53,571][71000] Updated weights for policy 0, policy_version 31604 (0.0024) [2024-06-12 16:33:55,939][70768] Fps is (10 sec: 52428.6, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 517931008. Throughput: 0: 50709.6. Samples: 46747920. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 16:33:55,940][70768] Avg episode reward: [(0, '0.160')] [2024-06-12 16:33:56,476][71000] Updated weights for policy 0, policy_version 31614 (0.0034) [2024-06-12 16:34:00,008][71000] Updated weights for policy 0, policy_version 31624 (0.0029) [2024-06-12 16:34:00,939][70768] Fps is (10 sec: 49152.4, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 518176768. Throughput: 0: 50544.4. Samples: 47048000. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 16:34:00,940][70768] Avg episode reward: [(0, '0.159')] [2024-06-12 16:34:02,978][71000] Updated weights for policy 0, policy_version 31634 (0.0030) [2024-06-12 16:34:05,940][70768] Fps is (10 sec: 47513.2, 60 sec: 50244.2, 300 sec: 50207.2). Total num frames: 518406144. Throughput: 0: 50363.5. Samples: 47192080. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 16:34:05,940][70768] Avg episode reward: [(0, '0.161')] [2024-06-12 16:34:06,550][71000] Updated weights for policy 0, policy_version 31644 (0.0029) [2024-06-12 16:34:06,994][70980] Signal inference workers to stop experience collection... (650 times) [2024-06-12 16:34:07,039][71000] InferenceWorker_p0-w0: stopping experience collection (650 times) [2024-06-12 16:34:07,048][70980] Signal inference workers to resume experience collection... (650 times) [2024-06-12 16:34:07,050][71000] InferenceWorker_p0-w0: resuming experience collection (650 times) [2024-06-12 16:34:09,200][71000] Updated weights for policy 0, policy_version 31654 (0.0026) [2024-06-12 16:34:10,940][70768] Fps is (10 sec: 50789.8, 60 sec: 50793.8, 300 sec: 50318.3). Total num frames: 518684672. Throughput: 0: 50405.3. Samples: 47496140. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 16:34:10,940][70768] Avg episode reward: [(0, '0.174')] [2024-06-12 16:34:13,011][71000] Updated weights for policy 0, policy_version 31664 (0.0028) [2024-06-12 16:34:15,939][70768] Fps is (10 sec: 52429.3, 60 sec: 50517.3, 300 sec: 50207.3). Total num frames: 518930432. Throughput: 0: 50501.8. Samples: 47800040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 16:34:15,940][70768] Avg episode reward: [(0, '0.162')] [2024-06-12 16:34:16,011][71000] Updated weights for policy 0, policy_version 31674 (0.0028) [2024-06-12 16:34:19,665][71000] Updated weights for policy 0, policy_version 31684 (0.0031) [2024-06-12 16:34:20,940][70768] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50207.2). Total num frames: 519192576. Throughput: 0: 50431.5. Samples: 47956720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 16:34:20,940][70768] Avg episode reward: [(0, '0.155')] [2024-06-12 16:34:22,768][71000] Updated weights for policy 0, policy_version 31694 (0.0031) [2024-06-12 16:34:25,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49971.1, 300 sec: 50151.7). Total num frames: 519405568. Throughput: 0: 50347.7. Samples: 48256780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 16:34:25,940][70768] Avg episode reward: [(0, '0.179')] [2024-06-12 16:34:25,946][70980] Saving new best policy, reward=0.179! [2024-06-12 16:34:26,151][71000] Updated weights for policy 0, policy_version 31704 (0.0024) [2024-06-12 16:34:29,223][71000] Updated weights for policy 0, policy_version 31714 (0.0025) [2024-06-12 16:34:30,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49971.1, 300 sec: 50207.2). Total num frames: 519667712. Throughput: 0: 50024.7. Samples: 48549720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 16:34:30,940][70768] Avg episode reward: [(0, '0.166')] [2024-06-12 16:34:32,649][71000] Updated weights for policy 0, policy_version 31724 (0.0022) [2024-06-12 16:34:35,384][71000] Updated weights for policy 0, policy_version 31734 (0.0023) [2024-06-12 16:34:35,940][70768] Fps is (10 sec: 55706.3, 60 sec: 50790.4, 300 sec: 50318.3). Total num frames: 519962624. Throughput: 0: 50156.9. Samples: 48702320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-12 16:34:35,940][70768] Avg episode reward: [(0, '0.173')] [2024-06-12 16:34:39,131][71000] Updated weights for policy 0, policy_version 31744 (0.0031) [2024-06-12 16:34:40,940][70768] Fps is (10 sec: 50790.1, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 520175616. Throughput: 0: 50384.2. Samples: 49015220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-12 16:34:40,940][70768] Avg episode reward: [(0, '0.166')] [2024-06-12 16:34:41,840][71000] Updated weights for policy 0, policy_version 31754 (0.0031) [2024-06-12 16:34:45,781][71000] Updated weights for policy 0, policy_version 31764 (0.0028) [2024-06-12 16:34:45,939][70768] Fps is (10 sec: 47513.8, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 520437760. Throughput: 0: 50408.4. Samples: 49316380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-12 16:34:45,940][70768] Avg episode reward: [(0, '0.172')] [2024-06-12 16:34:48,513][71000] Updated weights for policy 0, policy_version 31774 (0.0026) [2024-06-12 16:34:50,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49698.0, 300 sec: 50207.2). Total num frames: 520667136. Throughput: 0: 50257.2. Samples: 49453660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-12 16:34:50,940][70768] Avg episode reward: [(0, '0.145')] [2024-06-12 16:34:52,306][71000] Updated weights for policy 0, policy_version 31784 (0.0023) [2024-06-12 16:34:52,722][70980] Signal inference workers to stop experience collection... (700 times) [2024-06-12 16:34:52,724][70980] Signal inference workers to resume experience collection... (700 times) [2024-06-12 16:34:52,736][71000] InferenceWorker_p0-w0: stopping experience collection (700 times) [2024-06-12 16:34:52,736][71000] InferenceWorker_p0-w0: resuming experience collection (700 times) [2024-06-12 16:34:55,028][71000] Updated weights for policy 0, policy_version 31794 (0.0024) [2024-06-12 16:34:55,940][70768] Fps is (10 sec: 52428.0, 60 sec: 50517.2, 300 sec: 50373.8). Total num frames: 520962048. Throughput: 0: 50222.2. Samples: 49756140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 16:34:55,940][70768] Avg episode reward: [(0, '0.177')] [2024-06-12 16:34:58,681][71000] Updated weights for policy 0, policy_version 31804 (0.0026) [2024-06-12 16:35:00,940][70768] Fps is (10 sec: 52429.5, 60 sec: 50244.2, 300 sec: 50207.3). Total num frames: 521191424. Throughput: 0: 50238.6. Samples: 50060780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 16:35:00,940][70768] Avg episode reward: [(0, '0.157')] [2024-06-12 16:35:01,480][71000] Updated weights for policy 0, policy_version 31814 (0.0031) [2024-06-12 16:35:05,479][71000] Updated weights for policy 0, policy_version 31824 (0.0032) [2024-06-12 16:35:05,940][70768] Fps is (10 sec: 47513.8, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 521437184. Throughput: 0: 50242.7. Samples: 50217640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 16:35:05,940][70768] Avg episode reward: [(0, '0.148')] [2024-06-12 16:35:08,042][71000] Updated weights for policy 0, policy_version 31834 (0.0028) [2024-06-12 16:35:10,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49698.1, 300 sec: 50207.2). Total num frames: 521666560. Throughput: 0: 49967.2. Samples: 50505300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 16:35:10,940][70768] Avg episode reward: [(0, '0.162')] [2024-06-12 16:35:11,792][71000] Updated weights for policy 0, policy_version 31844 (0.0039) [2024-06-12 16:35:14,823][71000] Updated weights for policy 0, policy_version 31854 (0.0031) [2024-06-12 16:35:15,939][70768] Fps is (10 sec: 50790.9, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 521945088. Throughput: 0: 50192.7. Samples: 50808380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 16:35:15,940][70768] Avg episode reward: [(0, '0.156')] [2024-06-12 16:35:18,315][71000] Updated weights for policy 0, policy_version 31864 (0.0023) [2024-06-12 16:35:20,940][70768] Fps is (10 sec: 54066.5, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 522207232. Throughput: 0: 50383.3. Samples: 50969580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 16:35:20,940][70768] Avg episode reward: [(0, '0.166')] [2024-06-12 16:35:21,289][71000] Updated weights for policy 0, policy_version 31874 (0.0039) [2024-06-12 16:35:24,450][71000] Updated weights for policy 0, policy_version 31884 (0.0025) [2024-06-12 16:35:25,940][70768] Fps is (10 sec: 52428.6, 60 sec: 51063.6, 300 sec: 50318.3). Total num frames: 522469376. Throughput: 0: 50304.2. Samples: 51278900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 16:35:25,940][70768] Avg episode reward: [(0, '0.172')] [2024-06-12 16:35:27,451][71000] Updated weights for policy 0, policy_version 31894 (0.0029) [2024-06-12 16:35:30,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49971.3, 300 sec: 50151.7). Total num frames: 522665984. Throughput: 0: 50326.6. Samples: 51581080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 16:35:30,940][70768] Avg episode reward: [(0, '0.169')] [2024-06-12 16:35:31,400][71000] Updated weights for policy 0, policy_version 31904 (0.0036) [2024-06-12 16:35:34,330][71000] Updated weights for policy 0, policy_version 31914 (0.0037) [2024-06-12 16:35:35,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49971.1, 300 sec: 50373.8). Total num frames: 522960896. Throughput: 0: 50220.0. Samples: 51713560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 16:35:35,940][70768] Avg episode reward: [(0, '0.171')] [2024-06-12 16:35:37,993][71000] Updated weights for policy 0, policy_version 31924 (0.0041) [2024-06-12 16:35:40,939][70768] Fps is (10 sec: 52429.2, 60 sec: 50244.5, 300 sec: 50318.3). Total num frames: 523190272. Throughput: 0: 50001.9. Samples: 52006220. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-12 16:35:40,940][70768] Avg episode reward: [(0, '0.168')] [2024-06-12 16:35:41,013][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000031934_523206656.pth... [2024-06-12 16:35:41,013][71000] Updated weights for policy 0, policy_version 31934 (0.0032) [2024-06-12 16:35:41,053][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000031197_511131648.pth [2024-06-12 16:35:44,638][71000] Updated weights for policy 0, policy_version 31944 (0.0025) [2024-06-12 16:35:45,940][70768] Fps is (10 sec: 50790.0, 60 sec: 50517.1, 300 sec: 50262.7). Total num frames: 523468800. Throughput: 0: 50086.4. Samples: 52314680. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-12 16:35:45,940][70768] Avg episode reward: [(0, '0.185')] [2024-06-12 16:35:45,941][70980] Saving new best policy, reward=0.185! [2024-06-12 16:35:47,303][71000] Updated weights for policy 0, policy_version 31954 (0.0029) [2024-06-12 16:35:50,940][70768] Fps is (10 sec: 49151.5, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 523681792. Throughput: 0: 50018.2. Samples: 52468460. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-12 16:35:50,940][70768] Avg episode reward: [(0, '0.182')] [2024-06-12 16:35:51,015][71000] Updated weights for policy 0, policy_version 31964 (0.0022) [2024-06-12 16:35:52,369][70980] Signal inference workers to stop experience collection... (750 times) [2024-06-12 16:35:52,390][71000] InferenceWorker_p0-w0: stopping experience collection (750 times) [2024-06-12 16:35:52,477][70980] Signal inference workers to resume experience collection... (750 times) [2024-06-12 16:35:52,477][71000] InferenceWorker_p0-w0: resuming experience collection (750 times) [2024-06-12 16:35:53,717][71000] Updated weights for policy 0, policy_version 31974 (0.0025) [2024-06-12 16:35:55,940][70768] Fps is (10 sec: 47514.6, 60 sec: 49698.2, 300 sec: 50262.8). Total num frames: 523943936. Throughput: 0: 50150.7. Samples: 52762080. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-12 16:35:55,940][70768] Avg episode reward: [(0, '0.181')] [2024-06-12 16:35:57,797][71000] Updated weights for policy 0, policy_version 31984 (0.0033) [2024-06-12 16:36:00,643][71000] Updated weights for policy 0, policy_version 31994 (0.0027) [2024-06-12 16:36:00,939][70768] Fps is (10 sec: 52429.3, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 524206080. Throughput: 0: 49991.1. Samples: 53057980. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-12 16:36:00,940][70768] Avg episode reward: [(0, '0.166')] [2024-06-12 16:36:04,275][71000] Updated weights for policy 0, policy_version 32004 (0.0036) [2024-06-12 16:36:05,942][70768] Fps is (10 sec: 52415.1, 60 sec: 50515.2, 300 sec: 50262.4). Total num frames: 524468224. Throughput: 0: 50008.9. Samples: 53220100. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-12 16:36:05,943][70768] Avg episode reward: [(0, '0.179')] [2024-06-12 16:36:07,018][71000] Updated weights for policy 0, policy_version 32014 (0.0044) [2024-06-12 16:36:10,939][70768] Fps is (10 sec: 45875.2, 60 sec: 49971.3, 300 sec: 50151.7). Total num frames: 524664832. Throughput: 0: 49674.7. Samples: 53514260. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-12 16:36:10,940][70768] Avg episode reward: [(0, '0.170')] [2024-06-12 16:36:11,054][71000] Updated weights for policy 0, policy_version 32024 (0.0027) [2024-06-12 16:36:13,446][71000] Updated weights for policy 0, policy_version 32034 (0.0035) [2024-06-12 16:36:15,940][70768] Fps is (10 sec: 47525.1, 60 sec: 49971.0, 300 sec: 50262.8). Total num frames: 524943360. Throughput: 0: 49680.3. Samples: 53816700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-12 16:36:15,940][70768] Avg episode reward: [(0, '0.200')] [2024-06-12 16:36:16,085][70980] Saving new best policy, reward=0.200! [2024-06-12 16:36:17,541][71000] Updated weights for policy 0, policy_version 32044 (0.0033) [2024-06-12 16:36:19,885][71000] Updated weights for policy 0, policy_version 32054 (0.0021) [2024-06-12 16:36:20,939][70768] Fps is (10 sec: 54067.4, 60 sec: 49971.4, 300 sec: 50318.3). Total num frames: 525205504. Throughput: 0: 50175.3. Samples: 53971440. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-12 16:36:20,940][70768] Avg episode reward: [(0, '0.175')] [2024-06-12 16:36:24,158][71000] Updated weights for policy 0, policy_version 32064 (0.0028) [2024-06-12 16:36:25,940][70768] Fps is (10 sec: 52429.1, 60 sec: 49971.1, 300 sec: 50373.8). Total num frames: 525467648. Throughput: 0: 50325.1. Samples: 54270860. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-12 16:36:25,940][70768] Avg episode reward: [(0, '0.178')] [2024-06-12 16:36:26,542][71000] Updated weights for policy 0, policy_version 32074 (0.0030) [2024-06-12 16:36:30,561][71000] Updated weights for policy 0, policy_version 32084 (0.0036) [2024-06-12 16:36:30,940][70768] Fps is (10 sec: 45874.8, 60 sec: 49971.2, 300 sec: 50040.7). Total num frames: 525664256. Throughput: 0: 50101.1. Samples: 54569220. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-12 16:36:30,940][70768] Avg episode reward: [(0, '0.195')] [2024-06-12 16:36:33,302][71000] Updated weights for policy 0, policy_version 32094 (0.0028) [2024-06-12 16:36:35,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49698.2, 300 sec: 50262.8). Total num frames: 525942784. Throughput: 0: 49928.4. Samples: 54715240. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-12 16:36:35,940][70768] Avg episode reward: [(0, '0.177')] [2024-06-12 16:36:36,831][71000] Updated weights for policy 0, policy_version 32104 (0.0025) [2024-06-12 16:36:39,599][71000] Updated weights for policy 0, policy_version 32114 (0.0026) [2024-06-12 16:36:40,940][70768] Fps is (10 sec: 54067.0, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 526204928. Throughput: 0: 50179.0. Samples: 55020140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-12 16:36:40,940][70768] Avg episode reward: [(0, '0.169')] [2024-06-12 16:36:43,569][71000] Updated weights for policy 0, policy_version 32124 (0.0031) [2024-06-12 16:36:44,203][70980] Signal inference workers to stop experience collection... (800 times) [2024-06-12 16:36:44,246][71000] InferenceWorker_p0-w0: stopping experience collection (800 times) [2024-06-12 16:36:44,255][70980] Signal inference workers to resume experience collection... (800 times) [2024-06-12 16:36:44,261][71000] InferenceWorker_p0-w0: resuming experience collection (800 times) [2024-06-12 16:36:45,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49971.4, 300 sec: 50318.3). Total num frames: 526467072. Throughput: 0: 50122.2. Samples: 55313480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-12 16:36:45,940][70768] Avg episode reward: [(0, '0.167')] [2024-06-12 16:36:46,171][71000] Updated weights for policy 0, policy_version 32134 (0.0029) [2024-06-12 16:36:50,260][71000] Updated weights for policy 0, policy_version 32144 (0.0036) [2024-06-12 16:36:50,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 526680064. Throughput: 0: 49885.5. Samples: 55464820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-12 16:36:50,940][70768] Avg episode reward: [(0, '0.154')] [2024-06-12 16:36:52,574][71000] Updated weights for policy 0, policy_version 32154 (0.0026) [2024-06-12 16:36:55,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 526942208. Throughput: 0: 50338.6. Samples: 55779500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-12 16:36:55,940][70768] Avg episode reward: [(0, '0.161')] [2024-06-12 16:36:56,557][71000] Updated weights for policy 0, policy_version 32164 (0.0031) [2024-06-12 16:36:59,414][71000] Updated weights for policy 0, policy_version 32174 (0.0026) [2024-06-12 16:37:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.1, 300 sec: 50151.7). Total num frames: 527187968. Throughput: 0: 50131.3. Samples: 56072600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-12 16:37:00,940][70768] Avg episode reward: [(0, '0.195')] [2024-06-12 16:37:02,813][71000] Updated weights for policy 0, policy_version 32184 (0.0027) [2024-06-12 16:37:05,616][71000] Updated weights for policy 0, policy_version 32194 (0.0029) [2024-06-12 16:37:05,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49973.3, 300 sec: 50207.2). Total num frames: 527466496. Throughput: 0: 50170.5. Samples: 56229120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-12 16:37:05,940][70768] Avg episode reward: [(0, '0.179')] [2024-06-12 16:37:09,680][71000] Updated weights for policy 0, policy_version 32204 (0.0034) [2024-06-12 16:37:10,940][70768] Fps is (10 sec: 52429.0, 60 sec: 50790.4, 300 sec: 50262.8). Total num frames: 527712256. Throughput: 0: 50326.8. Samples: 56535560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-12 16:37:10,940][70768] Avg episode reward: [(0, '0.165')] [2024-06-12 16:37:12,027][71000] Updated weights for policy 0, policy_version 32214 (0.0018) [2024-06-12 16:37:15,939][70768] Fps is (10 sec: 45875.7, 60 sec: 49698.3, 300 sec: 50041.4). Total num frames: 527925248. Throughput: 0: 50385.4. Samples: 56836560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-12 16:37:15,940][70768] Avg episode reward: [(0, '0.177')] [2024-06-12 16:37:16,180][71000] Updated weights for policy 0, policy_version 32224 (0.0025) [2024-06-12 16:37:18,787][71000] Updated weights for policy 0, policy_version 32234 (0.0031) [2024-06-12 16:37:20,939][70768] Fps is (10 sec: 50790.4, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 528220160. Throughput: 0: 50295.6. Samples: 56978540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-12 16:37:20,940][70768] Avg episode reward: [(0, '0.182')] [2024-06-12 16:37:22,337][71000] Updated weights for policy 0, policy_version 32244 (0.0027) [2024-06-12 16:37:25,264][71000] Updated weights for policy 0, policy_version 32254 (0.0046) [2024-06-12 16:37:25,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49698.2, 300 sec: 50096.2). Total num frames: 528449536. Throughput: 0: 50368.9. Samples: 57286740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 16:37:25,940][70768] Avg episode reward: [(0, '0.181')] [2024-06-12 16:37:28,667][71000] Updated weights for policy 0, policy_version 32264 (0.0028) [2024-06-12 16:37:30,940][70768] Fps is (10 sec: 49151.4, 60 sec: 50790.3, 300 sec: 50262.8). Total num frames: 528711680. Throughput: 0: 50579.0. Samples: 57589540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 16:37:30,940][70768] Avg episode reward: [(0, '0.169')] [2024-06-12 16:37:31,600][71000] Updated weights for policy 0, policy_version 32274 (0.0043) [2024-06-12 16:37:35,670][71000] Updated weights for policy 0, policy_version 32284 (0.0034) [2024-06-12 16:37:35,940][70768] Fps is (10 sec: 50790.2, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 528957440. Throughput: 0: 50367.5. Samples: 57731360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 16:37:35,940][70768] Avg episode reward: [(0, '0.170')] [2024-06-12 16:37:38,607][71000] Updated weights for policy 0, policy_version 32294 (0.0027) [2024-06-12 16:37:40,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 529203200. Throughput: 0: 50126.6. Samples: 58035200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 16:37:40,940][70768] Avg episode reward: [(0, '0.179')] [2024-06-12 16:37:40,953][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000032300_529203200.pth... [2024-06-12 16:37:40,996][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000031565_517160960.pth [2024-06-12 16:37:42,130][71000] Updated weights for policy 0, policy_version 32304 (0.0027) [2024-06-12 16:37:43,282][70980] Signal inference workers to stop experience collection... (850 times) [2024-06-12 16:37:43,282][70980] Signal inference workers to resume experience collection... (850 times) [2024-06-12 16:37:43,303][71000] InferenceWorker_p0-w0: stopping experience collection (850 times) [2024-06-12 16:37:43,303][71000] InferenceWorker_p0-w0: resuming experience collection (850 times) [2024-06-12 16:37:44,909][71000] Updated weights for policy 0, policy_version 32314 (0.0025) [2024-06-12 16:37:45,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49971.1, 300 sec: 50207.2). Total num frames: 529465344. Throughput: 0: 50292.4. Samples: 58335760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 16:37:45,940][70768] Avg episode reward: [(0, '0.177')] [2024-06-12 16:37:48,366][71000] Updated weights for policy 0, policy_version 32324 (0.0027) [2024-06-12 16:37:50,940][70768] Fps is (10 sec: 50790.6, 60 sec: 50517.3, 300 sec: 50207.2). Total num frames: 529711104. Throughput: 0: 50189.9. Samples: 58487660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 16:37:50,940][70768] Avg episode reward: [(0, '0.174')] [2024-06-12 16:37:51,721][71000] Updated weights for policy 0, policy_version 32334 (0.0032) [2024-06-12 16:37:54,851][71000] Updated weights for policy 0, policy_version 32344 (0.0033) [2024-06-12 16:37:55,940][70768] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 529956864. Throughput: 0: 50102.2. Samples: 58790160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 16:37:55,940][70768] Avg episode reward: [(0, '0.162')] [2024-06-12 16:37:58,253][71000] Updated weights for policy 0, policy_version 32354 (0.0030) [2024-06-12 16:38:00,940][70768] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 530202624. Throughput: 0: 49739.5. Samples: 59074840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 16:38:00,940][70768] Avg episode reward: [(0, '0.172')] [2024-06-12 16:38:01,720][71000] Updated weights for policy 0, policy_version 32364 (0.0032) [2024-06-12 16:38:04,930][71000] Updated weights for policy 0, policy_version 32374 (0.0030) [2024-06-12 16:38:05,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49698.1, 300 sec: 50207.9). Total num frames: 530448384. Throughput: 0: 50019.4. Samples: 59229420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 16:38:05,940][70768] Avg episode reward: [(0, '0.173')] [2024-06-12 16:38:08,172][71000] Updated weights for policy 0, policy_version 32384 (0.0037) [2024-06-12 16:38:10,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 530710528. Throughput: 0: 49680.9. Samples: 59522380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 16:38:10,940][70768] Avg episode reward: [(0, '0.171')] [2024-06-12 16:38:11,611][71000] Updated weights for policy 0, policy_version 32394 (0.0038) [2024-06-12 16:38:14,774][71000] Updated weights for policy 0, policy_version 32404 (0.0025) [2024-06-12 16:38:15,939][70768] Fps is (10 sec: 49152.9, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 530939904. Throughput: 0: 49441.5. Samples: 59814400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 16:38:15,940][70768] Avg episode reward: [(0, '0.168')] [2024-06-12 16:38:18,210][71000] Updated weights for policy 0, policy_version 32414 (0.0035) [2024-06-12 16:38:20,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 50151.7). Total num frames: 531202048. Throughput: 0: 49601.8. Samples: 59963440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 16:38:20,940][70768] Avg episode reward: [(0, '0.157')] [2024-06-12 16:38:21,180][71000] Updated weights for policy 0, policy_version 32424 (0.0022) [2024-06-12 16:38:25,021][71000] Updated weights for policy 0, policy_version 32434 (0.0040) [2024-06-12 16:38:25,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49971.2, 300 sec: 50096.2). Total num frames: 531447808. Throughput: 0: 49350.3. Samples: 60255960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-12 16:38:25,949][70768] Avg episode reward: [(0, '0.183')] [2024-06-12 16:38:28,252][71000] Updated weights for policy 0, policy_version 32444 (0.0035) [2024-06-12 16:38:30,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49425.2, 300 sec: 50040.6). Total num frames: 531677184. Throughput: 0: 49281.4. Samples: 60553420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-12 16:38:30,940][70768] Avg episode reward: [(0, '0.179')] [2024-06-12 16:38:31,499][71000] Updated weights for policy 0, policy_version 32454 (0.0029) [2024-06-12 16:38:34,617][71000] Updated weights for policy 0, policy_version 32464 (0.0024) [2024-06-12 16:38:35,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.1, 300 sec: 50040.7). Total num frames: 531922944. Throughput: 0: 49268.0. Samples: 60704720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-12 16:38:35,940][70768] Avg episode reward: [(0, '0.174')] [2024-06-12 16:38:38,081][71000] Updated weights for policy 0, policy_version 32474 (0.0028) [2024-06-12 16:38:40,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49698.1, 300 sec: 50096.1). Total num frames: 532185088. Throughput: 0: 49118.2. Samples: 61000480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-12 16:38:40,940][70768] Avg episode reward: [(0, '0.182')] [2024-06-12 16:38:41,098][71000] Updated weights for policy 0, policy_version 32484 (0.0028) [2024-06-12 16:38:44,788][71000] Updated weights for policy 0, policy_version 32494 (0.0029) [2024-06-12 16:38:45,940][70768] Fps is (10 sec: 50789.3, 60 sec: 49424.9, 300 sec: 49985.0). Total num frames: 532430848. Throughput: 0: 49470.9. Samples: 61301040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 16:38:45,940][70768] Avg episode reward: [(0, '0.174')] [2024-06-12 16:38:47,211][70980] Signal inference workers to stop experience collection... (900 times) [2024-06-12 16:38:47,212][70980] Signal inference workers to resume experience collection... (900 times) [2024-06-12 16:38:47,239][71000] InferenceWorker_p0-w0: stopping experience collection (900 times) [2024-06-12 16:38:47,239][71000] InferenceWorker_p0-w0: resuming experience collection (900 times) [2024-06-12 16:38:47,674][71000] Updated weights for policy 0, policy_version 32504 (0.0033) [2024-06-12 16:38:50,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49985.1). Total num frames: 532676608. Throughput: 0: 49371.2. Samples: 61451120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 16:38:50,940][70768] Avg episode reward: [(0, '0.164')] [2024-06-12 16:38:51,367][71000] Updated weights for policy 0, policy_version 32514 (0.0037) [2024-06-12 16:38:54,345][71000] Updated weights for policy 0, policy_version 32524 (0.0033) [2024-06-12 16:38:55,940][70768] Fps is (10 sec: 50791.3, 60 sec: 49698.1, 300 sec: 50040.6). Total num frames: 532938752. Throughput: 0: 49388.4. Samples: 61744860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 16:38:55,940][70768] Avg episode reward: [(0, '0.173')] [2024-06-12 16:38:57,880][71000] Updated weights for policy 0, policy_version 32534 (0.0035) [2024-06-12 16:39:00,892][71000] Updated weights for policy 0, policy_version 32544 (0.0029) [2024-06-12 16:39:00,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49971.1, 300 sec: 50151.7). Total num frames: 533200896. Throughput: 0: 49625.6. Samples: 62047560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 16:39:00,940][70768] Avg episode reward: [(0, '0.171')] [2024-06-12 16:39:04,587][71000] Updated weights for policy 0, policy_version 32554 (0.0033) [2024-06-12 16:39:05,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.2, 300 sec: 49929.6). Total num frames: 533413888. Throughput: 0: 49679.6. Samples: 62199020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 16:39:05,940][70768] Avg episode reward: [(0, '0.178')] [2024-06-12 16:39:07,583][71000] Updated weights for policy 0, policy_version 32564 (0.0028) [2024-06-12 16:39:10,939][70768] Fps is (10 sec: 47514.2, 60 sec: 49425.1, 300 sec: 49985.1). Total num frames: 533676032. Throughput: 0: 49817.4. Samples: 62497740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-12 16:39:10,940][70768] Avg episode reward: [(0, '0.188')] [2024-06-12 16:39:10,967][71000] Updated weights for policy 0, policy_version 32574 (0.0028) [2024-06-12 16:39:14,125][71000] Updated weights for policy 0, policy_version 32584 (0.0037) [2024-06-12 16:39:15,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.1, 300 sec: 49929.6). Total num frames: 533921792. Throughput: 0: 49713.8. Samples: 62790540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-12 16:39:15,940][70768] Avg episode reward: [(0, '0.166')] [2024-06-12 16:39:17,831][71000] Updated weights for policy 0, policy_version 32594 (0.0025) [2024-06-12 16:39:20,652][71000] Updated weights for policy 0, policy_version 32604 (0.0022) [2024-06-12 16:39:20,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49698.1, 300 sec: 50096.2). Total num frames: 534183936. Throughput: 0: 49795.5. Samples: 62945520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-12 16:39:20,940][70768] Avg episode reward: [(0, '0.183')] [2024-06-12 16:39:24,344][71000] Updated weights for policy 0, policy_version 32614 (0.0030) [2024-06-12 16:39:25,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.1, 300 sec: 50040.6). Total num frames: 534429696. Throughput: 0: 49967.6. Samples: 63249020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-12 16:39:25,940][70768] Avg episode reward: [(0, '0.187')] [2024-06-12 16:39:27,207][71000] Updated weights for policy 0, policy_version 32624 (0.0023) [2024-06-12 16:39:30,629][71000] Updated weights for policy 0, policy_version 32634 (0.0023) [2024-06-12 16:39:30,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49971.2, 300 sec: 49874.0). Total num frames: 534675456. Throughput: 0: 49826.9. Samples: 63543240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 26.0) [2024-06-12 16:39:30,940][70768] Avg episode reward: [(0, '0.188')] [2024-06-12 16:39:33,966][71000] Updated weights for policy 0, policy_version 32644 (0.0025) [2024-06-12 16:39:35,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49971.3, 300 sec: 49985.1). Total num frames: 534921216. Throughput: 0: 50038.3. Samples: 63702840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 26.0) [2024-06-12 16:39:35,940][70768] Avg episode reward: [(0, '0.154')] [2024-06-12 16:39:37,235][71000] Updated weights for policy 0, policy_version 32654 (0.0031) [2024-06-12 16:39:40,304][71000] Updated weights for policy 0, policy_version 32664 (0.0022) [2024-06-12 16:39:40,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49971.1, 300 sec: 49985.0). Total num frames: 535183360. Throughput: 0: 50047.0. Samples: 63996980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 26.0) [2024-06-12 16:39:40,940][70768] Avg episode reward: [(0, '0.200')] [2024-06-12 16:39:40,954][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000032665_535183360.pth... [2024-06-12 16:39:40,999][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000031934_523206656.pth [2024-06-12 16:39:43,930][71000] Updated weights for policy 0, policy_version 32674 (0.0026) [2024-06-12 16:39:45,940][70768] Fps is (10 sec: 50789.4, 60 sec: 49971.3, 300 sec: 50040.6). Total num frames: 535429120. Throughput: 0: 49913.7. Samples: 64293680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 26.0) [2024-06-12 16:39:45,940][70768] Avg episode reward: [(0, '0.198')] [2024-06-12 16:39:46,797][71000] Updated weights for policy 0, policy_version 32684 (0.0031) [2024-06-12 16:39:50,396][71000] Updated weights for policy 0, policy_version 32694 (0.0027) [2024-06-12 16:39:50,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49971.2, 300 sec: 49874.0). Total num frames: 535674880. Throughput: 0: 49808.5. Samples: 64440400. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-12 16:39:50,940][70768] Avg episode reward: [(0, '0.169')] [2024-06-12 16:39:53,501][71000] Updated weights for policy 0, policy_version 32704 (0.0022) [2024-06-12 16:39:55,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49425.1, 300 sec: 49874.0). Total num frames: 535904256. Throughput: 0: 49773.7. Samples: 64737560. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-12 16:39:55,940][70768] Avg episode reward: [(0, '0.165')] [2024-06-12 16:39:56,974][71000] Updated weights for policy 0, policy_version 32714 (0.0029) [2024-06-12 16:40:00,412][71000] Updated weights for policy 0, policy_version 32724 (0.0029) [2024-06-12 16:40:00,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 536182784. Throughput: 0: 50165.6. Samples: 65048000. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-12 16:40:00,940][70768] Avg episode reward: [(0, '0.196')] [2024-06-12 16:40:03,190][70980] Signal inference workers to stop experience collection... (950 times) [2024-06-12 16:40:03,190][70980] Signal inference workers to resume experience collection... (950 times) [2024-06-12 16:40:03,206][71000] InferenceWorker_p0-w0: stopping experience collection (950 times) [2024-06-12 16:40:03,206][71000] InferenceWorker_p0-w0: resuming experience collection (950 times) [2024-06-12 16:40:03,325][71000] Updated weights for policy 0, policy_version 32734 (0.0036) [2024-06-12 16:40:05,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 536412160. Throughput: 0: 49917.4. Samples: 65191800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-12 16:40:05,940][70768] Avg episode reward: [(0, '0.191')] [2024-06-12 16:40:06,803][71000] Updated weights for policy 0, policy_version 32744 (0.0025) [2024-06-12 16:40:10,390][71000] Updated weights for policy 0, policy_version 32754 (0.0040) [2024-06-12 16:40:10,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49698.1, 300 sec: 49874.0). Total num frames: 536657920. Throughput: 0: 49851.6. Samples: 65492340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 16:40:10,940][70768] Avg episode reward: [(0, '0.180')] [2024-06-12 16:40:13,610][71000] Updated weights for policy 0, policy_version 32764 (0.0025) [2024-06-12 16:40:15,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 49818.5). Total num frames: 536903680. Throughput: 0: 49732.4. Samples: 65781200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 16:40:15,940][70768] Avg episode reward: [(0, '0.200')] [2024-06-12 16:40:16,970][71000] Updated weights for policy 0, policy_version 32774 (0.0042) [2024-06-12 16:40:20,198][71000] Updated weights for policy 0, policy_version 32784 (0.0035) [2024-06-12 16:40:20,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.1, 300 sec: 49818.5). Total num frames: 537165824. Throughput: 0: 49479.0. Samples: 65929400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 16:40:20,940][70768] Avg episode reward: [(0, '0.185')] [2024-06-12 16:40:23,299][71000] Updated weights for policy 0, policy_version 32794 (0.0025) [2024-06-12 16:40:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49929.6). Total num frames: 537395200. Throughput: 0: 49724.2. Samples: 66234560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 16:40:25,940][70768] Avg episode reward: [(0, '0.176')] [2024-06-12 16:40:26,939][71000] Updated weights for policy 0, policy_version 32804 (0.0027) [2024-06-12 16:40:29,958][71000] Updated weights for policy 0, policy_version 32814 (0.0025) [2024-06-12 16:40:30,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49425.0, 300 sec: 49762.9). Total num frames: 537640960. Throughput: 0: 49754.2. Samples: 66532620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 16:40:30,940][70768] Avg episode reward: [(0, '0.184')] [2024-06-12 16:40:33,367][71000] Updated weights for policy 0, policy_version 32824 (0.0023) [2024-06-12 16:40:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.0, 300 sec: 49818.5). Total num frames: 537886720. Throughput: 0: 49863.1. Samples: 66684240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 16:40:35,940][70768] Avg episode reward: [(0, '0.187')] [2024-06-12 16:40:36,642][71000] Updated weights for policy 0, policy_version 32834 (0.0037) [2024-06-12 16:40:39,764][71000] Updated weights for policy 0, policy_version 32844 (0.0021) [2024-06-12 16:40:40,939][70768] Fps is (10 sec: 55706.4, 60 sec: 50244.4, 300 sec: 49929.6). Total num frames: 538198016. Throughput: 0: 50013.3. Samples: 66988160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 16:40:40,940][70768] Avg episode reward: [(0, '0.183')] [2024-06-12 16:40:42,984][71000] Updated weights for policy 0, policy_version 32854 (0.0034) [2024-06-12 16:40:45,943][70768] Fps is (10 sec: 50772.9, 60 sec: 49422.3, 300 sec: 49873.4). Total num frames: 538394624. Throughput: 0: 49571.9. Samples: 67278900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 16:40:45,943][70768] Avg episode reward: [(0, '0.206')] [2024-06-12 16:40:46,030][70980] Saving new best policy, reward=0.206! [2024-06-12 16:40:46,473][71000] Updated weights for policy 0, policy_version 32864 (0.0033) [2024-06-12 16:40:49,455][71000] Updated weights for policy 0, policy_version 32874 (0.0030) [2024-06-12 16:40:50,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49698.1, 300 sec: 49874.0). Total num frames: 538656768. Throughput: 0: 49697.7. Samples: 67428200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 16:40:50,940][70768] Avg episode reward: [(0, '0.184')] [2024-06-12 16:40:53,062][71000] Updated weights for policy 0, policy_version 32884 (0.0028) [2024-06-12 16:40:55,940][70768] Fps is (10 sec: 52445.9, 60 sec: 50244.1, 300 sec: 49874.0). Total num frames: 538918912. Throughput: 0: 49746.9. Samples: 67730960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 16:40:55,940][70768] Avg episode reward: [(0, '0.199')] [2024-06-12 16:40:56,258][71000] Updated weights for policy 0, policy_version 32894 (0.0038) [2024-06-12 16:40:59,409][71000] Updated weights for policy 0, policy_version 32904 (0.0034) [2024-06-12 16:41:00,940][70768] Fps is (10 sec: 52427.5, 60 sec: 49971.1, 300 sec: 49874.4). Total num frames: 539181056. Throughput: 0: 49873.5. Samples: 68025520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 16:41:00,941][70768] Avg episode reward: [(0, '0.170')] [2024-06-12 16:41:03,168][71000] Updated weights for policy 0, policy_version 32914 (0.0025) [2024-06-12 16:41:05,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49698.1, 300 sec: 49929.5). Total num frames: 539394048. Throughput: 0: 49935.6. Samples: 68176500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 16:41:05,940][70768] Avg episode reward: [(0, '0.194')] [2024-06-12 16:41:06,379][71000] Updated weights for policy 0, policy_version 32924 (0.0032) [2024-06-12 16:41:09,468][71000] Updated weights for policy 0, policy_version 32934 (0.0023) [2024-06-12 16:41:10,940][70768] Fps is (10 sec: 45876.1, 60 sec: 49698.1, 300 sec: 49818.5). Total num frames: 539639808. Throughput: 0: 49855.9. Samples: 68478080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 16:41:10,940][70768] Avg episode reward: [(0, '0.206')] [2024-06-12 16:41:13,191][71000] Updated weights for policy 0, policy_version 32944 (0.0032) [2024-06-12 16:41:13,719][70980] Signal inference workers to stop experience collection... (1000 times) [2024-06-12 16:41:13,769][71000] InferenceWorker_p0-w0: stopping experience collection (1000 times) [2024-06-12 16:41:13,827][70980] Signal inference workers to resume experience collection... (1000 times) [2024-06-12 16:41:13,828][71000] InferenceWorker_p0-w0: resuming experience collection (1000 times) [2024-06-12 16:41:15,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49971.2, 300 sec: 49818.4). Total num frames: 539901952. Throughput: 0: 49593.8. Samples: 68764340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 16:41:15,940][70768] Avg episode reward: [(0, '0.202')] [2024-06-12 16:41:16,311][71000] Updated weights for policy 0, policy_version 32954 (0.0026) [2024-06-12 16:41:19,801][71000] Updated weights for policy 0, policy_version 32964 (0.0025) [2024-06-12 16:41:20,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49971.2, 300 sec: 49818.5). Total num frames: 540164096. Throughput: 0: 49649.7. Samples: 68918480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 16:41:20,940][70768] Avg episode reward: [(0, '0.180')] [2024-06-12 16:41:23,009][71000] Updated weights for policy 0, policy_version 32974 (0.0027) [2024-06-12 16:41:25,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49698.0, 300 sec: 49874.0). Total num frames: 540377088. Throughput: 0: 49539.8. Samples: 69217460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 16:41:25,940][70768] Avg episode reward: [(0, '0.184')] [2024-06-12 16:41:26,366][71000] Updated weights for policy 0, policy_version 32984 (0.0028) [2024-06-12 16:41:29,991][71000] Updated weights for policy 0, policy_version 32994 (0.0026) [2024-06-12 16:41:30,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49698.2, 300 sec: 49762.9). Total num frames: 540622848. Throughput: 0: 49397.1. Samples: 69501600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 16:41:30,944][70768] Avg episode reward: [(0, '0.209')] [2024-06-12 16:41:30,952][70980] Saving new best policy, reward=0.209! [2024-06-12 16:41:33,168][71000] Updated weights for policy 0, policy_version 33004 (0.0025) [2024-06-12 16:41:35,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.0, 300 sec: 49707.4). Total num frames: 540868608. Throughput: 0: 49438.6. Samples: 69652940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 16:41:35,940][70768] Avg episode reward: [(0, '0.202')] [2024-06-12 16:41:36,326][71000] Updated weights for policy 0, policy_version 33014 (0.0031) [2024-06-12 16:41:39,851][71000] Updated weights for policy 0, policy_version 33024 (0.0027) [2024-06-12 16:41:40,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48605.8, 300 sec: 49651.8). Total num frames: 541114368. Throughput: 0: 49195.6. Samples: 69944760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 16:41:40,940][70768] Avg episode reward: [(0, '0.177')] [2024-06-12 16:41:41,017][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000033028_541130752.pth... [2024-06-12 16:41:41,075][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000032300_529203200.pth [2024-06-12 16:41:42,898][71000] Updated weights for policy 0, policy_version 33034 (0.0025) [2024-06-12 16:41:45,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49427.8, 300 sec: 49762.9). Total num frames: 541360128. Throughput: 0: 49229.5. Samples: 70240840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 16:41:45,940][70768] Avg episode reward: [(0, '0.187')] [2024-06-12 16:41:46,761][71000] Updated weights for policy 0, policy_version 33044 (0.0030) [2024-06-12 16:41:49,560][71000] Updated weights for policy 0, policy_version 33054 (0.0031) [2024-06-12 16:41:50,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 49707.4). Total num frames: 541605888. Throughput: 0: 49008.9. Samples: 70381900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 16:41:50,940][70768] Avg episode reward: [(0, '0.189')] [2024-06-12 16:41:53,133][71000] Updated weights for policy 0, policy_version 33064 (0.0024) [2024-06-12 16:41:55,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49152.1, 300 sec: 49762.9). Total num frames: 541868032. Throughput: 0: 48828.5. Samples: 70675360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 16:41:55,940][70768] Avg episode reward: [(0, '0.192')] [2024-06-12 16:41:56,246][71000] Updated weights for policy 0, policy_version 33074 (0.0026) [2024-06-12 16:41:59,623][71000] Updated weights for policy 0, policy_version 33084 (0.0026) [2024-06-12 16:42:00,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49152.2, 300 sec: 49707.4). Total num frames: 542130176. Throughput: 0: 49462.7. Samples: 70990160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 16:42:00,940][70768] Avg episode reward: [(0, '0.174')] [2024-06-12 16:42:02,821][71000] Updated weights for policy 0, policy_version 33094 (0.0033) [2024-06-12 16:42:05,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49651.8). Total num frames: 542359552. Throughput: 0: 49192.5. Samples: 71132140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 16:42:05,940][70768] Avg episode reward: [(0, '0.166')] [2024-06-12 16:42:06,033][71000] Updated weights for policy 0, policy_version 33104 (0.0031) [2024-06-12 16:42:09,254][71000] Updated weights for policy 0, policy_version 33114 (0.0024) [2024-06-12 16:42:10,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.1, 300 sec: 49762.9). Total num frames: 542605312. Throughput: 0: 49247.6. Samples: 71433600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 16:42:10,940][70768] Avg episode reward: [(0, '0.159')] [2024-06-12 16:42:12,739][71000] Updated weights for policy 0, policy_version 33124 (0.0024) [2024-06-12 16:42:15,845][71000] Updated weights for policy 0, policy_version 33134 (0.0033) [2024-06-12 16:42:15,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.1, 300 sec: 49651.8). Total num frames: 542867456. Throughput: 0: 49521.8. Samples: 71730080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 16:42:15,940][70768] Avg episode reward: [(0, '0.194')] [2024-06-12 16:42:19,596][71000] Updated weights for policy 0, policy_version 33144 (0.0027) [2024-06-12 16:42:20,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49425.0, 300 sec: 49762.9). Total num frames: 543129600. Throughput: 0: 49427.1. Samples: 71877160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 16:42:20,940][70768] Avg episode reward: [(0, '0.177')] [2024-06-12 16:42:22,487][71000] Updated weights for policy 0, policy_version 33154 (0.0035) [2024-06-12 16:42:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 543342592. Throughput: 0: 49579.2. Samples: 72175820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 16:42:25,940][70768] Avg episode reward: [(0, '0.181')] [2024-06-12 16:42:26,023][71000] Updated weights for policy 0, policy_version 33164 (0.0027) [2024-06-12 16:42:27,348][70980] Signal inference workers to stop experience collection... (1050 times) [2024-06-12 16:42:27,381][71000] InferenceWorker_p0-w0: stopping experience collection (1050 times) [2024-06-12 16:42:27,398][70980] Signal inference workers to resume experience collection... (1050 times) [2024-06-12 16:42:27,402][71000] InferenceWorker_p0-w0: resuming experience collection (1050 times) [2024-06-12 16:42:29,390][71000] Updated weights for policy 0, policy_version 33174 (0.0030) [2024-06-12 16:42:30,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49698.0, 300 sec: 49651.8). Total num frames: 543604736. Throughput: 0: 49588.9. Samples: 72472340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 16:42:30,940][70768] Avg episode reward: [(0, '0.187')] [2024-06-12 16:42:32,479][71000] Updated weights for policy 0, policy_version 33184 (0.0028) [2024-06-12 16:42:35,683][71000] Updated weights for policy 0, policy_version 33194 (0.0030) [2024-06-12 16:42:35,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49698.3, 300 sec: 49651.9). Total num frames: 543850496. Throughput: 0: 49832.5. Samples: 72624360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 16:42:35,940][70768] Avg episode reward: [(0, '0.197')] [2024-06-12 16:42:39,149][71000] Updated weights for policy 0, policy_version 33204 (0.0037) [2024-06-12 16:42:40,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49971.3, 300 sec: 49651.9). Total num frames: 544112640. Throughput: 0: 49908.4. Samples: 72921240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 16:42:40,940][70768] Avg episode reward: [(0, '0.189')] [2024-06-12 16:42:42,258][71000] Updated weights for policy 0, policy_version 33214 (0.0041) [2024-06-12 16:42:45,755][71000] Updated weights for policy 0, policy_version 33224 (0.0031) [2024-06-12 16:42:45,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.3, 300 sec: 49596.3). Total num frames: 544342016. Throughput: 0: 49511.2. Samples: 73218160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 16:42:45,940][70768] Avg episode reward: [(0, '0.194')] [2024-06-12 16:42:48,793][71000] Updated weights for policy 0, policy_version 33234 (0.0037) [2024-06-12 16:42:50,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 49651.8). Total num frames: 544604160. Throughput: 0: 49479.5. Samples: 73358720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 16:42:50,940][70768] Avg episode reward: [(0, '0.199')] [2024-06-12 16:42:52,429][71000] Updated weights for policy 0, policy_version 33244 (0.0031) [2024-06-12 16:42:55,592][71000] Updated weights for policy 0, policy_version 33254 (0.0028) [2024-06-12 16:42:55,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 544833536. Throughput: 0: 49452.6. Samples: 73658960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 16:42:55,940][70768] Avg episode reward: [(0, '0.194')] [2024-06-12 16:42:58,774][71000] Updated weights for policy 0, policy_version 33264 (0.0032) [2024-06-12 16:43:00,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.0, 300 sec: 49596.3). Total num frames: 545079296. Throughput: 0: 49635.2. Samples: 73963660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 16:43:00,940][70768] Avg episode reward: [(0, '0.186')] [2024-06-12 16:43:02,049][71000] Updated weights for policy 0, policy_version 33274 (0.0029) [2024-06-12 16:43:05,421][71000] Updated weights for policy 0, policy_version 33284 (0.0026) [2024-06-12 16:43:05,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49698.0, 300 sec: 49596.3). Total num frames: 545341440. Throughput: 0: 49749.4. Samples: 74115880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 16:43:05,940][70768] Avg episode reward: [(0, '0.180')] [2024-06-12 16:43:08,701][71000] Updated weights for policy 0, policy_version 33294 (0.0028) [2024-06-12 16:43:10,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.2, 300 sec: 49651.8). Total num frames: 545587200. Throughput: 0: 49640.0. Samples: 74409620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 16:43:10,940][70768] Avg episode reward: [(0, '0.189')] [2024-06-12 16:43:11,998][71000] Updated weights for policy 0, policy_version 33304 (0.0030) [2024-06-12 16:43:15,057][71000] Updated weights for policy 0, policy_version 33314 (0.0024) [2024-06-12 16:43:15,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49424.9, 300 sec: 49596.3). Total num frames: 545832960. Throughput: 0: 49688.4. Samples: 74708320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 16:43:15,940][70768] Avg episode reward: [(0, '0.206')] [2024-06-12 16:43:18,702][71000] Updated weights for policy 0, policy_version 33324 (0.0032) [2024-06-12 16:43:20,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49425.0, 300 sec: 49651.8). Total num frames: 546095104. Throughput: 0: 49691.8. Samples: 74860500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 16:43:20,940][70768] Avg episode reward: [(0, '0.201')] [2024-06-12 16:43:21,606][71000] Updated weights for policy 0, policy_version 33334 (0.0027) [2024-06-12 16:43:25,321][71000] Updated weights for policy 0, policy_version 33344 (0.0033) [2024-06-12 16:43:25,942][70768] Fps is (10 sec: 49140.5, 60 sec: 49696.0, 300 sec: 49651.4). Total num frames: 546324480. Throughput: 0: 49522.5. Samples: 75149880. Policy #0 lag: (min: 1.0, avg: 9.8, max: 24.0) [2024-06-12 16:43:25,943][70768] Avg episode reward: [(0, '0.203')] [2024-06-12 16:43:28,516][71000] Updated weights for policy 0, policy_version 33354 (0.0039) [2024-06-12 16:43:30,940][70768] Fps is (10 sec: 47512.7, 60 sec: 49424.9, 300 sec: 49651.8). Total num frames: 546570240. Throughput: 0: 49424.0. Samples: 75442260. Policy #0 lag: (min: 1.0, avg: 9.8, max: 24.0) [2024-06-12 16:43:30,941][70768] Avg episode reward: [(0, '0.183')] [2024-06-12 16:43:31,979][71000] Updated weights for policy 0, policy_version 33364 (0.0031) [2024-06-12 16:43:35,191][71000] Updated weights for policy 0, policy_version 33374 (0.0043) [2024-06-12 16:43:35,942][70768] Fps is (10 sec: 49150.2, 60 sec: 49422.6, 300 sec: 49595.8). Total num frames: 546816000. Throughput: 0: 49495.9. Samples: 75586180. Policy #0 lag: (min: 1.0, avg: 9.8, max: 24.0) [2024-06-12 16:43:35,943][70768] Avg episode reward: [(0, '0.204')] [2024-06-12 16:43:38,573][71000] Updated weights for policy 0, policy_version 33384 (0.0035) [2024-06-12 16:43:40,940][70768] Fps is (10 sec: 49153.8, 60 sec: 49152.0, 300 sec: 49596.4). Total num frames: 547061760. Throughput: 0: 49493.3. Samples: 75886160. Policy #0 lag: (min: 1.0, avg: 9.8, max: 24.0) [2024-06-12 16:43:40,940][70768] Avg episode reward: [(0, '0.194')] [2024-06-12 16:43:41,138][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000033392_547094528.pth... [2024-06-12 16:43:41,179][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000032665_535183360.pth [2024-06-12 16:43:41,803][71000] Updated weights for policy 0, policy_version 33394 (0.0029) [2024-06-12 16:43:45,176][71000] Updated weights for policy 0, policy_version 33404 (0.0038) [2024-06-12 16:43:45,940][70768] Fps is (10 sec: 49165.7, 60 sec: 49424.9, 300 sec: 49596.3). Total num frames: 547307520. Throughput: 0: 49132.7. Samples: 76174640. Policy #0 lag: (min: 1.0, avg: 9.8, max: 24.0) [2024-06-12 16:43:45,940][70768] Avg episode reward: [(0, '0.198')] [2024-06-12 16:43:48,439][71000] Updated weights for policy 0, policy_version 33414 (0.0027) [2024-06-12 16:43:50,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 547569664. Throughput: 0: 49250.4. Samples: 76332140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 16:43:50,940][70768] Avg episode reward: [(0, '0.216')] [2024-06-12 16:43:51,004][70980] Saving new best policy, reward=0.216! [2024-06-12 16:43:51,982][71000] Updated weights for policy 0, policy_version 33424 (0.0042) [2024-06-12 16:43:52,472][70980] Signal inference workers to stop experience collection... (1100 times) [2024-06-12 16:43:52,500][71000] InferenceWorker_p0-w0: stopping experience collection (1100 times) [2024-06-12 16:43:52,524][70980] Signal inference workers to resume experience collection... (1100 times) [2024-06-12 16:43:52,525][71000] InferenceWorker_p0-w0: resuming experience collection (1100 times) [2024-06-12 16:43:55,257][71000] Updated weights for policy 0, policy_version 33434 (0.0034) [2024-06-12 16:43:55,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 547799040. Throughput: 0: 49336.0. Samples: 76629740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 16:43:55,940][70768] Avg episode reward: [(0, '0.204')] [2024-06-12 16:43:58,661][71000] Updated weights for policy 0, policy_version 33444 (0.0030) [2024-06-12 16:44:00,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 548028416. Throughput: 0: 49019.8. Samples: 76914200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 16:44:00,940][70768] Avg episode reward: [(0, '0.203')] [2024-06-12 16:44:02,022][71000] Updated weights for policy 0, policy_version 33454 (0.0030) [2024-06-12 16:44:05,311][71000] Updated weights for policy 0, policy_version 33464 (0.0031) [2024-06-12 16:44:05,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49152.0, 300 sec: 49540.7). Total num frames: 548290560. Throughput: 0: 48847.2. Samples: 77058620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 16:44:05,940][70768] Avg episode reward: [(0, '0.186')] [2024-06-12 16:44:08,834][71000] Updated weights for policy 0, policy_version 33474 (0.0030) [2024-06-12 16:44:10,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48879.0, 300 sec: 49485.2). Total num frames: 548519936. Throughput: 0: 49033.5. Samples: 77356260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 16:44:10,940][70768] Avg episode reward: [(0, '0.206')] [2024-06-12 16:44:11,850][71000] Updated weights for policy 0, policy_version 33484 (0.0028) [2024-06-12 16:44:15,190][71000] Updated weights for policy 0, policy_version 33494 (0.0028) [2024-06-12 16:44:15,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.2, 300 sec: 49485.2). Total num frames: 548782080. Throughput: 0: 49056.4. Samples: 77649780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 16:44:15,940][70768] Avg episode reward: [(0, '0.198')] [2024-06-12 16:44:18,645][71000] Updated weights for policy 0, policy_version 33504 (0.0023) [2024-06-12 16:44:20,940][70768] Fps is (10 sec: 49150.8, 60 sec: 48605.9, 300 sec: 49429.7). Total num frames: 549011456. Throughput: 0: 49226.1. Samples: 77801220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 16:44:20,940][70768] Avg episode reward: [(0, '0.193')] [2024-06-12 16:44:22,079][71000] Updated weights for policy 0, policy_version 33514 (0.0034) [2024-06-12 16:44:25,147][71000] Updated weights for policy 0, policy_version 33524 (0.0032) [2024-06-12 16:44:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49154.1, 300 sec: 49485.2). Total num frames: 549273600. Throughput: 0: 49034.6. Samples: 78092720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 16:44:25,940][70768] Avg episode reward: [(0, '0.190')] [2024-06-12 16:44:28,659][71000] Updated weights for policy 0, policy_version 33534 (0.0025) [2024-06-12 16:44:30,940][70768] Fps is (10 sec: 50791.3, 60 sec: 49152.3, 300 sec: 49485.2). Total num frames: 549519360. Throughput: 0: 49360.6. Samples: 78395860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 16:44:30,940][70768] Avg episode reward: [(0, '0.208')] [2024-06-12 16:44:31,818][71000] Updated weights for policy 0, policy_version 33544 (0.0032) [2024-06-12 16:44:35,469][71000] Updated weights for policy 0, policy_version 33554 (0.0032) [2024-06-12 16:44:35,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49154.4, 300 sec: 49429.7). Total num frames: 549765120. Throughput: 0: 49070.6. Samples: 78540320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 16:44:35,940][70768] Avg episode reward: [(0, '0.214')] [2024-06-12 16:44:38,499][71000] Updated weights for policy 0, policy_version 33564 (0.0031) [2024-06-12 16:44:40,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48879.0, 300 sec: 49374.2). Total num frames: 549994496. Throughput: 0: 49009.9. Samples: 78835180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 16:44:40,940][70768] Avg episode reward: [(0, '0.211')] [2024-06-12 16:44:41,830][71000] Updated weights for policy 0, policy_version 33574 (0.0030) [2024-06-12 16:44:45,134][71000] Updated weights for policy 0, policy_version 33584 (0.0030) [2024-06-12 16:44:45,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49698.3, 300 sec: 49540.8). Total num frames: 550289408. Throughput: 0: 49212.9. Samples: 79128780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 16:44:45,940][70768] Avg episode reward: [(0, '0.205')] [2024-06-12 16:44:49,002][71000] Updated weights for policy 0, policy_version 33594 (0.0024) [2024-06-12 16:44:50,939][70768] Fps is (10 sec: 50790.4, 60 sec: 48878.9, 300 sec: 49485.2). Total num frames: 550502400. Throughput: 0: 49415.7. Samples: 79282320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 16:44:50,940][70768] Avg episode reward: [(0, '0.222')] [2024-06-12 16:44:51,100][70980] Saving new best policy, reward=0.222! [2024-06-12 16:44:51,867][71000] Updated weights for policy 0, policy_version 33604 (0.0027) [2024-06-12 16:44:55,334][71000] Updated weights for policy 0, policy_version 33614 (0.0027) [2024-06-12 16:44:55,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 550764544. Throughput: 0: 49471.8. Samples: 79582500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 16:44:55,940][70768] Avg episode reward: [(0, '0.211')] [2024-06-12 16:44:58,446][71000] Updated weights for policy 0, policy_version 33624 (0.0024) [2024-06-12 16:45:00,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 550993920. Throughput: 0: 49470.1. Samples: 79875940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 16:45:00,940][70768] Avg episode reward: [(0, '0.215')] [2024-06-12 16:45:02,055][71000] Updated weights for policy 0, policy_version 33634 (0.0029) [2024-06-12 16:45:05,036][71000] Updated weights for policy 0, policy_version 33644 (0.0032) [2024-06-12 16:45:05,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 551272448. Throughput: 0: 49388.6. Samples: 80023700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 16:45:05,940][70768] Avg episode reward: [(0, '0.190')] [2024-06-12 16:45:08,683][71000] Updated weights for policy 0, policy_version 33654 (0.0029) [2024-06-12 16:45:09,417][70980] Signal inference workers to stop experience collection... (1150 times) [2024-06-12 16:45:09,437][71000] InferenceWorker_p0-w0: stopping experience collection (1150 times) [2024-06-12 16:45:09,523][70980] Signal inference workers to resume experience collection... (1150 times) [2024-06-12 16:45:09,523][71000] InferenceWorker_p0-w0: resuming experience collection (1150 times) [2024-06-12 16:45:10,940][70768] Fps is (10 sec: 52429.1, 60 sec: 49971.1, 300 sec: 49540.8). Total num frames: 551518208. Throughput: 0: 49784.0. Samples: 80333000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 16:45:10,940][70768] Avg episode reward: [(0, '0.196')] [2024-06-12 16:45:11,365][71000] Updated weights for policy 0, policy_version 33664 (0.0033) [2024-06-12 16:45:15,473][71000] Updated weights for policy 0, policy_version 33674 (0.0035) [2024-06-12 16:45:15,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 551747584. Throughput: 0: 49573.8. Samples: 80626680. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-12 16:45:15,940][70768] Avg episode reward: [(0, '0.191')] [2024-06-12 16:45:17,909][71000] Updated weights for policy 0, policy_version 33684 (0.0037) [2024-06-12 16:45:20,940][70768] Fps is (10 sec: 45874.6, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 551976960. Throughput: 0: 49450.0. Samples: 80765580. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-12 16:45:20,940][70768] Avg episode reward: [(0, '0.199')] [2024-06-12 16:45:21,912][71000] Updated weights for policy 0, policy_version 33694 (0.0025) [2024-06-12 16:45:24,785][71000] Updated weights for policy 0, policy_version 33704 (0.0034) [2024-06-12 16:45:25,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 552255488. Throughput: 0: 49752.4. Samples: 81074040. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-12 16:45:25,940][70768] Avg episode reward: [(0, '0.199')] [2024-06-12 16:45:28,643][71000] Updated weights for policy 0, policy_version 33714 (0.0029) [2024-06-12 16:45:30,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49698.0, 300 sec: 49540.7). Total num frames: 552501248. Throughput: 0: 49744.2. Samples: 81367280. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-12 16:45:30,940][70768] Avg episode reward: [(0, '0.184')] [2024-06-12 16:45:31,244][71000] Updated weights for policy 0, policy_version 33724 (0.0026) [2024-06-12 16:45:35,328][71000] Updated weights for policy 0, policy_version 33734 (0.0026) [2024-06-12 16:45:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 552747008. Throughput: 0: 49629.7. Samples: 81515660. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 16:45:35,940][70768] Avg episode reward: [(0, '0.186')] [2024-06-12 16:45:37,634][71000] Updated weights for policy 0, policy_version 33744 (0.0030) [2024-06-12 16:45:40,939][70768] Fps is (10 sec: 45876.3, 60 sec: 49425.1, 300 sec: 49374.7). Total num frames: 552960000. Throughput: 0: 49510.4. Samples: 81810460. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 16:45:40,940][70768] Avg episode reward: [(0, '0.195')] [2024-06-12 16:45:41,033][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000033751_552976384.pth... [2024-06-12 16:45:41,083][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000033028_541130752.pth [2024-06-12 16:45:41,828][71000] Updated weights for policy 0, policy_version 33754 (0.0029) [2024-06-12 16:45:44,699][71000] Updated weights for policy 0, policy_version 33764 (0.0026) [2024-06-12 16:45:45,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 553238528. Throughput: 0: 49331.1. Samples: 82095840. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 16:45:45,940][70768] Avg episode reward: [(0, '0.201')] [2024-06-12 16:45:48,431][71000] Updated weights for policy 0, policy_version 33774 (0.0026) [2024-06-12 16:45:50,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 553467904. Throughput: 0: 49513.4. Samples: 82251800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 16:45:50,941][70768] Avg episode reward: [(0, '0.195')] [2024-06-12 16:45:51,534][71000] Updated weights for policy 0, policy_version 33784 (0.0027) [2024-06-12 16:45:55,251][71000] Updated weights for policy 0, policy_version 33794 (0.0032) [2024-06-12 16:45:55,939][70768] Fps is (10 sec: 47514.5, 60 sec: 49152.2, 300 sec: 49263.1). Total num frames: 553713664. Throughput: 0: 49315.2. Samples: 82552180. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 16:45:55,940][70768] Avg episode reward: [(0, '0.197')] [2024-06-12 16:45:57,786][71000] Updated weights for policy 0, policy_version 33804 (0.0028) [2024-06-12 16:46:00,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 553943040. Throughput: 0: 49288.3. Samples: 82844660. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-12 16:46:00,940][70768] Avg episode reward: [(0, '0.203')] [2024-06-12 16:46:02,034][71000] Updated weights for policy 0, policy_version 33814 (0.0024) [2024-06-12 16:46:04,189][71000] Updated weights for policy 0, policy_version 33824 (0.0030) [2024-06-12 16:46:05,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 554221568. Throughput: 0: 49511.7. Samples: 82993600. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-12 16:46:05,940][70768] Avg episode reward: [(0, '0.169')] [2024-06-12 16:46:08,542][71000] Updated weights for policy 0, policy_version 33834 (0.0030) [2024-06-12 16:46:10,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 554467328. Throughput: 0: 49235.1. Samples: 83289620. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-12 16:46:10,940][70768] Avg episode reward: [(0, '0.215')] [2024-06-12 16:46:11,107][70980] Signal inference workers to stop experience collection... (1200 times) [2024-06-12 16:46:11,108][70980] Signal inference workers to resume experience collection... (1200 times) [2024-06-12 16:46:11,135][71000] InferenceWorker_p0-w0: stopping experience collection (1200 times) [2024-06-12 16:46:11,135][71000] InferenceWorker_p0-w0: resuming experience collection (1200 times) [2024-06-12 16:46:11,254][71000] Updated weights for policy 0, policy_version 33844 (0.0042) [2024-06-12 16:46:15,065][71000] Updated weights for policy 0, policy_version 33854 (0.0023) [2024-06-12 16:46:15,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 554696704. Throughput: 0: 49332.5. Samples: 83587240. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-12 16:46:15,949][70768] Avg episode reward: [(0, '0.188')] [2024-06-12 16:46:17,639][71000] Updated weights for policy 0, policy_version 33864 (0.0025) [2024-06-12 16:46:20,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 554926080. Throughput: 0: 49332.0. Samples: 83735600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 16:46:20,940][70768] Avg episode reward: [(0, '0.212')] [2024-06-12 16:46:21,826][71000] Updated weights for policy 0, policy_version 33874 (0.0025) [2024-06-12 16:46:23,868][71000] Updated weights for policy 0, policy_version 33884 (0.0036) [2024-06-12 16:46:25,939][70768] Fps is (10 sec: 50791.3, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 555204608. Throughput: 0: 49269.3. Samples: 84027580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 16:46:25,940][70768] Avg episode reward: [(0, '0.199')] [2024-06-12 16:46:28,389][71000] Updated weights for policy 0, policy_version 33894 (0.0030) [2024-06-12 16:46:30,678][71000] Updated weights for policy 0, policy_version 33904 (0.0024) [2024-06-12 16:46:30,940][70768] Fps is (10 sec: 55705.4, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 555483136. Throughput: 0: 49496.0. Samples: 84323160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 16:46:30,940][70768] Avg episode reward: [(0, '0.195')] [2024-06-12 16:46:35,079][71000] Updated weights for policy 0, policy_version 33914 (0.0028) [2024-06-12 16:46:35,939][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 555696128. Throughput: 0: 49380.5. Samples: 84473920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 16:46:35,940][70768] Avg episode reward: [(0, '0.187')] [2024-06-12 16:46:37,753][71000] Updated weights for policy 0, policy_version 33924 (0.0042) [2024-06-12 16:46:40,939][70768] Fps is (10 sec: 45875.7, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 555941888. Throughput: 0: 49290.2. Samples: 84770240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 16:46:40,940][70768] Avg episode reward: [(0, '0.200')] [2024-06-12 16:46:41,670][71000] Updated weights for policy 0, policy_version 33934 (0.0024) [2024-06-12 16:46:44,180][71000] Updated weights for policy 0, policy_version 33944 (0.0029) [2024-06-12 16:46:45,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 556220416. Throughput: 0: 49450.6. Samples: 85069940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 16:46:45,940][70768] Avg episode reward: [(0, '0.205')] [2024-06-12 16:46:48,057][71000] Updated weights for policy 0, policy_version 33954 (0.0029) [2024-06-12 16:46:50,472][71000] Updated weights for policy 0, policy_version 33964 (0.0026) [2024-06-12 16:46:50,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 556466176. Throughput: 0: 49581.4. Samples: 85224760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 16:46:50,940][70768] Avg episode reward: [(0, '0.208')] [2024-06-12 16:46:55,132][71000] Updated weights for policy 0, policy_version 33974 (0.0025) [2024-06-12 16:46:55,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49697.9, 300 sec: 49374.1). Total num frames: 556695552. Throughput: 0: 49384.7. Samples: 85511940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 16:46:55,940][70768] Avg episode reward: [(0, '0.180')] [2024-06-12 16:46:57,490][71000] Updated weights for policy 0, policy_version 33984 (0.0030) [2024-06-12 16:47:00,940][70768] Fps is (10 sec: 44236.5, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 556908544. Throughput: 0: 49373.4. Samples: 85809040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 16:47:00,940][70768] Avg episode reward: [(0, '0.188')] [2024-06-12 16:47:01,690][71000] Updated weights for policy 0, policy_version 33994 (0.0030) [2024-06-12 16:47:04,241][70980] Signal inference workers to stop experience collection... (1250 times) [2024-06-12 16:47:04,241][70980] Signal inference workers to resume experience collection... (1250 times) [2024-06-12 16:47:04,283][71000] InferenceWorker_p0-w0: stopping experience collection (1250 times) [2024-06-12 16:47:04,283][71000] InferenceWorker_p0-w0: resuming experience collection (1250 times) [2024-06-12 16:47:04,379][71000] Updated weights for policy 0, policy_version 34004 (0.0031) [2024-06-12 16:47:05,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 557170688. Throughput: 0: 49272.8. Samples: 85952880. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-12 16:47:05,940][70768] Avg episode reward: [(0, '0.187')] [2024-06-12 16:47:08,462][71000] Updated weights for policy 0, policy_version 34014 (0.0039) [2024-06-12 16:47:10,839][71000] Updated weights for policy 0, policy_version 34024 (0.0033) [2024-06-12 16:47:10,940][70768] Fps is (10 sec: 54066.4, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 557449216. Throughput: 0: 49311.7. Samples: 86246620. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-12 16:47:10,940][70768] Avg episode reward: [(0, '0.186')] [2024-06-12 16:47:15,224][71000] Updated weights for policy 0, policy_version 34034 (0.0028) [2024-06-12 16:47:15,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 557662208. Throughput: 0: 49461.5. Samples: 86548920. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-12 16:47:15,940][70768] Avg episode reward: [(0, '0.207')] [2024-06-12 16:47:17,447][71000] Updated weights for policy 0, policy_version 34044 (0.0026) [2024-06-12 16:47:20,940][70768] Fps is (10 sec: 44237.4, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 557891584. Throughput: 0: 48976.8. Samples: 86677880. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-12 16:47:20,940][70768] Avg episode reward: [(0, '0.208')] [2024-06-12 16:47:21,737][71000] Updated weights for policy 0, policy_version 34054 (0.0031) [2024-06-12 16:47:24,480][71000] Updated weights for policy 0, policy_version 34064 (0.0028) [2024-06-12 16:47:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 558153728. Throughput: 0: 48852.4. Samples: 86968600. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-12 16:47:25,940][70768] Avg episode reward: [(0, '0.215')] [2024-06-12 16:47:28,498][71000] Updated weights for policy 0, policy_version 34074 (0.0030) [2024-06-12 16:47:30,940][70768] Fps is (10 sec: 52428.6, 60 sec: 48878.9, 300 sec: 49374.1). Total num frames: 558415872. Throughput: 0: 48857.3. Samples: 87268520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 16:47:30,940][70768] Avg episode reward: [(0, '0.201')] [2024-06-12 16:47:30,972][71000] Updated weights for policy 0, policy_version 34084 (0.0028) [2024-06-12 16:47:35,004][71000] Updated weights for policy 0, policy_version 34094 (0.0034) [2024-06-12 16:47:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 558645248. Throughput: 0: 48695.1. Samples: 87416040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 16:47:35,940][70768] Avg episode reward: [(0, '0.209')] [2024-06-12 16:47:37,682][71000] Updated weights for policy 0, policy_version 34104 (0.0035) [2024-06-12 16:47:40,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 558891008. Throughput: 0: 48990.2. Samples: 87716500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 16:47:40,940][70768] Avg episode reward: [(0, '0.196')] [2024-06-12 16:47:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000034112_558891008.pth... [2024-06-12 16:47:40,993][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000033392_547094528.pth [2024-06-12 16:47:41,596][71000] Updated weights for policy 0, policy_version 34114 (0.0035) [2024-06-12 16:47:44,464][71000] Updated weights for policy 0, policy_version 34124 (0.0030) [2024-06-12 16:47:45,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48605.8, 300 sec: 49263.1). Total num frames: 559136768. Throughput: 0: 48908.8. Samples: 88009940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 16:47:45,940][70768] Avg episode reward: [(0, '0.190')] [2024-06-12 16:47:48,106][71000] Updated weights for policy 0, policy_version 34134 (0.0034) [2024-06-12 16:47:50,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48605.8, 300 sec: 49318.6). Total num frames: 559382528. Throughput: 0: 48994.7. Samples: 88157640. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-12 16:47:50,940][70768] Avg episode reward: [(0, '0.201')] [2024-06-12 16:47:51,134][71000] Updated weights for policy 0, policy_version 34144 (0.0027) [2024-06-12 16:47:54,916][71000] Updated weights for policy 0, policy_version 34154 (0.0026) [2024-06-12 16:47:55,942][70768] Fps is (10 sec: 47502.0, 60 sec: 48603.9, 300 sec: 49262.6). Total num frames: 559611904. Throughput: 0: 49040.1. Samples: 88453540. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-12 16:47:55,943][70768] Avg episode reward: [(0, '0.214')] [2024-06-12 16:47:57,735][71000] Updated weights for policy 0, policy_version 34164 (0.0035) [2024-06-12 16:48:00,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49424.9, 300 sec: 49263.1). Total num frames: 559874048. Throughput: 0: 48865.1. Samples: 88747860. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-12 16:48:00,940][70768] Avg episode reward: [(0, '0.203')] [2024-06-12 16:48:01,672][71000] Updated weights for policy 0, policy_version 34174 (0.0037) [2024-06-12 16:48:04,239][71000] Updated weights for policy 0, policy_version 34184 (0.0037) [2024-06-12 16:48:05,940][70768] Fps is (10 sec: 50802.6, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 560119808. Throughput: 0: 49264.3. Samples: 88894780. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-12 16:48:05,940][70768] Avg episode reward: [(0, '0.200')] [2024-06-12 16:48:08,104][71000] Updated weights for policy 0, policy_version 34194 (0.0030) [2024-06-12 16:48:10,939][70768] Fps is (10 sec: 49153.6, 60 sec: 48606.1, 300 sec: 49263.1). Total num frames: 560365568. Throughput: 0: 49368.2. Samples: 89190160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-12 16:48:10,940][70768] Avg episode reward: [(0, '0.196')] [2024-06-12 16:48:11,058][70980] Signal inference workers to stop experience collection... (1300 times) [2024-06-12 16:48:11,058][70980] Signal inference workers to resume experience collection... (1300 times) [2024-06-12 16:48:11,100][71000] InferenceWorker_p0-w0: stopping experience collection (1300 times) [2024-06-12 16:48:11,100][71000] InferenceWorker_p0-w0: resuming experience collection (1300 times) [2024-06-12 16:48:11,190][71000] Updated weights for policy 0, policy_version 34204 (0.0030) [2024-06-12 16:48:14,523][71000] Updated weights for policy 0, policy_version 34214 (0.0026) [2024-06-12 16:48:15,939][70768] Fps is (10 sec: 47514.8, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 560594944. Throughput: 0: 49222.4. Samples: 89483520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-12 16:48:15,940][70768] Avg episode reward: [(0, '0.205')] [2024-06-12 16:48:17,776][71000] Updated weights for policy 0, policy_version 34224 (0.0027) [2024-06-12 16:48:20,940][70768] Fps is (10 sec: 49150.9, 60 sec: 49425.0, 300 sec: 49263.5). Total num frames: 560857088. Throughput: 0: 49154.5. Samples: 89628000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-12 16:48:20,940][70768] Avg episode reward: [(0, '0.196')] [2024-06-12 16:48:21,443][71000] Updated weights for policy 0, policy_version 34234 (0.0032) [2024-06-12 16:48:24,423][71000] Updated weights for policy 0, policy_version 34244 (0.0034) [2024-06-12 16:48:25,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 561102848. Throughput: 0: 49150.3. Samples: 89928260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-12 16:48:25,940][70768] Avg episode reward: [(0, '0.202')] [2024-06-12 16:48:28,107][71000] Updated weights for policy 0, policy_version 34254 (0.0041) [2024-06-12 16:48:30,939][70768] Fps is (10 sec: 50791.5, 60 sec: 49152.1, 300 sec: 49319.1). Total num frames: 561364992. Throughput: 0: 49092.7. Samples: 90219100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-12 16:48:30,940][70768] Avg episode reward: [(0, '0.188')] [2024-06-12 16:48:31,075][71000] Updated weights for policy 0, policy_version 34264 (0.0031) [2024-06-12 16:48:34,535][71000] Updated weights for policy 0, policy_version 34274 (0.0029) [2024-06-12 16:48:35,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 561594368. Throughput: 0: 49158.2. Samples: 90369760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 16:48:35,940][70768] Avg episode reward: [(0, '0.203')] [2024-06-12 16:48:37,966][71000] Updated weights for policy 0, policy_version 34284 (0.0032) [2024-06-12 16:48:40,939][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.2, 300 sec: 49263.1). Total num frames: 561840128. Throughput: 0: 49081.1. Samples: 90662060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 16:48:40,940][70768] Avg episode reward: [(0, '0.182')] [2024-06-12 16:48:41,359][71000] Updated weights for policy 0, policy_version 34294 (0.0030) [2024-06-12 16:48:44,632][71000] Updated weights for policy 0, policy_version 34304 (0.0028) [2024-06-12 16:48:45,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 562102272. Throughput: 0: 49036.7. Samples: 90954500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 16:48:45,940][70768] Avg episode reward: [(0, '0.197')] [2024-06-12 16:48:48,051][71000] Updated weights for policy 0, policy_version 34314 (0.0044) [2024-06-12 16:48:50,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 562315264. Throughput: 0: 48949.1. Samples: 91097480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 16:48:50,940][70768] Avg episode reward: [(0, '0.197')] [2024-06-12 16:48:51,570][71000] Updated weights for policy 0, policy_version 34324 (0.0022) [2024-06-12 16:48:54,775][71000] Updated weights for policy 0, policy_version 34334 (0.0030) [2024-06-12 16:48:55,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49154.1, 300 sec: 49263.1). Total num frames: 562561024. Throughput: 0: 48891.4. Samples: 91390280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 16:48:55,940][70768] Avg episode reward: [(0, '0.201')] [2024-06-12 16:48:58,114][71000] Updated weights for policy 0, policy_version 34344 (0.0027) [2024-06-12 16:49:00,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.2, 300 sec: 49263.1). Total num frames: 562823168. Throughput: 0: 48852.9. Samples: 91681900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 16:49:00,940][70768] Avg episode reward: [(0, '0.210')] [2024-06-12 16:49:01,632][71000] Updated weights for policy 0, policy_version 34354 (0.0031) [2024-06-12 16:49:04,894][71000] Updated weights for policy 0, policy_version 34364 (0.0030) [2024-06-12 16:49:05,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.2, 300 sec: 49318.6). Total num frames: 563068928. Throughput: 0: 49104.6. Samples: 91837700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 16:49:05,940][70768] Avg episode reward: [(0, '0.213')] [2024-06-12 16:49:08,154][71000] Updated weights for policy 0, policy_version 34374 (0.0028) [2024-06-12 16:49:10,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48605.8, 300 sec: 49152.0). Total num frames: 563281920. Throughput: 0: 48895.6. Samples: 92128560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 16:49:10,940][70768] Avg episode reward: [(0, '0.212')] [2024-06-12 16:49:11,562][71000] Updated weights for policy 0, policy_version 34384 (0.0027) [2024-06-12 16:49:14,987][71000] Updated weights for policy 0, policy_version 34394 (0.0028) [2024-06-12 16:49:15,939][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 563544064. Throughput: 0: 48826.2. Samples: 92416280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 16:49:15,940][70768] Avg episode reward: [(0, '0.205')] [2024-06-12 16:49:18,340][71000] Updated weights for policy 0, policy_version 34404 (0.0025) [2024-06-12 16:49:20,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 563789824. Throughput: 0: 48736.9. Samples: 92562920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 16:49:20,940][70768] Avg episode reward: [(0, '0.204')] [2024-06-12 16:49:21,462][71000] Updated weights for policy 0, policy_version 34414 (0.0030) [2024-06-12 16:49:24,882][71000] Updated weights for policy 0, policy_version 34424 (0.0026) [2024-06-12 16:49:25,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 564051968. Throughput: 0: 48979.5. Samples: 92866140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 16:49:25,940][70768] Avg episode reward: [(0, '0.206')] [2024-06-12 16:49:28,199][71000] Updated weights for policy 0, policy_version 34434 (0.0027) [2024-06-12 16:49:30,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48332.7, 300 sec: 49152.0). Total num frames: 564264960. Throughput: 0: 49052.4. Samples: 93161860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 16:49:30,940][70768] Avg episode reward: [(0, '0.213')] [2024-06-12 16:49:31,720][71000] Updated weights for policy 0, policy_version 34444 (0.0028) [2024-06-12 16:49:34,796][70980] Signal inference workers to stop experience collection... (1350 times) [2024-06-12 16:49:34,797][70980] Signal inference workers to resume experience collection... (1350 times) [2024-06-12 16:49:34,805][71000] Updated weights for policy 0, policy_version 34454 (0.0026) [2024-06-12 16:49:34,813][71000] InferenceWorker_p0-w0: stopping experience collection (1350 times) [2024-06-12 16:49:34,813][71000] InferenceWorker_p0-w0: resuming experience collection (1350 times) [2024-06-12 16:49:35,941][70768] Fps is (10 sec: 49143.7, 60 sec: 49150.6, 300 sec: 49318.3). Total num frames: 564543488. Throughput: 0: 48898.5. Samples: 93298000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 16:49:35,942][70768] Avg episode reward: [(0, '0.213')] [2024-06-12 16:49:38,407][71000] Updated weights for policy 0, policy_version 34464 (0.0030) [2024-06-12 16:49:40,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 564772864. Throughput: 0: 49055.1. Samples: 93597760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 16:49:40,940][70768] Avg episode reward: [(0, '0.211')] [2024-06-12 16:49:41,050][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000034472_564789248.pth... [2024-06-12 16:49:41,090][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000033751_552976384.pth [2024-06-12 16:49:41,495][71000] Updated weights for policy 0, policy_version 34474 (0.0037) [2024-06-12 16:49:45,133][71000] Updated weights for policy 0, policy_version 34484 (0.0028) [2024-06-12 16:49:45,940][70768] Fps is (10 sec: 47521.7, 60 sec: 48605.9, 300 sec: 49207.5). Total num frames: 565018624. Throughput: 0: 49038.6. Samples: 93888640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 16:49:45,940][70768] Avg episode reward: [(0, '0.211')] [2024-06-12 16:49:48,133][71000] Updated weights for policy 0, policy_version 34494 (0.0023) [2024-06-12 16:49:50,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48605.9, 300 sec: 49041.0). Total num frames: 565231616. Throughput: 0: 48755.6. Samples: 94031700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 16:49:50,940][70768] Avg episode reward: [(0, '0.196')] [2024-06-12 16:49:52,021][71000] Updated weights for policy 0, policy_version 34504 (0.0029) [2024-06-12 16:49:54,641][71000] Updated weights for policy 0, policy_version 34514 (0.0032) [2024-06-12 16:49:55,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 565510144. Throughput: 0: 48958.3. Samples: 94331680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 16:49:55,940][70768] Avg episode reward: [(0, '0.193')] [2024-06-12 16:49:58,713][71000] Updated weights for policy 0, policy_version 34524 (0.0032) [2024-06-12 16:50:00,940][70768] Fps is (10 sec: 54066.9, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 565772288. Throughput: 0: 49165.3. Samples: 94628720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 16:50:00,940][70768] Avg episode reward: [(0, '0.207')] [2024-06-12 16:50:01,325][71000] Updated weights for policy 0, policy_version 34534 (0.0023) [2024-06-12 16:50:05,151][71000] Updated weights for policy 0, policy_version 34544 (0.0033) [2024-06-12 16:50:05,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 566018048. Throughput: 0: 49477.5. Samples: 94789400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-12 16:50:05,940][70768] Avg episode reward: [(0, '0.197')] [2024-06-12 16:50:07,976][71000] Updated weights for policy 0, policy_version 34554 (0.0031) [2024-06-12 16:50:10,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49152.0, 300 sec: 49096.4). Total num frames: 566231040. Throughput: 0: 48937.3. Samples: 95068320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-12 16:50:10,940][70768] Avg episode reward: [(0, '0.206')] [2024-06-12 16:50:11,830][71000] Updated weights for policy 0, policy_version 34564 (0.0029) [2024-06-12 16:50:14,781][71000] Updated weights for policy 0, policy_version 34574 (0.0025) [2024-06-12 16:50:15,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 566493184. Throughput: 0: 48782.7. Samples: 95357080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-12 16:50:15,949][70768] Avg episode reward: [(0, '0.202')] [2024-06-12 16:50:18,608][71000] Updated weights for policy 0, policy_version 34584 (0.0029) [2024-06-12 16:50:20,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 566755328. Throughput: 0: 49388.9. Samples: 95520420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-12 16:50:20,940][70768] Avg episode reward: [(0, '0.215')] [2024-06-12 16:50:21,307][71000] Updated weights for policy 0, policy_version 34594 (0.0033) [2024-06-12 16:50:25,124][71000] Updated weights for policy 0, policy_version 34604 (0.0025) [2024-06-12 16:50:25,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.1, 300 sec: 49152.1). Total num frames: 567001088. Throughput: 0: 49399.7. Samples: 95820740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 16:50:25,940][70768] Avg episode reward: [(0, '0.195')] [2024-06-12 16:50:27,727][71000] Updated weights for policy 0, policy_version 34614 (0.0026) [2024-06-12 16:50:30,939][70768] Fps is (10 sec: 45875.7, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 567214080. Throughput: 0: 49509.0. Samples: 96116540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 16:50:30,940][70768] Avg episode reward: [(0, '0.204')] [2024-06-12 16:50:32,017][71000] Updated weights for policy 0, policy_version 34624 (0.0029) [2024-06-12 16:50:32,203][70980] Signal inference workers to stop experience collection... (1400 times) [2024-06-12 16:50:32,204][70980] Signal inference workers to resume experience collection... (1400 times) [2024-06-12 16:50:32,227][71000] InferenceWorker_p0-w0: stopping experience collection (1400 times) [2024-06-12 16:50:32,228][71000] InferenceWorker_p0-w0: resuming experience collection (1400 times) [2024-06-12 16:50:34,759][71000] Updated weights for policy 0, policy_version 34634 (0.0028) [2024-06-12 16:50:35,940][70768] Fps is (10 sec: 49150.8, 60 sec: 49153.3, 300 sec: 49263.0). Total num frames: 567492608. Throughput: 0: 49354.5. Samples: 96252660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 16:50:35,940][70768] Avg episode reward: [(0, '0.214')] [2024-06-12 16:50:38,555][71000] Updated weights for policy 0, policy_version 34644 (0.0027) [2024-06-12 16:50:40,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 567738368. Throughput: 0: 49371.0. Samples: 96553380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 16:50:40,940][70768] Avg episode reward: [(0, '0.188')] [2024-06-12 16:50:41,192][71000] Updated weights for policy 0, policy_version 34654 (0.0028) [2024-06-12 16:50:45,074][71000] Updated weights for policy 0, policy_version 34664 (0.0031) [2024-06-12 16:50:45,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 567984128. Throughput: 0: 49452.0. Samples: 96854060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 16:50:45,940][70768] Avg episode reward: [(0, '0.206')] [2024-06-12 16:50:47,864][71000] Updated weights for policy 0, policy_version 34674 (0.0029) [2024-06-12 16:50:50,942][70768] Fps is (10 sec: 45865.9, 60 sec: 49423.3, 300 sec: 49096.1). Total num frames: 568197120. Throughput: 0: 49038.6. Samples: 96996240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-12 16:50:50,942][70768] Avg episode reward: [(0, '0.216')] [2024-06-12 16:50:51,979][71000] Updated weights for policy 0, policy_version 34684 (0.0040) [2024-06-12 16:50:54,595][71000] Updated weights for policy 0, policy_version 34694 (0.0024) [2024-06-12 16:50:55,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49424.9, 300 sec: 49263.1). Total num frames: 568475648. Throughput: 0: 49234.1. Samples: 97283860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-12 16:50:55,940][70768] Avg episode reward: [(0, '0.211')] [2024-06-12 16:50:58,789][71000] Updated weights for policy 0, policy_version 34704 (0.0034) [2024-06-12 16:51:00,940][70768] Fps is (10 sec: 52438.5, 60 sec: 49151.8, 300 sec: 49152.0). Total num frames: 568721408. Throughput: 0: 49206.4. Samples: 97571380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-12 16:51:00,940][70768] Avg episode reward: [(0, '0.200')] [2024-06-12 16:51:01,501][71000] Updated weights for policy 0, policy_version 34714 (0.0025) [2024-06-12 16:51:05,603][71000] Updated weights for policy 0, policy_version 34724 (0.0033) [2024-06-12 16:51:05,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 568950784. Throughput: 0: 48978.7. Samples: 97724460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-12 16:51:05,940][70768] Avg episode reward: [(0, '0.219')] [2024-06-12 16:51:07,979][71000] Updated weights for policy 0, policy_version 34734 (0.0030) [2024-06-12 16:51:10,940][70768] Fps is (10 sec: 44237.6, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 569163776. Throughput: 0: 48752.3. Samples: 98014600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-12 16:51:10,940][70768] Avg episode reward: [(0, '0.218')] [2024-06-12 16:51:12,179][71000] Updated weights for policy 0, policy_version 34744 (0.0039) [2024-06-12 16:51:15,058][71000] Updated weights for policy 0, policy_version 34754 (0.0028) [2024-06-12 16:51:15,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 569425920. Throughput: 0: 48597.2. Samples: 98303420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-12 16:51:15,940][70768] Avg episode reward: [(0, '0.205')] [2024-06-12 16:51:18,903][71000] Updated weights for policy 0, policy_version 34764 (0.0023) [2024-06-12 16:51:20,939][70768] Fps is (10 sec: 52429.4, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 569688064. Throughput: 0: 49042.0. Samples: 98459540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-12 16:51:20,940][70768] Avg episode reward: [(0, '0.222')] [2024-06-12 16:51:21,543][71000] Updated weights for policy 0, policy_version 34774 (0.0027) [2024-06-12 16:51:25,501][71000] Updated weights for policy 0, policy_version 34784 (0.0024) [2024-06-12 16:51:25,939][70768] Fps is (10 sec: 47514.2, 60 sec: 48332.7, 300 sec: 48874.3). Total num frames: 569901056. Throughput: 0: 48641.4. Samples: 98742240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-12 16:51:25,940][70768] Avg episode reward: [(0, '0.206')] [2024-06-12 16:51:28,045][71000] Updated weights for policy 0, policy_version 34794 (0.0024) [2024-06-12 16:51:30,939][70768] Fps is (10 sec: 45875.0, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 570146816. Throughput: 0: 48549.4. Samples: 99038780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-12 16:51:30,940][70768] Avg episode reward: [(0, '0.198')] [2024-06-12 16:51:31,908][70980] Signal inference workers to stop experience collection... (1450 times) [2024-06-12 16:51:31,911][70980] Signal inference workers to resume experience collection... (1450 times) [2024-06-12 16:51:31,926][71000] InferenceWorker_p0-w0: stopping experience collection (1450 times) [2024-06-12 16:51:31,926][71000] InferenceWorker_p0-w0: resuming experience collection (1450 times) [2024-06-12 16:51:32,179][71000] Updated weights for policy 0, policy_version 34804 (0.0026) [2024-06-12 16:51:34,784][71000] Updated weights for policy 0, policy_version 34814 (0.0028) [2024-06-12 16:51:35,939][70768] Fps is (10 sec: 50790.4, 60 sec: 48606.0, 300 sec: 49040.9). Total num frames: 570408960. Throughput: 0: 48527.1. Samples: 99179860. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-12 16:51:35,940][70768] Avg episode reward: [(0, '0.210')] [2024-06-12 16:51:39,061][71000] Updated weights for policy 0, policy_version 34824 (0.0031) [2024-06-12 16:51:40,940][70768] Fps is (10 sec: 52428.3, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 570671104. Throughput: 0: 48827.2. Samples: 99481080. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-12 16:51:40,940][70768] Avg episode reward: [(0, '0.205')] [2024-06-12 16:51:41,072][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000034832_570687488.pth... [2024-06-12 16:51:41,106][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000034112_558891008.pth [2024-06-12 16:51:41,547][71000] Updated weights for policy 0, policy_version 34834 (0.0035) [2024-06-12 16:51:45,611][71000] Updated weights for policy 0, policy_version 34844 (0.0030) [2024-06-12 16:51:45,943][70768] Fps is (10 sec: 47497.6, 60 sec: 48330.1, 300 sec: 48873.7). Total num frames: 570884096. Throughput: 0: 48934.4. Samples: 99773580. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-12 16:51:45,943][70768] Avg episode reward: [(0, '0.207')] [2024-06-12 16:51:48,601][71000] Updated weights for policy 0, policy_version 34854 (0.0032) [2024-06-12 16:51:50,940][70768] Fps is (10 sec: 45874.7, 60 sec: 48880.5, 300 sec: 48929.8). Total num frames: 571129856. Throughput: 0: 48690.0. Samples: 99915520. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-12 16:51:50,940][70768] Avg episode reward: [(0, '0.218')] [2024-06-12 16:51:52,461][71000] Updated weights for policy 0, policy_version 34864 (0.0031) [2024-06-12 16:51:54,987][71000] Updated weights for policy 0, policy_version 34874 (0.0027) [2024-06-12 16:51:55,940][70768] Fps is (10 sec: 50807.4, 60 sec: 48606.0, 300 sec: 49096.5). Total num frames: 571392000. Throughput: 0: 48820.9. Samples: 100211540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 16:51:55,940][70768] Avg episode reward: [(0, '0.212')] [2024-06-12 16:51:58,802][71000] Updated weights for policy 0, policy_version 34884 (0.0033) [2024-06-12 16:52:00,940][70768] Fps is (10 sec: 52429.6, 60 sec: 48879.1, 300 sec: 49096.5). Total num frames: 571654144. Throughput: 0: 49126.3. Samples: 100514100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 16:52:00,940][70768] Avg episode reward: [(0, '0.218')] [2024-06-12 16:52:01,346][71000] Updated weights for policy 0, policy_version 34894 (0.0030) [2024-06-12 16:52:05,582][71000] Updated weights for policy 0, policy_version 34904 (0.0032) [2024-06-12 16:52:05,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 571883520. Throughput: 0: 48897.3. Samples: 100659920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 16:52:05,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 16:52:05,941][70980] Saving new best policy, reward=0.229! [2024-06-12 16:52:08,414][71000] Updated weights for policy 0, policy_version 34914 (0.0029) [2024-06-12 16:52:10,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 572129280. Throughput: 0: 49194.1. Samples: 100955980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 16:52:10,940][70768] Avg episode reward: [(0, '0.197')] [2024-06-12 16:52:12,503][71000] Updated weights for policy 0, policy_version 34924 (0.0025) [2024-06-12 16:52:15,398][71000] Updated weights for policy 0, policy_version 34934 (0.0027) [2024-06-12 16:52:15,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 572375040. Throughput: 0: 49000.3. Samples: 101243800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 16:52:15,940][70768] Avg episode reward: [(0, '0.222')] [2024-06-12 16:52:19,044][71000] Updated weights for policy 0, policy_version 34944 (0.0026) [2024-06-12 16:52:20,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48878.7, 300 sec: 49040.9). Total num frames: 572620800. Throughput: 0: 49070.8. Samples: 101388060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 16:52:20,940][70768] Avg episode reward: [(0, '0.213')] [2024-06-12 16:52:22,124][71000] Updated weights for policy 0, policy_version 34954 (0.0025) [2024-06-12 16:52:25,697][71000] Updated weights for policy 0, policy_version 34964 (0.0033) [2024-06-12 16:52:25,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 572850176. Throughput: 0: 48772.5. Samples: 101675840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 16:52:25,940][70768] Avg episode reward: [(0, '0.224')] [2024-06-12 16:52:28,810][71000] Updated weights for policy 0, policy_version 34974 (0.0030) [2024-06-12 16:52:30,940][70768] Fps is (10 sec: 49153.1, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 573112320. Throughput: 0: 48915.2. Samples: 101974600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 16:52:30,940][70768] Avg episode reward: [(0, '0.221')] [2024-06-12 16:52:32,658][71000] Updated weights for policy 0, policy_version 34984 (0.0038) [2024-06-12 16:52:35,826][71000] Updated weights for policy 0, policy_version 34994 (0.0037) [2024-06-12 16:52:35,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 573341696. Throughput: 0: 48896.9. Samples: 102115880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 16:52:35,940][70768] Avg episode reward: [(0, '0.220')] [2024-06-12 16:52:39,203][71000] Updated weights for policy 0, policy_version 35004 (0.0031) [2024-06-12 16:52:40,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48332.8, 300 sec: 48929.9). Total num frames: 573571072. Throughput: 0: 48751.5. Samples: 102405360. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-12 16:52:40,940][70768] Avg episode reward: [(0, '0.216')] [2024-06-12 16:52:43,007][71000] Updated weights for policy 0, policy_version 35014 (0.0032) [2024-06-12 16:52:45,783][71000] Updated weights for policy 0, policy_version 35024 (0.0024) [2024-06-12 16:52:45,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49154.7, 300 sec: 48985.4). Total num frames: 573833216. Throughput: 0: 48404.4. Samples: 102692300. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-12 16:52:45,940][70768] Avg episode reward: [(0, '0.203')] [2024-06-12 16:52:47,460][70980] Signal inference workers to stop experience collection... (1500 times) [2024-06-12 16:52:47,461][70980] Signal inference workers to resume experience collection... (1500 times) [2024-06-12 16:52:47,500][71000] InferenceWorker_p0-w0: stopping experience collection (1500 times) [2024-06-12 16:52:47,500][71000] InferenceWorker_p0-w0: resuming experience collection (1500 times) [2024-06-12 16:52:49,466][71000] Updated weights for policy 0, policy_version 35034 (0.0025) [2024-06-12 16:52:50,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.1, 300 sec: 48985.8). Total num frames: 574062592. Throughput: 0: 48448.9. Samples: 102840120. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-12 16:52:50,940][70768] Avg episode reward: [(0, '0.223')] [2024-06-12 16:52:52,232][71000] Updated weights for policy 0, policy_version 35044 (0.0031) [2024-06-12 16:52:55,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48605.8, 300 sec: 48929.9). Total num frames: 574308352. Throughput: 0: 48490.7. Samples: 103138060. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-12 16:52:55,940][70768] Avg episode reward: [(0, '0.224')] [2024-06-12 16:52:55,972][71000] Updated weights for policy 0, policy_version 35054 (0.0043) [2024-06-12 16:52:59,354][71000] Updated weights for policy 0, policy_version 35064 (0.0031) [2024-06-12 16:53:00,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48059.7, 300 sec: 48874.3). Total num frames: 574537728. Throughput: 0: 48524.5. Samples: 103427400. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-12 16:53:00,940][70768] Avg episode reward: [(0, '0.205')] [2024-06-12 16:53:02,762][71000] Updated weights for policy 0, policy_version 35074 (0.0022) [2024-06-12 16:53:05,833][71000] Updated weights for policy 0, policy_version 35084 (0.0029) [2024-06-12 16:53:05,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 574816256. Throughput: 0: 48613.1. Samples: 103575640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 16:53:05,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 16:53:05,940][70980] Saving new best policy, reward=0.234! [2024-06-12 16:53:09,777][71000] Updated weights for policy 0, policy_version 35094 (0.0031) [2024-06-12 16:53:10,940][70768] Fps is (10 sec: 52429.0, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 575062016. Throughput: 0: 48841.4. Samples: 103873700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 16:53:10,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 16:53:12,291][71000] Updated weights for policy 0, policy_version 35104 (0.0023) [2024-06-12 16:53:15,939][70768] Fps is (10 sec: 45875.5, 60 sec: 48332.9, 300 sec: 48874.3). Total num frames: 575275008. Throughput: 0: 48742.3. Samples: 104168000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 16:53:15,940][70768] Avg episode reward: [(0, '0.214')] [2024-06-12 16:53:16,306][71000] Updated weights for policy 0, policy_version 35114 (0.0033) [2024-06-12 16:53:18,918][71000] Updated weights for policy 0, policy_version 35124 (0.0026) [2024-06-12 16:53:20,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48332.9, 300 sec: 48874.3). Total num frames: 575520768. Throughput: 0: 48717.4. Samples: 104308160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 16:53:20,940][70768] Avg episode reward: [(0, '0.219')] [2024-06-12 16:53:22,874][71000] Updated weights for policy 0, policy_version 35134 (0.0040) [2024-06-12 16:53:25,940][70768] Fps is (10 sec: 50789.5, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 575782912. Throughput: 0: 48718.1. Samples: 104597680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 16:53:25,940][70768] Avg episode reward: [(0, '0.206')] [2024-06-12 16:53:26,520][71000] Updated weights for policy 0, policy_version 35144 (0.0027) [2024-06-12 16:53:29,819][71000] Updated weights for policy 0, policy_version 35154 (0.0038) [2024-06-12 16:53:30,940][70768] Fps is (10 sec: 52428.9, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 576045056. Throughput: 0: 48870.2. Samples: 104891460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 16:53:30,940][70768] Avg episode reward: [(0, '0.224')] [2024-06-12 16:53:33,129][71000] Updated weights for policy 0, policy_version 35164 (0.0035) [2024-06-12 16:53:35,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48332.9, 300 sec: 48818.8). Total num frames: 576241664. Throughput: 0: 48717.7. Samples: 105032420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 16:53:35,940][70768] Avg episode reward: [(0, '0.200')] [2024-06-12 16:53:36,670][71000] Updated weights for policy 0, policy_version 35174 (0.0035) [2024-06-12 16:53:39,606][71000] Updated weights for policy 0, policy_version 35184 (0.0033) [2024-06-12 16:53:40,939][70768] Fps is (10 sec: 44237.2, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 576487424. Throughput: 0: 48495.7. Samples: 105320360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 16:53:40,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 16:53:40,968][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000035187_576503808.pth... [2024-06-12 16:53:41,008][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000034472_564789248.pth [2024-06-12 16:53:43,302][71000] Updated weights for policy 0, policy_version 35194 (0.0024) [2024-06-12 16:53:45,940][70768] Fps is (10 sec: 52428.3, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 576765952. Throughput: 0: 48525.2. Samples: 105611040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 16:53:45,940][70768] Avg episode reward: [(0, '0.218')] [2024-06-12 16:53:46,268][71000] Updated weights for policy 0, policy_version 35204 (0.0030) [2024-06-12 16:53:49,918][71000] Updated weights for policy 0, policy_version 35214 (0.0027) [2024-06-12 16:53:50,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 577011712. Throughput: 0: 48821.2. Samples: 105772600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 16:53:50,940][70768] Avg episode reward: [(0, '0.205')] [2024-06-12 16:53:53,234][71000] Updated weights for policy 0, policy_version 35224 (0.0033) [2024-06-12 16:53:55,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48605.8, 300 sec: 48818.7). Total num frames: 577224704. Throughput: 0: 48644.3. Samples: 106062700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 16:53:55,940][70768] Avg episode reward: [(0, '0.208')] [2024-06-12 16:53:56,634][71000] Updated weights for policy 0, policy_version 35234 (0.0033) [2024-06-12 16:53:59,746][71000] Updated weights for policy 0, policy_version 35244 (0.0027) [2024-06-12 16:54:00,942][70768] Fps is (10 sec: 45863.5, 60 sec: 48876.8, 300 sec: 48818.3). Total num frames: 577470464. Throughput: 0: 48566.8. Samples: 106353640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 16:54:00,943][70768] Avg episode reward: [(0, '0.201')] [2024-06-12 16:54:03,479][71000] Updated weights for policy 0, policy_version 35254 (0.0024) [2024-06-12 16:54:04,537][70980] Signal inference workers to stop experience collection... (1550 times) [2024-06-12 16:54:04,538][70980] Signal inference workers to resume experience collection... (1550 times) [2024-06-12 16:54:04,553][71000] InferenceWorker_p0-w0: stopping experience collection (1550 times) [2024-06-12 16:54:04,574][71000] InferenceWorker_p0-w0: resuming experience collection (1550 times) [2024-06-12 16:54:05,939][70768] Fps is (10 sec: 50791.4, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 577732608. Throughput: 0: 48905.0. Samples: 106508880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 16:54:05,940][70768] Avg episode reward: [(0, '0.209')] [2024-06-12 16:54:06,179][71000] Updated weights for policy 0, policy_version 35264 (0.0030) [2024-06-12 16:54:10,077][71000] Updated weights for policy 0, policy_version 35274 (0.0031) [2024-06-12 16:54:10,940][70768] Fps is (10 sec: 50804.2, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 577978368. Throughput: 0: 49043.3. Samples: 106804620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-12 16:54:10,940][70768] Avg episode reward: [(0, '0.219')] [2024-06-12 16:54:12,564][71000] Updated weights for policy 0, policy_version 35284 (0.0031) [2024-06-12 16:54:15,940][70768] Fps is (10 sec: 45874.4, 60 sec: 48605.7, 300 sec: 48818.8). Total num frames: 578191360. Throughput: 0: 49045.7. Samples: 107098520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-12 16:54:15,940][70768] Avg episode reward: [(0, '0.222')] [2024-06-12 16:54:16,718][71000] Updated weights for policy 0, policy_version 35294 (0.0032) [2024-06-12 16:54:19,645][71000] Updated weights for policy 0, policy_version 35304 (0.0029) [2024-06-12 16:54:20,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 578453504. Throughput: 0: 48896.9. Samples: 107232780. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-12 16:54:20,940][70768] Avg episode reward: [(0, '0.228')] [2024-06-12 16:54:23,334][71000] Updated weights for policy 0, policy_version 35314 (0.0042) [2024-06-12 16:54:25,940][70768] Fps is (10 sec: 54067.8, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 578732032. Throughput: 0: 49191.5. Samples: 107533980. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-12 16:54:25,940][70768] Avg episode reward: [(0, '0.220')] [2024-06-12 16:54:26,169][71000] Updated weights for policy 0, policy_version 35324 (0.0027) [2024-06-12 16:54:30,194][71000] Updated weights for policy 0, policy_version 35334 (0.0029) [2024-06-12 16:54:30,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48332.9, 300 sec: 48819.1). Total num frames: 578945024. Throughput: 0: 49142.8. Samples: 107822460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-12 16:54:30,940][70768] Avg episode reward: [(0, '0.206')] [2024-06-12 16:54:32,946][71000] Updated weights for policy 0, policy_version 35344 (0.0028) [2024-06-12 16:54:35,939][70768] Fps is (10 sec: 42598.6, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 579158016. Throughput: 0: 48678.4. Samples: 107963120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-12 16:54:35,940][70768] Avg episode reward: [(0, '0.208')] [2024-06-12 16:54:37,112][71000] Updated weights for policy 0, policy_version 35354 (0.0027) [2024-06-12 16:54:39,837][71000] Updated weights for policy 0, policy_version 35364 (0.0026) [2024-06-12 16:54:40,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 579436544. Throughput: 0: 48835.3. Samples: 108260280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-12 16:54:40,940][70768] Avg episode reward: [(0, '0.200')] [2024-06-12 16:54:43,495][71000] Updated weights for policy 0, policy_version 35374 (0.0036) [2024-06-12 16:54:45,939][70768] Fps is (10 sec: 54067.0, 60 sec: 48879.1, 300 sec: 49040.9). Total num frames: 579698688. Throughput: 0: 48806.5. Samples: 108549800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-12 16:54:45,940][70768] Avg episode reward: [(0, '0.213')] [2024-06-12 16:54:46,380][71000] Updated weights for policy 0, policy_version 35384 (0.0029) [2024-06-12 16:54:50,112][71000] Updated weights for policy 0, policy_version 35394 (0.0027) [2024-06-12 16:54:50,939][70768] Fps is (10 sec: 50790.5, 60 sec: 48879.1, 300 sec: 48929.8). Total num frames: 579944448. Throughput: 0: 48956.0. Samples: 108711900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-12 16:54:50,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 16:54:52,842][71000] Updated weights for policy 0, policy_version 35404 (0.0040) [2024-06-12 16:54:55,939][70768] Fps is (10 sec: 44237.0, 60 sec: 48606.0, 300 sec: 48707.7). Total num frames: 580141056. Throughput: 0: 48665.8. Samples: 108994580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 16:54:55,940][70768] Avg episode reward: [(0, '0.196')] [2024-06-12 16:54:56,871][71000] Updated weights for policy 0, policy_version 35414 (0.0030) [2024-06-12 16:54:59,755][71000] Updated weights for policy 0, policy_version 35424 (0.0032) [2024-06-12 16:55:00,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49154.2, 300 sec: 48818.7). Total num frames: 580419584. Throughput: 0: 48513.8. Samples: 109281640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 16:55:00,940][70768] Avg episode reward: [(0, '0.212')] [2024-06-12 16:55:03,885][71000] Updated weights for policy 0, policy_version 35434 (0.0029) [2024-06-12 16:55:04,366][70980] Signal inference workers to stop experience collection... (1600 times) [2024-06-12 16:55:04,409][71000] InferenceWorker_p0-w0: stopping experience collection (1600 times) [2024-06-12 16:55:04,472][70980] Signal inference workers to resume experience collection... (1600 times) [2024-06-12 16:55:04,473][71000] InferenceWorker_p0-w0: resuming experience collection (1600 times) [2024-06-12 16:55:05,940][70768] Fps is (10 sec: 54066.1, 60 sec: 49151.8, 300 sec: 48985.4). Total num frames: 580681728. Throughput: 0: 49080.8. Samples: 109441420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 16:55:05,940][70768] Avg episode reward: [(0, '0.212')] [2024-06-12 16:55:06,817][71000] Updated weights for policy 0, policy_version 35444 (0.0030) [2024-06-12 16:55:10,471][71000] Updated weights for policy 0, policy_version 35454 (0.0034) [2024-06-12 16:55:10,942][70768] Fps is (10 sec: 50776.3, 60 sec: 49149.7, 300 sec: 48929.4). Total num frames: 580927488. Throughput: 0: 48974.3. Samples: 109737960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 16:55:10,943][70768] Avg episode reward: [(0, '0.209')] [2024-06-12 16:55:13,389][71000] Updated weights for policy 0, policy_version 35464 (0.0030) [2024-06-12 16:55:15,939][70768] Fps is (10 sec: 40960.8, 60 sec: 48332.9, 300 sec: 48596.6). Total num frames: 581091328. Throughput: 0: 48942.7. Samples: 110024880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 16:55:15,940][70768] Avg episode reward: [(0, '0.217')] [2024-06-12 16:55:17,105][71000] Updated weights for policy 0, policy_version 35474 (0.0024) [2024-06-12 16:55:20,079][71000] Updated weights for policy 0, policy_version 35484 (0.0028) [2024-06-12 16:55:20,940][70768] Fps is (10 sec: 45888.1, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 581386240. Throughput: 0: 48703.9. Samples: 110154800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-12 16:55:20,940][70768] Avg episode reward: [(0, '0.212')] [2024-06-12 16:55:23,991][71000] Updated weights for policy 0, policy_version 35494 (0.0029) [2024-06-12 16:55:25,939][70768] Fps is (10 sec: 54067.0, 60 sec: 48332.8, 300 sec: 48874.3). Total num frames: 581632000. Throughput: 0: 48704.9. Samples: 110452000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-12 16:55:25,940][70768] Avg episode reward: [(0, '0.199')] [2024-06-12 16:55:26,741][71000] Updated weights for policy 0, policy_version 35504 (0.0031) [2024-06-12 16:55:30,461][71000] Updated weights for policy 0, policy_version 35514 (0.0027) [2024-06-12 16:55:30,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48878.8, 300 sec: 48763.2). Total num frames: 581877760. Throughput: 0: 48955.0. Samples: 110752780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-12 16:55:30,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 16:55:33,899][71000] Updated weights for policy 0, policy_version 35524 (0.0039) [2024-06-12 16:55:35,940][70768] Fps is (10 sec: 45874.6, 60 sec: 48878.8, 300 sec: 48652.1). Total num frames: 582090752. Throughput: 0: 48541.2. Samples: 110896260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-12 16:55:35,940][70768] Avg episode reward: [(0, '0.219')] [2024-06-12 16:55:37,226][71000] Updated weights for policy 0, policy_version 35534 (0.0033) [2024-06-12 16:55:40,199][71000] Updated weights for policy 0, policy_version 35544 (0.0027) [2024-06-12 16:55:40,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 582369280. Throughput: 0: 48917.3. Samples: 111195860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-12 16:55:40,940][70768] Avg episode reward: [(0, '0.228')] [2024-06-12 16:55:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000035545_582369280.pth... [2024-06-12 16:55:41,020][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000034832_570687488.pth [2024-06-12 16:55:43,771][71000] Updated weights for policy 0, policy_version 35554 (0.0032) [2024-06-12 16:55:45,940][70768] Fps is (10 sec: 50790.9, 60 sec: 48332.8, 300 sec: 48819.1). Total num frames: 582598656. Throughput: 0: 48827.1. Samples: 111478860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-12 16:55:45,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 16:55:46,911][71000] Updated weights for policy 0, policy_version 35564 (0.0041) [2024-06-12 16:55:50,498][71000] Updated weights for policy 0, policy_version 35574 (0.0030) [2024-06-12 16:55:50,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 582860800. Throughput: 0: 48509.4. Samples: 111624340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-12 16:55:50,940][70768] Avg episode reward: [(0, '0.202')] [2024-06-12 16:55:51,197][70980] Signal inference workers to stop experience collection... (1650 times) [2024-06-12 16:55:51,249][70980] Signal inference workers to resume experience collection... (1650 times) [2024-06-12 16:55:51,249][71000] InferenceWorker_p0-w0: stopping experience collection (1650 times) [2024-06-12 16:55:51,263][71000] InferenceWorker_p0-w0: resuming experience collection (1650 times) [2024-06-12 16:55:53,617][71000] Updated weights for policy 0, policy_version 35584 (0.0032) [2024-06-12 16:55:55,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.8, 300 sec: 48652.2). Total num frames: 583073792. Throughput: 0: 48511.8. Samples: 111920860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-12 16:55:55,940][70768] Avg episode reward: [(0, '0.220')] [2024-06-12 16:55:57,148][71000] Updated weights for policy 0, policy_version 35594 (0.0029) [2024-06-12 16:56:00,218][71000] Updated weights for policy 0, policy_version 35604 (0.0034) [2024-06-12 16:56:00,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 583368704. Throughput: 0: 48778.4. Samples: 112219920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-12 16:56:00,940][70768] Avg episode reward: [(0, '0.209')] [2024-06-12 16:56:03,533][71000] Updated weights for policy 0, policy_version 35614 (0.0031) [2024-06-12 16:56:05,940][70768] Fps is (10 sec: 52429.5, 60 sec: 48606.0, 300 sec: 48929.9). Total num frames: 583598080. Throughput: 0: 49464.0. Samples: 112380680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 16:56:05,940][70768] Avg episode reward: [(0, '0.224')] [2024-06-12 16:56:06,639][71000] Updated weights for policy 0, policy_version 35624 (0.0035) [2024-06-12 16:56:10,355][71000] Updated weights for policy 0, policy_version 35634 (0.0026) [2024-06-12 16:56:10,940][70768] Fps is (10 sec: 49150.3, 60 sec: 48880.8, 300 sec: 48929.8). Total num frames: 583860224. Throughput: 0: 49468.3. Samples: 112678100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 16:56:10,941][70768] Avg episode reward: [(0, '0.208')] [2024-06-12 16:56:13,641][71000] Updated weights for policy 0, policy_version 35644 (0.0032) [2024-06-12 16:56:15,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49971.0, 300 sec: 48818.7). Total num frames: 584089600. Throughput: 0: 49208.8. Samples: 112967180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 16:56:15,940][70768] Avg episode reward: [(0, '0.224')] [2024-06-12 16:56:17,014][71000] Updated weights for policy 0, policy_version 35654 (0.0027) [2024-06-12 16:56:20,191][71000] Updated weights for policy 0, policy_version 35664 (0.0030) [2024-06-12 16:56:20,940][70768] Fps is (10 sec: 47516.1, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 584335360. Throughput: 0: 49240.5. Samples: 113112080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 16:56:20,940][70768] Avg episode reward: [(0, '0.219')] [2024-06-12 16:56:23,627][71000] Updated weights for policy 0, policy_version 35674 (0.0028) [2024-06-12 16:56:25,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 584581120. Throughput: 0: 48975.0. Samples: 113399740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 16:56:25,940][70768] Avg episode reward: [(0, '0.200')] [2024-06-12 16:56:26,846][71000] Updated weights for policy 0, policy_version 35684 (0.0034) [2024-06-12 16:56:30,210][71000] Updated weights for policy 0, policy_version 35694 (0.0032) [2024-06-12 16:56:30,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.1, 300 sec: 48929.8). Total num frames: 584843264. Throughput: 0: 49340.8. Samples: 113699200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 16:56:30,940][70768] Avg episode reward: [(0, '0.215')] [2024-06-12 16:56:33,463][71000] Updated weights for policy 0, policy_version 35704 (0.0028) [2024-06-12 16:56:35,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.1, 300 sec: 48763.2). Total num frames: 585056256. Throughput: 0: 49435.5. Samples: 113848940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 16:56:35,940][70768] Avg episode reward: [(0, '0.208')] [2024-06-12 16:56:36,940][71000] Updated weights for policy 0, policy_version 35714 (0.0026) [2024-06-12 16:56:40,398][71000] Updated weights for policy 0, policy_version 35724 (0.0033) [2024-06-12 16:56:40,939][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.0, 300 sec: 48930.4). Total num frames: 585318400. Throughput: 0: 49237.1. Samples: 114136520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 16:56:40,940][70768] Avg episode reward: [(0, '0.217')] [2024-06-12 16:56:43,655][71000] Updated weights for policy 0, policy_version 35734 (0.0025) [2024-06-12 16:56:45,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 585547776. Throughput: 0: 49102.8. Samples: 114429540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 16:56:45,940][70768] Avg episode reward: [(0, '0.207')] [2024-06-12 16:56:47,346][71000] Updated weights for policy 0, policy_version 35744 (0.0038) [2024-06-12 16:56:50,509][71000] Updated weights for policy 0, policy_version 35754 (0.0025) [2024-06-12 16:56:50,940][70768] Fps is (10 sec: 49150.9, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 585809920. Throughput: 0: 48799.4. Samples: 114576660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 16:56:50,940][70768] Avg episode reward: [(0, '0.214')] [2024-06-12 16:56:53,921][71000] Updated weights for policy 0, policy_version 35764 (0.0031) [2024-06-12 16:56:55,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.2, 300 sec: 48818.8). Total num frames: 586055680. Throughput: 0: 48575.6. Samples: 114863980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 16:56:55,940][70768] Avg episode reward: [(0, '0.214')] [2024-06-12 16:56:57,057][71000] Updated weights for policy 0, policy_version 35774 (0.0026) [2024-06-12 16:57:00,869][71000] Updated weights for policy 0, policy_version 35784 (0.0029) [2024-06-12 16:57:00,940][70768] Fps is (10 sec: 47514.3, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 586285056. Throughput: 0: 48679.2. Samples: 115157740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 16:57:00,940][70768] Avg episode reward: [(0, '0.211')] [2024-06-12 16:57:03,837][71000] Updated weights for policy 0, policy_version 35794 (0.0031) [2024-06-12 16:57:05,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 586514432. Throughput: 0: 48539.6. Samples: 115296360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 16:57:05,940][70768] Avg episode reward: [(0, '0.206')] [2024-06-12 16:57:06,046][70980] Signal inference workers to stop experience collection... (1700 times) [2024-06-12 16:57:06,046][70980] Signal inference workers to resume experience collection... (1700 times) [2024-06-12 16:57:06,066][71000] InferenceWorker_p0-w0: stopping experience collection (1700 times) [2024-06-12 16:57:06,067][71000] InferenceWorker_p0-w0: resuming experience collection (1700 times) [2024-06-12 16:57:07,526][71000] Updated weights for policy 0, policy_version 35804 (0.0030) [2024-06-12 16:57:10,509][71000] Updated weights for policy 0, policy_version 35814 (0.0032) [2024-06-12 16:57:10,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48606.2, 300 sec: 48818.8). Total num frames: 586776576. Throughput: 0: 48653.7. Samples: 115589160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 16:57:10,952][70768] Avg episode reward: [(0, '0.217')] [2024-06-12 16:57:14,474][71000] Updated weights for policy 0, policy_version 35824 (0.0039) [2024-06-12 16:57:15,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48879.1, 300 sec: 48818.8). Total num frames: 587022336. Throughput: 0: 48668.5. Samples: 115889280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 16:57:15,940][70768] Avg episode reward: [(0, '0.214')] [2024-06-12 16:57:17,242][71000] Updated weights for policy 0, policy_version 35834 (0.0026) [2024-06-12 16:57:20,939][70768] Fps is (10 sec: 47514.7, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 587251712. Throughput: 0: 48506.8. Samples: 116031740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 16:57:20,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 16:57:20,983][71000] Updated weights for policy 0, policy_version 35844 (0.0028) [2024-06-12 16:57:23,947][71000] Updated weights for policy 0, policy_version 35854 (0.0036) [2024-06-12 16:57:25,939][70768] Fps is (10 sec: 45875.4, 60 sec: 48332.9, 300 sec: 48707.7). Total num frames: 587481088. Throughput: 0: 48473.3. Samples: 116317820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 16:57:25,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 16:57:27,773][71000] Updated weights for policy 0, policy_version 35864 (0.0035) [2024-06-12 16:57:30,507][71000] Updated weights for policy 0, policy_version 35874 (0.0022) [2024-06-12 16:57:30,940][70768] Fps is (10 sec: 50789.2, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 587759616. Throughput: 0: 48522.1. Samples: 116613040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 16:57:30,940][70768] Avg episode reward: [(0, '0.221')] [2024-06-12 16:57:34,716][71000] Updated weights for policy 0, policy_version 35884 (0.0028) [2024-06-12 16:57:35,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49152.1, 300 sec: 48929.8). Total num frames: 588005376. Throughput: 0: 48699.3. Samples: 116768120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 16:57:35,940][70768] Avg episode reward: [(0, '0.209')] [2024-06-12 16:57:37,091][71000] Updated weights for policy 0, policy_version 35894 (0.0037) [2024-06-12 16:57:40,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48332.7, 300 sec: 48763.2). Total num frames: 588218368. Throughput: 0: 48749.8. Samples: 117057720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 16:57:40,940][70768] Avg episode reward: [(0, '0.224')] [2024-06-12 16:57:40,960][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000035902_588218368.pth... [2024-06-12 16:57:41,009][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000035187_576503808.pth [2024-06-12 16:57:41,483][71000] Updated weights for policy 0, policy_version 35904 (0.0032) [2024-06-12 16:57:44,189][71000] Updated weights for policy 0, policy_version 35914 (0.0038) [2024-06-12 16:57:45,940][70768] Fps is (10 sec: 44236.9, 60 sec: 48332.8, 300 sec: 48763.2). Total num frames: 588447744. Throughput: 0: 48637.8. Samples: 117346440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 16:57:45,940][70768] Avg episode reward: [(0, '0.209')] [2024-06-12 16:57:48,273][71000] Updated weights for policy 0, policy_version 35924 (0.0033) [2024-06-12 16:57:50,865][71000] Updated weights for policy 0, policy_version 35934 (0.0037) [2024-06-12 16:57:50,940][70768] Fps is (10 sec: 52428.9, 60 sec: 48879.1, 300 sec: 48929.8). Total num frames: 588742656. Throughput: 0: 48794.6. Samples: 117492120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 16:57:50,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 16:57:55,018][71000] Updated weights for policy 0, policy_version 35944 (0.0027) [2024-06-12 16:57:55,939][70768] Fps is (10 sec: 50790.6, 60 sec: 48332.9, 300 sec: 48874.3). Total num frames: 588955648. Throughput: 0: 48751.3. Samples: 117782960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 16:57:55,940][70768] Avg episode reward: [(0, '0.224')] [2024-06-12 16:57:57,504][71000] Updated weights for policy 0, policy_version 35954 (0.0024) [2024-06-12 16:58:00,940][70768] Fps is (10 sec: 44236.0, 60 sec: 48332.7, 300 sec: 48707.7). Total num frames: 589185024. Throughput: 0: 48826.9. Samples: 118086500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 16:58:00,940][70768] Avg episode reward: [(0, '0.214')] [2024-06-12 16:58:01,422][71000] Updated weights for policy 0, policy_version 35964 (0.0028) [2024-06-12 16:58:04,089][71000] Updated weights for policy 0, policy_version 35974 (0.0020) [2024-06-12 16:58:05,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 589430784. Throughput: 0: 48674.6. Samples: 118222100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-12 16:58:05,940][70768] Avg episode reward: [(0, '0.226')] [2024-06-12 16:58:08,100][71000] Updated weights for policy 0, policy_version 35984 (0.0032) [2024-06-12 16:58:10,939][70768] Fps is (10 sec: 52430.2, 60 sec: 48879.1, 300 sec: 48929.8). Total num frames: 589709312. Throughput: 0: 48690.2. Samples: 118508880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-12 16:58:10,940][70768] Avg episode reward: [(0, '0.225')] [2024-06-12 16:58:10,982][71000] Updated weights for policy 0, policy_version 35994 (0.0031) [2024-06-12 16:58:14,382][70980] Signal inference workers to stop experience collection... (1750 times) [2024-06-12 16:58:14,383][70980] Signal inference workers to resume experience collection... (1750 times) [2024-06-12 16:58:14,418][71000] InferenceWorker_p0-w0: stopping experience collection (1750 times) [2024-06-12 16:58:14,418][71000] InferenceWorker_p0-w0: resuming experience collection (1750 times) [2024-06-12 16:58:14,785][71000] Updated weights for policy 0, policy_version 36004 (0.0028) [2024-06-12 16:58:15,940][70768] Fps is (10 sec: 52428.8, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 589955072. Throughput: 0: 48737.5. Samples: 118806220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-12 16:58:15,940][70768] Avg episode reward: [(0, '0.231')] [2024-06-12 16:58:17,868][71000] Updated weights for policy 0, policy_version 36014 (0.0036) [2024-06-12 16:58:20,939][70768] Fps is (10 sec: 44236.7, 60 sec: 48332.8, 300 sec: 48707.7). Total num frames: 590151680. Throughput: 0: 48457.4. Samples: 118948700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-12 16:58:20,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 16:58:21,659][71000] Updated weights for policy 0, policy_version 36024 (0.0029) [2024-06-12 16:58:24,247][71000] Updated weights for policy 0, policy_version 36034 (0.0029) [2024-06-12 16:58:25,942][70768] Fps is (10 sec: 45862.7, 60 sec: 48876.7, 300 sec: 48707.2). Total num frames: 590413824. Throughput: 0: 48486.9. Samples: 119239760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-12 16:58:25,943][70768] Avg episode reward: [(0, '0.203')] [2024-06-12 16:58:28,402][71000] Updated weights for policy 0, policy_version 36044 (0.0028) [2024-06-12 16:58:30,940][70768] Fps is (10 sec: 54067.1, 60 sec: 48879.1, 300 sec: 48985.4). Total num frames: 590692352. Throughput: 0: 48582.7. Samples: 119532660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-12 16:58:30,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 16:58:30,993][71000] Updated weights for policy 0, policy_version 36054 (0.0028) [2024-06-12 16:58:35,089][71000] Updated weights for policy 0, policy_version 36064 (0.0036) [2024-06-12 16:58:35,940][70768] Fps is (10 sec: 50804.4, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 590921728. Throughput: 0: 48874.3. Samples: 119691460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-12 16:58:35,940][70768] Avg episode reward: [(0, '0.211')] [2024-06-12 16:58:37,885][71000] Updated weights for policy 0, policy_version 36074 (0.0030) [2024-06-12 16:58:40,939][70768] Fps is (10 sec: 44236.9, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 591134720. Throughput: 0: 48970.2. Samples: 119986620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-12 16:58:40,940][70768] Avg episode reward: [(0, '0.218')] [2024-06-12 16:58:41,571][71000] Updated weights for policy 0, policy_version 36084 (0.0024) [2024-06-12 16:58:44,779][71000] Updated weights for policy 0, policy_version 36094 (0.0038) [2024-06-12 16:58:45,939][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 48763.3). Total num frames: 591396864. Throughput: 0: 48572.3. Samples: 120272240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-12 16:58:45,940][70768] Avg episode reward: [(0, '0.224')] [2024-06-12 16:58:48,539][71000] Updated weights for policy 0, policy_version 36104 (0.0026) [2024-06-12 16:58:50,939][70768] Fps is (10 sec: 54067.1, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 591675392. Throughput: 0: 48982.3. Samples: 120426300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 16:58:50,940][70768] Avg episode reward: [(0, '0.212')] [2024-06-12 16:58:51,256][71000] Updated weights for policy 0, policy_version 36114 (0.0038) [2024-06-12 16:58:55,104][71000] Updated weights for policy 0, policy_version 36124 (0.0028) [2024-06-12 16:58:55,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.9, 300 sec: 48819.2). Total num frames: 591872000. Throughput: 0: 49101.8. Samples: 120718460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 16:58:55,940][70768] Avg episode reward: [(0, '0.228')] [2024-06-12 16:58:58,063][71000] Updated weights for policy 0, policy_version 36134 (0.0042) [2024-06-12 16:59:00,940][70768] Fps is (10 sec: 44236.1, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 592117760. Throughput: 0: 48965.6. Samples: 121009680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 16:59:00,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 16:59:00,947][70980] Saving new best policy, reward=0.243! [2024-06-12 16:59:01,812][71000] Updated weights for policy 0, policy_version 36144 (0.0030) [2024-06-12 16:59:04,592][71000] Updated weights for policy 0, policy_version 36154 (0.0034) [2024-06-12 16:59:05,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 592363520. Throughput: 0: 48919.5. Samples: 121150080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 16:59:05,940][70768] Avg episode reward: [(0, '0.226')] [2024-06-12 16:59:08,544][71000] Updated weights for policy 0, policy_version 36164 (0.0030) [2024-06-12 16:59:10,940][70768] Fps is (10 sec: 52428.5, 60 sec: 48878.7, 300 sec: 48985.4). Total num frames: 592642048. Throughput: 0: 49125.9. Samples: 121450300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 16:59:10,940][70768] Avg episode reward: [(0, '0.218')] [2024-06-12 16:59:11,675][71000] Updated weights for policy 0, policy_version 36174 (0.0026) [2024-06-12 16:59:15,241][71000] Updated weights for policy 0, policy_version 36184 (0.0032) [2024-06-12 16:59:15,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48332.8, 300 sec: 48818.8). Total num frames: 592855040. Throughput: 0: 49131.6. Samples: 121743580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 16:59:15,940][70768] Avg episode reward: [(0, '0.222')] [2024-06-12 16:59:18,134][71000] Updated weights for policy 0, policy_version 36194 (0.0027) [2024-06-12 16:59:20,940][70768] Fps is (10 sec: 45875.7, 60 sec: 49151.9, 300 sec: 48707.7). Total num frames: 593100800. Throughput: 0: 48763.0. Samples: 121885800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 16:59:20,940][70768] Avg episode reward: [(0, '0.220')] [2024-06-12 16:59:21,748][71000] Updated weights for policy 0, policy_version 36204 (0.0026) [2024-06-12 16:59:22,407][70980] Signal inference workers to stop experience collection... (1800 times) [2024-06-12 16:59:22,447][71000] InferenceWorker_p0-w0: stopping experience collection (1800 times) [2024-06-12 16:59:22,454][70980] Signal inference workers to resume experience collection... (1800 times) [2024-06-12 16:59:22,463][71000] InferenceWorker_p0-w0: resuming experience collection (1800 times) [2024-06-12 16:59:24,790][71000] Updated weights for policy 0, policy_version 36214 (0.0034) [2024-06-12 16:59:25,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49154.3, 300 sec: 48874.3). Total num frames: 593362944. Throughput: 0: 48646.6. Samples: 122175720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 16:59:25,940][70768] Avg episode reward: [(0, '0.216')] [2024-06-12 16:59:28,998][71000] Updated weights for policy 0, policy_version 36224 (0.0034) [2024-06-12 16:59:30,940][70768] Fps is (10 sec: 52428.2, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 593625088. Throughput: 0: 48782.9. Samples: 122467480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 16:59:30,940][70768] Avg episode reward: [(0, '0.205')] [2024-06-12 16:59:31,812][71000] Updated weights for policy 0, policy_version 36234 (0.0028) [2024-06-12 16:59:35,401][71000] Updated weights for policy 0, policy_version 36244 (0.0039) [2024-06-12 16:59:35,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48605.8, 300 sec: 48818.8). Total num frames: 593838080. Throughput: 0: 48985.7. Samples: 122630660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 16:59:35,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 16:59:38,334][71000] Updated weights for policy 0, policy_version 36254 (0.0035) [2024-06-12 16:59:40,940][70768] Fps is (10 sec: 44237.7, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 594067456. Throughput: 0: 48706.1. Samples: 122910240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 16:59:40,940][70768] Avg episode reward: [(0, '0.219')] [2024-06-12 16:59:41,013][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000036260_594083840.pth... [2024-06-12 16:59:41,053][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000035545_582369280.pth [2024-06-12 16:59:42,029][71000] Updated weights for policy 0, policy_version 36264 (0.0029) [2024-06-12 16:59:44,752][71000] Updated weights for policy 0, policy_version 36274 (0.0039) [2024-06-12 16:59:45,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49151.9, 300 sec: 48818.7). Total num frames: 594345984. Throughput: 0: 48814.8. Samples: 123206340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 16:59:45,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 16:59:48,876][71000] Updated weights for policy 0, policy_version 36284 (0.0031) [2024-06-12 16:59:50,940][70768] Fps is (10 sec: 54067.0, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 594608128. Throughput: 0: 49155.6. Samples: 123362080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 16:59:50,940][70768] Avg episode reward: [(0, '0.217')] [2024-06-12 16:59:51,536][71000] Updated weights for policy 0, policy_version 36294 (0.0030) [2024-06-12 16:59:55,833][71000] Updated weights for policy 0, policy_version 36304 (0.0031) [2024-06-12 16:59:55,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 594804736. Throughput: 0: 48937.6. Samples: 123652480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 16:59:55,940][70768] Avg episode reward: [(0, '0.223')] [2024-06-12 16:59:58,478][71000] Updated weights for policy 0, policy_version 36314 (0.0028) [2024-06-12 17:00:00,940][70768] Fps is (10 sec: 44236.9, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 595050496. Throughput: 0: 48918.6. Samples: 123944920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-12 17:00:00,940][70768] Avg episode reward: [(0, '0.218')] [2024-06-12 17:00:02,352][71000] Updated weights for policy 0, policy_version 36324 (0.0031) [2024-06-12 17:00:05,199][71000] Updated weights for policy 0, policy_version 36334 (0.0036) [2024-06-12 17:00:05,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49152.0, 300 sec: 48763.7). Total num frames: 595312640. Throughput: 0: 48856.9. Samples: 124084360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-12 17:00:05,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 17:00:08,886][71000] Updated weights for policy 0, policy_version 36344 (0.0025) [2024-06-12 17:00:10,940][70768] Fps is (10 sec: 52428.7, 60 sec: 48879.1, 300 sec: 49096.4). Total num frames: 595574784. Throughput: 0: 49074.2. Samples: 124384060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-12 17:00:10,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 17:00:11,904][71000] Updated weights for policy 0, policy_version 36354 (0.0027) [2024-06-12 17:00:15,835][71000] Updated weights for policy 0, policy_version 36364 (0.0031) [2024-06-12 17:00:15,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 595787776. Throughput: 0: 49105.1. Samples: 124677200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-12 17:00:15,940][70768] Avg episode reward: [(0, '0.209')] [2024-06-12 17:00:18,706][71000] Updated weights for policy 0, policy_version 36374 (0.0042) [2024-06-12 17:00:20,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48879.1, 300 sec: 48818.8). Total num frames: 596033536. Throughput: 0: 48439.2. Samples: 124810420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-12 17:00:20,940][70768] Avg episode reward: [(0, '0.211')] [2024-06-12 17:00:22,587][71000] Updated weights for policy 0, policy_version 36384 (0.0030) [2024-06-12 17:00:23,748][70980] Signal inference workers to stop experience collection... (1850 times) [2024-06-12 17:00:23,748][70980] Signal inference workers to resume experience collection... (1850 times) [2024-06-12 17:00:23,766][71000] InferenceWorker_p0-w0: stopping experience collection (1850 times) [2024-06-12 17:00:23,766][71000] InferenceWorker_p0-w0: resuming experience collection (1850 times) [2024-06-12 17:00:25,222][71000] Updated weights for policy 0, policy_version 36394 (0.0039) [2024-06-12 17:00:25,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 596295680. Throughput: 0: 48730.5. Samples: 125103120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 17:00:25,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 17:00:29,466][71000] Updated weights for policy 0, policy_version 36404 (0.0034) [2024-06-12 17:00:30,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48332.9, 300 sec: 48929.8). Total num frames: 596525056. Throughput: 0: 48596.8. Samples: 125393200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 17:00:30,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:00:31,848][71000] Updated weights for policy 0, policy_version 36414 (0.0029) [2024-06-12 17:00:35,929][71000] Updated weights for policy 0, policy_version 36424 (0.0035) [2024-06-12 17:00:35,939][70768] Fps is (10 sec: 47515.0, 60 sec: 48879.1, 300 sec: 48818.8). Total num frames: 596770816. Throughput: 0: 48671.3. Samples: 125552280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 17:00:35,940][70768] Avg episode reward: [(0, '0.226')] [2024-06-12 17:00:38,324][71000] Updated weights for policy 0, policy_version 36434 (0.0032) [2024-06-12 17:00:40,939][70768] Fps is (10 sec: 50791.1, 60 sec: 49425.1, 300 sec: 48929.9). Total num frames: 597032960. Throughput: 0: 48801.8. Samples: 125848560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 17:00:40,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 17:00:42,445][71000] Updated weights for policy 0, policy_version 36444 (0.0036) [2024-06-12 17:00:45,146][71000] Updated weights for policy 0, policy_version 36454 (0.0027) [2024-06-12 17:00:45,940][70768] Fps is (10 sec: 50789.1, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 597278720. Throughput: 0: 48706.1. Samples: 126136700. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-12 17:00:45,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 17:00:49,199][71000] Updated weights for policy 0, policy_version 36464 (0.0024) [2024-06-12 17:00:50,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 597524480. Throughput: 0: 49158.6. Samples: 126296500. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-12 17:00:50,940][70768] Avg episode reward: [(0, '0.225')] [2024-06-12 17:00:51,639][71000] Updated weights for policy 0, policy_version 36474 (0.0035) [2024-06-12 17:00:55,940][70768] Fps is (10 sec: 45875.7, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 597737472. Throughput: 0: 48632.0. Samples: 126572500. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-12 17:00:55,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 17:00:56,022][71000] Updated weights for policy 0, policy_version 36484 (0.0031) [2024-06-12 17:00:58,764][71000] Updated weights for policy 0, policy_version 36494 (0.0029) [2024-06-12 17:01:00,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.0, 300 sec: 48874.3). Total num frames: 598016000. Throughput: 0: 48635.1. Samples: 126865780. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-12 17:01:00,940][70768] Avg episode reward: [(0, '0.223')] [2024-06-12 17:01:02,829][71000] Updated weights for policy 0, policy_version 36504 (0.0028) [2024-06-12 17:01:05,742][71000] Updated weights for policy 0, policy_version 36514 (0.0025) [2024-06-12 17:01:05,941][70768] Fps is (10 sec: 50785.0, 60 sec: 48878.1, 300 sec: 48763.1). Total num frames: 598245376. Throughput: 0: 49029.4. Samples: 127016800. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-12 17:01:05,950][70768] Avg episode reward: [(0, '0.218')] [2024-06-12 17:01:09,427][71000] Updated weights for policy 0, policy_version 36524 (0.0032) [2024-06-12 17:01:10,940][70768] Fps is (10 sec: 44236.7, 60 sec: 48059.7, 300 sec: 48707.7). Total num frames: 598458368. Throughput: 0: 48913.0. Samples: 127304200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-12 17:01:10,940][70768] Avg episode reward: [(0, '0.232')] [2024-06-12 17:01:12,479][71000] Updated weights for policy 0, policy_version 36534 (0.0035) [2024-06-12 17:01:15,939][70768] Fps is (10 sec: 45880.7, 60 sec: 48606.0, 300 sec: 48707.7). Total num frames: 598704128. Throughput: 0: 48832.2. Samples: 127590640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-12 17:01:15,940][70768] Avg episode reward: [(0, '0.217')] [2024-06-12 17:01:16,398][71000] Updated weights for policy 0, policy_version 36544 (0.0038) [2024-06-12 17:01:18,574][70980] Signal inference workers to stop experience collection... (1900 times) [2024-06-12 17:01:18,575][70980] Signal inference workers to resume experience collection... (1900 times) [2024-06-12 17:01:18,594][71000] InferenceWorker_p0-w0: stopping experience collection (1900 times) [2024-06-12 17:01:18,594][71000] InferenceWorker_p0-w0: resuming experience collection (1900 times) [2024-06-12 17:01:19,135][71000] Updated weights for policy 0, policy_version 36554 (0.0036) [2024-06-12 17:01:20,940][70768] Fps is (10 sec: 50790.9, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 598966272. Throughput: 0: 48481.7. Samples: 127733960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-12 17:01:20,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:01:23,296][71000] Updated weights for policy 0, policy_version 36564 (0.0030) [2024-06-12 17:01:25,939][70768] Fps is (10 sec: 49151.8, 60 sec: 48333.0, 300 sec: 48652.2). Total num frames: 599195648. Throughput: 0: 48266.7. Samples: 128020560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-12 17:01:25,940][70768] Avg episode reward: [(0, '0.221')] [2024-06-12 17:01:26,179][71000] Updated weights for policy 0, policy_version 36574 (0.0021) [2024-06-12 17:01:29,731][71000] Updated weights for policy 0, policy_version 36584 (0.0028) [2024-06-12 17:01:30,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 599457792. Throughput: 0: 48679.3. Samples: 128327260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-12 17:01:30,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:01:32,865][71000] Updated weights for policy 0, policy_version 36594 (0.0026) [2024-06-12 17:01:35,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 599687168. Throughput: 0: 48286.3. Samples: 128469380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-12 17:01:35,940][70768] Avg episode reward: [(0, '0.216')] [2024-06-12 17:01:36,357][71000] Updated weights for policy 0, policy_version 36604 (0.0028) [2024-06-12 17:01:39,505][71000] Updated weights for policy 0, policy_version 36614 (0.0026) [2024-06-12 17:01:40,939][70768] Fps is (10 sec: 50790.4, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 599965696. Throughput: 0: 48652.9. Samples: 128761880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-12 17:01:40,940][70768] Avg episode reward: [(0, '0.221')] [2024-06-12 17:01:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000036619_599965696.pth... [2024-06-12 17:01:40,994][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000035902_588218368.pth [2024-06-12 17:01:43,230][71000] Updated weights for policy 0, policy_version 36624 (0.0027) [2024-06-12 17:01:45,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48332.9, 300 sec: 48707.7). Total num frames: 600178688. Throughput: 0: 48531.7. Samples: 129049700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-12 17:01:45,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 17:01:46,301][71000] Updated weights for policy 0, policy_version 36634 (0.0028) [2024-06-12 17:01:50,079][71000] Updated weights for policy 0, policy_version 36644 (0.0037) [2024-06-12 17:01:50,940][70768] Fps is (10 sec: 44236.1, 60 sec: 48059.7, 300 sec: 48652.1). Total num frames: 600408064. Throughput: 0: 48187.3. Samples: 129185180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-12 17:01:50,940][70768] Avg episode reward: [(0, '0.225')] [2024-06-12 17:01:53,325][71000] Updated weights for policy 0, policy_version 36654 (0.0041) [2024-06-12 17:01:55,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48332.8, 300 sec: 48652.1). Total num frames: 600637440. Throughput: 0: 48320.0. Samples: 129478600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-12 17:01:55,940][70768] Avg episode reward: [(0, '0.220')] [2024-06-12 17:01:56,943][71000] Updated weights for policy 0, policy_version 36664 (0.0034) [2024-06-12 17:01:59,987][71000] Updated weights for policy 0, policy_version 36674 (0.0029) [2024-06-12 17:02:00,940][70768] Fps is (10 sec: 50790.9, 60 sec: 48332.8, 300 sec: 48818.8). Total num frames: 600915968. Throughput: 0: 48512.7. Samples: 129773720. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-06-12 17:02:00,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:02:03,446][71000] Updated weights for policy 0, policy_version 36684 (0.0032) [2024-06-12 17:02:05,940][70768] Fps is (10 sec: 49151.2, 60 sec: 48060.4, 300 sec: 48652.1). Total num frames: 601128960. Throughput: 0: 48620.7. Samples: 129921900. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-06-12 17:02:05,940][70768] Avg episode reward: [(0, '0.232')] [2024-06-12 17:02:06,645][71000] Updated weights for policy 0, policy_version 36694 (0.0031) [2024-06-12 17:02:10,059][71000] Updated weights for policy 0, policy_version 36704 (0.0030) [2024-06-12 17:02:10,940][70768] Fps is (10 sec: 44236.3, 60 sec: 48332.8, 300 sec: 48596.6). Total num frames: 601358336. Throughput: 0: 48487.8. Samples: 130202520. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-06-12 17:02:10,940][70768] Avg episode reward: [(0, '0.218')] [2024-06-12 17:02:13,629][71000] Updated weights for policy 0, policy_version 36714 (0.0042) [2024-06-12 17:02:15,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48605.6, 300 sec: 48707.6). Total num frames: 601620480. Throughput: 0: 47881.1. Samples: 130481920. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-06-12 17:02:15,940][70768] Avg episode reward: [(0, '0.225')] [2024-06-12 17:02:17,046][71000] Updated weights for policy 0, policy_version 36724 (0.0027) [2024-06-12 17:02:18,374][70980] Signal inference workers to stop experience collection... (1950 times) [2024-06-12 17:02:18,374][70980] Signal inference workers to resume experience collection... (1950 times) [2024-06-12 17:02:18,386][71000] InferenceWorker_p0-w0: stopping experience collection (1950 times) [2024-06-12 17:02:18,386][71000] InferenceWorker_p0-w0: resuming experience collection (1950 times) [2024-06-12 17:02:20,575][71000] Updated weights for policy 0, policy_version 36734 (0.0032) [2024-06-12 17:02:20,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48332.6, 300 sec: 48763.2). Total num frames: 601866240. Throughput: 0: 48338.5. Samples: 130644620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-12 17:02:20,944][70768] Avg episode reward: [(0, '0.228')] [2024-06-12 17:02:23,811][71000] Updated weights for policy 0, policy_version 36744 (0.0027) [2024-06-12 17:02:25,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48332.6, 300 sec: 48596.6). Total num frames: 602095616. Throughput: 0: 48270.9. Samples: 130934080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-12 17:02:25,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:02:27,233][71000] Updated weights for policy 0, policy_version 36754 (0.0035) [2024-06-12 17:02:30,476][71000] Updated weights for policy 0, policy_version 36764 (0.0038) [2024-06-12 17:02:30,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48059.6, 300 sec: 48596.6). Total num frames: 602341376. Throughput: 0: 48155.9. Samples: 131216720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-12 17:02:30,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 17:02:33,942][71000] Updated weights for policy 0, policy_version 36774 (0.0037) [2024-06-12 17:02:35,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48605.7, 300 sec: 48763.2). Total num frames: 602603520. Throughput: 0: 48562.2. Samples: 131370480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-12 17:02:35,940][70768] Avg episode reward: [(0, '0.225')] [2024-06-12 17:02:36,945][71000] Updated weights for policy 0, policy_version 36784 (0.0029) [2024-06-12 17:02:40,826][71000] Updated weights for policy 0, policy_version 36794 (0.0023) [2024-06-12 17:02:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 47786.6, 300 sec: 48763.2). Total num frames: 602832896. Throughput: 0: 48516.0. Samples: 131661820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-12 17:02:40,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 17:02:43,696][71000] Updated weights for policy 0, policy_version 36804 (0.0028) [2024-06-12 17:02:45,939][70768] Fps is (10 sec: 47514.6, 60 sec: 48332.8, 300 sec: 48596.6). Total num frames: 603078656. Throughput: 0: 48658.3. Samples: 131963340. Policy #0 lag: (min: 2.0, avg: 12.8, max: 22.0) [2024-06-12 17:02:45,940][70768] Avg episode reward: [(0, '0.232')] [2024-06-12 17:02:47,342][71000] Updated weights for policy 0, policy_version 36814 (0.0023) [2024-06-12 17:02:50,358][71000] Updated weights for policy 0, policy_version 36824 (0.0028) [2024-06-12 17:02:50,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 603324416. Throughput: 0: 48434.7. Samples: 132101460. Policy #0 lag: (min: 2.0, avg: 12.8, max: 22.0) [2024-06-12 17:02:50,940][70768] Avg episode reward: [(0, '0.212')] [2024-06-12 17:02:54,089][71000] Updated weights for policy 0, policy_version 36834 (0.0035) [2024-06-12 17:02:55,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.1, 300 sec: 48818.8). Total num frames: 603586560. Throughput: 0: 49068.7. Samples: 132410600. Policy #0 lag: (min: 2.0, avg: 12.8, max: 22.0) [2024-06-12 17:02:55,940][70768] Avg episode reward: [(0, '0.224')] [2024-06-12 17:02:56,840][71000] Updated weights for policy 0, policy_version 36844 (0.0035) [2024-06-12 17:03:00,609][71000] Updated weights for policy 0, policy_version 36854 (0.0024) [2024-06-12 17:03:00,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48332.8, 300 sec: 48763.2). Total num frames: 603815936. Throughput: 0: 49341.6. Samples: 132702280. Policy #0 lag: (min: 2.0, avg: 12.8, max: 22.0) [2024-06-12 17:03:00,940][70768] Avg episode reward: [(0, '0.218')] [2024-06-12 17:03:03,710][71000] Updated weights for policy 0, policy_version 36864 (0.0023) [2024-06-12 17:03:05,939][70768] Fps is (10 sec: 45875.1, 60 sec: 48606.1, 300 sec: 48596.6). Total num frames: 604045312. Throughput: 0: 48781.1. Samples: 132839760. Policy #0 lag: (min: 2.0, avg: 12.8, max: 22.0) [2024-06-12 17:03:05,940][70768] Avg episode reward: [(0, '0.232')] [2024-06-12 17:03:07,334][71000] Updated weights for policy 0, policy_version 36874 (0.0034) [2024-06-12 17:03:10,245][71000] Updated weights for policy 0, policy_version 36884 (0.0036) [2024-06-12 17:03:10,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49152.0, 300 sec: 48652.1). Total num frames: 604307456. Throughput: 0: 48793.3. Samples: 133129780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 17:03:10,940][70768] Avg episode reward: [(0, '0.217')] [2024-06-12 17:03:14,007][71000] Updated weights for policy 0, policy_version 36894 (0.0027) [2024-06-12 17:03:15,940][70768] Fps is (10 sec: 52427.5, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 604569600. Throughput: 0: 49021.2. Samples: 133422680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 17:03:15,940][70768] Avg episode reward: [(0, '0.231')] [2024-06-12 17:03:16,994][71000] Updated weights for policy 0, policy_version 36904 (0.0029) [2024-06-12 17:03:20,834][71000] Updated weights for policy 0, policy_version 36914 (0.0030) [2024-06-12 17:03:20,939][70768] Fps is (10 sec: 49152.9, 60 sec: 48879.1, 300 sec: 48763.7). Total num frames: 604798976. Throughput: 0: 49101.1. Samples: 133580020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 17:03:20,940][70768] Avg episode reward: [(0, '0.228')] [2024-06-12 17:03:23,577][71000] Updated weights for policy 0, policy_version 36924 (0.0029) [2024-06-12 17:03:25,940][70768] Fps is (10 sec: 44237.6, 60 sec: 48606.0, 300 sec: 48541.1). Total num frames: 605011968. Throughput: 0: 48888.0. Samples: 133861780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 17:03:25,940][70768] Avg episode reward: [(0, '0.210')] [2024-06-12 17:03:27,713][71000] Updated weights for policy 0, policy_version 36934 (0.0030) [2024-06-12 17:03:30,756][71000] Updated weights for policy 0, policy_version 36944 (0.0036) [2024-06-12 17:03:30,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.1, 300 sec: 48707.7). Total num frames: 605290496. Throughput: 0: 48497.8. Samples: 134145740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 17:03:30,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 17:03:30,952][70980] Saving new best policy, reward=0.245! [2024-06-12 17:03:32,950][70980] Signal inference workers to stop experience collection... (2000 times) [2024-06-12 17:03:32,951][70980] Signal inference workers to resume experience collection... (2000 times) [2024-06-12 17:03:32,991][71000] InferenceWorker_p0-w0: stopping experience collection (2000 times) [2024-06-12 17:03:32,991][71000] InferenceWorker_p0-w0: resuming experience collection (2000 times) [2024-06-12 17:03:34,292][71000] Updated weights for policy 0, policy_version 36954 (0.0027) [2024-06-12 17:03:35,940][70768] Fps is (10 sec: 54067.4, 60 sec: 49152.2, 300 sec: 48874.3). Total num frames: 605552640. Throughput: 0: 48993.1. Samples: 134306140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 17:03:35,940][70768] Avg episode reward: [(0, '0.216')] [2024-06-12 17:03:37,243][71000] Updated weights for policy 0, policy_version 36964 (0.0031) [2024-06-12 17:03:40,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 605765632. Throughput: 0: 48771.5. Samples: 134605320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 17:03:40,940][70768] Avg episode reward: [(0, '0.211')] [2024-06-12 17:03:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000036974_605782016.pth... [2024-06-12 17:03:40,959][71000] Updated weights for policy 0, policy_version 36974 (0.0024) [2024-06-12 17:03:40,987][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000036260_594083840.pth [2024-06-12 17:03:43,775][71000] Updated weights for policy 0, policy_version 36984 (0.0034) [2024-06-12 17:03:45,939][70768] Fps is (10 sec: 44236.9, 60 sec: 48605.9, 300 sec: 48541.1). Total num frames: 605995008. Throughput: 0: 48800.0. Samples: 134898280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 17:03:45,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:03:47,651][71000] Updated weights for policy 0, policy_version 36994 (0.0029) [2024-06-12 17:03:50,519][71000] Updated weights for policy 0, policy_version 37004 (0.0024) [2024-06-12 17:03:50,940][70768] Fps is (10 sec: 50789.3, 60 sec: 49152.0, 300 sec: 48818.7). Total num frames: 606273536. Throughput: 0: 48898.0. Samples: 135040180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 17:03:50,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 17:03:54,468][71000] Updated weights for policy 0, policy_version 37014 (0.0028) [2024-06-12 17:03:55,940][70768] Fps is (10 sec: 54065.2, 60 sec: 49151.7, 300 sec: 48874.3). Total num frames: 606535680. Throughput: 0: 48965.6. Samples: 135333240. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-12 17:03:55,940][70768] Avg episode reward: [(0, '0.216')] [2024-06-12 17:03:57,207][71000] Updated weights for policy 0, policy_version 37024 (0.0030) [2024-06-12 17:04:00,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 606748672. Throughput: 0: 49096.6. Samples: 135632020. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-12 17:04:00,940][70768] Avg episode reward: [(0, '0.221')] [2024-06-12 17:04:00,995][71000] Updated weights for policy 0, policy_version 37034 (0.0032) [2024-06-12 17:04:04,089][71000] Updated weights for policy 0, policy_version 37044 (0.0025) [2024-06-12 17:04:05,940][70768] Fps is (10 sec: 44238.1, 60 sec: 48878.9, 300 sec: 48596.6). Total num frames: 606978048. Throughput: 0: 48535.5. Samples: 135764120. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-12 17:04:05,940][70768] Avg episode reward: [(0, '0.198')] [2024-06-12 17:04:07,774][71000] Updated weights for policy 0, policy_version 37054 (0.0039) [2024-06-12 17:04:10,573][71000] Updated weights for policy 0, policy_version 37064 (0.0033) [2024-06-12 17:04:10,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49152.0, 300 sec: 48818.7). Total num frames: 607256576. Throughput: 0: 48974.5. Samples: 136065640. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-12 17:04:10,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:04:14,502][71000] Updated weights for policy 0, policy_version 37074 (0.0028) [2024-06-12 17:04:15,940][70768] Fps is (10 sec: 52429.0, 60 sec: 48879.1, 300 sec: 48818.8). Total num frames: 607502336. Throughput: 0: 49182.2. Samples: 136358940. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-12 17:04:15,940][70768] Avg episode reward: [(0, '0.208')] [2024-06-12 17:04:17,545][71000] Updated weights for policy 0, policy_version 37084 (0.0030) [2024-06-12 17:04:20,940][70768] Fps is (10 sec: 44237.3, 60 sec: 48332.7, 300 sec: 48596.6). Total num frames: 607698944. Throughput: 0: 48899.5. Samples: 136506620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 17:04:20,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:04:21,319][71000] Updated weights for policy 0, policy_version 37094 (0.0029) [2024-06-12 17:04:24,323][71000] Updated weights for policy 0, policy_version 37104 (0.0034) [2024-06-12 17:04:25,940][70768] Fps is (10 sec: 45874.7, 60 sec: 49151.9, 300 sec: 48596.6). Total num frames: 607961088. Throughput: 0: 48705.6. Samples: 136797080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 17:04:25,940][70768] Avg episode reward: [(0, '0.221')] [2024-06-12 17:04:27,841][71000] Updated weights for policy 0, policy_version 37114 (0.0028) [2024-06-12 17:04:29,393][70980] Signal inference workers to stop experience collection... (2050 times) [2024-06-12 17:04:29,393][70980] Signal inference workers to resume experience collection... (2050 times) [2024-06-12 17:04:29,427][71000] InferenceWorker_p0-w0: stopping experience collection (2050 times) [2024-06-12 17:04:29,427][71000] InferenceWorker_p0-w0: resuming experience collection (2050 times) [2024-06-12 17:04:30,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 608206848. Throughput: 0: 48551.5. Samples: 137083100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 17:04:30,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:04:31,160][71000] Updated weights for policy 0, policy_version 37124 (0.0036) [2024-06-12 17:04:34,553][71000] Updated weights for policy 0, policy_version 37134 (0.0032) [2024-06-12 17:04:35,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48605.8, 300 sec: 48818.8). Total num frames: 608468992. Throughput: 0: 49021.0. Samples: 137246120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 17:04:35,940][70768] Avg episode reward: [(0, '0.209')] [2024-06-12 17:04:37,579][71000] Updated weights for policy 0, policy_version 37144 (0.0037) [2024-06-12 17:04:40,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48878.8, 300 sec: 48652.1). Total num frames: 608698368. Throughput: 0: 48853.1. Samples: 137531620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 17:04:40,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 17:04:41,055][70980] Saving new best policy, reward=0.253! [2024-06-12 17:04:41,280][71000] Updated weights for policy 0, policy_version 37154 (0.0032) [2024-06-12 17:04:44,556][71000] Updated weights for policy 0, policy_version 37164 (0.0039) [2024-06-12 17:04:45,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.0, 300 sec: 48652.1). Total num frames: 608960512. Throughput: 0: 48717.8. Samples: 137824320. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-12 17:04:45,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 17:04:48,190][71000] Updated weights for policy 0, policy_version 37174 (0.0024) [2024-06-12 17:04:50,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48878.9, 300 sec: 48818.7). Total num frames: 609206272. Throughput: 0: 49228.7. Samples: 137979420. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-12 17:04:50,940][70768] Avg episode reward: [(0, '0.211')] [2024-06-12 17:04:51,082][71000] Updated weights for policy 0, policy_version 37184 (0.0024) [2024-06-12 17:04:54,582][71000] Updated weights for policy 0, policy_version 37194 (0.0027) [2024-06-12 17:04:55,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48332.9, 300 sec: 48763.2). Total num frames: 609435648. Throughput: 0: 49208.9. Samples: 138280040. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-12 17:04:55,940][70768] Avg episode reward: [(0, '0.216')] [2024-06-12 17:04:57,695][71000] Updated weights for policy 0, policy_version 37204 (0.0024) [2024-06-12 17:05:00,940][70768] Fps is (10 sec: 47514.3, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 609681408. Throughput: 0: 49127.0. Samples: 138569660. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-12 17:05:00,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:05:01,300][71000] Updated weights for policy 0, policy_version 37214 (0.0023) [2024-06-12 17:05:04,326][71000] Updated weights for policy 0, policy_version 37224 (0.0028) [2024-06-12 17:05:05,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49425.1, 300 sec: 48707.7). Total num frames: 609943552. Throughput: 0: 48964.5. Samples: 138710020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-12 17:05:05,940][70768] Avg episode reward: [(0, '0.223')] [2024-06-12 17:05:08,201][71000] Updated weights for policy 0, policy_version 37234 (0.0038) [2024-06-12 17:05:10,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48606.0, 300 sec: 48763.2). Total num frames: 610172928. Throughput: 0: 48870.3. Samples: 138996240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-12 17:05:10,940][70768] Avg episode reward: [(0, '0.231')] [2024-06-12 17:05:11,408][71000] Updated weights for policy 0, policy_version 37244 (0.0031) [2024-06-12 17:05:15,254][71000] Updated weights for policy 0, policy_version 37254 (0.0028) [2024-06-12 17:05:15,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 610418688. Throughput: 0: 48970.6. Samples: 139286780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-12 17:05:15,940][70768] Avg episode reward: [(0, '0.231')] [2024-06-12 17:05:18,162][71000] Updated weights for policy 0, policy_version 37264 (0.0026) [2024-06-12 17:05:20,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48878.9, 300 sec: 48596.6). Total num frames: 610631680. Throughput: 0: 48339.5. Samples: 139421400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-12 17:05:20,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:05:21,845][71000] Updated weights for policy 0, policy_version 37274 (0.0038) [2024-06-12 17:05:24,592][71000] Updated weights for policy 0, policy_version 37284 (0.0028) [2024-06-12 17:05:25,943][70768] Fps is (10 sec: 50774.5, 60 sec: 49422.5, 300 sec: 48818.3). Total num frames: 610926592. Throughput: 0: 48603.4. Samples: 139718920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-12 17:05:25,943][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:05:28,469][71000] Updated weights for policy 0, policy_version 37294 (0.0030) [2024-06-12 17:05:30,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 611139584. Throughput: 0: 48591.6. Samples: 140010940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 17:05:30,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:05:31,554][71000] Updated weights for policy 0, policy_version 37304 (0.0032) [2024-06-12 17:05:35,526][71000] Updated weights for policy 0, policy_version 37314 (0.0032) [2024-06-12 17:05:35,940][70768] Fps is (10 sec: 44249.9, 60 sec: 48332.7, 300 sec: 48596.6). Total num frames: 611368960. Throughput: 0: 48296.4. Samples: 140152760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 17:05:35,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:05:38,411][71000] Updated weights for policy 0, policy_version 37324 (0.0029) [2024-06-12 17:05:40,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48333.0, 300 sec: 48541.1). Total num frames: 611598336. Throughput: 0: 47938.0. Samples: 140437240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 17:05:40,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:05:41,035][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000037330_611614720.pth... [2024-06-12 17:05:41,071][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000036619_599965696.pth [2024-06-12 17:05:42,247][71000] Updated weights for policy 0, policy_version 37334 (0.0034) [2024-06-12 17:05:42,640][70980] Signal inference workers to stop experience collection... (2100 times) [2024-06-12 17:05:42,686][71000] InferenceWorker_p0-w0: stopping experience collection (2100 times) [2024-06-12 17:05:42,692][70980] Signal inference workers to resume experience collection... (2100 times) [2024-06-12 17:05:42,697][71000] InferenceWorker_p0-w0: resuming experience collection (2100 times) [2024-06-12 17:05:45,105][71000] Updated weights for policy 0, policy_version 37344 (0.0031) [2024-06-12 17:05:45,940][70768] Fps is (10 sec: 52429.4, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 611893248. Throughput: 0: 47933.3. Samples: 140726660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 17:05:45,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 17:05:49,013][71000] Updated weights for policy 0, policy_version 37354 (0.0026) [2024-06-12 17:05:50,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48059.9, 300 sec: 48652.2). Total num frames: 612089856. Throughput: 0: 48424.9. Samples: 140889140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 17:05:50,940][70768] Avg episode reward: [(0, '0.228')] [2024-06-12 17:05:51,852][71000] Updated weights for policy 0, policy_version 37364 (0.0022) [2024-06-12 17:05:55,647][71000] Updated weights for policy 0, policy_version 37374 (0.0034) [2024-06-12 17:05:55,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48605.9, 300 sec: 48596.6). Total num frames: 612352000. Throughput: 0: 48545.2. Samples: 141180780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:05:55,940][70768] Avg episode reward: [(0, '0.232')] [2024-06-12 17:05:58,607][71000] Updated weights for policy 0, policy_version 37384 (0.0028) [2024-06-12 17:06:00,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48605.9, 300 sec: 48652.3). Total num frames: 612597760. Throughput: 0: 48326.3. Samples: 141461460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:06:00,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:06:02,522][71000] Updated weights for policy 0, policy_version 37394 (0.0037) [2024-06-12 17:06:05,400][71000] Updated weights for policy 0, policy_version 37404 (0.0029) [2024-06-12 17:06:05,940][70768] Fps is (10 sec: 50791.0, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 612859904. Throughput: 0: 48557.4. Samples: 141606480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:06:05,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:06:09,409][71000] Updated weights for policy 0, policy_version 37414 (0.0025) [2024-06-12 17:06:10,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48332.8, 300 sec: 48707.7). Total num frames: 613072896. Throughput: 0: 48659.0. Samples: 141908420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:06:10,940][70768] Avg episode reward: [(0, '0.231')] [2024-06-12 17:06:12,004][71000] Updated weights for policy 0, policy_version 37424 (0.0029) [2024-06-12 17:06:15,853][71000] Updated weights for policy 0, policy_version 37434 (0.0026) [2024-06-12 17:06:15,939][70768] Fps is (10 sec: 45875.5, 60 sec: 48332.9, 300 sec: 48652.2). Total num frames: 613318656. Throughput: 0: 48736.1. Samples: 142204060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:06:15,940][70768] Avg episode reward: [(0, '0.231')] [2024-06-12 17:06:18,761][71000] Updated weights for policy 0, policy_version 37444 (0.0030) [2024-06-12 17:06:20,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.1, 300 sec: 48763.2). Total num frames: 613580800. Throughput: 0: 48650.9. Samples: 142342040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-12 17:06:20,940][70768] Avg episode reward: [(0, '0.207')] [2024-06-12 17:06:22,801][71000] Updated weights for policy 0, policy_version 37454 (0.0032) [2024-06-12 17:06:25,638][71000] Updated weights for policy 0, policy_version 37464 (0.0033) [2024-06-12 17:06:25,940][70768] Fps is (10 sec: 50789.5, 60 sec: 48335.3, 300 sec: 48707.7). Total num frames: 613826560. Throughput: 0: 48765.1. Samples: 142631680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-12 17:06:25,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:06:29,799][71000] Updated weights for policy 0, policy_version 37474 (0.0030) [2024-06-12 17:06:30,939][70768] Fps is (10 sec: 44237.3, 60 sec: 48059.8, 300 sec: 48596.6). Total num frames: 614023168. Throughput: 0: 48547.3. Samples: 142911280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-12 17:06:30,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 17:06:32,190][70980] Signal inference workers to stop experience collection... (2150 times) [2024-06-12 17:06:32,222][71000] InferenceWorker_p0-w0: stopping experience collection (2150 times) [2024-06-12 17:06:32,247][70980] Signal inference workers to resume experience collection... (2150 times) [2024-06-12 17:06:32,247][71000] InferenceWorker_p0-w0: resuming experience collection (2150 times) [2024-06-12 17:06:32,520][71000] Updated weights for policy 0, policy_version 37484 (0.0031) [2024-06-12 17:06:35,940][70768] Fps is (10 sec: 42598.8, 60 sec: 48059.9, 300 sec: 48430.0). Total num frames: 614252544. Throughput: 0: 47988.4. Samples: 143048620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-12 17:06:35,940][70768] Avg episode reward: [(0, '0.231')] [2024-06-12 17:06:36,690][71000] Updated weights for policy 0, policy_version 37494 (0.0036) [2024-06-12 17:06:39,124][71000] Updated weights for policy 0, policy_version 37504 (0.0027) [2024-06-12 17:06:40,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49151.9, 300 sec: 48707.7). Total num frames: 614547456. Throughput: 0: 47997.0. Samples: 143340640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-12 17:06:40,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:06:43,194][71000] Updated weights for policy 0, policy_version 37514 (0.0030) [2024-06-12 17:06:45,917][71000] Updated weights for policy 0, policy_version 37524 (0.0029) [2024-06-12 17:06:45,940][70768] Fps is (10 sec: 54067.3, 60 sec: 48332.9, 300 sec: 48763.2). Total num frames: 614793216. Throughput: 0: 48479.1. Samples: 143643020. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 17:06:45,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:06:49,907][71000] Updated weights for policy 0, policy_version 37534 (0.0032) [2024-06-12 17:06:50,940][70768] Fps is (10 sec: 44236.8, 60 sec: 48332.8, 300 sec: 48652.2). Total num frames: 614989824. Throughput: 0: 48405.8. Samples: 143784740. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 17:06:50,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:06:52,630][71000] Updated weights for policy 0, policy_version 37544 (0.0031) [2024-06-12 17:06:55,940][70768] Fps is (10 sec: 44236.3, 60 sec: 48059.7, 300 sec: 48541.1). Total num frames: 615235584. Throughput: 0: 47971.0. Samples: 144067120. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 17:06:55,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 17:06:56,979][71000] Updated weights for policy 0, policy_version 37554 (0.0031) [2024-06-12 17:06:59,722][71000] Updated weights for policy 0, policy_version 37564 (0.0026) [2024-06-12 17:07:00,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48332.7, 300 sec: 48707.7). Total num frames: 615497728. Throughput: 0: 47662.5. Samples: 144348880. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 17:07:00,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 17:07:04,142][71000] Updated weights for policy 0, policy_version 37574 (0.0032) [2024-06-12 17:07:05,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48059.7, 300 sec: 48763.2). Total num frames: 615743488. Throughput: 0: 47939.5. Samples: 144499320. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 17:07:05,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:07:06,448][71000] Updated weights for policy 0, policy_version 37584 (0.0024) [2024-06-12 17:07:10,618][71000] Updated weights for policy 0, policy_version 37594 (0.0025) [2024-06-12 17:07:10,940][70768] Fps is (10 sec: 45875.6, 60 sec: 48059.7, 300 sec: 48596.6). Total num frames: 615956480. Throughput: 0: 48085.9. Samples: 144795540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 17:07:10,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 17:07:13,411][71000] Updated weights for policy 0, policy_version 37604 (0.0037) [2024-06-12 17:07:15,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48059.6, 300 sec: 48596.6). Total num frames: 616202240. Throughput: 0: 48139.8. Samples: 145077580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 17:07:15,940][70768] Avg episode reward: [(0, '0.225')] [2024-06-12 17:07:17,114][71000] Updated weights for policy 0, policy_version 37614 (0.0027) [2024-06-12 17:07:19,975][71000] Updated weights for policy 0, policy_version 37624 (0.0029) [2024-06-12 17:07:20,940][70768] Fps is (10 sec: 49152.0, 60 sec: 47786.6, 300 sec: 48652.2). Total num frames: 616448000. Throughput: 0: 48527.1. Samples: 145232340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 17:07:20,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:07:24,419][71000] Updated weights for policy 0, policy_version 37634 (0.0037) [2024-06-12 17:07:25,940][70768] Fps is (10 sec: 49152.5, 60 sec: 47786.8, 300 sec: 48652.2). Total num frames: 616693760. Throughput: 0: 48370.2. Samples: 145517300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 17:07:25,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 17:07:27,090][71000] Updated weights for policy 0, policy_version 37644 (0.0029) [2024-06-12 17:07:30,939][70768] Fps is (10 sec: 47514.1, 60 sec: 48332.8, 300 sec: 48541.1). Total num frames: 616923136. Throughput: 0: 48138.7. Samples: 145809260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 17:07:30,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:07:30,948][71000] Updated weights for policy 0, policy_version 37654 (0.0028) [2024-06-12 17:07:31,833][70980] Signal inference workers to stop experience collection... (2200 times) [2024-06-12 17:07:31,884][71000] InferenceWorker_p0-w0: stopping experience collection (2200 times) [2024-06-12 17:07:31,891][70980] Signal inference workers to resume experience collection... (2200 times) [2024-06-12 17:07:31,892][71000] InferenceWorker_p0-w0: resuming experience collection (2200 times) [2024-06-12 17:07:33,643][71000] Updated weights for policy 0, policy_version 37664 (0.0028) [2024-06-12 17:07:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48878.9, 300 sec: 48652.2). Total num frames: 617185280. Throughput: 0: 48125.3. Samples: 145950380. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-12 17:07:35,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 17:07:35,940][70980] Saving new best policy, reward=0.255! [2024-06-12 17:07:37,738][71000] Updated weights for policy 0, policy_version 37674 (0.0026) [2024-06-12 17:07:40,562][71000] Updated weights for policy 0, policy_version 37684 (0.0030) [2024-06-12 17:07:40,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48059.8, 300 sec: 48652.1). Total num frames: 617431040. Throughput: 0: 48355.2. Samples: 146243100. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-12 17:07:40,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:07:40,954][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000037685_617431040.pth... [2024-06-12 17:07:40,997][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000036974_605782016.pth [2024-06-12 17:07:44,638][71000] Updated weights for policy 0, policy_version 37694 (0.0026) [2024-06-12 17:07:45,940][70768] Fps is (10 sec: 47513.7, 60 sec: 47786.7, 300 sec: 48596.6). Total num frames: 617660416. Throughput: 0: 48698.8. Samples: 146540320. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-12 17:07:45,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 17:07:47,200][71000] Updated weights for policy 0, policy_version 37704 (0.0032) [2024-06-12 17:07:50,940][70768] Fps is (10 sec: 45874.7, 60 sec: 48332.7, 300 sec: 48485.5). Total num frames: 617889792. Throughput: 0: 48362.1. Samples: 146675620. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-12 17:07:50,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:07:51,407][71000] Updated weights for policy 0, policy_version 37714 (0.0027) [2024-06-12 17:07:53,989][71000] Updated weights for policy 0, policy_version 37724 (0.0024) [2024-06-12 17:07:55,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48332.9, 300 sec: 48541.1). Total num frames: 618135552. Throughput: 0: 48115.1. Samples: 146960720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 17:07:55,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:07:58,311][71000] Updated weights for policy 0, policy_version 37734 (0.0024) [2024-06-12 17:08:00,859][71000] Updated weights for policy 0, policy_version 37744 (0.0026) [2024-06-12 17:08:00,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48332.9, 300 sec: 48652.1). Total num frames: 618397696. Throughput: 0: 48310.3. Samples: 147251540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 17:08:00,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:08:04,695][71000] Updated weights for policy 0, policy_version 37754 (0.0031) [2024-06-12 17:08:05,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48059.7, 300 sec: 48541.1). Total num frames: 618627072. Throughput: 0: 48219.9. Samples: 147402240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 17:08:05,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 17:08:07,657][71000] Updated weights for policy 0, policy_version 37764 (0.0030) [2024-06-12 17:08:10,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.9, 300 sec: 48485.6). Total num frames: 618872832. Throughput: 0: 48351.5. Samples: 147693120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 17:08:10,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:08:11,678][71000] Updated weights for policy 0, policy_version 37774 (0.0035) [2024-06-12 17:08:14,326][71000] Updated weights for policy 0, policy_version 37784 (0.0043) [2024-06-12 17:08:15,942][70768] Fps is (10 sec: 47500.7, 60 sec: 48330.6, 300 sec: 48485.1). Total num frames: 619102208. Throughput: 0: 48078.2. Samples: 147972920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 17:08:15,943][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:08:18,480][71000] Updated weights for policy 0, policy_version 37794 (0.0037) [2024-06-12 17:08:20,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48605.9, 300 sec: 48652.2). Total num frames: 619364352. Throughput: 0: 48217.8. Samples: 148120180. Policy #0 lag: (min: 2.0, avg: 10.2, max: 21.0) [2024-06-12 17:08:20,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 17:08:21,543][71000] Updated weights for policy 0, policy_version 37804 (0.0034) [2024-06-12 17:08:25,419][71000] Updated weights for policy 0, policy_version 37814 (0.0026) [2024-06-12 17:08:25,939][70768] Fps is (10 sec: 47527.3, 60 sec: 48059.8, 300 sec: 48430.0). Total num frames: 619577344. Throughput: 0: 47968.0. Samples: 148401660. Policy #0 lag: (min: 2.0, avg: 10.2, max: 21.0) [2024-06-12 17:08:25,940][70768] Avg episode reward: [(0, '0.224')] [2024-06-12 17:08:28,311][71000] Updated weights for policy 0, policy_version 37824 (0.0032) [2024-06-12 17:08:30,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48605.7, 300 sec: 48430.0). Total num frames: 619839488. Throughput: 0: 48017.6. Samples: 148701120. Policy #0 lag: (min: 2.0, avg: 10.2, max: 21.0) [2024-06-12 17:08:30,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 17:08:31,947][71000] Updated weights for policy 0, policy_version 37834 (0.0028) [2024-06-12 17:08:35,029][71000] Updated weights for policy 0, policy_version 37844 (0.0032) [2024-06-12 17:08:35,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48059.7, 300 sec: 48485.5). Total num frames: 620068864. Throughput: 0: 48122.8. Samples: 148841140. Policy #0 lag: (min: 2.0, avg: 10.2, max: 21.0) [2024-06-12 17:08:35,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:08:38,648][71000] Updated weights for policy 0, policy_version 37854 (0.0026) [2024-06-12 17:08:40,769][70980] Signal inference workers to stop experience collection... (2250 times) [2024-06-12 17:08:40,813][71000] InferenceWorker_p0-w0: stopping experience collection (2250 times) [2024-06-12 17:08:40,820][70980] Signal inference workers to resume experience collection... (2250 times) [2024-06-12 17:08:40,823][71000] InferenceWorker_p0-w0: resuming experience collection (2250 times) [2024-06-12 17:08:40,940][70768] Fps is (10 sec: 45875.6, 60 sec: 47786.6, 300 sec: 48485.5). Total num frames: 620298240. Throughput: 0: 48357.7. Samples: 149136820. Policy #0 lag: (min: 2.0, avg: 10.2, max: 21.0) [2024-06-12 17:08:40,940][70768] Avg episode reward: [(0, '0.220')] [2024-06-12 17:08:41,691][71000] Updated weights for policy 0, policy_version 37864 (0.0031) [2024-06-12 17:08:45,697][71000] Updated weights for policy 0, policy_version 37874 (0.0035) [2024-06-12 17:08:45,943][70768] Fps is (10 sec: 45857.9, 60 sec: 47783.6, 300 sec: 48318.3). Total num frames: 620527616. Throughput: 0: 48305.7. Samples: 149425480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 17:08:45,944][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 17:08:48,763][71000] Updated weights for policy 0, policy_version 37884 (0.0030) [2024-06-12 17:08:50,939][70768] Fps is (10 sec: 49152.6, 60 sec: 48332.9, 300 sec: 48319.0). Total num frames: 620789760. Throughput: 0: 47906.4. Samples: 149558020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 17:08:50,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:08:52,544][71000] Updated weights for policy 0, policy_version 37894 (0.0029) [2024-06-12 17:08:55,465][71000] Updated weights for policy 0, policy_version 37904 (0.0026) [2024-06-12 17:08:55,940][70768] Fps is (10 sec: 49170.6, 60 sec: 48059.7, 300 sec: 48374.5). Total num frames: 621019136. Throughput: 0: 48039.6. Samples: 149854900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 17:08:55,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:08:59,053][71000] Updated weights for policy 0, policy_version 37914 (0.0032) [2024-06-12 17:09:00,944][70768] Fps is (10 sec: 49130.5, 60 sec: 48056.3, 300 sec: 48484.8). Total num frames: 621281280. Throughput: 0: 48307.3. Samples: 150146820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 17:09:00,945][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:09:02,174][71000] Updated weights for policy 0, policy_version 37924 (0.0029) [2024-06-12 17:09:05,817][71000] Updated weights for policy 0, policy_version 37934 (0.0041) [2024-06-12 17:09:05,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48059.9, 300 sec: 48319.0). Total num frames: 621510656. Throughput: 0: 48225.8. Samples: 150290340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 17:09:05,940][70768] Avg episode reward: [(0, '0.223')] [2024-06-12 17:09:09,163][71000] Updated weights for policy 0, policy_version 37944 (0.0036) [2024-06-12 17:09:10,940][70768] Fps is (10 sec: 47533.2, 60 sec: 48059.6, 300 sec: 48318.9). Total num frames: 621756416. Throughput: 0: 48291.3. Samples: 150574780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 17:09:10,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 17:09:12,916][71000] Updated weights for policy 0, policy_version 37954 (0.0035) [2024-06-12 17:09:15,880][71000] Updated weights for policy 0, policy_version 37964 (0.0024) [2024-06-12 17:09:15,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48335.1, 300 sec: 48485.5). Total num frames: 622002176. Throughput: 0: 48044.1. Samples: 150863100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 17:09:15,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:09:19,705][71000] Updated weights for policy 0, policy_version 37974 (0.0025) [2024-06-12 17:09:20,940][70768] Fps is (10 sec: 47514.2, 60 sec: 47786.6, 300 sec: 48374.5). Total num frames: 622231552. Throughput: 0: 48163.1. Samples: 151008480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 17:09:20,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 17:09:22,913][71000] Updated weights for policy 0, policy_version 37984 (0.0030) [2024-06-12 17:09:25,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48059.7, 300 sec: 48318.9). Total num frames: 622460928. Throughput: 0: 48092.1. Samples: 151300960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 17:09:25,940][70768] Avg episode reward: [(0, '0.216')] [2024-06-12 17:09:26,178][71000] Updated weights for policy 0, policy_version 37994 (0.0031) [2024-06-12 17:09:29,611][71000] Updated weights for policy 0, policy_version 38004 (0.0029) [2024-06-12 17:09:30,940][70768] Fps is (10 sec: 47513.9, 60 sec: 47786.8, 300 sec: 48263.4). Total num frames: 622706688. Throughput: 0: 47968.0. Samples: 151583860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 17:09:30,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 17:09:33,104][71000] Updated weights for policy 0, policy_version 38014 (0.0033) [2024-06-12 17:09:35,940][70768] Fps is (10 sec: 47513.2, 60 sec: 47786.7, 300 sec: 48263.4). Total num frames: 622936064. Throughput: 0: 48209.7. Samples: 151727460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 17:09:35,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:09:36,380][71000] Updated weights for policy 0, policy_version 38024 (0.0030) [2024-06-12 17:09:40,017][71000] Updated weights for policy 0, policy_version 38034 (0.0036) [2024-06-12 17:09:40,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48332.9, 300 sec: 48263.4). Total num frames: 623198208. Throughput: 0: 48033.4. Samples: 152016400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 17:09:40,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:09:40,962][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000038038_623214592.pth... [2024-06-12 17:09:41,006][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000037330_611614720.pth [2024-06-12 17:09:43,158][71000] Updated weights for policy 0, policy_version 38044 (0.0025) [2024-06-12 17:09:45,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48062.6, 300 sec: 48152.3). Total num frames: 623411200. Throughput: 0: 47933.3. Samples: 152303620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 17:09:45,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 17:09:46,717][71000] Updated weights for policy 0, policy_version 38054 (0.0031) [2024-06-12 17:09:50,078][71000] Updated weights for policy 0, policy_version 38064 (0.0026) [2024-06-12 17:09:50,940][70768] Fps is (10 sec: 45875.0, 60 sec: 47786.6, 300 sec: 48207.9). Total num frames: 623656960. Throughput: 0: 47877.2. Samples: 152444820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 17:09:50,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 17:09:52,678][70980] Signal inference workers to stop experience collection... (2300 times) [2024-06-12 17:09:52,732][71000] InferenceWorker_p0-w0: stopping experience collection (2300 times) [2024-06-12 17:09:52,736][70980] Signal inference workers to resume experience collection... (2300 times) [2024-06-12 17:09:52,740][71000] InferenceWorker_p0-w0: resuming experience collection (2300 times) [2024-06-12 17:09:53,237][71000] Updated weights for policy 0, policy_version 38074 (0.0027) [2024-06-12 17:09:55,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48059.6, 300 sec: 48207.8). Total num frames: 623902720. Throughput: 0: 48004.5. Samples: 152734980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 17:09:55,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:09:57,027][71000] Updated weights for policy 0, policy_version 38084 (0.0024) [2024-06-12 17:10:00,403][71000] Updated weights for policy 0, policy_version 38094 (0.0044) [2024-06-12 17:10:00,940][70768] Fps is (10 sec: 52428.1, 60 sec: 48336.2, 300 sec: 48263.4). Total num frames: 624181248. Throughput: 0: 48051.0. Samples: 153025400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 17:10:00,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:10:03,803][71000] Updated weights for policy 0, policy_version 38104 (0.0037) [2024-06-12 17:10:05,940][70768] Fps is (10 sec: 49153.0, 60 sec: 48059.7, 300 sec: 48207.8). Total num frames: 624394240. Throughput: 0: 48085.4. Samples: 153172320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 17:10:05,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:10:07,089][71000] Updated weights for policy 0, policy_version 38114 (0.0024) [2024-06-12 17:10:10,758][71000] Updated weights for policy 0, policy_version 38124 (0.0035) [2024-06-12 17:10:10,939][70768] Fps is (10 sec: 44237.9, 60 sec: 47786.9, 300 sec: 48152.3). Total num frames: 624623616. Throughput: 0: 47958.3. Samples: 153459080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 17:10:10,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:10:13,740][71000] Updated weights for policy 0, policy_version 38134 (0.0027) [2024-06-12 17:10:15,940][70768] Fps is (10 sec: 47513.6, 60 sec: 47786.7, 300 sec: 48263.4). Total num frames: 624869376. Throughput: 0: 47934.2. Samples: 153740900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 17:10:15,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:10:17,371][71000] Updated weights for policy 0, policy_version 38144 (0.0031) [2024-06-12 17:10:20,520][71000] Updated weights for policy 0, policy_version 38154 (0.0024) [2024-06-12 17:10:20,940][70768] Fps is (10 sec: 50789.3, 60 sec: 48332.8, 300 sec: 48152.8). Total num frames: 625131520. Throughput: 0: 48145.3. Samples: 153894000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 17:10:20,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:10:24,304][71000] Updated weights for policy 0, policy_version 38164 (0.0036) [2024-06-12 17:10:25,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48332.7, 300 sec: 48207.8). Total num frames: 625360896. Throughput: 0: 48307.5. Samples: 154190240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-12 17:10:25,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:10:27,172][71000] Updated weights for policy 0, policy_version 38174 (0.0024) [2024-06-12 17:10:30,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48059.7, 300 sec: 48207.9). Total num frames: 625590272. Throughput: 0: 48474.7. Samples: 154484980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-12 17:10:30,940][70768] Avg episode reward: [(0, '0.226')] [2024-06-12 17:10:31,019][71000] Updated weights for policy 0, policy_version 38184 (0.0036) [2024-06-12 17:10:33,906][71000] Updated weights for policy 0, policy_version 38194 (0.0023) [2024-06-12 17:10:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48605.9, 300 sec: 48318.9). Total num frames: 625852416. Throughput: 0: 48424.4. Samples: 154623920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-12 17:10:35,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:10:37,622][71000] Updated weights for policy 0, policy_version 38204 (0.0028) [2024-06-12 17:10:40,546][71000] Updated weights for policy 0, policy_version 38214 (0.0031) [2024-06-12 17:10:40,940][70768] Fps is (10 sec: 52429.2, 60 sec: 48605.8, 300 sec: 48207.8). Total num frames: 626114560. Throughput: 0: 48557.9. Samples: 154920080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-12 17:10:40,940][70768] Avg episode reward: [(0, '0.222')] [2024-06-12 17:10:44,522][71000] Updated weights for policy 0, policy_version 38224 (0.0032) [2024-06-12 17:10:45,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48879.1, 300 sec: 48318.9). Total num frames: 626343936. Throughput: 0: 48553.9. Samples: 155210320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 17:10:45,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:10:47,183][71000] Updated weights for policy 0, policy_version 38234 (0.0028) [2024-06-12 17:10:50,940][70768] Fps is (10 sec: 44236.0, 60 sec: 48332.6, 300 sec: 48152.3). Total num frames: 626556928. Throughput: 0: 48327.3. Samples: 155347060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 17:10:50,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:10:51,379][71000] Updated weights for policy 0, policy_version 38244 (0.0028) [2024-06-12 17:10:54,071][71000] Updated weights for policy 0, policy_version 38254 (0.0025) [2024-06-12 17:10:55,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48879.1, 300 sec: 48263.4). Total num frames: 626835456. Throughput: 0: 48303.4. Samples: 155632740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 17:10:55,940][70768] Avg episode reward: [(0, '0.232')] [2024-06-12 17:10:58,306][71000] Updated weights for policy 0, policy_version 38264 (0.0027) [2024-06-12 17:10:58,463][70980] Signal inference workers to stop experience collection... (2350 times) [2024-06-12 17:10:58,514][71000] InferenceWorker_p0-w0: stopping experience collection (2350 times) [2024-06-12 17:10:58,522][70980] Signal inference workers to resume experience collection... (2350 times) [2024-06-12 17:10:58,525][71000] InferenceWorker_p0-w0: resuming experience collection (2350 times) [2024-06-12 17:11:00,714][71000] Updated weights for policy 0, policy_version 38274 (0.0034) [2024-06-12 17:11:00,940][70768] Fps is (10 sec: 54067.7, 60 sec: 48605.9, 300 sec: 48263.4). Total num frames: 627097600. Throughput: 0: 48626.1. Samples: 155929080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 17:11:00,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:11:04,934][71000] Updated weights for policy 0, policy_version 38284 (0.0029) [2024-06-12 17:11:05,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48332.8, 300 sec: 48207.8). Total num frames: 627294208. Throughput: 0: 48525.5. Samples: 156077640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 17:11:05,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 17:11:07,379][71000] Updated weights for policy 0, policy_version 38294 (0.0026) [2024-06-12 17:11:10,940][70768] Fps is (10 sec: 42598.1, 60 sec: 48332.6, 300 sec: 48152.3). Total num frames: 627523584. Throughput: 0: 48395.8. Samples: 156368060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 17:11:10,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:11:11,631][71000] Updated weights for policy 0, policy_version 38304 (0.0030) [2024-06-12 17:11:14,190][71000] Updated weights for policy 0, policy_version 38314 (0.0037) [2024-06-12 17:11:15,940][70768] Fps is (10 sec: 49151.2, 60 sec: 48605.8, 300 sec: 48152.3). Total num frames: 627785728. Throughput: 0: 48417.3. Samples: 156663760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 17:11:15,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 17:11:18,189][71000] Updated weights for policy 0, policy_version 38324 (0.0030) [2024-06-12 17:11:20,940][70768] Fps is (10 sec: 52429.6, 60 sec: 48605.9, 300 sec: 48207.9). Total num frames: 628047872. Throughput: 0: 48677.8. Samples: 156814420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 17:11:20,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:11:20,969][71000] Updated weights for policy 0, policy_version 38334 (0.0036) [2024-06-12 17:11:25,145][71000] Updated weights for policy 0, policy_version 38344 (0.0030) [2024-06-12 17:11:25,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48605.8, 300 sec: 48318.9). Total num frames: 628277248. Throughput: 0: 48662.6. Samples: 157109900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 17:11:25,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:11:27,578][71000] Updated weights for policy 0, policy_version 38354 (0.0036) [2024-06-12 17:11:30,940][70768] Fps is (10 sec: 44236.7, 60 sec: 48332.8, 300 sec: 48263.4). Total num frames: 628490240. Throughput: 0: 48577.7. Samples: 157396320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 17:11:30,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 17:11:31,786][71000] Updated weights for policy 0, policy_version 38364 (0.0031) [2024-06-12 17:11:34,193][71000] Updated weights for policy 0, policy_version 38374 (0.0029) [2024-06-12 17:11:35,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48878.8, 300 sec: 48263.4). Total num frames: 628785152. Throughput: 0: 48774.3. Samples: 157541900. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-12 17:11:35,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 17:11:38,583][71000] Updated weights for policy 0, policy_version 38384 (0.0031) [2024-06-12 17:11:40,940][70768] Fps is (10 sec: 54066.8, 60 sec: 48605.8, 300 sec: 48263.4). Total num frames: 629030912. Throughput: 0: 48919.0. Samples: 157834100. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-12 17:11:40,940][70768] Avg episode reward: [(0, '0.228')] [2024-06-12 17:11:40,958][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000038393_629030912.pth... [2024-06-12 17:11:41,008][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000037685_617431040.pth [2024-06-12 17:11:41,411][71000] Updated weights for policy 0, policy_version 38394 (0.0025) [2024-06-12 17:11:45,412][71000] Updated weights for policy 0, policy_version 38404 (0.0039) [2024-06-12 17:11:45,940][70768] Fps is (10 sec: 44237.3, 60 sec: 48059.7, 300 sec: 48263.4). Total num frames: 629227520. Throughput: 0: 48680.1. Samples: 158119680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-12 17:11:45,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:11:48,241][71000] Updated weights for policy 0, policy_version 38414 (0.0032) [2024-06-12 17:11:50,940][70768] Fps is (10 sec: 44236.6, 60 sec: 48605.9, 300 sec: 48263.4). Total num frames: 629473280. Throughput: 0: 48410.0. Samples: 158256100. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-12 17:11:50,940][70768] Avg episode reward: [(0, '0.228')] [2024-06-12 17:11:52,370][71000] Updated weights for policy 0, policy_version 38424 (0.0030) [2024-06-12 17:11:54,882][71000] Updated weights for policy 0, policy_version 38434 (0.0029) [2024-06-12 17:11:55,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48332.8, 300 sec: 48263.4). Total num frames: 629735424. Throughput: 0: 48344.2. Samples: 158543540. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-12 17:11:55,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:11:58,821][70980] Signal inference workers to stop experience collection... (2400 times) [2024-06-12 17:11:58,823][70980] Signal inference workers to resume experience collection... (2400 times) [2024-06-12 17:11:58,861][71000] InferenceWorker_p0-w0: stopping experience collection (2400 times) [2024-06-12 17:11:58,861][71000] InferenceWorker_p0-w0: resuming experience collection (2400 times) [2024-06-12 17:11:59,159][71000] Updated weights for policy 0, policy_version 38444 (0.0027) [2024-06-12 17:12:00,940][70768] Fps is (10 sec: 50791.3, 60 sec: 48059.8, 300 sec: 48263.4). Total num frames: 629981184. Throughput: 0: 48293.5. Samples: 158836960. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-12 17:12:00,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:12:01,883][71000] Updated weights for policy 0, policy_version 38454 (0.0028) [2024-06-12 17:12:05,838][71000] Updated weights for policy 0, policy_version 38464 (0.0021) [2024-06-12 17:12:05,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48332.8, 300 sec: 48263.4). Total num frames: 630194176. Throughput: 0: 48036.0. Samples: 158976040. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-12 17:12:05,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 17:12:08,531][71000] Updated weights for policy 0, policy_version 38474 (0.0046) [2024-06-12 17:12:10,940][70768] Fps is (10 sec: 45874.6, 60 sec: 48605.9, 300 sec: 48263.4). Total num frames: 630439936. Throughput: 0: 47883.9. Samples: 159264680. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-12 17:12:10,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 17:12:12,990][71000] Updated weights for policy 0, policy_version 38484 (0.0040) [2024-06-12 17:12:15,423][71000] Updated weights for policy 0, policy_version 38494 (0.0030) [2024-06-12 17:12:15,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48605.9, 300 sec: 48318.9). Total num frames: 630702080. Throughput: 0: 47881.3. Samples: 159550980. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-12 17:12:15,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:12:19,989][71000] Updated weights for policy 0, policy_version 38504 (0.0031) [2024-06-12 17:12:20,939][70768] Fps is (10 sec: 49152.9, 60 sec: 48059.8, 300 sec: 48263.4). Total num frames: 630931456. Throughput: 0: 47933.1. Samples: 159698880. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-12 17:12:20,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 17:12:22,331][71000] Updated weights for policy 0, policy_version 38514 (0.0035) [2024-06-12 17:12:25,939][70768] Fps is (10 sec: 40960.5, 60 sec: 47240.6, 300 sec: 48096.8). Total num frames: 631111680. Throughput: 0: 47675.3. Samples: 159979480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 17:12:25,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:12:26,713][71000] Updated weights for policy 0, policy_version 38524 (0.0024) [2024-06-12 17:12:29,123][71000] Updated weights for policy 0, policy_version 38534 (0.0034) [2024-06-12 17:12:30,940][70768] Fps is (10 sec: 45874.4, 60 sec: 48332.8, 300 sec: 48152.3). Total num frames: 631390208. Throughput: 0: 47723.0. Samples: 160267220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 17:12:30,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:12:33,559][71000] Updated weights for policy 0, policy_version 38544 (0.0032) [2024-06-12 17:12:35,939][70768] Fps is (10 sec: 54067.0, 60 sec: 47786.8, 300 sec: 48207.8). Total num frames: 631652352. Throughput: 0: 47882.0. Samples: 160410780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 17:12:35,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:12:35,992][71000] Updated weights for policy 0, policy_version 38554 (0.0025) [2024-06-12 17:12:40,518][71000] Updated weights for policy 0, policy_version 38564 (0.0039) [2024-06-12 17:12:40,940][70768] Fps is (10 sec: 45872.6, 60 sec: 46967.0, 300 sec: 48096.7). Total num frames: 631848960. Throughput: 0: 47885.1. Samples: 160698400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 17:12:40,941][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:12:42,786][71000] Updated weights for policy 0, policy_version 38574 (0.0021) [2024-06-12 17:12:45,940][70768] Fps is (10 sec: 44236.5, 60 sec: 47786.6, 300 sec: 48152.3). Total num frames: 632094720. Throughput: 0: 47848.0. Samples: 160990120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 17:12:45,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:12:47,122][71000] Updated weights for policy 0, policy_version 38584 (0.0029) [2024-06-12 17:12:49,646][71000] Updated weights for policy 0, policy_version 38594 (0.0029) [2024-06-12 17:12:50,939][70768] Fps is (10 sec: 52432.7, 60 sec: 48333.0, 300 sec: 48263.4). Total num frames: 632373248. Throughput: 0: 47953.0. Samples: 161133920. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-12 17:12:50,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 17:12:53,855][71000] Updated weights for policy 0, policy_version 38604 (0.0028) [2024-06-12 17:12:54,102][70980] Signal inference workers to stop experience collection... (2450 times) [2024-06-12 17:12:54,106][70980] Signal inference workers to resume experience collection... (2450 times) [2024-06-12 17:12:54,112][71000] InferenceWorker_p0-w0: stopping experience collection (2450 times) [2024-06-12 17:12:54,125][71000] InferenceWorker_p0-w0: resuming experience collection (2450 times) [2024-06-12 17:12:55,940][70768] Fps is (10 sec: 54067.2, 60 sec: 48332.8, 300 sec: 48263.4). Total num frames: 632635392. Throughput: 0: 48211.6. Samples: 161434200. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-12 17:12:55,940][70768] Avg episode reward: [(0, '0.217')] [2024-06-12 17:12:56,331][71000] Updated weights for policy 0, policy_version 38614 (0.0027) [2024-06-12 17:13:00,462][71000] Updated weights for policy 0, policy_version 38624 (0.0034) [2024-06-12 17:13:00,940][70768] Fps is (10 sec: 45874.7, 60 sec: 47513.6, 300 sec: 48152.3). Total num frames: 632832000. Throughput: 0: 48209.3. Samples: 161720400. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-12 17:13:00,940][70768] Avg episode reward: [(0, '0.231')] [2024-06-12 17:13:03,010][71000] Updated weights for policy 0, policy_version 38634 (0.0023) [2024-06-12 17:13:05,940][70768] Fps is (10 sec: 44236.5, 60 sec: 48059.6, 300 sec: 48152.3). Total num frames: 633077760. Throughput: 0: 47903.8. Samples: 161854560. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-12 17:13:05,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 17:13:07,213][71000] Updated weights for policy 0, policy_version 38644 (0.0027) [2024-06-12 17:13:09,897][71000] Updated weights for policy 0, policy_version 38654 (0.0033) [2024-06-12 17:13:10,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48059.8, 300 sec: 48208.3). Total num frames: 633323520. Throughput: 0: 48127.0. Samples: 162145200. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-12 17:13:10,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:13:14,051][71000] Updated weights for policy 0, policy_version 38664 (0.0047) [2024-06-12 17:13:15,940][70768] Fps is (10 sec: 49152.3, 60 sec: 47786.7, 300 sec: 48152.3). Total num frames: 633569280. Throughput: 0: 48272.1. Samples: 162439460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-12 17:13:15,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 17:13:16,081][70980] Saving new best policy, reward=0.256! [2024-06-12 17:13:16,706][71000] Updated weights for policy 0, policy_version 38674 (0.0020) [2024-06-12 17:13:20,801][71000] Updated weights for policy 0, policy_version 38684 (0.0031) [2024-06-12 17:13:20,940][70768] Fps is (10 sec: 47513.4, 60 sec: 47786.6, 300 sec: 48207.8). Total num frames: 633798656. Throughput: 0: 48297.7. Samples: 162584180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-12 17:13:20,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 17:13:23,575][71000] Updated weights for policy 0, policy_version 38694 (0.0023) [2024-06-12 17:13:25,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48878.9, 300 sec: 48152.3). Total num frames: 634044416. Throughput: 0: 48382.0. Samples: 162875560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-12 17:13:25,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 17:13:27,827][71000] Updated weights for policy 0, policy_version 38704 (0.0029) [2024-06-12 17:13:30,445][71000] Updated weights for policy 0, policy_version 38714 (0.0023) [2024-06-12 17:13:30,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48605.9, 300 sec: 48263.4). Total num frames: 634306560. Throughput: 0: 48241.2. Samples: 163160980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-12 17:13:30,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:13:34,274][71000] Updated weights for policy 0, policy_version 38724 (0.0032) [2024-06-12 17:13:35,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48059.7, 300 sec: 48263.4). Total num frames: 634535936. Throughput: 0: 48412.3. Samples: 163312480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-12 17:13:35,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 17:13:37,028][71000] Updated weights for policy 0, policy_version 38734 (0.0027) [2024-06-12 17:13:40,940][70768] Fps is (10 sec: 44237.1, 60 sec: 48333.3, 300 sec: 48208.5). Total num frames: 634748928. Throughput: 0: 48161.3. Samples: 163601460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 17:13:40,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:13:40,985][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000038743_634765312.pth... [2024-06-12 17:13:41,025][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000038038_623214592.pth [2024-06-12 17:13:41,153][71000] Updated weights for policy 0, policy_version 38744 (0.0038) [2024-06-12 17:13:43,916][71000] Updated weights for policy 0, policy_version 38754 (0.0021) [2024-06-12 17:13:45,941][70768] Fps is (10 sec: 47504.8, 60 sec: 48604.4, 300 sec: 48207.5). Total num frames: 635011072. Throughput: 0: 48200.2. Samples: 163889500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 17:13:45,942][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 17:13:47,824][71000] Updated weights for policy 0, policy_version 38764 (0.0036) [2024-06-12 17:13:50,666][71000] Updated weights for policy 0, policy_version 38774 (0.0030) [2024-06-12 17:13:50,940][70768] Fps is (10 sec: 52428.7, 60 sec: 48332.7, 300 sec: 48318.9). Total num frames: 635273216. Throughput: 0: 48426.7. Samples: 164033760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 17:13:50,940][70768] Avg episode reward: [(0, '0.231')] [2024-06-12 17:13:54,740][71000] Updated weights for policy 0, policy_version 38784 (0.0027) [2024-06-12 17:13:55,940][70768] Fps is (10 sec: 50800.0, 60 sec: 48059.8, 300 sec: 48264.1). Total num frames: 635518976. Throughput: 0: 48431.6. Samples: 164324620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 17:13:55,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:13:57,625][71000] Updated weights for policy 0, policy_version 38794 (0.0032) [2024-06-12 17:14:00,940][70768] Fps is (10 sec: 44237.2, 60 sec: 48059.8, 300 sec: 48152.3). Total num frames: 635715584. Throughput: 0: 48390.3. Samples: 164617020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 17:14:00,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 17:14:01,054][70980] Saving new best policy, reward=0.258! [2024-06-12 17:14:01,385][71000] Updated weights for policy 0, policy_version 38804 (0.0023) [2024-06-12 17:14:04,301][71000] Updated weights for policy 0, policy_version 38814 (0.0035) [2024-06-12 17:14:05,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48332.9, 300 sec: 48207.9). Total num frames: 635977728. Throughput: 0: 48309.8. Samples: 164758120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-12 17:14:05,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:14:06,305][70980] Signal inference workers to stop experience collection... (2500 times) [2024-06-12 17:14:06,306][70980] Signal inference workers to resume experience collection... (2500 times) [2024-06-12 17:14:06,321][71000] InferenceWorker_p0-w0: stopping experience collection (2500 times) [2024-06-12 17:14:06,321][71000] InferenceWorker_p0-w0: resuming experience collection (2500 times) [2024-06-12 17:14:08,562][71000] Updated weights for policy 0, policy_version 38824 (0.0035) [2024-06-12 17:14:10,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48332.8, 300 sec: 48207.8). Total num frames: 636223488. Throughput: 0: 48092.4. Samples: 165039720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-12 17:14:10,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 17:14:11,515][71000] Updated weights for policy 0, policy_version 38834 (0.0024) [2024-06-12 17:14:15,367][71000] Updated weights for policy 0, policy_version 38844 (0.0033) [2024-06-12 17:14:15,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48332.8, 300 sec: 48263.4). Total num frames: 636469248. Throughput: 0: 48203.1. Samples: 165330120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-12 17:14:15,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:14:18,277][71000] Updated weights for policy 0, policy_version 38854 (0.0034) [2024-06-12 17:14:20,939][70768] Fps is (10 sec: 45875.5, 60 sec: 48059.8, 300 sec: 48207.8). Total num frames: 636682240. Throughput: 0: 47826.7. Samples: 165464680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-12 17:14:20,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 17:14:22,062][71000] Updated weights for policy 0, policy_version 38864 (0.0030) [2024-06-12 17:14:25,318][71000] Updated weights for policy 0, policy_version 38874 (0.0032) [2024-06-12 17:14:25,939][70768] Fps is (10 sec: 45875.9, 60 sec: 48059.8, 300 sec: 48207.8). Total num frames: 636928000. Throughput: 0: 47881.4. Samples: 165756120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-12 17:14:25,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 17:14:28,966][71000] Updated weights for policy 0, policy_version 38884 (0.0027) [2024-06-12 17:14:30,940][70768] Fps is (10 sec: 47513.3, 60 sec: 47513.7, 300 sec: 48207.8). Total num frames: 637157376. Throughput: 0: 47702.9. Samples: 166036040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-12 17:14:30,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 17:14:32,014][71000] Updated weights for policy 0, policy_version 38894 (0.0037) [2024-06-12 17:14:35,682][71000] Updated weights for policy 0, policy_version 38904 (0.0037) [2024-06-12 17:14:35,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48059.7, 300 sec: 48207.8). Total num frames: 637419520. Throughput: 0: 47855.2. Samples: 166187240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-12 17:14:35,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:14:38,937][71000] Updated weights for policy 0, policy_version 38914 (0.0032) [2024-06-12 17:14:40,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48605.9, 300 sec: 48318.9). Total num frames: 637665280. Throughput: 0: 47802.1. Samples: 166475720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-12 17:14:40,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 17:14:42,427][71000] Updated weights for policy 0, policy_version 38924 (0.0024) [2024-06-12 17:14:45,653][71000] Updated weights for policy 0, policy_version 38934 (0.0035) [2024-06-12 17:14:45,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48061.1, 300 sec: 48263.4). Total num frames: 637894656. Throughput: 0: 47795.8. Samples: 166767840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-12 17:14:45,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:14:49,551][71000] Updated weights for policy 0, policy_version 38944 (0.0028) [2024-06-12 17:14:50,939][70768] Fps is (10 sec: 45875.6, 60 sec: 47513.7, 300 sec: 48207.9). Total num frames: 638124032. Throughput: 0: 47757.8. Samples: 166907220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-12 17:14:50,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 17:14:52,826][71000] Updated weights for policy 0, policy_version 38954 (0.0031) [2024-06-12 17:14:55,940][70768] Fps is (10 sec: 45875.6, 60 sec: 47240.5, 300 sec: 48041.2). Total num frames: 638353408. Throughput: 0: 47796.9. Samples: 167190580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-12 17:14:55,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:14:56,278][71000] Updated weights for policy 0, policy_version 38964 (0.0028) [2024-06-12 17:14:59,578][71000] Updated weights for policy 0, policy_version 38974 (0.0029) [2024-06-12 17:15:00,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48059.7, 300 sec: 48152.3). Total num frames: 638599168. Throughput: 0: 47640.5. Samples: 167473940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-12 17:15:00,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 17:15:01,136][70980] Signal inference workers to stop experience collection... (2550 times) [2024-06-12 17:15:01,164][71000] InferenceWorker_p0-w0: stopping experience collection (2550 times) [2024-06-12 17:15:01,189][70980] Signal inference workers to resume experience collection... (2550 times) [2024-06-12 17:15:01,189][71000] InferenceWorker_p0-w0: resuming experience collection (2550 times) [2024-06-12 17:15:02,871][71000] Updated weights for policy 0, policy_version 38984 (0.0033) [2024-06-12 17:15:05,940][70768] Fps is (10 sec: 49151.6, 60 sec: 47786.6, 300 sec: 48207.8). Total num frames: 638844928. Throughput: 0: 48008.7. Samples: 167625080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-12 17:15:05,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:15:06,602][71000] Updated weights for policy 0, policy_version 38994 (0.0030) [2024-06-12 17:15:09,889][71000] Updated weights for policy 0, policy_version 39004 (0.0035) [2024-06-12 17:15:10,943][70768] Fps is (10 sec: 49136.8, 60 sec: 47784.2, 300 sec: 48207.3). Total num frames: 639090688. Throughput: 0: 47853.9. Samples: 167909700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-12 17:15:10,943][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:15:13,172][71000] Updated weights for policy 0, policy_version 39014 (0.0037) [2024-06-12 17:15:15,940][70768] Fps is (10 sec: 47513.3, 60 sec: 47513.5, 300 sec: 48096.7). Total num frames: 639320064. Throughput: 0: 48038.5. Samples: 168197780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-12 17:15:15,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:15:16,407][71000] Updated weights for policy 0, policy_version 39024 (0.0033) [2024-06-12 17:15:20,374][71000] Updated weights for policy 0, policy_version 39034 (0.0029) [2024-06-12 17:15:20,939][70768] Fps is (10 sec: 45889.7, 60 sec: 47786.7, 300 sec: 48096.8). Total num frames: 639549440. Throughput: 0: 47738.7. Samples: 168335480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 17:15:20,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:15:23,082][71000] Updated weights for policy 0, policy_version 39044 (0.0030) [2024-06-12 17:15:25,940][70768] Fps is (10 sec: 47514.1, 60 sec: 47786.5, 300 sec: 48152.3). Total num frames: 639795200. Throughput: 0: 47847.5. Samples: 168628860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 17:15:25,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 17:15:27,241][71000] Updated weights for policy 0, policy_version 39054 (0.0025) [2024-06-12 17:15:30,265][71000] Updated weights for policy 0, policy_version 39064 (0.0031) [2024-06-12 17:15:30,939][70768] Fps is (10 sec: 49152.0, 60 sec: 48059.8, 300 sec: 48096.8). Total num frames: 640040960. Throughput: 0: 47623.7. Samples: 168910900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 17:15:30,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:15:34,337][71000] Updated weights for policy 0, policy_version 39074 (0.0034) [2024-06-12 17:15:35,940][70768] Fps is (10 sec: 49152.3, 60 sec: 47786.7, 300 sec: 48041.2). Total num frames: 640286720. Throughput: 0: 47899.1. Samples: 169062680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 17:15:35,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:15:36,943][71000] Updated weights for policy 0, policy_version 39084 (0.0028) [2024-06-12 17:15:40,758][71000] Updated weights for policy 0, policy_version 39094 (0.0034) [2024-06-12 17:15:40,940][70768] Fps is (10 sec: 49151.8, 60 sec: 47786.7, 300 sec: 48096.8). Total num frames: 640532480. Throughput: 0: 48024.9. Samples: 169351700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 17:15:40,940][70768] Avg episode reward: [(0, '0.231')] [2024-06-12 17:15:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000039095_640532480.pth... [2024-06-12 17:15:40,986][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000038393_629030912.pth [2024-06-12 17:15:43,712][71000] Updated weights for policy 0, policy_version 39104 (0.0032) [2024-06-12 17:15:45,940][70768] Fps is (10 sec: 47513.8, 60 sec: 47786.8, 300 sec: 48152.3). Total num frames: 640761856. Throughput: 0: 47937.8. Samples: 169631140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-12 17:15:45,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:15:47,795][71000] Updated weights for policy 0, policy_version 39114 (0.0023) [2024-06-12 17:15:50,244][71000] Updated weights for policy 0, policy_version 39124 (0.0032) [2024-06-12 17:15:50,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48332.7, 300 sec: 48096.8). Total num frames: 641024000. Throughput: 0: 47920.1. Samples: 169781480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-12 17:15:50,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:15:54,611][71000] Updated weights for policy 0, policy_version 39134 (0.0030) [2024-06-12 17:15:55,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48332.8, 300 sec: 47985.7). Total num frames: 641253376. Throughput: 0: 48122.8. Samples: 170075080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-12 17:15:55,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:15:57,143][71000] Updated weights for policy 0, policy_version 39144 (0.0028) [2024-06-12 17:16:00,940][70768] Fps is (10 sec: 44237.0, 60 sec: 47786.7, 300 sec: 48041.2). Total num frames: 641466368. Throughput: 0: 48290.4. Samples: 170370840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-12 17:16:00,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:16:01,355][71000] Updated weights for policy 0, policy_version 39154 (0.0026) [2024-06-12 17:16:03,967][71000] Updated weights for policy 0, policy_version 39164 (0.0027) [2024-06-12 17:16:05,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48332.9, 300 sec: 48207.9). Total num frames: 641744896. Throughput: 0: 48276.0. Samples: 170507900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-12 17:16:05,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 17:16:08,147][71000] Updated weights for policy 0, policy_version 39174 (0.0028) [2024-06-12 17:16:09,786][70980] Signal inference workers to stop experience collection... (2600 times) [2024-06-12 17:16:09,838][71000] InferenceWorker_p0-w0: stopping experience collection (2600 times) [2024-06-12 17:16:09,894][70980] Signal inference workers to resume experience collection... (2600 times) [2024-06-12 17:16:09,894][71000] InferenceWorker_p0-w0: resuming experience collection (2600 times) [2024-06-12 17:16:10,939][70768] Fps is (10 sec: 50790.8, 60 sec: 48062.3, 300 sec: 48096.8). Total num frames: 641974272. Throughput: 0: 48064.2. Samples: 170791740. Policy #0 lag: (min: 3.0, avg: 10.9, max: 24.0) [2024-06-12 17:16:10,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:16:10,999][71000] Updated weights for policy 0, policy_version 39184 (0.0028) [2024-06-12 17:16:15,122][71000] Updated weights for policy 0, policy_version 39194 (0.0039) [2024-06-12 17:16:15,939][70768] Fps is (10 sec: 45875.4, 60 sec: 48059.9, 300 sec: 47985.7). Total num frames: 642203648. Throughput: 0: 48390.2. Samples: 171088460. Policy #0 lag: (min: 3.0, avg: 10.9, max: 24.0) [2024-06-12 17:16:15,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:16:17,675][71000] Updated weights for policy 0, policy_version 39204 (0.0027) [2024-06-12 17:16:20,940][70768] Fps is (10 sec: 44236.0, 60 sec: 47786.6, 300 sec: 47930.1). Total num frames: 642416640. Throughput: 0: 47975.9. Samples: 171221600. Policy #0 lag: (min: 3.0, avg: 10.9, max: 24.0) [2024-06-12 17:16:20,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:16:21,762][71000] Updated weights for policy 0, policy_version 39214 (0.0034) [2024-06-12 17:16:24,254][71000] Updated weights for policy 0, policy_version 39224 (0.0031) [2024-06-12 17:16:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48332.9, 300 sec: 48152.3). Total num frames: 642695168. Throughput: 0: 47924.4. Samples: 171508300. Policy #0 lag: (min: 3.0, avg: 10.9, max: 24.0) [2024-06-12 17:16:25,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:16:28,825][71000] Updated weights for policy 0, policy_version 39234 (0.0030) [2024-06-12 17:16:30,940][70768] Fps is (10 sec: 52428.4, 60 sec: 48332.6, 300 sec: 47985.7). Total num frames: 642940928. Throughput: 0: 48202.0. Samples: 171800240. Policy #0 lag: (min: 3.0, avg: 10.9, max: 24.0) [2024-06-12 17:16:30,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 17:16:31,273][71000] Updated weights for policy 0, policy_version 39244 (0.0037) [2024-06-12 17:16:35,386][71000] Updated weights for policy 0, policy_version 39254 (0.0032) [2024-06-12 17:16:35,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48059.8, 300 sec: 47930.2). Total num frames: 643170304. Throughput: 0: 48217.5. Samples: 171951260. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-12 17:16:35,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:16:37,996][71000] Updated weights for policy 0, policy_version 39264 (0.0024) [2024-06-12 17:16:40,940][70768] Fps is (10 sec: 45876.1, 60 sec: 47786.7, 300 sec: 48041.2). Total num frames: 643399680. Throughput: 0: 47980.6. Samples: 172234200. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-12 17:16:40,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 17:16:42,187][71000] Updated weights for policy 0, policy_version 39274 (0.0033) [2024-06-12 17:16:44,941][71000] Updated weights for policy 0, policy_version 39284 (0.0033) [2024-06-12 17:16:45,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48332.8, 300 sec: 48096.8). Total num frames: 643661824. Throughput: 0: 47658.7. Samples: 172515480. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-12 17:16:45,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:16:49,223][71000] Updated weights for policy 0, policy_version 39294 (0.0028) [2024-06-12 17:16:50,940][70768] Fps is (10 sec: 50789.7, 60 sec: 48059.7, 300 sec: 48041.2). Total num frames: 643907584. Throughput: 0: 48133.2. Samples: 172673900. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-12 17:16:50,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:16:51,611][71000] Updated weights for policy 0, policy_version 39304 (0.0031) [2024-06-12 17:16:55,850][71000] Updated weights for policy 0, policy_version 39314 (0.0033) [2024-06-12 17:16:55,940][70768] Fps is (10 sec: 45874.7, 60 sec: 47786.6, 300 sec: 47930.1). Total num frames: 644120576. Throughput: 0: 48150.9. Samples: 172958540. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-12 17:16:55,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:16:58,298][71000] Updated weights for policy 0, policy_version 39324 (0.0033) [2024-06-12 17:17:00,940][70768] Fps is (10 sec: 45875.6, 60 sec: 48332.8, 300 sec: 48041.2). Total num frames: 644366336. Throughput: 0: 48046.6. Samples: 173250560. Policy #0 lag: (min: 0.0, avg: 13.0, max: 22.0) [2024-06-12 17:17:00,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 17:17:02,389][71000] Updated weights for policy 0, policy_version 39334 (0.0028) [2024-06-12 17:17:05,212][71000] Updated weights for policy 0, policy_version 39344 (0.0029) [2024-06-12 17:17:05,940][70768] Fps is (10 sec: 52429.1, 60 sec: 48332.8, 300 sec: 48152.3). Total num frames: 644644864. Throughput: 0: 48250.3. Samples: 173392860. Policy #0 lag: (min: 0.0, avg: 13.0, max: 22.0) [2024-06-12 17:17:05,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:17:09,421][71000] Updated weights for policy 0, policy_version 39354 (0.0035) [2024-06-12 17:17:10,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48332.7, 300 sec: 48041.2). Total num frames: 644874240. Throughput: 0: 48386.2. Samples: 173685680. Policy #0 lag: (min: 0.0, avg: 13.0, max: 22.0) [2024-06-12 17:17:10,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:17:12,085][71000] Updated weights for policy 0, policy_version 39364 (0.0024) [2024-06-12 17:17:15,940][70768] Fps is (10 sec: 42598.5, 60 sec: 47786.6, 300 sec: 47930.1). Total num frames: 645070848. Throughput: 0: 48265.9. Samples: 173972200. Policy #0 lag: (min: 0.0, avg: 13.0, max: 22.0) [2024-06-12 17:17:15,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:17:16,467][71000] Updated weights for policy 0, policy_version 39374 (0.0033) [2024-06-12 17:17:19,102][71000] Updated weights for policy 0, policy_version 39384 (0.0028) [2024-06-12 17:17:19,678][70980] Signal inference workers to stop experience collection... (2650 times) [2024-06-12 17:17:19,723][71000] InferenceWorker_p0-w0: stopping experience collection (2650 times) [2024-06-12 17:17:19,732][70980] Signal inference workers to resume experience collection... (2650 times) [2024-06-12 17:17:19,738][71000] InferenceWorker_p0-w0: resuming experience collection (2650 times) [2024-06-12 17:17:20,940][70768] Fps is (10 sec: 44236.4, 60 sec: 48332.8, 300 sec: 48152.3). Total num frames: 645316608. Throughput: 0: 47790.1. Samples: 174101820. Policy #0 lag: (min: 0.0, avg: 13.0, max: 22.0) [2024-06-12 17:17:20,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:17:23,030][71000] Updated weights for policy 0, policy_version 39394 (0.0040) [2024-06-12 17:17:25,674][71000] Updated weights for policy 0, policy_version 39404 (0.0039) [2024-06-12 17:17:25,939][70768] Fps is (10 sec: 52429.5, 60 sec: 48332.9, 300 sec: 48152.3). Total num frames: 645595136. Throughput: 0: 48050.3. Samples: 174396460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 17:17:25,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 17:17:29,595][71000] Updated weights for policy 0, policy_version 39414 (0.0036) [2024-06-12 17:17:30,940][70768] Fps is (10 sec: 49151.8, 60 sec: 47786.7, 300 sec: 47985.7). Total num frames: 645808128. Throughput: 0: 48238.5. Samples: 174686220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 17:17:30,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 17:17:32,701][71000] Updated weights for policy 0, policy_version 39424 (0.0033) [2024-06-12 17:17:35,939][70768] Fps is (10 sec: 44236.6, 60 sec: 47786.7, 300 sec: 48096.9). Total num frames: 646037504. Throughput: 0: 47854.8. Samples: 174827360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 17:17:35,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 17:17:36,558][71000] Updated weights for policy 0, policy_version 39434 (0.0038) [2024-06-12 17:17:39,348][71000] Updated weights for policy 0, policy_version 39444 (0.0029) [2024-06-12 17:17:40,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48332.7, 300 sec: 48152.3). Total num frames: 646299648. Throughput: 0: 48013.3. Samples: 175119140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 17:17:40,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 17:17:40,946][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000039447_646299648.pth... [2024-06-12 17:17:40,997][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000038743_634765312.pth [2024-06-12 17:17:43,401][71000] Updated weights for policy 0, policy_version 39454 (0.0038) [2024-06-12 17:17:45,940][70768] Fps is (10 sec: 52428.3, 60 sec: 48332.8, 300 sec: 48096.7). Total num frames: 646561792. Throughput: 0: 47881.3. Samples: 175405220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 17:17:45,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:17:46,325][71000] Updated weights for policy 0, policy_version 39464 (0.0031) [2024-06-12 17:17:50,113][71000] Updated weights for policy 0, policy_version 39474 (0.0033) [2024-06-12 17:17:50,940][70768] Fps is (10 sec: 47513.7, 60 sec: 47786.7, 300 sec: 47930.1). Total num frames: 646774784. Throughput: 0: 48059.9. Samples: 175555560. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-12 17:17:50,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 17:17:52,998][71000] Updated weights for policy 0, policy_version 39484 (0.0025) [2024-06-12 17:17:55,940][70768] Fps is (10 sec: 42598.1, 60 sec: 47786.7, 300 sec: 47985.7). Total num frames: 646987776. Throughput: 0: 47661.7. Samples: 175830460. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-12 17:17:55,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 17:17:57,194][71000] Updated weights for policy 0, policy_version 39494 (0.0028) [2024-06-12 17:18:00,272][71000] Updated weights for policy 0, policy_version 39504 (0.0025) [2024-06-12 17:18:00,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48332.7, 300 sec: 48096.8). Total num frames: 647266304. Throughput: 0: 47780.8. Samples: 176122340. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-12 17:18:00,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 17:18:00,948][70980] Saving new best policy, reward=0.264! [2024-06-12 17:18:03,802][71000] Updated weights for policy 0, policy_version 39514 (0.0028) [2024-06-12 17:18:05,939][70768] Fps is (10 sec: 52429.6, 60 sec: 47786.8, 300 sec: 48096.8). Total num frames: 647512064. Throughput: 0: 48263.3. Samples: 176273660. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-12 17:18:05,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:18:06,832][71000] Updated weights for policy 0, policy_version 39524 (0.0024) [2024-06-12 17:18:10,939][70768] Fps is (10 sec: 44237.4, 60 sec: 47240.6, 300 sec: 47930.2). Total num frames: 647708672. Throughput: 0: 48135.0. Samples: 176562540. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-12 17:18:10,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:18:10,990][71000] Updated weights for policy 0, policy_version 39534 (0.0024) [2024-06-12 17:18:13,892][71000] Updated weights for policy 0, policy_version 39544 (0.0027) [2024-06-12 17:18:15,940][70768] Fps is (10 sec: 44235.8, 60 sec: 48059.6, 300 sec: 47985.7). Total num frames: 647954432. Throughput: 0: 47990.2. Samples: 176845780. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-12 17:18:15,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:18:17,753][71000] Updated weights for policy 0, policy_version 39554 (0.0029) [2024-06-12 17:18:20,811][71000] Updated weights for policy 0, policy_version 39564 (0.0031) [2024-06-12 17:18:20,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48332.8, 300 sec: 48041.2). Total num frames: 648216576. Throughput: 0: 47980.8. Samples: 176986500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-12 17:18:20,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 17:18:24,263][71000] Updated weights for policy 0, policy_version 39574 (0.0036) [2024-06-12 17:18:25,940][70768] Fps is (10 sec: 49153.0, 60 sec: 47513.5, 300 sec: 47930.2). Total num frames: 648445952. Throughput: 0: 48051.7. Samples: 177281460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-12 17:18:25,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 17:18:27,378][71000] Updated weights for policy 0, policy_version 39584 (0.0031) [2024-06-12 17:18:30,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48059.8, 300 sec: 47985.7). Total num frames: 648691712. Throughput: 0: 48138.3. Samples: 177571440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-12 17:18:30,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:18:31,198][71000] Updated weights for policy 0, policy_version 39594 (0.0026) [2024-06-12 17:18:33,986][71000] Updated weights for policy 0, policy_version 39604 (0.0034) [2024-06-12 17:18:35,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48059.7, 300 sec: 48041.2). Total num frames: 648921088. Throughput: 0: 47854.4. Samples: 177709000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-12 17:18:35,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:18:37,179][70980] Signal inference workers to stop experience collection... (2700 times) [2024-06-12 17:18:37,179][70980] Signal inference workers to resume experience collection... (2700 times) [2024-06-12 17:18:37,197][71000] InferenceWorker_p0-w0: stopping experience collection (2700 times) [2024-06-12 17:18:37,197][71000] InferenceWorker_p0-w0: resuming experience collection (2700 times) [2024-06-12 17:18:38,200][71000] Updated weights for policy 0, policy_version 39614 (0.0036) [2024-06-12 17:18:40,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48059.8, 300 sec: 48041.5). Total num frames: 649183232. Throughput: 0: 48047.7. Samples: 177992600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-12 17:18:40,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 17:18:41,043][71000] Updated weights for policy 0, policy_version 39624 (0.0029) [2024-06-12 17:18:45,200][71000] Updated weights for policy 0, policy_version 39634 (0.0033) [2024-06-12 17:18:45,940][70768] Fps is (10 sec: 49151.4, 60 sec: 47513.6, 300 sec: 47930.1). Total num frames: 649412608. Throughput: 0: 47995.2. Samples: 178282120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-12 17:18:45,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 17:18:48,048][71000] Updated weights for policy 0, policy_version 39644 (0.0036) [2024-06-12 17:18:50,940][70768] Fps is (10 sec: 44236.9, 60 sec: 47513.7, 300 sec: 47819.1). Total num frames: 649625600. Throughput: 0: 47635.1. Samples: 178417240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-12 17:18:50,940][70768] Avg episode reward: [(0, '0.223')] [2024-06-12 17:18:52,096][71000] Updated weights for policy 0, policy_version 39654 (0.0031) [2024-06-12 17:18:54,990][71000] Updated weights for policy 0, policy_version 39664 (0.0029) [2024-06-12 17:18:55,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48332.8, 300 sec: 48041.2). Total num frames: 649887744. Throughput: 0: 47600.3. Samples: 178704560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-12 17:18:55,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 17:18:58,753][71000] Updated weights for policy 0, policy_version 39674 (0.0041) [2024-06-12 17:19:00,939][70768] Fps is (10 sec: 49152.2, 60 sec: 47513.7, 300 sec: 47930.2). Total num frames: 650117120. Throughput: 0: 47679.3. Samples: 178991340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-12 17:19:00,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 17:19:01,941][71000] Updated weights for policy 0, policy_version 39684 (0.0034) [2024-06-12 17:19:05,541][71000] Updated weights for policy 0, policy_version 39694 (0.0025) [2024-06-12 17:19:05,939][70768] Fps is (10 sec: 45875.9, 60 sec: 47240.6, 300 sec: 47874.6). Total num frames: 650346496. Throughput: 0: 47737.9. Samples: 179134700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:19:05,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 17:19:08,851][71000] Updated weights for policy 0, policy_version 39704 (0.0031) [2024-06-12 17:19:10,940][70768] Fps is (10 sec: 47512.4, 60 sec: 48059.5, 300 sec: 47874.6). Total num frames: 650592256. Throughput: 0: 47581.1. Samples: 179422620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:19:10,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:19:12,595][71000] Updated weights for policy 0, policy_version 39714 (0.0037) [2024-06-12 17:19:15,909][71000] Updated weights for policy 0, policy_version 39724 (0.0041) [2024-06-12 17:19:15,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48059.8, 300 sec: 47985.7). Total num frames: 650838016. Throughput: 0: 47427.0. Samples: 179705660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:19:15,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:19:19,609][71000] Updated weights for policy 0, policy_version 39734 (0.0034) [2024-06-12 17:19:20,940][70768] Fps is (10 sec: 49152.5, 60 sec: 47786.6, 300 sec: 47985.7). Total num frames: 651083776. Throughput: 0: 47485.6. Samples: 179845860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:19:20,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 17:19:22,681][71000] Updated weights for policy 0, policy_version 39744 (0.0028) [2024-06-12 17:19:25,939][70768] Fps is (10 sec: 45875.8, 60 sec: 47513.6, 300 sec: 47930.2). Total num frames: 651296768. Throughput: 0: 47549.0. Samples: 180132300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:19:25,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 17:19:26,233][71000] Updated weights for policy 0, policy_version 39754 (0.0033) [2024-06-12 17:19:29,229][71000] Updated weights for policy 0, policy_version 39764 (0.0032) [2024-06-12 17:19:30,940][70768] Fps is (10 sec: 47514.1, 60 sec: 47786.7, 300 sec: 47930.1). Total num frames: 651558912. Throughput: 0: 47659.6. Samples: 180426800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 17:19:30,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 17:19:32,762][71000] Updated weights for policy 0, policy_version 39774 (0.0031) [2024-06-12 17:19:35,940][70768] Fps is (10 sec: 49151.5, 60 sec: 47786.6, 300 sec: 47874.6). Total num frames: 651788288. Throughput: 0: 47898.2. Samples: 180572660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 17:19:35,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:19:36,213][71000] Updated weights for policy 0, policy_version 39784 (0.0032) [2024-06-12 17:19:39,775][71000] Updated weights for policy 0, policy_version 39794 (0.0036) [2024-06-12 17:19:40,940][70768] Fps is (10 sec: 49151.4, 60 sec: 47786.6, 300 sec: 47985.7). Total num frames: 652050432. Throughput: 0: 47961.7. Samples: 180862840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 17:19:40,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:19:41,044][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000039799_652066816.pth... [2024-06-12 17:19:41,085][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000039095_640532480.pth [2024-06-12 17:19:43,033][71000] Updated weights for policy 0, policy_version 39804 (0.0024) [2024-06-12 17:19:45,940][70768] Fps is (10 sec: 49151.9, 60 sec: 47786.7, 300 sec: 47985.7). Total num frames: 652279808. Throughput: 0: 47964.8. Samples: 181149760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 17:19:45,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 17:19:46,379][71000] Updated weights for policy 0, policy_version 39814 (0.0030) [2024-06-12 17:19:49,842][71000] Updated weights for policy 0, policy_version 39824 (0.0025) [2024-06-12 17:19:50,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48059.7, 300 sec: 47985.7). Total num frames: 652509184. Throughput: 0: 48009.2. Samples: 181295120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 17:19:50,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 17:19:52,116][70980] Signal inference workers to stop experience collection... (2750 times) [2024-06-12 17:19:52,117][70980] Signal inference workers to resume experience collection... (2750 times) [2024-06-12 17:19:52,155][71000] InferenceWorker_p0-w0: stopping experience collection (2750 times) [2024-06-12 17:19:52,155][71000] InferenceWorker_p0-w0: resuming experience collection (2750 times) [2024-06-12 17:19:53,047][71000] Updated weights for policy 0, policy_version 39834 (0.0024) [2024-06-12 17:19:55,940][70768] Fps is (10 sec: 47513.9, 60 sec: 47786.7, 300 sec: 47985.7). Total num frames: 652754944. Throughput: 0: 48145.1. Samples: 181589140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-12 17:19:55,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:19:56,511][71000] Updated weights for policy 0, policy_version 39844 (0.0030) [2024-06-12 17:19:59,597][71000] Updated weights for policy 0, policy_version 39854 (0.0027) [2024-06-12 17:20:00,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48332.7, 300 sec: 48041.2). Total num frames: 653017088. Throughput: 0: 48318.2. Samples: 181879980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-12 17:20:00,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:20:03,424][71000] Updated weights for policy 0, policy_version 39864 (0.0024) [2024-06-12 17:20:05,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48605.7, 300 sec: 48041.7). Total num frames: 653262848. Throughput: 0: 48450.6. Samples: 182026140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-12 17:20:05,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:20:06,574][71000] Updated weights for policy 0, policy_version 39874 (0.0020) [2024-06-12 17:20:09,890][71000] Updated weights for policy 0, policy_version 39884 (0.0033) [2024-06-12 17:20:10,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48333.0, 300 sec: 48041.2). Total num frames: 653492224. Throughput: 0: 48596.4. Samples: 182319140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-12 17:20:10,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 17:20:13,412][71000] Updated weights for policy 0, policy_version 39894 (0.0027) [2024-06-12 17:20:15,939][70768] Fps is (10 sec: 45876.0, 60 sec: 48059.8, 300 sec: 48041.2). Total num frames: 653721600. Throughput: 0: 48522.7. Samples: 182610320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-12 17:20:15,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 17:20:16,930][71000] Updated weights for policy 0, policy_version 39904 (0.0030) [2024-06-12 17:20:19,983][71000] Updated weights for policy 0, policy_version 39914 (0.0034) [2024-06-12 17:20:20,940][70768] Fps is (10 sec: 49150.9, 60 sec: 48332.7, 300 sec: 48096.7). Total num frames: 653983744. Throughput: 0: 48421.1. Samples: 182751620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 17:20:20,940][70768] Avg episode reward: [(0, '0.231')] [2024-06-12 17:20:23,808][71000] Updated weights for policy 0, policy_version 39924 (0.0030) [2024-06-12 17:20:25,939][70768] Fps is (10 sec: 49152.0, 60 sec: 48605.9, 300 sec: 48041.2). Total num frames: 654213120. Throughput: 0: 48318.0. Samples: 183037140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 17:20:25,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:20:26,590][71000] Updated weights for policy 0, policy_version 39934 (0.0027) [2024-06-12 17:20:30,500][71000] Updated weights for policy 0, policy_version 39944 (0.0042) [2024-06-12 17:20:30,940][70768] Fps is (10 sec: 47514.5, 60 sec: 48332.8, 300 sec: 48041.2). Total num frames: 654458880. Throughput: 0: 48424.0. Samples: 183328840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 17:20:30,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 17:20:33,436][71000] Updated weights for policy 0, policy_version 39954 (0.0034) [2024-06-12 17:20:35,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48332.8, 300 sec: 47985.7). Total num frames: 654688256. Throughput: 0: 48176.9. Samples: 183463080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 17:20:35,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 17:20:37,222][71000] Updated weights for policy 0, policy_version 39964 (0.0030) [2024-06-12 17:20:40,463][71000] Updated weights for policy 0, policy_version 39974 (0.0035) [2024-06-12 17:20:40,940][70768] Fps is (10 sec: 49150.9, 60 sec: 48332.7, 300 sec: 48096.7). Total num frames: 654950400. Throughput: 0: 48232.2. Samples: 183759600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 17:20:40,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:20:43,851][71000] Updated weights for policy 0, policy_version 39984 (0.0030) [2024-06-12 17:20:45,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48332.7, 300 sec: 47985.7). Total num frames: 655179776. Throughput: 0: 48207.5. Samples: 184049320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 17:20:45,940][70768] Avg episode reward: [(0, '0.228')] [2024-06-12 17:20:47,042][71000] Updated weights for policy 0, policy_version 39994 (0.0028) [2024-06-12 17:20:50,940][70768] Fps is (10 sec: 47514.9, 60 sec: 48605.9, 300 sec: 48041.2). Total num frames: 655425536. Throughput: 0: 48150.8. Samples: 184192920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 17:20:50,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 17:20:50,943][71000] Updated weights for policy 0, policy_version 40004 (0.0035) [2024-06-12 17:20:53,865][71000] Updated weights for policy 0, policy_version 40014 (0.0028) [2024-06-12 17:20:55,940][70768] Fps is (10 sec: 45876.0, 60 sec: 48059.7, 300 sec: 48041.2). Total num frames: 655638528. Throughput: 0: 48138.2. Samples: 184485360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 17:20:55,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 17:20:57,831][71000] Updated weights for policy 0, policy_version 40024 (0.0035) [2024-06-12 17:21:00,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48059.9, 300 sec: 47985.7). Total num frames: 655900672. Throughput: 0: 47949.4. Samples: 184768040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 17:21:00,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:21:01,007][71000] Updated weights for policy 0, policy_version 40034 (0.0029) [2024-06-12 17:21:04,589][71000] Updated weights for policy 0, policy_version 40044 (0.0038) [2024-06-12 17:21:05,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48059.9, 300 sec: 48041.2). Total num frames: 656146432. Throughput: 0: 48188.8. Samples: 184920100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 17:21:05,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:21:07,502][71000] Updated weights for policy 0, policy_version 40054 (0.0041) [2024-06-12 17:21:10,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48332.8, 300 sec: 48096.8). Total num frames: 656392192. Throughput: 0: 48425.7. Samples: 185216300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 17:21:10,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 17:21:11,053][71000] Updated weights for policy 0, policy_version 40064 (0.0033) [2024-06-12 17:21:14,346][71000] Updated weights for policy 0, policy_version 40074 (0.0038) [2024-06-12 17:21:15,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48332.7, 300 sec: 48152.3). Total num frames: 656621568. Throughput: 0: 48144.9. Samples: 185495360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 17:21:15,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 17:21:18,192][71000] Updated weights for policy 0, policy_version 40084 (0.0030) [2024-06-12 17:21:20,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48059.9, 300 sec: 48041.2). Total num frames: 656867328. Throughput: 0: 48515.2. Samples: 185646260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 17:21:20,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 17:21:21,217][71000] Updated weights for policy 0, policy_version 40094 (0.0031) [2024-06-12 17:21:24,775][71000] Updated weights for policy 0, policy_version 40104 (0.0028) [2024-06-12 17:21:25,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48332.7, 300 sec: 48041.2). Total num frames: 657113088. Throughput: 0: 48420.6. Samples: 185938520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 17:21:25,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:21:26,318][70980] Signal inference workers to stop experience collection... (2800 times) [2024-06-12 17:21:26,319][70980] Signal inference workers to resume experience collection... (2800 times) [2024-06-12 17:21:26,352][71000] InferenceWorker_p0-w0: stopping experience collection (2800 times) [2024-06-12 17:21:26,352][71000] InferenceWorker_p0-w0: resuming experience collection (2800 times) [2024-06-12 17:21:27,962][71000] Updated weights for policy 0, policy_version 40114 (0.0032) [2024-06-12 17:21:30,940][70768] Fps is (10 sec: 45875.2, 60 sec: 47786.7, 300 sec: 47985.7). Total num frames: 657326080. Throughput: 0: 48301.5. Samples: 186222880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 17:21:30,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:21:31,674][71000] Updated weights for policy 0, policy_version 40124 (0.0032) [2024-06-12 17:21:34,702][71000] Updated weights for policy 0, policy_version 40134 (0.0030) [2024-06-12 17:21:35,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48059.8, 300 sec: 48041.2). Total num frames: 657571840. Throughput: 0: 48088.9. Samples: 186356920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 17:21:35,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 17:21:38,660][71000] Updated weights for policy 0, policy_version 40144 (0.0034) [2024-06-12 17:21:40,941][70768] Fps is (10 sec: 52419.5, 60 sec: 48331.6, 300 sec: 48096.5). Total num frames: 657850368. Throughput: 0: 48278.1. Samples: 186657960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 17:21:40,942][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:21:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000040152_657850368.pth... [2024-06-12 17:21:41,001][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000039447_646299648.pth [2024-06-12 17:21:41,453][71000] Updated weights for policy 0, policy_version 40154 (0.0031) [2024-06-12 17:21:45,385][71000] Updated weights for policy 0, policy_version 40164 (0.0035) [2024-06-12 17:21:45,939][70768] Fps is (10 sec: 47514.0, 60 sec: 47786.9, 300 sec: 47930.2). Total num frames: 658046976. Throughput: 0: 48372.9. Samples: 186944820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 17:21:45,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 17:21:48,291][71000] Updated weights for policy 0, policy_version 40174 (0.0023) [2024-06-12 17:21:50,940][70768] Fps is (10 sec: 44244.0, 60 sec: 47786.5, 300 sec: 48041.2). Total num frames: 658292736. Throughput: 0: 48034.4. Samples: 187081660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 17:21:50,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:21:52,272][71000] Updated weights for policy 0, policy_version 40184 (0.0024) [2024-06-12 17:21:55,049][71000] Updated weights for policy 0, policy_version 40194 (0.0030) [2024-06-12 17:21:55,939][70768] Fps is (10 sec: 50790.3, 60 sec: 48605.9, 300 sec: 48096.8). Total num frames: 658554880. Throughput: 0: 47852.1. Samples: 187369640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 17:21:55,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 17:21:59,068][71000] Updated weights for policy 0, policy_version 40204 (0.0036) [2024-06-12 17:22:00,940][70768] Fps is (10 sec: 50791.2, 60 sec: 48332.7, 300 sec: 47985.7). Total num frames: 658800640. Throughput: 0: 47960.5. Samples: 187653580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 17:22:00,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:22:02,032][71000] Updated weights for policy 0, policy_version 40214 (0.0032) [2024-06-12 17:22:05,939][70768] Fps is (10 sec: 44236.7, 60 sec: 47513.6, 300 sec: 47874.6). Total num frames: 658997248. Throughput: 0: 48057.8. Samples: 187808860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 17:22:05,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:22:06,106][71000] Updated weights for policy 0, policy_version 40224 (0.0036) [2024-06-12 17:22:08,598][71000] Updated weights for policy 0, policy_version 40234 (0.0033) [2024-06-12 17:22:10,940][70768] Fps is (10 sec: 45875.3, 60 sec: 47786.7, 300 sec: 48096.8). Total num frames: 659259392. Throughput: 0: 47812.6. Samples: 188090080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 17:22:10,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 17:22:12,937][71000] Updated weights for policy 0, policy_version 40244 (0.0048) [2024-06-12 17:22:15,837][71000] Updated weights for policy 0, policy_version 40254 (0.0031) [2024-06-12 17:22:15,939][70768] Fps is (10 sec: 52428.9, 60 sec: 48332.9, 300 sec: 48152.3). Total num frames: 659521536. Throughput: 0: 47652.5. Samples: 188367240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 17:22:15,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:22:19,980][71000] Updated weights for policy 0, policy_version 40264 (0.0037) [2024-06-12 17:22:20,940][70768] Fps is (10 sec: 49151.0, 60 sec: 48059.6, 300 sec: 47985.6). Total num frames: 659750912. Throughput: 0: 48043.8. Samples: 188518900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 17:22:20,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:22:22,549][71000] Updated weights for policy 0, policy_version 40274 (0.0032) [2024-06-12 17:22:25,939][70768] Fps is (10 sec: 45875.2, 60 sec: 47786.8, 300 sec: 48041.3). Total num frames: 659980288. Throughput: 0: 47823.8. Samples: 188809940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 17:22:25,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:22:26,765][71000] Updated weights for policy 0, policy_version 40284 (0.0029) [2024-06-12 17:22:29,640][71000] Updated weights for policy 0, policy_version 40294 (0.0029) [2024-06-12 17:22:30,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48332.7, 300 sec: 48096.7). Total num frames: 660226048. Throughput: 0: 47803.8. Samples: 189096000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 17:22:30,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 17:22:33,744][71000] Updated weights for policy 0, policy_version 40304 (0.0040) [2024-06-12 17:22:35,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48605.8, 300 sec: 48096.8). Total num frames: 660488192. Throughput: 0: 48048.1. Samples: 189243820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 17:22:35,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:22:36,660][71000] Updated weights for policy 0, policy_version 40314 (0.0030) [2024-06-12 17:22:40,481][70980] Signal inference workers to stop experience collection... (2850 times) [2024-06-12 17:22:40,506][71000] InferenceWorker_p0-w0: stopping experience collection (2850 times) [2024-06-12 17:22:40,589][70980] Signal inference workers to resume experience collection... (2850 times) [2024-06-12 17:22:40,590][71000] InferenceWorker_p0-w0: resuming experience collection (2850 times) [2024-06-12 17:22:40,591][71000] Updated weights for policy 0, policy_version 40324 (0.0024) [2024-06-12 17:22:40,940][70768] Fps is (10 sec: 47514.4, 60 sec: 47515.0, 300 sec: 47930.2). Total num frames: 660701184. Throughput: 0: 47856.0. Samples: 189523160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 17:22:40,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 17:22:43,280][71000] Updated weights for policy 0, policy_version 40334 (0.0029) [2024-06-12 17:22:45,940][70768] Fps is (10 sec: 44236.9, 60 sec: 48059.7, 300 sec: 47985.7). Total num frames: 660930560. Throughput: 0: 48086.2. Samples: 189817460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 17:22:45,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:22:47,110][71000] Updated weights for policy 0, policy_version 40344 (0.0032) [2024-06-12 17:22:50,011][71000] Updated weights for policy 0, policy_version 40354 (0.0031) [2024-06-12 17:22:50,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48059.9, 300 sec: 48096.8). Total num frames: 661176320. Throughput: 0: 47726.2. Samples: 189956540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 17:22:50,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 17:22:53,999][71000] Updated weights for policy 0, policy_version 40364 (0.0027) [2024-06-12 17:22:55,940][70768] Fps is (10 sec: 49152.2, 60 sec: 47786.6, 300 sec: 47985.7). Total num frames: 661422080. Throughput: 0: 47927.6. Samples: 190246820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 17:22:55,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 17:22:56,651][71000] Updated weights for policy 0, policy_version 40374 (0.0021) [2024-06-12 17:23:00,899][71000] Updated weights for policy 0, policy_version 40384 (0.0024) [2024-06-12 17:23:00,940][70768] Fps is (10 sec: 47513.0, 60 sec: 47513.5, 300 sec: 47930.1). Total num frames: 661651456. Throughput: 0: 48380.7. Samples: 190544380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 17:23:00,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:23:03,401][71000] Updated weights for policy 0, policy_version 40394 (0.0028) [2024-06-12 17:23:05,940][70768] Fps is (10 sec: 47512.4, 60 sec: 48332.6, 300 sec: 48096.7). Total num frames: 661897216. Throughput: 0: 47923.1. Samples: 190675440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 17:23:05,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 17:23:07,731][71000] Updated weights for policy 0, policy_version 40404 (0.0028) [2024-06-12 17:23:10,385][71000] Updated weights for policy 0, policy_version 40414 (0.0033) [2024-06-12 17:23:10,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48332.7, 300 sec: 48152.3). Total num frames: 662159360. Throughput: 0: 47860.2. Samples: 190963660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 17:23:10,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 17:23:14,434][71000] Updated weights for policy 0, policy_version 40424 (0.0038) [2024-06-12 17:23:15,940][70768] Fps is (10 sec: 49152.9, 60 sec: 47786.6, 300 sec: 48041.2). Total num frames: 662388736. Throughput: 0: 47961.5. Samples: 191254260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 17:23:15,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 17:23:17,286][71000] Updated weights for policy 0, policy_version 40434 (0.0034) [2024-06-12 17:23:20,940][70768] Fps is (10 sec: 44237.3, 60 sec: 47513.7, 300 sec: 47985.7). Total num frames: 662601728. Throughput: 0: 47957.8. Samples: 191401920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 17:23:20,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 17:23:20,948][70980] Saving new best policy, reward=0.267! [2024-06-12 17:23:21,272][71000] Updated weights for policy 0, policy_version 40444 (0.0027) [2024-06-12 17:23:23,892][71000] Updated weights for policy 0, policy_version 40454 (0.0027) [2024-06-12 17:23:25,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48059.7, 300 sec: 48041.2). Total num frames: 662863872. Throughput: 0: 47980.0. Samples: 191682260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 17:23:25,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:23:28,232][71000] Updated weights for policy 0, policy_version 40464 (0.0022) [2024-06-12 17:23:30,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48059.9, 300 sec: 48096.8). Total num frames: 663109632. Throughput: 0: 47773.8. Samples: 191967280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 17:23:30,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:23:31,060][71000] Updated weights for policy 0, policy_version 40474 (0.0026) [2024-06-12 17:23:35,029][71000] Updated weights for policy 0, policy_version 40484 (0.0030) [2024-06-12 17:23:35,940][70768] Fps is (10 sec: 47513.5, 60 sec: 47513.6, 300 sec: 47985.7). Total num frames: 663339008. Throughput: 0: 47921.3. Samples: 192113000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 17:23:35,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 17:23:38,159][71000] Updated weights for policy 0, policy_version 40494 (0.0040) [2024-06-12 17:23:40,940][70768] Fps is (10 sec: 45874.6, 60 sec: 47786.6, 300 sec: 47985.7). Total num frames: 663568384. Throughput: 0: 47763.4. Samples: 192396180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 17:23:40,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:23:40,945][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000040501_663568384.pth... [2024-06-12 17:23:40,995][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000039799_652066816.pth [2024-06-12 17:23:41,711][71000] Updated weights for policy 0, policy_version 40504 (0.0033) [2024-06-12 17:23:44,907][71000] Updated weights for policy 0, policy_version 40514 (0.0031) [2024-06-12 17:23:45,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48332.8, 300 sec: 48152.3). Total num frames: 663830528. Throughput: 0: 47670.8. Samples: 192689560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 17:23:45,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:23:48,439][71000] Updated weights for policy 0, policy_version 40524 (0.0035) [2024-06-12 17:23:50,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48332.8, 300 sec: 48096.8). Total num frames: 664076288. Throughput: 0: 48050.0. Samples: 192837680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 17:23:50,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:23:51,418][71000] Updated weights for policy 0, policy_version 40534 (0.0038) [2024-06-12 17:23:55,614][71000] Updated weights for policy 0, policy_version 40544 (0.0023) [2024-06-12 17:23:55,940][70768] Fps is (10 sec: 45874.4, 60 sec: 47786.5, 300 sec: 48041.2). Total num frames: 664289280. Throughput: 0: 48169.7. Samples: 193131300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 17:23:55,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 17:23:58,589][71000] Updated weights for policy 0, policy_version 40554 (0.0028) [2024-06-12 17:24:00,940][70768] Fps is (10 sec: 44237.0, 60 sec: 47786.8, 300 sec: 48041.2). Total num frames: 664518656. Throughput: 0: 47864.9. Samples: 193408180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 17:24:00,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 17:24:01,234][70980] Signal inference workers to stop experience collection... (2900 times) [2024-06-12 17:24:01,235][70980] Signal inference workers to resume experience collection... (2900 times) [2024-06-12 17:24:01,243][71000] InferenceWorker_p0-w0: stopping experience collection (2900 times) [2024-06-12 17:24:01,244][71000] InferenceWorker_p0-w0: resuming experience collection (2900 times) [2024-06-12 17:24:02,360][71000] Updated weights for policy 0, policy_version 40564 (0.0026) [2024-06-12 17:24:05,665][71000] Updated weights for policy 0, policy_version 40574 (0.0032) [2024-06-12 17:24:05,939][70768] Fps is (10 sec: 47514.8, 60 sec: 47786.9, 300 sec: 48041.3). Total num frames: 664764416. Throughput: 0: 47838.8. Samples: 193554660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 17:24:05,940][70768] Avg episode reward: [(0, '0.232')] [2024-06-12 17:24:09,138][71000] Updated weights for policy 0, policy_version 40584 (0.0025) [2024-06-12 17:24:10,940][70768] Fps is (10 sec: 52428.3, 60 sec: 48059.8, 300 sec: 48152.3). Total num frames: 665042944. Throughput: 0: 47951.5. Samples: 193840080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 17:24:10,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:24:12,242][71000] Updated weights for policy 0, policy_version 40594 (0.0028) [2024-06-12 17:24:15,776][71000] Updated weights for policy 0, policy_version 40604 (0.0035) [2024-06-12 17:24:15,940][70768] Fps is (10 sec: 50789.2, 60 sec: 48059.6, 300 sec: 48096.7). Total num frames: 665272320. Throughput: 0: 48131.3. Samples: 194133200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 17:24:15,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:24:18,926][71000] Updated weights for policy 0, policy_version 40614 (0.0032) [2024-06-12 17:24:20,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48332.8, 300 sec: 48152.3). Total num frames: 665501696. Throughput: 0: 48144.5. Samples: 194279500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 17:24:20,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:24:22,429][71000] Updated weights for policy 0, policy_version 40624 (0.0029) [2024-06-12 17:24:25,756][71000] Updated weights for policy 0, policy_version 40634 (0.0042) [2024-06-12 17:24:25,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48059.6, 300 sec: 48096.7). Total num frames: 665747456. Throughput: 0: 48268.4. Samples: 194568260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 17:24:25,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:24:29,191][71000] Updated weights for policy 0, policy_version 40644 (0.0031) [2024-06-12 17:24:30,940][70768] Fps is (10 sec: 47513.3, 60 sec: 47786.6, 300 sec: 48096.8). Total num frames: 665976832. Throughput: 0: 48112.9. Samples: 194854640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 17:24:30,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 17:24:32,731][71000] Updated weights for policy 0, policy_version 40654 (0.0032) [2024-06-12 17:24:35,940][70768] Fps is (10 sec: 47514.3, 60 sec: 48059.8, 300 sec: 48041.2). Total num frames: 666222592. Throughput: 0: 47882.7. Samples: 194992400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 17:24:35,940][70768] Avg episode reward: [(0, '0.225')] [2024-06-12 17:24:36,571][71000] Updated weights for policy 0, policy_version 40664 (0.0024) [2024-06-12 17:24:39,436][71000] Updated weights for policy 0, policy_version 40674 (0.0028) [2024-06-12 17:24:40,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48332.9, 300 sec: 48096.8). Total num frames: 666468352. Throughput: 0: 48009.6. Samples: 195291720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 17:24:40,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:24:42,913][71000] Updated weights for policy 0, policy_version 40684 (0.0032) [2024-06-12 17:24:45,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48059.6, 300 sec: 48152.3). Total num frames: 666714112. Throughput: 0: 48323.4. Samples: 195582740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 17:24:45,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 17:24:46,216][71000] Updated weights for policy 0, policy_version 40694 (0.0046) [2024-06-12 17:24:49,583][71000] Updated weights for policy 0, policy_version 40704 (0.0027) [2024-06-12 17:24:50,940][70768] Fps is (10 sec: 47513.0, 60 sec: 47786.6, 300 sec: 48096.7). Total num frames: 666943488. Throughput: 0: 48202.5. Samples: 195723780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 17:24:50,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:24:53,002][71000] Updated weights for policy 0, policy_version 40714 (0.0036) [2024-06-12 17:24:55,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48332.8, 300 sec: 48041.2). Total num frames: 667189248. Throughput: 0: 48104.0. Samples: 196004760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 17:24:55,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:24:56,323][71000] Updated weights for policy 0, policy_version 40724 (0.0036) [2024-06-12 17:24:59,786][71000] Updated weights for policy 0, policy_version 40734 (0.0024) [2024-06-12 17:25:00,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48332.8, 300 sec: 47985.7). Total num frames: 667418624. Throughput: 0: 47983.8. Samples: 196292460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 17:25:00,940][70768] Avg episode reward: [(0, '0.226')] [2024-06-12 17:25:03,650][71000] Updated weights for policy 0, policy_version 40744 (0.0033) [2024-06-12 17:25:05,939][70768] Fps is (10 sec: 47514.6, 60 sec: 48332.8, 300 sec: 48041.2). Total num frames: 667664384. Throughput: 0: 47883.2. Samples: 196434240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-12 17:25:05,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:25:06,651][71000] Updated weights for policy 0, policy_version 40754 (0.0024) [2024-06-12 17:25:10,359][71000] Updated weights for policy 0, policy_version 40764 (0.0028) [2024-06-12 17:25:10,940][70768] Fps is (10 sec: 47513.4, 60 sec: 47513.7, 300 sec: 48041.2). Total num frames: 667893760. Throughput: 0: 48008.1. Samples: 196728620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-12 17:25:10,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 17:25:13,560][71000] Updated weights for policy 0, policy_version 40774 (0.0034) [2024-06-12 17:25:15,940][70768] Fps is (10 sec: 47510.9, 60 sec: 47786.4, 300 sec: 47985.6). Total num frames: 668139520. Throughput: 0: 47879.5. Samples: 197009240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-12 17:25:15,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:25:17,311][71000] Updated weights for policy 0, policy_version 40784 (0.0039) [2024-06-12 17:25:20,420][71000] Updated weights for policy 0, policy_version 40794 (0.0026) [2024-06-12 17:25:20,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48059.6, 300 sec: 48041.2). Total num frames: 668385280. Throughput: 0: 48028.7. Samples: 197153700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-12 17:25:20,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 17:25:22,979][70980] Signal inference workers to stop experience collection... (2950 times) [2024-06-12 17:25:22,980][70980] Signal inference workers to resume experience collection... (2950 times) [2024-06-12 17:25:23,027][71000] InferenceWorker_p0-w0: stopping experience collection (2950 times) [2024-06-12 17:25:23,027][71000] InferenceWorker_p0-w0: resuming experience collection (2950 times) [2024-06-12 17:25:23,986][71000] Updated weights for policy 0, policy_version 40804 (0.0030) [2024-06-12 17:25:25,939][70768] Fps is (10 sec: 45877.6, 60 sec: 47513.7, 300 sec: 47930.2). Total num frames: 668598272. Throughput: 0: 47762.7. Samples: 197441040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-12 17:25:25,940][70768] Avg episode reward: [(0, '0.231')] [2024-06-12 17:25:27,221][71000] Updated weights for policy 0, policy_version 40814 (0.0030) [2024-06-12 17:25:30,609][71000] Updated weights for policy 0, policy_version 40824 (0.0032) [2024-06-12 17:25:30,940][70768] Fps is (10 sec: 47514.3, 60 sec: 48059.8, 300 sec: 48041.2). Total num frames: 668860416. Throughput: 0: 47726.4. Samples: 197730420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 17:25:30,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:25:34,166][71000] Updated weights for policy 0, policy_version 40834 (0.0025) [2024-06-12 17:25:35,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48059.7, 300 sec: 47985.7). Total num frames: 669106176. Throughput: 0: 47876.5. Samples: 197878220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 17:25:35,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:25:37,726][71000] Updated weights for policy 0, policy_version 40844 (0.0026) [2024-06-12 17:25:40,940][70768] Fps is (10 sec: 47513.3, 60 sec: 47786.6, 300 sec: 47985.7). Total num frames: 669335552. Throughput: 0: 48109.0. Samples: 198169660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 17:25:40,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:25:40,958][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000040854_669351936.pth... [2024-06-12 17:25:40,961][71000] Updated weights for policy 0, policy_version 40854 (0.0034) [2024-06-12 17:25:41,008][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000040152_657850368.pth [2024-06-12 17:25:44,751][71000] Updated weights for policy 0, policy_version 40864 (0.0029) [2024-06-12 17:25:45,940][70768] Fps is (10 sec: 45875.1, 60 sec: 47513.7, 300 sec: 47930.1). Total num frames: 669564928. Throughput: 0: 47790.2. Samples: 198443020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 17:25:45,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 17:25:47,886][71000] Updated weights for policy 0, policy_version 40874 (0.0027) [2024-06-12 17:25:50,940][70768] Fps is (10 sec: 47513.7, 60 sec: 47786.7, 300 sec: 48041.2). Total num frames: 669810688. Throughput: 0: 47828.8. Samples: 198586540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 17:25:50,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:25:51,337][71000] Updated weights for policy 0, policy_version 40884 (0.0034) [2024-06-12 17:25:54,556][71000] Updated weights for policy 0, policy_version 40894 (0.0032) [2024-06-12 17:25:55,940][70768] Fps is (10 sec: 49151.3, 60 sec: 47786.7, 300 sec: 47985.6). Total num frames: 670056448. Throughput: 0: 47790.5. Samples: 198879200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 17:25:55,949][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 17:25:58,212][71000] Updated weights for policy 0, policy_version 40904 (0.0027) [2024-06-12 17:26:00,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48059.7, 300 sec: 47985.7). Total num frames: 670302208. Throughput: 0: 48019.2. Samples: 199170080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 17:26:00,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:26:01,365][71000] Updated weights for policy 0, policy_version 40914 (0.0032) [2024-06-12 17:26:05,197][71000] Updated weights for policy 0, policy_version 40924 (0.0027) [2024-06-12 17:26:05,940][70768] Fps is (10 sec: 45875.0, 60 sec: 47513.4, 300 sec: 47874.6). Total num frames: 670515200. Throughput: 0: 47839.9. Samples: 199306500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 17:26:05,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 17:26:08,382][71000] Updated weights for policy 0, policy_version 40934 (0.0027) [2024-06-12 17:26:10,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48059.7, 300 sec: 47985.7). Total num frames: 670777344. Throughput: 0: 47800.3. Samples: 199592060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 17:26:10,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:26:12,430][71000] Updated weights for policy 0, policy_version 40944 (0.0046) [2024-06-12 17:26:15,550][71000] Updated weights for policy 0, policy_version 40954 (0.0036) [2024-06-12 17:26:15,940][70768] Fps is (10 sec: 49152.3, 60 sec: 47786.9, 300 sec: 47930.1). Total num frames: 671006720. Throughput: 0: 47761.1. Samples: 199879680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 17:26:15,940][70768] Avg episode reward: [(0, '0.231')] [2024-06-12 17:26:19,157][71000] Updated weights for policy 0, policy_version 40964 (0.0026) [2024-06-12 17:26:20,940][70768] Fps is (10 sec: 45874.6, 60 sec: 47513.6, 300 sec: 47874.6). Total num frames: 671236096. Throughput: 0: 47594.5. Samples: 200019980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 17:26:20,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 17:26:22,301][71000] Updated weights for policy 0, policy_version 40974 (0.0033) [2024-06-12 17:26:25,939][70768] Fps is (10 sec: 47514.8, 60 sec: 48059.8, 300 sec: 47985.7). Total num frames: 671481856. Throughput: 0: 47395.7. Samples: 200302460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 17:26:25,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:26:25,948][71000] Updated weights for policy 0, policy_version 40984 (0.0041) [2024-06-12 17:26:29,102][71000] Updated weights for policy 0, policy_version 40994 (0.0036) [2024-06-12 17:26:30,940][70768] Fps is (10 sec: 50790.9, 60 sec: 48059.6, 300 sec: 48041.2). Total num frames: 671744000. Throughput: 0: 47980.8. Samples: 200602160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 17:26:30,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:26:32,383][71000] Updated weights for policy 0, policy_version 41004 (0.0029) [2024-06-12 17:26:35,773][71000] Updated weights for policy 0, policy_version 41014 (0.0030) [2024-06-12 17:26:35,940][70768] Fps is (10 sec: 49151.5, 60 sec: 47786.7, 300 sec: 47874.9). Total num frames: 671973376. Throughput: 0: 48164.5. Samples: 200753940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 17:26:35,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:26:37,624][70980] Signal inference workers to stop experience collection... (3000 times) [2024-06-12 17:26:37,625][70980] Signal inference workers to resume experience collection... (3000 times) [2024-06-12 17:26:37,653][71000] InferenceWorker_p0-w0: stopping experience collection (3000 times) [2024-06-12 17:26:37,653][71000] InferenceWorker_p0-w0: resuming experience collection (3000 times) [2024-06-12 17:26:39,585][71000] Updated weights for policy 0, policy_version 41024 (0.0028) [2024-06-12 17:26:40,940][70768] Fps is (10 sec: 45875.5, 60 sec: 47786.7, 300 sec: 47985.7). Total num frames: 672202752. Throughput: 0: 47878.8. Samples: 201033740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 17:26:40,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 17:26:42,660][71000] Updated weights for policy 0, policy_version 41034 (0.0023) [2024-06-12 17:26:45,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48059.8, 300 sec: 47985.7). Total num frames: 672448512. Throughput: 0: 47606.2. Samples: 201312360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 17:26:45,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 17:26:46,441][71000] Updated weights for policy 0, policy_version 41044 (0.0033) [2024-06-12 17:26:49,544][71000] Updated weights for policy 0, policy_version 41054 (0.0028) [2024-06-12 17:26:50,940][70768] Fps is (10 sec: 47513.7, 60 sec: 47786.7, 300 sec: 47874.6). Total num frames: 672677888. Throughput: 0: 47761.1. Samples: 201455740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-12 17:26:50,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 17:26:52,927][71000] Updated weights for policy 0, policy_version 41064 (0.0030) [2024-06-12 17:26:55,940][70768] Fps is (10 sec: 47513.3, 60 sec: 47786.8, 300 sec: 47874.6). Total num frames: 672923648. Throughput: 0: 47964.5. Samples: 201750460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-12 17:26:55,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 17:26:56,592][71000] Updated weights for policy 0, policy_version 41074 (0.0028) [2024-06-12 17:26:59,587][71000] Updated weights for policy 0, policy_version 41084 (0.0033) [2024-06-12 17:27:00,940][70768] Fps is (10 sec: 47513.8, 60 sec: 47513.6, 300 sec: 47985.7). Total num frames: 673153024. Throughput: 0: 47871.7. Samples: 202033900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-12 17:27:00,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:27:03,456][71000] Updated weights for policy 0, policy_version 41094 (0.0032) [2024-06-12 17:27:05,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48332.9, 300 sec: 47985.7). Total num frames: 673415168. Throughput: 0: 48062.7. Samples: 202182800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-12 17:27:05,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:27:06,782][71000] Updated weights for policy 0, policy_version 41104 (0.0029) [2024-06-12 17:27:10,438][71000] Updated weights for policy 0, policy_version 41114 (0.0028) [2024-06-12 17:27:10,940][70768] Fps is (10 sec: 47512.8, 60 sec: 47513.5, 300 sec: 47819.0). Total num frames: 673628160. Throughput: 0: 48018.9. Samples: 202463320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-12 17:27:10,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 17:27:13,635][71000] Updated weights for policy 0, policy_version 41124 (0.0034) [2024-06-12 17:27:15,939][70768] Fps is (10 sec: 44237.5, 60 sec: 47513.8, 300 sec: 47819.1). Total num frames: 673857536. Throughput: 0: 47525.9. Samples: 202740820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 17:27:15,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 17:27:17,305][71000] Updated weights for policy 0, policy_version 41134 (0.0044) [2024-06-12 17:27:20,214][71000] Updated weights for policy 0, policy_version 41144 (0.0038) [2024-06-12 17:27:20,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48059.9, 300 sec: 47930.1). Total num frames: 674119680. Throughput: 0: 47464.9. Samples: 202889860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 17:27:20,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 17:27:24,179][71000] Updated weights for policy 0, policy_version 41154 (0.0030) [2024-06-12 17:27:25,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48059.7, 300 sec: 47930.2). Total num frames: 674365440. Throughput: 0: 47694.7. Samples: 203180000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 17:27:25,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:27:27,210][71000] Updated weights for policy 0, policy_version 41164 (0.0041) [2024-06-12 17:27:30,875][71000] Updated weights for policy 0, policy_version 41174 (0.0030) [2024-06-12 17:27:30,941][70768] Fps is (10 sec: 47508.5, 60 sec: 47512.8, 300 sec: 47818.9). Total num frames: 674594816. Throughput: 0: 47985.4. Samples: 203471760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 17:27:30,941][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 17:27:34,057][71000] Updated weights for policy 0, policy_version 41184 (0.0028) [2024-06-12 17:27:35,940][70768] Fps is (10 sec: 47513.3, 60 sec: 47786.6, 300 sec: 47930.1). Total num frames: 674840576. Throughput: 0: 47799.1. Samples: 203606700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 17:27:35,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:27:37,833][71000] Updated weights for policy 0, policy_version 41194 (0.0030) [2024-06-12 17:27:40,855][71000] Updated weights for policy 0, policy_version 41204 (0.0036) [2024-06-12 17:27:40,940][70768] Fps is (10 sec: 49156.6, 60 sec: 48059.6, 300 sec: 47985.7). Total num frames: 675086336. Throughput: 0: 47679.8. Samples: 203896060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 17:27:40,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 17:27:40,953][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000041204_675086336.pth... [2024-06-12 17:27:40,996][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000040501_663568384.pth [2024-06-12 17:27:44,710][71000] Updated weights for policy 0, policy_version 41214 (0.0033) [2024-06-12 17:27:45,939][70768] Fps is (10 sec: 49152.6, 60 sec: 48059.8, 300 sec: 47985.7). Total num frames: 675332096. Throughput: 0: 47970.7. Samples: 204192580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 17:27:45,940][70768] Avg episode reward: [(0, '0.232')] [2024-06-12 17:27:47,347][71000] Updated weights for policy 0, policy_version 41224 (0.0025) [2024-06-12 17:27:48,473][70980] Signal inference workers to stop experience collection... (3050 times) [2024-06-12 17:27:48,501][71000] InferenceWorker_p0-w0: stopping experience collection (3050 times) [2024-06-12 17:27:48,531][70980] Signal inference workers to resume experience collection... (3050 times) [2024-06-12 17:27:48,531][71000] InferenceWorker_p0-w0: resuming experience collection (3050 times) [2024-06-12 17:27:50,940][70768] Fps is (10 sec: 45875.1, 60 sec: 47786.5, 300 sec: 47874.6). Total num frames: 675545088. Throughput: 0: 47879.9. Samples: 204337400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 17:27:50,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:27:51,371][71000] Updated weights for policy 0, policy_version 41234 (0.0026) [2024-06-12 17:27:54,180][71000] Updated weights for policy 0, policy_version 41244 (0.0035) [2024-06-12 17:27:55,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48059.8, 300 sec: 47985.7). Total num frames: 675807232. Throughput: 0: 47929.1. Samples: 204620120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 17:27:55,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 17:27:58,469][71000] Updated weights for policy 0, policy_version 41254 (0.0030) [2024-06-12 17:28:00,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48059.6, 300 sec: 47930.2). Total num frames: 676036608. Throughput: 0: 48171.4. Samples: 204908540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 17:28:00,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:28:01,211][71000] Updated weights for policy 0, policy_version 41264 (0.0034) [2024-06-12 17:28:05,263][71000] Updated weights for policy 0, policy_version 41274 (0.0027) [2024-06-12 17:28:05,940][70768] Fps is (10 sec: 47513.4, 60 sec: 47786.8, 300 sec: 47874.6). Total num frames: 676282368. Throughput: 0: 48077.8. Samples: 205053360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 17:28:05,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:28:08,274][71000] Updated weights for policy 0, policy_version 41284 (0.0033) [2024-06-12 17:28:10,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48059.8, 300 sec: 47874.6). Total num frames: 676511744. Throughput: 0: 48055.5. Samples: 205342500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 17:28:10,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:28:11,877][71000] Updated weights for policy 0, policy_version 41294 (0.0040) [2024-06-12 17:28:14,848][71000] Updated weights for policy 0, policy_version 41304 (0.0028) [2024-06-12 17:28:15,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48605.8, 300 sec: 48041.2). Total num frames: 676773888. Throughput: 0: 47966.9. Samples: 205630220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 17:28:15,949][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 17:28:18,574][71000] Updated weights for policy 0, policy_version 41314 (0.0028) [2024-06-12 17:28:20,939][70768] Fps is (10 sec: 49152.6, 60 sec: 48059.8, 300 sec: 47930.2). Total num frames: 677003264. Throughput: 0: 48320.1. Samples: 205781100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 17:28:20,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:28:21,354][71000] Updated weights for policy 0, policy_version 41324 (0.0034) [2024-06-12 17:28:25,871][71000] Updated weights for policy 0, policy_version 41334 (0.0035) [2024-06-12 17:28:25,940][70768] Fps is (10 sec: 44236.8, 60 sec: 47513.6, 300 sec: 47819.1). Total num frames: 677216256. Throughput: 0: 48261.0. Samples: 206067800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 17:28:25,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 17:28:28,352][71000] Updated weights for policy 0, policy_version 41344 (0.0037) [2024-06-12 17:28:30,940][70768] Fps is (10 sec: 45874.7, 60 sec: 47787.5, 300 sec: 47874.6). Total num frames: 677462016. Throughput: 0: 48023.4. Samples: 206353640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 17:28:30,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 17:28:32,449][71000] Updated weights for policy 0, policy_version 41354 (0.0025) [2024-06-12 17:28:35,233][71000] Updated weights for policy 0, policy_version 41364 (0.0027) [2024-06-12 17:28:35,940][70768] Fps is (10 sec: 52428.2, 60 sec: 48332.7, 300 sec: 48041.2). Total num frames: 677740544. Throughput: 0: 47980.5. Samples: 206496520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-12 17:28:35,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:28:39,306][71000] Updated weights for policy 0, policy_version 41374 (0.0037) [2024-06-12 17:28:40,940][70768] Fps is (10 sec: 49152.1, 60 sec: 47786.8, 300 sec: 47874.6). Total num frames: 677953536. Throughput: 0: 48146.2. Samples: 206786700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-12 17:28:40,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:28:42,103][71000] Updated weights for policy 0, policy_version 41384 (0.0026) [2024-06-12 17:28:45,940][70768] Fps is (10 sec: 44237.4, 60 sec: 47513.5, 300 sec: 47819.1). Total num frames: 678182912. Throughput: 0: 48051.2. Samples: 207070840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-12 17:28:45,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:28:45,957][71000] Updated weights for policy 0, policy_version 41394 (0.0033) [2024-06-12 17:28:48,827][71000] Updated weights for policy 0, policy_version 41404 (0.0029) [2024-06-12 17:28:50,942][70768] Fps is (10 sec: 45862.1, 60 sec: 47784.5, 300 sec: 47874.2). Total num frames: 678412288. Throughput: 0: 47997.0. Samples: 207213360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-12 17:28:50,943][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:28:53,028][71000] Updated weights for policy 0, policy_version 41414 (0.0027) [2024-06-12 17:28:55,940][70768] Fps is (10 sec: 49151.5, 60 sec: 47786.5, 300 sec: 47985.7). Total num frames: 678674432. Throughput: 0: 47835.9. Samples: 207495120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-12 17:28:55,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 17:28:55,967][71000] Updated weights for policy 0, policy_version 41424 (0.0033) [2024-06-12 17:28:59,810][71000] Updated weights for policy 0, policy_version 41434 (0.0029) [2024-06-12 17:29:00,940][70768] Fps is (10 sec: 49165.2, 60 sec: 47786.6, 300 sec: 47930.1). Total num frames: 678903808. Throughput: 0: 47937.2. Samples: 207787400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 17:29:00,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:29:02,637][71000] Updated weights for policy 0, policy_version 41444 (0.0031) [2024-06-12 17:29:03,229][70980] Signal inference workers to stop experience collection... (3100 times) [2024-06-12 17:29:03,229][70980] Signal inference workers to resume experience collection... (3100 times) [2024-06-12 17:29:03,272][71000] InferenceWorker_p0-w0: stopping experience collection (3100 times) [2024-06-12 17:29:03,272][71000] InferenceWorker_p0-w0: resuming experience collection (3100 times) [2024-06-12 17:29:05,940][70768] Fps is (10 sec: 47514.1, 60 sec: 47786.7, 300 sec: 47819.1). Total num frames: 679149568. Throughput: 0: 47612.8. Samples: 207923680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 17:29:05,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 17:29:06,499][71000] Updated weights for policy 0, policy_version 41454 (0.0027) [2024-06-12 17:29:09,523][71000] Updated weights for policy 0, policy_version 41464 (0.0023) [2024-06-12 17:29:10,940][70768] Fps is (10 sec: 49152.9, 60 sec: 48059.8, 300 sec: 47874.6). Total num frames: 679395328. Throughput: 0: 47660.5. Samples: 208212520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 17:29:10,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:29:13,277][71000] Updated weights for policy 0, policy_version 41474 (0.0034) [2024-06-12 17:29:15,940][70768] Fps is (10 sec: 49152.0, 60 sec: 47786.7, 300 sec: 47930.1). Total num frames: 679641088. Throughput: 0: 47828.4. Samples: 208505920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 17:29:15,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:29:16,262][71000] Updated weights for policy 0, policy_version 41484 (0.0026) [2024-06-12 17:29:20,186][71000] Updated weights for policy 0, policy_version 41494 (0.0030) [2024-06-12 17:29:20,940][70768] Fps is (10 sec: 45875.2, 60 sec: 47513.6, 300 sec: 47819.1). Total num frames: 679854080. Throughput: 0: 47882.0. Samples: 208651200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 17:29:20,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:29:23,135][71000] Updated weights for policy 0, policy_version 41504 (0.0032) [2024-06-12 17:29:25,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48059.8, 300 sec: 47874.6). Total num frames: 680099840. Throughput: 0: 47613.0. Samples: 208929280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 17:29:25,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 17:29:26,983][71000] Updated weights for policy 0, policy_version 41514 (0.0030) [2024-06-12 17:29:29,837][71000] Updated weights for policy 0, policy_version 41524 (0.0030) [2024-06-12 17:29:30,940][70768] Fps is (10 sec: 52428.7, 60 sec: 48605.9, 300 sec: 47985.7). Total num frames: 680378368. Throughput: 0: 47719.6. Samples: 209218220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 17:29:30,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:29:33,697][71000] Updated weights for policy 0, policy_version 41534 (0.0035) [2024-06-12 17:29:35,940][70768] Fps is (10 sec: 50789.8, 60 sec: 47786.7, 300 sec: 47930.1). Total num frames: 680607744. Throughput: 0: 48021.7. Samples: 209374200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 17:29:35,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:29:36,621][71000] Updated weights for policy 0, policy_version 41544 (0.0031) [2024-06-12 17:29:40,438][71000] Updated weights for policy 0, policy_version 41554 (0.0036) [2024-06-12 17:29:40,939][70768] Fps is (10 sec: 45875.4, 60 sec: 48059.8, 300 sec: 47874.6). Total num frames: 680837120. Throughput: 0: 48226.0. Samples: 209665280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 17:29:40,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 17:29:40,955][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000041555_680837120.pth... [2024-06-12 17:29:41,008][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000040854_669351936.pth [2024-06-12 17:29:43,436][71000] Updated weights for policy 0, policy_version 41564 (0.0031) [2024-06-12 17:29:45,939][70768] Fps is (10 sec: 45875.7, 60 sec: 48059.8, 300 sec: 47874.6). Total num frames: 681066496. Throughput: 0: 48001.6. Samples: 209947460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 17:29:45,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:29:47,283][71000] Updated weights for policy 0, policy_version 41574 (0.0032) [2024-06-12 17:29:50,275][71000] Updated weights for policy 0, policy_version 41584 (0.0022) [2024-06-12 17:29:50,940][70768] Fps is (10 sec: 52427.8, 60 sec: 49154.2, 300 sec: 48041.2). Total num frames: 681361408. Throughput: 0: 48339.0. Samples: 210098940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 17:29:50,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:29:54,134][71000] Updated weights for policy 0, policy_version 41594 (0.0035) [2024-06-12 17:29:55,940][70768] Fps is (10 sec: 47513.3, 60 sec: 47786.8, 300 sec: 47874.6). Total num frames: 681541632. Throughput: 0: 48206.6. Samples: 210381820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 17:29:55,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:29:57,031][71000] Updated weights for policy 0, policy_version 41604 (0.0030) [2024-06-12 17:30:00,847][71000] Updated weights for policy 0, policy_version 41614 (0.0031) [2024-06-12 17:30:00,940][70768] Fps is (10 sec: 44237.3, 60 sec: 48332.9, 300 sec: 47930.1). Total num frames: 681803776. Throughput: 0: 48144.4. Samples: 210672420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 17:30:00,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:30:04,030][71000] Updated weights for policy 0, policy_version 41624 (0.0031) [2024-06-12 17:30:05,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48332.7, 300 sec: 47985.7). Total num frames: 682049536. Throughput: 0: 48041.6. Samples: 210813080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 17:30:05,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 17:30:07,847][71000] Updated weights for policy 0, policy_version 41634 (0.0038) [2024-06-12 17:30:10,939][70768] Fps is (10 sec: 47514.2, 60 sec: 48059.8, 300 sec: 47930.2). Total num frames: 682278912. Throughput: 0: 48376.0. Samples: 211106200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 17:30:10,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 17:30:11,074][71000] Updated weights for policy 0, policy_version 41644 (0.0024) [2024-06-12 17:30:12,345][70980] Signal inference workers to stop experience collection... (3150 times) [2024-06-12 17:30:12,346][70980] Signal inference workers to resume experience collection... (3150 times) [2024-06-12 17:30:12,375][71000] InferenceWorker_p0-w0: stopping experience collection (3150 times) [2024-06-12 17:30:12,375][71000] InferenceWorker_p0-w0: resuming experience collection (3150 times) [2024-06-12 17:30:14,523][71000] Updated weights for policy 0, policy_version 41654 (0.0028) [2024-06-12 17:30:15,940][70768] Fps is (10 sec: 44236.5, 60 sec: 47513.4, 300 sec: 47819.0). Total num frames: 682491904. Throughput: 0: 48252.7. Samples: 211389600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 17:30:15,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 17:30:17,693][71000] Updated weights for policy 0, policy_version 41664 (0.0025) [2024-06-12 17:30:20,939][70768] Fps is (10 sec: 47513.3, 60 sec: 48332.8, 300 sec: 47985.7). Total num frames: 682754048. Throughput: 0: 47943.6. Samples: 211531660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 17:30:20,940][70768] Avg episode reward: [(0, '0.226')] [2024-06-12 17:30:21,223][71000] Updated weights for policy 0, policy_version 41674 (0.0036) [2024-06-12 17:30:24,483][71000] Updated weights for policy 0, policy_version 41684 (0.0033) [2024-06-12 17:30:25,940][70768] Fps is (10 sec: 50791.4, 60 sec: 48332.7, 300 sec: 47930.1). Total num frames: 682999808. Throughput: 0: 47836.8. Samples: 211817940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 17:30:25,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 17:30:28,233][71000] Updated weights for policy 0, policy_version 41694 (0.0029) [2024-06-12 17:30:30,940][70768] Fps is (10 sec: 49151.7, 60 sec: 47786.6, 300 sec: 47930.1). Total num frames: 683245568. Throughput: 0: 48023.9. Samples: 212108540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 17:30:30,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 17:30:31,449][71000] Updated weights for policy 0, policy_version 41704 (0.0031) [2024-06-12 17:30:35,272][71000] Updated weights for policy 0, policy_version 41714 (0.0032) [2024-06-12 17:30:35,939][70768] Fps is (10 sec: 45875.5, 60 sec: 47513.7, 300 sec: 47874.6). Total num frames: 683458560. Throughput: 0: 47684.2. Samples: 212244720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 17:30:35,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:30:38,676][71000] Updated weights for policy 0, policy_version 41724 (0.0034) [2024-06-12 17:30:40,940][70768] Fps is (10 sec: 45875.0, 60 sec: 47786.6, 300 sec: 47930.1). Total num frames: 683704320. Throughput: 0: 47785.3. Samples: 212532160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 17:30:40,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 17:30:42,094][71000] Updated weights for policy 0, policy_version 41734 (0.0041) [2024-06-12 17:30:45,278][71000] Updated weights for policy 0, policy_version 41744 (0.0023) [2024-06-12 17:30:45,940][70768] Fps is (10 sec: 49151.0, 60 sec: 48059.6, 300 sec: 47930.1). Total num frames: 683950080. Throughput: 0: 47868.8. Samples: 212826520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 17:30:45,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 17:30:48,682][71000] Updated weights for policy 0, policy_version 41754 (0.0026) [2024-06-12 17:30:50,940][70768] Fps is (10 sec: 49151.8, 60 sec: 47240.6, 300 sec: 47930.2). Total num frames: 684195840. Throughput: 0: 47836.5. Samples: 212965720. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-12 17:30:50,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:30:52,166][71000] Updated weights for policy 0, policy_version 41764 (0.0028) [2024-06-12 17:30:55,939][70768] Fps is (10 sec: 45876.1, 60 sec: 47786.7, 300 sec: 47819.1). Total num frames: 684408832. Throughput: 0: 47743.5. Samples: 213254660. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-12 17:30:55,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:30:55,952][71000] Updated weights for policy 0, policy_version 41774 (0.0033) [2024-06-12 17:30:59,055][71000] Updated weights for policy 0, policy_version 41784 (0.0030) [2024-06-12 17:31:00,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48059.8, 300 sec: 48041.3). Total num frames: 684687360. Throughput: 0: 47733.6. Samples: 213537600. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-12 17:31:00,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:31:02,770][71000] Updated weights for policy 0, policy_version 41794 (0.0035) [2024-06-12 17:31:05,940][70768] Fps is (10 sec: 49150.7, 60 sec: 47513.5, 300 sec: 47874.6). Total num frames: 684900352. Throughput: 0: 48075.3. Samples: 213695060. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-12 17:31:05,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 17:31:06,086][71000] Updated weights for policy 0, policy_version 41804 (0.0032) [2024-06-12 17:31:09,407][71000] Updated weights for policy 0, policy_version 41814 (0.0037) [2024-06-12 17:31:10,940][70768] Fps is (10 sec: 45875.3, 60 sec: 47786.6, 300 sec: 47930.2). Total num frames: 685146112. Throughput: 0: 48001.4. Samples: 213978000. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-12 17:31:10,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:31:12,731][71000] Updated weights for policy 0, policy_version 41824 (0.0029) [2024-06-12 17:31:15,854][71000] Updated weights for policy 0, policy_version 41834 (0.0033) [2024-06-12 17:31:15,940][70768] Fps is (10 sec: 50791.4, 60 sec: 48606.0, 300 sec: 48041.2). Total num frames: 685408256. Throughput: 0: 47962.7. Samples: 214266860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 17:31:15,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:31:19,813][71000] Updated weights for policy 0, policy_version 41844 (0.0027) [2024-06-12 17:31:20,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48332.7, 300 sec: 48041.2). Total num frames: 685654016. Throughput: 0: 48176.3. Samples: 214412660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 17:31:20,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:31:22,854][71000] Updated weights for policy 0, policy_version 41854 (0.0034) [2024-06-12 17:31:25,940][70768] Fps is (10 sec: 45874.7, 60 sec: 47786.6, 300 sec: 47874.6). Total num frames: 685867008. Throughput: 0: 48307.0. Samples: 214705980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 17:31:25,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 17:31:26,317][70980] Signal inference workers to stop experience collection... (3200 times) [2024-06-12 17:31:26,319][70980] Signal inference workers to resume experience collection... (3200 times) [2024-06-12 17:31:26,350][71000] InferenceWorker_p0-w0: stopping experience collection (3200 times) [2024-06-12 17:31:26,354][71000] InferenceWorker_p0-w0: resuming experience collection (3200 times) [2024-06-12 17:31:26,456][71000] Updated weights for policy 0, policy_version 41864 (0.0027) [2024-06-12 17:31:29,969][71000] Updated weights for policy 0, policy_version 41874 (0.0033) [2024-06-12 17:31:30,939][70768] Fps is (10 sec: 45876.1, 60 sec: 47786.8, 300 sec: 47930.2). Total num frames: 686112768. Throughput: 0: 48174.9. Samples: 214994380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 17:31:30,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 17:31:33,068][71000] Updated weights for policy 0, policy_version 41884 (0.0031) [2024-06-12 17:31:35,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48332.7, 300 sec: 47985.7). Total num frames: 686358528. Throughput: 0: 48134.3. Samples: 215131760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 17:31:35,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 17:31:36,571][71000] Updated weights for policy 0, policy_version 41894 (0.0034) [2024-06-12 17:31:39,792][71000] Updated weights for policy 0, policy_version 41904 (0.0032) [2024-06-12 17:31:40,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48605.9, 300 sec: 48041.2). Total num frames: 686620672. Throughput: 0: 48381.3. Samples: 215431820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:31:40,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 17:31:40,953][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000041908_686620672.pth... [2024-06-12 17:31:40,997][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000041204_675086336.pth [2024-06-12 17:31:43,288][71000] Updated weights for policy 0, policy_version 41914 (0.0038) [2024-06-12 17:31:45,940][70768] Fps is (10 sec: 45875.6, 60 sec: 47786.8, 300 sec: 47930.1). Total num frames: 686817280. Throughput: 0: 48296.0. Samples: 215710920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:31:45,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:31:46,903][71000] Updated weights for policy 0, policy_version 41924 (0.0030) [2024-06-12 17:31:50,285][71000] Updated weights for policy 0, policy_version 41934 (0.0028) [2024-06-12 17:31:50,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48059.8, 300 sec: 47985.7). Total num frames: 687079424. Throughput: 0: 47965.5. Samples: 215853500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:31:50,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:31:53,569][71000] Updated weights for policy 0, policy_version 41944 (0.0045) [2024-06-12 17:31:55,940][70768] Fps is (10 sec: 52428.0, 60 sec: 48878.8, 300 sec: 48096.7). Total num frames: 687341568. Throughput: 0: 48119.4. Samples: 216143380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:31:55,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 17:31:57,176][71000] Updated weights for policy 0, policy_version 41954 (0.0034) [2024-06-12 17:32:00,431][71000] Updated weights for policy 0, policy_version 41964 (0.0025) [2024-06-12 17:32:00,940][70768] Fps is (10 sec: 47513.3, 60 sec: 47786.5, 300 sec: 47930.1). Total num frames: 687554560. Throughput: 0: 48179.9. Samples: 216434960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:32:00,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:32:03,874][71000] Updated weights for policy 0, policy_version 41974 (0.0033) [2024-06-12 17:32:05,939][70768] Fps is (10 sec: 45876.0, 60 sec: 48333.0, 300 sec: 48041.3). Total num frames: 687800320. Throughput: 0: 47987.7. Samples: 216572100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 17:32:05,940][70768] Avg episode reward: [(0, '0.232')] [2024-06-12 17:32:07,258][71000] Updated weights for policy 0, policy_version 41984 (0.0043) [2024-06-12 17:32:10,639][71000] Updated weights for policy 0, policy_version 41994 (0.0030) [2024-06-12 17:32:10,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48332.7, 300 sec: 48096.7). Total num frames: 688046080. Throughput: 0: 47881.8. Samples: 216860660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 17:32:10,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 17:32:14,197][71000] Updated weights for policy 0, policy_version 42004 (0.0031) [2024-06-12 17:32:15,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48059.8, 300 sec: 48041.2). Total num frames: 688291840. Throughput: 0: 48010.6. Samples: 217154860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 17:32:15,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:32:17,462][71000] Updated weights for policy 0, policy_version 42014 (0.0031) [2024-06-12 17:32:20,702][71000] Updated weights for policy 0, policy_version 42024 (0.0028) [2024-06-12 17:32:20,940][70768] Fps is (10 sec: 47514.1, 60 sec: 47786.8, 300 sec: 47985.7). Total num frames: 688521216. Throughput: 0: 47981.4. Samples: 217290920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 17:32:20,940][70768] Avg episode reward: [(0, '0.224')] [2024-06-12 17:32:24,431][71000] Updated weights for policy 0, policy_version 42034 (0.0030) [2024-06-12 17:32:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48332.9, 300 sec: 48041.4). Total num frames: 688766976. Throughput: 0: 47843.6. Samples: 217584780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 17:32:25,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:32:27,772][71000] Updated weights for policy 0, policy_version 42044 (0.0030) [2024-06-12 17:32:30,942][70768] Fps is (10 sec: 47503.9, 60 sec: 48058.0, 300 sec: 47985.4). Total num frames: 688996352. Throughput: 0: 47936.5. Samples: 217868160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 17:32:30,942][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 17:32:31,070][71000] Updated weights for policy 0, policy_version 42054 (0.0030) [2024-06-12 17:32:34,737][71000] Updated weights for policy 0, policy_version 42064 (0.0032) [2024-06-12 17:32:35,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48059.8, 300 sec: 47985.7). Total num frames: 689242112. Throughput: 0: 48135.6. Samples: 218019600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 17:32:35,940][70768] Avg episode reward: [(0, '0.226')] [2024-06-12 17:32:37,635][71000] Updated weights for policy 0, policy_version 42074 (0.0031) [2024-06-12 17:32:40,940][70768] Fps is (10 sec: 45884.5, 60 sec: 47240.5, 300 sec: 47874.6). Total num frames: 689455104. Throughput: 0: 48027.7. Samples: 218304620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 17:32:40,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 17:32:41,399][71000] Updated weights for policy 0, policy_version 42084 (0.0035) [2024-06-12 17:32:44,563][71000] Updated weights for policy 0, policy_version 42094 (0.0028) [2024-06-12 17:32:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48332.8, 300 sec: 48041.2). Total num frames: 689717248. Throughput: 0: 47881.0. Samples: 218589600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 17:32:45,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:32:47,377][70980] Signal inference workers to stop experience collection... (3250 times) [2024-06-12 17:32:47,381][70980] Signal inference workers to resume experience collection... (3250 times) [2024-06-12 17:32:47,396][71000] InferenceWorker_p0-w0: stopping experience collection (3250 times) [2024-06-12 17:32:47,427][71000] InferenceWorker_p0-w0: resuming experience collection (3250 times) [2024-06-12 17:32:48,354][71000] Updated weights for policy 0, policy_version 42104 (0.0036) [2024-06-12 17:32:50,940][70768] Fps is (10 sec: 52428.6, 60 sec: 48332.8, 300 sec: 48041.2). Total num frames: 689979392. Throughput: 0: 48155.5. Samples: 218739100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 17:32:50,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:32:51,330][71000] Updated weights for policy 0, policy_version 42114 (0.0028) [2024-06-12 17:32:55,598][71000] Updated weights for policy 0, policy_version 42124 (0.0043) [2024-06-12 17:32:55,940][70768] Fps is (10 sec: 45875.2, 60 sec: 47240.6, 300 sec: 47930.2). Total num frames: 690176000. Throughput: 0: 47981.0. Samples: 219019800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 17:32:55,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 17:32:58,102][71000] Updated weights for policy 0, policy_version 42134 (0.0030) [2024-06-12 17:33:00,940][70768] Fps is (10 sec: 44236.5, 60 sec: 47786.7, 300 sec: 47930.1). Total num frames: 690421760. Throughput: 0: 47752.8. Samples: 219303740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 17:33:00,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:33:02,300][71000] Updated weights for policy 0, policy_version 42144 (0.0030) [2024-06-12 17:33:04,777][71000] Updated weights for policy 0, policy_version 42154 (0.0029) [2024-06-12 17:33:05,940][70768] Fps is (10 sec: 52428.8, 60 sec: 48332.7, 300 sec: 48096.8). Total num frames: 690700288. Throughput: 0: 47996.8. Samples: 219450780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 17:33:05,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 17:33:08,890][71000] Updated weights for policy 0, policy_version 42164 (0.0038) [2024-06-12 17:33:10,940][70768] Fps is (10 sec: 49151.9, 60 sec: 47786.6, 300 sec: 47930.1). Total num frames: 690913280. Throughput: 0: 47961.6. Samples: 219743060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 17:33:10,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:33:11,618][71000] Updated weights for policy 0, policy_version 42174 (0.0028) [2024-06-12 17:33:15,868][71000] Updated weights for policy 0, policy_version 42184 (0.0029) [2024-06-12 17:33:15,939][70768] Fps is (10 sec: 44237.3, 60 sec: 47513.7, 300 sec: 47930.1). Total num frames: 691142656. Throughput: 0: 48039.2. Samples: 220029820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 17:33:15,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 17:33:18,681][71000] Updated weights for policy 0, policy_version 42194 (0.0027) [2024-06-12 17:33:20,940][70768] Fps is (10 sec: 47514.2, 60 sec: 47786.7, 300 sec: 48041.2). Total num frames: 691388416. Throughput: 0: 47833.4. Samples: 220172100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 17:33:20,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 17:33:22,701][71000] Updated weights for policy 0, policy_version 42204 (0.0041) [2024-06-12 17:33:25,281][71000] Updated weights for policy 0, policy_version 42214 (0.0032) [2024-06-12 17:33:25,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48059.7, 300 sec: 48096.8). Total num frames: 691650560. Throughput: 0: 47887.0. Samples: 220459540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 17:33:25,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:33:29,461][71000] Updated weights for policy 0, policy_version 42224 (0.0034) [2024-06-12 17:33:30,939][70768] Fps is (10 sec: 47513.8, 60 sec: 47788.3, 300 sec: 47874.6). Total num frames: 691863552. Throughput: 0: 48058.3. Samples: 220752220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 17:33:30,940][70768] Avg episode reward: [(0, '0.224')] [2024-06-12 17:33:32,127][71000] Updated weights for policy 0, policy_version 42234 (0.0029) [2024-06-12 17:33:35,939][70768] Fps is (10 sec: 44237.2, 60 sec: 47513.6, 300 sec: 47930.1). Total num frames: 692092928. Throughput: 0: 47813.0. Samples: 220890680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 17:33:35,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:33:36,262][71000] Updated weights for policy 0, policy_version 42244 (0.0029) [2024-06-12 17:33:39,087][71000] Updated weights for policy 0, policy_version 42254 (0.0025) [2024-06-12 17:33:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48332.8, 300 sec: 48041.2). Total num frames: 692355072. Throughput: 0: 47917.8. Samples: 221176100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 17:33:40,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:33:40,953][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000042258_692355072.pth... [2024-06-12 17:33:40,996][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000041555_680837120.pth [2024-06-12 17:33:43,154][71000] Updated weights for policy 0, policy_version 42264 (0.0030) [2024-06-12 17:33:45,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48059.7, 300 sec: 48097.2). Total num frames: 692600832. Throughput: 0: 47973.4. Samples: 221462540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 17:33:45,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:33:46,222][71000] Updated weights for policy 0, policy_version 42274 (0.0036) [2024-06-12 17:33:50,035][71000] Updated weights for policy 0, policy_version 42284 (0.0026) [2024-06-12 17:33:50,940][70768] Fps is (10 sec: 45875.3, 60 sec: 47240.6, 300 sec: 47930.2). Total num frames: 692813824. Throughput: 0: 47884.5. Samples: 221605580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 17:33:50,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 17:33:52,785][71000] Updated weights for policy 0, policy_version 42294 (0.0026) [2024-06-12 17:33:55,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48059.8, 300 sec: 47985.7). Total num frames: 693059584. Throughput: 0: 47811.3. Samples: 221894560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 17:33:55,940][70768] Avg episode reward: [(0, '0.225')] [2024-06-12 17:33:56,769][71000] Updated weights for policy 0, policy_version 42304 (0.0022) [2024-06-12 17:33:59,668][71000] Updated weights for policy 0, policy_version 42314 (0.0039) [2024-06-12 17:34:00,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48332.8, 300 sec: 48041.2). Total num frames: 693321728. Throughput: 0: 47728.2. Samples: 222177600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 17:34:00,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:34:03,345][70980] Signal inference workers to stop experience collection... (3300 times) [2024-06-12 17:34:03,393][70980] Signal inference workers to resume experience collection... (3300 times) [2024-06-12 17:34:03,394][71000] InferenceWorker_p0-w0: stopping experience collection (3300 times) [2024-06-12 17:34:03,409][71000] InferenceWorker_p0-w0: resuming experience collection (3300 times) [2024-06-12 17:34:03,541][71000] Updated weights for policy 0, policy_version 42324 (0.0037) [2024-06-12 17:34:05,940][70768] Fps is (10 sec: 50789.7, 60 sec: 47786.6, 300 sec: 48041.2). Total num frames: 693567488. Throughput: 0: 48153.3. Samples: 222339000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 17:34:05,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:34:06,319][71000] Updated weights for policy 0, policy_version 42334 (0.0027) [2024-06-12 17:34:10,272][71000] Updated weights for policy 0, policy_version 42344 (0.0036) [2024-06-12 17:34:10,940][70768] Fps is (10 sec: 45875.5, 60 sec: 47786.7, 300 sec: 47930.1). Total num frames: 693780480. Throughput: 0: 48098.7. Samples: 222623980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 17:34:10,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:34:13,578][71000] Updated weights for policy 0, policy_version 42354 (0.0030) [2024-06-12 17:34:15,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48059.6, 300 sec: 48041.2). Total num frames: 694026240. Throughput: 0: 47735.9. Samples: 222900340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 17:34:15,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 17:34:17,151][71000] Updated weights for policy 0, policy_version 42364 (0.0029) [2024-06-12 17:34:20,181][71000] Updated weights for policy 0, policy_version 42374 (0.0027) [2024-06-12 17:34:20,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48332.8, 300 sec: 48096.7). Total num frames: 694288384. Throughput: 0: 48072.4. Samples: 223053940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 17:34:20,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 17:34:23,916][71000] Updated weights for policy 0, policy_version 42384 (0.0025) [2024-06-12 17:34:25,940][70768] Fps is (10 sec: 47513.6, 60 sec: 47513.6, 300 sec: 47874.6). Total num frames: 694501376. Throughput: 0: 48043.1. Samples: 223338040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 17:34:25,940][70768] Avg episode reward: [(0, '0.225')] [2024-06-12 17:34:26,992][71000] Updated weights for policy 0, policy_version 42394 (0.0038) [2024-06-12 17:34:30,762][71000] Updated weights for policy 0, policy_version 42404 (0.0028) [2024-06-12 17:34:30,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48059.7, 300 sec: 47930.1). Total num frames: 694747136. Throughput: 0: 48082.7. Samples: 223626260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 17:34:30,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 17:34:33,723][71000] Updated weights for policy 0, policy_version 42414 (0.0031) [2024-06-12 17:34:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48332.8, 300 sec: 47985.7). Total num frames: 694992896. Throughput: 0: 48020.9. Samples: 223766520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 17:34:35,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 17:34:37,809][71000] Updated weights for policy 0, policy_version 42424 (0.0040) [2024-06-12 17:34:40,684][71000] Updated weights for policy 0, policy_version 42434 (0.0030) [2024-06-12 17:34:40,939][70768] Fps is (10 sec: 49152.5, 60 sec: 48059.8, 300 sec: 48041.2). Total num frames: 695238656. Throughput: 0: 47961.8. Samples: 224052840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 17:34:40,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 17:34:44,529][71000] Updated weights for policy 0, policy_version 42444 (0.0041) [2024-06-12 17:34:45,940][70768] Fps is (10 sec: 45875.3, 60 sec: 47513.6, 300 sec: 47763.5). Total num frames: 695451648. Throughput: 0: 48010.8. Samples: 224338080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 17:34:45,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 17:34:47,733][71000] Updated weights for policy 0, policy_version 42454 (0.0027) [2024-06-12 17:34:50,940][70768] Fps is (10 sec: 45874.5, 60 sec: 48059.7, 300 sec: 47985.7). Total num frames: 695697408. Throughput: 0: 47559.1. Samples: 224479160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 17:34:50,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:34:51,301][71000] Updated weights for policy 0, policy_version 42464 (0.0044) [2024-06-12 17:34:54,648][71000] Updated weights for policy 0, policy_version 42474 (0.0030) [2024-06-12 17:34:55,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48332.7, 300 sec: 47985.7). Total num frames: 695959552. Throughput: 0: 47596.9. Samples: 224765840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 17:34:55,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 17:34:58,200][71000] Updated weights for policy 0, policy_version 42484 (0.0032) [2024-06-12 17:35:00,940][70768] Fps is (10 sec: 47513.5, 60 sec: 47513.6, 300 sec: 47874.6). Total num frames: 696172544. Throughput: 0: 47829.8. Samples: 225052680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 17:35:00,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:35:01,301][71000] Updated weights for policy 0, policy_version 42494 (0.0032) [2024-06-12 17:35:05,210][71000] Updated weights for policy 0, policy_version 42504 (0.0029) [2024-06-12 17:35:05,939][70768] Fps is (10 sec: 45875.4, 60 sec: 47513.7, 300 sec: 47930.1). Total num frames: 696418304. Throughput: 0: 47576.1. Samples: 225194860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 17:35:05,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 17:35:07,958][70980] Signal inference workers to stop experience collection... (3350 times) [2024-06-12 17:35:07,997][71000] InferenceWorker_p0-w0: stopping experience collection (3350 times) [2024-06-12 17:35:08,011][70980] Signal inference workers to resume experience collection... (3350 times) [2024-06-12 17:35:08,012][71000] InferenceWorker_p0-w0: resuming experience collection (3350 times) [2024-06-12 17:35:08,305][71000] Updated weights for policy 0, policy_version 42514 (0.0034) [2024-06-12 17:35:10,940][70768] Fps is (10 sec: 47514.0, 60 sec: 47786.7, 300 sec: 47985.7). Total num frames: 696647680. Throughput: 0: 47662.7. Samples: 225482860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 17:35:10,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:35:11,999][71000] Updated weights for policy 0, policy_version 42524 (0.0030) [2024-06-12 17:35:15,181][71000] Updated weights for policy 0, policy_version 42534 (0.0040) [2024-06-12 17:35:15,939][70768] Fps is (10 sec: 49152.0, 60 sec: 48059.8, 300 sec: 47985.7). Total num frames: 696909824. Throughput: 0: 47597.8. Samples: 225768160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 17:35:15,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 17:35:18,871][71000] Updated weights for policy 0, policy_version 42544 (0.0032) [2024-06-12 17:35:20,940][70768] Fps is (10 sec: 47513.7, 60 sec: 47240.6, 300 sec: 47874.6). Total num frames: 697122816. Throughput: 0: 47759.6. Samples: 225915700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 17:35:20,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:35:21,991][71000] Updated weights for policy 0, policy_version 42554 (0.0024) [2024-06-12 17:35:25,698][71000] Updated weights for policy 0, policy_version 42564 (0.0027) [2024-06-12 17:35:25,939][70768] Fps is (10 sec: 45875.3, 60 sec: 47786.8, 300 sec: 47874.6). Total num frames: 697368576. Throughput: 0: 47797.3. Samples: 226203720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 17:35:25,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 17:35:28,823][71000] Updated weights for policy 0, policy_version 42574 (0.0032) [2024-06-12 17:35:30,939][70768] Fps is (10 sec: 49152.1, 60 sec: 47786.7, 300 sec: 47985.7). Total num frames: 697614336. Throughput: 0: 47659.6. Samples: 226482760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 17:35:30,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 17:35:32,262][71000] Updated weights for policy 0, policy_version 42584 (0.0027) [2024-06-12 17:35:35,702][71000] Updated weights for policy 0, policy_version 42594 (0.0033) [2024-06-12 17:35:35,940][70768] Fps is (10 sec: 49151.5, 60 sec: 47786.7, 300 sec: 47985.7). Total num frames: 697860096. Throughput: 0: 48012.9. Samples: 226639740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 17:35:35,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 17:35:39,117][71000] Updated weights for policy 0, policy_version 42604 (0.0039) [2024-06-12 17:35:40,939][70768] Fps is (10 sec: 47513.7, 60 sec: 47513.6, 300 sec: 47930.2). Total num frames: 698089472. Throughput: 0: 48075.6. Samples: 226929240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-12 17:35:40,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:35:40,967][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000042609_698105856.pth... [2024-06-12 17:35:41,046][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000041908_686620672.pth [2024-06-12 17:35:42,500][71000] Updated weights for policy 0, policy_version 42614 (0.0029) [2024-06-12 17:35:45,865][71000] Updated weights for policy 0, policy_version 42624 (0.0032) [2024-06-12 17:35:45,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48332.8, 300 sec: 47985.7). Total num frames: 698351616. Throughput: 0: 48029.4. Samples: 227214000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-12 17:35:45,940][70768] Avg episode reward: [(0, '0.231')] [2024-06-12 17:35:49,347][71000] Updated weights for policy 0, policy_version 42634 (0.0032) [2024-06-12 17:35:50,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48332.8, 300 sec: 48096.7). Total num frames: 698597376. Throughput: 0: 48118.6. Samples: 227360200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-12 17:35:50,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:35:52,822][71000] Updated weights for policy 0, policy_version 42644 (0.0031) [2024-06-12 17:35:55,939][70768] Fps is (10 sec: 45875.7, 60 sec: 47513.7, 300 sec: 47874.6). Total num frames: 698810368. Throughput: 0: 48081.4. Samples: 227646520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-12 17:35:55,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:35:56,220][71000] Updated weights for policy 0, policy_version 42654 (0.0028) [2024-06-12 17:35:59,577][71000] Updated weights for policy 0, policy_version 42664 (0.0032) [2024-06-12 17:36:00,942][70768] Fps is (10 sec: 45863.1, 60 sec: 48057.7, 300 sec: 47985.3). Total num frames: 699056128. Throughput: 0: 48154.0. Samples: 227935220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-12 17:36:00,943][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:36:03,057][71000] Updated weights for policy 0, policy_version 42674 (0.0038) [2024-06-12 17:36:05,939][70768] Fps is (10 sec: 49151.9, 60 sec: 48059.7, 300 sec: 47985.7). Total num frames: 699301888. Throughput: 0: 48046.2. Samples: 228077780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-12 17:36:05,940][70768] Avg episode reward: [(0, '0.228')] [2024-06-12 17:36:06,182][71000] Updated weights for policy 0, policy_version 42684 (0.0031) [2024-06-12 17:36:09,963][71000] Updated weights for policy 0, policy_version 42694 (0.0032) [2024-06-12 17:36:10,940][70768] Fps is (10 sec: 50803.0, 60 sec: 48605.7, 300 sec: 47985.7). Total num frames: 699564032. Throughput: 0: 48269.5. Samples: 228375860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 17:36:10,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 17:36:12,750][71000] Updated weights for policy 0, policy_version 42704 (0.0031) [2024-06-12 17:36:15,940][70768] Fps is (10 sec: 47513.4, 60 sec: 47786.6, 300 sec: 47874.6). Total num frames: 699777024. Throughput: 0: 48339.1. Samples: 228658020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 17:36:15,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:36:16,733][71000] Updated weights for policy 0, policy_version 42714 (0.0030) [2024-06-12 17:36:19,923][71000] Updated weights for policy 0, policy_version 42724 (0.0029) [2024-06-12 17:36:20,940][70768] Fps is (10 sec: 44237.4, 60 sec: 48059.7, 300 sec: 47930.2). Total num frames: 700006400. Throughput: 0: 47964.0. Samples: 228798120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 17:36:20,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:36:23,238][71000] Updated weights for policy 0, policy_version 42734 (0.0024) [2024-06-12 17:36:25,940][70768] Fps is (10 sec: 50789.5, 60 sec: 48605.7, 300 sec: 48041.2). Total num frames: 700284928. Throughput: 0: 48015.8. Samples: 229089960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 17:36:25,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 17:36:26,848][71000] Updated weights for policy 0, policy_version 42744 (0.0036) [2024-06-12 17:36:29,702][70980] Signal inference workers to stop experience collection... (3400 times) [2024-06-12 17:36:29,755][71000] InferenceWorker_p0-w0: stopping experience collection (3400 times) [2024-06-12 17:36:29,758][70980] Signal inference workers to resume experience collection... (3400 times) [2024-06-12 17:36:29,765][71000] InferenceWorker_p0-w0: resuming experience collection (3400 times) [2024-06-12 17:36:29,920][71000] Updated weights for policy 0, policy_version 42754 (0.0029) [2024-06-12 17:36:30,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48332.8, 300 sec: 47985.7). Total num frames: 700514304. Throughput: 0: 48341.4. Samples: 229389360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 17:36:30,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 17:36:33,232][71000] Updated weights for policy 0, policy_version 42764 (0.0029) [2024-06-12 17:36:35,940][70768] Fps is (10 sec: 45876.0, 60 sec: 48059.8, 300 sec: 47874.6). Total num frames: 700743680. Throughput: 0: 48312.0. Samples: 229534240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:36:35,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:36:37,102][71000] Updated weights for policy 0, policy_version 42774 (0.0032) [2024-06-12 17:36:39,839][71000] Updated weights for policy 0, policy_version 42784 (0.0031) [2024-06-12 17:36:40,941][70768] Fps is (10 sec: 47504.8, 60 sec: 48331.3, 300 sec: 48040.9). Total num frames: 700989440. Throughput: 0: 48326.4. Samples: 229821300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:36:40,942][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:36:43,635][71000] Updated weights for policy 0, policy_version 42794 (0.0035) [2024-06-12 17:36:45,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48059.8, 300 sec: 47985.7). Total num frames: 701235200. Throughput: 0: 48404.7. Samples: 230113300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:36:45,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 17:36:46,971][71000] Updated weights for policy 0, policy_version 42804 (0.0031) [2024-06-12 17:36:50,321][71000] Updated weights for policy 0, policy_version 42814 (0.0028) [2024-06-12 17:36:50,940][70768] Fps is (10 sec: 49160.6, 60 sec: 48059.7, 300 sec: 47930.1). Total num frames: 701480960. Throughput: 0: 48482.5. Samples: 230259500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:36:50,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:36:53,766][71000] Updated weights for policy 0, policy_version 42824 (0.0033) [2024-06-12 17:36:55,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48332.8, 300 sec: 47985.7). Total num frames: 701710336. Throughput: 0: 48189.5. Samples: 230544380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:36:55,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:36:56,867][71000] Updated weights for policy 0, policy_version 42834 (0.0029) [2024-06-12 17:37:00,239][71000] Updated weights for policy 0, policy_version 42844 (0.0043) [2024-06-12 17:37:00,939][70768] Fps is (10 sec: 49152.8, 60 sec: 48608.1, 300 sec: 48041.2). Total num frames: 701972480. Throughput: 0: 48248.5. Samples: 230829200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:37:00,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 17:37:04,197][71000] Updated weights for policy 0, policy_version 42854 (0.0037) [2024-06-12 17:37:05,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48605.8, 300 sec: 48041.2). Total num frames: 702218240. Throughput: 0: 48572.4. Samples: 230983880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-12 17:37:05,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:37:06,863][71000] Updated weights for policy 0, policy_version 42864 (0.0034) [2024-06-12 17:37:10,706][71000] Updated weights for policy 0, policy_version 42874 (0.0029) [2024-06-12 17:37:10,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48059.9, 300 sec: 47985.7). Total num frames: 702447616. Throughput: 0: 48575.3. Samples: 231275840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-12 17:37:10,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 17:37:13,990][71000] Updated weights for policy 0, policy_version 42884 (0.0027) [2024-06-12 17:37:15,940][70768] Fps is (10 sec: 45875.6, 60 sec: 48332.8, 300 sec: 47985.7). Total num frames: 702676992. Throughput: 0: 48264.9. Samples: 231561280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-12 17:37:15,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:37:17,272][71000] Updated weights for policy 0, policy_version 42894 (0.0031) [2024-06-12 17:37:20,911][71000] Updated weights for policy 0, policy_version 42904 (0.0029) [2024-06-12 17:37:20,940][70768] Fps is (10 sec: 49151.2, 60 sec: 48878.9, 300 sec: 48041.2). Total num frames: 702939136. Throughput: 0: 48284.7. Samples: 231707060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-12 17:37:20,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 17:37:24,374][71000] Updated weights for policy 0, policy_version 42914 (0.0033) [2024-06-12 17:37:25,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48059.8, 300 sec: 48041.5). Total num frames: 703168512. Throughput: 0: 48371.2. Samples: 231997920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-12 17:37:25,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:37:27,560][71000] Updated weights for policy 0, policy_version 42924 (0.0034) [2024-06-12 17:37:30,940][70768] Fps is (10 sec: 45876.0, 60 sec: 48059.7, 300 sec: 47985.7). Total num frames: 703397888. Throughput: 0: 48251.0. Samples: 232284600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-12 17:37:30,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:37:31,400][71000] Updated weights for policy 0, policy_version 42934 (0.0033) [2024-06-12 17:37:34,237][71000] Updated weights for policy 0, policy_version 42944 (0.0019) [2024-06-12 17:37:35,943][70768] Fps is (10 sec: 47499.3, 60 sec: 48330.3, 300 sec: 48096.3). Total num frames: 703643648. Throughput: 0: 48131.9. Samples: 232425580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-12 17:37:35,943][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:37:38,010][71000] Updated weights for policy 0, policy_version 42954 (0.0026) [2024-06-12 17:37:40,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48607.3, 300 sec: 48096.8). Total num frames: 703905792. Throughput: 0: 48191.5. Samples: 232713000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-12 17:37:40,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:37:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000042963_703905792.pth... [2024-06-12 17:37:40,995][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000042258_692355072.pth [2024-06-12 17:37:41,479][71000] Updated weights for policy 0, policy_version 42964 (0.0038) [2024-06-12 17:37:44,169][70980] Signal inference workers to stop experience collection... (3450 times) [2024-06-12 17:37:44,171][70980] Signal inference workers to resume experience collection... (3450 times) [2024-06-12 17:37:44,198][71000] InferenceWorker_p0-w0: stopping experience collection (3450 times) [2024-06-12 17:37:44,198][71000] InferenceWorker_p0-w0: resuming experience collection (3450 times) [2024-06-12 17:37:44,595][71000] Updated weights for policy 0, policy_version 42974 (0.0028) [2024-06-12 17:37:45,940][70768] Fps is (10 sec: 45889.3, 60 sec: 47786.6, 300 sec: 47874.6). Total num frames: 704102400. Throughput: 0: 48142.6. Samples: 232995620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-12 17:37:45,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:37:48,187][71000] Updated weights for policy 0, policy_version 42984 (0.0035) [2024-06-12 17:37:50,939][70768] Fps is (10 sec: 45875.7, 60 sec: 48059.9, 300 sec: 48096.8). Total num frames: 704364544. Throughput: 0: 47817.0. Samples: 233135640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-12 17:37:50,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:37:51,910][71000] Updated weights for policy 0, policy_version 42994 (0.0036) [2024-06-12 17:37:54,857][71000] Updated weights for policy 0, policy_version 43004 (0.0027) [2024-06-12 17:37:55,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48332.8, 300 sec: 48096.8). Total num frames: 704610304. Throughput: 0: 47560.8. Samples: 233416080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-12 17:37:55,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:37:58,950][71000] Updated weights for policy 0, policy_version 43014 (0.0024) [2024-06-12 17:38:00,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48332.7, 300 sec: 48041.2). Total num frames: 704872448. Throughput: 0: 48038.1. Samples: 233723000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 17:38:00,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 17:38:01,772][71000] Updated weights for policy 0, policy_version 43024 (0.0031) [2024-06-12 17:38:05,536][71000] Updated weights for policy 0, policy_version 43034 (0.0033) [2024-06-12 17:38:05,944][70768] Fps is (10 sec: 45855.8, 60 sec: 47510.3, 300 sec: 47985.0). Total num frames: 705069056. Throughput: 0: 47809.0. Samples: 233858660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 17:38:05,944][70768] Avg episode reward: [(0, '0.223')] [2024-06-12 17:38:08,479][71000] Updated weights for policy 0, policy_version 43044 (0.0029) [2024-06-12 17:38:10,939][70768] Fps is (10 sec: 44237.0, 60 sec: 47786.7, 300 sec: 48041.2). Total num frames: 705314816. Throughput: 0: 47817.9. Samples: 234149720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 17:38:10,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 17:38:12,001][71000] Updated weights for policy 0, policy_version 43054 (0.0036) [2024-06-12 17:38:15,156][71000] Updated weights for policy 0, policy_version 43064 (0.0027) [2024-06-12 17:38:15,940][70768] Fps is (10 sec: 50811.6, 60 sec: 48332.7, 300 sec: 48096.7). Total num frames: 705576960. Throughput: 0: 47862.1. Samples: 234438400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 17:38:15,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:38:18,857][71000] Updated weights for policy 0, policy_version 43074 (0.0030) [2024-06-12 17:38:20,939][70768] Fps is (10 sec: 50790.6, 60 sec: 48059.9, 300 sec: 48041.2). Total num frames: 705822720. Throughput: 0: 48150.9. Samples: 234592220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 17:38:20,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 17:38:21,679][71000] Updated weights for policy 0, policy_version 43084 (0.0034) [2024-06-12 17:38:25,940][70768] Fps is (10 sec: 45875.6, 60 sec: 47786.7, 300 sec: 48041.2). Total num frames: 706035712. Throughput: 0: 48144.5. Samples: 234879500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 17:38:25,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 17:38:26,182][71000] Updated weights for policy 0, policy_version 43094 (0.0032) [2024-06-12 17:38:28,746][71000] Updated weights for policy 0, policy_version 43104 (0.0033) [2024-06-12 17:38:30,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48332.8, 300 sec: 48152.3). Total num frames: 706297856. Throughput: 0: 48223.1. Samples: 235165660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 17:38:30,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 17:38:32,720][71000] Updated weights for policy 0, policy_version 43114 (0.0029) [2024-06-12 17:38:35,627][71000] Updated weights for policy 0, policy_version 43124 (0.0031) [2024-06-12 17:38:35,940][70768] Fps is (10 sec: 52428.9, 60 sec: 48608.4, 300 sec: 48152.3). Total num frames: 706560000. Throughput: 0: 48603.5. Samples: 235322800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 17:38:35,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:38:39,283][71000] Updated weights for policy 0, policy_version 43134 (0.0031) [2024-06-12 17:38:40,701][70980] Signal inference workers to stop experience collection... (3500 times) [2024-06-12 17:38:40,746][71000] InferenceWorker_p0-w0: stopping experience collection (3500 times) [2024-06-12 17:38:40,753][70980] Signal inference workers to resume experience collection... (3500 times) [2024-06-12 17:38:40,754][71000] InferenceWorker_p0-w0: resuming experience collection (3500 times) [2024-06-12 17:38:40,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48059.7, 300 sec: 48096.8). Total num frames: 706789376. Throughput: 0: 48615.5. Samples: 235603780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 17:38:40,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:38:42,272][71000] Updated weights for policy 0, policy_version 43144 (0.0030) [2024-06-12 17:38:45,940][70768] Fps is (10 sec: 45874.7, 60 sec: 48605.8, 300 sec: 48152.3). Total num frames: 707018752. Throughput: 0: 48336.4. Samples: 235898140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 17:38:45,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:38:46,195][71000] Updated weights for policy 0, policy_version 43154 (0.0034) [2024-06-12 17:38:48,785][71000] Updated weights for policy 0, policy_version 43164 (0.0031) [2024-06-12 17:38:50,939][70768] Fps is (10 sec: 49152.5, 60 sec: 48605.9, 300 sec: 48207.8). Total num frames: 707280896. Throughput: 0: 48485.1. Samples: 236040280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 17:38:50,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:38:53,123][71000] Updated weights for policy 0, policy_version 43174 (0.0038) [2024-06-12 17:38:55,865][71000] Updated weights for policy 0, policy_version 43184 (0.0029) [2024-06-12 17:38:55,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48605.9, 300 sec: 48152.3). Total num frames: 707526656. Throughput: 0: 48452.4. Samples: 236330080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 17:38:55,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:38:59,862][71000] Updated weights for policy 0, policy_version 43194 (0.0036) [2024-06-12 17:39:00,940][70768] Fps is (10 sec: 45874.9, 60 sec: 47786.7, 300 sec: 48041.2). Total num frames: 707739648. Throughput: 0: 48395.2. Samples: 236616180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 17:39:00,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:39:02,740][71000] Updated weights for policy 0, policy_version 43204 (0.0024) [2024-06-12 17:39:05,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48609.3, 300 sec: 48152.3). Total num frames: 707985408. Throughput: 0: 47986.1. Samples: 236751600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 17:39:05,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:39:06,658][71000] Updated weights for policy 0, policy_version 43214 (0.0028) [2024-06-12 17:39:09,316][71000] Updated weights for policy 0, policy_version 43224 (0.0041) [2024-06-12 17:39:10,940][70768] Fps is (10 sec: 47512.1, 60 sec: 48332.5, 300 sec: 48096.7). Total num frames: 708214784. Throughput: 0: 48099.2. Samples: 237043980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 17:39:10,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:39:13,572][71000] Updated weights for policy 0, policy_version 43234 (0.0023) [2024-06-12 17:39:15,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48332.9, 300 sec: 48096.8). Total num frames: 708476928. Throughput: 0: 48277.8. Samples: 237338160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 17:39:15,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:39:16,119][71000] Updated weights for policy 0, policy_version 43244 (0.0024) [2024-06-12 17:39:20,290][71000] Updated weights for policy 0, policy_version 43254 (0.0036) [2024-06-12 17:39:20,940][70768] Fps is (10 sec: 49153.6, 60 sec: 48059.7, 300 sec: 48152.3). Total num frames: 708706304. Throughput: 0: 47993.8. Samples: 237482520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 17:39:20,940][70768] Avg episode reward: [(0, '0.218')] [2024-06-12 17:39:22,923][71000] Updated weights for policy 0, policy_version 43264 (0.0025) [2024-06-12 17:39:25,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48332.8, 300 sec: 48096.8). Total num frames: 708935680. Throughput: 0: 48118.2. Samples: 237769100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 17:39:25,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:39:27,145][71000] Updated weights for policy 0, policy_version 43274 (0.0031) [2024-06-12 17:39:29,909][71000] Updated weights for policy 0, policy_version 43284 (0.0035) [2024-06-12 17:39:30,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48059.8, 300 sec: 48096.8). Total num frames: 709181440. Throughput: 0: 48056.2. Samples: 238060660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 17:39:30,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 17:39:33,805][71000] Updated weights for policy 0, policy_version 43294 (0.0037) [2024-06-12 17:39:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 47786.6, 300 sec: 48096.7). Total num frames: 709427200. Throughput: 0: 48252.8. Samples: 238211660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 17:39:35,940][70768] Avg episode reward: [(0, '0.232')] [2024-06-12 17:39:36,691][71000] Updated weights for policy 0, policy_version 43304 (0.0029) [2024-06-12 17:39:40,940][70768] Fps is (10 sec: 45874.9, 60 sec: 47513.6, 300 sec: 48096.8). Total num frames: 709640192. Throughput: 0: 48111.6. Samples: 238495100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 17:39:40,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:39:41,048][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000043314_709656576.pth... [2024-06-12 17:39:41,050][71000] Updated weights for policy 0, policy_version 43314 (0.0029) [2024-06-12 17:39:41,094][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000042609_698105856.pth [2024-06-12 17:39:43,505][71000] Updated weights for policy 0, policy_version 43324 (0.0028) [2024-06-12 17:39:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48059.8, 300 sec: 48152.3). Total num frames: 709902336. Throughput: 0: 48083.1. Samples: 238779920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 17:39:45,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:39:47,688][71000] Updated weights for policy 0, policy_version 43334 (0.0031) [2024-06-12 17:39:50,386][71000] Updated weights for policy 0, policy_version 43344 (0.0029) [2024-06-12 17:39:50,939][70768] Fps is (10 sec: 52429.0, 60 sec: 48059.7, 300 sec: 48152.3). Total num frames: 710164480. Throughput: 0: 48384.5. Samples: 238928900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 17:39:50,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:39:54,302][71000] Updated weights for policy 0, policy_version 43354 (0.0034) [2024-06-12 17:39:55,940][70768] Fps is (10 sec: 47513.8, 60 sec: 47513.6, 300 sec: 48152.3). Total num frames: 710377472. Throughput: 0: 48284.0. Samples: 239216740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 17:39:55,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:39:56,283][70980] Signal inference workers to stop experience collection... (3550 times) [2024-06-12 17:39:56,337][71000] InferenceWorker_p0-w0: stopping experience collection (3550 times) [2024-06-12 17:39:56,343][70980] Signal inference workers to resume experience collection... (3550 times) [2024-06-12 17:39:56,358][71000] InferenceWorker_p0-w0: resuming experience collection (3550 times) [2024-06-12 17:39:57,440][71000] Updated weights for policy 0, policy_version 43364 (0.0034) [2024-06-12 17:40:00,939][70768] Fps is (10 sec: 45875.4, 60 sec: 48059.8, 300 sec: 48152.3). Total num frames: 710623232. Throughput: 0: 48225.0. Samples: 239508280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 17:40:00,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 17:40:01,026][71000] Updated weights for policy 0, policy_version 43374 (0.0035) [2024-06-12 17:40:04,159][71000] Updated weights for policy 0, policy_version 43384 (0.0030) [2024-06-12 17:40:05,940][70768] Fps is (10 sec: 47513.5, 60 sec: 47786.7, 300 sec: 48152.3). Total num frames: 710852608. Throughput: 0: 48200.5. Samples: 239651540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 17:40:05,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:40:07,894][71000] Updated weights for policy 0, policy_version 43394 (0.0046) [2024-06-12 17:40:10,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48333.1, 300 sec: 48152.3). Total num frames: 711114752. Throughput: 0: 48212.9. Samples: 239938680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 17:40:10,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:40:10,979][71000] Updated weights for policy 0, policy_version 43404 (0.0030) [2024-06-12 17:40:14,831][71000] Updated weights for policy 0, policy_version 43414 (0.0041) [2024-06-12 17:40:15,939][70768] Fps is (10 sec: 49152.2, 60 sec: 47786.7, 300 sec: 48207.8). Total num frames: 711344128. Throughput: 0: 48015.1. Samples: 240221340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 17:40:15,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:40:17,769][71000] Updated weights for policy 0, policy_version 43424 (0.0035) [2024-06-12 17:40:20,940][70768] Fps is (10 sec: 45875.3, 60 sec: 47786.7, 300 sec: 48152.3). Total num frames: 711573504. Throughput: 0: 47724.9. Samples: 240359280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 17:40:20,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 17:40:21,474][71000] Updated weights for policy 0, policy_version 43434 (0.0026) [2024-06-12 17:40:24,881][71000] Updated weights for policy 0, policy_version 43444 (0.0036) [2024-06-12 17:40:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48332.8, 300 sec: 48207.8). Total num frames: 711835648. Throughput: 0: 47874.2. Samples: 240649440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 17:40:25,940][70768] Avg episode reward: [(0, '0.225')] [2024-06-12 17:40:28,534][71000] Updated weights for policy 0, policy_version 43454 (0.0038) [2024-06-12 17:40:30,939][70768] Fps is (10 sec: 49152.6, 60 sec: 48059.8, 300 sec: 48152.3). Total num frames: 712065024. Throughput: 0: 48068.6. Samples: 240943000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 17:40:30,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 17:40:31,596][71000] Updated weights for policy 0, policy_version 43464 (0.0034) [2024-06-12 17:40:35,179][71000] Updated weights for policy 0, policy_version 43474 (0.0040) [2024-06-12 17:40:35,940][70768] Fps is (10 sec: 45875.1, 60 sec: 47786.6, 300 sec: 48152.3). Total num frames: 712294400. Throughput: 0: 48020.8. Samples: 241089840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 17:40:35,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:40:38,383][71000] Updated weights for policy 0, policy_version 43484 (0.0038) [2024-06-12 17:40:40,940][70768] Fps is (10 sec: 49151.0, 60 sec: 48605.8, 300 sec: 48152.3). Total num frames: 712556544. Throughput: 0: 47931.9. Samples: 241373680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 17:40:40,940][70768] Avg episode reward: [(0, '0.223')] [2024-06-12 17:40:41,847][71000] Updated weights for policy 0, policy_version 43494 (0.0032) [2024-06-12 17:40:45,355][71000] Updated weights for policy 0, policy_version 43504 (0.0038) [2024-06-12 17:40:45,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48332.8, 300 sec: 48152.3). Total num frames: 712802304. Throughput: 0: 47937.7. Samples: 241665480. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-12 17:40:45,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:40:48,804][71000] Updated weights for policy 0, policy_version 43514 (0.0027) [2024-06-12 17:40:50,940][70768] Fps is (10 sec: 47513.9, 60 sec: 47786.6, 300 sec: 48207.8). Total num frames: 713031680. Throughput: 0: 47872.9. Samples: 241805820. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-12 17:40:50,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 17:40:52,081][71000] Updated weights for policy 0, policy_version 43524 (0.0039) [2024-06-12 17:40:55,569][71000] Updated weights for policy 0, policy_version 43534 (0.0028) [2024-06-12 17:40:55,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48332.8, 300 sec: 48208.3). Total num frames: 713277440. Throughput: 0: 48034.2. Samples: 242100220. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-12 17:40:55,940][70768] Avg episode reward: [(0, '0.228')] [2024-06-12 17:40:59,031][71000] Updated weights for policy 0, policy_version 43544 (0.0027) [2024-06-12 17:41:00,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48059.7, 300 sec: 48152.3). Total num frames: 713506816. Throughput: 0: 48009.3. Samples: 242381760. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-12 17:41:00,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:41:02,334][71000] Updated weights for policy 0, policy_version 43554 (0.0032) [2024-06-12 17:41:05,827][71000] Updated weights for policy 0, policy_version 43564 (0.0026) [2024-06-12 17:41:05,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48332.8, 300 sec: 48096.8). Total num frames: 713752576. Throughput: 0: 48242.6. Samples: 242530200. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-12 17:41:05,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 17:41:08,920][71000] Updated weights for policy 0, policy_version 43574 (0.0029) [2024-06-12 17:41:10,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48059.8, 300 sec: 48207.8). Total num frames: 713998336. Throughput: 0: 48107.1. Samples: 242814260. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 17:41:10,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 17:41:12,627][71000] Updated weights for policy 0, policy_version 43584 (0.0028) [2024-06-12 17:41:15,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48059.6, 300 sec: 48207.8). Total num frames: 714227712. Throughput: 0: 48059.8. Samples: 243105700. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 17:41:15,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 17:41:16,112][71000] Updated weights for policy 0, policy_version 43594 (0.0032) [2024-06-12 17:41:17,135][70980] Signal inference workers to stop experience collection... (3600 times) [2024-06-12 17:41:17,135][70980] Signal inference workers to resume experience collection... (3600 times) [2024-06-12 17:41:17,153][71000] InferenceWorker_p0-w0: stopping experience collection (3600 times) [2024-06-12 17:41:17,153][71000] InferenceWorker_p0-w0: resuming experience collection (3600 times) [2024-06-12 17:41:19,409][71000] Updated weights for policy 0, policy_version 43604 (0.0031) [2024-06-12 17:41:20,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48605.8, 300 sec: 48152.3). Total num frames: 714489856. Throughput: 0: 47978.1. Samples: 243248860. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 17:41:20,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 17:41:22,853][71000] Updated weights for policy 0, policy_version 43614 (0.0036) [2024-06-12 17:41:25,940][70768] Fps is (10 sec: 47513.2, 60 sec: 47786.6, 300 sec: 48096.7). Total num frames: 714702848. Throughput: 0: 47996.8. Samples: 243533540. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 17:41:25,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:41:26,343][71000] Updated weights for policy 0, policy_version 43624 (0.0030) [2024-06-12 17:41:29,410][71000] Updated weights for policy 0, policy_version 43634 (0.0025) [2024-06-12 17:41:30,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48059.6, 300 sec: 48152.3). Total num frames: 714948608. Throughput: 0: 48088.3. Samples: 243829460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 17:41:30,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 17:41:32,841][71000] Updated weights for policy 0, policy_version 43644 (0.0028) [2024-06-12 17:41:35,940][70768] Fps is (10 sec: 50791.0, 60 sec: 48605.9, 300 sec: 48208.1). Total num frames: 715210752. Throughput: 0: 48114.2. Samples: 243970960. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 17:41:35,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:41:35,959][71000] Updated weights for policy 0, policy_version 43654 (0.0039) [2024-06-12 17:41:39,777][71000] Updated weights for policy 0, policy_version 43664 (0.0030) [2024-06-12 17:41:40,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48059.7, 300 sec: 48152.3). Total num frames: 715440128. Throughput: 0: 48117.2. Samples: 244265500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 17:41:40,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 17:41:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000043667_715440128.pth... [2024-06-12 17:41:41,019][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000042963_703905792.pth [2024-06-12 17:41:43,261][71000] Updated weights for policy 0, policy_version 43674 (0.0022) [2024-06-12 17:41:45,940][70768] Fps is (10 sec: 45874.6, 60 sec: 47786.5, 300 sec: 48096.7). Total num frames: 715669504. Throughput: 0: 48217.1. Samples: 244551540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 17:41:45,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 17:41:46,511][71000] Updated weights for policy 0, policy_version 43684 (0.0021) [2024-06-12 17:41:49,885][71000] Updated weights for policy 0, policy_version 43694 (0.0028) [2024-06-12 17:41:50,940][70768] Fps is (10 sec: 47514.4, 60 sec: 48059.8, 300 sec: 48152.3). Total num frames: 715915264. Throughput: 0: 48093.4. Samples: 244694400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 17:41:50,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:41:53,567][71000] Updated weights for policy 0, policy_version 43704 (0.0024) [2024-06-12 17:41:55,940][70768] Fps is (10 sec: 47514.3, 60 sec: 47786.7, 300 sec: 48041.2). Total num frames: 716144640. Throughput: 0: 48084.0. Samples: 244978040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 17:41:55,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:41:56,727][71000] Updated weights for policy 0, policy_version 43714 (0.0032) [2024-06-12 17:42:00,158][71000] Updated weights for policy 0, policy_version 43724 (0.0037) [2024-06-12 17:42:00,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48059.8, 300 sec: 48041.2). Total num frames: 716390400. Throughput: 0: 47941.1. Samples: 245263040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 17:42:00,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 17:42:03,271][71000] Updated weights for policy 0, policy_version 43734 (0.0032) [2024-06-12 17:42:05,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48059.7, 300 sec: 48096.7). Total num frames: 716636160. Throughput: 0: 47957.4. Samples: 245406940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 17:42:05,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 17:42:07,038][71000] Updated weights for policy 0, policy_version 43744 (0.0039) [2024-06-12 17:42:10,309][71000] Updated weights for policy 0, policy_version 43754 (0.0032) [2024-06-12 17:42:10,940][70768] Fps is (10 sec: 47513.1, 60 sec: 47786.7, 300 sec: 48096.8). Total num frames: 716865536. Throughput: 0: 48195.7. Samples: 245702340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 17:42:10,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:42:13,957][71000] Updated weights for policy 0, policy_version 43764 (0.0032) [2024-06-12 17:42:15,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48332.8, 300 sec: 48096.8). Total num frames: 717127680. Throughput: 0: 47889.4. Samples: 245984480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 17:42:15,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:42:17,493][71000] Updated weights for policy 0, policy_version 43774 (0.0028) [2024-06-12 17:42:20,939][70768] Fps is (10 sec: 47513.8, 60 sec: 47513.7, 300 sec: 48041.2). Total num frames: 717340672. Throughput: 0: 47971.2. Samples: 246129660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 17:42:20,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 17:42:20,955][71000] Updated weights for policy 0, policy_version 43784 (0.0031) [2024-06-12 17:42:24,050][71000] Updated weights for policy 0, policy_version 43794 (0.0034) [2024-06-12 17:42:25,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48059.8, 300 sec: 48096.7). Total num frames: 717586432. Throughput: 0: 47801.4. Samples: 246416560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 17:42:25,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 17:42:27,338][70980] Signal inference workers to stop experience collection... (3650 times) [2024-06-12 17:42:27,393][71000] InferenceWorker_p0-w0: stopping experience collection (3650 times) [2024-06-12 17:42:27,445][70980] Signal inference workers to resume experience collection... (3650 times) [2024-06-12 17:42:27,446][71000] InferenceWorker_p0-w0: resuming experience collection (3650 times) [2024-06-12 17:42:27,583][71000] Updated weights for policy 0, policy_version 43804 (0.0028) [2024-06-12 17:42:30,871][71000] Updated weights for policy 0, policy_version 43814 (0.0025) [2024-06-12 17:42:30,940][70768] Fps is (10 sec: 50789.7, 60 sec: 48332.8, 300 sec: 48152.8). Total num frames: 717848576. Throughput: 0: 47804.5. Samples: 246702740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 17:42:30,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:42:34,509][71000] Updated weights for policy 0, policy_version 43824 (0.0027) [2024-06-12 17:42:35,944][70768] Fps is (10 sec: 49130.8, 60 sec: 47783.2, 300 sec: 48040.5). Total num frames: 718077952. Throughput: 0: 47958.9. Samples: 246852760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 17:42:35,945][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:42:37,962][71000] Updated weights for policy 0, policy_version 43834 (0.0028) [2024-06-12 17:42:40,940][70768] Fps is (10 sec: 45874.9, 60 sec: 47786.7, 300 sec: 48152.3). Total num frames: 718307328. Throughput: 0: 48092.3. Samples: 247142200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 17:42:40,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:42:41,273][71000] Updated weights for policy 0, policy_version 43844 (0.0029) [2024-06-12 17:42:44,935][71000] Updated weights for policy 0, policy_version 43854 (0.0028) [2024-06-12 17:42:45,940][70768] Fps is (10 sec: 47534.3, 60 sec: 48059.9, 300 sec: 48096.8). Total num frames: 718553088. Throughput: 0: 48064.3. Samples: 247425940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 17:42:45,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:42:48,379][71000] Updated weights for policy 0, policy_version 43864 (0.0031) [2024-06-12 17:42:50,942][70768] Fps is (10 sec: 50779.2, 60 sec: 48330.9, 300 sec: 48151.9). Total num frames: 718815232. Throughput: 0: 48069.1. Samples: 247570160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 17:42:50,942][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:42:51,734][71000] Updated weights for policy 0, policy_version 43874 (0.0032) [2024-06-12 17:42:55,021][71000] Updated weights for policy 0, policy_version 43884 (0.0027) [2024-06-12 17:42:55,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48059.8, 300 sec: 47985.7). Total num frames: 719028224. Throughput: 0: 47883.1. Samples: 247857080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 17:42:55,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:42:58,311][71000] Updated weights for policy 0, policy_version 43894 (0.0033) [2024-06-12 17:43:00,939][70768] Fps is (10 sec: 44247.7, 60 sec: 47786.7, 300 sec: 48097.5). Total num frames: 719257600. Throughput: 0: 48033.5. Samples: 248145980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 17:43:00,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 17:43:02,180][71000] Updated weights for policy 0, policy_version 43904 (0.0032) [2024-06-12 17:43:05,286][71000] Updated weights for policy 0, policy_version 43914 (0.0042) [2024-06-12 17:43:05,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48059.8, 300 sec: 48152.3). Total num frames: 719519744. Throughput: 0: 47888.4. Samples: 248284640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 17:43:05,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 17:43:08,677][71000] Updated weights for policy 0, policy_version 43924 (0.0032) [2024-06-12 17:43:10,940][70768] Fps is (10 sec: 50789.5, 60 sec: 48332.7, 300 sec: 48096.8). Total num frames: 719765504. Throughput: 0: 47943.5. Samples: 248574020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 17:43:10,951][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 17:43:12,144][71000] Updated weights for policy 0, policy_version 43934 (0.0028) [2024-06-12 17:43:15,930][71000] Updated weights for policy 0, policy_version 43944 (0.0033) [2024-06-12 17:43:15,944][70768] Fps is (10 sec: 45855.3, 60 sec: 47510.2, 300 sec: 47985.0). Total num frames: 719978496. Throughput: 0: 47878.2. Samples: 248857460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 17:43:15,945][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 17:43:18,883][71000] Updated weights for policy 0, policy_version 43954 (0.0022) [2024-06-12 17:43:20,940][70768] Fps is (10 sec: 44236.4, 60 sec: 47786.5, 300 sec: 48041.2). Total num frames: 720207872. Throughput: 0: 47652.4. Samples: 248996920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 17:43:20,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 17:43:22,610][71000] Updated weights for policy 0, policy_version 43964 (0.0034) [2024-06-12 17:43:25,550][71000] Updated weights for policy 0, policy_version 43974 (0.0031) [2024-06-12 17:43:25,940][70768] Fps is (10 sec: 49173.2, 60 sec: 48059.7, 300 sec: 48041.2). Total num frames: 720470016. Throughput: 0: 47779.7. Samples: 249292280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 17:43:25,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:43:29,133][71000] Updated weights for policy 0, policy_version 43984 (0.0033) [2024-06-12 17:43:30,934][70980] Signal inference workers to stop experience collection... (3700 times) [2024-06-12 17:43:30,934][70980] Signal inference workers to resume experience collection... (3700 times) [2024-06-12 17:43:30,940][70768] Fps is (10 sec: 50791.3, 60 sec: 47786.8, 300 sec: 47985.7). Total num frames: 720715776. Throughput: 0: 47912.0. Samples: 249581980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 17:43:30,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 17:43:30,971][71000] InferenceWorker_p0-w0: stopping experience collection (3700 times) [2024-06-12 17:43:30,971][71000] InferenceWorker_p0-w0: resuming experience collection (3700 times) [2024-06-12 17:43:32,539][71000] Updated weights for policy 0, policy_version 43994 (0.0031) [2024-06-12 17:43:35,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48063.1, 300 sec: 48041.2). Total num frames: 720961536. Throughput: 0: 47996.2. Samples: 249729880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 17:43:35,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:43:35,941][71000] Updated weights for policy 0, policy_version 44004 (0.0031) [2024-06-12 17:43:39,107][71000] Updated weights for policy 0, policy_version 44014 (0.0029) [2024-06-12 17:43:40,940][70768] Fps is (10 sec: 45875.2, 60 sec: 47786.8, 300 sec: 47985.7). Total num frames: 721174528. Throughput: 0: 47933.8. Samples: 250014100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 17:43:40,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 17:43:41,004][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000044018_721190912.pth... [2024-06-12 17:43:41,067][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000043314_709656576.pth [2024-06-12 17:43:42,911][71000] Updated weights for policy 0, policy_version 44024 (0.0032) [2024-06-12 17:43:45,771][71000] Updated weights for policy 0, policy_version 44034 (0.0029) [2024-06-12 17:43:45,940][70768] Fps is (10 sec: 49152.8, 60 sec: 48332.8, 300 sec: 48041.2). Total num frames: 721453056. Throughput: 0: 47834.2. Samples: 250298520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 17:43:45,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 17:43:49,701][71000] Updated weights for policy 0, policy_version 44044 (0.0031) [2024-06-12 17:43:50,940][70768] Fps is (10 sec: 50789.6, 60 sec: 47788.4, 300 sec: 47985.7). Total num frames: 721682432. Throughput: 0: 48258.5. Samples: 250456280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 17:43:50,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 17:43:52,879][71000] Updated weights for policy 0, policy_version 44054 (0.0036) [2024-06-12 17:43:55,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48059.7, 300 sec: 48041.2). Total num frames: 721911808. Throughput: 0: 48278.3. Samples: 250746540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 17:43:55,940][70768] Avg episode reward: [(0, '0.232')] [2024-06-12 17:43:56,214][71000] Updated weights for policy 0, policy_version 44064 (0.0020) [2024-06-12 17:43:59,436][71000] Updated weights for policy 0, policy_version 44074 (0.0031) [2024-06-12 17:44:00,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48332.6, 300 sec: 48041.2). Total num frames: 722157568. Throughput: 0: 48430.7. Samples: 251036640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 17:44:00,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:44:03,154][71000] Updated weights for policy 0, policy_version 44084 (0.0037) [2024-06-12 17:44:05,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48332.7, 300 sec: 48152.3). Total num frames: 722419712. Throughput: 0: 48399.6. Samples: 251174900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 17:44:05,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 17:44:06,213][71000] Updated weights for policy 0, policy_version 44094 (0.0024) [2024-06-12 17:44:10,063][71000] Updated weights for policy 0, policy_version 44104 (0.0035) [2024-06-12 17:44:10,939][70768] Fps is (10 sec: 49153.0, 60 sec: 48059.8, 300 sec: 48041.2). Total num frames: 722649088. Throughput: 0: 48489.9. Samples: 251474320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 17:44:10,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 17:44:12,930][71000] Updated weights for policy 0, policy_version 44114 (0.0034) [2024-06-12 17:44:15,940][70768] Fps is (10 sec: 45875.7, 60 sec: 48336.3, 300 sec: 48041.2). Total num frames: 722878464. Throughput: 0: 48285.8. Samples: 251754840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 17:44:15,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 17:44:17,126][71000] Updated weights for policy 0, policy_version 44124 (0.0031) [2024-06-12 17:44:20,076][71000] Updated weights for policy 0, policy_version 44134 (0.0030) [2024-06-12 17:44:20,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48332.9, 300 sec: 48041.2). Total num frames: 723107840. Throughput: 0: 48062.3. Samples: 251892680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 17:44:20,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:44:23,796][71000] Updated weights for policy 0, policy_version 44144 (0.0035) [2024-06-12 17:44:25,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48332.9, 300 sec: 48096.8). Total num frames: 723369984. Throughput: 0: 48147.6. Samples: 252180740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 17:44:25,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:44:26,706][71000] Updated weights for policy 0, policy_version 44154 (0.0037) [2024-06-12 17:44:30,521][71000] Updated weights for policy 0, policy_version 44164 (0.0025) [2024-06-12 17:44:30,940][70768] Fps is (10 sec: 47513.8, 60 sec: 47786.7, 300 sec: 47985.7). Total num frames: 723582976. Throughput: 0: 48294.2. Samples: 252471760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 17:44:30,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 17:44:33,658][71000] Updated weights for policy 0, policy_version 44174 (0.0039) [2024-06-12 17:44:35,940][70768] Fps is (10 sec: 45874.9, 60 sec: 47786.8, 300 sec: 48096.8). Total num frames: 723828736. Throughput: 0: 47776.6. Samples: 252606220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 17:44:35,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 17:44:37,599][71000] Updated weights for policy 0, policy_version 44184 (0.0039) [2024-06-12 17:44:40,450][71000] Updated weights for policy 0, policy_version 44194 (0.0032) [2024-06-12 17:44:40,939][70768] Fps is (10 sec: 49152.0, 60 sec: 48332.8, 300 sec: 48041.2). Total num frames: 724074496. Throughput: 0: 47791.6. Samples: 252897160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 17:44:40,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:44:44,359][71000] Updated weights for policy 0, policy_version 44204 (0.0038) [2024-06-12 17:44:45,940][70768] Fps is (10 sec: 49151.2, 60 sec: 47786.5, 300 sec: 47985.7). Total num frames: 724320256. Throughput: 0: 47680.0. Samples: 253182240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 17:44:45,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:44:47,538][71000] Updated weights for policy 0, policy_version 44214 (0.0026) [2024-06-12 17:44:50,940][70768] Fps is (10 sec: 47513.5, 60 sec: 47786.8, 300 sec: 48041.2). Total num frames: 724549632. Throughput: 0: 47908.6. Samples: 253330780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 17:44:50,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:44:51,181][71000] Updated weights for policy 0, policy_version 44224 (0.0033) [2024-06-12 17:44:54,062][71000] Updated weights for policy 0, policy_version 44234 (0.0020) [2024-06-12 17:44:55,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48332.8, 300 sec: 48096.7). Total num frames: 724811776. Throughput: 0: 47853.3. Samples: 253627720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 17:44:55,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:44:57,540][71000] Updated weights for policy 0, policy_version 44244 (0.0028) [2024-06-12 17:45:00,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48059.9, 300 sec: 48096.8). Total num frames: 725041152. Throughput: 0: 48062.7. Samples: 253917660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 17:45:00,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 17:45:01,001][71000] Updated weights for policy 0, policy_version 44254 (0.0043) [2024-06-12 17:45:04,412][71000] Updated weights for policy 0, policy_version 44264 (0.0032) [2024-06-12 17:45:05,940][70768] Fps is (10 sec: 45875.2, 60 sec: 47513.7, 300 sec: 47985.7). Total num frames: 725270528. Throughput: 0: 48146.2. Samples: 254059260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 17:45:05,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:45:07,724][71000] Updated weights for policy 0, policy_version 44274 (0.0031) [2024-06-12 17:45:10,940][70768] Fps is (10 sec: 49148.7, 60 sec: 48059.2, 300 sec: 48096.6). Total num frames: 725532672. Throughput: 0: 47994.3. Samples: 254340520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 17:45:10,941][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:45:11,159][71000] Updated weights for policy 0, policy_version 44284 (0.0026) [2024-06-12 17:45:12,370][70980] Signal inference workers to stop experience collection... (3750 times) [2024-06-12 17:45:12,371][70980] Signal inference workers to resume experience collection... (3750 times) [2024-06-12 17:45:12,419][71000] InferenceWorker_p0-w0: stopping experience collection (3750 times) [2024-06-12 17:45:12,419][71000] InferenceWorker_p0-w0: resuming experience collection (3750 times) [2024-06-12 17:45:14,667][71000] Updated weights for policy 0, policy_version 44294 (0.0037) [2024-06-12 17:45:15,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48059.7, 300 sec: 48096.8). Total num frames: 725762048. Throughput: 0: 48074.6. Samples: 254635120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 17:45:15,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 17:45:18,085][71000] Updated weights for policy 0, policy_version 44304 (0.0033) [2024-06-12 17:45:20,940][70768] Fps is (10 sec: 47516.1, 60 sec: 48332.7, 300 sec: 48041.2). Total num frames: 726007808. Throughput: 0: 48207.9. Samples: 254775580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 17:45:20,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:45:21,714][71000] Updated weights for policy 0, policy_version 44314 (0.0032) [2024-06-12 17:45:24,777][71000] Updated weights for policy 0, policy_version 44324 (0.0028) [2024-06-12 17:45:25,939][70768] Fps is (10 sec: 47514.1, 60 sec: 47786.6, 300 sec: 48041.2). Total num frames: 726237184. Throughput: 0: 48139.1. Samples: 255063420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 17:45:25,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:45:28,331][71000] Updated weights for policy 0, policy_version 44334 (0.0033) [2024-06-12 17:45:30,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48332.8, 300 sec: 48096.8). Total num frames: 726482944. Throughput: 0: 48289.9. Samples: 255355280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 17:45:30,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 17:45:31,530][71000] Updated weights for policy 0, policy_version 44344 (0.0030) [2024-06-12 17:45:35,190][71000] Updated weights for policy 0, policy_version 44354 (0.0038) [2024-06-12 17:45:35,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48332.7, 300 sec: 48041.2). Total num frames: 726728704. Throughput: 0: 48345.7. Samples: 255506340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 17:45:35,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 17:45:38,401][71000] Updated weights for policy 0, policy_version 44364 (0.0046) [2024-06-12 17:45:40,943][70768] Fps is (10 sec: 47495.0, 60 sec: 48056.6, 300 sec: 47985.0). Total num frames: 726958080. Throughput: 0: 47981.6. Samples: 255787080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 17:45:40,944][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:45:40,962][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000044370_726958080.pth... [2024-06-12 17:45:41,007][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000043667_715440128.pth [2024-06-12 17:45:42,141][71000] Updated weights for policy 0, policy_version 44374 (0.0037) [2024-06-12 17:45:45,124][71000] Updated weights for policy 0, policy_version 44384 (0.0034) [2024-06-12 17:45:45,939][70768] Fps is (10 sec: 49153.0, 60 sec: 48333.0, 300 sec: 48096.8). Total num frames: 727220224. Throughput: 0: 47985.4. Samples: 256077000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 17:45:45,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:45:48,776][71000] Updated weights for policy 0, policy_version 44394 (0.0028) [2024-06-12 17:45:50,940][70768] Fps is (10 sec: 49170.4, 60 sec: 48332.7, 300 sec: 48041.2). Total num frames: 727449600. Throughput: 0: 48253.2. Samples: 256230660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 17:45:50,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:45:51,944][71000] Updated weights for policy 0, policy_version 44404 (0.0026) [2024-06-12 17:45:55,449][71000] Updated weights for policy 0, policy_version 44414 (0.0035) [2024-06-12 17:45:55,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48059.8, 300 sec: 48096.8). Total num frames: 727695360. Throughput: 0: 48488.7. Samples: 256522480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 17:45:55,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:45:58,870][71000] Updated weights for policy 0, policy_version 44424 (0.0028) [2024-06-12 17:46:00,939][70768] Fps is (10 sec: 47514.7, 60 sec: 48059.8, 300 sec: 48041.2). Total num frames: 727924736. Throughput: 0: 48259.7. Samples: 256806800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 17:46:00,940][70768] Avg episode reward: [(0, '0.221')] [2024-06-12 17:46:02,391][71000] Updated weights for policy 0, policy_version 44434 (0.0030) [2024-06-12 17:46:05,554][71000] Updated weights for policy 0, policy_version 44444 (0.0030) [2024-06-12 17:46:05,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48332.7, 300 sec: 48041.2). Total num frames: 728170496. Throughput: 0: 48185.3. Samples: 256943920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 17:46:05,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:46:09,319][71000] Updated weights for policy 0, policy_version 44454 (0.0029) [2024-06-12 17:46:10,939][70768] Fps is (10 sec: 49152.0, 60 sec: 48060.3, 300 sec: 48096.8). Total num frames: 728416256. Throughput: 0: 48452.0. Samples: 257243760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 17:46:10,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:46:12,085][71000] Updated weights for policy 0, policy_version 44464 (0.0034) [2024-06-12 17:46:15,889][71000] Updated weights for policy 0, policy_version 44474 (0.0030) [2024-06-12 17:46:15,939][70768] Fps is (10 sec: 49152.9, 60 sec: 48332.9, 300 sec: 48041.2). Total num frames: 728662016. Throughput: 0: 48477.4. Samples: 257536760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 17:46:15,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 17:46:18,959][71000] Updated weights for policy 0, policy_version 44484 (0.0030) [2024-06-12 17:46:20,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48059.8, 300 sec: 48096.8). Total num frames: 728891392. Throughput: 0: 48144.1. Samples: 257672820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 17:46:20,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 17:46:22,694][71000] Updated weights for policy 0, policy_version 44494 (0.0035) [2024-06-12 17:46:25,853][71000] Updated weights for policy 0, policy_version 44504 (0.0033) [2024-06-12 17:46:25,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48605.8, 300 sec: 48152.3). Total num frames: 729153536. Throughput: 0: 48364.1. Samples: 257963280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 17:46:25,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:46:29,600][71000] Updated weights for policy 0, policy_version 44514 (0.0033) [2024-06-12 17:46:30,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48059.6, 300 sec: 47985.7). Total num frames: 729366528. Throughput: 0: 48188.6. Samples: 258245500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 17:46:30,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:46:30,950][70980] Signal inference workers to stop experience collection... (3800 times) [2024-06-12 17:46:30,952][70980] Signal inference workers to resume experience collection... (3800 times) [2024-06-12 17:46:31,004][71000] InferenceWorker_p0-w0: stopping experience collection (3800 times) [2024-06-12 17:46:31,004][71000] InferenceWorker_p0-w0: resuming experience collection (3800 times) [2024-06-12 17:46:32,466][71000] Updated weights for policy 0, policy_version 44524 (0.0036) [2024-06-12 17:46:35,939][70768] Fps is (10 sec: 44237.5, 60 sec: 47786.8, 300 sec: 47985.7). Total num frames: 729595904. Throughput: 0: 47969.2. Samples: 258389260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 17:46:35,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 17:46:36,526][71000] Updated weights for policy 0, policy_version 44534 (0.0026) [2024-06-12 17:46:39,272][71000] Updated weights for policy 0, policy_version 44544 (0.0029) [2024-06-12 17:46:40,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48335.8, 300 sec: 48096.8). Total num frames: 729858048. Throughput: 0: 47891.8. Samples: 258677620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 17:46:40,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:46:43,337][71000] Updated weights for policy 0, policy_version 44554 (0.0028) [2024-06-12 17:46:45,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48059.6, 300 sec: 48096.8). Total num frames: 730103808. Throughput: 0: 48109.3. Samples: 258971720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 17:46:45,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 17:46:46,133][71000] Updated weights for policy 0, policy_version 44564 (0.0031) [2024-06-12 17:46:50,040][71000] Updated weights for policy 0, policy_version 44574 (0.0035) [2024-06-12 17:46:50,940][70768] Fps is (10 sec: 49153.0, 60 sec: 48333.0, 300 sec: 48152.3). Total num frames: 730349568. Throughput: 0: 48429.5. Samples: 259123240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 17:46:50,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:46:52,896][71000] Updated weights for policy 0, policy_version 44584 (0.0026) [2024-06-12 17:46:55,940][70768] Fps is (10 sec: 45875.3, 60 sec: 47786.7, 300 sec: 48041.2). Total num frames: 730562560. Throughput: 0: 47951.5. Samples: 259401580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 17:46:55,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:46:56,776][71000] Updated weights for policy 0, policy_version 44594 (0.0031) [2024-06-12 17:46:59,498][71000] Updated weights for policy 0, policy_version 44604 (0.0029) [2024-06-12 17:47:00,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48605.7, 300 sec: 48152.3). Total num frames: 730841088. Throughput: 0: 47868.3. Samples: 259690840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 17:47:00,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:47:03,733][71000] Updated weights for policy 0, policy_version 44614 (0.0028) [2024-06-12 17:47:05,940][70768] Fps is (10 sec: 52427.9, 60 sec: 48605.9, 300 sec: 48207.8). Total num frames: 731086848. Throughput: 0: 48231.0. Samples: 259843220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 17:47:05,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:47:06,626][71000] Updated weights for policy 0, policy_version 44624 (0.0026) [2024-06-12 17:47:10,411][71000] Updated weights for policy 0, policy_version 44634 (0.0028) [2024-06-12 17:47:10,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48059.7, 300 sec: 48041.2). Total num frames: 731299840. Throughput: 0: 48328.5. Samples: 260138060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 17:47:10,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 17:47:13,558][71000] Updated weights for policy 0, policy_version 44644 (0.0021) [2024-06-12 17:47:15,940][70768] Fps is (10 sec: 44236.9, 60 sec: 47786.5, 300 sec: 48096.7). Total num frames: 731529216. Throughput: 0: 48384.0. Samples: 260422780. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-12 17:47:15,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 17:47:17,058][71000] Updated weights for policy 0, policy_version 44654 (0.0028) [2024-06-12 17:47:20,060][71000] Updated weights for policy 0, policy_version 44664 (0.0029) [2024-06-12 17:47:20,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48605.9, 300 sec: 48207.8). Total num frames: 731807744. Throughput: 0: 48479.9. Samples: 260570860. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-12 17:47:20,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:47:23,771][71000] Updated weights for policy 0, policy_version 44674 (0.0027) [2024-06-12 17:47:25,939][70768] Fps is (10 sec: 52430.0, 60 sec: 48332.9, 300 sec: 48152.3). Total num frames: 732053504. Throughput: 0: 48475.0. Samples: 260858980. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-12 17:47:25,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:47:27,069][71000] Updated weights for policy 0, policy_version 44684 (0.0036) [2024-06-12 17:47:30,845][71000] Updated weights for policy 0, policy_version 44694 (0.0034) [2024-06-12 17:47:30,939][70768] Fps is (10 sec: 45875.2, 60 sec: 48332.9, 300 sec: 48097.5). Total num frames: 732266496. Throughput: 0: 48344.5. Samples: 261147220. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-12 17:47:30,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 17:47:33,860][71000] Updated weights for policy 0, policy_version 44704 (0.0041) [2024-06-12 17:47:35,939][70768] Fps is (10 sec: 42598.1, 60 sec: 48059.7, 300 sec: 48041.3). Total num frames: 732479488. Throughput: 0: 47913.8. Samples: 261279360. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-12 17:47:35,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:47:37,563][71000] Updated weights for policy 0, policy_version 44714 (0.0035) [2024-06-12 17:47:40,687][70980] Signal inference workers to stop experience collection... (3850 times) [2024-06-12 17:47:40,688][70980] Signal inference workers to resume experience collection... (3850 times) [2024-06-12 17:47:40,718][71000] InferenceWorker_p0-w0: stopping experience collection (3850 times) [2024-06-12 17:47:40,719][71000] InferenceWorker_p0-w0: resuming experience collection (3850 times) [2024-06-12 17:47:40,824][71000] Updated weights for policy 0, policy_version 44724 (0.0033) [2024-06-12 17:47:40,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48332.8, 300 sec: 48152.3). Total num frames: 732758016. Throughput: 0: 48080.7. Samples: 261565220. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-12 17:47:40,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:47:40,954][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000044724_732758016.pth... [2024-06-12 17:47:41,011][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000044018_721190912.pth [2024-06-12 17:47:44,688][71000] Updated weights for policy 0, policy_version 44734 (0.0037) [2024-06-12 17:47:45,940][70768] Fps is (10 sec: 54067.0, 60 sec: 48605.9, 300 sec: 48152.7). Total num frames: 733020160. Throughput: 0: 48297.5. Samples: 261864220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-12 17:47:45,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 17:47:47,363][71000] Updated weights for policy 0, policy_version 44744 (0.0035) [2024-06-12 17:47:50,939][70768] Fps is (10 sec: 45876.1, 60 sec: 47786.7, 300 sec: 48096.8). Total num frames: 733216768. Throughput: 0: 48080.6. Samples: 262006840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-12 17:47:50,940][70768] Avg episode reward: [(0, '0.231')] [2024-06-12 17:47:51,307][71000] Updated weights for policy 0, policy_version 44754 (0.0030) [2024-06-12 17:47:54,246][71000] Updated weights for policy 0, policy_version 44764 (0.0028) [2024-06-12 17:47:55,940][70768] Fps is (10 sec: 44236.3, 60 sec: 48332.7, 300 sec: 48152.3). Total num frames: 733462528. Throughput: 0: 47872.8. Samples: 262292340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-12 17:47:55,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:47:57,993][71000] Updated weights for policy 0, policy_version 44774 (0.0027) [2024-06-12 17:48:00,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48059.7, 300 sec: 48152.3). Total num frames: 733724672. Throughput: 0: 48189.3. Samples: 262591300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-12 17:48:00,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 17:48:01,001][71000] Updated weights for policy 0, policy_version 44784 (0.0036) [2024-06-12 17:48:04,899][71000] Updated weights for policy 0, policy_version 44794 (0.0036) [2024-06-12 17:48:05,940][70768] Fps is (10 sec: 52429.3, 60 sec: 48332.9, 300 sec: 48207.8). Total num frames: 733986816. Throughput: 0: 48264.0. Samples: 262742740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-12 17:48:05,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 17:48:07,619][71000] Updated weights for policy 0, policy_version 44804 (0.0036) [2024-06-12 17:48:10,939][70768] Fps is (10 sec: 47514.5, 60 sec: 48332.9, 300 sec: 48208.6). Total num frames: 734199808. Throughput: 0: 48212.4. Samples: 263028540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 17:48:10,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:48:11,703][71000] Updated weights for policy 0, policy_version 44814 (0.0027) [2024-06-12 17:48:14,285][71000] Updated weights for policy 0, policy_version 44824 (0.0026) [2024-06-12 17:48:15,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48606.0, 300 sec: 48263.4). Total num frames: 734445568. Throughput: 0: 48156.4. Samples: 263314260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 17:48:15,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:48:18,519][71000] Updated weights for policy 0, policy_version 44834 (0.0048) [2024-06-12 17:48:20,940][70768] Fps is (10 sec: 50789.3, 60 sec: 48332.7, 300 sec: 48263.4). Total num frames: 734707712. Throughput: 0: 48552.3. Samples: 263464220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 17:48:20,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:48:21,129][71000] Updated weights for policy 0, policy_version 44844 (0.0027) [2024-06-12 17:48:24,986][71000] Updated weights for policy 0, policy_version 44854 (0.0029) [2024-06-12 17:48:25,940][70768] Fps is (10 sec: 49151.2, 60 sec: 48059.5, 300 sec: 48207.8). Total num frames: 734937088. Throughput: 0: 48794.2. Samples: 263760960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 17:48:25,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:48:28,115][71000] Updated weights for policy 0, policy_version 44864 (0.0022) [2024-06-12 17:48:30,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48332.8, 300 sec: 48152.3). Total num frames: 735166464. Throughput: 0: 48592.9. Samples: 264050900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 17:48:30,940][70768] Avg episode reward: [(0, '0.215')] [2024-06-12 17:48:31,982][71000] Updated weights for policy 0, policy_version 44874 (0.0033) [2024-06-12 17:48:34,720][71000] Updated weights for policy 0, policy_version 44884 (0.0031) [2024-06-12 17:48:35,939][70768] Fps is (10 sec: 47514.8, 60 sec: 48879.0, 300 sec: 48263.4). Total num frames: 735412224. Throughput: 0: 48515.6. Samples: 264190040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 17:48:35,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 17:48:38,683][71000] Updated weights for policy 0, policy_version 44894 (0.0030) [2024-06-12 17:48:40,940][70768] Fps is (10 sec: 52428.9, 60 sec: 48879.1, 300 sec: 48263.4). Total num frames: 735690752. Throughput: 0: 48735.7. Samples: 264485440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 17:48:40,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:48:41,226][71000] Updated weights for policy 0, policy_version 44904 (0.0029) [2024-06-12 17:48:45,286][70980] Signal inference workers to stop experience collection... (3900 times) [2024-06-12 17:48:45,323][71000] InferenceWorker_p0-w0: stopping experience collection (3900 times) [2024-06-12 17:48:45,403][70980] Signal inference workers to resume experience collection... (3900 times) [2024-06-12 17:48:45,404][71000] InferenceWorker_p0-w0: resuming experience collection (3900 times) [2024-06-12 17:48:45,406][71000] Updated weights for policy 0, policy_version 44914 (0.0034) [2024-06-12 17:48:45,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48332.8, 300 sec: 48263.4). Total num frames: 735920128. Throughput: 0: 48740.6. Samples: 264784620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 17:48:45,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 17:48:48,065][71000] Updated weights for policy 0, policy_version 44924 (0.0028) [2024-06-12 17:48:50,940][70768] Fps is (10 sec: 42598.2, 60 sec: 48332.7, 300 sec: 48152.3). Total num frames: 736116736. Throughput: 0: 48367.5. Samples: 264919280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 17:48:50,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:48:51,964][71000] Updated weights for policy 0, policy_version 44934 (0.0033) [2024-06-12 17:48:54,950][71000] Updated weights for policy 0, policy_version 44944 (0.0037) [2024-06-12 17:48:55,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48605.9, 300 sec: 48207.9). Total num frames: 736378880. Throughput: 0: 48577.6. Samples: 265214540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 17:48:55,940][70768] Avg episode reward: [(0, '0.216')] [2024-06-12 17:48:58,911][71000] Updated weights for policy 0, policy_version 44954 (0.0052) [2024-06-12 17:49:00,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48332.9, 300 sec: 48152.3). Total num frames: 736624640. Throughput: 0: 48476.5. Samples: 265495700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 17:49:00,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:49:01,859][71000] Updated weights for policy 0, policy_version 44964 (0.0033) [2024-06-12 17:49:05,699][71000] Updated weights for policy 0, policy_version 44974 (0.0034) [2024-06-12 17:49:05,939][70768] Fps is (10 sec: 47514.1, 60 sec: 47786.7, 300 sec: 48152.3). Total num frames: 736854016. Throughput: 0: 48338.0. Samples: 265639420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 17:49:05,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:49:08,706][71000] Updated weights for policy 0, policy_version 44984 (0.0037) [2024-06-12 17:49:10,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48332.7, 300 sec: 48207.8). Total num frames: 737099776. Throughput: 0: 48105.4. Samples: 265925700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 17:49:10,940][70768] Avg episode reward: [(0, '0.223')] [2024-06-12 17:49:12,615][71000] Updated weights for policy 0, policy_version 44994 (0.0028) [2024-06-12 17:49:15,521][71000] Updated weights for policy 0, policy_version 45004 (0.0032) [2024-06-12 17:49:15,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48332.8, 300 sec: 48263.4). Total num frames: 737345536. Throughput: 0: 48071.1. Samples: 266214100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 17:49:15,940][70768] Avg episode reward: [(0, '0.225')] [2024-06-12 17:49:19,427][71000] Updated weights for policy 0, policy_version 45014 (0.0035) [2024-06-12 17:49:20,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48059.8, 300 sec: 48207.8). Total num frames: 737591296. Throughput: 0: 48295.8. Samples: 266363360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 17:49:20,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 17:49:22,261][71000] Updated weights for policy 0, policy_version 45024 (0.0026) [2024-06-12 17:49:25,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48059.9, 300 sec: 48263.4). Total num frames: 737820672. Throughput: 0: 48268.5. Samples: 266657520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 17:49:25,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 17:49:26,014][71000] Updated weights for policy 0, policy_version 45034 (0.0032) [2024-06-12 17:49:29,255][71000] Updated weights for policy 0, policy_version 45044 (0.0029) [2024-06-12 17:49:30,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48059.7, 300 sec: 48207.8). Total num frames: 738050048. Throughput: 0: 47843.9. Samples: 266937600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 17:49:30,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:49:32,953][71000] Updated weights for policy 0, policy_version 45054 (0.0031) [2024-06-12 17:49:35,922][71000] Updated weights for policy 0, policy_version 45064 (0.0024) [2024-06-12 17:49:35,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48605.8, 300 sec: 48318.9). Total num frames: 738328576. Throughput: 0: 48067.2. Samples: 267082300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 17:49:35,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:49:39,607][71000] Updated weights for policy 0, policy_version 45074 (0.0030) [2024-06-12 17:49:40,939][70768] Fps is (10 sec: 49152.6, 60 sec: 47513.7, 300 sec: 48207.9). Total num frames: 738541568. Throughput: 0: 47930.8. Samples: 267371420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 17:49:40,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:49:41,087][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000045078_738557952.pth... [2024-06-12 17:49:41,128][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000044370_726958080.pth [2024-06-12 17:49:42,922][71000] Updated weights for policy 0, policy_version 45084 (0.0029) [2024-06-12 17:49:45,940][70768] Fps is (10 sec: 45875.1, 60 sec: 47786.6, 300 sec: 48263.4). Total num frames: 738787328. Throughput: 0: 48184.9. Samples: 267664020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 17:49:45,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:49:46,449][71000] Updated weights for policy 0, policy_version 45094 (0.0035) [2024-06-12 17:49:49,709][71000] Updated weights for policy 0, policy_version 45104 (0.0034) [2024-06-12 17:49:50,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48605.8, 300 sec: 48207.8). Total num frames: 739033088. Throughput: 0: 48169.2. Samples: 267807040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 17:49:50,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 17:49:52,967][71000] Updated weights for policy 0, policy_version 45114 (0.0041) [2024-06-12 17:49:55,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48332.8, 300 sec: 48263.4). Total num frames: 739278848. Throughput: 0: 48315.1. Samples: 268099880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 17:49:55,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 17:49:56,507][71000] Updated weights for policy 0, policy_version 45124 (0.0031) [2024-06-12 17:49:59,978][71000] Updated weights for policy 0, policy_version 45134 (0.0041) [2024-06-12 17:50:00,939][70768] Fps is (10 sec: 45875.8, 60 sec: 47786.7, 300 sec: 48207.9). Total num frames: 739491840. Throughput: 0: 48168.5. Samples: 268381680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 17:50:00,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:50:03,244][71000] Updated weights for policy 0, policy_version 45144 (0.0030) [2024-06-12 17:50:05,062][70980] Signal inference workers to stop experience collection... (3950 times) [2024-06-12 17:50:05,063][70980] Signal inference workers to resume experience collection... (3950 times) [2024-06-12 17:50:05,075][71000] InferenceWorker_p0-w0: stopping experience collection (3950 times) [2024-06-12 17:50:05,075][71000] InferenceWorker_p0-w0: resuming experience collection (3950 times) [2024-06-12 17:50:05,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48332.7, 300 sec: 48207.9). Total num frames: 739753984. Throughput: 0: 48158.7. Samples: 268530500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 17:50:05,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:50:06,596][71000] Updated weights for policy 0, policy_version 45154 (0.0035) [2024-06-12 17:50:10,208][71000] Updated weights for policy 0, policy_version 45164 (0.0032) [2024-06-12 17:50:10,940][70768] Fps is (10 sec: 50789.2, 60 sec: 48332.7, 300 sec: 48263.4). Total num frames: 739999744. Throughput: 0: 48081.1. Samples: 268821180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 17:50:10,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 17:50:13,202][71000] Updated weights for policy 0, policy_version 45174 (0.0038) [2024-06-12 17:50:15,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48059.8, 300 sec: 48207.9). Total num frames: 740229120. Throughput: 0: 48244.1. Samples: 269108580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 17:50:15,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:50:16,933][71000] Updated weights for policy 0, policy_version 45184 (0.0026) [2024-06-12 17:50:19,974][71000] Updated weights for policy 0, policy_version 45194 (0.0024) [2024-06-12 17:50:20,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48059.6, 300 sec: 48263.3). Total num frames: 740474880. Throughput: 0: 48322.0. Samples: 269256800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 17:50:20,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 17:50:23,847][71000] Updated weights for policy 0, policy_version 45204 (0.0029) [2024-06-12 17:50:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48332.7, 300 sec: 48263.4). Total num frames: 740720640. Throughput: 0: 48298.1. Samples: 269544840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 17:50:25,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:50:26,897][71000] Updated weights for policy 0, policy_version 45214 (0.0034) [2024-06-12 17:50:30,692][71000] Updated weights for policy 0, policy_version 45224 (0.0037) [2024-06-12 17:50:30,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48332.7, 300 sec: 48207.8). Total num frames: 740950016. Throughput: 0: 48233.6. Samples: 269834540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 17:50:30,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:50:33,689][71000] Updated weights for policy 0, policy_version 45234 (0.0044) [2024-06-12 17:50:35,940][70768] Fps is (10 sec: 47513.6, 60 sec: 47786.7, 300 sec: 48264.0). Total num frames: 741195776. Throughput: 0: 48024.5. Samples: 269968140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 17:50:35,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:50:37,679][71000] Updated weights for policy 0, policy_version 45244 (0.0025) [2024-06-12 17:50:40,422][71000] Updated weights for policy 0, policy_version 45254 (0.0030) [2024-06-12 17:50:40,940][70768] Fps is (10 sec: 50791.2, 60 sec: 48605.8, 300 sec: 48263.4). Total num frames: 741457920. Throughput: 0: 48046.8. Samples: 270261980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 17:50:40,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:50:44,273][71000] Updated weights for policy 0, policy_version 45264 (0.0027) [2024-06-12 17:50:45,939][70768] Fps is (10 sec: 49152.6, 60 sec: 48332.9, 300 sec: 48263.4). Total num frames: 741687296. Throughput: 0: 48184.1. Samples: 270549960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 17:50:45,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:50:47,337][71000] Updated weights for policy 0, policy_version 45274 (0.0029) [2024-06-12 17:50:50,939][70768] Fps is (10 sec: 44236.9, 60 sec: 47786.7, 300 sec: 48152.3). Total num frames: 741900288. Throughput: 0: 47863.6. Samples: 270684360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 17:50:50,940][70768] Avg episode reward: [(0, '0.231')] [2024-06-12 17:50:51,400][71000] Updated weights for policy 0, policy_version 45284 (0.0029) [2024-06-12 17:50:54,225][71000] Updated weights for policy 0, policy_version 45294 (0.0040) [2024-06-12 17:50:55,940][70768] Fps is (10 sec: 45871.9, 60 sec: 47786.3, 300 sec: 48207.7). Total num frames: 742146048. Throughput: 0: 47746.2. Samples: 270969780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 17:50:55,941][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 17:50:58,621][71000] Updated weights for policy 0, policy_version 45304 (0.0035) [2024-06-12 17:51:00,873][71000] Updated weights for policy 0, policy_version 45314 (0.0036) [2024-06-12 17:51:00,940][70768] Fps is (10 sec: 52428.1, 60 sec: 48878.8, 300 sec: 48318.9). Total num frames: 742424576. Throughput: 0: 47838.1. Samples: 271261300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 17:51:00,940][70768] Avg episode reward: [(0, '0.226')] [2024-06-12 17:51:05,231][71000] Updated weights for policy 0, policy_version 45324 (0.0027) [2024-06-12 17:51:05,940][70768] Fps is (10 sec: 45877.8, 60 sec: 47513.6, 300 sec: 48096.7). Total num frames: 742604800. Throughput: 0: 47792.6. Samples: 271407460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 17:51:05,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:51:07,712][71000] Updated weights for policy 0, policy_version 45334 (0.0033) [2024-06-12 17:51:10,940][70768] Fps is (10 sec: 42598.6, 60 sec: 47513.7, 300 sec: 48096.7). Total num frames: 742850560. Throughput: 0: 47748.0. Samples: 271693500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 17:51:10,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 17:51:12,056][71000] Updated weights for policy 0, policy_version 45344 (0.0025) [2024-06-12 17:51:14,651][71000] Updated weights for policy 0, policy_version 45354 (0.0027) [2024-06-12 17:51:15,939][70768] Fps is (10 sec: 52429.4, 60 sec: 48332.8, 300 sec: 48263.4). Total num frames: 743129088. Throughput: 0: 47545.6. Samples: 271974080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 17:51:15,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:51:18,436][70980] Signal inference workers to stop experience collection... (4000 times) [2024-06-12 17:51:18,436][70980] Signal inference workers to resume experience collection... (4000 times) [2024-06-12 17:51:18,461][71000] InferenceWorker_p0-w0: stopping experience collection (4000 times) [2024-06-12 17:51:18,461][71000] InferenceWorker_p0-w0: resuming experience collection (4000 times) [2024-06-12 17:51:19,067][71000] Updated weights for policy 0, policy_version 45364 (0.0039) [2024-06-12 17:51:20,940][70768] Fps is (10 sec: 49152.0, 60 sec: 47786.8, 300 sec: 48096.8). Total num frames: 743342080. Throughput: 0: 47969.3. Samples: 272126760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 17:51:20,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:51:21,711][71000] Updated weights for policy 0, policy_version 45374 (0.0034) [2024-06-12 17:51:25,939][70768] Fps is (10 sec: 42598.4, 60 sec: 47240.6, 300 sec: 48096.8). Total num frames: 743555072. Throughput: 0: 47561.0. Samples: 272402220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 17:51:25,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:51:26,005][71000] Updated weights for policy 0, policy_version 45384 (0.0031) [2024-06-12 17:51:28,273][71000] Updated weights for policy 0, policy_version 45394 (0.0029) [2024-06-12 17:51:30,940][70768] Fps is (10 sec: 47513.7, 60 sec: 47786.8, 300 sec: 48207.8). Total num frames: 743817216. Throughput: 0: 47581.2. Samples: 272691120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 17:51:30,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:51:32,849][71000] Updated weights for policy 0, policy_version 45404 (0.0033) [2024-06-12 17:51:35,239][71000] Updated weights for policy 0, policy_version 45414 (0.0030) [2024-06-12 17:51:35,940][70768] Fps is (10 sec: 52427.6, 60 sec: 48059.6, 300 sec: 48207.8). Total num frames: 744079360. Throughput: 0: 47987.4. Samples: 272843800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 17:51:35,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:51:39,617][71000] Updated weights for policy 0, policy_version 45424 (0.0031) [2024-06-12 17:51:40,940][70768] Fps is (10 sec: 49152.0, 60 sec: 47513.6, 300 sec: 48152.3). Total num frames: 744308736. Throughput: 0: 48162.9. Samples: 273137080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 17:51:40,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:51:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000045429_744308736.pth... [2024-06-12 17:51:41,011][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000044724_732758016.pth [2024-06-12 17:51:41,967][71000] Updated weights for policy 0, policy_version 45434 (0.0029) [2024-06-12 17:51:45,940][70768] Fps is (10 sec: 44237.5, 60 sec: 47240.4, 300 sec: 48041.2). Total num frames: 744521728. Throughput: 0: 48049.5. Samples: 273423520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 17:51:45,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:51:46,328][71000] Updated weights for policy 0, policy_version 45444 (0.0030) [2024-06-12 17:51:48,934][71000] Updated weights for policy 0, policy_version 45454 (0.0030) [2024-06-12 17:51:50,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48059.8, 300 sec: 48207.8). Total num frames: 744783872. Throughput: 0: 47811.2. Samples: 273558960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 17:51:50,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:51:53,018][71000] Updated weights for policy 0, policy_version 45464 (0.0027) [2024-06-12 17:51:55,450][71000] Updated weights for policy 0, policy_version 45474 (0.0039) [2024-06-12 17:51:55,940][70768] Fps is (10 sec: 54066.6, 60 sec: 48606.3, 300 sec: 48207.8). Total num frames: 745062400. Throughput: 0: 48073.7. Samples: 273856820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 17:51:55,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 17:51:59,930][71000] Updated weights for policy 0, policy_version 45484 (0.0030) [2024-06-12 17:52:00,940][70768] Fps is (10 sec: 49151.0, 60 sec: 47513.6, 300 sec: 48096.8). Total num frames: 745275392. Throughput: 0: 48412.2. Samples: 274152640. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-12 17:52:00,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:52:02,293][71000] Updated weights for policy 0, policy_version 45494 (0.0034) [2024-06-12 17:52:05,939][70768] Fps is (10 sec: 42599.0, 60 sec: 48059.8, 300 sec: 48096.8). Total num frames: 745488384. Throughput: 0: 48011.2. Samples: 274287260. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-12 17:52:05,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:52:06,538][71000] Updated weights for policy 0, policy_version 45504 (0.0035) [2024-06-12 17:52:09,341][71000] Updated weights for policy 0, policy_version 45514 (0.0043) [2024-06-12 17:52:10,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48605.9, 300 sec: 48263.4). Total num frames: 745766912. Throughput: 0: 48452.3. Samples: 274582580. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-12 17:52:10,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:52:13,314][71000] Updated weights for policy 0, policy_version 45524 (0.0033) [2024-06-12 17:52:15,940][70768] Fps is (10 sec: 52428.5, 60 sec: 48059.7, 300 sec: 48152.3). Total num frames: 746012672. Throughput: 0: 48352.4. Samples: 274866980. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-12 17:52:15,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:52:16,098][71000] Updated weights for policy 0, policy_version 45534 (0.0031) [2024-06-12 17:52:20,114][71000] Updated weights for policy 0, policy_version 45544 (0.0031) [2024-06-12 17:52:20,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48332.8, 300 sec: 48096.7). Total num frames: 746242048. Throughput: 0: 48439.7. Samples: 275023580. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-12 17:52:20,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 17:52:22,632][71000] Updated weights for policy 0, policy_version 45554 (0.0032) [2024-06-12 17:52:25,939][70768] Fps is (10 sec: 44237.0, 60 sec: 48332.8, 300 sec: 48096.8). Total num frames: 746455040. Throughput: 0: 48286.3. Samples: 275309960. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-12 17:52:25,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 17:52:26,727][71000] Updated weights for policy 0, policy_version 45564 (0.0027) [2024-06-12 17:52:29,523][71000] Updated weights for policy 0, policy_version 45574 (0.0032) [2024-06-12 17:52:30,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48605.8, 300 sec: 48318.9). Total num frames: 746733568. Throughput: 0: 48267.5. Samples: 275595560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-12 17:52:30,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:52:33,455][71000] Updated weights for policy 0, policy_version 45584 (0.0031) [2024-06-12 17:52:35,940][70768] Fps is (10 sec: 52428.7, 60 sec: 48332.9, 300 sec: 48207.9). Total num frames: 746979328. Throughput: 0: 48729.3. Samples: 275751780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-12 17:52:35,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:52:36,306][71000] Updated weights for policy 0, policy_version 45594 (0.0031) [2024-06-12 17:52:40,381][71000] Updated weights for policy 0, policy_version 45604 (0.0026) [2024-06-12 17:52:40,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48332.7, 300 sec: 48096.7). Total num frames: 747208704. Throughput: 0: 48487.1. Samples: 276038740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-12 17:52:40,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 17:52:41,732][70980] Signal inference workers to stop experience collection... (4050 times) [2024-06-12 17:52:41,732][70980] Signal inference workers to resume experience collection... (4050 times) [2024-06-12 17:52:41,750][71000] InferenceWorker_p0-w0: stopping experience collection (4050 times) [2024-06-12 17:52:41,750][71000] InferenceWorker_p0-w0: resuming experience collection (4050 times) [2024-06-12 17:52:42,982][71000] Updated weights for policy 0, policy_version 45614 (0.0026) [2024-06-12 17:52:45,939][70768] Fps is (10 sec: 44236.9, 60 sec: 48332.8, 300 sec: 48152.3). Total num frames: 747421696. Throughput: 0: 48446.4. Samples: 276332720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-12 17:52:45,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:52:47,329][71000] Updated weights for policy 0, policy_version 45624 (0.0035) [2024-06-12 17:52:50,006][71000] Updated weights for policy 0, policy_version 45634 (0.0034) [2024-06-12 17:52:50,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48878.8, 300 sec: 48318.9). Total num frames: 747716608. Throughput: 0: 48454.6. Samples: 276467720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-12 17:52:50,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 17:52:53,953][71000] Updated weights for policy 0, policy_version 45644 (0.0035) [2024-06-12 17:52:55,940][70768] Fps is (10 sec: 52428.5, 60 sec: 48059.8, 300 sec: 48207.9). Total num frames: 747945984. Throughput: 0: 48411.6. Samples: 276761100. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-12 17:52:55,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 17:52:56,821][71000] Updated weights for policy 0, policy_version 45654 (0.0025) [2024-06-12 17:53:00,550][71000] Updated weights for policy 0, policy_version 45664 (0.0026) [2024-06-12 17:53:00,940][70768] Fps is (10 sec: 44236.9, 60 sec: 48059.8, 300 sec: 48041.2). Total num frames: 748158976. Throughput: 0: 48573.3. Samples: 277052780. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-12 17:53:00,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:53:03,433][71000] Updated weights for policy 0, policy_version 45674 (0.0021) [2024-06-12 17:53:05,940][70768] Fps is (10 sec: 44236.8, 60 sec: 48332.7, 300 sec: 48096.7). Total num frames: 748388352. Throughput: 0: 48123.5. Samples: 277189140. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-12 17:53:05,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:53:07,523][71000] Updated weights for policy 0, policy_version 45684 (0.0032) [2024-06-12 17:53:10,390][71000] Updated weights for policy 0, policy_version 45694 (0.0039) [2024-06-12 17:53:10,940][70768] Fps is (10 sec: 50789.5, 60 sec: 48332.7, 300 sec: 48207.8). Total num frames: 748666880. Throughput: 0: 48058.4. Samples: 277472600. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-12 17:53:10,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:53:14,455][71000] Updated weights for policy 0, policy_version 45704 (0.0034) [2024-06-12 17:53:15,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48059.6, 300 sec: 48096.8). Total num frames: 748896256. Throughput: 0: 48187.0. Samples: 277763980. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-12 17:53:15,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:53:17,139][71000] Updated weights for policy 0, policy_version 45714 (0.0029) [2024-06-12 17:53:20,939][70768] Fps is (10 sec: 44238.1, 60 sec: 47786.7, 300 sec: 48041.3). Total num frames: 749109248. Throughput: 0: 47687.2. Samples: 277897700. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-12 17:53:20,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 17:53:21,151][71000] Updated weights for policy 0, policy_version 45724 (0.0028) [2024-06-12 17:53:24,131][71000] Updated weights for policy 0, policy_version 45734 (0.0037) [2024-06-12 17:53:25,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48878.9, 300 sec: 48207.8). Total num frames: 749387776. Throughput: 0: 47898.8. Samples: 278194180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 17:53:25,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 17:53:27,880][71000] Updated weights for policy 0, policy_version 45744 (0.0026) [2024-06-12 17:53:30,797][71000] Updated weights for policy 0, policy_version 45754 (0.0033) [2024-06-12 17:53:30,939][70768] Fps is (10 sec: 52428.7, 60 sec: 48332.9, 300 sec: 48207.8). Total num frames: 749633536. Throughput: 0: 47780.9. Samples: 278482860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 17:53:30,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:53:34,735][71000] Updated weights for policy 0, policy_version 45764 (0.0031) [2024-06-12 17:53:35,940][70768] Fps is (10 sec: 45875.0, 60 sec: 47786.6, 300 sec: 47985.7). Total num frames: 749846528. Throughput: 0: 48041.8. Samples: 278629600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 17:53:35,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 17:53:37,630][71000] Updated weights for policy 0, policy_version 45774 (0.0030) [2024-06-12 17:53:40,940][70768] Fps is (10 sec: 42598.2, 60 sec: 47513.7, 300 sec: 47930.1). Total num frames: 750059520. Throughput: 0: 47812.5. Samples: 278912660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 17:53:40,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:53:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000045780_750059520.pth... [2024-06-12 17:53:41,000][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000045078_738557952.pth [2024-06-12 17:53:41,723][71000] Updated weights for policy 0, policy_version 45784 (0.0032) [2024-06-12 17:53:44,499][70980] Signal inference workers to stop experience collection... (4100 times) [2024-06-12 17:53:44,499][70980] Signal inference workers to resume experience collection... (4100 times) [2024-06-12 17:53:44,517][71000] InferenceWorker_p0-w0: stopping experience collection (4100 times) [2024-06-12 17:53:44,517][71000] InferenceWorker_p0-w0: resuming experience collection (4100 times) [2024-06-12 17:53:44,637][71000] Updated weights for policy 0, policy_version 45794 (0.0036) [2024-06-12 17:53:45,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48605.9, 300 sec: 48207.8). Total num frames: 750338048. Throughput: 0: 47469.8. Samples: 279188920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 17:53:45,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 17:53:48,354][71000] Updated weights for policy 0, policy_version 45804 (0.0027) [2024-06-12 17:53:50,940][70768] Fps is (10 sec: 50790.2, 60 sec: 47513.6, 300 sec: 48096.8). Total num frames: 750567424. Throughput: 0: 47880.9. Samples: 279343780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 17:53:50,940][70768] Avg episode reward: [(0, '0.222')] [2024-06-12 17:53:51,511][71000] Updated weights for policy 0, policy_version 45814 (0.0030) [2024-06-12 17:53:55,093][71000] Updated weights for policy 0, policy_version 45824 (0.0032) [2024-06-12 17:53:55,940][70768] Fps is (10 sec: 45874.9, 60 sec: 47513.6, 300 sec: 48041.2). Total num frames: 750796800. Throughput: 0: 48085.1. Samples: 279636420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 17:53:55,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:53:58,186][71000] Updated weights for policy 0, policy_version 45834 (0.0033) [2024-06-12 17:54:00,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48059.7, 300 sec: 48096.8). Total num frames: 751042560. Throughput: 0: 47943.3. Samples: 279921420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 17:54:00,940][70768] Avg episode reward: [(0, '0.232')] [2024-06-12 17:54:02,000][71000] Updated weights for policy 0, policy_version 45844 (0.0026) [2024-06-12 17:54:04,954][71000] Updated weights for policy 0, policy_version 45854 (0.0030) [2024-06-12 17:54:05,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48605.8, 300 sec: 48152.3). Total num frames: 751304704. Throughput: 0: 48163.8. Samples: 280065080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 17:54:05,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 17:54:08,771][71000] Updated weights for policy 0, policy_version 45864 (0.0033) [2024-06-12 17:54:10,940][70768] Fps is (10 sec: 49151.6, 60 sec: 47786.8, 300 sec: 48096.7). Total num frames: 751534080. Throughput: 0: 48026.1. Samples: 280355360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 17:54:10,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:54:11,862][71000] Updated weights for policy 0, policy_version 45874 (0.0029) [2024-06-12 17:54:15,408][71000] Updated weights for policy 0, policy_version 45884 (0.0035) [2024-06-12 17:54:15,939][70768] Fps is (10 sec: 45875.7, 60 sec: 47786.8, 300 sec: 48041.2). Total num frames: 751763456. Throughput: 0: 48147.5. Samples: 280649500. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 17:54:15,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 17:54:18,750][71000] Updated weights for policy 0, policy_version 45894 (0.0029) [2024-06-12 17:54:20,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48332.7, 300 sec: 48096.7). Total num frames: 752009216. Throughput: 0: 48078.2. Samples: 280793120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 17:54:20,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 17:54:22,330][71000] Updated weights for policy 0, policy_version 45904 (0.0041) [2024-06-12 17:54:25,537][71000] Updated weights for policy 0, policy_version 45914 (0.0031) [2024-06-12 17:54:25,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48059.7, 300 sec: 48207.8). Total num frames: 752271360. Throughput: 0: 48237.7. Samples: 281083360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 17:54:25,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 17:54:29,147][71000] Updated weights for policy 0, policy_version 45924 (0.0028) [2024-06-12 17:54:30,940][70768] Fps is (10 sec: 47513.7, 60 sec: 47513.5, 300 sec: 47985.7). Total num frames: 752484352. Throughput: 0: 48315.5. Samples: 281363120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 17:54:30,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:54:32,324][71000] Updated weights for policy 0, policy_version 45934 (0.0032) [2024-06-12 17:54:35,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48059.8, 300 sec: 48096.7). Total num frames: 752730112. Throughput: 0: 48308.5. Samples: 281517660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 17:54:35,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:54:36,024][71000] Updated weights for policy 0, policy_version 45944 (0.0039) [2024-06-12 17:54:39,210][71000] Updated weights for policy 0, policy_version 45954 (0.0025) [2024-06-12 17:54:40,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48605.9, 300 sec: 48096.8). Total num frames: 752975872. Throughput: 0: 48074.8. Samples: 281799780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 17:54:40,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:54:42,670][71000] Updated weights for policy 0, policy_version 45964 (0.0031) [2024-06-12 17:54:45,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48059.7, 300 sec: 48096.8). Total num frames: 753221632. Throughput: 0: 48203.1. Samples: 282090560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 17:54:45,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 17:54:46,072][71000] Updated weights for policy 0, policy_version 45974 (0.0035) [2024-06-12 17:54:49,379][71000] Updated weights for policy 0, policy_version 45984 (0.0033) [2024-06-12 17:54:50,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48059.7, 300 sec: 48041.2). Total num frames: 753451008. Throughput: 0: 48168.8. Samples: 282232680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 17:54:50,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:54:52,799][71000] Updated weights for policy 0, policy_version 45994 (0.0028) [2024-06-12 17:54:55,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48332.9, 300 sec: 48152.3). Total num frames: 753696768. Throughput: 0: 48245.9. Samples: 282526420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-12 17:54:55,940][70768] Avg episode reward: [(0, '0.226')] [2024-06-12 17:54:56,314][71000] Updated weights for policy 0, policy_version 46004 (0.0026) [2024-06-12 17:54:58,950][70980] Signal inference workers to stop experience collection... (4150 times) [2024-06-12 17:54:58,950][70980] Signal inference workers to resume experience collection... (4150 times) [2024-06-12 17:54:58,986][71000] InferenceWorker_p0-w0: stopping experience collection (4150 times) [2024-06-12 17:54:58,986][71000] InferenceWorker_p0-w0: resuming experience collection (4150 times) [2024-06-12 17:54:59,405][71000] Updated weights for policy 0, policy_version 46014 (0.0029) [2024-06-12 17:55:00,939][70768] Fps is (10 sec: 49152.7, 60 sec: 48332.8, 300 sec: 48096.8). Total num frames: 753942528. Throughput: 0: 48095.5. Samples: 282813800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-12 17:55:00,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:55:02,882][71000] Updated weights for policy 0, policy_version 46024 (0.0038) [2024-06-12 17:55:05,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48059.8, 300 sec: 48096.8). Total num frames: 754188288. Throughput: 0: 48104.1. Samples: 282957800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-12 17:55:05,940][70768] Avg episode reward: [(0, '0.232')] [2024-06-12 17:55:06,468][71000] Updated weights for policy 0, policy_version 46034 (0.0024) [2024-06-12 17:55:09,700][71000] Updated weights for policy 0, policy_version 46044 (0.0032) [2024-06-12 17:55:10,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48059.8, 300 sec: 48096.8). Total num frames: 754417664. Throughput: 0: 48152.5. Samples: 283250220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-12 17:55:10,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:55:13,487][71000] Updated weights for policy 0, policy_version 46054 (0.0029) [2024-06-12 17:55:15,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48332.8, 300 sec: 48096.8). Total num frames: 754663424. Throughput: 0: 48056.6. Samples: 283525660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-12 17:55:15,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:55:16,710][71000] Updated weights for policy 0, policy_version 46064 (0.0028) [2024-06-12 17:55:20,319][71000] Updated weights for policy 0, policy_version 46074 (0.0022) [2024-06-12 17:55:20,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48059.7, 300 sec: 48041.2). Total num frames: 754892800. Throughput: 0: 47954.6. Samples: 283675620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-12 17:55:20,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:55:23,302][71000] Updated weights for policy 0, policy_version 46084 (0.0029) [2024-06-12 17:55:25,940][70768] Fps is (10 sec: 45874.8, 60 sec: 47513.6, 300 sec: 48041.2). Total num frames: 755122176. Throughput: 0: 48195.5. Samples: 283968580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 17:55:25,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 17:55:26,998][71000] Updated weights for policy 0, policy_version 46094 (0.0031) [2024-06-12 17:55:30,227][71000] Updated weights for policy 0, policy_version 46104 (0.0031) [2024-06-12 17:55:30,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48332.7, 300 sec: 48096.7). Total num frames: 755384320. Throughput: 0: 47870.5. Samples: 284244740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 17:55:30,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 17:55:34,349][71000] Updated weights for policy 0, policy_version 46114 (0.0021) [2024-06-12 17:55:35,939][70768] Fps is (10 sec: 50790.8, 60 sec: 48332.9, 300 sec: 48041.2). Total num frames: 755630080. Throughput: 0: 48101.5. Samples: 284397240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 17:55:35,940][70768] Avg episode reward: [(0, '0.222')] [2024-06-12 17:55:36,895][71000] Updated weights for policy 0, policy_version 46124 (0.0026) [2024-06-12 17:55:40,940][70768] Fps is (10 sec: 45876.2, 60 sec: 47786.7, 300 sec: 47985.7). Total num frames: 755843072. Throughput: 0: 48016.0. Samples: 284687140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 17:55:40,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:55:41,026][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000046134_755859456.pth... [2024-06-12 17:55:41,035][71000] Updated weights for policy 0, policy_version 46134 (0.0029) [2024-06-12 17:55:41,068][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000045429_744308736.pth [2024-06-12 17:55:44,012][71000] Updated weights for policy 0, policy_version 46144 (0.0036) [2024-06-12 17:55:45,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48059.8, 300 sec: 48152.3). Total num frames: 756105216. Throughput: 0: 48028.0. Samples: 284975060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 17:55:45,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:55:47,576][71000] Updated weights for policy 0, policy_version 46154 (0.0033) [2024-06-12 17:55:50,525][71000] Updated weights for policy 0, policy_version 46164 (0.0035) [2024-06-12 17:55:50,940][70768] Fps is (10 sec: 52428.1, 60 sec: 48605.9, 300 sec: 48207.9). Total num frames: 756367360. Throughput: 0: 48169.7. Samples: 285125440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 17:55:50,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 17:55:54,396][71000] Updated weights for policy 0, policy_version 46174 (0.0028) [2024-06-12 17:55:55,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48332.7, 300 sec: 48041.2). Total num frames: 756596736. Throughput: 0: 48103.5. Samples: 285414880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 17:55:55,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:55:57,202][71000] Updated weights for policy 0, policy_version 46184 (0.0035) [2024-06-12 17:56:00,940][70768] Fps is (10 sec: 44237.2, 60 sec: 47786.6, 300 sec: 48152.3). Total num frames: 756809728. Throughput: 0: 48544.3. Samples: 285710160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 17:56:00,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 17:56:01,271][71000] Updated weights for policy 0, policy_version 46194 (0.0033) [2024-06-12 17:56:02,535][70980] Signal inference workers to stop experience collection... (4200 times) [2024-06-12 17:56:02,588][71000] InferenceWorker_p0-w0: stopping experience collection (4200 times) [2024-06-12 17:56:02,588][70980] Signal inference workers to resume experience collection... (4200 times) [2024-06-12 17:56:02,600][71000] InferenceWorker_p0-w0: resuming experience collection (4200 times) [2024-06-12 17:56:04,110][71000] Updated weights for policy 0, policy_version 46204 (0.0028) [2024-06-12 17:56:05,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48059.7, 300 sec: 48207.8). Total num frames: 757071872. Throughput: 0: 48233.9. Samples: 285846140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 17:56:05,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:56:07,974][71000] Updated weights for policy 0, policy_version 46214 (0.0032) [2024-06-12 17:56:10,849][71000] Updated weights for policy 0, policy_version 46224 (0.0038) [2024-06-12 17:56:10,940][70768] Fps is (10 sec: 52428.8, 60 sec: 48605.8, 300 sec: 48152.3). Total num frames: 757334016. Throughput: 0: 48215.1. Samples: 286138260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 17:56:10,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 17:56:14,814][71000] Updated weights for policy 0, policy_version 46234 (0.0028) [2024-06-12 17:56:15,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48059.7, 300 sec: 48152.3). Total num frames: 757547008. Throughput: 0: 48483.0. Samples: 286426460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 17:56:15,940][70768] Avg episode reward: [(0, '0.231')] [2024-06-12 17:56:17,605][71000] Updated weights for policy 0, policy_version 46244 (0.0030) [2024-06-12 17:56:20,940][70768] Fps is (10 sec: 44236.5, 60 sec: 48059.7, 300 sec: 48207.8). Total num frames: 757776384. Throughput: 0: 48250.0. Samples: 286568500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 17:56:20,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 17:56:21,650][71000] Updated weights for policy 0, policy_version 46254 (0.0029) [2024-06-12 17:56:24,335][71000] Updated weights for policy 0, policy_version 46264 (0.0024) [2024-06-12 17:56:25,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48332.8, 300 sec: 48152.3). Total num frames: 758022144. Throughput: 0: 48237.8. Samples: 286857840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 17:56:25,940][70768] Avg episode reward: [(0, '0.228')] [2024-06-12 17:56:28,313][71000] Updated weights for policy 0, policy_version 46274 (0.0032) [2024-06-12 17:56:30,940][70768] Fps is (10 sec: 52429.1, 60 sec: 48606.0, 300 sec: 48207.9). Total num frames: 758300672. Throughput: 0: 48377.7. Samples: 287152060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 17:56:30,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:56:31,039][71000] Updated weights for policy 0, policy_version 46284 (0.0034) [2024-06-12 17:56:35,070][71000] Updated weights for policy 0, policy_version 46294 (0.0024) [2024-06-12 17:56:35,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48332.6, 300 sec: 48207.8). Total num frames: 758530048. Throughput: 0: 48357.7. Samples: 287301540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 17:56:35,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:56:37,953][71000] Updated weights for policy 0, policy_version 46304 (0.0030) [2024-06-12 17:56:40,940][70768] Fps is (10 sec: 44236.7, 60 sec: 48332.7, 300 sec: 48207.8). Total num frames: 758743040. Throughput: 0: 48264.0. Samples: 287586760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 17:56:40,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 17:56:41,938][71000] Updated weights for policy 0, policy_version 46314 (0.0028) [2024-06-12 17:56:44,724][71000] Updated weights for policy 0, policy_version 46324 (0.0023) [2024-06-12 17:56:45,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48059.6, 300 sec: 48152.3). Total num frames: 758988800. Throughput: 0: 47934.1. Samples: 287867200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 17:56:45,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 17:56:48,930][71000] Updated weights for policy 0, policy_version 46334 (0.0028) [2024-06-12 17:56:50,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48059.8, 300 sec: 48096.8). Total num frames: 759250944. Throughput: 0: 48482.2. Samples: 288027840. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-12 17:56:50,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:56:51,685][71000] Updated weights for policy 0, policy_version 46344 (0.0034) [2024-06-12 17:56:55,569][71000] Updated weights for policy 0, policy_version 46354 (0.0036) [2024-06-12 17:56:55,939][70768] Fps is (10 sec: 49152.8, 60 sec: 48059.8, 300 sec: 48152.3). Total num frames: 759480320. Throughput: 0: 48165.4. Samples: 288305700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-12 17:56:55,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 17:56:58,529][71000] Updated weights for policy 0, policy_version 46364 (0.0030) [2024-06-12 17:57:00,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48332.8, 300 sec: 48207.8). Total num frames: 759709696. Throughput: 0: 47967.5. Samples: 288585000. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-12 17:57:00,940][70768] Avg episode reward: [(0, '0.219')] [2024-06-12 17:57:02,522][71000] Updated weights for policy 0, policy_version 46374 (0.0026) [2024-06-12 17:57:05,309][71000] Updated weights for policy 0, policy_version 46384 (0.0019) [2024-06-12 17:57:05,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48059.7, 300 sec: 48096.8). Total num frames: 759955456. Throughput: 0: 48135.7. Samples: 288734600. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-12 17:57:05,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:57:09,521][71000] Updated weights for policy 0, policy_version 46394 (0.0033) [2024-06-12 17:57:10,940][70768] Fps is (10 sec: 49151.1, 60 sec: 47786.6, 300 sec: 48096.7). Total num frames: 760201216. Throughput: 0: 48126.9. Samples: 289023560. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-12 17:57:10,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 17:57:12,035][71000] Updated weights for policy 0, policy_version 46404 (0.0020) [2024-06-12 17:57:15,940][70768] Fps is (10 sec: 45874.6, 60 sec: 47786.5, 300 sec: 48041.2). Total num frames: 760414208. Throughput: 0: 47887.0. Samples: 289306980. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-12 17:57:15,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 17:57:16,294][71000] Updated weights for policy 0, policy_version 46414 (0.0034) [2024-06-12 17:57:19,080][71000] Updated weights for policy 0, policy_version 46424 (0.0032) [2024-06-12 17:57:20,940][70768] Fps is (10 sec: 45875.9, 60 sec: 48059.8, 300 sec: 48152.3). Total num frames: 760659968. Throughput: 0: 47598.4. Samples: 289443460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-12 17:57:20,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 17:57:22,972][71000] Updated weights for policy 0, policy_version 46434 (0.0028) [2024-06-12 17:57:24,784][70980] Signal inference workers to stop experience collection... (4250 times) [2024-06-12 17:57:24,784][70980] Signal inference workers to resume experience collection... (4250 times) [2024-06-12 17:57:24,801][71000] InferenceWorker_p0-w0: stopping experience collection (4250 times) [2024-06-12 17:57:24,801][71000] InferenceWorker_p0-w0: resuming experience collection (4250 times) [2024-06-12 17:57:25,888][71000] Updated weights for policy 0, policy_version 46444 (0.0030) [2024-06-12 17:57:25,940][70768] Fps is (10 sec: 52429.5, 60 sec: 48605.9, 300 sec: 48152.3). Total num frames: 760938496. Throughput: 0: 47714.7. Samples: 289733920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-12 17:57:25,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 17:57:29,806][71000] Updated weights for policy 0, policy_version 46454 (0.0029) [2024-06-12 17:57:30,940][70768] Fps is (10 sec: 47513.1, 60 sec: 47240.5, 300 sec: 47985.7). Total num frames: 761135104. Throughput: 0: 47968.0. Samples: 290025760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-12 17:57:30,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 17:57:32,774][71000] Updated weights for policy 0, policy_version 46464 (0.0032) [2024-06-12 17:57:35,939][70768] Fps is (10 sec: 42599.0, 60 sec: 47240.8, 300 sec: 47985.7). Total num frames: 761364480. Throughput: 0: 47435.7. Samples: 290162440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-12 17:57:35,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 17:57:36,724][71000] Updated weights for policy 0, policy_version 46474 (0.0029) [2024-06-12 17:57:39,691][71000] Updated weights for policy 0, policy_version 46484 (0.0032) [2024-06-12 17:57:40,940][70768] Fps is (10 sec: 50790.9, 60 sec: 48332.8, 300 sec: 48207.8). Total num frames: 761643008. Throughput: 0: 47691.9. Samples: 290451840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-12 17:57:40,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 17:57:40,956][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000046487_761643008.pth... [2024-06-12 17:57:41,009][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000045780_750059520.pth [2024-06-12 17:57:43,476][71000] Updated weights for policy 0, policy_version 46494 (0.0041) [2024-06-12 17:57:45,939][70768] Fps is (10 sec: 52428.5, 60 sec: 48333.0, 300 sec: 48041.2). Total num frames: 761888768. Throughput: 0: 48011.2. Samples: 290745500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-12 17:57:45,940][70768] Avg episode reward: [(0, '0.231')] [2024-06-12 17:57:46,419][71000] Updated weights for policy 0, policy_version 46504 (0.0025) [2024-06-12 17:57:50,231][71000] Updated weights for policy 0, policy_version 46514 (0.0034) [2024-06-12 17:57:50,940][70768] Fps is (10 sec: 47513.7, 60 sec: 47786.7, 300 sec: 48041.2). Total num frames: 762118144. Throughput: 0: 47748.9. Samples: 290883300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-12 17:57:50,940][70768] Avg episode reward: [(0, '0.226')] [2024-06-12 17:57:53,318][71000] Updated weights for policy 0, policy_version 46524 (0.0029) [2024-06-12 17:57:55,939][70768] Fps is (10 sec: 44236.9, 60 sec: 47513.6, 300 sec: 48041.2). Total num frames: 762331136. Throughput: 0: 47763.8. Samples: 291172920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-12 17:57:55,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 17:57:56,864][71000] Updated weights for policy 0, policy_version 46534 (0.0025) [2024-06-12 17:58:00,116][71000] Updated weights for policy 0, policy_version 46544 (0.0031) [2024-06-12 17:58:00,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48059.7, 300 sec: 48152.3). Total num frames: 762593280. Throughput: 0: 47867.7. Samples: 291461020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-12 17:58:00,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 17:58:03,649][71000] Updated weights for policy 0, policy_version 46554 (0.0041) [2024-06-12 17:58:05,939][70768] Fps is (10 sec: 50790.2, 60 sec: 48059.8, 300 sec: 48041.3). Total num frames: 762839040. Throughput: 0: 48204.5. Samples: 291612660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-12 17:58:05,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 17:58:06,767][71000] Updated weights for policy 0, policy_version 46564 (0.0030) [2024-06-12 17:58:10,361][71000] Updated weights for policy 0, policy_version 46574 (0.0024) [2024-06-12 17:58:10,939][70768] Fps is (10 sec: 47513.9, 60 sec: 47786.8, 300 sec: 48041.3). Total num frames: 763068416. Throughput: 0: 48304.9. Samples: 291907640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-12 17:58:10,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:58:13,578][71000] Updated weights for policy 0, policy_version 46584 (0.0024) [2024-06-12 17:58:15,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48332.9, 300 sec: 48152.3). Total num frames: 763314176. Throughput: 0: 48025.8. Samples: 292186920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-12 17:58:15,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:58:17,087][71000] Updated weights for policy 0, policy_version 46594 (0.0029) [2024-06-12 17:58:20,292][70980] Signal inference workers to stop experience collection... (4300 times) [2024-06-12 17:58:20,328][71000] InferenceWorker_p0-w0: stopping experience collection (4300 times) [2024-06-12 17:58:20,403][70980] Signal inference workers to resume experience collection... (4300 times) [2024-06-12 17:58:20,403][71000] InferenceWorker_p0-w0: resuming experience collection (4300 times) [2024-06-12 17:58:20,534][71000] Updated weights for policy 0, policy_version 46604 (0.0029) [2024-06-12 17:58:20,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48605.8, 300 sec: 48096.8). Total num frames: 763576320. Throughput: 0: 48270.9. Samples: 292334640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-12 17:58:20,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:58:24,090][71000] Updated weights for policy 0, policy_version 46614 (0.0033) [2024-06-12 17:58:25,940][70768] Fps is (10 sec: 45874.5, 60 sec: 47240.4, 300 sec: 47930.1). Total num frames: 763772928. Throughput: 0: 48155.8. Samples: 292618860. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-12 17:58:25,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:58:27,227][71000] Updated weights for policy 0, policy_version 46624 (0.0038) [2024-06-12 17:58:30,762][71000] Updated weights for policy 0, policy_version 46634 (0.0033) [2024-06-12 17:58:30,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48605.9, 300 sec: 48152.3). Total num frames: 764051456. Throughput: 0: 48079.9. Samples: 292909100. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-12 17:58:30,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 17:58:34,015][71000] Updated weights for policy 0, policy_version 46644 (0.0029) [2024-06-12 17:58:35,939][70768] Fps is (10 sec: 50791.6, 60 sec: 48605.8, 300 sec: 48207.8). Total num frames: 764280832. Throughput: 0: 48366.7. Samples: 293059800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-12 17:58:35,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 17:58:37,700][71000] Updated weights for policy 0, policy_version 46654 (0.0033) [2024-06-12 17:58:40,787][71000] Updated weights for policy 0, policy_version 46664 (0.0037) [2024-06-12 17:58:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48332.8, 300 sec: 48152.3). Total num frames: 764542976. Throughput: 0: 48399.9. Samples: 293350920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-12 17:58:40,940][70768] Avg episode reward: [(0, '0.228')] [2024-06-12 17:58:44,386][71000] Updated weights for policy 0, policy_version 46674 (0.0026) [2024-06-12 17:58:45,940][70768] Fps is (10 sec: 47512.9, 60 sec: 47786.5, 300 sec: 48096.7). Total num frames: 764755968. Throughput: 0: 48254.6. Samples: 293632480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 17:58:45,940][70768] Avg episode reward: [(0, '0.224')] [2024-06-12 17:58:47,958][71000] Updated weights for policy 0, policy_version 46684 (0.0023) [2024-06-12 17:58:50,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48059.7, 300 sec: 48152.3). Total num frames: 765001728. Throughput: 0: 48028.8. Samples: 293773960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 17:58:50,940][70768] Avg episode reward: [(0, '0.218')] [2024-06-12 17:58:51,497][71000] Updated weights for policy 0, policy_version 46694 (0.0032) [2024-06-12 17:58:54,659][71000] Updated weights for policy 0, policy_version 46704 (0.0035) [2024-06-12 17:58:55,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48605.8, 300 sec: 48152.3). Total num frames: 765247488. Throughput: 0: 47784.4. Samples: 294057940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 17:58:55,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 17:58:58,258][71000] Updated weights for policy 0, policy_version 46714 (0.0041) [2024-06-12 17:59:00,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48059.8, 300 sec: 48041.2). Total num frames: 765476864. Throughput: 0: 48036.5. Samples: 294348560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 17:59:00,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 17:59:01,731][71000] Updated weights for policy 0, policy_version 46724 (0.0034) [2024-06-12 17:59:05,425][71000] Updated weights for policy 0, policy_version 46734 (0.0032) [2024-06-12 17:59:05,940][70768] Fps is (10 sec: 45875.0, 60 sec: 47786.6, 300 sec: 48041.2). Total num frames: 765706240. Throughput: 0: 47961.4. Samples: 294492900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 17:59:05,940][70768] Avg episode reward: [(0, '0.223')] [2024-06-12 17:59:08,459][71000] Updated weights for policy 0, policy_version 46744 (0.0026) [2024-06-12 17:59:10,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48605.7, 300 sec: 48207.8). Total num frames: 765984768. Throughput: 0: 47905.0. Samples: 294774580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 17:59:10,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 17:59:12,035][71000] Updated weights for policy 0, policy_version 46754 (0.0034) [2024-06-12 17:59:15,598][71000] Updated weights for policy 0, policy_version 46764 (0.0026) [2024-06-12 17:59:15,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48332.8, 300 sec: 48152.3). Total num frames: 766214144. Throughput: 0: 48052.9. Samples: 295071480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:59:15,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 17:59:19,075][71000] Updated weights for policy 0, policy_version 46774 (0.0028) [2024-06-12 17:59:20,940][70768] Fps is (10 sec: 44237.0, 60 sec: 47513.6, 300 sec: 47985.7). Total num frames: 766427136. Throughput: 0: 47772.8. Samples: 295209580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:59:20,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 17:59:22,154][71000] Updated weights for policy 0, policy_version 46784 (0.0032) [2024-06-12 17:59:25,679][71000] Updated weights for policy 0, policy_version 46794 (0.0024) [2024-06-12 17:59:25,942][70768] Fps is (10 sec: 45862.8, 60 sec: 48330.8, 300 sec: 48096.3). Total num frames: 766672896. Throughput: 0: 47797.1. Samples: 295501920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:59:25,943][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 17:59:28,826][70980] Signal inference workers to stop experience collection... (4350 times) [2024-06-12 17:59:28,826][70980] Signal inference workers to resume experience collection... (4350 times) [2024-06-12 17:59:28,865][71000] InferenceWorker_p0-w0: stopping experience collection (4350 times) [2024-06-12 17:59:28,865][71000] InferenceWorker_p0-w0: resuming experience collection (4350 times) [2024-06-12 17:59:28,991][71000] Updated weights for policy 0, policy_version 46804 (0.0023) [2024-06-12 17:59:30,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48059.7, 300 sec: 48152.3). Total num frames: 766935040. Throughput: 0: 48189.0. Samples: 295800980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:59:30,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 17:59:32,493][71000] Updated weights for policy 0, policy_version 46814 (0.0028) [2024-06-12 17:59:35,651][71000] Updated weights for policy 0, policy_version 46824 (0.0037) [2024-06-12 17:59:35,940][70768] Fps is (10 sec: 49164.7, 60 sec: 48059.6, 300 sec: 48096.7). Total num frames: 767164416. Throughput: 0: 48361.2. Samples: 295950220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:59:35,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 17:59:39,141][71000] Updated weights for policy 0, policy_version 46834 (0.0033) [2024-06-12 17:59:40,939][70768] Fps is (10 sec: 47513.9, 60 sec: 47786.7, 300 sec: 48096.8). Total num frames: 767410176. Throughput: 0: 48369.3. Samples: 296234560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 17:59:40,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 17:59:40,956][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000046840_767426560.pth... [2024-06-12 17:59:41,005][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000046134_755859456.pth [2024-06-12 17:59:42,604][71000] Updated weights for policy 0, policy_version 46844 (0.0041) [2024-06-12 17:59:45,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48059.8, 300 sec: 48096.8). Total num frames: 767639552. Throughput: 0: 48189.3. Samples: 296517080. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-12 17:59:45,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 17:59:46,201][71000] Updated weights for policy 0, policy_version 46854 (0.0031) [2024-06-12 17:59:49,386][71000] Updated weights for policy 0, policy_version 46864 (0.0040) [2024-06-12 17:59:50,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48332.8, 300 sec: 48152.3). Total num frames: 767901696. Throughput: 0: 48333.8. Samples: 296667920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-12 17:59:50,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 17:59:52,942][71000] Updated weights for policy 0, policy_version 46874 (0.0023) [2024-06-12 17:59:55,940][70768] Fps is (10 sec: 47513.4, 60 sec: 47786.6, 300 sec: 48041.2). Total num frames: 768114688. Throughput: 0: 48407.2. Samples: 296952900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-12 17:59:55,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 17:59:56,478][71000] Updated weights for policy 0, policy_version 46884 (0.0035) [2024-06-12 17:59:59,997][71000] Updated weights for policy 0, policy_version 46894 (0.0029) [2024-06-12 18:00:00,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48059.6, 300 sec: 48041.2). Total num frames: 768360448. Throughput: 0: 48350.9. Samples: 297247280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-12 18:00:00,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 18:00:03,122][71000] Updated weights for policy 0, policy_version 46904 (0.0030) [2024-06-12 18:00:05,940][70768] Fps is (10 sec: 49151.2, 60 sec: 48332.7, 300 sec: 48096.7). Total num frames: 768606208. Throughput: 0: 48365.2. Samples: 297386020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-12 18:00:05,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 18:00:06,715][71000] Updated weights for policy 0, policy_version 46914 (0.0030) [2024-06-12 18:00:10,002][71000] Updated weights for policy 0, policy_version 46924 (0.0025) [2024-06-12 18:00:10,940][70768] Fps is (10 sec: 52429.4, 60 sec: 48332.9, 300 sec: 48207.8). Total num frames: 768884736. Throughput: 0: 48379.7. Samples: 297678880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-12 18:00:10,940][70768] Avg episode reward: [(0, '0.219')] [2024-06-12 18:00:13,418][71000] Updated weights for policy 0, policy_version 46934 (0.0031) [2024-06-12 18:00:15,940][70768] Fps is (10 sec: 47514.4, 60 sec: 47786.6, 300 sec: 48096.8). Total num frames: 769081344. Throughput: 0: 48091.6. Samples: 297965100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 18:00:15,940][70768] Avg episode reward: [(0, '0.232')] [2024-06-12 18:00:16,811][71000] Updated weights for policy 0, policy_version 46944 (0.0029) [2024-06-12 18:00:20,181][71000] Updated weights for policy 0, policy_version 46954 (0.0034) [2024-06-12 18:00:20,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48605.9, 300 sec: 48207.8). Total num frames: 769343488. Throughput: 0: 47771.2. Samples: 298099920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 18:00:20,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 18:00:23,615][71000] Updated weights for policy 0, policy_version 46964 (0.0034) [2024-06-12 18:00:25,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48335.0, 300 sec: 48096.8). Total num frames: 769572864. Throughput: 0: 47901.8. Samples: 298390140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 18:00:25,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 18:00:27,241][71000] Updated weights for policy 0, policy_version 46974 (0.0025) [2024-06-12 18:00:30,279][71000] Updated weights for policy 0, policy_version 46984 (0.0032) [2024-06-12 18:00:30,939][70768] Fps is (10 sec: 45875.9, 60 sec: 47786.8, 300 sec: 48041.2). Total num frames: 769802240. Throughput: 0: 48229.4. Samples: 298687400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 18:00:30,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:00:33,877][71000] Updated weights for policy 0, policy_version 46994 (0.0033) [2024-06-12 18:00:35,944][70768] Fps is (10 sec: 49130.4, 60 sec: 48329.4, 300 sec: 48207.1). Total num frames: 770064384. Throughput: 0: 48145.7. Samples: 298834680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 18:00:35,945][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 18:00:37,277][71000] Updated weights for policy 0, policy_version 47004 (0.0037) [2024-06-12 18:00:37,286][70980] Signal inference workers to stop experience collection... (4400 times) [2024-06-12 18:00:37,286][70980] Signal inference workers to resume experience collection... (4400 times) [2024-06-12 18:00:37,305][71000] InferenceWorker_p0-w0: stopping experience collection (4400 times) [2024-06-12 18:00:37,305][71000] InferenceWorker_p0-w0: resuming experience collection (4400 times) [2024-06-12 18:00:40,344][71000] Updated weights for policy 0, policy_version 47014 (0.0035) [2024-06-12 18:00:40,939][70768] Fps is (10 sec: 49152.1, 60 sec: 48059.8, 300 sec: 48096.8). Total num frames: 770293760. Throughput: 0: 48309.5. Samples: 299126820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 18:00:40,940][70768] Avg episode reward: [(0, '0.222')] [2024-06-12 18:00:43,802][71000] Updated weights for policy 0, policy_version 47024 (0.0031) [2024-06-12 18:00:45,940][70768] Fps is (10 sec: 47533.9, 60 sec: 48332.8, 300 sec: 48041.2). Total num frames: 770539520. Throughput: 0: 47982.8. Samples: 299406500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 18:00:45,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 18:00:47,052][71000] Updated weights for policy 0, policy_version 47034 (0.0021) [2024-06-12 18:00:50,711][71000] Updated weights for policy 0, policy_version 47044 (0.0036) [2024-06-12 18:00:50,940][70768] Fps is (10 sec: 47513.1, 60 sec: 47786.7, 300 sec: 48041.2). Total num frames: 770768896. Throughput: 0: 48374.4. Samples: 299562860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 18:00:50,940][70768] Avg episode reward: [(0, '0.231')] [2024-06-12 18:00:54,082][71000] Updated weights for policy 0, policy_version 47054 (0.0034) [2024-06-12 18:00:55,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48605.8, 300 sec: 48207.8). Total num frames: 771031040. Throughput: 0: 48319.4. Samples: 299853260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 18:00:55,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:00:57,311][71000] Updated weights for policy 0, policy_version 47064 (0.0027) [2024-06-12 18:01:00,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48059.8, 300 sec: 48041.2). Total num frames: 771244032. Throughput: 0: 48391.5. Samples: 300142720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 18:01:00,940][70768] Avg episode reward: [(0, '0.228')] [2024-06-12 18:01:00,942][71000] Updated weights for policy 0, policy_version 47074 (0.0022) [2024-06-12 18:01:04,275][71000] Updated weights for policy 0, policy_version 47084 (0.0027) [2024-06-12 18:01:05,939][70768] Fps is (10 sec: 49152.8, 60 sec: 48606.0, 300 sec: 48096.8). Total num frames: 771522560. Throughput: 0: 48635.7. Samples: 300288520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 18:01:05,940][70768] Avg episode reward: [(0, '0.226')] [2024-06-12 18:01:07,417][71000] Updated weights for policy 0, policy_version 47094 (0.0025) [2024-06-12 18:01:10,939][71000] Updated weights for policy 0, policy_version 47104 (0.0034) [2024-06-12 18:01:10,940][70768] Fps is (10 sec: 50789.8, 60 sec: 47786.6, 300 sec: 48152.3). Total num frames: 771751936. Throughput: 0: 48577.1. Samples: 300576120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 18:01:10,949][70768] Avg episode reward: [(0, '0.218')] [2024-06-12 18:01:14,524][71000] Updated weights for policy 0, policy_version 47114 (0.0030) [2024-06-12 18:01:15,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48332.8, 300 sec: 48152.3). Total num frames: 771981312. Throughput: 0: 48248.3. Samples: 300858580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:01:15,940][70768] Avg episode reward: [(0, '0.226')] [2024-06-12 18:01:17,825][71000] Updated weights for policy 0, policy_version 47124 (0.0030) [2024-06-12 18:01:20,940][70768] Fps is (10 sec: 45875.1, 60 sec: 47786.6, 300 sec: 48096.7). Total num frames: 772210688. Throughput: 0: 48242.2. Samples: 301005380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:01:20,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 18:01:21,289][71000] Updated weights for policy 0, policy_version 47134 (0.0032) [2024-06-12 18:01:24,649][71000] Updated weights for policy 0, policy_version 47144 (0.0028) [2024-06-12 18:01:25,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48332.7, 300 sec: 48041.2). Total num frames: 772472832. Throughput: 0: 48185.6. Samples: 301295180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:01:25,940][70768] Avg episode reward: [(0, '0.215')] [2024-06-12 18:01:28,107][71000] Updated weights for policy 0, policy_version 47154 (0.0030) [2024-06-12 18:01:30,940][70768] Fps is (10 sec: 47514.3, 60 sec: 48059.6, 300 sec: 47985.7). Total num frames: 772685824. Throughput: 0: 48431.1. Samples: 301585900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:01:30,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 18:01:31,458][71000] Updated weights for policy 0, policy_version 47164 (0.0026) [2024-06-12 18:01:34,590][71000] Updated weights for policy 0, policy_version 47174 (0.0028) [2024-06-12 18:01:35,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48063.1, 300 sec: 48152.3). Total num frames: 772947968. Throughput: 0: 48118.6. Samples: 301728200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:01:35,940][70768] Avg episode reward: [(0, '0.232')] [2024-06-12 18:01:37,969][71000] Updated weights for policy 0, policy_version 47184 (0.0032) [2024-06-12 18:01:40,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48332.6, 300 sec: 48152.3). Total num frames: 773193728. Throughput: 0: 48275.6. Samples: 302025660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:01:40,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 18:01:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000047192_773193728.pth... [2024-06-12 18:01:41,009][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000046487_761643008.pth [2024-06-12 18:01:41,290][71000] Updated weights for policy 0, policy_version 47194 (0.0028) [2024-06-12 18:01:44,794][71000] Updated weights for policy 0, policy_version 47204 (0.0029) [2024-06-12 18:01:45,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48332.8, 300 sec: 48096.8). Total num frames: 773439488. Throughput: 0: 48220.8. Samples: 302312660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 18:01:45,940][70768] Avg episode reward: [(0, '0.232')] [2024-06-12 18:01:47,986][71000] Updated weights for policy 0, policy_version 47214 (0.0024) [2024-06-12 18:01:50,836][70980] Signal inference workers to stop experience collection... (4450 times) [2024-06-12 18:01:50,836][70980] Signal inference workers to resume experience collection... (4450 times) [2024-06-12 18:01:50,847][71000] InferenceWorker_p0-w0: stopping experience collection (4450 times) [2024-06-12 18:01:50,848][71000] InferenceWorker_p0-w0: resuming experience collection (4450 times) [2024-06-12 18:01:50,939][70768] Fps is (10 sec: 47514.4, 60 sec: 48332.9, 300 sec: 48096.8). Total num frames: 773668864. Throughput: 0: 48166.7. Samples: 302456020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 18:01:50,940][70768] Avg episode reward: [(0, '0.224')] [2024-06-12 18:01:51,414][71000] Updated weights for policy 0, policy_version 47224 (0.0039) [2024-06-12 18:01:54,913][71000] Updated weights for policy 0, policy_version 47234 (0.0035) [2024-06-12 18:01:55,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48332.8, 300 sec: 48207.8). Total num frames: 773931008. Throughput: 0: 48229.8. Samples: 302746460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 18:01:55,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 18:01:58,194][71000] Updated weights for policy 0, policy_version 47244 (0.0029) [2024-06-12 18:02:00,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48878.9, 300 sec: 48207.8). Total num frames: 774176768. Throughput: 0: 48451.9. Samples: 303038920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 18:02:00,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 18:02:01,498][71000] Updated weights for policy 0, policy_version 47254 (0.0031) [2024-06-12 18:02:05,150][71000] Updated weights for policy 0, policy_version 47264 (0.0033) [2024-06-12 18:02:05,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48059.7, 300 sec: 48152.3). Total num frames: 774406144. Throughput: 0: 48352.1. Samples: 303181220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 18:02:05,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 18:02:08,363][71000] Updated weights for policy 0, policy_version 47274 (0.0027) [2024-06-12 18:02:10,940][70768] Fps is (10 sec: 45875.6, 60 sec: 48059.8, 300 sec: 48207.9). Total num frames: 774635520. Throughput: 0: 48411.6. Samples: 303473700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 18:02:10,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 18:02:11,709][71000] Updated weights for policy 0, policy_version 47284 (0.0024) [2024-06-12 18:02:14,843][71000] Updated weights for policy 0, policy_version 47294 (0.0028) [2024-06-12 18:02:15,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48605.8, 300 sec: 48263.4). Total num frames: 774897664. Throughput: 0: 48452.4. Samples: 303766260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 18:02:15,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 18:02:18,445][71000] Updated weights for policy 0, policy_version 47304 (0.0032) [2024-06-12 18:02:20,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48606.0, 300 sec: 48096.7). Total num frames: 775127040. Throughput: 0: 48522.3. Samples: 303911700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 18:02:20,940][70768] Avg episode reward: [(0, '0.221')] [2024-06-12 18:02:21,732][71000] Updated weights for policy 0, policy_version 47314 (0.0025) [2024-06-12 18:02:25,467][71000] Updated weights for policy 0, policy_version 47324 (0.0030) [2024-06-12 18:02:25,943][70768] Fps is (10 sec: 47497.7, 60 sec: 48330.0, 300 sec: 48262.8). Total num frames: 775372800. Throughput: 0: 48365.7. Samples: 304202280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 18:02:25,943][70768] Avg episode reward: [(0, '0.221')] [2024-06-12 18:02:28,316][71000] Updated weights for policy 0, policy_version 47334 (0.0031) [2024-06-12 18:02:30,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.9, 300 sec: 48263.4). Total num frames: 775602176. Throughput: 0: 48175.6. Samples: 304480560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 18:02:30,940][70768] Avg episode reward: [(0, '0.228')] [2024-06-12 18:02:32,364][71000] Updated weights for policy 0, policy_version 47344 (0.0036) [2024-06-12 18:02:35,467][71000] Updated weights for policy 0, policy_version 47354 (0.0029) [2024-06-12 18:02:35,940][70768] Fps is (10 sec: 49169.0, 60 sec: 48606.0, 300 sec: 48207.8). Total num frames: 775864320. Throughput: 0: 48216.4. Samples: 304625760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 18:02:35,940][70768] Avg episode reward: [(0, '0.217')] [2024-06-12 18:02:39,238][71000] Updated weights for policy 0, policy_version 47364 (0.0024) [2024-06-12 18:02:40,940][70768] Fps is (10 sec: 47512.4, 60 sec: 48059.6, 300 sec: 48096.7). Total num frames: 776077312. Throughput: 0: 48194.1. Samples: 304915200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 18:02:40,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 18:02:42,243][71000] Updated weights for policy 0, policy_version 47374 (0.0027) [2024-06-12 18:02:45,793][71000] Updated weights for policy 0, policy_version 47384 (0.0028) [2024-06-12 18:02:45,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48605.9, 300 sec: 48263.4). Total num frames: 776355840. Throughput: 0: 48191.2. Samples: 305207520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 18:02:45,940][70768] Avg episode reward: [(0, '0.232')] [2024-06-12 18:02:48,813][71000] Updated weights for policy 0, policy_version 47394 (0.0023) [2024-06-12 18:02:50,940][70768] Fps is (10 sec: 49153.1, 60 sec: 48332.7, 300 sec: 48263.4). Total num frames: 776568832. Throughput: 0: 48339.6. Samples: 305356500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 18:02:50,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 18:02:52,655][71000] Updated weights for policy 0, policy_version 47404 (0.0029) [2024-06-12 18:02:54,159][70980] Signal inference workers to stop experience collection... (4500 times) [2024-06-12 18:02:54,183][71000] InferenceWorker_p0-w0: stopping experience collection (4500 times) [2024-06-12 18:02:54,217][70980] Signal inference workers to resume experience collection... (4500 times) [2024-06-12 18:02:54,217][71000] InferenceWorker_p0-w0: resuming experience collection (4500 times) [2024-06-12 18:02:55,528][71000] Updated weights for policy 0, policy_version 47414 (0.0028) [2024-06-12 18:02:55,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48332.9, 300 sec: 48263.4). Total num frames: 776830976. Throughput: 0: 48342.7. Samples: 305649120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 18:02:55,940][70768] Avg episode reward: [(0, '0.231')] [2024-06-12 18:02:59,439][71000] Updated weights for policy 0, policy_version 47424 (0.0026) [2024-06-12 18:03:00,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48059.8, 300 sec: 48207.8). Total num frames: 777060352. Throughput: 0: 48284.5. Samples: 305939060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 18:03:00,940][70768] Avg episode reward: [(0, '0.225')] [2024-06-12 18:03:02,517][71000] Updated weights for policy 0, policy_version 47434 (0.0035) [2024-06-12 18:03:05,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48332.9, 300 sec: 48263.4). Total num frames: 777306112. Throughput: 0: 47999.2. Samples: 306071660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 18:03:05,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 18:03:06,506][71000] Updated weights for policy 0, policy_version 47444 (0.0033) [2024-06-12 18:03:09,336][71000] Updated weights for policy 0, policy_version 47454 (0.0035) [2024-06-12 18:03:10,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48332.8, 300 sec: 48207.9). Total num frames: 777535488. Throughput: 0: 48126.0. Samples: 306367780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 18:03:10,940][70768] Avg episode reward: [(0, '0.221')] [2024-06-12 18:03:13,040][71000] Updated weights for policy 0, policy_version 47464 (0.0036) [2024-06-12 18:03:15,939][70768] Fps is (10 sec: 47513.5, 60 sec: 48059.8, 300 sec: 48152.3). Total num frames: 777781248. Throughput: 0: 48489.8. Samples: 306662600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 18:03:15,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 18:03:16,088][71000] Updated weights for policy 0, policy_version 47474 (0.0026) [2024-06-12 18:03:19,641][71000] Updated weights for policy 0, policy_version 47484 (0.0031) [2024-06-12 18:03:20,940][70768] Fps is (10 sec: 47512.5, 60 sec: 48059.6, 300 sec: 48263.4). Total num frames: 778010624. Throughput: 0: 48424.7. Samples: 306804880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 18:03:20,940][70768] Avg episode reward: [(0, '0.221')] [2024-06-12 18:03:22,563][71000] Updated weights for policy 0, policy_version 47494 (0.0027) [2024-06-12 18:03:25,944][70768] Fps is (10 sec: 50768.2, 60 sec: 48605.1, 300 sec: 48262.7). Total num frames: 778289152. Throughput: 0: 48432.0. Samples: 307094840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 18:03:25,944][70768] Avg episode reward: [(0, '0.218')] [2024-06-12 18:03:26,862][71000] Updated weights for policy 0, policy_version 47504 (0.0026) [2024-06-12 18:03:29,517][71000] Updated weights for policy 0, policy_version 47514 (0.0030) [2024-06-12 18:03:30,940][70768] Fps is (10 sec: 50791.1, 60 sec: 48605.8, 300 sec: 48263.4). Total num frames: 778518528. Throughput: 0: 48275.5. Samples: 307379920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 18:03:30,940][70768] Avg episode reward: [(0, '0.225')] [2024-06-12 18:03:33,593][71000] Updated weights for policy 0, policy_version 47524 (0.0035) [2024-06-12 18:03:35,939][70768] Fps is (10 sec: 47534.5, 60 sec: 48332.8, 300 sec: 48207.8). Total num frames: 778764288. Throughput: 0: 48414.3. Samples: 307535140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 18:03:35,940][70768] Avg episode reward: [(0, '0.221')] [2024-06-12 18:03:36,183][71000] Updated weights for policy 0, policy_version 47534 (0.0028) [2024-06-12 18:03:40,097][71000] Updated weights for policy 0, policy_version 47544 (0.0023) [2024-06-12 18:03:40,939][70768] Fps is (10 sec: 47514.1, 60 sec: 48606.1, 300 sec: 48263.4). Total num frames: 778993664. Throughput: 0: 48164.5. Samples: 307816520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 18:03:40,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 18:03:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000047546_778993664.pth... [2024-06-12 18:03:40,991][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000046840_767426560.pth [2024-06-12 18:03:43,287][71000] Updated weights for policy 0, policy_version 47554 (0.0025) [2024-06-12 18:03:45,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48332.7, 300 sec: 48318.9). Total num frames: 779255808. Throughput: 0: 48273.6. Samples: 308111380. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-12 18:03:45,940][70768] Avg episode reward: [(0, '0.220')] [2024-06-12 18:03:46,831][71000] Updated weights for policy 0, policy_version 47564 (0.0029) [2024-06-12 18:03:49,860][71000] Updated weights for policy 0, policy_version 47574 (0.0033) [2024-06-12 18:03:50,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48878.9, 300 sec: 48318.9). Total num frames: 779501568. Throughput: 0: 48645.7. Samples: 308260720. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-12 18:03:50,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 18:03:53,906][71000] Updated weights for policy 0, policy_version 47584 (0.0042) [2024-06-12 18:03:55,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48059.7, 300 sec: 48263.4). Total num frames: 779714560. Throughput: 0: 48525.2. Samples: 308551420. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-12 18:03:55,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 18:03:56,670][71000] Updated weights for policy 0, policy_version 47594 (0.0028) [2024-06-12 18:04:00,438][71000] Updated weights for policy 0, policy_version 47604 (0.0025) [2024-06-12 18:04:00,940][70768] Fps is (10 sec: 44236.5, 60 sec: 48059.7, 300 sec: 48263.4). Total num frames: 779943936. Throughput: 0: 48443.4. Samples: 308842560. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-12 18:04:00,940][70768] Avg episode reward: [(0, '0.222')] [2024-06-12 18:04:03,285][71000] Updated weights for policy 0, policy_version 47614 (0.0027) [2024-06-12 18:04:05,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48605.8, 300 sec: 48263.4). Total num frames: 780222464. Throughput: 0: 48408.2. Samples: 308983240. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-12 18:04:05,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 18:04:07,323][71000] Updated weights for policy 0, policy_version 47624 (0.0026) [2024-06-12 18:04:10,647][71000] Updated weights for policy 0, policy_version 47634 (0.0038) [2024-06-12 18:04:10,942][70768] Fps is (10 sec: 50779.3, 60 sec: 48604.0, 300 sec: 48263.0). Total num frames: 780451840. Throughput: 0: 48531.1. Samples: 309278640. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-12 18:04:10,942][70768] Avg episode reward: [(0, '0.225')] [2024-06-12 18:04:14,162][71000] Updated weights for policy 0, policy_version 47644 (0.0026) [2024-06-12 18:04:15,939][70768] Fps is (10 sec: 45875.5, 60 sec: 48332.8, 300 sec: 48318.9). Total num frames: 780681216. Throughput: 0: 48637.4. Samples: 309568600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 18:04:15,940][70768] Avg episode reward: [(0, '0.232')] [2024-06-12 18:04:17,212][71000] Updated weights for policy 0, policy_version 47654 (0.0033) [2024-06-12 18:04:18,485][70980] Signal inference workers to stop experience collection... (4550 times) [2024-06-12 18:04:18,516][71000] InferenceWorker_p0-w0: stopping experience collection (4550 times) [2024-06-12 18:04:18,545][70980] Signal inference workers to resume experience collection... (4550 times) [2024-06-12 18:04:18,545][71000] InferenceWorker_p0-w0: resuming experience collection (4550 times) [2024-06-12 18:04:20,642][71000] Updated weights for policy 0, policy_version 47664 (0.0025) [2024-06-12 18:04:20,940][70768] Fps is (10 sec: 47523.7, 60 sec: 48605.9, 300 sec: 48319.3). Total num frames: 780926976. Throughput: 0: 48351.8. Samples: 309710980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 18:04:20,940][70768] Avg episode reward: [(0, '0.218')] [2024-06-12 18:04:23,954][71000] Updated weights for policy 0, policy_version 47674 (0.0032) [2024-06-12 18:04:25,939][70768] Fps is (10 sec: 52428.8, 60 sec: 48609.4, 300 sec: 48374.5). Total num frames: 781205504. Throughput: 0: 48481.3. Samples: 309998180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 18:04:25,940][70768] Avg episode reward: [(0, '0.224')] [2024-06-12 18:04:27,219][71000] Updated weights for policy 0, policy_version 47684 (0.0030) [2024-06-12 18:04:30,626][71000] Updated weights for policy 0, policy_version 47694 (0.0033) [2024-06-12 18:04:30,939][70768] Fps is (10 sec: 49152.9, 60 sec: 48332.9, 300 sec: 48318.9). Total num frames: 781418496. Throughput: 0: 48411.7. Samples: 310289900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 18:04:30,940][70768] Avg episode reward: [(0, '0.213')] [2024-06-12 18:04:34,411][71000] Updated weights for policy 0, policy_version 47704 (0.0038) [2024-06-12 18:04:35,940][70768] Fps is (10 sec: 45874.6, 60 sec: 48332.7, 300 sec: 48318.9). Total num frames: 781664256. Throughput: 0: 48276.4. Samples: 310433160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 18:04:35,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 18:04:37,697][71000] Updated weights for policy 0, policy_version 47714 (0.0028) [2024-06-12 18:04:40,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48332.7, 300 sec: 48318.9). Total num frames: 781893632. Throughput: 0: 48223.2. Samples: 310721460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 18:04:40,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 18:04:41,147][71000] Updated weights for policy 0, policy_version 47724 (0.0036) [2024-06-12 18:04:44,339][71000] Updated weights for policy 0, policy_version 47734 (0.0026) [2024-06-12 18:04:45,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48332.9, 300 sec: 48318.9). Total num frames: 782155776. Throughput: 0: 48193.0. Samples: 311011240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:04:45,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 18:04:48,075][71000] Updated weights for policy 0, policy_version 47744 (0.0030) [2024-06-12 18:04:50,939][70768] Fps is (10 sec: 47513.9, 60 sec: 47786.7, 300 sec: 48318.9). Total num frames: 782368768. Throughput: 0: 48245.4. Samples: 311154280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:04:50,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:04:51,161][71000] Updated weights for policy 0, policy_version 47754 (0.0028) [2024-06-12 18:04:54,740][71000] Updated weights for policy 0, policy_version 47764 (0.0026) [2024-06-12 18:04:55,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48332.8, 300 sec: 48318.9). Total num frames: 782614528. Throughput: 0: 48381.5. Samples: 311455700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:04:55,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 18:04:57,830][71000] Updated weights for policy 0, policy_version 47774 (0.0029) [2024-06-12 18:05:00,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48605.9, 300 sec: 48318.9). Total num frames: 782860288. Throughput: 0: 48221.2. Samples: 311738560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:05:00,940][70768] Avg episode reward: [(0, '0.213')] [2024-06-12 18:05:01,298][71000] Updated weights for policy 0, policy_version 47784 (0.0033) [2024-06-12 18:05:04,752][71000] Updated weights for policy 0, policy_version 47794 (0.0037) [2024-06-12 18:05:05,940][70768] Fps is (10 sec: 52428.1, 60 sec: 48605.8, 300 sec: 48318.9). Total num frames: 783138816. Throughput: 0: 48371.5. Samples: 311887700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:05:05,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 18:05:08,174][71000] Updated weights for policy 0, policy_version 47804 (0.0036) [2024-06-12 18:05:10,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48061.6, 300 sec: 48318.9). Total num frames: 783335424. Throughput: 0: 48251.1. Samples: 312169480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:05:10,940][70768] Avg episode reward: [(0, '0.221')] [2024-06-12 18:05:11,458][71000] Updated weights for policy 0, policy_version 47814 (0.0031) [2024-06-12 18:05:15,035][71000] Updated weights for policy 0, policy_version 47824 (0.0028) [2024-06-12 18:05:15,939][70768] Fps is (10 sec: 42599.4, 60 sec: 48059.8, 300 sec: 48207.9). Total num frames: 783564800. Throughput: 0: 48151.6. Samples: 312456720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:05:15,940][70768] Avg episode reward: [(0, '0.222')] [2024-06-12 18:05:18,567][71000] Updated weights for policy 0, policy_version 47834 (0.0036) [2024-06-12 18:05:20,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48332.9, 300 sec: 48318.9). Total num frames: 783826944. Throughput: 0: 48309.4. Samples: 312607080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:05:20,940][70768] Avg episode reward: [(0, '0.228')] [2024-06-12 18:05:21,866][70980] Signal inference workers to stop experience collection... (4600 times) [2024-06-12 18:05:21,894][71000] InferenceWorker_p0-w0: stopping experience collection (4600 times) [2024-06-12 18:05:21,923][70980] Signal inference workers to resume experience collection... (4600 times) [2024-06-12 18:05:21,923][71000] InferenceWorker_p0-w0: resuming experience collection (4600 times) [2024-06-12 18:05:22,057][71000] Updated weights for policy 0, policy_version 47844 (0.0027) [2024-06-12 18:05:25,138][71000] Updated weights for policy 0, policy_version 47854 (0.0041) [2024-06-12 18:05:25,939][70768] Fps is (10 sec: 52428.9, 60 sec: 48059.8, 300 sec: 48430.0). Total num frames: 784089088. Throughput: 0: 48338.3. Samples: 312896680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:05:25,940][70768] Avg episode reward: [(0, '0.211')] [2024-06-12 18:05:28,514][71000] Updated weights for policy 0, policy_version 47864 (0.0027) [2024-06-12 18:05:30,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48059.5, 300 sec: 48264.0). Total num frames: 784302080. Throughput: 0: 48313.1. Samples: 313185340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:05:30,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 18:05:32,010][71000] Updated weights for policy 0, policy_version 47874 (0.0032) [2024-06-12 18:05:35,379][71000] Updated weights for policy 0, policy_version 47884 (0.0041) [2024-06-12 18:05:35,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48332.8, 300 sec: 48374.4). Total num frames: 784564224. Throughput: 0: 48187.5. Samples: 313322720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:05:35,940][70768] Avg episode reward: [(0, '0.228')] [2024-06-12 18:05:38,871][71000] Updated weights for policy 0, policy_version 47894 (0.0025) [2024-06-12 18:05:40,939][70768] Fps is (10 sec: 49153.3, 60 sec: 48332.8, 300 sec: 48318.9). Total num frames: 784793600. Throughput: 0: 48113.4. Samples: 313620800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:05:40,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 18:05:40,967][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000047901_784809984.pth... [2024-06-12 18:05:41,005][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000047192_773193728.pth [2024-06-12 18:05:42,060][71000] Updated weights for policy 0, policy_version 47904 (0.0020) [2024-06-12 18:05:45,463][71000] Updated weights for policy 0, policy_version 47914 (0.0034) [2024-06-12 18:05:45,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48332.8, 300 sec: 48430.0). Total num frames: 785055744. Throughput: 0: 48439.7. Samples: 313918340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-12 18:05:45,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 18:05:48,983][71000] Updated weights for policy 0, policy_version 47924 (0.0032) [2024-06-12 18:05:50,939][70768] Fps is (10 sec: 49151.8, 60 sec: 48605.8, 300 sec: 48318.9). Total num frames: 785285120. Throughput: 0: 48420.2. Samples: 314066600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-12 18:05:50,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 18:05:52,096][71000] Updated weights for policy 0, policy_version 47934 (0.0033) [2024-06-12 18:05:55,531][71000] Updated weights for policy 0, policy_version 47944 (0.0038) [2024-06-12 18:05:55,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48878.9, 300 sec: 48485.5). Total num frames: 785547264. Throughput: 0: 48514.5. Samples: 314352640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-12 18:05:55,940][70768] Avg episode reward: [(0, '0.224')] [2024-06-12 18:05:58,907][71000] Updated weights for policy 0, policy_version 47954 (0.0032) [2024-06-12 18:06:00,940][70768] Fps is (10 sec: 49151.2, 60 sec: 48605.8, 300 sec: 48318.9). Total num frames: 785776640. Throughput: 0: 48511.3. Samples: 314639740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-12 18:06:00,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 18:06:02,333][71000] Updated weights for policy 0, policy_version 47964 (0.0028) [2024-06-12 18:06:05,739][71000] Updated weights for policy 0, policy_version 47974 (0.0033) [2024-06-12 18:06:05,939][70768] Fps is (10 sec: 45875.9, 60 sec: 47786.8, 300 sec: 48318.9). Total num frames: 786006016. Throughput: 0: 48402.8. Samples: 314785200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-12 18:06:05,940][70768] Avg episode reward: [(0, '0.211')] [2024-06-12 18:06:09,326][71000] Updated weights for policy 0, policy_version 47984 (0.0031) [2024-06-12 18:06:10,940][70768] Fps is (10 sec: 45875.7, 60 sec: 48332.8, 300 sec: 48318.9). Total num frames: 786235392. Throughput: 0: 48546.5. Samples: 315081280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-12 18:06:10,940][70768] Avg episode reward: [(0, '0.222')] [2024-06-12 18:06:12,412][71000] Updated weights for policy 0, policy_version 47994 (0.0034) [2024-06-12 18:06:15,927][71000] Updated weights for policy 0, policy_version 48004 (0.0028) [2024-06-12 18:06:15,939][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.9, 300 sec: 48430.0). Total num frames: 786497536. Throughput: 0: 48538.5. Samples: 315369560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 18:06:15,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:06:19,013][71000] Updated weights for policy 0, policy_version 48014 (0.0036) [2024-06-12 18:06:20,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48605.8, 300 sec: 48374.4). Total num frames: 786743296. Throughput: 0: 48591.0. Samples: 315509320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 18:06:20,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 18:06:22,432][71000] Updated weights for policy 0, policy_version 48024 (0.0031) [2024-06-12 18:06:25,898][71000] Updated weights for policy 0, policy_version 48034 (0.0037) [2024-06-12 18:06:25,940][70768] Fps is (10 sec: 49151.2, 60 sec: 48332.6, 300 sec: 48485.5). Total num frames: 786989056. Throughput: 0: 48574.0. Samples: 315806640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 18:06:25,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 18:06:29,129][70980] Signal inference workers to stop experience collection... (4650 times) [2024-06-12 18:06:29,130][70980] Signal inference workers to resume experience collection... (4650 times) [2024-06-12 18:06:29,172][71000] InferenceWorker_p0-w0: stopping experience collection (4650 times) [2024-06-12 18:06:29,173][71000] InferenceWorker_p0-w0: resuming experience collection (4650 times) [2024-06-12 18:06:29,511][71000] Updated weights for policy 0, policy_version 48044 (0.0031) [2024-06-12 18:06:30,940][70768] Fps is (10 sec: 49151.0, 60 sec: 48878.9, 300 sec: 48430.0). Total num frames: 787234816. Throughput: 0: 48257.9. Samples: 316089960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 18:06:30,940][70768] Avg episode reward: [(0, '0.232')] [2024-06-12 18:06:32,635][71000] Updated weights for policy 0, policy_version 48054 (0.0023) [2024-06-12 18:06:35,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48059.6, 300 sec: 48318.9). Total num frames: 787447808. Throughput: 0: 48122.0. Samples: 316232100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 18:06:35,940][70768] Avg episode reward: [(0, '0.228')] [2024-06-12 18:06:36,327][71000] Updated weights for policy 0, policy_version 48064 (0.0034) [2024-06-12 18:06:39,618][71000] Updated weights for policy 0, policy_version 48074 (0.0026) [2024-06-12 18:06:40,940][70768] Fps is (10 sec: 49153.1, 60 sec: 48878.8, 300 sec: 48430.0). Total num frames: 787726336. Throughput: 0: 48203.6. Samples: 316521800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 18:06:40,940][70768] Avg episode reward: [(0, '0.221')] [2024-06-12 18:06:42,959][71000] Updated weights for policy 0, policy_version 48084 (0.0031) [2024-06-12 18:06:45,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48332.7, 300 sec: 48430.0). Total num frames: 787955712. Throughput: 0: 48552.5. Samples: 316824600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 18:06:45,940][70768] Avg episode reward: [(0, '0.217')] [2024-06-12 18:06:46,219][71000] Updated weights for policy 0, policy_version 48094 (0.0020) [2024-06-12 18:06:49,314][71000] Updated weights for policy 0, policy_version 48104 (0.0027) [2024-06-12 18:06:50,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.8, 300 sec: 48374.5). Total num frames: 788201472. Throughput: 0: 48682.6. Samples: 316975920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 18:06:50,940][70768] Avg episode reward: [(0, '0.213')] [2024-06-12 18:06:52,857][71000] Updated weights for policy 0, policy_version 48114 (0.0026) [2024-06-12 18:06:55,939][70768] Fps is (10 sec: 47514.4, 60 sec: 48059.8, 300 sec: 48318.9). Total num frames: 788430848. Throughput: 0: 48479.2. Samples: 317262840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 18:06:55,940][70768] Avg episode reward: [(0, '0.213')] [2024-06-12 18:06:56,206][71000] Updated weights for policy 0, policy_version 48124 (0.0031) [2024-06-12 18:06:59,586][71000] Updated weights for policy 0, policy_version 48134 (0.0041) [2024-06-12 18:07:00,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48332.9, 300 sec: 48374.5). Total num frames: 788676608. Throughput: 0: 48434.6. Samples: 317549120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 18:07:00,940][70768] Avg episode reward: [(0, '0.208')] [2024-06-12 18:07:02,909][71000] Updated weights for policy 0, policy_version 48144 (0.0027) [2024-06-12 18:07:05,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48605.8, 300 sec: 48430.0). Total num frames: 788922368. Throughput: 0: 48700.1. Samples: 317700820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 18:07:05,940][70768] Avg episode reward: [(0, '0.216')] [2024-06-12 18:07:06,191][71000] Updated weights for policy 0, policy_version 48154 (0.0028) [2024-06-12 18:07:09,262][71000] Updated weights for policy 0, policy_version 48164 (0.0019) [2024-06-12 18:07:10,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 48430.0). Total num frames: 789184512. Throughput: 0: 48604.1. Samples: 317993820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 18:07:10,940][70768] Avg episode reward: [(0, '0.223')] [2024-06-12 18:07:12,845][71000] Updated weights for policy 0, policy_version 48174 (0.0024) [2024-06-12 18:07:15,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48878.9, 300 sec: 48485.5). Total num frames: 789430272. Throughput: 0: 49201.2. Samples: 318304000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 18:07:15,940][70768] Avg episode reward: [(0, '0.216')] [2024-06-12 18:07:15,986][71000] Updated weights for policy 0, policy_version 48184 (0.0029) [2024-06-12 18:07:19,317][71000] Updated weights for policy 0, policy_version 48194 (0.0029) [2024-06-12 18:07:20,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48878.9, 300 sec: 48486.1). Total num frames: 789676032. Throughput: 0: 49409.3. Samples: 318455520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 18:07:20,940][70768] Avg episode reward: [(0, '0.211')] [2024-06-12 18:07:22,615][71000] Updated weights for policy 0, policy_version 48204 (0.0028) [2024-06-12 18:07:25,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48605.9, 300 sec: 48485.5). Total num frames: 789905408. Throughput: 0: 49459.1. Samples: 318747460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 18:07:25,940][70768] Avg episode reward: [(0, '0.209')] [2024-06-12 18:07:26,257][71000] Updated weights for policy 0, policy_version 48214 (0.0031) [2024-06-12 18:07:29,285][71000] Updated weights for policy 0, policy_version 48224 (0.0029) [2024-06-12 18:07:30,940][70768] Fps is (10 sec: 47514.3, 60 sec: 48606.1, 300 sec: 48430.0). Total num frames: 790151168. Throughput: 0: 49033.0. Samples: 319031080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 18:07:30,944][70768] Avg episode reward: [(0, '0.206')] [2024-06-12 18:07:33,053][71000] Updated weights for policy 0, policy_version 48234 (0.0035) [2024-06-12 18:07:35,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.2, 300 sec: 48596.6). Total num frames: 790413312. Throughput: 0: 48990.2. Samples: 319180480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 18:07:35,940][70768] Avg episode reward: [(0, '0.222')] [2024-06-12 18:07:35,943][71000] Updated weights for policy 0, policy_version 48244 (0.0028) [2024-06-12 18:07:39,396][71000] Updated weights for policy 0, policy_version 48254 (0.0029) [2024-06-12 18:07:40,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48605.9, 300 sec: 48430.0). Total num frames: 790642688. Throughput: 0: 49328.4. Samples: 319482620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 18:07:40,940][70768] Avg episode reward: [(0, '0.211')] [2024-06-12 18:07:41,034][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000048258_790659072.pth... [2024-06-12 18:07:41,082][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000047546_778993664.pth [2024-06-12 18:07:42,472][71000] Updated weights for policy 0, policy_version 48264 (0.0031) [2024-06-12 18:07:45,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.1, 300 sec: 48596.6). Total num frames: 790904832. Throughput: 0: 49487.1. Samples: 319776040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-12 18:07:45,940][70768] Avg episode reward: [(0, '0.212')] [2024-06-12 18:07:46,174][71000] Updated weights for policy 0, policy_version 48274 (0.0026) [2024-06-12 18:07:48,680][70980] Signal inference workers to stop experience collection... (4700 times) [2024-06-12 18:07:48,710][71000] InferenceWorker_p0-w0: stopping experience collection (4700 times) [2024-06-12 18:07:48,732][70980] Signal inference workers to resume experience collection... (4700 times) [2024-06-12 18:07:48,736][71000] InferenceWorker_p0-w0: resuming experience collection (4700 times) [2024-06-12 18:07:49,282][71000] Updated weights for policy 0, policy_version 48284 (0.0039) [2024-06-12 18:07:50,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.0, 300 sec: 48541.1). Total num frames: 791150592. Throughput: 0: 49489.8. Samples: 319927860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-12 18:07:50,940][70768] Avg episode reward: [(0, '0.206')] [2024-06-12 18:07:53,090][71000] Updated weights for policy 0, policy_version 48294 (0.0029) [2024-06-12 18:07:55,636][71000] Updated weights for policy 0, policy_version 48304 (0.0030) [2024-06-12 18:07:55,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.1, 300 sec: 48652.1). Total num frames: 791412736. Throughput: 0: 49658.2. Samples: 320228440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-12 18:07:55,940][70768] Avg episode reward: [(0, '0.208')] [2024-06-12 18:07:59,422][71000] Updated weights for policy 0, policy_version 48314 (0.0026) [2024-06-12 18:08:00,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.1, 300 sec: 48596.6). Total num frames: 791642112. Throughput: 0: 49508.1. Samples: 320531860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-12 18:08:00,940][70768] Avg episode reward: [(0, '0.187')] [2024-06-12 18:08:02,373][71000] Updated weights for policy 0, policy_version 48324 (0.0029) [2024-06-12 18:08:05,743][71000] Updated weights for policy 0, policy_version 48334 (0.0032) [2024-06-12 18:08:05,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.1, 300 sec: 48707.7). Total num frames: 791904256. Throughput: 0: 49277.9. Samples: 320673020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-12 18:08:05,940][70768] Avg episode reward: [(0, '0.207')] [2024-06-12 18:08:09,087][71000] Updated weights for policy 0, policy_version 48344 (0.0039) [2024-06-12 18:08:10,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49698.2, 300 sec: 48763.2). Total num frames: 792166400. Throughput: 0: 49356.6. Samples: 320968500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-12 18:08:10,940][70768] Avg episode reward: [(0, '0.204')] [2024-06-12 18:08:12,695][71000] Updated weights for policy 0, policy_version 48354 (0.0026) [2024-06-12 18:08:15,470][71000] Updated weights for policy 0, policy_version 48364 (0.0031) [2024-06-12 18:08:15,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49698.0, 300 sec: 48818.8). Total num frames: 792412160. Throughput: 0: 49641.2. Samples: 321264940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-12 18:08:15,940][70768] Avg episode reward: [(0, '0.208')] [2024-06-12 18:08:19,533][71000] Updated weights for policy 0, policy_version 48374 (0.0024) [2024-06-12 18:08:20,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.2, 300 sec: 48597.3). Total num frames: 792625152. Throughput: 0: 49604.5. Samples: 321412680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-12 18:08:20,940][70768] Avg episode reward: [(0, '0.196')] [2024-06-12 18:08:21,994][71000] Updated weights for policy 0, policy_version 48384 (0.0032) [2024-06-12 18:08:25,854][71000] Updated weights for policy 0, policy_version 48394 (0.0033) [2024-06-12 18:08:25,939][70768] Fps is (10 sec: 47514.6, 60 sec: 49698.2, 300 sec: 48707.7). Total num frames: 792887296. Throughput: 0: 49464.0. Samples: 321708500. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-12 18:08:25,940][70768] Avg episode reward: [(0, '0.209')] [2024-06-12 18:08:28,598][71000] Updated weights for policy 0, policy_version 48404 (0.0020) [2024-06-12 18:08:30,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 48707.7). Total num frames: 793133056. Throughput: 0: 49681.4. Samples: 322011700. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-12 18:08:30,940][70768] Avg episode reward: [(0, '0.206')] [2024-06-12 18:08:32,235][71000] Updated weights for policy 0, policy_version 48414 (0.0032) [2024-06-12 18:08:35,390][71000] Updated weights for policy 0, policy_version 48424 (0.0034) [2024-06-12 18:08:35,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.2, 300 sec: 48818.8). Total num frames: 793395200. Throughput: 0: 49675.6. Samples: 322163260. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-12 18:08:35,940][70768] Avg episode reward: [(0, '0.190')] [2024-06-12 18:08:39,262][71000] Updated weights for policy 0, policy_version 48434 (0.0026) [2024-06-12 18:08:40,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49698.1, 300 sec: 48707.7). Total num frames: 793624576. Throughput: 0: 49515.2. Samples: 322456620. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-12 18:08:40,940][70768] Avg episode reward: [(0, '0.216')] [2024-06-12 18:08:41,766][71000] Updated weights for policy 0, policy_version 48444 (0.0023) [2024-06-12 18:08:45,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49152.0, 300 sec: 48652.1). Total num frames: 793853952. Throughput: 0: 49332.8. Samples: 322751840. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-12 18:08:45,940][70768] Avg episode reward: [(0, '0.215')] [2024-06-12 18:08:46,193][71000] Updated weights for policy 0, policy_version 48454 (0.0030) [2024-06-12 18:08:48,163][71000] Updated weights for policy 0, policy_version 48464 (0.0028) [2024-06-12 18:08:50,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.2, 300 sec: 48874.3). Total num frames: 794132480. Throughput: 0: 49372.0. Samples: 322894760. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-12 18:08:50,940][70768] Avg episode reward: [(0, '0.197')] [2024-06-12 18:08:52,537][71000] Updated weights for policy 0, policy_version 48474 (0.0035) [2024-06-12 18:08:53,351][70980] Signal inference workers to stop experience collection... (4750 times) [2024-06-12 18:08:53,351][70980] Signal inference workers to resume experience collection... (4750 times) [2024-06-12 18:08:53,363][71000] InferenceWorker_p0-w0: stopping experience collection (4750 times) [2024-06-12 18:08:53,363][71000] InferenceWorker_p0-w0: resuming experience collection (4750 times) [2024-06-12 18:08:55,042][71000] Updated weights for policy 0, policy_version 48484 (0.0028) [2024-06-12 18:08:55,940][70768] Fps is (10 sec: 54066.3, 60 sec: 49698.0, 300 sec: 48985.4). Total num frames: 794394624. Throughput: 0: 49726.0. Samples: 323206180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-12 18:08:55,940][70768] Avg episode reward: [(0, '0.189')] [2024-06-12 18:08:58,739][71000] Updated weights for policy 0, policy_version 48494 (0.0035) [2024-06-12 18:09:00,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49971.2, 300 sec: 48874.3). Total num frames: 794640384. Throughput: 0: 49801.6. Samples: 323506000. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-12 18:09:00,940][70768] Avg episode reward: [(0, '0.221')] [2024-06-12 18:09:01,555][71000] Updated weights for policy 0, policy_version 48504 (0.0033) [2024-06-12 18:09:05,535][71000] Updated weights for policy 0, policy_version 48514 (0.0034) [2024-06-12 18:09:05,940][70768] Fps is (10 sec: 45876.0, 60 sec: 49152.0, 300 sec: 48819.1). Total num frames: 794853376. Throughput: 0: 49795.5. Samples: 323653480. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-12 18:09:05,940][70768] Avg episode reward: [(0, '0.206')] [2024-06-12 18:09:08,061][71000] Updated weights for policy 0, policy_version 48524 (0.0027) [2024-06-12 18:09:10,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 795131904. Throughput: 0: 49655.9. Samples: 323943020. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-12 18:09:10,940][70768] Avg episode reward: [(0, '0.193')] [2024-06-12 18:09:12,075][71000] Updated weights for policy 0, policy_version 48534 (0.0031) [2024-06-12 18:09:14,613][71000] Updated weights for policy 0, policy_version 48544 (0.0025) [2024-06-12 18:09:15,940][70768] Fps is (10 sec: 54067.2, 60 sec: 49698.3, 300 sec: 49040.9). Total num frames: 795394048. Throughput: 0: 49638.7. Samples: 324245440. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-12 18:09:15,940][70768] Avg episode reward: [(0, '0.179')] [2024-06-12 18:09:18,642][71000] Updated weights for policy 0, policy_version 48554 (0.0024) [2024-06-12 18:09:20,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49971.2, 300 sec: 48874.3). Total num frames: 795623424. Throughput: 0: 49614.6. Samples: 324395920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 18:09:20,940][70768] Avg episode reward: [(0, '0.203')] [2024-06-12 18:09:21,400][71000] Updated weights for policy 0, policy_version 48564 (0.0039) [2024-06-12 18:09:25,070][71000] Updated weights for policy 0, policy_version 48574 (0.0024) [2024-06-12 18:09:25,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49425.0, 300 sec: 48929.8). Total num frames: 795852800. Throughput: 0: 49767.0. Samples: 324696140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 18:09:25,940][70768] Avg episode reward: [(0, '0.201')] [2024-06-12 18:09:28,106][71000] Updated weights for policy 0, policy_version 48584 (0.0031) [2024-06-12 18:09:30,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49971.2, 300 sec: 49040.9). Total num frames: 796131328. Throughput: 0: 49745.3. Samples: 324990380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 18:09:30,940][70768] Avg episode reward: [(0, '0.199')] [2024-06-12 18:09:31,922][71000] Updated weights for policy 0, policy_version 48594 (0.0029) [2024-06-12 18:09:34,688][71000] Updated weights for policy 0, policy_version 48604 (0.0026) [2024-06-12 18:09:35,942][70768] Fps is (10 sec: 52414.2, 60 sec: 49695.8, 300 sec: 49096.0). Total num frames: 796377088. Throughput: 0: 49901.3. Samples: 325140460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 18:09:35,943][70768] Avg episode reward: [(0, '0.184')] [2024-06-12 18:09:38,532][71000] Updated weights for policy 0, policy_version 48614 (0.0028) [2024-06-12 18:09:40,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49971.1, 300 sec: 49040.9). Total num frames: 796622848. Throughput: 0: 49553.9. Samples: 325436100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 18:09:40,940][70768] Avg episode reward: [(0, '0.203')] [2024-06-12 18:09:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000048622_796622848.pth... [2024-06-12 18:09:40,991][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000047901_784809984.pth [2024-06-12 18:09:41,379][71000] Updated weights for policy 0, policy_version 48624 (0.0026) [2024-06-12 18:09:45,376][71000] Updated weights for policy 0, policy_version 48634 (0.0023) [2024-06-12 18:09:45,940][70768] Fps is (10 sec: 49165.2, 60 sec: 50244.2, 300 sec: 49152.0). Total num frames: 796868608. Throughput: 0: 49687.8. Samples: 325741960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 18:09:45,940][70768] Avg episode reward: [(0, '0.182')] [2024-06-12 18:09:47,988][71000] Updated weights for policy 0, policy_version 48644 (0.0027) [2024-06-12 18:09:50,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49698.0, 300 sec: 49152.0). Total num frames: 797114368. Throughput: 0: 49555.4. Samples: 325883480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-12 18:09:50,940][70768] Avg episode reward: [(0, '0.199')] [2024-06-12 18:09:51,673][71000] Updated weights for policy 0, policy_version 48654 (0.0029) [2024-06-12 18:09:54,583][71000] Updated weights for policy 0, policy_version 48664 (0.0033) [2024-06-12 18:09:55,939][70768] Fps is (10 sec: 52429.7, 60 sec: 49971.4, 300 sec: 49263.1). Total num frames: 797392896. Throughput: 0: 49985.4. Samples: 326192360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-12 18:09:55,940][70768] Avg episode reward: [(0, '0.204')] [2024-06-12 18:09:58,257][71000] Updated weights for policy 0, policy_version 48674 (0.0030) [2024-06-12 18:10:00,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49698.1, 300 sec: 49096.5). Total num frames: 797622272. Throughput: 0: 49948.0. Samples: 326493100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-12 18:10:00,940][70768] Avg episode reward: [(0, '0.175')] [2024-06-12 18:10:00,972][71000] Updated weights for policy 0, policy_version 48684 (0.0034) [2024-06-12 18:10:04,850][71000] Updated weights for policy 0, policy_version 48694 (0.0027) [2024-06-12 18:10:05,920][70980] Signal inference workers to stop experience collection... (4800 times) [2024-06-12 18:10:05,939][70768] Fps is (10 sec: 45875.2, 60 sec: 49971.2, 300 sec: 49207.5). Total num frames: 797851648. Throughput: 0: 49863.2. Samples: 326639760. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-12 18:10:05,940][70768] Avg episode reward: [(0, '0.194')] [2024-06-12 18:10:05,959][71000] InferenceWorker_p0-w0: stopping experience collection (4800 times) [2024-06-12 18:10:05,965][70980] Signal inference workers to resume experience collection... (4800 times) [2024-06-12 18:10:05,978][71000] InferenceWorker_p0-w0: resuming experience collection (4800 times) [2024-06-12 18:10:07,552][71000] Updated weights for policy 0, policy_version 48704 (0.0026) [2024-06-12 18:10:10,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 798097408. Throughput: 0: 49756.5. Samples: 326935180. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-12 18:10:10,940][70768] Avg episode reward: [(0, '0.194')] [2024-06-12 18:10:11,500][71000] Updated weights for policy 0, policy_version 48714 (0.0030) [2024-06-12 18:10:14,235][71000] Updated weights for policy 0, policy_version 48724 (0.0023) [2024-06-12 18:10:15,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 798375936. Throughput: 0: 49925.8. Samples: 327237040. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-12 18:10:15,940][70768] Avg episode reward: [(0, '0.181')] [2024-06-12 18:10:17,879][71000] Updated weights for policy 0, policy_version 48734 (0.0026) [2024-06-12 18:10:20,605][71000] Updated weights for policy 0, policy_version 48744 (0.0028) [2024-06-12 18:10:20,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49971.2, 300 sec: 49263.1). Total num frames: 798621696. Throughput: 0: 50103.2. Samples: 327394960. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-12 18:10:20,940][70768] Avg episode reward: [(0, '0.196')] [2024-06-12 18:10:24,493][71000] Updated weights for policy 0, policy_version 48754 (0.0033) [2024-06-12 18:10:25,940][70768] Fps is (10 sec: 49151.2, 60 sec: 50244.2, 300 sec: 49374.2). Total num frames: 798867456. Throughput: 0: 50210.1. Samples: 327695560. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-12 18:10:25,940][70768] Avg episode reward: [(0, '0.190')] [2024-06-12 18:10:27,498][71000] Updated weights for policy 0, policy_version 48764 (0.0025) [2024-06-12 18:10:30,939][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 799080448. Throughput: 0: 49821.1. Samples: 327983900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-12 18:10:30,940][70768] Avg episode reward: [(0, '0.198')] [2024-06-12 18:10:31,186][71000] Updated weights for policy 0, policy_version 48774 (0.0021) [2024-06-12 18:10:34,139][71000] Updated weights for policy 0, policy_version 48784 (0.0035) [2024-06-12 18:10:35,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49973.5, 300 sec: 49429.7). Total num frames: 799375360. Throughput: 0: 50022.3. Samples: 328134480. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-12 18:10:35,940][70768] Avg episode reward: [(0, '0.206')] [2024-06-12 18:10:37,623][71000] Updated weights for policy 0, policy_version 48794 (0.0030) [2024-06-12 18:10:40,881][71000] Updated weights for policy 0, policy_version 48804 (0.0028) [2024-06-12 18:10:40,939][70768] Fps is (10 sec: 52428.8, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 799604736. Throughput: 0: 49709.4. Samples: 328429280. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-12 18:10:40,940][70768] Avg episode reward: [(0, '0.189')] [2024-06-12 18:10:44,403][71000] Updated weights for policy 0, policy_version 48814 (0.0031) [2024-06-12 18:10:45,939][70768] Fps is (10 sec: 47514.2, 60 sec: 49698.3, 300 sec: 49374.2). Total num frames: 799850496. Throughput: 0: 49375.6. Samples: 328715000. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-12 18:10:45,940][70768] Avg episode reward: [(0, '0.188')] [2024-06-12 18:10:47,618][71000] Updated weights for policy 0, policy_version 48824 (0.0023) [2024-06-12 18:10:50,905][71000] Updated weights for policy 0, policy_version 48834 (0.0032) [2024-06-12 18:10:50,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 800096256. Throughput: 0: 49524.8. Samples: 328868380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 18:10:50,940][70768] Avg episode reward: [(0, '0.197')] [2024-06-12 18:10:54,160][71000] Updated weights for policy 0, policy_version 48844 (0.0034) [2024-06-12 18:10:55,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 800358400. Throughput: 0: 49685.4. Samples: 329171020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 18:10:55,940][70768] Avg episode reward: [(0, '0.208')] [2024-06-12 18:10:57,403][71000] Updated weights for policy 0, policy_version 48854 (0.0030) [2024-06-12 18:11:00,615][71000] Updated weights for policy 0, policy_version 48864 (0.0031) [2024-06-12 18:11:00,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 800587776. Throughput: 0: 49797.3. Samples: 329477920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 18:11:00,940][70768] Avg episode reward: [(0, '0.194')] [2024-06-12 18:11:03,725][71000] Updated weights for policy 0, policy_version 48874 (0.0026) [2024-06-12 18:11:05,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49971.1, 300 sec: 49540.8). Total num frames: 800849920. Throughput: 0: 49530.5. Samples: 329623840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 18:11:05,940][70768] Avg episode reward: [(0, '0.192')] [2024-06-12 18:11:07,421][71000] Updated weights for policy 0, policy_version 48884 (0.0029) [2024-06-12 18:11:10,273][71000] Updated weights for policy 0, policy_version 48894 (0.0024) [2024-06-12 18:11:10,940][70768] Fps is (10 sec: 52428.5, 60 sec: 50244.3, 300 sec: 49540.8). Total num frames: 801112064. Throughput: 0: 49516.1. Samples: 329923780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 18:11:10,940][70768] Avg episode reward: [(0, '0.189')] [2024-06-12 18:11:13,752][71000] Updated weights for policy 0, policy_version 48904 (0.0038) [2024-06-12 18:11:15,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 801357824. Throughput: 0: 49701.7. Samples: 330220480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 18:11:15,940][70768] Avg episode reward: [(0, '0.203')] [2024-06-12 18:11:16,885][70980] Signal inference workers to stop experience collection... (4850 times) [2024-06-12 18:11:16,933][71000] InferenceWorker_p0-w0: stopping experience collection (4850 times) [2024-06-12 18:11:16,943][70980] Signal inference workers to resume experience collection... (4850 times) [2024-06-12 18:11:16,944][71000] InferenceWorker_p0-w0: resuming experience collection (4850 times) [2024-06-12 18:11:16,946][71000] Updated weights for policy 0, policy_version 48914 (0.0028) [2024-06-12 18:11:20,616][71000] Updated weights for policy 0, policy_version 48924 (0.0034) [2024-06-12 18:11:20,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 801587200. Throughput: 0: 49621.8. Samples: 330367460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 18:11:20,940][70768] Avg episode reward: [(0, '0.203')] [2024-06-12 18:11:23,439][71000] Updated weights for policy 0, policy_version 48934 (0.0030) [2024-06-12 18:11:25,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49425.1, 300 sec: 49485.3). Total num frames: 801832960. Throughput: 0: 49579.4. Samples: 330660360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 18:11:25,940][70768] Avg episode reward: [(0, '0.203')] [2024-06-12 18:11:27,093][71000] Updated weights for policy 0, policy_version 48944 (0.0028) [2024-06-12 18:11:29,891][71000] Updated weights for policy 0, policy_version 48954 (0.0035) [2024-06-12 18:11:30,939][70768] Fps is (10 sec: 52429.5, 60 sec: 50517.3, 300 sec: 49707.4). Total num frames: 802111488. Throughput: 0: 49815.6. Samples: 330956700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 18:11:30,940][70768] Avg episode reward: [(0, '0.188')] [2024-06-12 18:11:33,803][71000] Updated weights for policy 0, policy_version 48964 (0.0029) [2024-06-12 18:11:35,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49152.1, 300 sec: 49485.3). Total num frames: 802324480. Throughput: 0: 49929.4. Samples: 331115200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 18:11:35,940][70768] Avg episode reward: [(0, '0.209')] [2024-06-12 18:11:36,670][71000] Updated weights for policy 0, policy_version 48974 (0.0034) [2024-06-12 18:11:40,193][71000] Updated weights for policy 0, policy_version 48984 (0.0027) [2024-06-12 18:11:40,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49698.0, 300 sec: 49596.3). Total num frames: 802586624. Throughput: 0: 49900.3. Samples: 331416540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 18:11:40,940][70768] Avg episode reward: [(0, '0.199')] [2024-06-12 18:11:40,945][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000048986_802586624.pth... [2024-06-12 18:11:41,002][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000048258_790659072.pth [2024-06-12 18:11:43,101][71000] Updated weights for policy 0, policy_version 48994 (0.0031) [2024-06-12 18:11:45,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 802832384. Throughput: 0: 49512.5. Samples: 331705980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 18:11:45,940][70768] Avg episode reward: [(0, '0.194')] [2024-06-12 18:11:47,313][71000] Updated weights for policy 0, policy_version 49004 (0.0022) [2024-06-12 18:11:49,724][71000] Updated weights for policy 0, policy_version 49014 (0.0028) [2024-06-12 18:11:50,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49971.1, 300 sec: 49707.3). Total num frames: 803094528. Throughput: 0: 49492.4. Samples: 331851000. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-12 18:11:50,940][70768] Avg episode reward: [(0, '0.186')] [2024-06-12 18:11:53,964][71000] Updated weights for policy 0, policy_version 49024 (0.0032) [2024-06-12 18:11:55,939][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.1, 300 sec: 49651.9). Total num frames: 803323904. Throughput: 0: 49577.0. Samples: 332154740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-12 18:11:55,940][70768] Avg episode reward: [(0, '0.206')] [2024-06-12 18:11:56,379][71000] Updated weights for policy 0, policy_version 49034 (0.0026) [2024-06-12 18:12:00,271][71000] Updated weights for policy 0, policy_version 49044 (0.0031) [2024-06-12 18:12:00,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49971.1, 300 sec: 49707.4). Total num frames: 803586048. Throughput: 0: 49673.2. Samples: 332455780. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-12 18:12:00,940][70768] Avg episode reward: [(0, '0.203')] [2024-06-12 18:12:03,042][71000] Updated weights for policy 0, policy_version 49054 (0.0029) [2024-06-12 18:12:05,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.2, 300 sec: 49596.3). Total num frames: 803815424. Throughput: 0: 49561.5. Samples: 332597720. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-12 18:12:05,940][70768] Avg episode reward: [(0, '0.205')] [2024-06-12 18:12:06,896][71000] Updated weights for policy 0, policy_version 49064 (0.0033) [2024-06-12 18:12:09,394][71000] Updated weights for policy 0, policy_version 49074 (0.0029) [2024-06-12 18:12:10,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.1, 300 sec: 49651.9). Total num frames: 804077568. Throughput: 0: 49742.8. Samples: 332898780. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-12 18:12:10,940][70768] Avg episode reward: [(0, '0.206')] [2024-06-12 18:12:13,514][71000] Updated weights for policy 0, policy_version 49084 (0.0024) [2024-06-12 18:12:15,809][71000] Updated weights for policy 0, policy_version 49094 (0.0030) [2024-06-12 18:12:15,940][70768] Fps is (10 sec: 54066.0, 60 sec: 49971.1, 300 sec: 49762.9). Total num frames: 804356096. Throughput: 0: 49825.6. Samples: 333198860. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-12 18:12:15,940][70768] Avg episode reward: [(0, '0.188')] [2024-06-12 18:12:18,149][70980] Signal inference workers to stop experience collection... (4900 times) [2024-06-12 18:12:18,154][70980] Signal inference workers to resume experience collection... (4900 times) [2024-06-12 18:12:18,189][71000] InferenceWorker_p0-w0: stopping experience collection (4900 times) [2024-06-12 18:12:18,189][71000] InferenceWorker_p0-w0: resuming experience collection (4900 times) [2024-06-12 18:12:19,850][71000] Updated weights for policy 0, policy_version 49104 (0.0028) [2024-06-12 18:12:20,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49698.1, 300 sec: 49707.4). Total num frames: 804569088. Throughput: 0: 49679.8. Samples: 333350800. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-12 18:12:20,940][70768] Avg episode reward: [(0, '0.209')] [2024-06-12 18:12:22,704][71000] Updated weights for policy 0, policy_version 49114 (0.0028) [2024-06-12 18:12:25,940][70768] Fps is (10 sec: 45875.8, 60 sec: 49698.2, 300 sec: 49707.4). Total num frames: 804814848. Throughput: 0: 49648.5. Samples: 333650720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 18:12:25,940][70768] Avg episode reward: [(0, '0.210')] [2024-06-12 18:12:26,355][71000] Updated weights for policy 0, policy_version 49124 (0.0029) [2024-06-12 18:12:29,078][71000] Updated weights for policy 0, policy_version 49134 (0.0023) [2024-06-12 18:12:30,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49698.0, 300 sec: 49762.9). Total num frames: 805093376. Throughput: 0: 49890.0. Samples: 333951040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 18:12:30,940][70768] Avg episode reward: [(0, '0.198')] [2024-06-12 18:12:33,222][71000] Updated weights for policy 0, policy_version 49144 (0.0031) [2024-06-12 18:12:35,515][71000] Updated weights for policy 0, policy_version 49154 (0.0036) [2024-06-12 18:12:35,940][70768] Fps is (10 sec: 54067.3, 60 sec: 50517.3, 300 sec: 49874.0). Total num frames: 805355520. Throughput: 0: 50175.8. Samples: 334108900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 18:12:35,940][70768] Avg episode reward: [(0, '0.218')] [2024-06-12 18:12:39,564][71000] Updated weights for policy 0, policy_version 49164 (0.0031) [2024-06-12 18:12:40,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49698.2, 300 sec: 49707.4). Total num frames: 805568512. Throughput: 0: 50053.3. Samples: 334407140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 18:12:40,940][70768] Avg episode reward: [(0, '0.210')] [2024-06-12 18:12:42,012][71000] Updated weights for policy 0, policy_version 49174 (0.0032) [2024-06-12 18:12:45,939][70768] Fps is (10 sec: 45875.6, 60 sec: 49698.2, 300 sec: 49707.4). Total num frames: 805814272. Throughput: 0: 50002.9. Samples: 334705900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 18:12:45,940][70768] Avg episode reward: [(0, '0.208')] [2024-06-12 18:12:46,013][71000] Updated weights for policy 0, policy_version 49184 (0.0026) [2024-06-12 18:12:49,009][71000] Updated weights for policy 0, policy_version 49194 (0.0035) [2024-06-12 18:12:50,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.2, 300 sec: 49651.8). Total num frames: 806060032. Throughput: 0: 49750.9. Samples: 334836520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 18:12:50,940][70768] Avg episode reward: [(0, '0.194')] [2024-06-12 18:12:52,703][71000] Updated weights for policy 0, policy_version 49204 (0.0032) [2024-06-12 18:12:55,259][71000] Updated weights for policy 0, policy_version 49214 (0.0026) [2024-06-12 18:12:55,940][70768] Fps is (10 sec: 52427.5, 60 sec: 50244.1, 300 sec: 49818.4). Total num frames: 806338560. Throughput: 0: 49974.9. Samples: 335147660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 18:12:55,940][70768] Avg episode reward: [(0, '0.207')] [2024-06-12 18:12:59,212][71000] Updated weights for policy 0, policy_version 49224 (0.0028) [2024-06-12 18:13:00,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49971.2, 300 sec: 49762.9). Total num frames: 806584320. Throughput: 0: 50235.5. Samples: 335459460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 18:13:00,940][70768] Avg episode reward: [(0, '0.207')] [2024-06-12 18:13:01,526][71000] Updated weights for policy 0, policy_version 49234 (0.0031) [2024-06-12 18:13:05,623][71000] Updated weights for policy 0, policy_version 49244 (0.0027) [2024-06-12 18:13:05,940][70768] Fps is (10 sec: 49152.6, 60 sec: 50244.2, 300 sec: 49707.4). Total num frames: 806830080. Throughput: 0: 49965.4. Samples: 335599240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 18:13:05,940][70768] Avg episode reward: [(0, '0.206')] [2024-06-12 18:13:08,121][71000] Updated weights for policy 0, policy_version 49254 (0.0020) [2024-06-12 18:13:10,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49971.1, 300 sec: 49707.4). Total num frames: 807075840. Throughput: 0: 50065.3. Samples: 335903660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 18:13:10,940][70768] Avg episode reward: [(0, '0.197')] [2024-06-12 18:13:12,123][71000] Updated weights for policy 0, policy_version 49264 (0.0022) [2024-06-12 18:13:14,755][71000] Updated weights for policy 0, policy_version 49274 (0.0024) [2024-06-12 18:13:15,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49698.1, 300 sec: 49874.0). Total num frames: 807337984. Throughput: 0: 49904.9. Samples: 336196760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 18:13:15,940][70768] Avg episode reward: [(0, '0.189')] [2024-06-12 18:13:18,723][71000] Updated weights for policy 0, policy_version 49284 (0.0027) [2024-06-12 18:13:20,015][70980] Signal inference workers to stop experience collection... (4950 times) [2024-06-12 18:13:20,052][71000] InferenceWorker_p0-w0: stopping experience collection (4950 times) [2024-06-12 18:13:20,127][70980] Signal inference workers to resume experience collection... (4950 times) [2024-06-12 18:13:20,127][71000] InferenceWorker_p0-w0: resuming experience collection (4950 times) [2024-06-12 18:13:20,940][70768] Fps is (10 sec: 50790.6, 60 sec: 50244.4, 300 sec: 49818.5). Total num frames: 807583744. Throughput: 0: 49782.7. Samples: 336349120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 18:13:20,940][70768] Avg episode reward: [(0, '0.183')] [2024-06-12 18:13:21,472][71000] Updated weights for policy 0, policy_version 49294 (0.0022) [2024-06-12 18:13:25,382][71000] Updated weights for policy 0, policy_version 49304 (0.0027) [2024-06-12 18:13:25,942][70768] Fps is (10 sec: 47502.7, 60 sec: 49969.2, 300 sec: 49762.5). Total num frames: 807813120. Throughput: 0: 49832.4. Samples: 336649720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 18:13:25,943][70768] Avg episode reward: [(0, '0.184')] [2024-06-12 18:13:27,939][71000] Updated weights for policy 0, policy_version 49314 (0.0032) [2024-06-12 18:13:30,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.2, 300 sec: 49707.4). Total num frames: 808058880. Throughput: 0: 49960.8. Samples: 336954140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 18:13:30,940][70768] Avg episode reward: [(0, '0.204')] [2024-06-12 18:13:31,817][71000] Updated weights for policy 0, policy_version 49324 (0.0042) [2024-06-12 18:13:34,505][71000] Updated weights for policy 0, policy_version 49334 (0.0027) [2024-06-12 18:13:35,939][70768] Fps is (10 sec: 52441.8, 60 sec: 49698.2, 300 sec: 49874.0). Total num frames: 808337408. Throughput: 0: 50326.4. Samples: 337101200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 18:13:35,940][70768] Avg episode reward: [(0, '0.213')] [2024-06-12 18:13:38,426][71000] Updated weights for policy 0, policy_version 49344 (0.0025) [2024-06-12 18:13:40,940][70768] Fps is (10 sec: 54066.8, 60 sec: 50517.3, 300 sec: 49985.1). Total num frames: 808599552. Throughput: 0: 50348.1. Samples: 337413320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 18:13:40,940][70768] Avg episode reward: [(0, '0.195')] [2024-06-12 18:13:41,064][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000049354_808615936.pth... [2024-06-12 18:13:41,067][71000] Updated weights for policy 0, policy_version 49354 (0.0034) [2024-06-12 18:13:41,113][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000048622_796622848.pth [2024-06-12 18:13:44,745][71000] Updated weights for policy 0, policy_version 49364 (0.0028) [2024-06-12 18:13:45,940][70768] Fps is (10 sec: 50790.0, 60 sec: 50517.2, 300 sec: 49874.0). Total num frames: 808845312. Throughput: 0: 50207.7. Samples: 337718800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 18:13:45,940][70768] Avg episode reward: [(0, '0.207')] [2024-06-12 18:13:47,527][71000] Updated weights for policy 0, policy_version 49374 (0.0031) [2024-06-12 18:13:50,939][70768] Fps is (10 sec: 47514.1, 60 sec: 50244.4, 300 sec: 49763.0). Total num frames: 809074688. Throughput: 0: 50200.5. Samples: 337858260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 18:13:50,940][70768] Avg episode reward: [(0, '0.199')] [2024-06-12 18:13:51,394][71000] Updated weights for policy 0, policy_version 49384 (0.0025) [2024-06-12 18:13:54,361][71000] Updated weights for policy 0, policy_version 49394 (0.0032) [2024-06-12 18:13:55,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49971.3, 300 sec: 49818.5). Total num frames: 809336832. Throughput: 0: 49990.3. Samples: 338153220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 18:13:55,940][70768] Avg episode reward: [(0, '0.214')] [2024-06-12 18:13:57,755][71000] Updated weights for policy 0, policy_version 49404 (0.0030) [2024-06-12 18:14:00,734][71000] Updated weights for policy 0, policy_version 49414 (0.0033) [2024-06-12 18:14:00,939][70768] Fps is (10 sec: 52428.9, 60 sec: 50244.5, 300 sec: 49985.1). Total num frames: 809598976. Throughput: 0: 50219.3. Samples: 338456620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 18:14:00,940][70768] Avg episode reward: [(0, '0.201')] [2024-06-12 18:14:04,471][71000] Updated weights for policy 0, policy_version 49424 (0.0038) [2024-06-12 18:14:05,940][70768] Fps is (10 sec: 50790.3, 60 sec: 50244.3, 300 sec: 49874.0). Total num frames: 809844736. Throughput: 0: 50304.9. Samples: 338612840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 18:14:05,940][70768] Avg episode reward: [(0, '0.200')] [2024-06-12 18:14:07,153][71000] Updated weights for policy 0, policy_version 49434 (0.0026) [2024-06-12 18:14:10,799][71000] Updated weights for policy 0, policy_version 49444 (0.0033) [2024-06-12 18:14:10,940][70768] Fps is (10 sec: 49151.1, 60 sec: 50244.2, 300 sec: 49818.4). Total num frames: 810090496. Throughput: 0: 50293.7. Samples: 338912820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 18:14:10,940][70768] Avg episode reward: [(0, '0.218')] [2024-06-12 18:14:14,118][71000] Updated weights for policy 0, policy_version 49454 (0.0036) [2024-06-12 18:14:15,940][70768] Fps is (10 sec: 50789.8, 60 sec: 50244.3, 300 sec: 49929.5). Total num frames: 810352640. Throughput: 0: 50182.5. Samples: 339212360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 18:14:15,940][70768] Avg episode reward: [(0, '0.189')] [2024-06-12 18:14:17,527][71000] Updated weights for policy 0, policy_version 49464 (0.0027) [2024-06-12 18:14:20,409][71000] Updated weights for policy 0, policy_version 49474 (0.0022) [2024-06-12 18:14:20,940][70768] Fps is (10 sec: 50790.0, 60 sec: 50244.1, 300 sec: 49985.1). Total num frames: 810598400. Throughput: 0: 50299.3. Samples: 339364680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 18:14:20,940][70768] Avg episode reward: [(0, '0.206')] [2024-06-12 18:14:23,732][71000] Updated weights for policy 0, policy_version 49484 (0.0038) [2024-06-12 18:14:25,940][70768] Fps is (10 sec: 49152.1, 60 sec: 50519.3, 300 sec: 49874.0). Total num frames: 810844160. Throughput: 0: 50136.4. Samples: 339669460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 18:14:25,940][70768] Avg episode reward: [(0, '0.214')] [2024-06-12 18:14:26,841][71000] Updated weights for policy 0, policy_version 49494 (0.0029) [2024-06-12 18:14:30,487][71000] Updated weights for policy 0, policy_version 49504 (0.0034) [2024-06-12 18:14:30,940][70768] Fps is (10 sec: 50790.2, 60 sec: 50790.2, 300 sec: 49930.0). Total num frames: 811106304. Throughput: 0: 50122.0. Samples: 339974300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 18:14:30,941][70768] Avg episode reward: [(0, '0.208')] [2024-06-12 18:14:33,486][71000] Updated weights for policy 0, policy_version 49514 (0.0032) [2024-06-12 18:14:35,940][70768] Fps is (10 sec: 50791.0, 60 sec: 50244.2, 300 sec: 49929.6). Total num frames: 811352064. Throughput: 0: 50269.7. Samples: 340120400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 18:14:35,940][70768] Avg episode reward: [(0, '0.208')] [2024-06-12 18:14:37,024][71000] Updated weights for policy 0, policy_version 49524 (0.0028) [2024-06-12 18:14:40,338][71000] Updated weights for policy 0, policy_version 49534 (0.0039) [2024-06-12 18:14:40,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49971.2, 300 sec: 49929.5). Total num frames: 811597824. Throughput: 0: 50407.9. Samples: 340421580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 18:14:40,940][70768] Avg episode reward: [(0, '0.220')] [2024-06-12 18:14:41,863][70980] Signal inference workers to stop experience collection... (5000 times) [2024-06-12 18:14:41,864][70980] Signal inference workers to resume experience collection... (5000 times) [2024-06-12 18:14:41,877][71000] InferenceWorker_p0-w0: stopping experience collection (5000 times) [2024-06-12 18:14:41,878][71000] InferenceWorker_p0-w0: resuming experience collection (5000 times) [2024-06-12 18:14:43,361][71000] Updated weights for policy 0, policy_version 49544 (0.0035) [2024-06-12 18:14:45,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49698.0, 300 sec: 49874.0). Total num frames: 811827200. Throughput: 0: 50269.1. Samples: 340718740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 18:14:45,940][70768] Avg episode reward: [(0, '0.197')] [2024-06-12 18:14:46,815][71000] Updated weights for policy 0, policy_version 49554 (0.0032) [2024-06-12 18:14:50,025][71000] Updated weights for policy 0, policy_version 49564 (0.0035) [2024-06-12 18:14:50,939][70768] Fps is (10 sec: 50791.3, 60 sec: 50517.4, 300 sec: 49874.0). Total num frames: 812105728. Throughput: 0: 50115.2. Samples: 340868020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 18:14:50,940][70768] Avg episode reward: [(0, '0.208')] [2024-06-12 18:14:53,132][71000] Updated weights for policy 0, policy_version 49574 (0.0038) [2024-06-12 18:14:55,940][70768] Fps is (10 sec: 52428.7, 60 sec: 50244.1, 300 sec: 49929.5). Total num frames: 812351488. Throughput: 0: 49963.0. Samples: 341161160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 18:14:55,940][70768] Avg episode reward: [(0, '0.218')] [2024-06-12 18:14:56,611][71000] Updated weights for policy 0, policy_version 49584 (0.0026) [2024-06-12 18:15:00,001][71000] Updated weights for policy 0, policy_version 49594 (0.0025) [2024-06-12 18:15:00,939][70768] Fps is (10 sec: 47513.4, 60 sec: 49698.1, 300 sec: 49929.5). Total num frames: 812580864. Throughput: 0: 50058.9. Samples: 341465000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 18:15:00,940][70768] Avg episode reward: [(0, '0.196')] [2024-06-12 18:15:03,001][71000] Updated weights for policy 0, policy_version 49604 (0.0026) [2024-06-12 18:15:05,939][70768] Fps is (10 sec: 47514.7, 60 sec: 49698.2, 300 sec: 49929.6). Total num frames: 812826624. Throughput: 0: 49886.9. Samples: 341609580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 18:15:05,940][70768] Avg episode reward: [(0, '0.195')] [2024-06-12 18:15:06,766][71000] Updated weights for policy 0, policy_version 49614 (0.0031) [2024-06-12 18:15:09,606][71000] Updated weights for policy 0, policy_version 49624 (0.0028) [2024-06-12 18:15:10,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49971.3, 300 sec: 49874.0). Total num frames: 813088768. Throughput: 0: 49856.2. Samples: 341912980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 18:15:10,940][70768] Avg episode reward: [(0, '0.199')] [2024-06-12 18:15:13,353][71000] Updated weights for policy 0, policy_version 49634 (0.0032) [2024-06-12 18:15:15,939][70768] Fps is (10 sec: 52428.8, 60 sec: 49971.3, 300 sec: 49929.5). Total num frames: 813350912. Throughput: 0: 49696.7. Samples: 342210640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 18:15:15,940][70768] Avg episode reward: [(0, '0.204')] [2024-06-12 18:15:16,116][71000] Updated weights for policy 0, policy_version 49644 (0.0023) [2024-06-12 18:15:19,764][71000] Updated weights for policy 0, policy_version 49654 (0.0032) [2024-06-12 18:15:20,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.3, 300 sec: 49874.0). Total num frames: 813580288. Throughput: 0: 49917.4. Samples: 342366680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 18:15:20,940][70768] Avg episode reward: [(0, '0.207')] [2024-06-12 18:15:22,521][71000] Updated weights for policy 0, policy_version 49664 (0.0027) [2024-06-12 18:15:25,940][70768] Fps is (10 sec: 47512.6, 60 sec: 49698.1, 300 sec: 49985.0). Total num frames: 813826048. Throughput: 0: 49904.4. Samples: 342667280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 18:15:25,940][70768] Avg episode reward: [(0, '0.204')] [2024-06-12 18:15:26,375][71000] Updated weights for policy 0, policy_version 49674 (0.0023) [2024-06-12 18:15:29,083][71000] Updated weights for policy 0, policy_version 49684 (0.0027) [2024-06-12 18:15:30,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.3, 300 sec: 49818.5). Total num frames: 814071808. Throughput: 0: 49947.8. Samples: 342966380. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-12 18:15:30,940][70768] Avg episode reward: [(0, '0.201')] [2024-06-12 18:15:32,994][71000] Updated weights for policy 0, policy_version 49694 (0.0026) [2024-06-12 18:15:35,434][71000] Updated weights for policy 0, policy_version 49704 (0.0022) [2024-06-12 18:15:35,939][70768] Fps is (10 sec: 54068.4, 60 sec: 50244.3, 300 sec: 50040.6). Total num frames: 814366720. Throughput: 0: 50093.3. Samples: 343122220. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-12 18:15:35,940][70768] Avg episode reward: [(0, '0.218')] [2024-06-12 18:15:39,289][71000] Updated weights for policy 0, policy_version 49714 (0.0028) [2024-06-12 18:15:40,940][70768] Fps is (10 sec: 55705.2, 60 sec: 50517.4, 300 sec: 50096.2). Total num frames: 814628864. Throughput: 0: 50305.9. Samples: 343424920. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-12 18:15:40,949][70768] Avg episode reward: [(0, '0.198')] [2024-06-12 18:15:40,957][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000049721_814628864.pth... [2024-06-12 18:15:40,998][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000048986_802586624.pth [2024-06-12 18:15:42,228][71000] Updated weights for policy 0, policy_version 49724 (0.0028) [2024-06-12 18:15:45,940][70768] Fps is (10 sec: 44236.0, 60 sec: 49698.2, 300 sec: 49874.0). Total num frames: 814809088. Throughput: 0: 50099.4. Samples: 343719480. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-12 18:15:45,940][70768] Avg episode reward: [(0, '0.193')] [2024-06-12 18:15:46,158][71000] Updated weights for policy 0, policy_version 49734 (0.0028) [2024-06-12 18:15:46,957][70980] Signal inference workers to stop experience collection... (5050 times) [2024-06-12 18:15:47,004][71000] InferenceWorker_p0-w0: stopping experience collection (5050 times) [2024-06-12 18:15:47,070][70980] Signal inference workers to resume experience collection... (5050 times) [2024-06-12 18:15:47,071][71000] InferenceWorker_p0-w0: resuming experience collection (5050 times) [2024-06-12 18:15:48,751][71000] Updated weights for policy 0, policy_version 49744 (0.0026) [2024-06-12 18:15:50,939][70768] Fps is (10 sec: 45875.5, 60 sec: 49698.1, 300 sec: 49929.5). Total num frames: 815087616. Throughput: 0: 50093.8. Samples: 343863800. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-12 18:15:50,940][70768] Avg episode reward: [(0, '0.198')] [2024-06-12 18:15:52,636][71000] Updated weights for policy 0, policy_version 49754 (0.0025) [2024-06-12 18:15:55,401][71000] Updated weights for policy 0, policy_version 49764 (0.0034) [2024-06-12 18:15:55,939][70768] Fps is (10 sec: 54068.3, 60 sec: 49971.4, 300 sec: 50040.6). Total num frames: 815349760. Throughput: 0: 49875.2. Samples: 344157360. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-12 18:15:55,940][70768] Avg episode reward: [(0, '0.211')] [2024-06-12 18:15:59,312][71000] Updated weights for policy 0, policy_version 49774 (0.0035) [2024-06-12 18:16:00,939][70768] Fps is (10 sec: 50790.7, 60 sec: 50244.3, 300 sec: 49985.1). Total num frames: 815595520. Throughput: 0: 50023.6. Samples: 344461700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:16:00,940][70768] Avg episode reward: [(0, '0.231')] [2024-06-12 18:16:01,791][71000] Updated weights for policy 0, policy_version 49784 (0.0022) [2024-06-12 18:16:05,661][71000] Updated weights for policy 0, policy_version 49794 (0.0029) [2024-06-12 18:16:05,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49971.2, 300 sec: 49874.0). Total num frames: 815824896. Throughput: 0: 49809.7. Samples: 344608120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:16:05,940][70768] Avg episode reward: [(0, '0.211')] [2024-06-12 18:16:08,372][71000] Updated weights for policy 0, policy_version 49804 (0.0040) [2024-06-12 18:16:10,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49698.1, 300 sec: 49874.0). Total num frames: 816070656. Throughput: 0: 49764.6. Samples: 344906680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:16:10,940][70768] Avg episode reward: [(0, '0.199')] [2024-06-12 18:16:12,501][71000] Updated weights for policy 0, policy_version 49814 (0.0028) [2024-06-12 18:16:15,078][71000] Updated weights for policy 0, policy_version 49824 (0.0024) [2024-06-12 18:16:15,939][70768] Fps is (10 sec: 52429.1, 60 sec: 49971.2, 300 sec: 50040.6). Total num frames: 816349184. Throughput: 0: 49646.2. Samples: 345200460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:16:15,940][70768] Avg episode reward: [(0, '0.210')] [2024-06-12 18:16:18,786][71000] Updated weights for policy 0, policy_version 49834 (0.0027) [2024-06-12 18:16:20,939][70768] Fps is (10 sec: 52429.4, 60 sec: 50244.3, 300 sec: 50040.7). Total num frames: 816594944. Throughput: 0: 49815.6. Samples: 345363920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:16:20,940][70768] Avg episode reward: [(0, '0.217')] [2024-06-12 18:16:21,487][71000] Updated weights for policy 0, policy_version 49844 (0.0028) [2024-06-12 18:16:25,358][71000] Updated weights for policy 0, policy_version 49854 (0.0024) [2024-06-12 18:16:25,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49971.3, 300 sec: 49874.0). Total num frames: 816824320. Throughput: 0: 49733.4. Samples: 345662920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:16:25,940][70768] Avg episode reward: [(0, '0.215')] [2024-06-12 18:16:27,989][71000] Updated weights for policy 0, policy_version 49864 (0.0030) [2024-06-12 18:16:30,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49971.1, 300 sec: 49985.1). Total num frames: 817070080. Throughput: 0: 49829.4. Samples: 345961800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-12 18:16:30,940][70768] Avg episode reward: [(0, '0.217')] [2024-06-12 18:16:31,785][71000] Updated weights for policy 0, policy_version 49874 (0.0033) [2024-06-12 18:16:34,577][71000] Updated weights for policy 0, policy_version 49884 (0.0032) [2024-06-12 18:16:35,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49698.1, 300 sec: 50040.6). Total num frames: 817348608. Throughput: 0: 49699.5. Samples: 346100280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-12 18:16:35,940][70768] Avg episode reward: [(0, '0.211')] [2024-06-12 18:16:38,388][71000] Updated weights for policy 0, policy_version 49894 (0.0030) [2024-06-12 18:16:40,940][70768] Fps is (10 sec: 54067.2, 60 sec: 49698.1, 300 sec: 50096.1). Total num frames: 817610752. Throughput: 0: 49943.4. Samples: 346404820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-12 18:16:40,940][70768] Avg episode reward: [(0, '0.210')] [2024-06-12 18:16:41,158][71000] Updated weights for policy 0, policy_version 49904 (0.0028) [2024-06-12 18:16:45,226][71000] Updated weights for policy 0, policy_version 49914 (0.0031) [2024-06-12 18:16:45,939][70768] Fps is (10 sec: 47513.7, 60 sec: 50244.4, 300 sec: 49929.6). Total num frames: 817823744. Throughput: 0: 49873.3. Samples: 346706000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-12 18:16:45,940][70768] Avg episode reward: [(0, '0.216')] [2024-06-12 18:16:47,723][71000] Updated weights for policy 0, policy_version 49924 (0.0025) [2024-06-12 18:16:50,939][70768] Fps is (10 sec: 44237.2, 60 sec: 49425.1, 300 sec: 49929.5). Total num frames: 818053120. Throughput: 0: 49729.4. Samples: 346845940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-12 18:16:50,940][70768] Avg episode reward: [(0, '0.216')] [2024-06-12 18:16:51,580][71000] Updated weights for policy 0, policy_version 49934 (0.0028) [2024-06-12 18:16:53,031][70980] Signal inference workers to stop experience collection... (5100 times) [2024-06-12 18:16:53,032][70980] Signal inference workers to resume experience collection... (5100 times) [2024-06-12 18:16:53,066][71000] InferenceWorker_p0-w0: stopping experience collection (5100 times) [2024-06-12 18:16:53,066][71000] InferenceWorker_p0-w0: resuming experience collection (5100 times) [2024-06-12 18:16:54,442][71000] Updated weights for policy 0, policy_version 49944 (0.0024) [2024-06-12 18:16:55,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 818331648. Throughput: 0: 49921.8. Samples: 347153160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-12 18:16:55,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 18:16:58,281][71000] Updated weights for policy 0, policy_version 49954 (0.0038) [2024-06-12 18:17:00,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49698.0, 300 sec: 50040.6). Total num frames: 818577408. Throughput: 0: 49934.5. Samples: 347447520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-12 18:17:00,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 18:17:01,251][71000] Updated weights for policy 0, policy_version 49964 (0.0032) [2024-06-12 18:17:04,643][71000] Updated weights for policy 0, policy_version 49974 (0.0028) [2024-06-12 18:17:05,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 818823168. Throughput: 0: 49729.7. Samples: 347601760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-12 18:17:05,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 18:17:07,673][71000] Updated weights for policy 0, policy_version 49984 (0.0025) [2024-06-12 18:17:10,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49971.3, 300 sec: 49874.0). Total num frames: 819068928. Throughput: 0: 49628.9. Samples: 347896220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-12 18:17:10,940][70768] Avg episode reward: [(0, '0.222')] [2024-06-12 18:17:11,391][71000] Updated weights for policy 0, policy_version 49994 (0.0029) [2024-06-12 18:17:14,365][71000] Updated weights for policy 0, policy_version 50004 (0.0033) [2024-06-12 18:17:15,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.0, 300 sec: 49985.1). Total num frames: 819314688. Throughput: 0: 49608.9. Samples: 348194200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-12 18:17:15,940][70768] Avg episode reward: [(0, '0.214')] [2024-06-12 18:17:17,844][71000] Updated weights for policy 0, policy_version 50014 (0.0028) [2024-06-12 18:17:20,897][71000] Updated weights for policy 0, policy_version 50024 (0.0020) [2024-06-12 18:17:20,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49971.0, 300 sec: 50096.1). Total num frames: 819593216. Throughput: 0: 49768.3. Samples: 348339860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-12 18:17:20,940][70768] Avg episode reward: [(0, '0.222')] [2024-06-12 18:17:24,495][71000] Updated weights for policy 0, policy_version 50034 (0.0026) [2024-06-12 18:17:25,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49698.0, 300 sec: 49874.0). Total num frames: 819806208. Throughput: 0: 49630.2. Samples: 348638180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-12 18:17:25,940][70768] Avg episode reward: [(0, '0.225')] [2024-06-12 18:17:27,915][71000] Updated weights for policy 0, policy_version 50044 (0.0044) [2024-06-12 18:17:30,940][70768] Fps is (10 sec: 45875.8, 60 sec: 49698.2, 300 sec: 49818.5). Total num frames: 820051968. Throughput: 0: 49548.4. Samples: 348935680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-12 18:17:30,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 18:17:31,223][71000] Updated weights for policy 0, policy_version 50054 (0.0037) [2024-06-12 18:17:34,352][71000] Updated weights for policy 0, policy_version 50064 (0.0036) [2024-06-12 18:17:35,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49152.0, 300 sec: 49929.6). Total num frames: 820297728. Throughput: 0: 49738.7. Samples: 349084180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 18:17:35,940][70768] Avg episode reward: [(0, '0.218')] [2024-06-12 18:17:37,763][71000] Updated weights for policy 0, policy_version 50074 (0.0038) [2024-06-12 18:17:40,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.1, 300 sec: 49985.1). Total num frames: 820559872. Throughput: 0: 49344.1. Samples: 349373640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 18:17:40,940][70768] Avg episode reward: [(0, '0.218')] [2024-06-12 18:17:40,960][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000050084_820576256.pth... [2024-06-12 18:17:40,964][71000] Updated weights for policy 0, policy_version 50084 (0.0024) [2024-06-12 18:17:41,003][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000049354_808615936.pth [2024-06-12 18:17:44,373][71000] Updated weights for policy 0, policy_version 50094 (0.0028) [2024-06-12 18:17:45,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 820805632. Throughput: 0: 49477.9. Samples: 349674020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 18:17:45,940][70768] Avg episode reward: [(0, '0.213')] [2024-06-12 18:17:47,814][71000] Updated weights for policy 0, policy_version 50104 (0.0027) [2024-06-12 18:17:50,925][71000] Updated weights for policy 0, policy_version 50114 (0.0037) [2024-06-12 18:17:50,940][70768] Fps is (10 sec: 50790.0, 60 sec: 50244.2, 300 sec: 49929.6). Total num frames: 821067776. Throughput: 0: 49487.5. Samples: 349828700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 18:17:50,940][70768] Avg episode reward: [(0, '0.218')] [2024-06-12 18:17:54,757][71000] Updated weights for policy 0, policy_version 50124 (0.0035) [2024-06-12 18:17:55,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.2, 300 sec: 49929.6). Total num frames: 821313536. Throughput: 0: 49504.4. Samples: 350123920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 18:17:55,940][70768] Avg episode reward: [(0, '0.222')] [2024-06-12 18:17:57,400][71000] Updated weights for policy 0, policy_version 50134 (0.0029) [2024-06-12 18:18:00,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.1, 300 sec: 49818.5). Total num frames: 821526528. Throughput: 0: 49559.1. Samples: 350424360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 18:18:00,940][70768] Avg episode reward: [(0, '0.220')] [2024-06-12 18:18:01,198][71000] Updated weights for policy 0, policy_version 50144 (0.0038) [2024-06-12 18:18:03,935][71000] Updated weights for policy 0, policy_version 50154 (0.0038) [2024-06-12 18:18:05,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.1, 300 sec: 49929.6). Total num frames: 821805056. Throughput: 0: 49612.1. Samples: 350572400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 18:18:05,940][70768] Avg episode reward: [(0, '0.225')] [2024-06-12 18:18:07,742][71000] Updated weights for policy 0, policy_version 50164 (0.0032) [2024-06-12 18:18:10,468][70980] Signal inference workers to stop experience collection... (5150 times) [2024-06-12 18:18:10,468][70980] Signal inference workers to resume experience collection... (5150 times) [2024-06-12 18:18:10,504][71000] InferenceWorker_p0-w0: stopping experience collection (5150 times) [2024-06-12 18:18:10,504][71000] InferenceWorker_p0-w0: resuming experience collection (5150 times) [2024-06-12 18:18:10,598][71000] Updated weights for policy 0, policy_version 50174 (0.0029) [2024-06-12 18:18:10,940][70768] Fps is (10 sec: 54067.3, 60 sec: 49971.2, 300 sec: 49929.6). Total num frames: 822067200. Throughput: 0: 49465.5. Samples: 350864120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 18:18:10,948][70768] Avg episode reward: [(0, '0.219')] [2024-06-12 18:18:14,511][71000] Updated weights for policy 0, policy_version 50184 (0.0028) [2024-06-12 18:18:15,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49698.2, 300 sec: 49874.0). Total num frames: 822296576. Throughput: 0: 49506.7. Samples: 351163480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 18:18:15,940][70768] Avg episode reward: [(0, '0.221')] [2024-06-12 18:18:17,219][71000] Updated weights for policy 0, policy_version 50194 (0.0033) [2024-06-12 18:18:20,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48879.0, 300 sec: 49874.4). Total num frames: 822525952. Throughput: 0: 49439.5. Samples: 351308960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 18:18:20,940][70768] Avg episode reward: [(0, '0.217')] [2024-06-12 18:18:21,093][71000] Updated weights for policy 0, policy_version 50204 (0.0031) [2024-06-12 18:18:23,737][71000] Updated weights for policy 0, policy_version 50214 (0.0027) [2024-06-12 18:18:25,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49698.3, 300 sec: 49929.6). Total num frames: 822788096. Throughput: 0: 49436.0. Samples: 351598260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 18:18:25,940][70768] Avg episode reward: [(0, '0.212')] [2024-06-12 18:18:27,711][71000] Updated weights for policy 0, policy_version 50224 (0.0023) [2024-06-12 18:18:30,347][71000] Updated weights for policy 0, policy_version 50234 (0.0031) [2024-06-12 18:18:30,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49971.1, 300 sec: 49874.0). Total num frames: 823050240. Throughput: 0: 49509.3. Samples: 351901940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 18:18:30,940][70768] Avg episode reward: [(0, '0.222')] [2024-06-12 18:18:34,103][71000] Updated weights for policy 0, policy_version 50244 (0.0032) [2024-06-12 18:18:35,939][70768] Fps is (10 sec: 52428.6, 60 sec: 50244.3, 300 sec: 49874.0). Total num frames: 823312384. Throughput: 0: 49687.6. Samples: 352064640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 18:18:35,940][70768] Avg episode reward: [(0, '0.222')] [2024-06-12 18:18:36,758][71000] Updated weights for policy 0, policy_version 50254 (0.0021) [2024-06-12 18:18:40,605][71000] Updated weights for policy 0, policy_version 50264 (0.0036) [2024-06-12 18:18:40,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49424.9, 300 sec: 49762.9). Total num frames: 823525376. Throughput: 0: 49607.0. Samples: 352356240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 18:18:40,940][70768] Avg episode reward: [(0, '0.220')] [2024-06-12 18:18:43,174][71000] Updated weights for policy 0, policy_version 50274 (0.0029) [2024-06-12 18:18:45,939][70768] Fps is (10 sec: 45875.1, 60 sec: 49425.1, 300 sec: 49818.5). Total num frames: 823771136. Throughput: 0: 49524.0. Samples: 352652940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 18:18:45,940][70768] Avg episode reward: [(0, '0.221')] [2024-06-12 18:18:47,526][71000] Updated weights for policy 0, policy_version 50284 (0.0028) [2024-06-12 18:18:50,177][71000] Updated weights for policy 0, policy_version 50294 (0.0030) [2024-06-12 18:18:50,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.0, 300 sec: 49818.5). Total num frames: 824033280. Throughput: 0: 49491.9. Samples: 352799540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 18:18:50,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 18:18:54,180][71000] Updated weights for policy 0, policy_version 50304 (0.0025) [2024-06-12 18:18:55,940][70768] Fps is (10 sec: 50786.9, 60 sec: 49424.5, 300 sec: 49762.8). Total num frames: 824279040. Throughput: 0: 49601.4. Samples: 353096220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 18:18:55,941][70768] Avg episode reward: [(0, '0.214')] [2024-06-12 18:18:56,663][71000] Updated weights for policy 0, policy_version 50314 (0.0027) [2024-06-12 18:19:00,940][70768] Fps is (10 sec: 45875.5, 60 sec: 49425.0, 300 sec: 49651.9). Total num frames: 824492032. Throughput: 0: 49750.2. Samples: 353402240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 18:19:00,940][70768] Avg episode reward: [(0, '0.224')] [2024-06-12 18:19:01,059][71000] Updated weights for policy 0, policy_version 50324 (0.0035) [2024-06-12 18:19:03,162][71000] Updated weights for policy 0, policy_version 50334 (0.0033) [2024-06-12 18:19:05,940][70768] Fps is (10 sec: 49155.3, 60 sec: 49425.1, 300 sec: 49762.9). Total num frames: 824770560. Throughput: 0: 49410.2. Samples: 353532420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 18:19:05,940][70768] Avg episode reward: [(0, '0.226')] [2024-06-12 18:19:07,443][71000] Updated weights for policy 0, policy_version 50344 (0.0032) [2024-06-12 18:19:09,978][71000] Updated weights for policy 0, policy_version 50354 (0.0027) [2024-06-12 18:19:10,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48878.9, 300 sec: 49651.9). Total num frames: 824999936. Throughput: 0: 49488.4. Samples: 353825240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 18:19:10,940][70768] Avg episode reward: [(0, '0.228')] [2024-06-12 18:19:14,124][71000] Updated weights for policy 0, policy_version 50364 (0.0025) [2024-06-12 18:19:15,907][70980] Signal inference workers to stop experience collection... (5200 times) [2024-06-12 18:19:15,917][71000] InferenceWorker_p0-w0: stopping experience collection (5200 times) [2024-06-12 18:19:15,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.0, 300 sec: 49707.4). Total num frames: 825262080. Throughput: 0: 49548.9. Samples: 354131640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 18:19:15,940][70768] Avg episode reward: [(0, '0.224')] [2024-06-12 18:19:15,963][70980] Signal inference workers to resume experience collection... (5200 times) [2024-06-12 18:19:15,964][71000] InferenceWorker_p0-w0: resuming experience collection (5200 times) [2024-06-12 18:19:16,754][71000] Updated weights for policy 0, policy_version 50374 (0.0041) [2024-06-12 18:19:20,798][71000] Updated weights for policy 0, policy_version 50384 (0.0025) [2024-06-12 18:19:20,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.1, 300 sec: 49651.9). Total num frames: 825491456. Throughput: 0: 49068.5. Samples: 354272720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 18:19:20,940][70768] Avg episode reward: [(0, '0.217')] [2024-06-12 18:19:23,285][71000] Updated weights for policy 0, policy_version 50394 (0.0027) [2024-06-12 18:19:25,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.0, 300 sec: 49596.4). Total num frames: 825737216. Throughput: 0: 49403.7. Samples: 354579400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 18:19:25,940][70768] Avg episode reward: [(0, '0.213')] [2024-06-12 18:19:27,295][71000] Updated weights for policy 0, policy_version 50404 (0.0027) [2024-06-12 18:19:29,919][71000] Updated weights for policy 0, policy_version 50414 (0.0035) [2024-06-12 18:19:30,940][70768] Fps is (10 sec: 54066.1, 60 sec: 49698.1, 300 sec: 49762.9). Total num frames: 826032128. Throughput: 0: 49510.5. Samples: 354880920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 18:19:30,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 18:19:33,902][71000] Updated weights for policy 0, policy_version 50424 (0.0035) [2024-06-12 18:19:35,940][70768] Fps is (10 sec: 54067.1, 60 sec: 49425.0, 300 sec: 49763.0). Total num frames: 826277888. Throughput: 0: 49811.7. Samples: 355041060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 18:19:35,940][70768] Avg episode reward: [(0, '0.225')] [2024-06-12 18:19:36,375][71000] Updated weights for policy 0, policy_version 50434 (0.0031) [2024-06-12 18:19:40,374][71000] Updated weights for policy 0, policy_version 50444 (0.0028) [2024-06-12 18:19:40,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49698.2, 300 sec: 49763.0). Total num frames: 826507264. Throughput: 0: 49774.5. Samples: 355336040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 18:19:40,940][70768] Avg episode reward: [(0, '0.224')] [2024-06-12 18:19:40,959][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000050446_826507264.pth... [2024-06-12 18:19:41,002][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000049721_814628864.pth [2024-06-12 18:19:43,273][71000] Updated weights for policy 0, policy_version 50454 (0.0029) [2024-06-12 18:19:45,940][70768] Fps is (10 sec: 45874.7, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 826736640. Throughput: 0: 49475.5. Samples: 355628640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 18:19:45,940][70768] Avg episode reward: [(0, '0.221')] [2024-06-12 18:19:47,135][71000] Updated weights for policy 0, policy_version 50464 (0.0033) [2024-06-12 18:19:49,898][71000] Updated weights for policy 0, policy_version 50474 (0.0020) [2024-06-12 18:19:50,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.2, 300 sec: 49707.4). Total num frames: 827015168. Throughput: 0: 49882.2. Samples: 355777120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 18:19:50,940][70768] Avg episode reward: [(0, '0.223')] [2024-06-12 18:19:53,716][71000] Updated weights for policy 0, policy_version 50484 (0.0025) [2024-06-12 18:19:55,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49698.7, 300 sec: 49762.9). Total num frames: 827260928. Throughput: 0: 50125.3. Samples: 356080880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 18:19:55,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 18:19:56,405][71000] Updated weights for policy 0, policy_version 50494 (0.0028) [2024-06-12 18:20:00,246][71000] Updated weights for policy 0, policy_version 50504 (0.0022) [2024-06-12 18:20:00,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49971.2, 300 sec: 49707.4). Total num frames: 827490304. Throughput: 0: 49903.9. Samples: 356377320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 18:20:00,940][70768] Avg episode reward: [(0, '0.229')] [2024-06-12 18:20:02,836][71000] Updated weights for policy 0, policy_version 50514 (0.0029) [2024-06-12 18:20:05,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49425.0, 300 sec: 49651.8). Total num frames: 827736064. Throughput: 0: 49864.2. Samples: 356516620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 18:20:05,940][70768] Avg episode reward: [(0, '0.199')] [2024-06-12 18:20:06,820][71000] Updated weights for policy 0, policy_version 50524 (0.0033) [2024-06-12 18:20:09,634][71000] Updated weights for policy 0, policy_version 50534 (0.0035) [2024-06-12 18:20:10,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49971.2, 300 sec: 49651.9). Total num frames: 827998208. Throughput: 0: 49674.7. Samples: 356814760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:20:10,940][70768] Avg episode reward: [(0, '0.232')] [2024-06-12 18:20:13,261][71000] Updated weights for policy 0, policy_version 50544 (0.0025) [2024-06-12 18:20:15,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49698.1, 300 sec: 49707.4). Total num frames: 828243968. Throughput: 0: 49621.9. Samples: 357113900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:20:15,940][70768] Avg episode reward: [(0, '0.216')] [2024-06-12 18:20:16,257][71000] Updated weights for policy 0, policy_version 50554 (0.0032) [2024-06-12 18:20:19,962][71000] Updated weights for policy 0, policy_version 50564 (0.0032) [2024-06-12 18:20:20,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49971.2, 300 sec: 49707.4). Total num frames: 828489728. Throughput: 0: 49397.8. Samples: 357263960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:20:20,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 18:20:22,914][71000] Updated weights for policy 0, policy_version 50574 (0.0026) [2024-06-12 18:20:25,868][70980] Signal inference workers to stop experience collection... (5250 times) [2024-06-12 18:20:25,868][70980] Signal inference workers to resume experience collection... (5250 times) [2024-06-12 18:20:25,903][71000] InferenceWorker_p0-w0: stopping experience collection (5250 times) [2024-06-12 18:20:25,903][71000] InferenceWorker_p0-w0: resuming experience collection (5250 times) [2024-06-12 18:20:25,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49698.1, 300 sec: 49651.8). Total num frames: 828719104. Throughput: 0: 49436.5. Samples: 357560680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:20:25,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 18:20:26,537][71000] Updated weights for policy 0, policy_version 50584 (0.0024) [2024-06-12 18:20:29,351][71000] Updated weights for policy 0, policy_version 50594 (0.0039) [2024-06-12 18:20:30,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 49540.8). Total num frames: 828981248. Throughput: 0: 49438.8. Samples: 357853380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:20:30,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 18:20:33,119][71000] Updated weights for policy 0, policy_version 50604 (0.0025) [2024-06-12 18:20:35,822][71000] Updated weights for policy 0, policy_version 50614 (0.0033) [2024-06-12 18:20:35,939][70768] Fps is (10 sec: 54067.3, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 829259776. Throughput: 0: 49603.6. Samples: 358009280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:20:35,940][70768] Avg episode reward: [(0, '0.226')] [2024-06-12 18:20:39,420][71000] Updated weights for policy 0, policy_version 50624 (0.0028) [2024-06-12 18:20:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.1, 300 sec: 49707.4). Total num frames: 829472768. Throughput: 0: 49524.8. Samples: 358309500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 18:20:40,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 18:20:42,548][71000] Updated weights for policy 0, policy_version 50634 (0.0025) [2024-06-12 18:20:45,939][70768] Fps is (10 sec: 45875.3, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 829718528. Throughput: 0: 49544.1. Samples: 358606800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 18:20:45,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 18:20:46,306][71000] Updated weights for policy 0, policy_version 50644 (0.0033) [2024-06-12 18:20:49,503][71000] Updated weights for policy 0, policy_version 50654 (0.0033) [2024-06-12 18:20:50,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 829964288. Throughput: 0: 49561.5. Samples: 358746880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 18:20:50,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 18:20:52,907][71000] Updated weights for policy 0, policy_version 50664 (0.0036) [2024-06-12 18:20:55,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 830226432. Throughput: 0: 49432.4. Samples: 359039220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 18:20:55,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 18:20:56,085][71000] Updated weights for policy 0, policy_version 50674 (0.0031) [2024-06-12 18:20:59,505][71000] Updated weights for policy 0, policy_version 50684 (0.0024) [2024-06-12 18:21:00,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 830455808. Throughput: 0: 49578.3. Samples: 359344920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 18:21:00,940][70768] Avg episode reward: [(0, '0.224')] [2024-06-12 18:21:02,471][71000] Updated weights for policy 0, policy_version 50694 (0.0033) [2024-06-12 18:21:05,927][71000] Updated weights for policy 0, policy_version 50704 (0.0034) [2024-06-12 18:21:05,940][70768] Fps is (10 sec: 50789.4, 60 sec: 49971.1, 300 sec: 49707.4). Total num frames: 830734336. Throughput: 0: 49586.4. Samples: 359495360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 18:21:05,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:21:09,036][71000] Updated weights for policy 0, policy_version 50714 (0.0031) [2024-06-12 18:21:10,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 830980096. Throughput: 0: 49584.9. Samples: 359792000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 18:21:10,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 18:21:12,696][71000] Updated weights for policy 0, policy_version 50724 (0.0039) [2024-06-12 18:21:15,743][71000] Updated weights for policy 0, policy_version 50734 (0.0032) [2024-06-12 18:21:15,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 831225856. Throughput: 0: 49683.8. Samples: 360089160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 18:21:15,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 18:21:19,427][71000] Updated weights for policy 0, policy_version 50744 (0.0031) [2024-06-12 18:21:20,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 831455232. Throughput: 0: 49480.5. Samples: 360235900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 18:21:20,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 18:21:22,590][71000] Updated weights for policy 0, policy_version 50754 (0.0025) [2024-06-12 18:21:25,708][71000] Updated weights for policy 0, policy_version 50764 (0.0022) [2024-06-12 18:21:25,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49971.2, 300 sec: 49651.9). Total num frames: 831717376. Throughput: 0: 49481.3. Samples: 360536160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 18:21:25,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:21:28,998][71000] Updated weights for policy 0, policy_version 50774 (0.0030) [2024-06-12 18:21:30,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 831946752. Throughput: 0: 49522.6. Samples: 360835320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 18:21:30,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:21:32,334][71000] Updated weights for policy 0, policy_version 50784 (0.0023) [2024-06-12 18:21:35,380][71000] Updated weights for policy 0, policy_version 50794 (0.0030) [2024-06-12 18:21:35,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49424.9, 300 sec: 49540.8). Total num frames: 832225280. Throughput: 0: 49717.2. Samples: 360984160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 18:21:35,949][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 18:21:38,849][71000] Updated weights for policy 0, policy_version 50804 (0.0026) [2024-06-12 18:21:40,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 832438272. Throughput: 0: 49774.7. Samples: 361279080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 18:21:40,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 18:21:41,006][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000050809_832454656.pth... [2024-06-12 18:21:41,060][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000050084_820576256.pth [2024-06-12 18:21:42,252][71000] Updated weights for policy 0, policy_version 50814 (0.0031) [2024-06-12 18:21:45,779][71000] Updated weights for policy 0, policy_version 50824 (0.0032) [2024-06-12 18:21:45,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49698.0, 300 sec: 49651.8). Total num frames: 832700416. Throughput: 0: 49391.0. Samples: 361567520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 18:21:45,942][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 18:21:47,738][70980] Signal inference workers to stop experience collection... (5300 times) [2024-06-12 18:21:47,738][70980] Signal inference workers to resume experience collection... (5300 times) [2024-06-12 18:21:47,757][71000] InferenceWorker_p0-w0: stopping experience collection (5300 times) [2024-06-12 18:21:47,757][71000] InferenceWorker_p0-w0: resuming experience collection (5300 times) [2024-06-12 18:21:49,171][71000] Updated weights for policy 0, policy_version 50834 (0.0025) [2024-06-12 18:21:50,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 832929792. Throughput: 0: 49288.2. Samples: 361713320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 18:21:50,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 18:21:52,285][71000] Updated weights for policy 0, policy_version 50844 (0.0028) [2024-06-12 18:21:55,603][71000] Updated weights for policy 0, policy_version 50854 (0.0029) [2024-06-12 18:21:55,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49424.9, 300 sec: 49540.8). Total num frames: 833191936. Throughput: 0: 49378.9. Samples: 362014060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 18:21:55,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 18:21:58,720][71000] Updated weights for policy 0, policy_version 50864 (0.0024) [2024-06-12 18:22:00,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 833404928. Throughput: 0: 49484.5. Samples: 362315960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 18:22:00,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 18:22:02,352][71000] Updated weights for policy 0, policy_version 50874 (0.0030) [2024-06-12 18:22:05,532][71000] Updated weights for policy 0, policy_version 50884 (0.0035) [2024-06-12 18:22:05,940][70768] Fps is (10 sec: 50791.4, 60 sec: 49425.2, 300 sec: 49596.3). Total num frames: 833699840. Throughput: 0: 49467.5. Samples: 362461940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 18:22:05,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 18:22:09,073][71000] Updated weights for policy 0, policy_version 50894 (0.0027) [2024-06-12 18:22:10,939][70768] Fps is (10 sec: 52429.7, 60 sec: 49152.1, 300 sec: 49540.8). Total num frames: 833929216. Throughput: 0: 49098.4. Samples: 362745580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 18:22:10,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-12 18:22:12,134][71000] Updated weights for policy 0, policy_version 50904 (0.0024) [2024-06-12 18:22:15,846][71000] Updated weights for policy 0, policy_version 50914 (0.0029) [2024-06-12 18:22:15,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 834174976. Throughput: 0: 49251.0. Samples: 363051620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 18:22:15,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 18:22:18,575][71000] Updated weights for policy 0, policy_version 50924 (0.0029) [2024-06-12 18:22:20,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 834420736. Throughput: 0: 49046.4. Samples: 363191240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 18:22:20,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 18:22:22,484][71000] Updated weights for policy 0, policy_version 50934 (0.0030) [2024-06-12 18:22:25,014][71000] Updated weights for policy 0, policy_version 50944 (0.0020) [2024-06-12 18:22:25,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49424.9, 300 sec: 49596.3). Total num frames: 834682880. Throughput: 0: 49210.8. Samples: 363493580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 18:22:25,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 18:22:28,912][71000] Updated weights for policy 0, policy_version 50954 (0.0028) [2024-06-12 18:22:30,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49698.0, 300 sec: 49596.3). Total num frames: 834928640. Throughput: 0: 49341.7. Samples: 363787900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 18:22:30,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 18:22:31,828][71000] Updated weights for policy 0, policy_version 50964 (0.0022) [2024-06-12 18:22:35,870][71000] Updated weights for policy 0, policy_version 50974 (0.0039) [2024-06-12 18:22:35,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48878.9, 300 sec: 49485.2). Total num frames: 835158016. Throughput: 0: 49333.1. Samples: 363933320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 18:22:35,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 18:22:39,035][71000] Updated weights for policy 0, policy_version 50984 (0.0032) [2024-06-12 18:22:40,940][70768] Fps is (10 sec: 45874.4, 60 sec: 49151.7, 300 sec: 49429.6). Total num frames: 835387392. Throughput: 0: 49119.9. Samples: 364224460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 18:22:40,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 18:22:42,669][71000] Updated weights for policy 0, policy_version 50994 (0.0035) [2024-06-12 18:22:45,563][71000] Updated weights for policy 0, policy_version 51004 (0.0033) [2024-06-12 18:22:45,940][70768] Fps is (10 sec: 50791.3, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 835665920. Throughput: 0: 48871.2. Samples: 364515160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 18:22:45,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:22:49,537][71000] Updated weights for policy 0, policy_version 51014 (0.0028) [2024-06-12 18:22:50,939][70768] Fps is (10 sec: 52430.6, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 835911680. Throughput: 0: 49007.6. Samples: 364667280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-12 18:22:50,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:22:51,947][71000] Updated weights for policy 0, policy_version 51024 (0.0027) [2024-06-12 18:22:55,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48879.1, 300 sec: 49485.2). Total num frames: 836124672. Throughput: 0: 49323.0. Samples: 364965120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-12 18:22:55,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:22:56,034][71000] Updated weights for policy 0, policy_version 51034 (0.0020) [2024-06-12 18:22:58,831][71000] Updated weights for policy 0, policy_version 51044 (0.0029) [2024-06-12 18:23:00,940][70768] Fps is (10 sec: 45874.6, 60 sec: 49425.1, 300 sec: 49374.1). Total num frames: 836370432. Throughput: 0: 48876.5. Samples: 365251060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-12 18:23:00,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 18:23:02,942][71000] Updated weights for policy 0, policy_version 51054 (0.0032) [2024-06-12 18:23:03,555][70980] Signal inference workers to stop experience collection... (5350 times) [2024-06-12 18:23:03,591][71000] InferenceWorker_p0-w0: stopping experience collection (5350 times) [2024-06-12 18:23:03,612][70980] Signal inference workers to resume experience collection... (5350 times) [2024-06-12 18:23:03,612][71000] InferenceWorker_p0-w0: resuming experience collection (5350 times) [2024-06-12 18:23:05,611][71000] Updated weights for policy 0, policy_version 51064 (0.0024) [2024-06-12 18:23:05,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48878.9, 300 sec: 49374.1). Total num frames: 836632576. Throughput: 0: 48926.1. Samples: 365392920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-12 18:23:05,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 18:23:09,565][71000] Updated weights for policy 0, policy_version 51074 (0.0022) [2024-06-12 18:23:10,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 836878336. Throughput: 0: 49081.1. Samples: 365702220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-12 18:23:10,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 18:23:12,215][71000] Updated weights for policy 0, policy_version 51084 (0.0025) [2024-06-12 18:23:15,940][70768] Fps is (10 sec: 45875.7, 60 sec: 48606.0, 300 sec: 49374.2). Total num frames: 837091328. Throughput: 0: 49177.5. Samples: 366000880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-12 18:23:15,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 18:23:16,297][71000] Updated weights for policy 0, policy_version 51094 (0.0036) [2024-06-12 18:23:18,674][71000] Updated weights for policy 0, policy_version 51104 (0.0038) [2024-06-12 18:23:20,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48878.8, 300 sec: 49374.1). Total num frames: 837353472. Throughput: 0: 48845.8. Samples: 366131380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 18:23:20,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:23:22,873][71000] Updated weights for policy 0, policy_version 51114 (0.0042) [2024-06-12 18:23:25,341][71000] Updated weights for policy 0, policy_version 51124 (0.0031) [2024-06-12 18:23:25,939][70768] Fps is (10 sec: 52429.0, 60 sec: 48879.2, 300 sec: 49374.2). Total num frames: 837615616. Throughput: 0: 48951.5. Samples: 366427260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 18:23:25,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 18:23:29,667][71000] Updated weights for policy 0, policy_version 51134 (0.0029) [2024-06-12 18:23:30,940][70768] Fps is (10 sec: 50791.0, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 837861376. Throughput: 0: 48984.9. Samples: 366719480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 18:23:30,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:23:32,483][71000] Updated weights for policy 0, policy_version 51144 (0.0033) [2024-06-12 18:23:35,940][70768] Fps is (10 sec: 44236.5, 60 sec: 48332.9, 300 sec: 49263.1). Total num frames: 838057984. Throughput: 0: 48694.2. Samples: 366858520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 18:23:35,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 18:23:36,548][71000] Updated weights for policy 0, policy_version 51154 (0.0024) [2024-06-12 18:23:38,973][71000] Updated weights for policy 0, policy_version 51164 (0.0026) [2024-06-12 18:23:40,939][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.3, 300 sec: 49374.2). Total num frames: 838336512. Throughput: 0: 48769.4. Samples: 367159740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 18:23:40,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 18:23:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000051169_838352896.pth... [2024-06-12 18:23:40,993][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000050446_826507264.pth [2024-06-12 18:23:43,014][71000] Updated weights for policy 0, policy_version 51174 (0.0022) [2024-06-12 18:23:45,470][71000] Updated weights for policy 0, policy_version 51184 (0.0029) [2024-06-12 18:23:45,940][70768] Fps is (10 sec: 55705.7, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 838615040. Throughput: 0: 49006.8. Samples: 367456360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 18:23:45,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 18:23:49,719][71000] Updated weights for policy 0, policy_version 51194 (0.0036) [2024-06-12 18:23:50,939][70768] Fps is (10 sec: 50790.1, 60 sec: 48878.9, 300 sec: 49374.3). Total num frames: 838844416. Throughput: 0: 49277.9. Samples: 367610420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 18:23:50,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 18:23:52,486][71000] Updated weights for policy 0, policy_version 51204 (0.0026) [2024-06-12 18:23:55,940][70768] Fps is (10 sec: 44236.7, 60 sec: 48878.9, 300 sec: 49374.2). Total num frames: 839057408. Throughput: 0: 48777.7. Samples: 367897220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-12 18:23:55,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:23:56,714][71000] Updated weights for policy 0, policy_version 51214 (0.0025) [2024-06-12 18:23:59,145][71000] Updated weights for policy 0, policy_version 51224 (0.0031) [2024-06-12 18:24:00,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 839319552. Throughput: 0: 48587.5. Samples: 368187320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-12 18:24:00,940][70768] Avg episode reward: [(0, '0.228')] [2024-06-12 18:24:03,087][71000] Updated weights for policy 0, policy_version 51234 (0.0030) [2024-06-12 18:24:03,431][70980] Signal inference workers to stop experience collection... (5400 times) [2024-06-12 18:24:03,480][71000] InferenceWorker_p0-w0: stopping experience collection (5400 times) [2024-06-12 18:24:03,485][70980] Signal inference workers to resume experience collection... (5400 times) [2024-06-12 18:24:03,488][71000] InferenceWorker_p0-w0: resuming experience collection (5400 times) [2024-06-12 18:24:05,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48878.9, 300 sec: 49374.1). Total num frames: 839565312. Throughput: 0: 49064.8. Samples: 368339300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-12 18:24:05,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:24:06,010][71000] Updated weights for policy 0, policy_version 51244 (0.0024) [2024-06-12 18:24:09,669][71000] Updated weights for policy 0, policy_version 51254 (0.0028) [2024-06-12 18:24:10,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 839811072. Throughput: 0: 49211.5. Samples: 368641780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-12 18:24:10,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 18:24:12,448][71000] Updated weights for policy 0, policy_version 51264 (0.0029) [2024-06-12 18:24:15,941][70768] Fps is (10 sec: 44229.5, 60 sec: 48604.4, 300 sec: 49207.2). Total num frames: 840007680. Throughput: 0: 48877.1. Samples: 368919040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-12 18:24:15,942][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 18:24:16,387][71000] Updated weights for policy 0, policy_version 51274 (0.0029) [2024-06-12 18:24:19,611][71000] Updated weights for policy 0, policy_version 51284 (0.0034) [2024-06-12 18:24:20,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 840286208. Throughput: 0: 48902.9. Samples: 369059160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-12 18:24:20,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 18:24:23,427][71000] Updated weights for policy 0, policy_version 51294 (0.0031) [2024-06-12 18:24:25,940][70768] Fps is (10 sec: 52438.2, 60 sec: 48605.8, 300 sec: 49152.0). Total num frames: 840531968. Throughput: 0: 48623.4. Samples: 369347800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 18:24:25,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 18:24:26,195][71000] Updated weights for policy 0, policy_version 51304 (0.0024) [2024-06-12 18:24:30,161][71000] Updated weights for policy 0, policy_version 51314 (0.0033) [2024-06-12 18:24:30,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48605.8, 300 sec: 49152.0). Total num frames: 840777728. Throughput: 0: 48597.3. Samples: 369643240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 18:24:30,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 18:24:33,313][71000] Updated weights for policy 0, policy_version 51324 (0.0027) [2024-06-12 18:24:35,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 841007104. Throughput: 0: 48527.1. Samples: 369794140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 18:24:35,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:24:36,629][71000] Updated weights for policy 0, policy_version 51334 (0.0034) [2024-06-12 18:24:39,999][71000] Updated weights for policy 0, policy_version 51344 (0.0028) [2024-06-12 18:24:40,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.8, 300 sec: 49263.1). Total num frames: 841269248. Throughput: 0: 48451.5. Samples: 370077540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 18:24:40,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 18:24:43,470][71000] Updated weights for policy 0, policy_version 51354 (0.0031) [2024-06-12 18:24:45,939][70768] Fps is (10 sec: 47513.7, 60 sec: 47786.7, 300 sec: 49040.9). Total num frames: 841482240. Throughput: 0: 48358.3. Samples: 370363440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 18:24:45,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 18:24:46,720][71000] Updated weights for policy 0, policy_version 51364 (0.0035) [2024-06-12 18:24:50,258][71000] Updated weights for policy 0, policy_version 51374 (0.0029) [2024-06-12 18:24:50,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48332.7, 300 sec: 49096.4). Total num frames: 841744384. Throughput: 0: 48275.6. Samples: 370511700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 18:24:50,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 18:24:53,369][71000] Updated weights for policy 0, policy_version 51384 (0.0028) [2024-06-12 18:24:55,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48605.9, 300 sec: 49096.5). Total num frames: 841973760. Throughput: 0: 48014.3. Samples: 370802420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 18:24:55,940][70768] Avg episode reward: [(0, '0.232')] [2024-06-12 18:24:56,825][71000] Updated weights for policy 0, policy_version 51394 (0.0028) [2024-06-12 18:25:00,117][71000] Updated weights for policy 0, policy_version 51404 (0.0038) [2024-06-12 18:25:00,940][70768] Fps is (10 sec: 49152.8, 60 sec: 48605.9, 300 sec: 49152.0). Total num frames: 842235904. Throughput: 0: 48441.1. Samples: 371098800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 18:25:00,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:25:03,762][71000] Updated weights for policy 0, policy_version 51414 (0.0027) [2024-06-12 18:25:05,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48332.9, 300 sec: 49040.9). Total num frames: 842465280. Throughput: 0: 48713.5. Samples: 371251260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 18:25:05,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:25:06,751][71000] Updated weights for policy 0, policy_version 51424 (0.0027) [2024-06-12 18:25:10,507][71000] Updated weights for policy 0, policy_version 51434 (0.0025) [2024-06-12 18:25:10,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48332.8, 300 sec: 49040.9). Total num frames: 842711040. Throughput: 0: 48560.0. Samples: 371533000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 18:25:10,940][70768] Avg episode reward: [(0, '0.225')] [2024-06-12 18:25:13,767][71000] Updated weights for policy 0, policy_version 51444 (0.0045) [2024-06-12 18:25:15,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48880.3, 300 sec: 48985.4). Total num frames: 842940416. Throughput: 0: 48376.8. Samples: 371820200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 18:25:15,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:25:17,139][71000] Updated weights for policy 0, policy_version 51454 (0.0024) [2024-06-12 18:25:19,486][70980] Signal inference workers to stop experience collection... (5450 times) [2024-06-12 18:25:19,532][71000] InferenceWorker_p0-w0: stopping experience collection (5450 times) [2024-06-12 18:25:19,539][70980] Signal inference workers to resume experience collection... (5450 times) [2024-06-12 18:25:19,545][71000] InferenceWorker_p0-w0: resuming experience collection (5450 times) [2024-06-12 18:25:20,492][71000] Updated weights for policy 0, policy_version 51464 (0.0022) [2024-06-12 18:25:20,939][70768] Fps is (10 sec: 49152.5, 60 sec: 48606.1, 300 sec: 49096.5). Total num frames: 843202560. Throughput: 0: 48332.0. Samples: 371969080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 18:25:20,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 18:25:23,690][71000] Updated weights for policy 0, policy_version 51474 (0.0030) [2024-06-12 18:25:25,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48332.8, 300 sec: 48985.4). Total num frames: 843431936. Throughput: 0: 48571.2. Samples: 372263240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 18:25:25,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 18:25:27,135][71000] Updated weights for policy 0, policy_version 51484 (0.0028) [2024-06-12 18:25:30,509][71000] Updated weights for policy 0, policy_version 51494 (0.0018) [2024-06-12 18:25:30,940][70768] Fps is (10 sec: 49151.0, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 843694080. Throughput: 0: 48805.6. Samples: 372559700. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-12 18:25:30,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 18:25:33,830][71000] Updated weights for policy 0, policy_version 51504 (0.0025) [2024-06-12 18:25:35,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48605.7, 300 sec: 48985.4). Total num frames: 843923456. Throughput: 0: 48710.2. Samples: 372703660. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-12 18:25:35,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:25:37,373][71000] Updated weights for policy 0, policy_version 51514 (0.0030) [2024-06-12 18:25:40,697][71000] Updated weights for policy 0, policy_version 51524 (0.0034) [2024-06-12 18:25:40,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48332.9, 300 sec: 48985.4). Total num frames: 844169216. Throughput: 0: 48668.0. Samples: 372992480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-12 18:25:40,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:25:41,068][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000051525_844185600.pth... [2024-06-12 18:25:41,110][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000050809_832454656.pth [2024-06-12 18:25:43,851][71000] Updated weights for policy 0, policy_version 51534 (0.0037) [2024-06-12 18:25:45,940][70768] Fps is (10 sec: 49148.5, 60 sec: 48878.2, 300 sec: 48985.2). Total num frames: 844414976. Throughput: 0: 48621.3. Samples: 373286800. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-12 18:25:45,941][70768] Avg episode reward: [(0, '0.231')] [2024-06-12 18:25:47,083][71000] Updated weights for policy 0, policy_version 51544 (0.0026) [2024-06-12 18:25:50,419][71000] Updated weights for policy 0, policy_version 51554 (0.0022) [2024-06-12 18:25:50,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48606.0, 300 sec: 48929.8). Total num frames: 844660736. Throughput: 0: 48680.0. Samples: 373441860. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-12 18:25:50,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 18:25:54,088][71000] Updated weights for policy 0, policy_version 51564 (0.0029) [2024-06-12 18:25:55,944][70768] Fps is (10 sec: 49134.7, 60 sec: 48875.4, 300 sec: 48984.7). Total num frames: 844906496. Throughput: 0: 48786.0. Samples: 373728580. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-12 18:25:55,944][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 18:25:57,237][71000] Updated weights for policy 0, policy_version 51574 (0.0029) [2024-06-12 18:26:00,798][71000] Updated weights for policy 0, policy_version 51584 (0.0025) [2024-06-12 18:26:00,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 845152256. Throughput: 0: 48932.1. Samples: 374022140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-12 18:26:00,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 18:26:04,158][71000] Updated weights for policy 0, policy_version 51594 (0.0028) [2024-06-12 18:26:05,940][70768] Fps is (10 sec: 47533.3, 60 sec: 48605.7, 300 sec: 48818.7). Total num frames: 845381632. Throughput: 0: 48633.9. Samples: 374157620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-12 18:26:05,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 18:26:07,455][71000] Updated weights for policy 0, policy_version 51604 (0.0025) [2024-06-12 18:26:10,572][71000] Updated weights for policy 0, policy_version 51614 (0.0027) [2024-06-12 18:26:10,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 845643776. Throughput: 0: 48865.4. Samples: 374462180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-12 18:26:10,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:26:14,487][71000] Updated weights for policy 0, policy_version 51624 (0.0021) [2024-06-12 18:26:15,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48605.9, 300 sec: 48818.7). Total num frames: 845856768. Throughput: 0: 48680.5. Samples: 374750320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-12 18:26:15,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 18:26:17,437][71000] Updated weights for policy 0, policy_version 51634 (0.0027) [2024-06-12 18:26:20,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48332.7, 300 sec: 48763.2). Total num frames: 846102528. Throughput: 0: 48679.2. Samples: 374894220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-12 18:26:20,940][70768] Avg episode reward: [(0, '0.227')] [2024-06-12 18:26:21,443][71000] Updated weights for policy 0, policy_version 51644 (0.0032) [2024-06-12 18:26:24,198][71000] Updated weights for policy 0, policy_version 51654 (0.0038) [2024-06-12 18:26:25,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 846348288. Throughput: 0: 48499.6. Samples: 375174960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-12 18:26:25,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 18:26:28,377][71000] Updated weights for policy 0, policy_version 51664 (0.0027) [2024-06-12 18:26:30,939][70768] Fps is (10 sec: 50790.9, 60 sec: 48606.0, 300 sec: 48763.3). Total num frames: 846610432. Throughput: 0: 48393.4. Samples: 375464460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-12 18:26:30,946][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 18:26:31,031][71000] Updated weights for policy 0, policy_version 51674 (0.0029) [2024-06-12 18:26:34,913][71000] Updated weights for policy 0, policy_version 51684 (0.0025) [2024-06-12 18:26:35,939][70768] Fps is (10 sec: 47513.7, 60 sec: 48332.9, 300 sec: 48763.2). Total num frames: 846823424. Throughput: 0: 48219.6. Samples: 375611740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:26:35,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 18:26:37,419][71000] Updated weights for policy 0, policy_version 51694 (0.0028) [2024-06-12 18:26:40,941][70768] Fps is (10 sec: 47504.4, 60 sec: 48604.4, 300 sec: 48762.9). Total num frames: 847085568. Throughput: 0: 48482.2. Samples: 375910160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:26:40,942][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:26:41,484][71000] Updated weights for policy 0, policy_version 51704 (0.0030) [2024-06-12 18:26:43,418][70980] Signal inference workers to stop experience collection... (5500 times) [2024-06-12 18:26:43,422][70980] Signal inference workers to resume experience collection... (5500 times) [2024-06-12 18:26:43,436][71000] InferenceWorker_p0-w0: stopping experience collection (5500 times) [2024-06-12 18:26:43,437][71000] InferenceWorker_p0-w0: resuming experience collection (5500 times) [2024-06-12 18:26:44,231][71000] Updated weights for policy 0, policy_version 51714 (0.0026) [2024-06-12 18:26:45,940][70768] Fps is (10 sec: 52428.5, 60 sec: 48879.6, 300 sec: 48874.3). Total num frames: 847347712. Throughput: 0: 48675.1. Samples: 376212520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:26:45,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 18:26:48,520][71000] Updated weights for policy 0, policy_version 51724 (0.0029) [2024-06-12 18:26:50,940][70768] Fps is (10 sec: 49161.3, 60 sec: 48605.9, 300 sec: 48763.3). Total num frames: 847577088. Throughput: 0: 48882.5. Samples: 376357320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:26:50,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:26:51,206][71000] Updated weights for policy 0, policy_version 51734 (0.0026) [2024-06-12 18:26:55,288][71000] Updated weights for policy 0, policy_version 51744 (0.0032) [2024-06-12 18:26:55,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48336.3, 300 sec: 48818.8). Total num frames: 847806464. Throughput: 0: 48337.8. Samples: 376637380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:26:55,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 18:26:58,061][71000] Updated weights for policy 0, policy_version 51754 (0.0030) [2024-06-12 18:27:00,939][70768] Fps is (10 sec: 47513.7, 60 sec: 48332.8, 300 sec: 48652.2). Total num frames: 848052224. Throughput: 0: 48323.7. Samples: 376924880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:27:00,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:27:02,158][71000] Updated weights for policy 0, policy_version 51764 (0.0030) [2024-06-12 18:27:04,583][71000] Updated weights for policy 0, policy_version 51774 (0.0019) [2024-06-12 18:27:05,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48879.1, 300 sec: 48763.2). Total num frames: 848314368. Throughput: 0: 48460.9. Samples: 377074960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:27:05,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 18:27:08,838][71000] Updated weights for policy 0, policy_version 51784 (0.0034) [2024-06-12 18:27:10,940][70768] Fps is (10 sec: 50789.7, 60 sec: 48605.7, 300 sec: 48763.2). Total num frames: 848560128. Throughput: 0: 48688.3. Samples: 377365940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:27:10,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:27:11,455][71000] Updated weights for policy 0, policy_version 51794 (0.0032) [2024-06-12 18:27:15,579][71000] Updated weights for policy 0, policy_version 51804 (0.0034) [2024-06-12 18:27:15,940][70768] Fps is (10 sec: 44236.8, 60 sec: 48332.9, 300 sec: 48596.6). Total num frames: 848756736. Throughput: 0: 48733.7. Samples: 377657480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:27:15,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:27:18,467][71000] Updated weights for policy 0, policy_version 51814 (0.0029) [2024-06-12 18:27:20,939][70768] Fps is (10 sec: 45875.8, 60 sec: 48605.9, 300 sec: 48596.7). Total num frames: 849018880. Throughput: 0: 48425.8. Samples: 377790900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:27:20,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 18:27:22,216][71000] Updated weights for policy 0, policy_version 51824 (0.0026) [2024-06-12 18:27:25,270][71000] Updated weights for policy 0, policy_version 51834 (0.0023) [2024-06-12 18:27:25,939][70768] Fps is (10 sec: 50790.5, 60 sec: 48605.9, 300 sec: 48596.6). Total num frames: 849264640. Throughput: 0: 48352.7. Samples: 378085940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:27:25,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 18:27:29,282][71000] Updated weights for policy 0, policy_version 51844 (0.0034) [2024-06-12 18:27:30,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48332.7, 300 sec: 48652.2). Total num frames: 849510400. Throughput: 0: 48170.2. Samples: 378380180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:27:30,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 18:27:31,714][71000] Updated weights for policy 0, policy_version 51854 (0.0027) [2024-06-12 18:27:35,550][71000] Updated weights for policy 0, policy_version 51864 (0.0030) [2024-06-12 18:27:35,939][70768] Fps is (10 sec: 47513.7, 60 sec: 48605.9, 300 sec: 48652.2). Total num frames: 849739776. Throughput: 0: 48301.4. Samples: 378530880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:27:35,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 18:27:38,738][71000] Updated weights for policy 0, policy_version 51874 (0.0029) [2024-06-12 18:27:40,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48334.3, 300 sec: 48541.1). Total num frames: 849985536. Throughput: 0: 48458.2. Samples: 378818000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 18:27:40,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:27:40,956][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000051879_849985536.pth... [2024-06-12 18:27:41,001][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000051169_838352896.pth [2024-06-12 18:27:42,586][71000] Updated weights for policy 0, policy_version 51884 (0.0039) [2024-06-12 18:27:45,397][71000] Updated weights for policy 0, policy_version 51894 (0.0035) [2024-06-12 18:27:45,940][70768] Fps is (10 sec: 49150.7, 60 sec: 48059.6, 300 sec: 48541.0). Total num frames: 850231296. Throughput: 0: 48420.6. Samples: 379103820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 18:27:45,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:27:49,406][71000] Updated weights for policy 0, policy_version 51904 (0.0039) [2024-06-12 18:27:50,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48332.8, 300 sec: 48652.1). Total num frames: 850477056. Throughput: 0: 48447.1. Samples: 379255080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 18:27:50,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 18:27:51,897][70980] Signal inference workers to stop experience collection... (5550 times) [2024-06-12 18:27:51,933][71000] InferenceWorker_p0-w0: stopping experience collection (5550 times) [2024-06-12 18:27:51,947][70980] Signal inference workers to resume experience collection... (5550 times) [2024-06-12 18:27:51,947][71000] InferenceWorker_p0-w0: resuming experience collection (5550 times) [2024-06-12 18:27:52,080][71000] Updated weights for policy 0, policy_version 51914 (0.0026) [2024-06-12 18:27:55,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48059.5, 300 sec: 48541.0). Total num frames: 850690048. Throughput: 0: 48371.9. Samples: 379542680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 18:27:55,941][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 18:27:56,137][71000] Updated weights for policy 0, policy_version 51924 (0.0027) [2024-06-12 18:27:58,811][71000] Updated weights for policy 0, policy_version 51934 (0.0023) [2024-06-12 18:28:00,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48605.8, 300 sec: 48596.6). Total num frames: 850968576. Throughput: 0: 48492.0. Samples: 379839620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 18:28:00,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:28:02,425][71000] Updated weights for policy 0, policy_version 51944 (0.0023) [2024-06-12 18:28:05,465][71000] Updated weights for policy 0, policy_version 51954 (0.0034) [2024-06-12 18:28:05,940][70768] Fps is (10 sec: 54068.3, 60 sec: 48605.8, 300 sec: 48652.1). Total num frames: 851230720. Throughput: 0: 49104.8. Samples: 380000620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 18:28:05,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 18:28:09,333][71000] Updated weights for policy 0, policy_version 51964 (0.0033) [2024-06-12 18:28:10,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48332.9, 300 sec: 48707.7). Total num frames: 851460096. Throughput: 0: 48829.8. Samples: 380283280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 18:28:10,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 18:28:12,307][71000] Updated weights for policy 0, policy_version 51974 (0.0024) [2024-06-12 18:28:15,939][70768] Fps is (10 sec: 44237.2, 60 sec: 48605.9, 300 sec: 48541.1). Total num frames: 851673088. Throughput: 0: 48763.2. Samples: 380574520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 18:28:15,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 18:28:16,148][71000] Updated weights for policy 0, policy_version 51984 (0.0034) [2024-06-12 18:28:19,227][71000] Updated weights for policy 0, policy_version 51994 (0.0043) [2024-06-12 18:28:20,940][70768] Fps is (10 sec: 47512.5, 60 sec: 48605.7, 300 sec: 48541.0). Total num frames: 851935232. Throughput: 0: 48370.8. Samples: 380707580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 18:28:20,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 18:28:22,862][71000] Updated weights for policy 0, policy_version 52004 (0.0030) [2024-06-12 18:28:25,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48605.8, 300 sec: 48541.1). Total num frames: 852180992. Throughput: 0: 48620.5. Samples: 381005920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 18:28:25,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 18:28:25,977][71000] Updated weights for policy 0, policy_version 52014 (0.0035) [2024-06-12 18:28:29,516][71000] Updated weights for policy 0, policy_version 52024 (0.0026) [2024-06-12 18:28:30,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 852426752. Throughput: 0: 48885.9. Samples: 381303680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 18:28:30,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 18:28:32,628][71000] Updated weights for policy 0, policy_version 52034 (0.0031) [2024-06-12 18:28:35,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.8, 300 sec: 48541.1). Total num frames: 852656128. Throughput: 0: 48570.2. Samples: 381440740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 18:28:35,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:28:36,328][71000] Updated weights for policy 0, policy_version 52044 (0.0031) [2024-06-12 18:28:39,310][71000] Updated weights for policy 0, policy_version 52054 (0.0032) [2024-06-12 18:28:40,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 48485.5). Total num frames: 852918272. Throughput: 0: 48542.9. Samples: 381727100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 18:28:40,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:28:43,087][71000] Updated weights for policy 0, policy_version 52064 (0.0028) [2024-06-12 18:28:45,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48879.1, 300 sec: 48541.1). Total num frames: 853164032. Throughput: 0: 48563.6. Samples: 382024980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 18:28:45,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:28:46,207][71000] Updated weights for policy 0, policy_version 52074 (0.0029) [2024-06-12 18:28:49,263][70980] Signal inference workers to stop experience collection... (5600 times) [2024-06-12 18:28:49,298][71000] InferenceWorker_p0-w0: stopping experience collection (5600 times) [2024-06-12 18:28:49,373][70980] Signal inference workers to resume experience collection... (5600 times) [2024-06-12 18:28:49,374][71000] InferenceWorker_p0-w0: resuming experience collection (5600 times) [2024-06-12 18:28:49,509][71000] Updated weights for policy 0, policy_version 52084 (0.0031) [2024-06-12 18:28:50,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48605.8, 300 sec: 48596.6). Total num frames: 853393408. Throughput: 0: 48327.6. Samples: 382175360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 18:28:50,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:28:52,887][71000] Updated weights for policy 0, policy_version 52094 (0.0026) [2024-06-12 18:28:55,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49152.1, 300 sec: 48541.1). Total num frames: 853639168. Throughput: 0: 48521.2. Samples: 382466740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 18:28:55,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:28:56,675][71000] Updated weights for policy 0, policy_version 52104 (0.0031) [2024-06-12 18:28:59,259][71000] Updated weights for policy 0, policy_version 52114 (0.0032) [2024-06-12 18:29:00,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48878.9, 300 sec: 48596.6). Total num frames: 853901312. Throughput: 0: 48460.3. Samples: 382755240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 18:29:00,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:29:03,404][71000] Updated weights for policy 0, policy_version 52124 (0.0040) [2024-06-12 18:29:05,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48332.7, 300 sec: 48541.0). Total num frames: 854130688. Throughput: 0: 49018.2. Samples: 382913400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 18:29:05,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 18:29:06,236][71000] Updated weights for policy 0, policy_version 52134 (0.0039) [2024-06-12 18:29:09,886][71000] Updated weights for policy 0, policy_version 52144 (0.0027) [2024-06-12 18:29:10,939][70768] Fps is (10 sec: 47514.5, 60 sec: 48605.9, 300 sec: 48708.0). Total num frames: 854376448. Throughput: 0: 48782.3. Samples: 383201120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 18:29:10,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:29:12,931][71000] Updated weights for policy 0, policy_version 52154 (0.0031) [2024-06-12 18:29:15,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49151.8, 300 sec: 48596.6). Total num frames: 854622208. Throughput: 0: 48699.5. Samples: 383495160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 18:29:15,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 18:29:16,435][71000] Updated weights for policy 0, policy_version 52164 (0.0026) [2024-06-12 18:29:19,741][71000] Updated weights for policy 0, policy_version 52174 (0.0030) [2024-06-12 18:29:20,940][70768] Fps is (10 sec: 50789.1, 60 sec: 49152.0, 300 sec: 48652.1). Total num frames: 854884352. Throughput: 0: 48874.5. Samples: 383640100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 18:29:20,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 18:29:23,728][71000] Updated weights for policy 0, policy_version 52184 (0.0030) [2024-06-12 18:29:25,940][70768] Fps is (10 sec: 49150.1, 60 sec: 48878.5, 300 sec: 48596.5). Total num frames: 855113728. Throughput: 0: 48958.1. Samples: 383930240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 18:29:25,941][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 18:29:26,338][71000] Updated weights for policy 0, policy_version 52194 (0.0036) [2024-06-12 18:29:30,363][71000] Updated weights for policy 0, policy_version 52204 (0.0037) [2024-06-12 18:29:30,939][70768] Fps is (10 sec: 44237.7, 60 sec: 48332.9, 300 sec: 48541.1). Total num frames: 855326720. Throughput: 0: 48811.6. Samples: 384221500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 18:29:30,940][70768] Avg episode reward: [(0, '0.223')] [2024-06-12 18:29:33,249][71000] Updated weights for policy 0, policy_version 52214 (0.0025) [2024-06-12 18:29:35,940][70768] Fps is (10 sec: 47516.3, 60 sec: 48879.0, 300 sec: 48541.1). Total num frames: 855588864. Throughput: 0: 48645.9. Samples: 384364420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 18:29:35,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 18:29:36,711][71000] Updated weights for policy 0, policy_version 52224 (0.0034) [2024-06-12 18:29:39,614][71000] Updated weights for policy 0, policy_version 52234 (0.0030) [2024-06-12 18:29:40,940][70768] Fps is (10 sec: 52427.6, 60 sec: 48878.8, 300 sec: 48707.7). Total num frames: 855851008. Throughput: 0: 48764.8. Samples: 384661160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 18:29:40,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 18:29:40,959][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000052237_855851008.pth... [2024-06-12 18:29:41,009][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000051525_844185600.pth [2024-06-12 18:29:43,486][71000] Updated weights for policy 0, policy_version 52244 (0.0033) [2024-06-12 18:29:45,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48605.8, 300 sec: 48596.6). Total num frames: 856080384. Throughput: 0: 48858.7. Samples: 384953880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 18:29:45,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:29:46,767][71000] Updated weights for policy 0, policy_version 52254 (0.0026) [2024-06-12 18:29:50,405][71000] Updated weights for policy 0, policy_version 52264 (0.0028) [2024-06-12 18:29:50,940][70768] Fps is (10 sec: 47514.7, 60 sec: 48879.0, 300 sec: 48652.2). Total num frames: 856326144. Throughput: 0: 48607.3. Samples: 385100720. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 18:29:50,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 18:29:53,260][71000] Updated weights for policy 0, policy_version 52274 (0.0030) [2024-06-12 18:29:55,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48879.0, 300 sec: 48596.6). Total num frames: 856571904. Throughput: 0: 48648.8. Samples: 385390320. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 18:29:55,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:29:57,091][71000] Updated weights for policy 0, policy_version 52284 (0.0038) [2024-06-12 18:30:00,357][71000] Updated weights for policy 0, policy_version 52294 (0.0025) [2024-06-12 18:30:00,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48332.9, 300 sec: 48596.6). Total num frames: 856801280. Throughput: 0: 48604.2. Samples: 385682340. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 18:30:00,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 18:30:03,560][71000] Updated weights for policy 0, policy_version 52304 (0.0031) [2024-06-12 18:30:05,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48879.1, 300 sec: 48652.2). Total num frames: 857063424. Throughput: 0: 48622.4. Samples: 385828100. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 18:30:05,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 18:30:06,921][71000] Updated weights for policy 0, policy_version 52314 (0.0040) [2024-06-12 18:30:10,256][71000] Updated weights for policy 0, policy_version 52324 (0.0031) [2024-06-12 18:30:10,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48605.8, 300 sec: 48652.2). Total num frames: 857292800. Throughput: 0: 48721.9. Samples: 386122700. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 18:30:10,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:30:11,043][70980] Signal inference workers to stop experience collection... (5650 times) [2024-06-12 18:30:11,043][70980] Signal inference workers to resume experience collection... (5650 times) [2024-06-12 18:30:11,063][71000] InferenceWorker_p0-w0: stopping experience collection (5650 times) [2024-06-12 18:30:11,063][71000] InferenceWorker_p0-w0: resuming experience collection (5650 times) [2024-06-12 18:30:14,058][71000] Updated weights for policy 0, policy_version 52334 (0.0036) [2024-06-12 18:30:15,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48606.0, 300 sec: 48596.6). Total num frames: 857538560. Throughput: 0: 48532.9. Samples: 386405480. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 18:30:15,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 18:30:17,198][71000] Updated weights for policy 0, policy_version 52344 (0.0032) [2024-06-12 18:30:20,623][71000] Updated weights for policy 0, policy_version 52354 (0.0024) [2024-06-12 18:30:20,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48332.9, 300 sec: 48652.1). Total num frames: 857784320. Throughput: 0: 48622.1. Samples: 386552420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 18:30:20,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:30:23,754][71000] Updated weights for policy 0, policy_version 52364 (0.0025) [2024-06-12 18:30:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48606.3, 300 sec: 48596.6). Total num frames: 858030080. Throughput: 0: 48552.6. Samples: 386846020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 18:30:25,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 18:30:27,421][71000] Updated weights for policy 0, policy_version 52374 (0.0034) [2024-06-12 18:30:30,282][71000] Updated weights for policy 0, policy_version 52384 (0.0033) [2024-06-12 18:30:30,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.8, 300 sec: 48652.1). Total num frames: 858275840. Throughput: 0: 48643.0. Samples: 387142820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 18:30:30,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:30:33,979][71000] Updated weights for policy 0, policy_version 52394 (0.0036) [2024-06-12 18:30:35,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.9, 300 sec: 48652.1). Total num frames: 858521600. Throughput: 0: 48764.8. Samples: 387295140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 18:30:35,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 18:30:37,124][71000] Updated weights for policy 0, policy_version 52404 (0.0025) [2024-06-12 18:30:40,847][71000] Updated weights for policy 0, policy_version 52414 (0.0033) [2024-06-12 18:30:40,940][70768] Fps is (10 sec: 47514.3, 60 sec: 48332.9, 300 sec: 48596.7). Total num frames: 858750976. Throughput: 0: 48817.8. Samples: 387587120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 18:30:40,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:30:43,787][71000] Updated weights for policy 0, policy_version 52424 (0.0029) [2024-06-12 18:30:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.9, 300 sec: 48596.6). Total num frames: 858996736. Throughput: 0: 48717.3. Samples: 387874620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 18:30:45,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 18:30:47,418][71000] Updated weights for policy 0, policy_version 52434 (0.0035) [2024-06-12 18:30:50,481][71000] Updated weights for policy 0, policy_version 52444 (0.0032) [2024-06-12 18:30:50,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48878.9, 300 sec: 48652.9). Total num frames: 859258880. Throughput: 0: 48788.5. Samples: 388023580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 18:30:50,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:30:54,051][71000] Updated weights for policy 0, policy_version 52454 (0.0028) [2024-06-12 18:30:55,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48605.8, 300 sec: 48596.6). Total num frames: 859488256. Throughput: 0: 48881.7. Samples: 388322380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-12 18:30:55,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 18:30:56,968][71000] Updated weights for policy 0, policy_version 52464 (0.0023) [2024-06-12 18:31:00,892][71000] Updated weights for policy 0, policy_version 52474 (0.0036) [2024-06-12 18:31:00,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.9, 300 sec: 48652.2). Total num frames: 859734016. Throughput: 0: 49232.8. Samples: 388620960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-12 18:31:00,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:31:03,572][71000] Updated weights for policy 0, policy_version 52484 (0.0030) [2024-06-12 18:31:05,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48878.8, 300 sec: 48652.1). Total num frames: 859996160. Throughput: 0: 49192.8. Samples: 388766100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-12 18:31:05,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 18:31:07,718][71000] Updated weights for policy 0, policy_version 52494 (0.0038) [2024-06-12 18:31:10,282][71000] Updated weights for policy 0, policy_version 52504 (0.0027) [2024-06-12 18:31:10,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 860241920. Throughput: 0: 49016.0. Samples: 389051740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-12 18:31:10,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 18:31:14,271][71000] Updated weights for policy 0, policy_version 52514 (0.0030) [2024-06-12 18:31:15,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.8, 300 sec: 48707.7). Total num frames: 860471296. Throughput: 0: 49034.3. Samples: 389349360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-12 18:31:15,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 18:31:17,015][71000] Updated weights for policy 0, policy_version 52524 (0.0022) [2024-06-12 18:31:20,846][71000] Updated weights for policy 0, policy_version 52534 (0.0030) [2024-06-12 18:31:20,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 860717056. Throughput: 0: 48932.0. Samples: 389497080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-12 18:31:20,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 18:31:23,494][71000] Updated weights for policy 0, policy_version 52544 (0.0032) [2024-06-12 18:31:25,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.0, 300 sec: 48707.7). Total num frames: 860979200. Throughput: 0: 48977.7. Samples: 389791120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 18:31:25,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 18:31:27,740][71000] Updated weights for policy 0, policy_version 52554 (0.0027) [2024-06-12 18:31:30,239][71000] Updated weights for policy 0, policy_version 52564 (0.0029) [2024-06-12 18:31:30,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49152.0, 300 sec: 48818.7). Total num frames: 861224960. Throughput: 0: 49023.4. Samples: 390080680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 18:31:30,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:31:34,299][71000] Updated weights for policy 0, policy_version 52574 (0.0026) [2024-06-12 18:31:35,817][70980] Signal inference workers to stop experience collection... (5700 times) [2024-06-12 18:31:35,818][70980] Signal inference workers to resume experience collection... (5700 times) [2024-06-12 18:31:35,851][71000] InferenceWorker_p0-w0: stopping experience collection (5700 times) [2024-06-12 18:31:35,852][71000] InferenceWorker_p0-w0: resuming experience collection (5700 times) [2024-06-12 18:31:35,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48605.9, 300 sec: 48652.5). Total num frames: 861437952. Throughput: 0: 49069.7. Samples: 390231720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 18:31:35,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:31:37,042][71000] Updated weights for policy 0, policy_version 52584 (0.0030) [2024-06-12 18:31:40,940][70768] Fps is (10 sec: 45876.0, 60 sec: 48879.0, 300 sec: 48596.6). Total num frames: 861683712. Throughput: 0: 48807.3. Samples: 390518700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 18:31:40,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:31:41,048][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000052594_861700096.pth... [2024-06-12 18:31:41,059][71000] Updated weights for policy 0, policy_version 52594 (0.0032) [2024-06-12 18:31:41,092][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000051879_849985536.pth [2024-06-12 18:31:43,535][71000] Updated weights for policy 0, policy_version 52604 (0.0025) [2024-06-12 18:31:45,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49151.9, 300 sec: 48707.7). Total num frames: 861945856. Throughput: 0: 48773.6. Samples: 390815780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 18:31:45,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 18:31:47,731][71000] Updated weights for policy 0, policy_version 52614 (0.0032) [2024-06-12 18:31:50,139][71000] Updated weights for policy 0, policy_version 52624 (0.0039) [2024-06-12 18:31:50,940][70768] Fps is (10 sec: 52427.7, 60 sec: 49151.8, 300 sec: 48818.7). Total num frames: 862208000. Throughput: 0: 48987.0. Samples: 390970520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 18:31:50,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 18:31:54,266][71000] Updated weights for policy 0, policy_version 52634 (0.0036) [2024-06-12 18:31:55,939][70768] Fps is (10 sec: 49153.0, 60 sec: 49152.1, 300 sec: 48763.2). Total num frames: 862437376. Throughput: 0: 49242.3. Samples: 391267640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 18:31:55,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 18:31:56,941][71000] Updated weights for policy 0, policy_version 52644 (0.0029) [2024-06-12 18:32:00,940][70768] Fps is (10 sec: 45876.2, 60 sec: 48879.0, 300 sec: 48652.1). Total num frames: 862666752. Throughput: 0: 49143.3. Samples: 391560800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 18:32:00,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 18:32:00,984][71000] Updated weights for policy 0, policy_version 52654 (0.0033) [2024-06-12 18:32:03,586][71000] Updated weights for policy 0, policy_version 52664 (0.0036) [2024-06-12 18:32:05,939][70768] Fps is (10 sec: 49152.0, 60 sec: 48879.1, 300 sec: 48707.7). Total num frames: 862928896. Throughput: 0: 48752.9. Samples: 391690960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 18:32:05,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 18:32:07,915][71000] Updated weights for policy 0, policy_version 52674 (0.0032) [2024-06-12 18:32:10,329][71000] Updated weights for policy 0, policy_version 52684 (0.0034) [2024-06-12 18:32:10,940][70768] Fps is (10 sec: 50789.7, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 863174656. Throughput: 0: 48911.5. Samples: 391992140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 18:32:10,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 18:32:14,458][71000] Updated weights for policy 0, policy_version 52694 (0.0027) [2024-06-12 18:32:15,939][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 48818.8). Total num frames: 863420416. Throughput: 0: 49260.2. Samples: 392297380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 18:32:15,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:32:16,898][71000] Updated weights for policy 0, policy_version 52704 (0.0027) [2024-06-12 18:32:20,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 863649792. Throughput: 0: 49084.4. Samples: 392440520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 18:32:20,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:32:21,007][71000] Updated weights for policy 0, policy_version 52714 (0.0032) [2024-06-12 18:32:23,645][71000] Updated weights for policy 0, policy_version 52724 (0.0028) [2024-06-12 18:32:25,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 863911936. Throughput: 0: 48973.2. Samples: 392722500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 18:32:25,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 18:32:27,835][71000] Updated weights for policy 0, policy_version 52734 (0.0038) [2024-06-12 18:32:30,183][71000] Updated weights for policy 0, policy_version 52744 (0.0028) [2024-06-12 18:32:30,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 864157696. Throughput: 0: 48648.5. Samples: 393004960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 18:32:30,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:32:34,713][71000] Updated weights for policy 0, policy_version 52754 (0.0036) [2024-06-12 18:32:35,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.1, 300 sec: 48874.3). Total num frames: 864403456. Throughput: 0: 48844.2. Samples: 393168500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 18:32:35,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 18:32:36,996][71000] Updated weights for policy 0, policy_version 52764 (0.0035) [2024-06-12 18:32:40,940][70768] Fps is (10 sec: 45875.6, 60 sec: 48878.9, 300 sec: 48763.3). Total num frames: 864616448. Throughput: 0: 48684.9. Samples: 393458460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 18:32:40,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:32:41,491][71000] Updated weights for policy 0, policy_version 52774 (0.0030) [2024-06-12 18:32:43,726][71000] Updated weights for policy 0, policy_version 52784 (0.0031) [2024-06-12 18:32:45,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48879.1, 300 sec: 48818.8). Total num frames: 864878592. Throughput: 0: 48778.2. Samples: 393755820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 18:32:45,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:32:47,973][71000] Updated weights for policy 0, policy_version 52794 (0.0026) [2024-06-12 18:32:50,042][70980] Signal inference workers to stop experience collection... (5750 times) [2024-06-12 18:32:50,043][70980] Signal inference workers to resume experience collection... (5750 times) [2024-06-12 18:32:50,053][71000] InferenceWorker_p0-w0: stopping experience collection (5750 times) [2024-06-12 18:32:50,064][71000] InferenceWorker_p0-w0: resuming experience collection (5750 times) [2024-06-12 18:32:50,477][71000] Updated weights for policy 0, policy_version 52804 (0.0022) [2024-06-12 18:32:50,940][70768] Fps is (10 sec: 52428.7, 60 sec: 48879.1, 300 sec: 48985.4). Total num frames: 865140736. Throughput: 0: 49097.3. Samples: 393900340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 18:32:50,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 18:32:54,812][71000] Updated weights for policy 0, policy_version 52814 (0.0033) [2024-06-12 18:32:55,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 865370112. Throughput: 0: 49013.9. Samples: 394197760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 18:32:55,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 18:32:57,172][71000] Updated weights for policy 0, policy_version 52824 (0.0034) [2024-06-12 18:33:00,940][70768] Fps is (10 sec: 45874.0, 60 sec: 48878.7, 300 sec: 48707.7). Total num frames: 865599488. Throughput: 0: 48782.8. Samples: 394492620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 18:33:00,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 18:33:01,353][71000] Updated weights for policy 0, policy_version 52834 (0.0033) [2024-06-12 18:33:03,984][71000] Updated weights for policy 0, policy_version 52844 (0.0028) [2024-06-12 18:33:05,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 865861632. Throughput: 0: 48649.9. Samples: 394629760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 18:33:05,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 18:33:08,032][71000] Updated weights for policy 0, policy_version 52854 (0.0042) [2024-06-12 18:33:10,496][71000] Updated weights for policy 0, policy_version 52864 (0.0029) [2024-06-12 18:33:10,940][70768] Fps is (10 sec: 54067.8, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 866140160. Throughput: 0: 49053.7. Samples: 394929920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 18:33:10,944][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 18:33:14,763][71000] Updated weights for policy 0, policy_version 52874 (0.0039) [2024-06-12 18:33:15,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 866353152. Throughput: 0: 49234.3. Samples: 395220500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 18:33:15,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 18:33:17,541][71000] Updated weights for policy 0, policy_version 52884 (0.0030) [2024-06-12 18:33:20,940][70768] Fps is (10 sec: 44237.2, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 866582528. Throughput: 0: 48687.5. Samples: 395359440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 18:33:20,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 18:33:21,514][71000] Updated weights for policy 0, policy_version 52894 (0.0029) [2024-06-12 18:33:23,989][71000] Updated weights for policy 0, policy_version 52904 (0.0028) [2024-06-12 18:33:25,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48605.8, 300 sec: 48818.8). Total num frames: 866828288. Throughput: 0: 48881.2. Samples: 395658120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 18:33:25,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:33:27,919][71000] Updated weights for policy 0, policy_version 52914 (0.0024) [2024-06-12 18:33:30,626][71000] Updated weights for policy 0, policy_version 52924 (0.0033) [2024-06-12 18:33:30,940][70768] Fps is (10 sec: 54067.7, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 867123200. Throughput: 0: 49030.2. Samples: 395962180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 18:33:30,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 18:33:34,563][71000] Updated weights for policy 0, policy_version 52934 (0.0028) [2024-06-12 18:33:35,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 867352576. Throughput: 0: 49345.8. Samples: 396120900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 18:33:35,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 18:33:37,348][71000] Updated weights for policy 0, policy_version 52944 (0.0021) [2024-06-12 18:33:40,940][70768] Fps is (10 sec: 44236.6, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 867565568. Throughput: 0: 49118.2. Samples: 396408080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 18:33:40,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:33:40,954][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000052952_867565568.pth... [2024-06-12 18:33:41,023][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000052237_855851008.pth [2024-06-12 18:33:41,455][71000] Updated weights for policy 0, policy_version 52954 (0.0029) [2024-06-12 18:33:44,379][71000] Updated weights for policy 0, policy_version 52964 (0.0029) [2024-06-12 18:33:45,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 867811328. Throughput: 0: 48733.6. Samples: 396685620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 18:33:45,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 18:33:48,154][71000] Updated weights for policy 0, policy_version 52974 (0.0035) [2024-06-12 18:33:50,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 868073472. Throughput: 0: 49042.6. Samples: 396836680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 18:33:50,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:33:50,974][71000] Updated weights for policy 0, policy_version 52984 (0.0031) [2024-06-12 18:33:54,680][71000] Updated weights for policy 0, policy_version 52994 (0.0032) [2024-06-12 18:33:55,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 868319232. Throughput: 0: 48915.1. Samples: 397131100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 18:33:55,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 18:33:57,912][71000] Updated weights for policy 0, policy_version 53004 (0.0029) [2024-06-12 18:34:00,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.2, 300 sec: 48874.3). Total num frames: 868548608. Throughput: 0: 49125.7. Samples: 397431160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 18:34:00,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:34:01,356][71000] Updated weights for policy 0, policy_version 53014 (0.0020) [2024-06-12 18:34:04,702][71000] Updated weights for policy 0, policy_version 53024 (0.0030) [2024-06-12 18:34:05,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 868794368. Throughput: 0: 49076.5. Samples: 397567880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 18:34:05,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 18:34:06,405][70980] Signal inference workers to stop experience collection... (5800 times) [2024-06-12 18:34:06,447][71000] InferenceWorker_p0-w0: stopping experience collection (5800 times) [2024-06-12 18:34:06,456][70980] Signal inference workers to resume experience collection... (5800 times) [2024-06-12 18:34:06,469][71000] InferenceWorker_p0-w0: resuming experience collection (5800 times) [2024-06-12 18:34:08,377][71000] Updated weights for policy 0, policy_version 53034 (0.0031) [2024-06-12 18:34:10,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48059.8, 300 sec: 48818.8). Total num frames: 869023744. Throughput: 0: 48809.9. Samples: 397854560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-12 18:34:10,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:34:11,462][71000] Updated weights for policy 0, policy_version 53044 (0.0032) [2024-06-12 18:34:14,927][71000] Updated weights for policy 0, policy_version 53054 (0.0032) [2024-06-12 18:34:15,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 869285888. Throughput: 0: 48594.2. Samples: 398148920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 18:34:15,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:34:18,206][71000] Updated weights for policy 0, policy_version 53064 (0.0025) [2024-06-12 18:34:20,939][70768] Fps is (10 sec: 47514.1, 60 sec: 48606.0, 300 sec: 48763.3). Total num frames: 869498880. Throughput: 0: 48238.8. Samples: 398291640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 18:34:20,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 18:34:21,559][71000] Updated weights for policy 0, policy_version 53074 (0.0027) [2024-06-12 18:34:24,688][71000] Updated weights for policy 0, policy_version 53084 (0.0021) [2024-06-12 18:34:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 869777408. Throughput: 0: 48441.3. Samples: 398587940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 18:34:25,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 18:34:28,165][71000] Updated weights for policy 0, policy_version 53094 (0.0037) [2024-06-12 18:34:30,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48059.7, 300 sec: 48874.3). Total num frames: 870006784. Throughput: 0: 48920.9. Samples: 398887060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 18:34:30,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 18:34:31,524][71000] Updated weights for policy 0, policy_version 53104 (0.0029) [2024-06-12 18:34:35,044][71000] Updated weights for policy 0, policy_version 53114 (0.0035) [2024-06-12 18:34:35,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 870268928. Throughput: 0: 48721.3. Samples: 399029140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 18:34:35,940][70768] Avg episode reward: [(0, '0.228')] [2024-06-12 18:34:38,177][71000] Updated weights for policy 0, policy_version 53124 (0.0029) [2024-06-12 18:34:40,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 870481920. Throughput: 0: 48759.7. Samples: 399325280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 18:34:40,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 18:34:41,619][71000] Updated weights for policy 0, policy_version 53134 (0.0025) [2024-06-12 18:34:44,913][71000] Updated weights for policy 0, policy_version 53144 (0.0031) [2024-06-12 18:34:45,941][70768] Fps is (10 sec: 47505.5, 60 sec: 48877.6, 300 sec: 48874.0). Total num frames: 870744064. Throughput: 0: 48556.0. Samples: 399616260. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-12 18:34:45,942][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 18:34:48,197][71000] Updated weights for policy 0, policy_version 53154 (0.0031) [2024-06-12 18:34:50,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 870989824. Throughput: 0: 48685.4. Samples: 399758720. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-12 18:34:50,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:34:51,512][71000] Updated weights for policy 0, policy_version 53164 (0.0028) [2024-06-12 18:34:55,035][71000] Updated weights for policy 0, policy_version 53174 (0.0028) [2024-06-12 18:34:55,940][70768] Fps is (10 sec: 50798.4, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 871251968. Throughput: 0: 49055.5. Samples: 400062060. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-12 18:34:55,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 18:34:58,483][71000] Updated weights for policy 0, policy_version 53184 (0.0042) [2024-06-12 18:35:00,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48332.8, 300 sec: 48763.2). Total num frames: 871448576. Throughput: 0: 48690.7. Samples: 400340000. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-12 18:35:00,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:35:01,983][71000] Updated weights for policy 0, policy_version 53194 (0.0026) [2024-06-12 18:35:03,470][70980] Signal inference workers to stop experience collection... (5850 times) [2024-06-12 18:35:03,471][70980] Signal inference workers to resume experience collection... (5850 times) [2024-06-12 18:35:03,519][71000] InferenceWorker_p0-w0: stopping experience collection (5850 times) [2024-06-12 18:35:03,519][71000] InferenceWorker_p0-w0: resuming experience collection (5850 times) [2024-06-12 18:35:05,236][71000] Updated weights for policy 0, policy_version 53204 (0.0033) [2024-06-12 18:35:05,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 871710720. Throughput: 0: 48788.8. Samples: 400487140. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-12 18:35:05,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:35:08,610][71000] Updated weights for policy 0, policy_version 53214 (0.0025) [2024-06-12 18:35:10,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 871940096. Throughput: 0: 48621.8. Samples: 400775920. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-12 18:35:10,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 18:35:12,021][71000] Updated weights for policy 0, policy_version 53224 (0.0025) [2024-06-12 18:35:15,227][71000] Updated weights for policy 0, policy_version 53234 (0.0029) [2024-06-12 18:35:15,939][70768] Fps is (10 sec: 50790.5, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 872218624. Throughput: 0: 48573.3. Samples: 401072860. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-12 18:35:15,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:35:18,767][71000] Updated weights for policy 0, policy_version 53244 (0.0028) [2024-06-12 18:35:20,939][70768] Fps is (10 sec: 49152.6, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 872431616. Throughput: 0: 48784.1. Samples: 401224420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 18:35:20,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 18:35:21,782][71000] Updated weights for policy 0, policy_version 53254 (0.0028) [2024-06-12 18:35:25,834][71000] Updated weights for policy 0, policy_version 53264 (0.0041) [2024-06-12 18:35:25,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48332.8, 300 sec: 48818.8). Total num frames: 872677376. Throughput: 0: 48721.4. Samples: 401517740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 18:35:25,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 18:35:28,611][71000] Updated weights for policy 0, policy_version 53274 (0.0040) [2024-06-12 18:35:30,940][70768] Fps is (10 sec: 47512.1, 60 sec: 48332.6, 300 sec: 48763.2). Total num frames: 872906752. Throughput: 0: 48330.0. Samples: 401791040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 18:35:30,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 18:35:32,590][71000] Updated weights for policy 0, policy_version 53284 (0.0026) [2024-06-12 18:35:35,365][71000] Updated weights for policy 0, policy_version 53294 (0.0028) [2024-06-12 18:35:35,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48605.7, 300 sec: 48929.8). Total num frames: 873185280. Throughput: 0: 48554.5. Samples: 401943680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 18:35:35,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:35:38,962][71000] Updated weights for policy 0, policy_version 53304 (0.0023) [2024-06-12 18:35:40,939][70768] Fps is (10 sec: 50791.8, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 873414656. Throughput: 0: 48404.6. Samples: 402240260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 18:35:40,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 18:35:41,027][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000053310_873431040.pth... [2024-06-12 18:35:41,086][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000052594_861700096.pth [2024-06-12 18:35:42,031][71000] Updated weights for policy 0, policy_version 53314 (0.0037) [2024-06-12 18:35:45,617][71000] Updated weights for policy 0, policy_version 53324 (0.0033) [2024-06-12 18:35:45,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48607.1, 300 sec: 48818.7). Total num frames: 873660416. Throughput: 0: 48764.3. Samples: 402534400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 18:35:45,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 18:35:49,154][71000] Updated weights for policy 0, policy_version 53334 (0.0030) [2024-06-12 18:35:50,939][70768] Fps is (10 sec: 45875.2, 60 sec: 48059.8, 300 sec: 48763.3). Total num frames: 873873408. Throughput: 0: 48576.0. Samples: 402673060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 18:35:50,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 18:35:52,511][71000] Updated weights for policy 0, policy_version 53344 (0.0039) [2024-06-12 18:35:55,915][71000] Updated weights for policy 0, policy_version 53354 (0.0032) [2024-06-12 18:35:55,940][70768] Fps is (10 sec: 49152.9, 60 sec: 48332.9, 300 sec: 48874.3). Total num frames: 874151936. Throughput: 0: 48682.7. Samples: 402966640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 18:35:55,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:35:58,988][71000] Updated weights for policy 0, policy_version 53364 (0.0035) [2024-06-12 18:36:00,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 874397696. Throughput: 0: 48717.2. Samples: 403265140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 18:36:00,944][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:36:02,201][71000] Updated weights for policy 0, policy_version 53374 (0.0024) [2024-06-12 18:36:05,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 874627072. Throughput: 0: 48460.3. Samples: 403405140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 18:36:05,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 18:36:06,185][71000] Updated weights for policy 0, policy_version 53384 (0.0033) [2024-06-12 18:36:08,645][71000] Updated weights for policy 0, policy_version 53394 (0.0032) [2024-06-12 18:36:10,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 874872832. Throughput: 0: 48401.4. Samples: 403695800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 18:36:10,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 18:36:12,803][71000] Updated weights for policy 0, policy_version 53404 (0.0029) [2024-06-12 18:36:15,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48332.8, 300 sec: 48818.8). Total num frames: 875118592. Throughput: 0: 48590.9. Samples: 403977620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 18:36:15,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:36:16,046][71000] Updated weights for policy 0, policy_version 53414 (0.0044) [2024-06-12 18:36:19,690][71000] Updated weights for policy 0, policy_version 53424 (0.0027) [2024-06-12 18:36:20,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48332.7, 300 sec: 48652.2). Total num frames: 875331584. Throughput: 0: 48425.9. Samples: 404122840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 18:36:20,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:36:21,270][70980] Signal inference workers to stop experience collection... (5900 times) [2024-06-12 18:36:21,270][70980] Signal inference workers to resume experience collection... (5900 times) [2024-06-12 18:36:21,288][71000] InferenceWorker_p0-w0: stopping experience collection (5900 times) [2024-06-12 18:36:21,288][71000] InferenceWorker_p0-w0: resuming experience collection (5900 times) [2024-06-12 18:36:22,794][71000] Updated weights for policy 0, policy_version 53434 (0.0032) [2024-06-12 18:36:25,940][70768] Fps is (10 sec: 45874.5, 60 sec: 48332.7, 300 sec: 48652.1). Total num frames: 875577344. Throughput: 0: 48427.3. Samples: 404419500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 18:36:25,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 18:36:26,670][71000] Updated weights for policy 0, policy_version 53444 (0.0031) [2024-06-12 18:36:29,297][71000] Updated weights for policy 0, policy_version 53454 (0.0027) [2024-06-12 18:36:30,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48606.1, 300 sec: 48763.2). Total num frames: 875823104. Throughput: 0: 48392.7. Samples: 404712060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 18:36:30,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 18:36:33,372][71000] Updated weights for policy 0, policy_version 53464 (0.0025) [2024-06-12 18:36:35,786][71000] Updated weights for policy 0, policy_version 53474 (0.0028) [2024-06-12 18:36:35,940][70768] Fps is (10 sec: 54068.2, 60 sec: 48879.1, 300 sec: 48929.8). Total num frames: 876118016. Throughput: 0: 48690.6. Samples: 404864140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 18:36:35,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 18:36:40,128][71000] Updated weights for policy 0, policy_version 53484 (0.0039) [2024-06-12 18:36:40,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48605.8, 300 sec: 48763.3). Total num frames: 876331008. Throughput: 0: 48594.7. Samples: 405153400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 18:36:40,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:36:42,928][71000] Updated weights for policy 0, policy_version 53494 (0.0024) [2024-06-12 18:36:45,939][70768] Fps is (10 sec: 42598.5, 60 sec: 48059.9, 300 sec: 48596.6). Total num frames: 876544000. Throughput: 0: 48199.2. Samples: 405434100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 18:36:45,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 18:36:46,801][71000] Updated weights for policy 0, policy_version 53504 (0.0034) [2024-06-12 18:36:49,839][71000] Updated weights for policy 0, policy_version 53514 (0.0029) [2024-06-12 18:36:50,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48878.8, 300 sec: 48707.7). Total num frames: 876806144. Throughput: 0: 48280.7. Samples: 405577780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 18:36:50,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 18:36:53,623][71000] Updated weights for policy 0, policy_version 53524 (0.0034) [2024-06-12 18:36:55,940][70768] Fps is (10 sec: 52428.8, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 877068288. Throughput: 0: 48459.6. Samples: 405876480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 18:36:55,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 18:36:56,195][71000] Updated weights for policy 0, policy_version 53534 (0.0021) [2024-06-12 18:37:00,879][71000] Updated weights for policy 0, policy_version 53544 (0.0030) [2024-06-12 18:37:00,940][70768] Fps is (10 sec: 45876.0, 60 sec: 47786.7, 300 sec: 48596.6). Total num frames: 877264896. Throughput: 0: 48663.6. Samples: 406167480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 18:37:00,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 18:37:03,107][71000] Updated weights for policy 0, policy_version 53554 (0.0031) [2024-06-12 18:37:05,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48332.8, 300 sec: 48652.2). Total num frames: 877527040. Throughput: 0: 48340.4. Samples: 406298160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 18:37:05,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:37:07,601][71000] Updated weights for policy 0, policy_version 53564 (0.0027) [2024-06-12 18:37:10,011][71000] Updated weights for policy 0, policy_version 53574 (0.0031) [2024-06-12 18:37:10,939][70768] Fps is (10 sec: 50790.4, 60 sec: 48332.8, 300 sec: 48652.1). Total num frames: 877772800. Throughput: 0: 48299.8. Samples: 406592980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 18:37:10,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 18:37:14,247][71000] Updated weights for policy 0, policy_version 53584 (0.0028) [2024-06-12 18:37:15,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48332.8, 300 sec: 48707.7). Total num frames: 878018560. Throughput: 0: 48349.2. Samples: 406887780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 18:37:15,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 18:37:16,715][71000] Updated weights for policy 0, policy_version 53594 (0.0030) [2024-06-12 18:37:20,870][71000] Updated weights for policy 0, policy_version 53604 (0.0026) [2024-06-12 18:37:20,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.9, 300 sec: 48596.6). Total num frames: 878247936. Throughput: 0: 48274.7. Samples: 407036500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 18:37:20,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 18:37:23,255][71000] Updated weights for policy 0, policy_version 53614 (0.0026) [2024-06-12 18:37:25,939][70768] Fps is (10 sec: 47514.1, 60 sec: 48606.0, 300 sec: 48596.6). Total num frames: 878493696. Throughput: 0: 48298.7. Samples: 407326840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 18:37:25,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 18:37:27,587][71000] Updated weights for policy 0, policy_version 53624 (0.0036) [2024-06-12 18:37:30,133][71000] Updated weights for policy 0, policy_version 53634 (0.0026) [2024-06-12 18:37:30,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48878.9, 300 sec: 48652.1). Total num frames: 878755840. Throughput: 0: 48442.6. Samples: 407614020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 18:37:30,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 18:37:34,296][71000] Updated weights for policy 0, policy_version 53644 (0.0034) [2024-06-12 18:37:35,939][70768] Fps is (10 sec: 50790.6, 60 sec: 48059.8, 300 sec: 48763.2). Total num frames: 879001600. Throughput: 0: 48757.6. Samples: 407771860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 18:37:35,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:37:36,646][70980] Signal inference workers to stop experience collection... (5950 times) [2024-06-12 18:37:36,666][71000] InferenceWorker_p0-w0: stopping experience collection (5950 times) [2024-06-12 18:37:36,753][70980] Signal inference workers to resume experience collection... (5950 times) [2024-06-12 18:37:36,753][71000] InferenceWorker_p0-w0: resuming experience collection (5950 times) [2024-06-12 18:37:36,878][71000] Updated weights for policy 0, policy_version 53654 (0.0032) [2024-06-12 18:37:40,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48059.7, 300 sec: 48596.6). Total num frames: 879214592. Throughput: 0: 48482.2. Samples: 408058180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 18:37:40,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 18:37:40,961][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000053663_879214592.pth... [2024-06-12 18:37:41,032][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000052952_867565568.pth [2024-06-12 18:37:41,174][71000] Updated weights for policy 0, policy_version 53664 (0.0025) [2024-06-12 18:37:43,853][71000] Updated weights for policy 0, policy_version 53674 (0.0023) [2024-06-12 18:37:45,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48878.9, 300 sec: 48596.6). Total num frames: 879476736. Throughput: 0: 48643.1. Samples: 408356420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 18:37:45,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 18:37:47,494][71000] Updated weights for policy 0, policy_version 53684 (0.0030) [2024-06-12 18:37:50,136][71000] Updated weights for policy 0, policy_version 53694 (0.0025) [2024-06-12 18:37:50,940][70768] Fps is (10 sec: 54066.7, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 879755264. Throughput: 0: 49143.4. Samples: 408509620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 18:37:50,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:37:53,921][71000] Updated weights for policy 0, policy_version 53704 (0.0030) [2024-06-12 18:37:55,940][70768] Fps is (10 sec: 52429.1, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 880001024. Throughput: 0: 49238.7. Samples: 408808720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 18:37:55,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:37:57,193][71000] Updated weights for policy 0, policy_version 53714 (0.0025) [2024-06-12 18:38:00,610][71000] Updated weights for policy 0, policy_version 53724 (0.0036) [2024-06-12 18:38:00,940][70768] Fps is (10 sec: 45875.7, 60 sec: 49152.0, 300 sec: 48652.1). Total num frames: 880214016. Throughput: 0: 49140.9. Samples: 409099120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 18:38:00,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:38:03,898][71000] Updated weights for policy 0, policy_version 53734 (0.0031) [2024-06-12 18:38:05,940][70768] Fps is (10 sec: 45874.4, 60 sec: 48878.8, 300 sec: 48541.1). Total num frames: 880459776. Throughput: 0: 48852.7. Samples: 409234880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-12 18:38:05,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 18:38:07,514][71000] Updated weights for policy 0, policy_version 53744 (0.0036) [2024-06-12 18:38:10,241][71000] Updated weights for policy 0, policy_version 53754 (0.0026) [2024-06-12 18:38:10,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49151.8, 300 sec: 48707.7). Total num frames: 880721920. Throughput: 0: 48954.9. Samples: 409529820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-12 18:38:10,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 18:38:14,190][71000] Updated weights for policy 0, policy_version 53764 (0.0037) [2024-06-12 18:38:15,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49151.8, 300 sec: 48763.2). Total num frames: 880967680. Throughput: 0: 49205.5. Samples: 409828280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-12 18:38:15,941][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 18:38:16,989][71000] Updated weights for policy 0, policy_version 53774 (0.0040) [2024-06-12 18:38:20,939][70768] Fps is (10 sec: 45876.2, 60 sec: 48878.9, 300 sec: 48652.2). Total num frames: 881180672. Throughput: 0: 48810.2. Samples: 409968320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-12 18:38:20,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 18:38:20,947][71000] Updated weights for policy 0, policy_version 53784 (0.0024) [2024-06-12 18:38:23,702][71000] Updated weights for policy 0, policy_version 53794 (0.0035) [2024-06-12 18:38:25,940][70768] Fps is (10 sec: 47514.6, 60 sec: 49151.9, 300 sec: 48541.1). Total num frames: 881442816. Throughput: 0: 49015.1. Samples: 410263860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-12 18:38:25,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 18:38:27,555][71000] Updated weights for policy 0, policy_version 53804 (0.0024) [2024-06-12 18:38:30,371][71000] Updated weights for policy 0, policy_version 53814 (0.0035) [2024-06-12 18:38:30,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48878.9, 300 sec: 48596.6). Total num frames: 881688576. Throughput: 0: 48704.4. Samples: 410548120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-12 18:38:30,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:38:34,183][71000] Updated weights for policy 0, policy_version 53824 (0.0025) [2024-06-12 18:38:35,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 881950720. Throughput: 0: 48915.7. Samples: 410710820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-12 18:38:35,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 18:38:37,118][71000] Updated weights for policy 0, policy_version 53834 (0.0032) [2024-06-12 18:38:38,615][70980] Signal inference workers to stop experience collection... (6000 times) [2024-06-12 18:38:38,647][71000] InferenceWorker_p0-w0: stopping experience collection (6000 times) [2024-06-12 18:38:38,679][70980] Signal inference workers to resume experience collection... (6000 times) [2024-06-12 18:38:38,679][71000] InferenceWorker_p0-w0: resuming experience collection (6000 times) [2024-06-12 18:38:40,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.0, 300 sec: 48652.1). Total num frames: 882163712. Throughput: 0: 48670.1. Samples: 410998880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 18:38:40,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 18:38:41,247][71000] Updated weights for policy 0, policy_version 53844 (0.0020) [2024-06-12 18:38:43,674][71000] Updated weights for policy 0, policy_version 53854 (0.0028) [2024-06-12 18:38:45,940][70768] Fps is (10 sec: 47512.4, 60 sec: 49151.8, 300 sec: 48652.1). Total num frames: 882425856. Throughput: 0: 48670.9. Samples: 411289320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 18:38:45,944][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 18:38:47,802][71000] Updated weights for policy 0, policy_version 53864 (0.0033) [2024-06-12 18:38:50,302][71000] Updated weights for policy 0, policy_version 53874 (0.0033) [2024-06-12 18:38:50,940][70768] Fps is (10 sec: 50789.7, 60 sec: 48605.8, 300 sec: 48652.1). Total num frames: 882671616. Throughput: 0: 49109.3. Samples: 411444800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 18:38:50,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:38:54,196][71000] Updated weights for policy 0, policy_version 53884 (0.0034) [2024-06-12 18:38:55,940][70768] Fps is (10 sec: 49153.1, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 882917376. Throughput: 0: 49150.0. Samples: 411741560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 18:38:55,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 18:38:57,293][71000] Updated weights for policy 0, policy_version 53894 (0.0031) [2024-06-12 18:39:00,852][71000] Updated weights for policy 0, policy_version 53904 (0.0040) [2024-06-12 18:39:00,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49152.0, 300 sec: 48707.7). Total num frames: 883163136. Throughput: 0: 49186.5. Samples: 412041660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 18:39:00,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 18:39:03,888][71000] Updated weights for policy 0, policy_version 53914 (0.0031) [2024-06-12 18:39:05,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 883408896. Throughput: 0: 49162.0. Samples: 412180620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 18:39:05,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 18:39:07,582][71000] Updated weights for policy 0, policy_version 53924 (0.0024) [2024-06-12 18:39:10,413][71000] Updated weights for policy 0, policy_version 53934 (0.0027) [2024-06-12 18:39:10,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.1, 300 sec: 48763.2). Total num frames: 883671040. Throughput: 0: 49251.6. Samples: 412480180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 18:39:10,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:39:14,511][71000] Updated weights for policy 0, policy_version 53944 (0.0032) [2024-06-12 18:39:15,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48879.0, 300 sec: 48818.7). Total num frames: 883900416. Throughput: 0: 49299.4. Samples: 412766600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 18:39:15,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 18:39:17,239][71000] Updated weights for policy 0, policy_version 53954 (0.0033) [2024-06-12 18:39:20,940][70768] Fps is (10 sec: 44236.3, 60 sec: 48878.8, 300 sec: 48596.6). Total num frames: 884113408. Throughput: 0: 48878.9. Samples: 412910380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 18:39:20,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 18:39:21,168][71000] Updated weights for policy 0, policy_version 53964 (0.0035) [2024-06-12 18:39:24,269][71000] Updated weights for policy 0, policy_version 53974 (0.0031) [2024-06-12 18:39:25,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48605.9, 300 sec: 48652.1). Total num frames: 884359168. Throughput: 0: 49038.2. Samples: 413205600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 18:39:25,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 18:39:27,893][71000] Updated weights for policy 0, policy_version 53984 (0.0035) [2024-06-12 18:39:30,828][71000] Updated weights for policy 0, policy_version 53994 (0.0036) [2024-06-12 18:39:30,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49151.9, 300 sec: 48707.7). Total num frames: 884637696. Throughput: 0: 49129.8. Samples: 413500160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 18:39:30,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 18:39:34,607][71000] Updated weights for policy 0, policy_version 54004 (0.0034) [2024-06-12 18:39:35,939][70768] Fps is (10 sec: 50790.8, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 884867072. Throughput: 0: 48880.3. Samples: 413644400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 18:39:35,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 18:39:37,650][71000] Updated weights for policy 0, policy_version 54014 (0.0031) [2024-06-12 18:39:40,940][70768] Fps is (10 sec: 45876.0, 60 sec: 48879.0, 300 sec: 48652.4). Total num frames: 885096448. Throughput: 0: 48721.3. Samples: 413934020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 18:39:40,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 18:39:40,965][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000054022_885096448.pth... [2024-06-12 18:39:41,022][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000053310_873431040.pth [2024-06-12 18:39:41,458][71000] Updated weights for policy 0, policy_version 54024 (0.0033) [2024-06-12 18:39:44,236][71000] Updated weights for policy 0, policy_version 54034 (0.0041) [2024-06-12 18:39:45,939][70768] Fps is (10 sec: 47513.5, 60 sec: 48606.1, 300 sec: 48652.2). Total num frames: 885342208. Throughput: 0: 48405.4. Samples: 414219900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:39:45,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 18:39:47,973][70980] Signal inference workers to stop experience collection... (6050 times) [2024-06-12 18:39:47,973][70980] Signal inference workers to resume experience collection... (6050 times) [2024-06-12 18:39:47,992][71000] InferenceWorker_p0-w0: stopping experience collection (6050 times) [2024-06-12 18:39:47,992][71000] InferenceWorker_p0-w0: resuming experience collection (6050 times) [2024-06-12 18:39:48,112][71000] Updated weights for policy 0, policy_version 54044 (0.0027) [2024-06-12 18:39:50,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48879.0, 300 sec: 48652.2). Total num frames: 885604352. Throughput: 0: 48703.3. Samples: 414372260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:39:50,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 18:39:51,284][71000] Updated weights for policy 0, policy_version 54054 (0.0039) [2024-06-12 18:39:54,526][71000] Updated weights for policy 0, policy_version 54064 (0.0032) [2024-06-12 18:39:55,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 885866496. Throughput: 0: 48779.9. Samples: 414675280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:39:55,949][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 18:39:57,748][71000] Updated weights for policy 0, policy_version 54074 (0.0032) [2024-06-12 18:40:00,940][70768] Fps is (10 sec: 45874.6, 60 sec: 48332.7, 300 sec: 48652.1). Total num frames: 886063104. Throughput: 0: 48825.3. Samples: 414963740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:40:00,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 18:40:01,335][71000] Updated weights for policy 0, policy_version 54084 (0.0022) [2024-06-12 18:40:04,484][71000] Updated weights for policy 0, policy_version 54094 (0.0032) [2024-06-12 18:40:05,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 886325248. Throughput: 0: 48883.1. Samples: 415110120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:40:05,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:40:08,397][71000] Updated weights for policy 0, policy_version 54104 (0.0038) [2024-06-12 18:40:10,940][70768] Fps is (10 sec: 50791.4, 60 sec: 48332.8, 300 sec: 48652.1). Total num frames: 886571008. Throughput: 0: 48598.7. Samples: 415392540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:40:10,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:40:11,252][71000] Updated weights for policy 0, policy_version 54114 (0.0028) [2024-06-12 18:40:14,835][71000] Updated weights for policy 0, policy_version 54124 (0.0033) [2024-06-12 18:40:15,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 886849536. Throughput: 0: 48668.4. Samples: 415690240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 18:40:15,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:40:17,784][71000] Updated weights for policy 0, policy_version 54134 (0.0028) [2024-06-12 18:40:20,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.1, 300 sec: 48763.2). Total num frames: 887062528. Throughput: 0: 48844.4. Samples: 415842400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 18:40:20,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 18:40:21,218][71000] Updated weights for policy 0, policy_version 54144 (0.0030) [2024-06-12 18:40:24,555][71000] Updated weights for policy 0, policy_version 54154 (0.0034) [2024-06-12 18:40:25,939][70768] Fps is (10 sec: 45876.3, 60 sec: 49152.1, 300 sec: 48818.8). Total num frames: 887308288. Throughput: 0: 48966.7. Samples: 416137520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 18:40:25,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:40:27,852][71000] Updated weights for policy 0, policy_version 54164 (0.0028) [2024-06-12 18:40:30,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 887554048. Throughput: 0: 49056.8. Samples: 416427460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 18:40:30,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:40:31,407][71000] Updated weights for policy 0, policy_version 54174 (0.0022) [2024-06-12 18:40:34,564][71000] Updated weights for policy 0, policy_version 54184 (0.0018) [2024-06-12 18:40:35,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 887799808. Throughput: 0: 48985.0. Samples: 416576580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 18:40:35,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:40:38,230][71000] Updated weights for policy 0, policy_version 54194 (0.0022) [2024-06-12 18:40:40,939][70768] Fps is (10 sec: 50791.2, 60 sec: 49425.1, 300 sec: 48818.8). Total num frames: 888061952. Throughput: 0: 48835.8. Samples: 416872880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 18:40:40,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 18:40:40,994][71000] Updated weights for policy 0, policy_version 54204 (0.0033) [2024-06-12 18:40:44,499][71000] Updated weights for policy 0, policy_version 54214 (0.0029) [2024-06-12 18:40:45,939][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 888291328. Throughput: 0: 49221.6. Samples: 417178700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 18:40:45,940][70768] Avg episode reward: [(0, '0.239')] [2024-06-12 18:40:47,802][71000] Updated weights for policy 0, policy_version 54224 (0.0021) [2024-06-12 18:40:49,376][70980] Signal inference workers to stop experience collection... (6100 times) [2024-06-12 18:40:49,424][71000] InferenceWorker_p0-w0: stopping experience collection (6100 times) [2024-06-12 18:40:49,430][70980] Signal inference workers to resume experience collection... (6100 times) [2024-06-12 18:40:49,435][71000] InferenceWorker_p0-w0: resuming experience collection (6100 times) [2024-06-12 18:40:50,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 888537088. Throughput: 0: 49106.4. Samples: 417319900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 18:40:50,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 18:40:51,177][71000] Updated weights for policy 0, policy_version 54234 (0.0029) [2024-06-12 18:40:54,320][71000] Updated weights for policy 0, policy_version 54244 (0.0036) [2024-06-12 18:40:55,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48606.0, 300 sec: 48763.2). Total num frames: 888782848. Throughput: 0: 49336.9. Samples: 417612700. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-12 18:40:55,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:40:58,351][71000] Updated weights for policy 0, policy_version 54254 (0.0024) [2024-06-12 18:41:00,939][70768] Fps is (10 sec: 49152.5, 60 sec: 49425.3, 300 sec: 48818.8). Total num frames: 889028608. Throughput: 0: 49293.2. Samples: 417908420. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-12 18:41:00,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 18:41:01,103][71000] Updated weights for policy 0, policy_version 54264 (0.0021) [2024-06-12 18:41:04,876][71000] Updated weights for policy 0, policy_version 54274 (0.0026) [2024-06-12 18:41:05,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.1, 300 sec: 48818.8). Total num frames: 889274368. Throughput: 0: 49017.8. Samples: 418048200. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-12 18:41:05,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 18:41:07,586][71000] Updated weights for policy 0, policy_version 54284 (0.0033) [2024-06-12 18:41:10,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 889520128. Throughput: 0: 49161.3. Samples: 418349780. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-12 18:41:10,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 18:41:11,337][71000] Updated weights for policy 0, policy_version 54294 (0.0029) [2024-06-12 18:41:14,639][71000] Updated weights for policy 0, policy_version 54304 (0.0035) [2024-06-12 18:41:15,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 889765888. Throughput: 0: 49156.8. Samples: 418639520. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-12 18:41:15,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:41:18,200][71000] Updated weights for policy 0, policy_version 54314 (0.0031) [2024-06-12 18:41:20,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 890011648. Throughput: 0: 49154.2. Samples: 418788520. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-12 18:41:20,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 18:41:21,381][71000] Updated weights for policy 0, policy_version 54324 (0.0031) [2024-06-12 18:41:25,211][71000] Updated weights for policy 0, policy_version 54334 (0.0041) [2024-06-12 18:41:25,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 890257408. Throughput: 0: 48997.7. Samples: 419077780. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-12 18:41:25,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:41:28,153][71000] Updated weights for policy 0, policy_version 54344 (0.0028) [2024-06-12 18:41:30,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 890503168. Throughput: 0: 48728.6. Samples: 419371500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 18:41:30,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:41:31,875][71000] Updated weights for policy 0, policy_version 54354 (0.0030) [2024-06-12 18:41:34,434][71000] Updated weights for policy 0, policy_version 54364 (0.0029) [2024-06-12 18:41:35,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 890748928. Throughput: 0: 48932.5. Samples: 419521860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 18:41:35,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 18:41:38,297][71000] Updated weights for policy 0, policy_version 54374 (0.0037) [2024-06-12 18:41:40,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 890994688. Throughput: 0: 49186.1. Samples: 419826080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 18:41:40,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 18:41:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000054383_891011072.pth... [2024-06-12 18:41:41,000][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000053663_879214592.pth [2024-06-12 18:41:41,264][71000] Updated weights for policy 0, policy_version 54384 (0.0029) [2024-06-12 18:41:45,199][71000] Updated weights for policy 0, policy_version 54394 (0.0032) [2024-06-12 18:41:45,939][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 891224064. Throughput: 0: 49056.0. Samples: 420115940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 18:41:45,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:41:48,146][71000] Updated weights for policy 0, policy_version 54404 (0.0033) [2024-06-12 18:41:50,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 891469824. Throughput: 0: 48959.5. Samples: 420251380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 18:41:50,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 18:41:51,919][71000] Updated weights for policy 0, policy_version 54414 (0.0044) [2024-06-12 18:41:54,687][71000] Updated weights for policy 0, policy_version 54424 (0.0031) [2024-06-12 18:41:55,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 891731968. Throughput: 0: 48884.0. Samples: 420549560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 18:41:55,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 18:41:58,358][71000] Updated weights for policy 0, policy_version 54434 (0.0023) [2024-06-12 18:41:59,653][70980] Signal inference workers to stop experience collection... (6150 times) [2024-06-12 18:41:59,675][71000] InferenceWorker_p0-w0: stopping experience collection (6150 times) [2024-06-12 18:41:59,708][70980] Signal inference workers to resume experience collection... (6150 times) [2024-06-12 18:41:59,709][71000] InferenceWorker_p0-w0: resuming experience collection (6150 times) [2024-06-12 18:42:00,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 891994112. Throughput: 0: 49109.1. Samples: 420849420. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-06-12 18:42:00,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 18:42:00,979][71000] Updated weights for policy 0, policy_version 54444 (0.0037) [2024-06-12 18:42:04,747][71000] Updated weights for policy 0, policy_version 54454 (0.0024) [2024-06-12 18:42:05,940][70768] Fps is (10 sec: 47512.5, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 892207104. Throughput: 0: 49094.5. Samples: 420997780. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-06-12 18:42:05,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 18:42:07,844][71000] Updated weights for policy 0, policy_version 54464 (0.0027) [2024-06-12 18:42:10,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 892452864. Throughput: 0: 49085.8. Samples: 421286640. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-06-12 18:42:10,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 18:42:11,697][71000] Updated weights for policy 0, policy_version 54474 (0.0037) [2024-06-12 18:42:14,786][71000] Updated weights for policy 0, policy_version 54484 (0.0040) [2024-06-12 18:42:15,939][70768] Fps is (10 sec: 50791.6, 60 sec: 49152.2, 300 sec: 49040.9). Total num frames: 892715008. Throughput: 0: 48990.9. Samples: 421576080. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-06-12 18:42:15,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 18:42:18,594][71000] Updated weights for policy 0, policy_version 54494 (0.0027) [2024-06-12 18:42:20,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 892960768. Throughput: 0: 49101.3. Samples: 421731420. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-06-12 18:42:20,940][70768] Avg episode reward: [(0, '0.233')] [2024-06-12 18:42:21,325][71000] Updated weights for policy 0, policy_version 54504 (0.0025) [2024-06-12 18:42:24,999][71000] Updated weights for policy 0, policy_version 54514 (0.0021) [2024-06-12 18:42:25,940][70768] Fps is (10 sec: 45874.3, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 893173760. Throughput: 0: 48780.0. Samples: 422021180. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-06-12 18:42:25,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 18:42:27,823][71000] Updated weights for policy 0, policy_version 54524 (0.0028) [2024-06-12 18:42:30,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 893435904. Throughput: 0: 49070.6. Samples: 422324120. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-06-12 18:42:30,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 18:42:31,606][71000] Updated weights for policy 0, policy_version 54534 (0.0028) [2024-06-12 18:42:34,707][71000] Updated weights for policy 0, policy_version 54544 (0.0028) [2024-06-12 18:42:35,940][70768] Fps is (10 sec: 54067.2, 60 sec: 49424.9, 300 sec: 49152.0). Total num frames: 893714432. Throughput: 0: 49293.2. Samples: 422469580. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-12 18:42:35,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 18:42:38,683][71000] Updated weights for policy 0, policy_version 54554 (0.0034) [2024-06-12 18:42:40,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48879.1, 300 sec: 48985.4). Total num frames: 893927424. Throughput: 0: 49152.9. Samples: 422761440. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-12 18:42:40,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 18:42:41,457][71000] Updated weights for policy 0, policy_version 54564 (0.0026) [2024-06-12 18:42:45,499][71000] Updated weights for policy 0, policy_version 54574 (0.0037) [2024-06-12 18:42:45,939][70768] Fps is (10 sec: 44237.7, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 894156800. Throughput: 0: 48930.3. Samples: 423051280. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-12 18:42:45,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:42:48,167][71000] Updated weights for policy 0, policy_version 54584 (0.0028) [2024-06-12 18:42:50,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48878.9, 300 sec: 48818.7). Total num frames: 894402560. Throughput: 0: 48737.0. Samples: 423190940. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-12 18:42:50,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:42:51,891][71000] Updated weights for policy 0, policy_version 54594 (0.0035) [2024-06-12 18:42:54,715][71000] Updated weights for policy 0, policy_version 54604 (0.0025) [2024-06-12 18:42:55,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 894681088. Throughput: 0: 49028.8. Samples: 423492940. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-12 18:42:55,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 18:42:58,472][71000] Updated weights for policy 0, policy_version 54614 (0.0028) [2024-06-12 18:43:00,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 894910464. Throughput: 0: 49084.7. Samples: 423784900. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-12 18:43:00,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 18:43:01,471][71000] Updated weights for policy 0, policy_version 54624 (0.0034) [2024-06-12 18:43:05,201][71000] Updated weights for policy 0, policy_version 54634 (0.0024) [2024-06-12 18:43:05,940][70768] Fps is (10 sec: 45874.4, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 895139840. Throughput: 0: 48703.8. Samples: 423923100. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-12 18:43:05,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 18:43:08,282][71000] Updated weights for policy 0, policy_version 54644 (0.0036) [2024-06-12 18:43:10,940][70768] Fps is (10 sec: 47514.3, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 895385600. Throughput: 0: 48893.5. Samples: 424221380. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-12 18:43:10,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:43:11,900][71000] Updated weights for policy 0, policy_version 54654 (0.0025) [2024-06-12 18:43:14,827][71000] Updated weights for policy 0, policy_version 54664 (0.0026) [2024-06-12 18:43:15,940][70768] Fps is (10 sec: 52430.0, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 895664128. Throughput: 0: 48720.1. Samples: 424516520. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-12 18:43:15,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 18:43:18,284][71000] Updated weights for policy 0, policy_version 54674 (0.0036) [2024-06-12 18:43:20,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 895893504. Throughput: 0: 49002.9. Samples: 424674700. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-12 18:43:20,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 18:43:21,467][71000] Updated weights for policy 0, policy_version 54684 (0.0028) [2024-06-12 18:43:24,783][70980] Signal inference workers to stop experience collection... (6200 times) [2024-06-12 18:43:24,784][70980] Signal inference workers to resume experience collection... (6200 times) [2024-06-12 18:43:24,831][71000] InferenceWorker_p0-w0: stopping experience collection (6200 times) [2024-06-12 18:43:24,831][71000] InferenceWorker_p0-w0: resuming experience collection (6200 times) [2024-06-12 18:43:24,915][71000] Updated weights for policy 0, policy_version 54694 (0.0026) [2024-06-12 18:43:25,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49698.2, 300 sec: 49040.9). Total num frames: 896155648. Throughput: 0: 49219.0. Samples: 424976300. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-12 18:43:25,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:43:27,915][71000] Updated weights for policy 0, policy_version 54704 (0.0036) [2024-06-12 18:43:30,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 896385024. Throughput: 0: 49202.5. Samples: 425265400. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-12 18:43:30,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 18:43:31,527][71000] Updated weights for policy 0, policy_version 54714 (0.0029) [2024-06-12 18:43:34,778][71000] Updated weights for policy 0, policy_version 54724 (0.0031) [2024-06-12 18:43:35,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48879.1, 300 sec: 49096.5). Total num frames: 896647168. Throughput: 0: 49309.9. Samples: 425409880. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-12 18:43:35,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 18:43:38,411][71000] Updated weights for policy 0, policy_version 54734 (0.0031) [2024-06-12 18:43:40,939][70768] Fps is (10 sec: 49152.8, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 896876544. Throughput: 0: 49085.9. Samples: 425701800. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-12 18:43:40,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 18:43:40,966][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000054742_896892928.pth... [2024-06-12 18:43:41,028][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000054022_885096448.pth [2024-06-12 18:43:41,472][71000] Updated weights for policy 0, policy_version 54744 (0.0026) [2024-06-12 18:43:45,073][71000] Updated weights for policy 0, policy_version 54754 (0.0029) [2024-06-12 18:43:45,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 897122304. Throughput: 0: 49137.5. Samples: 425996080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 18:43:45,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 18:43:48,007][71000] Updated weights for policy 0, policy_version 54764 (0.0031) [2024-06-12 18:43:50,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 897368064. Throughput: 0: 49232.1. Samples: 426138540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 18:43:50,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 18:43:51,779][71000] Updated weights for policy 0, policy_version 54774 (0.0026) [2024-06-12 18:43:54,730][71000] Updated weights for policy 0, policy_version 54784 (0.0027) [2024-06-12 18:43:55,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 897613824. Throughput: 0: 49100.9. Samples: 426430920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 18:43:55,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 18:43:58,556][71000] Updated weights for policy 0, policy_version 54794 (0.0028) [2024-06-12 18:44:00,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 897843200. Throughput: 0: 49159.9. Samples: 426728720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 18:44:00,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 18:44:01,393][71000] Updated weights for policy 0, policy_version 54804 (0.0030) [2024-06-12 18:44:05,358][71000] Updated weights for policy 0, policy_version 54814 (0.0036) [2024-06-12 18:44:05,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.2, 300 sec: 48929.8). Total num frames: 898105344. Throughput: 0: 48781.7. Samples: 426869880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 18:44:05,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 18:44:07,895][71000] Updated weights for policy 0, policy_version 54824 (0.0022) [2024-06-12 18:44:10,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 898334720. Throughput: 0: 48936.5. Samples: 427178440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 18:44:10,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:44:11,790][71000] Updated weights for policy 0, policy_version 54834 (0.0030) [2024-06-12 18:44:14,616][71000] Updated weights for policy 0, policy_version 54844 (0.0026) [2024-06-12 18:44:15,939][70768] Fps is (10 sec: 49152.6, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 898596864. Throughput: 0: 48953.5. Samples: 427468300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 18:44:15,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 18:44:18,719][71000] Updated weights for policy 0, policy_version 54854 (0.0029) [2024-06-12 18:44:20,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 898859008. Throughput: 0: 49113.8. Samples: 427620000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 18:44:20,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:44:21,625][71000] Updated weights for policy 0, policy_version 54864 (0.0031) [2024-06-12 18:44:25,256][71000] Updated weights for policy 0, policy_version 54874 (0.0025) [2024-06-12 18:44:25,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 899088384. Throughput: 0: 49084.3. Samples: 427910600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 18:44:25,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 18:44:28,084][71000] Updated weights for policy 0, policy_version 54884 (0.0028) [2024-06-12 18:44:30,940][70768] Fps is (10 sec: 49150.5, 60 sec: 49424.9, 300 sec: 49096.4). Total num frames: 899350528. Throughput: 0: 49249.9. Samples: 428212340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 18:44:30,941][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 18:44:31,798][71000] Updated weights for policy 0, policy_version 54894 (0.0024) [2024-06-12 18:44:34,342][70980] Signal inference workers to stop experience collection... (6250 times) [2024-06-12 18:44:34,385][71000] InferenceWorker_p0-w0: stopping experience collection (6250 times) [2024-06-12 18:44:34,391][70980] Signal inference workers to resume experience collection... (6250 times) [2024-06-12 18:44:34,402][71000] InferenceWorker_p0-w0: resuming experience collection (6250 times) [2024-06-12 18:44:34,519][71000] Updated weights for policy 0, policy_version 54904 (0.0024) [2024-06-12 18:44:35,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 899579904. Throughput: 0: 49276.1. Samples: 428355960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 18:44:35,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 18:44:38,947][71000] Updated weights for policy 0, policy_version 54914 (0.0033) [2024-06-12 18:44:40,940][70768] Fps is (10 sec: 49153.3, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 899842048. Throughput: 0: 49149.3. Samples: 428642640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 18:44:40,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 18:44:41,668][71000] Updated weights for policy 0, policy_version 54924 (0.0038) [2024-06-12 18:44:45,527][71000] Updated weights for policy 0, policy_version 54934 (0.0033) [2024-06-12 18:44:45,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 900071424. Throughput: 0: 49153.4. Samples: 428940620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 18:44:45,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:44:48,586][71000] Updated weights for policy 0, policy_version 54944 (0.0029) [2024-06-12 18:44:50,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 900317184. Throughput: 0: 49013.7. Samples: 429075500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-12 18:44:50,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 18:44:52,238][71000] Updated weights for policy 0, policy_version 54954 (0.0037) [2024-06-12 18:44:54,989][71000] Updated weights for policy 0, policy_version 54964 (0.0026) [2024-06-12 18:44:55,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.8, 300 sec: 49096.5). Total num frames: 900546560. Throughput: 0: 48727.0. Samples: 429371160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-12 18:44:55,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 18:44:58,569][71000] Updated weights for policy 0, policy_version 54974 (0.0025) [2024-06-12 18:45:00,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 900808704. Throughput: 0: 49052.6. Samples: 429675680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-12 18:45:00,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 18:45:01,394][71000] Updated weights for policy 0, policy_version 54984 (0.0031) [2024-06-12 18:45:05,215][71000] Updated weights for policy 0, policy_version 54994 (0.0030) [2024-06-12 18:45:05,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 901038080. Throughput: 0: 48966.6. Samples: 429823500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-12 18:45:05,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 18:45:08,350][71000] Updated weights for policy 0, policy_version 55004 (0.0024) [2024-06-12 18:45:10,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49424.9, 300 sec: 48985.4). Total num frames: 901300224. Throughput: 0: 49204.3. Samples: 430124800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-12 18:45:10,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 18:45:11,779][71000] Updated weights for policy 0, policy_version 55014 (0.0038) [2024-06-12 18:45:15,350][71000] Updated weights for policy 0, policy_version 55024 (0.0022) [2024-06-12 18:45:15,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 901513216. Throughput: 0: 48990.0. Samples: 430416880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-12 18:45:15,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 18:45:18,369][71000] Updated weights for policy 0, policy_version 55034 (0.0028) [2024-06-12 18:45:20,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 901808128. Throughput: 0: 49058.0. Samples: 430563580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-12 18:45:20,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 18:45:21,667][71000] Updated weights for policy 0, policy_version 55044 (0.0024) [2024-06-12 18:45:24,861][71000] Updated weights for policy 0, policy_version 55054 (0.0030) [2024-06-12 18:45:25,940][70768] Fps is (10 sec: 54066.7, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 902053888. Throughput: 0: 49442.5. Samples: 430867560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 18:45:25,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 18:45:28,384][71000] Updated weights for policy 0, policy_version 55064 (0.0030) [2024-06-12 18:45:30,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48879.1, 300 sec: 49096.4). Total num frames: 902283264. Throughput: 0: 49259.1. Samples: 431157280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 18:45:30,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 18:45:31,658][71000] Updated weights for policy 0, policy_version 55074 (0.0022) [2024-06-12 18:45:35,171][71000] Updated weights for policy 0, policy_version 55084 (0.0027) [2024-06-12 18:45:35,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 902529024. Throughput: 0: 49483.7. Samples: 431302260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 18:45:35,942][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:45:36,050][70980] Signal inference workers to stop experience collection... (6300 times) [2024-06-12 18:45:36,051][70980] Signal inference workers to resume experience collection... (6300 times) [2024-06-12 18:45:36,083][71000] InferenceWorker_p0-w0: stopping experience collection (6300 times) [2024-06-12 18:45:36,083][71000] InferenceWorker_p0-w0: resuming experience collection (6300 times) [2024-06-12 18:45:38,136][71000] Updated weights for policy 0, policy_version 55094 (0.0027) [2024-06-12 18:45:40,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 902774784. Throughput: 0: 49587.5. Samples: 431602600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 18:45:40,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:45:41,076][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000055102_902791168.pth... [2024-06-12 18:45:41,125][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000054383_891011072.pth [2024-06-12 18:45:41,656][71000] Updated weights for policy 0, policy_version 55104 (0.0024) [2024-06-12 18:45:44,767][71000] Updated weights for policy 0, policy_version 55114 (0.0031) [2024-06-12 18:45:45,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 903036928. Throughput: 0: 49493.4. Samples: 431902880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 18:45:45,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 18:45:48,351][71000] Updated weights for policy 0, policy_version 55124 (0.0031) [2024-06-12 18:45:50,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 903266304. Throughput: 0: 49776.0. Samples: 432063420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 18:45:50,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:45:51,311][71000] Updated weights for policy 0, policy_version 55134 (0.0025) [2024-06-12 18:45:54,814][71000] Updated weights for policy 0, policy_version 55144 (0.0033) [2024-06-12 18:45:55,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49698.2, 300 sec: 49152.0). Total num frames: 903528448. Throughput: 0: 49558.8. Samples: 432354940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 18:45:55,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 18:45:58,145][71000] Updated weights for policy 0, policy_version 55154 (0.0022) [2024-06-12 18:46:00,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 903757824. Throughput: 0: 49339.6. Samples: 432637160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 18:46:00,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:46:01,820][71000] Updated weights for policy 0, policy_version 55164 (0.0029) [2024-06-12 18:46:04,895][71000] Updated weights for policy 0, policy_version 55174 (0.0033) [2024-06-12 18:46:05,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 49152.0). Total num frames: 904019968. Throughput: 0: 49251.7. Samples: 432779900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 18:46:05,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 18:46:08,284][71000] Updated weights for policy 0, policy_version 55184 (0.0026) [2024-06-12 18:46:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 904249344. Throughput: 0: 49237.9. Samples: 433083260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 18:46:10,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 18:46:11,423][71000] Updated weights for policy 0, policy_version 55194 (0.0029) [2024-06-12 18:46:15,083][71000] Updated weights for policy 0, policy_version 55204 (0.0035) [2024-06-12 18:46:15,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 49152.0). Total num frames: 904511488. Throughput: 0: 49309.0. Samples: 433376180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 18:46:15,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 18:46:18,183][71000] Updated weights for policy 0, policy_version 55214 (0.0031) [2024-06-12 18:46:20,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48879.1, 300 sec: 49096.5). Total num frames: 904740864. Throughput: 0: 49325.8. Samples: 433521920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 18:46:20,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:46:21,698][71000] Updated weights for policy 0, policy_version 55224 (0.0029) [2024-06-12 18:46:24,984][71000] Updated weights for policy 0, policy_version 55234 (0.0028) [2024-06-12 18:46:25,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.2, 300 sec: 49152.0). Total num frames: 905003008. Throughput: 0: 49226.4. Samples: 433817780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 18:46:25,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:46:28,535][71000] Updated weights for policy 0, policy_version 55244 (0.0035) [2024-06-12 18:46:30,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.1, 300 sec: 49096.4). Total num frames: 905232384. Throughput: 0: 48957.9. Samples: 434105980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 18:46:30,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 18:46:31,530][71000] Updated weights for policy 0, policy_version 55254 (0.0027) [2024-06-12 18:46:35,147][71000] Updated weights for policy 0, policy_version 55264 (0.0029) [2024-06-12 18:46:35,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 905494528. Throughput: 0: 48656.3. Samples: 434252960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 18:46:35,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 18:46:38,025][71000] Updated weights for policy 0, policy_version 55274 (0.0038) [2024-06-12 18:46:40,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 905707520. Throughput: 0: 48812.9. Samples: 434551520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 18:46:40,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 18:46:41,367][70980] Signal inference workers to stop experience collection... (6350 times) [2024-06-12 18:46:41,419][71000] InferenceWorker_p0-w0: stopping experience collection (6350 times) [2024-06-12 18:46:41,419][70980] Signal inference workers to resume experience collection... (6350 times) [2024-06-12 18:46:41,438][71000] InferenceWorker_p0-w0: resuming experience collection (6350 times) [2024-06-12 18:46:41,553][71000] Updated weights for policy 0, policy_version 55284 (0.0023) [2024-06-12 18:46:44,755][71000] Updated weights for policy 0, policy_version 55294 (0.0032) [2024-06-12 18:46:45,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 906002432. Throughput: 0: 49248.0. Samples: 434853320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 18:46:45,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:46:48,210][71000] Updated weights for policy 0, policy_version 55304 (0.0033) [2024-06-12 18:46:50,939][70768] Fps is (10 sec: 52429.0, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 906231808. Throughput: 0: 49396.1. Samples: 435002720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 18:46:50,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:46:51,127][71000] Updated weights for policy 0, policy_version 55314 (0.0031) [2024-06-12 18:46:54,798][71000] Updated weights for policy 0, policy_version 55324 (0.0023) [2024-06-12 18:46:55,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 906477568. Throughput: 0: 49340.0. Samples: 435303560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 18:46:55,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 18:46:58,046][71000] Updated weights for policy 0, policy_version 55334 (0.0034) [2024-06-12 18:47:00,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 906706944. Throughput: 0: 49207.2. Samples: 435590500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 18:47:00,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:47:01,674][71000] Updated weights for policy 0, policy_version 55344 (0.0024) [2024-06-12 18:47:04,498][71000] Updated weights for policy 0, policy_version 55354 (0.0032) [2024-06-12 18:47:05,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 906969088. Throughput: 0: 49336.0. Samples: 435742040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 18:47:05,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:47:07,978][71000] Updated weights for policy 0, policy_version 55364 (0.0033) [2024-06-12 18:47:10,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 907214848. Throughput: 0: 49221.2. Samples: 436032740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 18:47:10,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 18:47:11,324][71000] Updated weights for policy 0, policy_version 55374 (0.0026) [2024-06-12 18:47:14,852][71000] Updated weights for policy 0, policy_version 55384 (0.0030) [2024-06-12 18:47:15,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 907444224. Throughput: 0: 49420.4. Samples: 436329900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 18:47:15,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 18:47:17,918][71000] Updated weights for policy 0, policy_version 55394 (0.0034) [2024-06-12 18:47:20,944][70768] Fps is (10 sec: 47493.6, 60 sec: 49148.4, 300 sec: 49206.8). Total num frames: 907689984. Throughput: 0: 49194.5. Samples: 436466920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 18:47:20,945][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 18:47:21,446][71000] Updated weights for policy 0, policy_version 55404 (0.0030) [2024-06-12 18:47:24,662][71000] Updated weights for policy 0, policy_version 55414 (0.0026) [2024-06-12 18:47:25,940][70768] Fps is (10 sec: 50786.8, 60 sec: 49151.4, 300 sec: 49207.4). Total num frames: 907952128. Throughput: 0: 49265.4. Samples: 436768500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 18:47:25,941][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 18:47:28,133][71000] Updated weights for policy 0, policy_version 55424 (0.0028) [2024-06-12 18:47:30,940][70768] Fps is (10 sec: 50812.4, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 908197888. Throughput: 0: 49229.8. Samples: 437068660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 18:47:30,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 18:47:31,165][71000] Updated weights for policy 0, policy_version 55434 (0.0027) [2024-06-12 18:47:34,603][71000] Updated weights for policy 0, policy_version 55444 (0.0042) [2024-06-12 18:47:35,940][70768] Fps is (10 sec: 49155.5, 60 sec: 49152.1, 300 sec: 49207.5). Total num frames: 908443648. Throughput: 0: 49324.8. Samples: 437222340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 18:47:35,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:47:37,865][71000] Updated weights for policy 0, policy_version 55454 (0.0034) [2024-06-12 18:47:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 908689408. Throughput: 0: 49128.0. Samples: 437514320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 18:47:40,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 18:47:41,010][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000055463_908705792.pth... [2024-06-12 18:47:41,051][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000054742_896892928.pth [2024-06-12 18:47:41,256][71000] Updated weights for policy 0, policy_version 55464 (0.0025) [2024-06-12 18:47:44,766][71000] Updated weights for policy 0, policy_version 55474 (0.0037) [2024-06-12 18:47:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.8, 300 sec: 49207.5). Total num frames: 908918784. Throughput: 0: 49092.3. Samples: 437799660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 18:47:45,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 18:47:46,266][70980] Signal inference workers to stop experience collection... (6400 times) [2024-06-12 18:47:46,305][71000] InferenceWorker_p0-w0: stopping experience collection (6400 times) [2024-06-12 18:47:46,314][70980] Signal inference workers to resume experience collection... (6400 times) [2024-06-12 18:47:46,321][71000] InferenceWorker_p0-w0: resuming experience collection (6400 times) [2024-06-12 18:47:48,312][71000] Updated weights for policy 0, policy_version 55484 (0.0032) [2024-06-12 18:47:50,941][70768] Fps is (10 sec: 49143.5, 60 sec: 49150.5, 300 sec: 49151.7). Total num frames: 909180928. Throughput: 0: 49136.2. Samples: 437953260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 18:47:50,950][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 18:47:51,336][71000] Updated weights for policy 0, policy_version 55494 (0.0029) [2024-06-12 18:47:54,634][71000] Updated weights for policy 0, policy_version 55504 (0.0030) [2024-06-12 18:47:55,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 909426688. Throughput: 0: 49377.8. Samples: 438254740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 18:47:55,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 18:47:57,842][71000] Updated weights for policy 0, policy_version 55514 (0.0030) [2024-06-12 18:48:00,940][70768] Fps is (10 sec: 50799.4, 60 sec: 49698.0, 300 sec: 49318.6). Total num frames: 909688832. Throughput: 0: 49220.9. Samples: 438544840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 18:48:00,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 18:48:01,465][71000] Updated weights for policy 0, policy_version 55524 (0.0038) [2024-06-12 18:48:04,558][71000] Updated weights for policy 0, policy_version 55534 (0.0039) [2024-06-12 18:48:05,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 909901824. Throughput: 0: 49375.0. Samples: 438688580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 18:48:05,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 18:48:08,103][71000] Updated weights for policy 0, policy_version 55544 (0.0029) [2024-06-12 18:48:10,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 910163968. Throughput: 0: 49088.7. Samples: 438977460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 18:48:10,940][70768] Avg episode reward: [(0, '0.226')] [2024-06-12 18:48:11,337][71000] Updated weights for policy 0, policy_version 55554 (0.0032) [2024-06-12 18:48:15,079][71000] Updated weights for policy 0, policy_version 55564 (0.0035) [2024-06-12 18:48:15,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 910393344. Throughput: 0: 48849.1. Samples: 439266880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 18:48:15,940][70768] Avg episode reward: [(0, '0.242')] [2024-06-12 18:48:17,999][71000] Updated weights for policy 0, policy_version 55574 (0.0028) [2024-06-12 18:48:20,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49155.6, 300 sec: 49096.5). Total num frames: 910639104. Throughput: 0: 48573.0. Samples: 439408120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 18:48:20,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 18:48:21,673][71000] Updated weights for policy 0, policy_version 55584 (0.0024) [2024-06-12 18:48:24,698][71000] Updated weights for policy 0, policy_version 55594 (0.0026) [2024-06-12 18:48:25,940][70768] Fps is (10 sec: 47514.5, 60 sec: 48606.5, 300 sec: 49096.5). Total num frames: 910868480. Throughput: 0: 48844.5. Samples: 439712320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 18:48:25,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:48:28,070][71000] Updated weights for policy 0, policy_version 55604 (0.0028) [2024-06-12 18:48:30,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 911130624. Throughput: 0: 48928.6. Samples: 440001440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 18:48:30,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:48:31,387][71000] Updated weights for policy 0, policy_version 55614 (0.0029) [2024-06-12 18:48:35,199][71000] Updated weights for policy 0, policy_version 55624 (0.0029) [2024-06-12 18:48:35,940][70768] Fps is (10 sec: 52427.3, 60 sec: 49151.8, 300 sec: 49207.5). Total num frames: 911392768. Throughput: 0: 48867.4. Samples: 440152220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 18:48:35,941][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:48:38,196][71000] Updated weights for policy 0, policy_version 55634 (0.0034) [2024-06-12 18:48:40,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 911622144. Throughput: 0: 48538.7. Samples: 440438980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 18:48:40,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:48:41,895][71000] Updated weights for policy 0, policy_version 55644 (0.0027) [2024-06-12 18:48:45,175][71000] Updated weights for policy 0, policy_version 55654 (0.0030) [2024-06-12 18:48:45,940][70768] Fps is (10 sec: 45876.4, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 911851520. Throughput: 0: 48440.4. Samples: 440724660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 18:48:45,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 18:48:48,810][71000] Updated weights for policy 0, policy_version 55664 (0.0028) [2024-06-12 18:48:49,201][70980] Signal inference workers to stop experience collection... (6450 times) [2024-06-12 18:48:49,249][71000] InferenceWorker_p0-w0: stopping experience collection (6450 times) [2024-06-12 18:48:49,249][70980] Signal inference workers to resume experience collection... (6450 times) [2024-06-12 18:48:49,269][71000] InferenceWorker_p0-w0: resuming experience collection (6450 times) [2024-06-12 18:48:50,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48607.3, 300 sec: 49096.5). Total num frames: 912097280. Throughput: 0: 48675.1. Samples: 440878960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 18:48:50,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 18:48:51,646][71000] Updated weights for policy 0, policy_version 55674 (0.0033) [2024-06-12 18:48:55,623][71000] Updated weights for policy 0, policy_version 55684 (0.0026) [2024-06-12 18:48:55,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48879.0, 300 sec: 49207.6). Total num frames: 912359424. Throughput: 0: 48681.0. Samples: 441168100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 18:48:55,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 18:48:58,817][71000] Updated weights for policy 0, policy_version 55694 (0.0032) [2024-06-12 18:49:00,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48332.7, 300 sec: 49096.4). Total num frames: 912588800. Throughput: 0: 48633.8. Samples: 441455400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 18:49:00,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 18:49:02,226][71000] Updated weights for policy 0, policy_version 55704 (0.0031) [2024-06-12 18:49:05,590][71000] Updated weights for policy 0, policy_version 55714 (0.0031) [2024-06-12 18:49:05,940][70768] Fps is (10 sec: 45874.2, 60 sec: 48605.7, 300 sec: 49096.4). Total num frames: 912818176. Throughput: 0: 48563.7. Samples: 441593500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 18:49:05,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 18:49:09,161][71000] Updated weights for policy 0, policy_version 55724 (0.0031) [2024-06-12 18:49:10,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48332.9, 300 sec: 49040.9). Total num frames: 913063936. Throughput: 0: 48370.2. Samples: 441888980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 18:49:10,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:49:12,597][71000] Updated weights for policy 0, policy_version 55734 (0.0031) [2024-06-12 18:49:15,578][71000] Updated weights for policy 0, policy_version 55744 (0.0024) [2024-06-12 18:49:15,940][70768] Fps is (10 sec: 50791.6, 60 sec: 48879.1, 300 sec: 49040.9). Total num frames: 913326080. Throughput: 0: 48659.5. Samples: 442191120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 18:49:15,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 18:49:19,120][71000] Updated weights for policy 0, policy_version 55754 (0.0041) [2024-06-12 18:49:20,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 913555456. Throughput: 0: 48493.2. Samples: 442334400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 18:49:20,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:49:22,270][71000] Updated weights for policy 0, policy_version 55764 (0.0033) [2024-06-12 18:49:25,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48605.8, 300 sec: 48929.9). Total num frames: 913784832. Throughput: 0: 48575.5. Samples: 442624880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 18:49:25,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:49:26,142][71000] Updated weights for policy 0, policy_version 55774 (0.0037) [2024-06-12 18:49:28,897][71000] Updated weights for policy 0, policy_version 55784 (0.0031) [2024-06-12 18:49:30,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48605.7, 300 sec: 49040.9). Total num frames: 914046976. Throughput: 0: 48692.4. Samples: 442915820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 18:49:30,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:49:32,914][71000] Updated weights for policy 0, policy_version 55794 (0.0032) [2024-06-12 18:49:35,479][71000] Updated weights for policy 0, policy_version 55804 (0.0033) [2024-06-12 18:49:35,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48333.1, 300 sec: 48985.4). Total num frames: 914292736. Throughput: 0: 48566.3. Samples: 443064440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 18:49:35,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 18:49:39,385][71000] Updated weights for policy 0, policy_version 55814 (0.0028) [2024-06-12 18:49:40,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 914538496. Throughput: 0: 48867.5. Samples: 443367140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 18:49:40,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:49:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000055819_914538496.pth... [2024-06-12 18:49:41,002][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000055102_902791168.pth [2024-06-12 18:49:42,195][71000] Updated weights for policy 0, policy_version 55824 (0.0032) [2024-06-12 18:49:45,939][70768] Fps is (10 sec: 47513.7, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 914767872. Throughput: 0: 48802.4. Samples: 443651500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 18:49:45,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 18:49:46,000][71000] Updated weights for policy 0, policy_version 55834 (0.0028) [2024-06-12 18:49:49,259][71000] Updated weights for policy 0, policy_version 55844 (0.0038) [2024-06-12 18:49:50,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 915013632. Throughput: 0: 48808.5. Samples: 443789880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 18:49:50,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:49:53,298][71000] Updated weights for policy 0, policy_version 55854 (0.0028) [2024-06-12 18:49:55,866][71000] Updated weights for policy 0, policy_version 55864 (0.0032) [2024-06-12 18:49:55,940][70768] Fps is (10 sec: 50789.2, 60 sec: 48605.7, 300 sec: 49040.9). Total num frames: 915275776. Throughput: 0: 48553.2. Samples: 444073880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 18:49:55,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 18:49:59,940][71000] Updated weights for policy 0, policy_version 55874 (0.0023) [2024-06-12 18:50:00,564][70980] Signal inference workers to stop experience collection... (6500 times) [2024-06-12 18:50:00,564][70980] Signal inference workers to resume experience collection... (6500 times) [2024-06-12 18:50:00,580][71000] InferenceWorker_p0-w0: stopping experience collection (6500 times) [2024-06-12 18:50:00,581][71000] InferenceWorker_p0-w0: resuming experience collection (6500 times) [2024-06-12 18:50:00,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48878.9, 300 sec: 49096.4). Total num frames: 915521536. Throughput: 0: 48538.9. Samples: 444375380. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-12 18:50:00,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 18:50:02,606][71000] Updated weights for policy 0, policy_version 55884 (0.0023) [2024-06-12 18:50:05,940][70768] Fps is (10 sec: 44237.6, 60 sec: 48333.0, 300 sec: 48874.3). Total num frames: 915718144. Throughput: 0: 48539.1. Samples: 444518660. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-12 18:50:05,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:50:06,607][71000] Updated weights for policy 0, policy_version 55894 (0.0033) [2024-06-12 18:50:09,045][71000] Updated weights for policy 0, policy_version 55904 (0.0027) [2024-06-12 18:50:10,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 915996672. Throughput: 0: 48434.1. Samples: 444804420. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-12 18:50:10,940][70768] Avg episode reward: [(0, '0.235')] [2024-06-12 18:50:13,143][71000] Updated weights for policy 0, policy_version 55914 (0.0032) [2024-06-12 18:50:15,930][71000] Updated weights for policy 0, policy_version 55924 (0.0035) [2024-06-12 18:50:15,940][70768] Fps is (10 sec: 54066.6, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 916258816. Throughput: 0: 48499.5. Samples: 445098300. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-12 18:50:15,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 18:50:20,167][71000] Updated weights for policy 0, policy_version 55934 (0.0031) [2024-06-12 18:50:20,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 916488192. Throughput: 0: 48531.8. Samples: 445248380. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-12 18:50:20,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 18:50:22,446][71000] Updated weights for policy 0, policy_version 55944 (0.0025) [2024-06-12 18:50:25,940][70768] Fps is (10 sec: 44237.2, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 916701184. Throughput: 0: 48371.1. Samples: 445543840. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-12 18:50:25,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 18:50:26,730][71000] Updated weights for policy 0, policy_version 55954 (0.0026) [2024-06-12 18:50:29,322][71000] Updated weights for policy 0, policy_version 55964 (0.0025) [2024-06-12 18:50:30,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48878.7, 300 sec: 48985.3). Total num frames: 916979712. Throughput: 0: 48685.8. Samples: 445842380. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-12 18:50:30,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 18:50:33,240][71000] Updated weights for policy 0, policy_version 55974 (0.0031) [2024-06-12 18:50:35,940][70768] Fps is (10 sec: 52428.8, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 917225472. Throughput: 0: 48934.4. Samples: 445991920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 18:50:35,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:50:35,995][71000] Updated weights for policy 0, policy_version 55984 (0.0028) [2024-06-12 18:50:40,113][71000] Updated weights for policy 0, policy_version 55994 (0.0032) [2024-06-12 18:50:40,940][70768] Fps is (10 sec: 47514.4, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 917454848. Throughput: 0: 49084.0. Samples: 446282660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 18:50:40,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 18:50:42,589][71000] Updated weights for policy 0, policy_version 56004 (0.0027) [2024-06-12 18:50:45,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 917700608. Throughput: 0: 48825.8. Samples: 446572540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 18:50:45,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 18:50:46,692][71000] Updated weights for policy 0, policy_version 56014 (0.0041) [2024-06-12 18:50:49,664][71000] Updated weights for policy 0, policy_version 56024 (0.0023) [2024-06-12 18:50:50,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 917946368. Throughput: 0: 48782.9. Samples: 446713900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 18:50:50,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 18:50:53,218][71000] Updated weights for policy 0, policy_version 56034 (0.0032) [2024-06-12 18:50:55,939][70768] Fps is (10 sec: 49152.7, 60 sec: 48606.0, 300 sec: 48929.9). Total num frames: 918192128. Throughput: 0: 49062.0. Samples: 447012200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 18:50:55,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 18:50:56,149][71000] Updated weights for policy 0, policy_version 56044 (0.0028) [2024-06-12 18:50:59,929][71000] Updated weights for policy 0, policy_version 56054 (0.0037) [2024-06-12 18:51:00,939][70768] Fps is (10 sec: 50791.9, 60 sec: 48879.1, 300 sec: 48929.9). Total num frames: 918454272. Throughput: 0: 49227.3. Samples: 447313520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 18:51:00,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:51:02,854][71000] Updated weights for policy 0, policy_version 56064 (0.0025) [2024-06-12 18:51:05,939][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 918667264. Throughput: 0: 49028.2. Samples: 447454640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 18:51:05,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:51:06,632][71000] Updated weights for policy 0, policy_version 56074 (0.0029) [2024-06-12 18:51:09,935][71000] Updated weights for policy 0, policy_version 56084 (0.0034) [2024-06-12 18:51:10,939][70768] Fps is (10 sec: 45875.0, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 918913024. Throughput: 0: 48847.2. Samples: 447741960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 18:51:10,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 18:51:13,213][71000] Updated weights for policy 0, policy_version 56094 (0.0036) [2024-06-12 18:51:14,239][70980] Signal inference workers to stop experience collection... (6550 times) [2024-06-12 18:51:14,285][71000] InferenceWorker_p0-w0: stopping experience collection (6550 times) [2024-06-12 18:51:14,295][70980] Signal inference workers to resume experience collection... (6550 times) [2024-06-12 18:51:14,305][71000] InferenceWorker_p0-w0: resuming experience collection (6550 times) [2024-06-12 18:51:15,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48332.9, 300 sec: 48874.3). Total num frames: 919158784. Throughput: 0: 48702.6. Samples: 448033980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 18:51:15,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 18:51:16,478][71000] Updated weights for policy 0, policy_version 56104 (0.0031) [2024-06-12 18:51:19,961][71000] Updated weights for policy 0, policy_version 56114 (0.0020) [2024-06-12 18:51:20,940][70768] Fps is (10 sec: 49151.0, 60 sec: 48605.9, 300 sec: 48818.7). Total num frames: 919404544. Throughput: 0: 48731.4. Samples: 448184840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 18:51:20,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 18:51:23,177][71000] Updated weights for policy 0, policy_version 56124 (0.0023) [2024-06-12 18:51:25,939][70768] Fps is (10 sec: 47513.7, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 919633920. Throughput: 0: 48799.8. Samples: 448478640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 18:51:25,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 18:51:26,560][71000] Updated weights for policy 0, policy_version 56134 (0.0041) [2024-06-12 18:51:29,920][71000] Updated weights for policy 0, policy_version 56144 (0.0036) [2024-06-12 18:51:30,939][70768] Fps is (10 sec: 50791.3, 60 sec: 48879.2, 300 sec: 48874.3). Total num frames: 919912448. Throughput: 0: 48797.5. Samples: 448768420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 18:51:30,940][70768] Avg episode reward: [(0, '0.241')] [2024-06-12 18:51:33,218][71000] Updated weights for policy 0, policy_version 56154 (0.0038) [2024-06-12 18:51:35,940][70768] Fps is (10 sec: 49151.2, 60 sec: 48332.7, 300 sec: 48874.3). Total num frames: 920125440. Throughput: 0: 48788.5. Samples: 448909380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 18:51:35,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 18:51:36,878][71000] Updated weights for policy 0, policy_version 56164 (0.0028) [2024-06-12 18:51:40,086][71000] Updated weights for policy 0, policy_version 56174 (0.0027) [2024-06-12 18:51:40,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 920387584. Throughput: 0: 48784.4. Samples: 449207500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 18:51:40,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 18:51:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000056176_920387584.pth... [2024-06-12 18:51:41,002][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000055463_908705792.pth [2024-06-12 18:51:43,498][71000] Updated weights for policy 0, policy_version 56184 (0.0026) [2024-06-12 18:51:45,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48332.9, 300 sec: 48707.7). Total num frames: 920600576. Throughput: 0: 48398.1. Samples: 449491440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 18:51:45,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 18:51:46,884][71000] Updated weights for policy 0, policy_version 56194 (0.0024) [2024-06-12 18:51:50,094][71000] Updated weights for policy 0, policy_version 56204 (0.0027) [2024-06-12 18:51:50,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48879.1, 300 sec: 48818.8). Total num frames: 920879104. Throughput: 0: 48705.7. Samples: 449646400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 18:51:50,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 18:51:53,559][71000] Updated weights for policy 0, policy_version 56214 (0.0028) [2024-06-12 18:51:55,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 921108480. Throughput: 0: 48900.0. Samples: 449942460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 18:51:55,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 18:51:56,822][71000] Updated weights for policy 0, policy_version 56224 (0.0041) [2024-06-12 18:52:00,048][71000] Updated weights for policy 0, policy_version 56234 (0.0030) [2024-06-12 18:52:00,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48605.7, 300 sec: 48818.7). Total num frames: 921370624. Throughput: 0: 48735.9. Samples: 450227100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 18:52:00,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 18:52:03,502][71000] Updated weights for policy 0, policy_version 56244 (0.0030) [2024-06-12 18:52:05,939][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.9, 300 sec: 48763.3). Total num frames: 921600000. Throughput: 0: 48885.5. Samples: 450384680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 18:52:05,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 18:52:06,722][71000] Updated weights for policy 0, policy_version 56254 (0.0037) [2024-06-12 18:52:09,386][70980] Signal inference workers to stop experience collection... (6600 times) [2024-06-12 18:52:09,429][71000] InferenceWorker_p0-w0: stopping experience collection (6600 times) [2024-06-12 18:52:09,495][70980] Signal inference workers to resume experience collection... (6600 times) [2024-06-12 18:52:09,495][71000] InferenceWorker_p0-w0: resuming experience collection (6600 times) [2024-06-12 18:52:10,056][71000] Updated weights for policy 0, policy_version 56264 (0.0027) [2024-06-12 18:52:10,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49425.0, 300 sec: 48929.9). Total num frames: 921878528. Throughput: 0: 49004.4. Samples: 450683840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 18:52:10,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 18:52:13,431][71000] Updated weights for policy 0, policy_version 56274 (0.0030) [2024-06-12 18:52:15,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.9, 300 sec: 48819.5). Total num frames: 922091520. Throughput: 0: 48963.9. Samples: 450971800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 18:52:15,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 18:52:17,002][71000] Updated weights for policy 0, policy_version 56284 (0.0028) [2024-06-12 18:52:20,086][71000] Updated weights for policy 0, policy_version 56294 (0.0055) [2024-06-12 18:52:20,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48879.0, 300 sec: 48763.3). Total num frames: 922337280. Throughput: 0: 48922.8. Samples: 451110900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 18:52:20,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 18:52:23,773][71000] Updated weights for policy 0, policy_version 56304 (0.0042) [2024-06-12 18:52:25,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 922583040. Throughput: 0: 48734.3. Samples: 451400540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 18:52:25,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 18:52:26,845][71000] Updated weights for policy 0, policy_version 56314 (0.0032) [2024-06-12 18:52:30,478][71000] Updated weights for policy 0, policy_version 56324 (0.0023) [2024-06-12 18:52:30,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 922828800. Throughput: 0: 49099.6. Samples: 451700920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 18:52:30,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 18:52:33,751][71000] Updated weights for policy 0, policy_version 56334 (0.0029) [2024-06-12 18:52:35,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48606.0, 300 sec: 48652.2). Total num frames: 923041792. Throughput: 0: 48979.6. Samples: 451850480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 18:52:35,948][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:52:36,983][71000] Updated weights for policy 0, policy_version 56344 (0.0026) [2024-06-12 18:52:40,295][71000] Updated weights for policy 0, policy_version 56354 (0.0020) [2024-06-12 18:52:40,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48878.8, 300 sec: 48818.7). Total num frames: 923320320. Throughput: 0: 48853.6. Samples: 452140880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 18:52:40,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 18:52:43,831][71000] Updated weights for policy 0, policy_version 56364 (0.0032) [2024-06-12 18:52:45,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49425.1, 300 sec: 48763.5). Total num frames: 923566080. Throughput: 0: 48999.6. Samples: 452432080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 18:52:45,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:52:46,834][71000] Updated weights for policy 0, policy_version 56374 (0.0018) [2024-06-12 18:52:50,567][71000] Updated weights for policy 0, policy_version 56384 (0.0034) [2024-06-12 18:52:50,939][70768] Fps is (10 sec: 49152.9, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 923811840. Throughput: 0: 48844.0. Samples: 452582660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 18:52:50,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 18:52:53,412][71000] Updated weights for policy 0, policy_version 56394 (0.0042) [2024-06-12 18:52:55,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48605.8, 300 sec: 48596.6). Total num frames: 924024832. Throughput: 0: 48662.2. Samples: 452873640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 18:52:55,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 18:52:57,124][71000] Updated weights for policy 0, policy_version 56404 (0.0029) [2024-06-12 18:53:00,364][71000] Updated weights for policy 0, policy_version 56414 (0.0030) [2024-06-12 18:53:00,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 924286976. Throughput: 0: 48707.5. Samples: 453163640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 18:53:00,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 18:53:03,870][71000] Updated weights for policy 0, policy_version 56424 (0.0023) [2024-06-12 18:53:04,318][70980] Signal inference workers to stop experience collection... (6650 times) [2024-06-12 18:53:04,320][70980] Signal inference workers to resume experience collection... (6650 times) [2024-06-12 18:53:04,341][71000] InferenceWorker_p0-w0: stopping experience collection (6650 times) [2024-06-12 18:53:04,342][71000] InferenceWorker_p0-w0: resuming experience collection (6650 times) [2024-06-12 18:53:05,939][70768] Fps is (10 sec: 52429.1, 60 sec: 49152.0, 300 sec: 48763.3). Total num frames: 924549120. Throughput: 0: 49233.4. Samples: 453326400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 18:53:05,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:53:07,055][71000] Updated weights for policy 0, policy_version 56434 (0.0029) [2024-06-12 18:53:10,658][71000] Updated weights for policy 0, policy_version 56444 (0.0024) [2024-06-12 18:53:10,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48332.8, 300 sec: 48763.3). Total num frames: 924778496. Throughput: 0: 49136.4. Samples: 453611680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 18:53:10,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 18:53:14,000][71000] Updated weights for policy 0, policy_version 56454 (0.0026) [2024-06-12 18:53:15,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 925007872. Throughput: 0: 48930.6. Samples: 453902800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 18:53:15,940][70768] Avg episode reward: [(0, '0.234')] [2024-06-12 18:53:17,467][71000] Updated weights for policy 0, policy_version 56464 (0.0031) [2024-06-12 18:53:20,418][71000] Updated weights for policy 0, policy_version 56474 (0.0024) [2024-06-12 18:53:20,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 925270016. Throughput: 0: 48760.4. Samples: 454044700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 18:53:20,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 18:53:24,053][71000] Updated weights for policy 0, policy_version 56484 (0.0027) [2024-06-12 18:53:25,940][70768] Fps is (10 sec: 54066.4, 60 sec: 49424.9, 300 sec: 48874.3). Total num frames: 925548544. Throughput: 0: 48889.7. Samples: 454340920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 18:53:25,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:53:27,236][71000] Updated weights for policy 0, policy_version 56494 (0.0041) [2024-06-12 18:53:30,693][71000] Updated weights for policy 0, policy_version 56504 (0.0027) [2024-06-12 18:53:30,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49151.9, 300 sec: 48763.3). Total num frames: 925777920. Throughput: 0: 49221.8. Samples: 454647060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-12 18:53:30,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 18:53:33,593][71000] Updated weights for policy 0, policy_version 56514 (0.0032) [2024-06-12 18:53:35,940][70768] Fps is (10 sec: 44237.2, 60 sec: 49151.9, 300 sec: 48707.7). Total num frames: 925990912. Throughput: 0: 48801.7. Samples: 454778740. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-12 18:53:35,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 18:53:37,339][71000] Updated weights for policy 0, policy_version 56524 (0.0035) [2024-06-12 18:53:40,528][71000] Updated weights for policy 0, policy_version 56534 (0.0027) [2024-06-12 18:53:40,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48879.1, 300 sec: 48818.8). Total num frames: 926253056. Throughput: 0: 48846.2. Samples: 455071720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-12 18:53:40,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 18:53:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000056534_926253056.pth... [2024-06-12 18:53:40,995][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000055819_914538496.pth [2024-06-12 18:53:44,348][71000] Updated weights for policy 0, policy_version 56544 (0.0033) [2024-06-12 18:53:45,940][70768] Fps is (10 sec: 54067.9, 60 sec: 49425.1, 300 sec: 48929.9). Total num frames: 926531584. Throughput: 0: 48911.2. Samples: 455364640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-12 18:53:45,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 18:53:47,248][71000] Updated weights for policy 0, policy_version 56554 (0.0028) [2024-06-12 18:53:50,916][71000] Updated weights for policy 0, policy_version 56564 (0.0035) [2024-06-12 18:53:50,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 926744576. Throughput: 0: 48680.4. Samples: 455517020. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-12 18:53:50,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 18:53:53,763][71000] Updated weights for policy 0, policy_version 56574 (0.0033) [2024-06-12 18:53:55,939][70768] Fps is (10 sec: 44236.7, 60 sec: 49152.0, 300 sec: 48763.3). Total num frames: 926973952. Throughput: 0: 48745.8. Samples: 455805240. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-12 18:53:55,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 18:53:57,782][71000] Updated weights for policy 0, policy_version 56584 (0.0039) [2024-06-12 18:54:00,268][71000] Updated weights for policy 0, policy_version 56594 (0.0035) [2024-06-12 18:54:00,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 927236096. Throughput: 0: 48466.5. Samples: 456083800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-12 18:54:00,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 18:54:04,487][71000] Updated weights for policy 0, policy_version 56604 (0.0027) [2024-06-12 18:54:05,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 927481856. Throughput: 0: 48971.4. Samples: 456248420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 18:54:05,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 18:54:07,178][71000] Updated weights for policy 0, policy_version 56614 (0.0030) [2024-06-12 18:54:10,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48605.6, 300 sec: 48707.6). Total num frames: 927694848. Throughput: 0: 48690.2. Samples: 456531980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 18:54:10,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:54:11,482][71000] Updated weights for policy 0, policy_version 56624 (0.0030) [2024-06-12 18:54:12,977][70980] Signal inference workers to stop experience collection... (6700 times) [2024-06-12 18:54:13,005][71000] InferenceWorker_p0-w0: stopping experience collection (6700 times) [2024-06-12 18:54:13,031][70980] Signal inference workers to resume experience collection... (6700 times) [2024-06-12 18:54:13,032][71000] InferenceWorker_p0-w0: resuming experience collection (6700 times) [2024-06-12 18:54:14,109][71000] Updated weights for policy 0, policy_version 56634 (0.0029) [2024-06-12 18:54:15,939][70768] Fps is (10 sec: 44237.5, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 927924224. Throughput: 0: 48461.0. Samples: 456827800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 18:54:15,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 18:54:17,995][71000] Updated weights for policy 0, policy_version 56644 (0.0022) [2024-06-12 18:54:20,507][71000] Updated weights for policy 0, policy_version 56654 (0.0027) [2024-06-12 18:54:20,940][70768] Fps is (10 sec: 52430.1, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 928219136. Throughput: 0: 48704.5. Samples: 456970440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 18:54:20,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 18:54:24,910][71000] Updated weights for policy 0, policy_version 56664 (0.0033) [2024-06-12 18:54:25,940][70768] Fps is (10 sec: 54067.1, 60 sec: 48606.0, 300 sec: 48874.3). Total num frames: 928464896. Throughput: 0: 48892.4. Samples: 457271880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 18:54:25,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 18:54:27,521][71000] Updated weights for policy 0, policy_version 56674 (0.0021) [2024-06-12 18:54:30,939][70768] Fps is (10 sec: 45875.3, 60 sec: 48332.9, 300 sec: 48763.2). Total num frames: 928677888. Throughput: 0: 48948.0. Samples: 457567300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 18:54:30,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 18:54:31,566][71000] Updated weights for policy 0, policy_version 56684 (0.0022) [2024-06-12 18:54:34,123][71000] Updated weights for policy 0, policy_version 56694 (0.0031) [2024-06-12 18:54:35,939][70768] Fps is (10 sec: 44237.1, 60 sec: 48606.0, 300 sec: 48707.7). Total num frames: 928907264. Throughput: 0: 48556.1. Samples: 457702040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 18:54:35,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 18:54:38,443][71000] Updated weights for policy 0, policy_version 56704 (0.0039) [2024-06-12 18:54:40,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 929185792. Throughput: 0: 48662.2. Samples: 457995040. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-12 18:54:40,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 18:54:41,054][71000] Updated weights for policy 0, policy_version 56714 (0.0027) [2024-06-12 18:54:45,144][71000] Updated weights for policy 0, policy_version 56724 (0.0023) [2024-06-12 18:54:45,939][70768] Fps is (10 sec: 50790.2, 60 sec: 48059.7, 300 sec: 48818.8). Total num frames: 929415168. Throughput: 0: 49199.3. Samples: 458297760. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-12 18:54:45,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 18:54:47,535][71000] Updated weights for policy 0, policy_version 56734 (0.0020) [2024-06-12 18:54:50,940][70768] Fps is (10 sec: 45874.0, 60 sec: 48332.6, 300 sec: 48707.7). Total num frames: 929644544. Throughput: 0: 48599.8. Samples: 458435420. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-12 18:54:50,941][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:54:51,728][71000] Updated weights for policy 0, policy_version 56744 (0.0037) [2024-06-12 18:54:54,788][71000] Updated weights for policy 0, policy_version 56754 (0.0024) [2024-06-12 18:54:55,940][70768] Fps is (10 sec: 45874.3, 60 sec: 48332.7, 300 sec: 48652.2). Total num frames: 929873920. Throughput: 0: 48519.2. Samples: 458715340. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-12 18:54:55,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:54:58,811][71000] Updated weights for policy 0, policy_version 56764 (0.0028) [2024-06-12 18:55:00,940][70768] Fps is (10 sec: 52429.6, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 930168832. Throughput: 0: 48403.0. Samples: 459005940. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-12 18:55:00,940][70768] Avg episode reward: [(0, '0.237')] [2024-06-12 18:55:01,245][71000] Updated weights for policy 0, policy_version 56774 (0.0024) [2024-06-12 18:55:05,565][71000] Updated weights for policy 0, policy_version 56784 (0.0033) [2024-06-12 18:55:05,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48059.8, 300 sec: 48707.7). Total num frames: 930365440. Throughput: 0: 48685.8. Samples: 459161300. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-12 18:55:05,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 18:55:07,925][71000] Updated weights for policy 0, policy_version 56794 (0.0027) [2024-06-12 18:55:09,358][70980] Signal inference workers to stop experience collection... (6750 times) [2024-06-12 18:55:09,358][70980] Signal inference workers to resume experience collection... (6750 times) [2024-06-12 18:55:09,371][71000] InferenceWorker_p0-w0: stopping experience collection (6750 times) [2024-06-12 18:55:09,371][71000] InferenceWorker_p0-w0: resuming experience collection (6750 times) [2024-06-12 18:55:10,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 930627584. Throughput: 0: 48539.9. Samples: 459456180. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-12 18:55:10,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 18:55:12,158][71000] Updated weights for policy 0, policy_version 56804 (0.0024) [2024-06-12 18:55:14,494][71000] Updated weights for policy 0, policy_version 56814 (0.0029) [2024-06-12 18:55:15,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 930856960. Throughput: 0: 48414.6. Samples: 459745960. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-12 18:55:15,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 18:55:18,692][71000] Updated weights for policy 0, policy_version 56824 (0.0034) [2024-06-12 18:55:20,940][70768] Fps is (10 sec: 50790.9, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 931135488. Throughput: 0: 48736.8. Samples: 459895200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:55:20,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:55:21,373][71000] Updated weights for policy 0, policy_version 56834 (0.0031) [2024-06-12 18:55:25,285][71000] Updated weights for policy 0, policy_version 56844 (0.0026) [2024-06-12 18:55:25,940][70768] Fps is (10 sec: 50789.5, 60 sec: 48332.7, 300 sec: 48763.3). Total num frames: 931364864. Throughput: 0: 48729.6. Samples: 460187880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:55:25,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 18:55:27,842][71000] Updated weights for policy 0, policy_version 56854 (0.0040) [2024-06-12 18:55:30,939][70768] Fps is (10 sec: 45875.5, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 931594240. Throughput: 0: 48589.3. Samples: 460484280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:55:30,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 18:55:31,958][71000] Updated weights for policy 0, policy_version 56864 (0.0034) [2024-06-12 18:55:34,777][71000] Updated weights for policy 0, policy_version 56874 (0.0031) [2024-06-12 18:55:35,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49151.9, 300 sec: 48818.8). Total num frames: 931856384. Throughput: 0: 48832.7. Samples: 460632880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:55:35,940][70768] Avg episode reward: [(0, '0.236')] [2024-06-12 18:55:38,457][71000] Updated weights for policy 0, policy_version 56884 (0.0038) [2024-06-12 18:55:40,939][70768] Fps is (10 sec: 52428.7, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 932118528. Throughput: 0: 49194.0. Samples: 460929060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:55:40,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 18:55:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000056892_932118528.pth... [2024-06-12 18:55:41,002][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000056176_920387584.pth [2024-06-12 18:55:41,315][71000] Updated weights for policy 0, policy_version 56894 (0.0027) [2024-06-12 18:55:45,183][71000] Updated weights for policy 0, policy_version 56904 (0.0032) [2024-06-12 18:55:45,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 932331520. Throughput: 0: 49215.2. Samples: 461220620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:55:45,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 18:55:48,053][71000] Updated weights for policy 0, policy_version 56914 (0.0026) [2024-06-12 18:55:50,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48879.1, 300 sec: 48763.2). Total num frames: 932577280. Throughput: 0: 48929.3. Samples: 461363120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 18:55:50,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 18:55:51,711][71000] Updated weights for policy 0, policy_version 56924 (0.0035) [2024-06-12 18:55:54,759][71000] Updated weights for policy 0, policy_version 56934 (0.0030) [2024-06-12 18:55:55,939][70768] Fps is (10 sec: 50791.3, 60 sec: 49425.3, 300 sec: 48763.2). Total num frames: 932839424. Throughput: 0: 49056.3. Samples: 461663700. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-12 18:55:55,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 18:55:58,423][71000] Updated weights for policy 0, policy_version 56944 (0.0028) [2024-06-12 18:56:00,944][70768] Fps is (10 sec: 52407.9, 60 sec: 48875.7, 300 sec: 48929.2). Total num frames: 933101568. Throughput: 0: 49229.8. Samples: 461961500. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-12 18:56:00,944][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 18:56:01,457][71000] Updated weights for policy 0, policy_version 56954 (0.0029) [2024-06-12 18:56:04,954][71000] Updated weights for policy 0, policy_version 56964 (0.0022) [2024-06-12 18:56:05,940][70768] Fps is (10 sec: 50789.2, 60 sec: 49698.0, 300 sec: 48929.8). Total num frames: 933347328. Throughput: 0: 49239.0. Samples: 462110960. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-12 18:56:05,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 18:56:07,896][71000] Updated weights for policy 0, policy_version 56974 (0.0032) [2024-06-12 18:56:10,940][70768] Fps is (10 sec: 45893.8, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 933560320. Throughput: 0: 49323.8. Samples: 462407440. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-12 18:56:10,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 18:56:11,595][71000] Updated weights for policy 0, policy_version 56984 (0.0024) [2024-06-12 18:56:14,970][71000] Updated weights for policy 0, policy_version 56994 (0.0026) [2024-06-12 18:56:15,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49698.2, 300 sec: 48929.9). Total num frames: 933838848. Throughput: 0: 49164.0. Samples: 462696660. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-12 18:56:15,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 18:56:18,097][71000] Updated weights for policy 0, policy_version 57004 (0.0028) [2024-06-12 18:56:20,939][70768] Fps is (10 sec: 49152.1, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 934051840. Throughput: 0: 49055.1. Samples: 462840360. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-12 18:56:20,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 18:56:21,093][70980] Signal inference workers to stop experience collection... (6800 times) [2024-06-12 18:56:21,142][71000] InferenceWorker_p0-w0: stopping experience collection (6800 times) [2024-06-12 18:56:21,146][70980] Signal inference workers to resume experience collection... (6800 times) [2024-06-12 18:56:21,156][71000] InferenceWorker_p0-w0: resuming experience collection (6800 times) [2024-06-12 18:56:21,506][71000] Updated weights for policy 0, policy_version 57014 (0.0027) [2024-06-12 18:56:24,808][71000] Updated weights for policy 0, policy_version 57024 (0.0035) [2024-06-12 18:56:25,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 934297600. Throughput: 0: 49302.6. Samples: 463147680. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-12 18:56:25,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 18:56:28,071][71000] Updated weights for policy 0, policy_version 57034 (0.0032) [2024-06-12 18:56:30,940][70768] Fps is (10 sec: 50789.4, 60 sec: 49424.9, 300 sec: 48929.8). Total num frames: 934559744. Throughput: 0: 49307.4. Samples: 463439460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 18:56:30,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 18:56:31,582][71000] Updated weights for policy 0, policy_version 57044 (0.0022) [2024-06-12 18:56:34,762][71000] Updated weights for policy 0, policy_version 57054 (0.0032) [2024-06-12 18:56:35,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.0, 300 sec: 48929.8). Total num frames: 934821888. Throughput: 0: 49639.6. Samples: 463596900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 18:56:35,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 18:56:38,195][71000] Updated weights for policy 0, policy_version 57064 (0.0028) [2024-06-12 18:56:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 935051264. Throughput: 0: 49131.3. Samples: 463874620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 18:56:40,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 18:56:41,546][71000] Updated weights for policy 0, policy_version 57074 (0.0032) [2024-06-12 18:56:45,302][71000] Updated weights for policy 0, policy_version 57084 (0.0025) [2024-06-12 18:56:45,940][70768] Fps is (10 sec: 45874.3, 60 sec: 49151.9, 300 sec: 48818.7). Total num frames: 935280640. Throughput: 0: 49066.4. Samples: 464169300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 18:56:45,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 18:56:48,267][71000] Updated weights for policy 0, policy_version 57094 (0.0026) [2024-06-12 18:56:50,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49425.1, 300 sec: 48929.8). Total num frames: 935542784. Throughput: 0: 49005.9. Samples: 464316220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 18:56:50,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 18:56:51,726][71000] Updated weights for policy 0, policy_version 57104 (0.0024) [2024-06-12 18:56:55,241][71000] Updated weights for policy 0, policy_version 57114 (0.0028) [2024-06-12 18:56:55,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49424.9, 300 sec: 48929.8). Total num frames: 935804928. Throughput: 0: 49034.1. Samples: 464613980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 18:56:55,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 18:56:55,941][70980] Saving new best policy, reward=0.268! [2024-06-12 18:56:58,526][71000] Updated weights for policy 0, policy_version 57124 (0.0031) [2024-06-12 18:57:00,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48336.0, 300 sec: 48818.8). Total num frames: 936001536. Throughput: 0: 49151.5. Samples: 464908480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 18:57:00,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 18:57:01,906][71000] Updated weights for policy 0, policy_version 57134 (0.0029) [2024-06-12 18:57:05,173][71000] Updated weights for policy 0, policy_version 57144 (0.0031) [2024-06-12 18:57:05,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 48818.7). Total num frames: 936280064. Throughput: 0: 48778.9. Samples: 465035420. Policy #0 lag: (min: 3.0, avg: 10.9, max: 23.0) [2024-06-12 18:57:05,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 18:57:08,823][71000] Updated weights for policy 0, policy_version 57154 (0.0025) [2024-06-12 18:57:10,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49425.0, 300 sec: 48929.8). Total num frames: 936525824. Throughput: 0: 48588.5. Samples: 465334160. Policy #0 lag: (min: 3.0, avg: 10.9, max: 23.0) [2024-06-12 18:57:10,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 18:57:11,983][71000] Updated weights for policy 0, policy_version 57164 (0.0028) [2024-06-12 18:57:15,404][71000] Updated weights for policy 0, policy_version 57174 (0.0027) [2024-06-12 18:57:15,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 936787968. Throughput: 0: 48788.6. Samples: 465634940. Policy #0 lag: (min: 3.0, avg: 10.9, max: 23.0) [2024-06-12 18:57:15,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 18:57:18,782][71000] Updated weights for policy 0, policy_version 57184 (0.0026) [2024-06-12 18:57:20,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48878.8, 300 sec: 48818.7). Total num frames: 936984576. Throughput: 0: 48551.4. Samples: 465781720. Policy #0 lag: (min: 3.0, avg: 10.9, max: 23.0) [2024-06-12 18:57:20,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 18:57:21,916][71000] Updated weights for policy 0, policy_version 57194 (0.0032) [2024-06-12 18:57:25,206][71000] Updated weights for policy 0, policy_version 57204 (0.0029) [2024-06-12 18:57:25,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 937246720. Throughput: 0: 48792.1. Samples: 466070260. Policy #0 lag: (min: 3.0, avg: 10.9, max: 23.0) [2024-06-12 18:57:25,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 18:57:27,561][70980] Signal inference workers to stop experience collection... (6850 times) [2024-06-12 18:57:27,614][71000] InferenceWorker_p0-w0: stopping experience collection (6850 times) [2024-06-12 18:57:27,615][70980] Signal inference workers to resume experience collection... (6850 times) [2024-06-12 18:57:27,629][71000] InferenceWorker_p0-w0: resuming experience collection (6850 times) [2024-06-12 18:57:28,823][71000] Updated weights for policy 0, policy_version 57214 (0.0027) [2024-06-12 18:57:30,940][70768] Fps is (10 sec: 50791.0, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 937492480. Throughput: 0: 48690.9. Samples: 466360380. Policy #0 lag: (min: 3.0, avg: 10.9, max: 23.0) [2024-06-12 18:57:30,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 18:57:31,817][71000] Updated weights for policy 0, policy_version 57224 (0.0028) [2024-06-12 18:57:35,450][71000] Updated weights for policy 0, policy_version 57234 (0.0031) [2024-06-12 18:57:35,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 937738240. Throughput: 0: 48944.5. Samples: 466518720. Policy #0 lag: (min: 3.0, avg: 10.9, max: 23.0) [2024-06-12 18:57:35,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 18:57:38,412][71000] Updated weights for policy 0, policy_version 57244 (0.0032) [2024-06-12 18:57:40,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 937967616. Throughput: 0: 48942.3. Samples: 466816380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-12 18:57:40,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 18:57:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000057249_937967616.pth... [2024-06-12 18:57:40,994][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000056534_926253056.pth [2024-06-12 18:57:42,085][71000] Updated weights for policy 0, policy_version 57254 (0.0029) [2024-06-12 18:57:45,618][71000] Updated weights for policy 0, policy_version 57264 (0.0027) [2024-06-12 18:57:45,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 938229760. Throughput: 0: 48715.9. Samples: 467100700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-12 18:57:45,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 18:57:48,842][71000] Updated weights for policy 0, policy_version 57274 (0.0038) [2024-06-12 18:57:50,939][70768] Fps is (10 sec: 52429.2, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 938491904. Throughput: 0: 49291.3. Samples: 467253520. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-12 18:57:50,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 18:57:52,145][71000] Updated weights for policy 0, policy_version 57284 (0.0030) [2024-06-12 18:57:55,630][71000] Updated weights for policy 0, policy_version 57294 (0.0037) [2024-06-12 18:57:55,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 938721280. Throughput: 0: 49182.7. Samples: 467547380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-12 18:57:55,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 18:57:58,596][71000] Updated weights for policy 0, policy_version 57304 (0.0029) [2024-06-12 18:58:00,940][70768] Fps is (10 sec: 45874.8, 60 sec: 49152.0, 300 sec: 48818.7). Total num frames: 938950656. Throughput: 0: 49069.3. Samples: 467843060. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-12 18:58:00,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 18:58:02,060][71000] Updated weights for policy 0, policy_version 57314 (0.0026) [2024-06-12 18:58:04,923][71000] Updated weights for policy 0, policy_version 57324 (0.0028) [2024-06-12 18:58:05,941][70768] Fps is (10 sec: 47504.7, 60 sec: 48604.5, 300 sec: 48874.0). Total num frames: 939196416. Throughput: 0: 48903.9. Samples: 467982480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-12 18:58:05,942][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 18:58:08,780][71000] Updated weights for policy 0, policy_version 57334 (0.0026) [2024-06-12 18:58:10,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 939474944. Throughput: 0: 48997.6. Samples: 468275160. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-12 18:58:10,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 18:58:12,411][71000] Updated weights for policy 0, policy_version 57344 (0.0029) [2024-06-12 18:58:15,403][71000] Updated weights for policy 0, policy_version 57354 (0.0037) [2024-06-12 18:58:15,940][70768] Fps is (10 sec: 49160.8, 60 sec: 48332.7, 300 sec: 48874.3). Total num frames: 939687936. Throughput: 0: 49039.9. Samples: 468567180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 18:58:15,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 18:58:18,996][71000] Updated weights for policy 0, policy_version 57364 (0.0031) [2024-06-12 18:58:20,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 939933696. Throughput: 0: 48606.9. Samples: 468706040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 18:58:20,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 18:58:22,291][71000] Updated weights for policy 0, policy_version 57374 (0.0024) [2024-06-12 18:58:25,598][71000] Updated weights for policy 0, policy_version 57384 (0.0027) [2024-06-12 18:58:25,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 940179456. Throughput: 0: 48663.1. Samples: 469006220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 18:58:25,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 18:58:28,734][71000] Updated weights for policy 0, policy_version 57394 (0.0024) [2024-06-12 18:58:30,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 940441600. Throughput: 0: 49079.2. Samples: 469309260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 18:58:30,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 18:58:32,442][71000] Updated weights for policy 0, policy_version 57404 (0.0036) [2024-06-12 18:58:35,819][71000] Updated weights for policy 0, policy_version 57414 (0.0031) [2024-06-12 18:58:35,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 940670976. Throughput: 0: 48973.3. Samples: 469457320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 18:58:35,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 18:58:39,169][71000] Updated weights for policy 0, policy_version 57424 (0.0028) [2024-06-12 18:58:40,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 940900352. Throughput: 0: 48810.8. Samples: 469743860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 18:58:40,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 18:58:42,279][71000] Updated weights for policy 0, policy_version 57434 (0.0038) [2024-06-12 18:58:45,619][71000] Updated weights for policy 0, policy_version 57444 (0.0028) [2024-06-12 18:58:45,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.1, 300 sec: 48929.9). Total num frames: 941178880. Throughput: 0: 48781.9. Samples: 470038240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-12 18:58:45,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 18:58:48,771][71000] Updated weights for policy 0, policy_version 57454 (0.0025) [2024-06-12 18:58:49,566][70980] Signal inference workers to stop experience collection... (6900 times) [2024-06-12 18:58:49,569][70980] Signal inference workers to resume experience collection... (6900 times) [2024-06-12 18:58:49,603][71000] InferenceWorker_p0-w0: stopping experience collection (6900 times) [2024-06-12 18:58:49,604][71000] InferenceWorker_p0-w0: resuming experience collection (6900 times) [2024-06-12 18:58:50,940][70768] Fps is (10 sec: 52427.2, 60 sec: 48878.7, 300 sec: 48985.3). Total num frames: 941424640. Throughput: 0: 49036.9. Samples: 470189060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 18:58:50,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 18:58:52,403][71000] Updated weights for policy 0, policy_version 57464 (0.0038) [2024-06-12 18:58:55,651][71000] Updated weights for policy 0, policy_version 57474 (0.0028) [2024-06-12 18:58:55,940][70768] Fps is (10 sec: 47512.6, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 941654016. Throughput: 0: 49006.1. Samples: 470480440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 18:58:55,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 18:58:59,525][71000] Updated weights for policy 0, policy_version 57484 (0.0036) [2024-06-12 18:59:00,940][70768] Fps is (10 sec: 45876.3, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 941883392. Throughput: 0: 49064.1. Samples: 470775060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 18:59:00,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 18:59:01,076][70980] Saving new best policy, reward=0.271! [2024-06-12 18:59:02,767][71000] Updated weights for policy 0, policy_version 57494 (0.0037) [2024-06-12 18:59:05,939][70768] Fps is (10 sec: 47514.7, 60 sec: 48880.5, 300 sec: 48929.9). Total num frames: 942129152. Throughput: 0: 48971.3. Samples: 470909740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 18:59:05,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 18:59:06,085][71000] Updated weights for policy 0, policy_version 57504 (0.0041) [2024-06-12 18:59:09,107][71000] Updated weights for policy 0, policy_version 57514 (0.0023) [2024-06-12 18:59:10,940][70768] Fps is (10 sec: 50789.3, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 942391296. Throughput: 0: 48842.0. Samples: 471204120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 18:59:10,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 18:59:12,694][71000] Updated weights for policy 0, policy_version 57524 (0.0031) [2024-06-12 18:59:15,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 942620672. Throughput: 0: 48489.3. Samples: 471491280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 18:59:15,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 18:59:16,062][71000] Updated weights for policy 0, policy_version 57534 (0.0031) [2024-06-12 18:59:19,346][71000] Updated weights for policy 0, policy_version 57544 (0.0030) [2024-06-12 18:59:20,939][70768] Fps is (10 sec: 45876.7, 60 sec: 48606.1, 300 sec: 48763.2). Total num frames: 942850048. Throughput: 0: 48397.0. Samples: 471635180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 18:59:20,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 18:59:22,729][71000] Updated weights for policy 0, policy_version 57554 (0.0030) [2024-06-12 18:59:25,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 943095808. Throughput: 0: 48646.1. Samples: 471932940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 18:59:25,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 18:59:26,350][71000] Updated weights for policy 0, policy_version 57564 (0.0020) [2024-06-12 18:59:29,391][71000] Updated weights for policy 0, policy_version 57574 (0.0033) [2024-06-12 18:59:30,940][70768] Fps is (10 sec: 52427.4, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 943374336. Throughput: 0: 48667.4. Samples: 472228280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 18:59:30,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 18:59:32,856][71000] Updated weights for policy 0, policy_version 57584 (0.0035) [2024-06-12 18:59:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48605.8, 300 sec: 48818.8). Total num frames: 943587328. Throughput: 0: 48676.2. Samples: 472379480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 18:59:35,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 18:59:36,161][71000] Updated weights for policy 0, policy_version 57594 (0.0028) [2024-06-12 18:59:39,610][71000] Updated weights for policy 0, policy_version 57604 (0.0025) [2024-06-12 18:59:40,940][70768] Fps is (10 sec: 45875.7, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 943833088. Throughput: 0: 48668.6. Samples: 472670520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 18:59:40,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 18:59:41,096][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000057609_943865856.pth... [2024-06-12 18:59:41,151][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000056892_932118528.pth [2024-06-12 18:59:42,878][71000] Updated weights for policy 0, policy_version 57614 (0.0025) [2024-06-12 18:59:45,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48332.7, 300 sec: 48929.9). Total num frames: 944078848. Throughput: 0: 48665.6. Samples: 472965020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 18:59:45,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 18:59:46,340][71000] Updated weights for policy 0, policy_version 57624 (0.0021) [2024-06-12 18:59:49,346][71000] Updated weights for policy 0, policy_version 57634 (0.0025) [2024-06-12 18:59:50,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48606.0, 300 sec: 49040.9). Total num frames: 944340992. Throughput: 0: 48869.3. Samples: 473108860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 18:59:50,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 18:59:53,145][71000] Updated weights for policy 0, policy_version 57644 (0.0033) [2024-06-12 18:59:55,939][70768] Fps is (10 sec: 50791.4, 60 sec: 48879.1, 300 sec: 48874.3). Total num frames: 944586752. Throughput: 0: 48915.4. Samples: 473405300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 18:59:55,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 18:59:56,041][71000] Updated weights for policy 0, policy_version 57654 (0.0023) [2024-06-12 18:59:59,709][71000] Updated weights for policy 0, policy_version 57664 (0.0030) [2024-06-12 19:00:00,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 944848896. Throughput: 0: 49181.8. Samples: 473704460. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-12 19:00:00,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 19:00:02,993][71000] Updated weights for policy 0, policy_version 57674 (0.0031) [2024-06-12 19:00:05,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 945061888. Throughput: 0: 49091.3. Samples: 473844300. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-12 19:00:05,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:00:06,363][71000] Updated weights for policy 0, policy_version 57684 (0.0031) [2024-06-12 19:00:07,716][70980] Signal inference workers to stop experience collection... (6950 times) [2024-06-12 19:00:07,759][71000] InferenceWorker_p0-w0: stopping experience collection (6950 times) [2024-06-12 19:00:07,825][70980] Signal inference workers to resume experience collection... (6950 times) [2024-06-12 19:00:07,826][71000] InferenceWorker_p0-w0: resuming experience collection (6950 times) [2024-06-12 19:00:09,584][71000] Updated weights for policy 0, policy_version 57694 (0.0027) [2024-06-12 19:00:10,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 945307648. Throughput: 0: 49044.0. Samples: 474139920. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-12 19:00:10,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:00:13,147][71000] Updated weights for policy 0, policy_version 57704 (0.0028) [2024-06-12 19:00:15,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 945569792. Throughput: 0: 49069.9. Samples: 474436420. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-12 19:00:15,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:00:16,416][71000] Updated weights for policy 0, policy_version 57714 (0.0031) [2024-06-12 19:00:19,818][71000] Updated weights for policy 0, policy_version 57724 (0.0029) [2024-06-12 19:00:20,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49424.9, 300 sec: 48985.4). Total num frames: 945815552. Throughput: 0: 48967.9. Samples: 474583040. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-12 19:00:20,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:00:22,974][71000] Updated weights for policy 0, policy_version 57734 (0.0022) [2024-06-12 19:00:25,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49151.9, 300 sec: 48985.3). Total num frames: 946044928. Throughput: 0: 49012.3. Samples: 474876080. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-12 19:00:25,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:00:26,327][71000] Updated weights for policy 0, policy_version 57744 (0.0043) [2024-06-12 19:00:29,805][71000] Updated weights for policy 0, policy_version 57754 (0.0026) [2024-06-12 19:00:30,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 946307072. Throughput: 0: 49043.2. Samples: 475171960. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-12 19:00:30,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 19:00:33,095][71000] Updated weights for policy 0, policy_version 57764 (0.0024) [2024-06-12 19:00:35,939][70768] Fps is (10 sec: 49153.1, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 946536448. Throughput: 0: 49104.5. Samples: 475318560. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-12 19:00:35,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 19:00:36,303][71000] Updated weights for policy 0, policy_version 57774 (0.0036) [2024-06-12 19:00:39,691][71000] Updated weights for policy 0, policy_version 57784 (0.0028) [2024-06-12 19:00:40,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 946798592. Throughput: 0: 49100.4. Samples: 475614820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 19:00:40,940][70768] Avg episode reward: [(0, '0.238')] [2024-06-12 19:00:42,893][71000] Updated weights for policy 0, policy_version 57794 (0.0028) [2024-06-12 19:00:45,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 947027968. Throughput: 0: 48836.0. Samples: 475902080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 19:00:45,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 19:00:46,466][71000] Updated weights for policy 0, policy_version 57804 (0.0028) [2024-06-12 19:00:49,789][71000] Updated weights for policy 0, policy_version 57814 (0.0026) [2024-06-12 19:00:50,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 947257344. Throughput: 0: 48889.9. Samples: 476044340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 19:00:50,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:00:53,210][71000] Updated weights for policy 0, policy_version 57824 (0.0035) [2024-06-12 19:00:55,939][70768] Fps is (10 sec: 50791.0, 60 sec: 49152.0, 300 sec: 48930.5). Total num frames: 947535872. Throughput: 0: 48759.2. Samples: 476334080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 19:00:55,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:00:56,447][71000] Updated weights for policy 0, policy_version 57834 (0.0028) [2024-06-12 19:00:59,629][71000] Updated weights for policy 0, policy_version 57844 (0.0029) [2024-06-12 19:01:00,939][70768] Fps is (10 sec: 50790.5, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 947765248. Throughput: 0: 48948.5. Samples: 476639100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 19:01:00,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:01:02,814][71000] Updated weights for policy 0, policy_version 57854 (0.0028) [2024-06-12 19:01:05,940][70768] Fps is (10 sec: 47512.2, 60 sec: 49151.9, 300 sec: 48985.3). Total num frames: 948011008. Throughput: 0: 49031.0. Samples: 476789440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 19:01:05,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:01:06,480][71000] Updated weights for policy 0, policy_version 57864 (0.0030) [2024-06-12 19:01:09,498][71000] Updated weights for policy 0, policy_version 57874 (0.0025) [2024-06-12 19:01:10,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 948256768. Throughput: 0: 49145.1. Samples: 477087600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 19:01:10,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:01:11,254][70980] Signal inference workers to stop experience collection... (7000 times) [2024-06-12 19:01:11,291][71000] InferenceWorker_p0-w0: stopping experience collection (7000 times) [2024-06-12 19:01:11,363][70980] Signal inference workers to resume experience collection... (7000 times) [2024-06-12 19:01:11,363][71000] InferenceWorker_p0-w0: resuming experience collection (7000 times) [2024-06-12 19:01:12,849][71000] Updated weights for policy 0, policy_version 57884 (0.0022) [2024-06-12 19:01:15,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 948518912. Throughput: 0: 49012.4. Samples: 477377520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 19:01:15,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:01:15,962][71000] Updated weights for policy 0, policy_version 57894 (0.0037) [2024-06-12 19:01:19,723][71000] Updated weights for policy 0, policy_version 57904 (0.0031) [2024-06-12 19:01:20,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49425.0, 300 sec: 49096.4). Total num frames: 948781056. Throughput: 0: 49259.8. Samples: 477535260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 19:01:20,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:01:22,559][71000] Updated weights for policy 0, policy_version 57914 (0.0028) [2024-06-12 19:01:25,939][70768] Fps is (10 sec: 47514.3, 60 sec: 49152.2, 300 sec: 48929.9). Total num frames: 948994048. Throughput: 0: 49235.6. Samples: 477830420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 19:01:25,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:01:26,213][71000] Updated weights for policy 0, policy_version 57924 (0.0027) [2024-06-12 19:01:29,178][71000] Updated weights for policy 0, policy_version 57934 (0.0032) [2024-06-12 19:01:30,941][70768] Fps is (10 sec: 44231.9, 60 sec: 48604.9, 300 sec: 48818.6). Total num frames: 949223424. Throughput: 0: 49282.7. Samples: 478119860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 19:01:30,941][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:01:33,186][71000] Updated weights for policy 0, policy_version 57944 (0.0030) [2024-06-12 19:01:35,898][71000] Updated weights for policy 0, policy_version 57954 (0.0031) [2024-06-12 19:01:35,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49698.0, 300 sec: 49040.9). Total num frames: 949518336. Throughput: 0: 49445.7. Samples: 478269400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 19:01:35,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 19:01:39,553][71000] Updated weights for policy 0, policy_version 57964 (0.0036) [2024-06-12 19:01:40,940][70768] Fps is (10 sec: 54073.9, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 949764096. Throughput: 0: 49476.3. Samples: 478560520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 19:01:40,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:01:40,946][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000057969_949764096.pth... [2024-06-12 19:01:40,995][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000057249_937967616.pth [2024-06-12 19:01:42,604][71000] Updated weights for policy 0, policy_version 57974 (0.0033) [2024-06-12 19:01:45,940][70768] Fps is (10 sec: 45875.6, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 949977088. Throughput: 0: 49287.1. Samples: 478857020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 19:01:45,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 19:01:46,487][71000] Updated weights for policy 0, policy_version 57984 (0.0032) [2024-06-12 19:01:49,135][71000] Updated weights for policy 0, policy_version 57994 (0.0038) [2024-06-12 19:01:50,940][70768] Fps is (10 sec: 42598.2, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 950190080. Throughput: 0: 49011.3. Samples: 478994940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 19:01:50,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:01:52,984][71000] Updated weights for policy 0, policy_version 58004 (0.0029) [2024-06-12 19:01:55,867][71000] Updated weights for policy 0, policy_version 58014 (0.0031) [2024-06-12 19:01:55,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 950501376. Throughput: 0: 48998.2. Samples: 479292520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 19:01:55,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:01:59,733][71000] Updated weights for policy 0, policy_version 58024 (0.0026) [2024-06-12 19:02:00,940][70768] Fps is (10 sec: 55705.0, 60 sec: 49698.0, 300 sec: 49040.9). Total num frames: 950747136. Throughput: 0: 49242.2. Samples: 479593420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 19:02:00,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 19:02:02,727][71000] Updated weights for policy 0, policy_version 58034 (0.0035) [2024-06-12 19:02:05,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49425.2, 300 sec: 48985.4). Total num frames: 950976512. Throughput: 0: 48972.5. Samples: 479739020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 19:02:05,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:02:06,140][71000] Updated weights for policy 0, policy_version 58044 (0.0025) [2024-06-12 19:02:09,127][71000] Updated weights for policy 0, policy_version 58054 (0.0036) [2024-06-12 19:02:10,940][70768] Fps is (10 sec: 45875.6, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 951205888. Throughput: 0: 48986.5. Samples: 480034820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 19:02:10,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 19:02:13,034][71000] Updated weights for policy 0, policy_version 58064 (0.0032) [2024-06-12 19:02:13,733][70980] Signal inference workers to stop experience collection... (7050 times) [2024-06-12 19:02:13,734][70980] Signal inference workers to resume experience collection... (7050 times) [2024-06-12 19:02:13,762][71000] InferenceWorker_p0-w0: stopping experience collection (7050 times) [2024-06-12 19:02:13,762][71000] InferenceWorker_p0-w0: resuming experience collection (7050 times) [2024-06-12 19:02:15,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 951468032. Throughput: 0: 49094.2. Samples: 480329040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 19:02:15,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:02:16,018][71000] Updated weights for policy 0, policy_version 58074 (0.0026) [2024-06-12 19:02:19,670][71000] Updated weights for policy 0, policy_version 58084 (0.0032) [2024-06-12 19:02:20,939][70768] Fps is (10 sec: 50791.0, 60 sec: 48879.1, 300 sec: 49040.9). Total num frames: 951713792. Throughput: 0: 49118.4. Samples: 480479720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 19:02:20,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:02:22,810][71000] Updated weights for policy 0, policy_version 58094 (0.0030) [2024-06-12 19:02:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 951943168. Throughput: 0: 49175.5. Samples: 480773420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 19:02:25,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:02:26,474][71000] Updated weights for policy 0, policy_version 58104 (0.0028) [2024-06-12 19:02:29,551][71000] Updated weights for policy 0, policy_version 58114 (0.0034) [2024-06-12 19:02:30,940][70768] Fps is (10 sec: 45874.8, 60 sec: 49153.0, 300 sec: 48929.8). Total num frames: 952172544. Throughput: 0: 49102.6. Samples: 481066640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 19:02:30,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:02:33,062][71000] Updated weights for policy 0, policy_version 58124 (0.0034) [2024-06-12 19:02:35,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48878.9, 300 sec: 49096.4). Total num frames: 952451072. Throughput: 0: 49370.1. Samples: 481216600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 19:02:35,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:02:36,136][71000] Updated weights for policy 0, policy_version 58134 (0.0032) [2024-06-12 19:02:39,829][71000] Updated weights for policy 0, policy_version 58144 (0.0031) [2024-06-12 19:02:40,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 952680448. Throughput: 0: 49240.9. Samples: 481508360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 19:02:40,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:02:42,897][71000] Updated weights for policy 0, policy_version 58154 (0.0023) [2024-06-12 19:02:45,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 952926208. Throughput: 0: 49055.2. Samples: 481800900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 19:02:45,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:02:46,470][71000] Updated weights for policy 0, policy_version 58164 (0.0031) [2024-06-12 19:02:49,818][71000] Updated weights for policy 0, policy_version 58174 (0.0033) [2024-06-12 19:02:50,939][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 953139200. Throughput: 0: 49073.4. Samples: 481947320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 19:02:50,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:02:53,132][71000] Updated weights for policy 0, policy_version 58184 (0.0028) [2024-06-12 19:02:55,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 953417728. Throughput: 0: 48869.0. Samples: 482233920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 19:02:55,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 19:02:56,458][71000] Updated weights for policy 0, policy_version 58194 (0.0026) [2024-06-12 19:02:59,975][71000] Updated weights for policy 0, policy_version 58204 (0.0034) [2024-06-12 19:03:00,940][70768] Fps is (10 sec: 52428.5, 60 sec: 48606.0, 300 sec: 49041.2). Total num frames: 953663488. Throughput: 0: 48964.0. Samples: 482532420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 19:03:00,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:03:03,310][71000] Updated weights for policy 0, policy_version 58214 (0.0029) [2024-06-12 19:03:05,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 953909248. Throughput: 0: 48970.9. Samples: 482683420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 19:03:05,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:03:06,380][71000] Updated weights for policy 0, policy_version 58224 (0.0030) [2024-06-12 19:03:09,997][71000] Updated weights for policy 0, policy_version 58234 (0.0032) [2024-06-12 19:03:10,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 954138624. Throughput: 0: 48959.0. Samples: 482976580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 19:03:10,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:03:13,221][71000] Updated weights for policy 0, policy_version 58244 (0.0035) [2024-06-12 19:03:15,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 954400768. Throughput: 0: 48766.5. Samples: 483261140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 19:03:15,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:03:16,622][71000] Updated weights for policy 0, policy_version 58254 (0.0039) [2024-06-12 19:03:20,086][71000] Updated weights for policy 0, policy_version 58264 (0.0032) [2024-06-12 19:03:20,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 954646528. Throughput: 0: 48815.1. Samples: 483413280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 19:03:20,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:03:23,286][71000] Updated weights for policy 0, policy_version 58274 (0.0024) [2024-06-12 19:03:25,940][70768] Fps is (10 sec: 47514.5, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 954875904. Throughput: 0: 48970.2. Samples: 483712020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 19:03:25,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:03:26,654][71000] Updated weights for policy 0, policy_version 58284 (0.0043) [2024-06-12 19:03:30,258][71000] Updated weights for policy 0, policy_version 58294 (0.0031) [2024-06-12 19:03:30,939][70768] Fps is (10 sec: 45876.0, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 955105280. Throughput: 0: 48881.4. Samples: 484000560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 19:03:30,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:03:33,182][71000] Updated weights for policy 0, policy_version 58304 (0.0023) [2024-06-12 19:03:35,940][70768] Fps is (10 sec: 50789.4, 60 sec: 48878.9, 300 sec: 49096.4). Total num frames: 955383808. Throughput: 0: 48707.8. Samples: 484139180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 19:03:35,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 19:03:37,226][71000] Updated weights for policy 0, policy_version 58314 (0.0033) [2024-06-12 19:03:40,317][71000] Updated weights for policy 0, policy_version 58324 (0.0034) [2024-06-12 19:03:40,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 955596800. Throughput: 0: 48865.2. Samples: 484432860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-12 19:03:40,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:03:40,954][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000058325_955596800.pth... [2024-06-12 19:03:41,019][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000057609_943865856.pth [2024-06-12 19:03:43,722][71000] Updated weights for policy 0, policy_version 58334 (0.0026) [2024-06-12 19:03:45,939][70768] Fps is (10 sec: 45876.2, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 955842560. Throughput: 0: 48679.6. Samples: 484723000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-12 19:03:45,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:03:46,876][71000] Updated weights for policy 0, policy_version 58344 (0.0026) [2024-06-12 19:03:49,501][70980] Signal inference workers to stop experience collection... (7100 times) [2024-06-12 19:03:49,503][70980] Signal inference workers to resume experience collection... (7100 times) [2024-06-12 19:03:49,519][71000] InferenceWorker_p0-w0: stopping experience collection (7100 times) [2024-06-12 19:03:49,519][71000] InferenceWorker_p0-w0: resuming experience collection (7100 times) [2024-06-12 19:03:50,195][71000] Updated weights for policy 0, policy_version 58354 (0.0032) [2024-06-12 19:03:50,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49151.9, 300 sec: 48929.9). Total num frames: 956088320. Throughput: 0: 48682.7. Samples: 484874140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-12 19:03:50,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 19:03:53,395][71000] Updated weights for policy 0, policy_version 58364 (0.0024) [2024-06-12 19:03:55,939][70768] Fps is (10 sec: 50790.6, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 956350464. Throughput: 0: 48686.9. Samples: 485167480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-12 19:03:55,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 19:03:56,995][71000] Updated weights for policy 0, policy_version 58374 (0.0038) [2024-06-12 19:04:00,098][71000] Updated weights for policy 0, policy_version 58384 (0.0031) [2024-06-12 19:04:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 956596224. Throughput: 0: 48748.0. Samples: 485454800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-12 19:04:00,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:04:03,814][71000] Updated weights for policy 0, policy_version 58394 (0.0026) [2024-06-12 19:04:05,940][70768] Fps is (10 sec: 47512.6, 60 sec: 48605.8, 300 sec: 48929.9). Total num frames: 956825600. Throughput: 0: 48727.5. Samples: 485606020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-12 19:04:05,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:04:07,116][71000] Updated weights for policy 0, policy_version 58404 (0.0040) [2024-06-12 19:04:10,279][71000] Updated weights for policy 0, policy_version 58414 (0.0020) [2024-06-12 19:04:10,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 957071360. Throughput: 0: 48536.2. Samples: 485896160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-12 19:04:10,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:04:13,663][71000] Updated weights for policy 0, policy_version 58424 (0.0024) [2024-06-12 19:04:15,939][70768] Fps is (10 sec: 49152.8, 60 sec: 48606.0, 300 sec: 49040.9). Total num frames: 957317120. Throughput: 0: 48641.8. Samples: 486189440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-12 19:04:15,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:04:17,098][71000] Updated weights for policy 0, policy_version 58434 (0.0029) [2024-06-12 19:04:20,349][71000] Updated weights for policy 0, policy_version 58444 (0.0033) [2024-06-12 19:04:20,942][70768] Fps is (10 sec: 49142.6, 60 sec: 48604.2, 300 sec: 49040.6). Total num frames: 957562880. Throughput: 0: 48748.1. Samples: 486332940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-12 19:04:20,942][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 19:04:23,993][71000] Updated weights for policy 0, policy_version 58454 (0.0022) [2024-06-12 19:04:25,939][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 957808640. Throughput: 0: 48631.2. Samples: 486621260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-12 19:04:25,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 19:04:27,368][71000] Updated weights for policy 0, policy_version 58464 (0.0026) [2024-06-12 19:04:30,531][71000] Updated weights for policy 0, policy_version 58474 (0.0023) [2024-06-12 19:04:30,940][70768] Fps is (10 sec: 49162.5, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 958054400. Throughput: 0: 48692.4. Samples: 486914160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-12 19:04:30,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:04:34,026][71000] Updated weights for policy 0, policy_version 58484 (0.0023) [2024-06-12 19:04:35,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48059.9, 300 sec: 48929.8). Total num frames: 958267392. Throughput: 0: 48610.3. Samples: 487061600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-12 19:04:35,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:04:37,076][71000] Updated weights for policy 0, policy_version 58494 (0.0025) [2024-06-12 19:04:40,761][71000] Updated weights for policy 0, policy_version 58504 (0.0029) [2024-06-12 19:04:40,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 958529536. Throughput: 0: 48673.5. Samples: 487357800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-12 19:04:40,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:04:44,091][71000] Updated weights for policy 0, policy_version 58514 (0.0031) [2024-06-12 19:04:45,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49151.8, 300 sec: 48985.4). Total num frames: 958791680. Throughput: 0: 48699.9. Samples: 487646300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-12 19:04:45,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:04:47,339][71000] Updated weights for policy 0, policy_version 58524 (0.0036) [2024-06-12 19:04:50,696][71000] Updated weights for policy 0, policy_version 58534 (0.0027) [2024-06-12 19:04:50,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49151.9, 300 sec: 48985.3). Total num frames: 959037440. Throughput: 0: 48690.6. Samples: 487797100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 19:04:50,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:04:54,131][71000] Updated weights for policy 0, policy_version 58544 (0.0030) [2024-06-12 19:04:55,940][70768] Fps is (10 sec: 44237.4, 60 sec: 48059.6, 300 sec: 48763.2). Total num frames: 959234048. Throughput: 0: 48723.8. Samples: 488088720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 19:04:55,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 19:04:57,485][71000] Updated weights for policy 0, policy_version 58554 (0.0029) [2024-06-12 19:04:57,640][70980] Signal inference workers to stop experience collection... (7150 times) [2024-06-12 19:04:57,640][70980] Signal inference workers to resume experience collection... (7150 times) [2024-06-12 19:04:57,670][71000] InferenceWorker_p0-w0: stopping experience collection (7150 times) [2024-06-12 19:04:57,671][71000] InferenceWorker_p0-w0: resuming experience collection (7150 times) [2024-06-12 19:05:00,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48332.7, 300 sec: 48929.8). Total num frames: 959496192. Throughput: 0: 48563.7. Samples: 488374820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 19:05:00,943][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:05:01,136][71000] Updated weights for policy 0, policy_version 58564 (0.0030) [2024-06-12 19:05:04,133][71000] Updated weights for policy 0, policy_version 58574 (0.0024) [2024-06-12 19:05:05,940][70768] Fps is (10 sec: 52428.7, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 959758336. Throughput: 0: 48842.3. Samples: 488530740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 19:05:05,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:05:07,858][71000] Updated weights for policy 0, policy_version 58584 (0.0027) [2024-06-12 19:05:10,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 959987712. Throughput: 0: 48861.1. Samples: 488820020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 19:05:10,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:05:10,983][71000] Updated weights for policy 0, policy_version 58594 (0.0031) [2024-06-12 19:05:14,401][71000] Updated weights for policy 0, policy_version 58604 (0.0033) [2024-06-12 19:05:15,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48332.7, 300 sec: 48818.8). Total num frames: 960217088. Throughput: 0: 48862.5. Samples: 489112980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 19:05:15,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:05:17,658][71000] Updated weights for policy 0, policy_version 58614 (0.0030) [2024-06-12 19:05:20,826][71000] Updated weights for policy 0, policy_version 58624 (0.0040) [2024-06-12 19:05:20,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48880.5, 300 sec: 48985.4). Total num frames: 960495616. Throughput: 0: 48536.7. Samples: 489245760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-12 19:05:20,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 19:05:24,318][71000] Updated weights for policy 0, policy_version 58634 (0.0031) [2024-06-12 19:05:25,940][70768] Fps is (10 sec: 52428.9, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 960741376. Throughput: 0: 48526.7. Samples: 489541500. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-12 19:05:25,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:05:27,800][71000] Updated weights for policy 0, policy_version 58644 (0.0025) [2024-06-12 19:05:30,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 960970752. Throughput: 0: 48759.2. Samples: 489840460. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-12 19:05:30,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:05:31,000][71000] Updated weights for policy 0, policy_version 58654 (0.0033) [2024-06-12 19:05:34,670][71000] Updated weights for policy 0, policy_version 58664 (0.0027) [2024-06-12 19:05:35,940][70768] Fps is (10 sec: 44237.2, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 961183744. Throughput: 0: 48479.7. Samples: 489978680. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-12 19:05:35,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 19:05:37,711][71000] Updated weights for policy 0, policy_version 58674 (0.0019) [2024-06-12 19:05:40,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48879.1, 300 sec: 48929.8). Total num frames: 961462272. Throughput: 0: 48408.0. Samples: 490267080. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-12 19:05:40,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:05:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000058683_961462272.pth... [2024-06-12 19:05:41,000][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000057969_949764096.pth [2024-06-12 19:05:41,171][71000] Updated weights for policy 0, policy_version 58684 (0.0028) [2024-06-12 19:05:44,842][71000] Updated weights for policy 0, policy_version 58694 (0.0029) [2024-06-12 19:05:45,939][70768] Fps is (10 sec: 52429.1, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 961708032. Throughput: 0: 48505.7. Samples: 490557560. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-12 19:05:45,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 19:05:48,081][71000] Updated weights for policy 0, policy_version 58704 (0.0029) [2024-06-12 19:05:50,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48332.8, 300 sec: 48818.7). Total num frames: 961937408. Throughput: 0: 48283.9. Samples: 490703520. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-12 19:05:50,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:05:51,261][71000] Updated weights for policy 0, policy_version 58714 (0.0038) [2024-06-12 19:05:54,477][71000] Updated weights for policy 0, policy_version 58724 (0.0030) [2024-06-12 19:05:55,940][70768] Fps is (10 sec: 44236.5, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 962150400. Throughput: 0: 48509.9. Samples: 491002960. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-12 19:05:55,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 19:05:56,878][70980] Signal inference workers to stop experience collection... (7200 times) [2024-06-12 19:05:56,912][71000] InferenceWorker_p0-w0: stopping experience collection (7200 times) [2024-06-12 19:05:56,995][70980] Signal inference workers to resume experience collection... (7200 times) [2024-06-12 19:05:56,995][71000] InferenceWorker_p0-w0: resuming experience collection (7200 times) [2024-06-12 19:05:58,025][71000] Updated weights for policy 0, policy_version 58734 (0.0026) [2024-06-12 19:06:00,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 962428928. Throughput: 0: 48436.3. Samples: 491292620. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-12 19:06:00,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:06:01,407][71000] Updated weights for policy 0, policy_version 58744 (0.0023) [2024-06-12 19:06:04,523][71000] Updated weights for policy 0, policy_version 58754 (0.0029) [2024-06-12 19:06:05,940][70768] Fps is (10 sec: 55704.7, 60 sec: 49151.9, 300 sec: 48985.3). Total num frames: 962707456. Throughput: 0: 49208.9. Samples: 491460160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 19:06:05,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 19:06:07,946][71000] Updated weights for policy 0, policy_version 58764 (0.0033) [2024-06-12 19:06:10,940][70768] Fps is (10 sec: 49153.3, 60 sec: 48879.1, 300 sec: 48818.8). Total num frames: 962920448. Throughput: 0: 49082.4. Samples: 491750200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 19:06:10,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 19:06:11,423][71000] Updated weights for policy 0, policy_version 58774 (0.0022) [2024-06-12 19:06:14,778][71000] Updated weights for policy 0, policy_version 58784 (0.0028) [2024-06-12 19:06:15,940][70768] Fps is (10 sec: 42598.7, 60 sec: 48605.9, 300 sec: 48652.2). Total num frames: 963133440. Throughput: 0: 48683.1. Samples: 492031200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 19:06:15,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:06:18,100][71000] Updated weights for policy 0, policy_version 58794 (0.0024) [2024-06-12 19:06:20,939][70768] Fps is (10 sec: 49152.0, 60 sec: 48606.1, 300 sec: 48874.3). Total num frames: 963411968. Throughput: 0: 48693.8. Samples: 492169900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 19:06:20,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:06:21,427][71000] Updated weights for policy 0, policy_version 58804 (0.0029) [2024-06-12 19:06:24,671][71000] Updated weights for policy 0, policy_version 58814 (0.0026) [2024-06-12 19:06:25,940][70768] Fps is (10 sec: 54067.9, 60 sec: 48879.0, 300 sec: 48985.6). Total num frames: 963674112. Throughput: 0: 49019.6. Samples: 492472960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 19:06:25,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:06:28,286][71000] Updated weights for policy 0, policy_version 58824 (0.0033) [2024-06-12 19:06:30,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 963903488. Throughput: 0: 49328.8. Samples: 492777360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 19:06:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:06:31,327][71000] Updated weights for policy 0, policy_version 58834 (0.0027) [2024-06-12 19:06:34,737][71000] Updated weights for policy 0, policy_version 58844 (0.0035) [2024-06-12 19:06:35,940][70768] Fps is (10 sec: 44236.6, 60 sec: 48878.9, 300 sec: 48652.1). Total num frames: 964116480. Throughput: 0: 49180.6. Samples: 492916640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 19:06:35,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:06:38,169][71000] Updated weights for policy 0, policy_version 58854 (0.0030) [2024-06-12 19:06:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 964395008. Throughput: 0: 48985.3. Samples: 493207300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 19:06:40,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:06:41,672][71000] Updated weights for policy 0, policy_version 58864 (0.0037) [2024-06-12 19:06:44,818][71000] Updated weights for policy 0, policy_version 58874 (0.0029) [2024-06-12 19:06:45,939][70768] Fps is (10 sec: 52429.1, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 964640768. Throughput: 0: 49152.3. Samples: 493504460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 19:06:45,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 19:06:48,132][71000] Updated weights for policy 0, policy_version 58884 (0.0025) [2024-06-12 19:06:50,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 48763.2). Total num frames: 964886528. Throughput: 0: 48803.2. Samples: 493656300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 19:06:50,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:06:51,345][71000] Updated weights for policy 0, policy_version 58894 (0.0022) [2024-06-12 19:06:55,008][71000] Updated weights for policy 0, policy_version 58904 (0.0027) [2024-06-12 19:06:55,940][70768] Fps is (10 sec: 45874.8, 60 sec: 49152.0, 300 sec: 48652.2). Total num frames: 965099520. Throughput: 0: 48909.7. Samples: 493951140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 19:06:55,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:06:56,519][70980] Signal inference workers to stop experience collection... (7250 times) [2024-06-12 19:06:56,520][70980] Signal inference workers to resume experience collection... (7250 times) [2024-06-12 19:06:56,565][71000] InferenceWorker_p0-w0: stopping experience collection (7250 times) [2024-06-12 19:06:56,565][71000] InferenceWorker_p0-w0: resuming experience collection (7250 times) [2024-06-12 19:06:58,107][71000] Updated weights for policy 0, policy_version 58914 (0.0027) [2024-06-12 19:07:00,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.2, 300 sec: 48818.8). Total num frames: 965378048. Throughput: 0: 49170.3. Samples: 494243860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 19:07:00,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 19:07:01,914][71000] Updated weights for policy 0, policy_version 58924 (0.0028) [2024-06-12 19:07:04,879][71000] Updated weights for policy 0, policy_version 58934 (0.0031) [2024-06-12 19:07:05,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48333.0, 300 sec: 48818.8). Total num frames: 965607424. Throughput: 0: 49372.4. Samples: 494391660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 19:07:05,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 19:07:08,551][71000] Updated weights for policy 0, policy_version 58944 (0.0028) [2024-06-12 19:07:10,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 965869568. Throughput: 0: 49116.5. Samples: 494683200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 19:07:10,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:07:11,497][71000] Updated weights for policy 0, policy_version 58954 (0.0035) [2024-06-12 19:07:15,154][71000] Updated weights for policy 0, policy_version 58964 (0.0028) [2024-06-12 19:07:15,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49152.0, 300 sec: 48707.7). Total num frames: 966082560. Throughput: 0: 48822.1. Samples: 494974360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-12 19:07:15,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 19:07:17,986][71000] Updated weights for policy 0, policy_version 58974 (0.0024) [2024-06-12 19:07:20,939][70768] Fps is (10 sec: 45875.3, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 966328320. Throughput: 0: 48936.5. Samples: 495118780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-12 19:07:20,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:07:22,115][71000] Updated weights for policy 0, policy_version 58984 (0.0027) [2024-06-12 19:07:24,759][71000] Updated weights for policy 0, policy_version 58994 (0.0031) [2024-06-12 19:07:25,940][70768] Fps is (10 sec: 52428.7, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 966606848. Throughput: 0: 48980.4. Samples: 495411420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-12 19:07:25,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:07:28,709][71000] Updated weights for policy 0, policy_version 59004 (0.0033) [2024-06-12 19:07:30,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 966836224. Throughput: 0: 48785.7. Samples: 495699820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-12 19:07:30,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:07:31,640][71000] Updated weights for policy 0, policy_version 59014 (0.0034) [2024-06-12 19:07:35,727][71000] Updated weights for policy 0, policy_version 59024 (0.0026) [2024-06-12 19:07:35,940][70768] Fps is (10 sec: 44237.3, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 967049216. Throughput: 0: 48688.1. Samples: 495847260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-12 19:07:35,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:07:38,362][71000] Updated weights for policy 0, policy_version 59034 (0.0030) [2024-06-12 19:07:40,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 967311360. Throughput: 0: 48495.9. Samples: 496133460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-12 19:07:40,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:07:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000059040_967311360.pth... [2024-06-12 19:07:40,998][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000058325_955596800.pth [2024-06-12 19:07:42,279][71000] Updated weights for policy 0, policy_version 59044 (0.0025) [2024-06-12 19:07:45,075][71000] Updated weights for policy 0, policy_version 59054 (0.0027) [2024-06-12 19:07:45,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 967557120. Throughput: 0: 48536.5. Samples: 496428000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-12 19:07:45,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:07:48,879][71000] Updated weights for policy 0, policy_version 59064 (0.0034) [2024-06-12 19:07:50,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 967802880. Throughput: 0: 48730.5. Samples: 496584540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-12 19:07:50,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 19:07:51,940][71000] Updated weights for policy 0, policy_version 59074 (0.0033) [2024-06-12 19:07:55,769][71000] Updated weights for policy 0, policy_version 59084 (0.0030) [2024-06-12 19:07:55,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 968032256. Throughput: 0: 48427.6. Samples: 496862440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 19:07:55,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:07:58,616][71000] Updated weights for policy 0, policy_version 59094 (0.0028) [2024-06-12 19:08:00,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48332.7, 300 sec: 48707.7). Total num frames: 968278016. Throughput: 0: 48438.2. Samples: 497154080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 19:08:00,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:08:02,432][71000] Updated weights for policy 0, policy_version 59104 (0.0032) [2024-06-12 19:08:05,575][71000] Updated weights for policy 0, policy_version 59114 (0.0025) [2024-06-12 19:08:05,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 968540160. Throughput: 0: 48432.8. Samples: 497298260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 19:08:05,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:08:09,212][71000] Updated weights for policy 0, policy_version 59124 (0.0035) [2024-06-12 19:08:10,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48605.7, 300 sec: 48763.2). Total num frames: 968785920. Throughput: 0: 48615.5. Samples: 497599120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 19:08:10,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:08:12,264][71000] Updated weights for policy 0, policy_version 59134 (0.0026) [2024-06-12 19:08:15,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48606.0, 300 sec: 48652.2). Total num frames: 968998912. Throughput: 0: 48718.7. Samples: 497892160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 19:08:15,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:08:15,949][71000] Updated weights for policy 0, policy_version 59144 (0.0030) [2024-06-12 19:08:16,613][70980] Signal inference workers to stop experience collection... (7300 times) [2024-06-12 19:08:16,633][71000] InferenceWorker_p0-w0: stopping experience collection (7300 times) [2024-06-12 19:08:16,668][70980] Signal inference workers to resume experience collection... (7300 times) [2024-06-12 19:08:16,669][71000] InferenceWorker_p0-w0: resuming experience collection (7300 times) [2024-06-12 19:08:18,852][71000] Updated weights for policy 0, policy_version 59154 (0.0025) [2024-06-12 19:08:20,939][70768] Fps is (10 sec: 47514.7, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 969261056. Throughput: 0: 48535.6. Samples: 498031360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 19:08:20,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 19:08:22,601][71000] Updated weights for policy 0, policy_version 59164 (0.0031) [2024-06-12 19:08:25,933][71000] Updated weights for policy 0, policy_version 59174 (0.0033) [2024-06-12 19:08:25,939][70768] Fps is (10 sec: 50790.4, 60 sec: 48332.9, 300 sec: 48818.8). Total num frames: 969506816. Throughput: 0: 48567.7. Samples: 498319000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 19:08:25,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:08:29,230][71000] Updated weights for policy 0, policy_version 59184 (0.0036) [2024-06-12 19:08:30,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 969768960. Throughput: 0: 48622.1. Samples: 498616000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 19:08:30,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 19:08:32,503][71000] Updated weights for policy 0, policy_version 59194 (0.0044) [2024-06-12 19:08:35,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48878.8, 300 sec: 48763.2). Total num frames: 969981952. Throughput: 0: 48597.8. Samples: 498771440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 19:08:35,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 19:08:36,076][71000] Updated weights for policy 0, policy_version 59204 (0.0027) [2024-06-12 19:08:38,747][71000] Updated weights for policy 0, policy_version 59214 (0.0023) [2024-06-12 19:08:40,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 970260480. Throughput: 0: 49018.7. Samples: 499068280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 19:08:40,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:08:42,766][71000] Updated weights for policy 0, policy_version 59224 (0.0031) [2024-06-12 19:08:45,856][71000] Updated weights for policy 0, policy_version 59234 (0.0038) [2024-06-12 19:08:45,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 970489856. Throughput: 0: 48908.1. Samples: 499354940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 19:08:45,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:08:49,371][71000] Updated weights for policy 0, policy_version 59244 (0.0040) [2024-06-12 19:08:50,940][70768] Fps is (10 sec: 47512.4, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 970735616. Throughput: 0: 49076.3. Samples: 499506700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 19:08:50,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:08:52,618][71000] Updated weights for policy 0, policy_version 59254 (0.0030) [2024-06-12 19:08:55,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 970964992. Throughput: 0: 48855.7. Samples: 499797620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 19:08:55,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:08:56,099][71000] Updated weights for policy 0, policy_version 59264 (0.0033) [2024-06-12 19:08:59,087][71000] Updated weights for policy 0, policy_version 59274 (0.0025) [2024-06-12 19:09:00,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.1, 300 sec: 48874.3). Total num frames: 971243520. Throughput: 0: 49052.7. Samples: 500099540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 19:09:00,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:09:02,561][71000] Updated weights for policy 0, policy_version 59284 (0.0027) [2024-06-12 19:09:05,826][71000] Updated weights for policy 0, policy_version 59294 (0.0032) [2024-06-12 19:09:05,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 971472896. Throughput: 0: 49262.2. Samples: 500248160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 19:09:05,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:09:09,388][71000] Updated weights for policy 0, policy_version 59304 (0.0035) [2024-06-12 19:09:10,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 971735040. Throughput: 0: 49388.3. Samples: 500541480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 19:09:10,941][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:09:12,518][71000] Updated weights for policy 0, policy_version 59314 (0.0028) [2024-06-12 19:09:15,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 48819.1). Total num frames: 971964416. Throughput: 0: 49133.5. Samples: 500827000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 19:09:15,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:09:15,943][71000] Updated weights for policy 0, policy_version 59324 (0.0024) [2024-06-12 19:09:19,320][70980] Signal inference workers to stop experience collection... (7350 times) [2024-06-12 19:09:19,353][71000] InferenceWorker_p0-w0: stopping experience collection (7350 times) [2024-06-12 19:09:19,378][70980] Signal inference workers to resume experience collection... (7350 times) [2024-06-12 19:09:19,379][71000] InferenceWorker_p0-w0: resuming experience collection (7350 times) [2024-06-12 19:09:19,516][71000] Updated weights for policy 0, policy_version 59334 (0.0028) [2024-06-12 19:09:20,940][70768] Fps is (10 sec: 45874.6, 60 sec: 48878.7, 300 sec: 48763.2). Total num frames: 972193792. Throughput: 0: 48977.6. Samples: 500975440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 19:09:20,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 19:09:22,462][71000] Updated weights for policy 0, policy_version 59344 (0.0021) [2024-06-12 19:09:25,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 972439552. Throughput: 0: 48847.0. Samples: 501266400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 19:09:25,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:09:26,046][71000] Updated weights for policy 0, policy_version 59354 (0.0031) [2024-06-12 19:09:29,118][71000] Updated weights for policy 0, policy_version 59364 (0.0032) [2024-06-12 19:09:30,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 972701696. Throughput: 0: 49053.1. Samples: 501562340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 19:09:30,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:09:32,936][71000] Updated weights for policy 0, policy_version 59374 (0.0022) [2024-06-12 19:09:35,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 972931072. Throughput: 0: 48992.5. Samples: 501711360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 19:09:35,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:09:36,110][71000] Updated weights for policy 0, policy_version 59384 (0.0031) [2024-06-12 19:09:39,367][71000] Updated weights for policy 0, policy_version 59394 (0.0034) [2024-06-12 19:09:40,940][70768] Fps is (10 sec: 45876.3, 60 sec: 48332.8, 300 sec: 48707.7). Total num frames: 973160448. Throughput: 0: 48965.8. Samples: 502001080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 19:09:40,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:09:41,034][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000059398_973176832.pth... [2024-06-12 19:09:41,089][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000058683_961462272.pth [2024-06-12 19:09:42,764][71000] Updated weights for policy 0, policy_version 59404 (0.0026) [2024-06-12 19:09:45,939][70768] Fps is (10 sec: 47514.4, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 973406208. Throughput: 0: 48737.5. Samples: 502292720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 19:09:45,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:09:46,245][71000] Updated weights for policy 0, policy_version 59414 (0.0027) [2024-06-12 19:09:49,331][71000] Updated weights for policy 0, policy_version 59424 (0.0026) [2024-06-12 19:09:50,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 973684736. Throughput: 0: 48804.4. Samples: 502444360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 19:09:50,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:09:52,693][71000] Updated weights for policy 0, policy_version 59434 (0.0031) [2024-06-12 19:09:55,730][71000] Updated weights for policy 0, policy_version 59444 (0.0032) [2024-06-12 19:09:55,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49424.9, 300 sec: 48929.9). Total num frames: 973930496. Throughput: 0: 48916.4. Samples: 502742720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 19:09:55,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:09:59,452][71000] Updated weights for policy 0, policy_version 59454 (0.0023) [2024-06-12 19:10:00,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48332.9, 300 sec: 48763.2). Total num frames: 974143488. Throughput: 0: 49154.6. Samples: 503038960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 19:10:00,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:10:02,547][71000] Updated weights for policy 0, policy_version 59464 (0.0027) [2024-06-12 19:10:05,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 974405632. Throughput: 0: 48900.7. Samples: 503175960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 19:10:05,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:10:05,982][71000] Updated weights for policy 0, policy_version 59474 (0.0027) [2024-06-12 19:10:09,305][71000] Updated weights for policy 0, policy_version 59484 (0.0027) [2024-06-12 19:10:10,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48606.0, 300 sec: 48929.9). Total num frames: 974651392. Throughput: 0: 49106.7. Samples: 503476200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 19:10:10,940][70768] Avg episode reward: [(0, '0.243')] [2024-06-12 19:10:13,063][71000] Updated weights for policy 0, policy_version 59494 (0.0032) [2024-06-12 19:10:14,153][70980] Signal inference workers to stop experience collection... (7400 times) [2024-06-12 19:10:14,198][71000] InferenceWorker_p0-w0: stopping experience collection (7400 times) [2024-06-12 19:10:14,204][70980] Signal inference workers to resume experience collection... (7400 times) [2024-06-12 19:10:14,208][71000] InferenceWorker_p0-w0: resuming experience collection (7400 times) [2024-06-12 19:10:15,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 974897152. Throughput: 0: 49125.6. Samples: 503772980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 19:10:15,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:10:15,991][71000] Updated weights for policy 0, policy_version 59504 (0.0029) [2024-06-12 19:10:19,712][71000] Updated weights for policy 0, policy_version 59514 (0.0029) [2024-06-12 19:10:20,940][70768] Fps is (10 sec: 47512.5, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 975126528. Throughput: 0: 48915.9. Samples: 503912580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 19:10:20,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:10:22,733][71000] Updated weights for policy 0, policy_version 59524 (0.0036) [2024-06-12 19:10:25,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 975372288. Throughput: 0: 49008.0. Samples: 504206440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 19:10:25,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:10:26,491][71000] Updated weights for policy 0, policy_version 59534 (0.0039) [2024-06-12 19:10:29,441][71000] Updated weights for policy 0, policy_version 59544 (0.0031) [2024-06-12 19:10:30,940][70768] Fps is (10 sec: 50791.3, 60 sec: 48879.1, 300 sec: 48985.4). Total num frames: 975634432. Throughput: 0: 48794.6. Samples: 504488480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 19:10:30,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:10:33,067][71000] Updated weights for policy 0, policy_version 59554 (0.0024) [2024-06-12 19:10:35,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 975880192. Throughput: 0: 48998.1. Samples: 504649280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 19:10:35,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:10:36,015][71000] Updated weights for policy 0, policy_version 59564 (0.0028) [2024-06-12 19:10:39,771][71000] Updated weights for policy 0, policy_version 59574 (0.0026) [2024-06-12 19:10:40,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 976109568. Throughput: 0: 48866.4. Samples: 504941700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 19:10:40,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:10:42,690][71000] Updated weights for policy 0, policy_version 59584 (0.0022) [2024-06-12 19:10:45,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 976355328. Throughput: 0: 48643.1. Samples: 505227900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 19:10:45,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:10:46,589][71000] Updated weights for policy 0, policy_version 59594 (0.0029) [2024-06-12 19:10:49,602][71000] Updated weights for policy 0, policy_version 59604 (0.0026) [2024-06-12 19:10:50,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 976601088. Throughput: 0: 48935.5. Samples: 505378060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 19:10:50,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:10:53,544][71000] Updated weights for policy 0, policy_version 59614 (0.0022) [2024-06-12 19:10:55,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48332.9, 300 sec: 48818.8). Total num frames: 976830464. Throughput: 0: 48510.2. Samples: 505659160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 19:10:55,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:10:56,345][71000] Updated weights for policy 0, policy_version 59624 (0.0039) [2024-06-12 19:10:59,984][71000] Updated weights for policy 0, policy_version 59634 (0.0029) [2024-06-12 19:11:00,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.8, 300 sec: 48707.7). Total num frames: 977076224. Throughput: 0: 48600.3. Samples: 505960000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 19:11:00,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:11:02,925][71000] Updated weights for policy 0, policy_version 59644 (0.0027) [2024-06-12 19:11:05,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48605.8, 300 sec: 48818.7). Total num frames: 977321984. Throughput: 0: 48457.4. Samples: 506093160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 19:11:05,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:11:06,493][71000] Updated weights for policy 0, policy_version 59654 (0.0023) [2024-06-12 19:11:09,800][71000] Updated weights for policy 0, policy_version 59664 (0.0031) [2024-06-12 19:11:10,940][70768] Fps is (10 sec: 50791.1, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 977584128. Throughput: 0: 48633.8. Samples: 506394960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 19:11:10,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:11:13,381][71000] Updated weights for policy 0, policy_version 59674 (0.0033) [2024-06-12 19:11:15,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48332.8, 300 sec: 48763.2). Total num frames: 977797120. Throughput: 0: 48772.0. Samples: 506683220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 19:11:15,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:11:16,621][71000] Updated weights for policy 0, policy_version 59684 (0.0033) [2024-06-12 19:11:20,369][71000] Updated weights for policy 0, policy_version 59694 (0.0031) [2024-06-12 19:11:20,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48879.1, 300 sec: 48763.2). Total num frames: 978059264. Throughput: 0: 48316.5. Samples: 506823520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 19:11:20,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:11:23,301][71000] Updated weights for policy 0, policy_version 59704 (0.0029) [2024-06-12 19:11:25,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48332.8, 300 sec: 48707.7). Total num frames: 978272256. Throughput: 0: 48288.8. Samples: 507114700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 19:11:25,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:11:27,100][71000] Updated weights for policy 0, policy_version 59714 (0.0035) [2024-06-12 19:11:27,666][70980] Signal inference workers to stop experience collection... (7450 times) [2024-06-12 19:11:27,666][70980] Signal inference workers to resume experience collection... (7450 times) [2024-06-12 19:11:27,680][71000] InferenceWorker_p0-w0: stopping experience collection (7450 times) [2024-06-12 19:11:27,681][71000] InferenceWorker_p0-w0: resuming experience collection (7450 times) [2024-06-12 19:11:30,070][71000] Updated weights for policy 0, policy_version 59724 (0.0030) [2024-06-12 19:11:30,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48332.8, 300 sec: 48874.3). Total num frames: 978534400. Throughput: 0: 48370.7. Samples: 507404580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-12 19:11:30,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:11:33,697][71000] Updated weights for policy 0, policy_version 59734 (0.0027) [2024-06-12 19:11:35,939][70768] Fps is (10 sec: 50790.8, 60 sec: 48332.9, 300 sec: 48763.3). Total num frames: 978780160. Throughput: 0: 48537.0. Samples: 507562220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 19:11:35,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:11:36,775][71000] Updated weights for policy 0, policy_version 59744 (0.0034) [2024-06-12 19:11:40,346][71000] Updated weights for policy 0, policy_version 59754 (0.0034) [2024-06-12 19:11:40,940][70768] Fps is (10 sec: 49150.8, 60 sec: 48605.6, 300 sec: 48763.2). Total num frames: 979025920. Throughput: 0: 48844.6. Samples: 507857180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 19:11:40,941][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 19:11:41,000][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000059756_979042304.pth... [2024-06-12 19:11:41,054][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000059040_967311360.pth [2024-06-12 19:11:43,685][71000] Updated weights for policy 0, policy_version 59764 (0.0025) [2024-06-12 19:11:45,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48332.9, 300 sec: 48707.7). Total num frames: 979255296. Throughput: 0: 48282.0. Samples: 508132680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 19:11:45,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:11:47,149][71000] Updated weights for policy 0, policy_version 59774 (0.0032) [2024-06-12 19:11:50,404][71000] Updated weights for policy 0, policy_version 59784 (0.0032) [2024-06-12 19:11:50,940][70768] Fps is (10 sec: 49152.9, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 979517440. Throughput: 0: 48670.3. Samples: 508283320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 19:11:50,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:11:53,764][71000] Updated weights for policy 0, policy_version 59794 (0.0022) [2024-06-12 19:11:55,940][70768] Fps is (10 sec: 52427.3, 60 sec: 49151.8, 300 sec: 48818.7). Total num frames: 979779584. Throughput: 0: 48718.0. Samples: 508587280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 19:11:55,941][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 19:11:57,206][71000] Updated weights for policy 0, policy_version 59804 (0.0032) [2024-06-12 19:12:00,423][71000] Updated weights for policy 0, policy_version 59814 (0.0033) [2024-06-12 19:12:00,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 980008960. Throughput: 0: 48811.1. Samples: 508879720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 19:12:00,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 19:12:03,850][71000] Updated weights for policy 0, policy_version 59824 (0.0031) [2024-06-12 19:12:05,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 980238336. Throughput: 0: 48890.1. Samples: 509023580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 19:12:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 19:12:05,941][70980] Saving new best policy, reward=0.273! [2024-06-12 19:12:07,139][71000] Updated weights for policy 0, policy_version 59834 (0.0029) [2024-06-12 19:12:10,594][71000] Updated weights for policy 0, policy_version 59844 (0.0032) [2024-06-12 19:12:10,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48605.7, 300 sec: 48874.3). Total num frames: 980500480. Throughput: 0: 48868.3. Samples: 509313780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 19:12:10,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:12:13,738][71000] Updated weights for policy 0, policy_version 59854 (0.0024) [2024-06-12 19:12:15,939][70768] Fps is (10 sec: 50791.2, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 980746240. Throughput: 0: 48938.3. Samples: 509606800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 19:12:15,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 19:12:17,522][71000] Updated weights for policy 0, policy_version 59864 (0.0027) [2024-06-12 19:12:20,843][71000] Updated weights for policy 0, policy_version 59874 (0.0034) [2024-06-12 19:12:20,940][70768] Fps is (10 sec: 47514.4, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 980975616. Throughput: 0: 48926.1. Samples: 509763900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 19:12:20,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:12:23,925][71000] Updated weights for policy 0, policy_version 59884 (0.0032) [2024-06-12 19:12:25,940][70768] Fps is (10 sec: 45874.4, 60 sec: 48878.8, 300 sec: 48707.7). Total num frames: 981204992. Throughput: 0: 48788.6. Samples: 510052660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 19:12:25,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 19:12:27,350][71000] Updated weights for policy 0, policy_version 59894 (0.0031) [2024-06-12 19:12:30,801][71000] Updated weights for policy 0, policy_version 59904 (0.0031) [2024-06-12 19:12:30,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 981483520. Throughput: 0: 49253.7. Samples: 510349100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 19:12:30,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:12:34,002][71000] Updated weights for policy 0, policy_version 59914 (0.0033) [2024-06-12 19:12:35,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 981729280. Throughput: 0: 49106.7. Samples: 510493120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 19:12:35,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:12:37,325][71000] Updated weights for policy 0, policy_version 59924 (0.0030) [2024-06-12 19:12:40,568][71000] Updated weights for policy 0, policy_version 59934 (0.0028) [2024-06-12 19:12:40,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48879.1, 300 sec: 48818.7). Total num frames: 981958656. Throughput: 0: 48944.2. Samples: 510789760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 19:12:40,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:12:44,116][71000] Updated weights for policy 0, policy_version 59944 (0.0032) [2024-06-12 19:12:45,940][70768] Fps is (10 sec: 45874.3, 60 sec: 48878.7, 300 sec: 48763.2). Total num frames: 982188032. Throughput: 0: 49146.0. Samples: 511091300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 19:12:45,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:12:47,291][71000] Updated weights for policy 0, policy_version 59954 (0.0031) [2024-06-12 19:12:50,823][70980] Signal inference workers to stop experience collection... (7500 times) [2024-06-12 19:12:50,860][71000] InferenceWorker_p0-w0: stopping experience collection (7500 times) [2024-06-12 19:12:50,883][70980] Signal inference workers to resume experience collection... (7500 times) [2024-06-12 19:12:50,883][71000] InferenceWorker_p0-w0: resuming experience collection (7500 times) [2024-06-12 19:12:50,939][70768] Fps is (10 sec: 47514.2, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 982433792. Throughput: 0: 48861.9. Samples: 511222360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 19:12:50,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 19:12:51,018][71000] Updated weights for policy 0, policy_version 59964 (0.0030) [2024-06-12 19:12:54,370][71000] Updated weights for policy 0, policy_version 59974 (0.0028) [2024-06-12 19:12:55,939][70768] Fps is (10 sec: 52430.1, 60 sec: 48879.2, 300 sec: 48929.9). Total num frames: 982712320. Throughput: 0: 49011.8. Samples: 511519300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 19:12:55,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:12:57,912][71000] Updated weights for policy 0, policy_version 59984 (0.0026) [2024-06-12 19:13:00,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 982925312. Throughput: 0: 48811.5. Samples: 511803320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 19:13:00,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:13:01,059][71000] Updated weights for policy 0, policy_version 59994 (0.0035) [2024-06-12 19:13:04,534][71000] Updated weights for policy 0, policy_version 60004 (0.0025) [2024-06-12 19:13:05,939][70768] Fps is (10 sec: 44236.9, 60 sec: 48606.0, 300 sec: 48707.7). Total num frames: 983154688. Throughput: 0: 48423.2. Samples: 511942940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 19:13:05,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:13:07,831][71000] Updated weights for policy 0, policy_version 60014 (0.0025) [2024-06-12 19:13:10,939][70768] Fps is (10 sec: 49152.5, 60 sec: 48606.1, 300 sec: 48874.3). Total num frames: 983416832. Throughput: 0: 48464.2. Samples: 512233540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 19:13:10,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:13:10,958][71000] Updated weights for policy 0, policy_version 60024 (0.0028) [2024-06-12 19:13:14,482][71000] Updated weights for policy 0, policy_version 60034 (0.0032) [2024-06-12 19:13:15,940][70768] Fps is (10 sec: 54066.4, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 983695360. Throughput: 0: 48522.1. Samples: 512532600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 19:13:15,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:13:18,040][71000] Updated weights for policy 0, policy_version 60044 (0.0031) [2024-06-12 19:13:20,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 983891968. Throughput: 0: 48711.1. Samples: 512685120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 19:13:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 19:13:20,964][70980] Saving new best policy, reward=0.276! [2024-06-12 19:13:21,190][71000] Updated weights for policy 0, policy_version 60054 (0.0025) [2024-06-12 19:13:24,857][71000] Updated weights for policy 0, policy_version 60064 (0.0033) [2024-06-12 19:13:25,940][70768] Fps is (10 sec: 44236.6, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 984137728. Throughput: 0: 48604.4. Samples: 512976960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:13:25,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:13:27,923][71000] Updated weights for policy 0, policy_version 60074 (0.0026) [2024-06-12 19:13:30,939][70768] Fps is (10 sec: 50790.6, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 984399872. Throughput: 0: 48122.9. Samples: 513256820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:13:30,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 19:13:31,345][71000] Updated weights for policy 0, policy_version 60084 (0.0031) [2024-06-12 19:13:34,470][71000] Updated weights for policy 0, policy_version 60094 (0.0032) [2024-06-12 19:13:35,939][70768] Fps is (10 sec: 49153.0, 60 sec: 48332.9, 300 sec: 48707.7). Total num frames: 984629248. Throughput: 0: 48700.5. Samples: 513413880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:13:35,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:13:38,397][71000] Updated weights for policy 0, policy_version 60104 (0.0027) [2024-06-12 19:13:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 984891392. Throughput: 0: 48646.2. Samples: 513708380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:13:40,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 19:13:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000060113_984891392.pth... [2024-06-12 19:13:41,000][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000059398_973176832.pth [2024-06-12 19:13:41,361][71000] Updated weights for policy 0, policy_version 60114 (0.0035) [2024-06-12 19:13:45,026][71000] Updated weights for policy 0, policy_version 60124 (0.0031) [2024-06-12 19:13:45,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48333.0, 300 sec: 48652.2). Total num frames: 985088000. Throughput: 0: 48638.7. Samples: 513992060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:13:45,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:13:48,043][71000] Updated weights for policy 0, policy_version 60134 (0.0026) [2024-06-12 19:13:50,942][70768] Fps is (10 sec: 47502.7, 60 sec: 48877.0, 300 sec: 48818.4). Total num frames: 985366528. Throughput: 0: 48689.0. Samples: 514134060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:13:50,943][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:13:51,538][71000] Updated weights for policy 0, policy_version 60144 (0.0032) [2024-06-12 19:13:55,048][70980] Signal inference workers to stop experience collection... (7550 times) [2024-06-12 19:13:55,075][71000] InferenceWorker_p0-w0: stopping experience collection (7550 times) [2024-06-12 19:13:55,151][70980] Signal inference workers to resume experience collection... (7550 times) [2024-06-12 19:13:55,152][71000] InferenceWorker_p0-w0: resuming experience collection (7550 times) [2024-06-12 19:13:55,153][71000] Updated weights for policy 0, policy_version 60154 (0.0024) [2024-06-12 19:13:55,939][70768] Fps is (10 sec: 52429.1, 60 sec: 48332.8, 300 sec: 48707.7). Total num frames: 985612288. Throughput: 0: 48747.1. Samples: 514427160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:13:55,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:13:58,000][71000] Updated weights for policy 0, policy_version 60164 (0.0037) [2024-06-12 19:14:00,939][70768] Fps is (10 sec: 47525.2, 60 sec: 48606.0, 300 sec: 48707.7). Total num frames: 985841664. Throughput: 0: 48619.8. Samples: 514720480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:14:00,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:14:01,594][71000] Updated weights for policy 0, policy_version 60174 (0.0033) [2024-06-12 19:14:05,044][71000] Updated weights for policy 0, policy_version 60184 (0.0039) [2024-06-12 19:14:05,940][70768] Fps is (10 sec: 45874.2, 60 sec: 48605.7, 300 sec: 48596.6). Total num frames: 986071040. Throughput: 0: 48385.6. Samples: 514862480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 19:14:05,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:14:08,418][71000] Updated weights for policy 0, policy_version 60194 (0.0031) [2024-06-12 19:14:10,939][70768] Fps is (10 sec: 49151.9, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 986333184. Throughput: 0: 48340.2. Samples: 515152260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 19:14:10,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:14:11,718][71000] Updated weights for policy 0, policy_version 60204 (0.0022) [2024-06-12 19:14:14,962][71000] Updated weights for policy 0, policy_version 60214 (0.0027) [2024-06-12 19:14:15,939][70768] Fps is (10 sec: 49153.3, 60 sec: 47786.8, 300 sec: 48707.7). Total num frames: 986562560. Throughput: 0: 48530.3. Samples: 515440680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 19:14:15,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:14:18,225][71000] Updated weights for policy 0, policy_version 60224 (0.0028) [2024-06-12 19:14:20,939][70768] Fps is (10 sec: 47513.5, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 986808320. Throughput: 0: 48300.5. Samples: 515587400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 19:14:20,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:14:21,890][71000] Updated weights for policy 0, policy_version 60234 (0.0029) [2024-06-12 19:14:24,770][71000] Updated weights for policy 0, policy_version 60244 (0.0030) [2024-06-12 19:14:25,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48605.9, 300 sec: 48652.2). Total num frames: 987054080. Throughput: 0: 48345.7. Samples: 515883940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 19:14:25,940][70768] Avg episode reward: [(0, '0.246')] [2024-06-12 19:14:28,380][71000] Updated weights for policy 0, policy_version 60254 (0.0027) [2024-06-12 19:14:30,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 987316224. Throughput: 0: 48788.9. Samples: 516187560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 19:14:30,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:14:31,660][71000] Updated weights for policy 0, policy_version 60264 (0.0029) [2024-06-12 19:14:35,177][71000] Updated weights for policy 0, policy_version 60274 (0.0030) [2024-06-12 19:14:35,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 987561984. Throughput: 0: 48831.8. Samples: 516331380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 19:14:35,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:14:38,411][71000] Updated weights for policy 0, policy_version 60284 (0.0027) [2024-06-12 19:14:40,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48059.8, 300 sec: 48707.7). Total num frames: 987774976. Throughput: 0: 48827.5. Samples: 516624400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 19:14:40,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:14:41,792][71000] Updated weights for policy 0, policy_version 60294 (0.0030) [2024-06-12 19:14:44,937][71000] Updated weights for policy 0, policy_version 60304 (0.0026) [2024-06-12 19:14:45,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49151.9, 300 sec: 48652.1). Total num frames: 988037120. Throughput: 0: 48676.6. Samples: 516910940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 19:14:45,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:14:48,607][71000] Updated weights for policy 0, policy_version 60314 (0.0036) [2024-06-12 19:14:50,939][70768] Fps is (10 sec: 50790.8, 60 sec: 48607.8, 300 sec: 48652.2). Total num frames: 988282880. Throughput: 0: 48920.7. Samples: 517063900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 19:14:50,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:14:51,419][71000] Updated weights for policy 0, policy_version 60324 (0.0023) [2024-06-12 19:14:55,123][70980] Signal inference workers to stop experience collection... (7600 times) [2024-06-12 19:14:55,173][71000] InferenceWorker_p0-w0: stopping experience collection (7600 times) [2024-06-12 19:14:55,236][70980] Signal inference workers to resume experience collection... (7600 times) [2024-06-12 19:14:55,237][71000] InferenceWorker_p0-w0: resuming experience collection (7600 times) [2024-06-12 19:14:55,394][71000] Updated weights for policy 0, policy_version 60334 (0.0025) [2024-06-12 19:14:55,940][70768] Fps is (10 sec: 50791.2, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 988545024. Throughput: 0: 48971.0. Samples: 517355960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 19:14:55,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:14:58,387][71000] Updated weights for policy 0, policy_version 60344 (0.0030) [2024-06-12 19:15:00,944][70768] Fps is (10 sec: 49130.2, 60 sec: 48875.3, 300 sec: 48707.0). Total num frames: 988774400. Throughput: 0: 49116.9. Samples: 517651160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 19:15:00,945][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:15:01,993][71000] Updated weights for policy 0, policy_version 60354 (0.0029) [2024-06-12 19:15:05,047][71000] Updated weights for policy 0, policy_version 60364 (0.0026) [2024-06-12 19:15:05,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.2, 300 sec: 48707.7). Total num frames: 989020160. Throughput: 0: 49137.7. Samples: 517798600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 19:15:05,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:15:08,683][71000] Updated weights for policy 0, policy_version 60374 (0.0023) [2024-06-12 19:15:10,940][70768] Fps is (10 sec: 49173.5, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 989265920. Throughput: 0: 49181.9. Samples: 518097120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 19:15:10,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:15:11,948][71000] Updated weights for policy 0, policy_version 60384 (0.0033) [2024-06-12 19:15:15,311][71000] Updated weights for policy 0, policy_version 60394 (0.0035) [2024-06-12 19:15:15,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49151.9, 300 sec: 48763.3). Total num frames: 989511680. Throughput: 0: 48852.9. Samples: 518385940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 19:15:15,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:15:18,595][71000] Updated weights for policy 0, policy_version 60404 (0.0031) [2024-06-12 19:15:20,940][70768] Fps is (10 sec: 49150.9, 60 sec: 49151.8, 300 sec: 48763.2). Total num frames: 989757440. Throughput: 0: 49029.6. Samples: 518537720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 19:15:20,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:15:21,918][71000] Updated weights for policy 0, policy_version 60414 (0.0025) [2024-06-12 19:15:25,287][71000] Updated weights for policy 0, policy_version 60424 (0.0027) [2024-06-12 19:15:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 48707.7). Total num frames: 990003200. Throughput: 0: 48875.9. Samples: 518823820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 19:15:25,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:15:28,781][71000] Updated weights for policy 0, policy_version 60434 (0.0019) [2024-06-12 19:15:30,940][70768] Fps is (10 sec: 49153.1, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 990248960. Throughput: 0: 49184.6. Samples: 519124240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 19:15:30,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:15:31,937][71000] Updated weights for policy 0, policy_version 60444 (0.0029) [2024-06-12 19:15:35,664][71000] Updated weights for policy 0, policy_version 60454 (0.0025) [2024-06-12 19:15:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 990494720. Throughput: 0: 49110.5. Samples: 519273880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 19:15:35,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:15:38,599][71000] Updated weights for policy 0, policy_version 60464 (0.0033) [2024-06-12 19:15:40,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49424.9, 300 sec: 48763.2). Total num frames: 990740480. Throughput: 0: 49105.6. Samples: 519565720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 19:15:40,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:15:40,953][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000060470_990740480.pth... [2024-06-12 19:15:41,001][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000059756_979042304.pth [2024-06-12 19:15:42,111][71000] Updated weights for policy 0, policy_version 60474 (0.0032) [2024-06-12 19:15:45,306][71000] Updated weights for policy 0, policy_version 60484 (0.0025) [2024-06-12 19:15:45,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 48763.2). Total num frames: 990986240. Throughput: 0: 49175.0. Samples: 519863820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 19:15:45,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 19:15:48,615][71000] Updated weights for policy 0, policy_version 60494 (0.0034) [2024-06-12 19:15:50,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48878.8, 300 sec: 48763.2). Total num frames: 991215616. Throughput: 0: 49087.9. Samples: 520007560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 19:15:50,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:15:52,033][71000] Updated weights for policy 0, policy_version 60504 (0.0043) [2024-06-12 19:15:55,753][71000] Updated weights for policy 0, policy_version 60514 (0.0049) [2024-06-12 19:15:55,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 991461376. Throughput: 0: 48958.6. Samples: 520300260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 19:15:55,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 19:15:58,667][71000] Updated weights for policy 0, policy_version 60524 (0.0028) [2024-06-12 19:16:00,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49155.5, 300 sec: 48818.8). Total num frames: 991723520. Throughput: 0: 49042.5. Samples: 520592860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 19:16:00,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 19:16:02,269][71000] Updated weights for policy 0, policy_version 60534 (0.0029) [2024-06-12 19:16:05,666][71000] Updated weights for policy 0, policy_version 60544 (0.0030) [2024-06-12 19:16:05,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 991952896. Throughput: 0: 48917.1. Samples: 520738980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 19:16:05,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:16:09,117][71000] Updated weights for policy 0, policy_version 60554 (0.0041) [2024-06-12 19:16:10,355][70980] Signal inference workers to stop experience collection... (7650 times) [2024-06-12 19:16:10,383][71000] InferenceWorker_p0-w0: stopping experience collection (7650 times) [2024-06-12 19:16:10,407][70980] Signal inference workers to resume experience collection... (7650 times) [2024-06-12 19:16:10,408][71000] InferenceWorker_p0-w0: resuming experience collection (7650 times) [2024-06-12 19:16:10,940][70768] Fps is (10 sec: 47514.3, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 992198656. Throughput: 0: 49086.3. Samples: 521032700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 19:16:10,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 19:16:12,594][71000] Updated weights for policy 0, policy_version 60564 (0.0025) [2024-06-12 19:16:15,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 992428032. Throughput: 0: 48630.7. Samples: 521312620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 19:16:15,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 19:16:15,979][71000] Updated weights for policy 0, policy_version 60574 (0.0047) [2024-06-12 19:16:19,559][71000] Updated weights for policy 0, policy_version 60584 (0.0027) [2024-06-12 19:16:20,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 992690176. Throughput: 0: 48612.8. Samples: 521461460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 19:16:20,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:16:22,830][71000] Updated weights for policy 0, policy_version 60594 (0.0028) [2024-06-12 19:16:25,939][70768] Fps is (10 sec: 47513.6, 60 sec: 48332.9, 300 sec: 48707.7). Total num frames: 992903168. Throughput: 0: 48506.0. Samples: 521748480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 19:16:25,944][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 19:16:26,153][71000] Updated weights for policy 0, policy_version 60604 (0.0037) [2024-06-12 19:16:29,362][71000] Updated weights for policy 0, policy_version 60614 (0.0023) [2024-06-12 19:16:30,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.8, 300 sec: 48818.7). Total num frames: 993181696. Throughput: 0: 48577.2. Samples: 522049800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 19:16:30,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:16:32,772][71000] Updated weights for policy 0, policy_version 60624 (0.0030) [2024-06-12 19:16:35,940][70768] Fps is (10 sec: 50789.4, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 993411072. Throughput: 0: 48486.1. Samples: 522189440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 19:16:35,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 19:16:36,086][71000] Updated weights for policy 0, policy_version 60634 (0.0025) [2024-06-12 19:16:39,594][71000] Updated weights for policy 0, policy_version 60644 (0.0029) [2024-06-12 19:16:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 993673216. Throughput: 0: 48659.4. Samples: 522489940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 19:16:40,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:16:42,969][71000] Updated weights for policy 0, policy_version 60654 (0.0037) [2024-06-12 19:16:45,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 993902592. Throughput: 0: 48511.7. Samples: 522775880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 19:16:45,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:16:46,111][71000] Updated weights for policy 0, policy_version 60664 (0.0032) [2024-06-12 19:16:49,375][71000] Updated weights for policy 0, policy_version 60674 (0.0030) [2024-06-12 19:16:50,939][70768] Fps is (10 sec: 45876.6, 60 sec: 48606.0, 300 sec: 48652.2). Total num frames: 994131968. Throughput: 0: 48567.2. Samples: 522924500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 19:16:50,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 19:16:52,986][71000] Updated weights for policy 0, policy_version 60684 (0.0029) [2024-06-12 19:16:55,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 994394112. Throughput: 0: 48362.2. Samples: 523209000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 19:16:55,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 19:16:55,981][71000] Updated weights for policy 0, policy_version 60694 (0.0023) [2024-06-12 19:16:59,719][71000] Updated weights for policy 0, policy_version 60704 (0.0031) [2024-06-12 19:17:00,940][70768] Fps is (10 sec: 52428.2, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 994656256. Throughput: 0: 48639.5. Samples: 523501400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 19:17:00,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:17:03,163][71000] Updated weights for policy 0, policy_version 60714 (0.0026) [2024-06-12 19:17:05,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48605.7, 300 sec: 48707.7). Total num frames: 994869248. Throughput: 0: 48735.9. Samples: 523654580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 19:17:05,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:17:06,274][71000] Updated weights for policy 0, policy_version 60724 (0.0030) [2024-06-12 19:17:06,860][70980] Signal inference workers to stop experience collection... (7700 times) [2024-06-12 19:17:06,863][70980] Signal inference workers to resume experience collection... (7700 times) [2024-06-12 19:17:06,876][71000] InferenceWorker_p0-w0: stopping experience collection (7700 times) [2024-06-12 19:17:06,876][71000] InferenceWorker_p0-w0: resuming experience collection (7700 times) [2024-06-12 19:17:09,706][71000] Updated weights for policy 0, policy_version 60734 (0.0035) [2024-06-12 19:17:10,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 995131392. Throughput: 0: 48904.0. Samples: 523949160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 19:17:10,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:17:12,934][71000] Updated weights for policy 0, policy_version 60744 (0.0035) [2024-06-12 19:17:15,939][70768] Fps is (10 sec: 49153.1, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 995360768. Throughput: 0: 48772.6. Samples: 524244560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 19:17:15,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:17:16,171][71000] Updated weights for policy 0, policy_version 60754 (0.0032) [2024-06-12 19:17:19,755][71000] Updated weights for policy 0, policy_version 60764 (0.0029) [2024-06-12 19:17:20,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 995622912. Throughput: 0: 48817.8. Samples: 524386240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 19:17:20,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:17:23,155][71000] Updated weights for policy 0, policy_version 60774 (0.0026) [2024-06-12 19:17:25,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48879.0, 300 sec: 48652.2). Total num frames: 995835904. Throughput: 0: 48601.7. Samples: 524677000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 19:17:25,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 19:17:26,502][71000] Updated weights for policy 0, policy_version 60784 (0.0032) [2024-06-12 19:17:29,667][71000] Updated weights for policy 0, policy_version 60794 (0.0029) [2024-06-12 19:17:30,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48332.9, 300 sec: 48652.2). Total num frames: 996081664. Throughput: 0: 48652.5. Samples: 524965240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 19:17:30,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:17:33,154][71000] Updated weights for policy 0, policy_version 60804 (0.0028) [2024-06-12 19:17:35,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 996327424. Throughput: 0: 48589.6. Samples: 525111040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 19:17:35,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 19:17:36,399][71000] Updated weights for policy 0, policy_version 60814 (0.0028) [2024-06-12 19:17:39,805][71000] Updated weights for policy 0, policy_version 60824 (0.0024) [2024-06-12 19:17:40,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 996589568. Throughput: 0: 49002.2. Samples: 525414100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 19:17:40,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:17:41,057][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000060828_996605952.pth... [2024-06-12 19:17:41,119][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000060113_984891392.pth [2024-06-12 19:17:42,878][71000] Updated weights for policy 0, policy_version 60834 (0.0036) [2024-06-12 19:17:45,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 996818944. Throughput: 0: 48941.6. Samples: 525703780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 19:17:45,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:17:46,349][71000] Updated weights for policy 0, policy_version 60844 (0.0030) [2024-06-12 19:17:49,583][71000] Updated weights for policy 0, policy_version 60854 (0.0032) [2024-06-12 19:17:50,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.9, 300 sec: 48652.1). Total num frames: 997064704. Throughput: 0: 48679.3. Samples: 525845140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 19:17:50,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:17:53,283][71000] Updated weights for policy 0, policy_version 60864 (0.0038) [2024-06-12 19:17:55,940][70768] Fps is (10 sec: 50791.0, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 997326848. Throughput: 0: 48903.0. Samples: 526149800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 19:17:55,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 19:17:56,359][71000] Updated weights for policy 0, policy_version 60874 (0.0023) [2024-06-12 19:17:59,923][71000] Updated weights for policy 0, policy_version 60884 (0.0025) [2024-06-12 19:18:00,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48332.9, 300 sec: 48818.8). Total num frames: 997556224. Throughput: 0: 48617.3. Samples: 526432340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 19:18:00,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:18:03,133][71000] Updated weights for policy 0, policy_version 60894 (0.0026) [2024-06-12 19:18:05,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48879.1, 300 sec: 48763.2). Total num frames: 997801984. Throughput: 0: 48768.6. Samples: 526580820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 19:18:05,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 19:18:06,392][71000] Updated weights for policy 0, policy_version 60904 (0.0031) [2024-06-12 19:18:08,244][70980] Signal inference workers to stop experience collection... (7750 times) [2024-06-12 19:18:08,291][71000] InferenceWorker_p0-w0: stopping experience collection (7750 times) [2024-06-12 19:18:08,356][70980] Signal inference workers to resume experience collection... (7750 times) [2024-06-12 19:18:08,357][71000] InferenceWorker_p0-w0: resuming experience collection (7750 times) [2024-06-12 19:18:09,847][71000] Updated weights for policy 0, policy_version 60914 (0.0034) [2024-06-12 19:18:10,939][70768] Fps is (10 sec: 47513.7, 60 sec: 48332.8, 300 sec: 48596.6). Total num frames: 998031360. Throughput: 0: 48712.9. Samples: 526869080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 19:18:10,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:18:13,473][71000] Updated weights for policy 0, policy_version 60924 (0.0038) [2024-06-12 19:18:15,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 998309888. Throughput: 0: 48698.3. Samples: 527156660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 19:18:15,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:18:16,706][71000] Updated weights for policy 0, policy_version 60934 (0.0026) [2024-06-12 19:18:20,277][71000] Updated weights for policy 0, policy_version 60944 (0.0032) [2024-06-12 19:18:20,940][70768] Fps is (10 sec: 49150.9, 60 sec: 48332.7, 300 sec: 48763.2). Total num frames: 998522880. Throughput: 0: 48896.8. Samples: 527311400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 19:18:20,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:18:23,019][71000] Updated weights for policy 0, policy_version 60954 (0.0037) [2024-06-12 19:18:25,939][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 998785024. Throughput: 0: 48783.6. Samples: 527609360. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-12 19:18:25,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:18:26,723][71000] Updated weights for policy 0, policy_version 60964 (0.0034) [2024-06-12 19:18:30,027][71000] Updated weights for policy 0, policy_version 60974 (0.0028) [2024-06-12 19:18:30,939][70768] Fps is (10 sec: 50791.5, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 999030784. Throughput: 0: 48901.1. Samples: 527904320. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-12 19:18:30,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 19:18:33,224][71000] Updated weights for policy 0, policy_version 60984 (0.0025) [2024-06-12 19:18:35,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49698.1, 300 sec: 48874.3). Total num frames: 999309312. Throughput: 0: 49060.3. Samples: 528052860. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-12 19:18:35,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 19:18:36,419][71000] Updated weights for policy 0, policy_version 60994 (0.0036) [2024-06-12 19:18:40,101][71000] Updated weights for policy 0, policy_version 61004 (0.0031) [2024-06-12 19:18:40,940][70768] Fps is (10 sec: 49151.2, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 999522304. Throughput: 0: 48847.0. Samples: 528347920. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-12 19:18:40,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:18:43,317][71000] Updated weights for policy 0, policy_version 61014 (0.0035) [2024-06-12 19:18:45,940][70768] Fps is (10 sec: 44236.6, 60 sec: 48879.0, 300 sec: 48763.6). Total num frames: 999751680. Throughput: 0: 48956.2. Samples: 528635380. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-12 19:18:45,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:18:46,919][71000] Updated weights for policy 0, policy_version 61024 (0.0039) [2024-06-12 19:18:50,015][71000] Updated weights for policy 0, policy_version 61034 (0.0033) [2024-06-12 19:18:50,939][70768] Fps is (10 sec: 45876.0, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 999981056. Throughput: 0: 48822.2. Samples: 528777820. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-12 19:18:50,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:18:53,719][71000] Updated weights for policy 0, policy_version 61044 (0.0035) [2024-06-12 19:18:55,940][70768] Fps is (10 sec: 50791.2, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1000259584. Throughput: 0: 48947.5. Samples: 529071720. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-12 19:18:55,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 19:18:56,593][71000] Updated weights for policy 0, policy_version 61054 (0.0031) [2024-06-12 19:19:00,357][71000] Updated weights for policy 0, policy_version 61064 (0.0032) [2024-06-12 19:19:00,939][70768] Fps is (10 sec: 49152.0, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 1000472576. Throughput: 0: 48871.1. Samples: 529355860. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-12 19:19:00,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 19:19:03,646][71000] Updated weights for policy 0, policy_version 61074 (0.0030) [2024-06-12 19:19:05,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 1000718336. Throughput: 0: 48506.0. Samples: 529494160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 19:19:05,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 19:19:07,269][71000] Updated weights for policy 0, policy_version 61084 (0.0036) [2024-06-12 19:19:08,263][70980] Signal inference workers to stop experience collection... (7800 times) [2024-06-12 19:19:08,315][70980] Signal inference workers to resume experience collection... (7800 times) [2024-06-12 19:19:08,315][71000] InferenceWorker_p0-w0: stopping experience collection (7800 times) [2024-06-12 19:19:08,326][71000] InferenceWorker_p0-w0: resuming experience collection (7800 times) [2024-06-12 19:19:10,130][71000] Updated weights for policy 0, policy_version 61094 (0.0032) [2024-06-12 19:19:10,940][70768] Fps is (10 sec: 49151.2, 60 sec: 48878.8, 300 sec: 48818.7). Total num frames: 1000964096. Throughput: 0: 48465.2. Samples: 529790300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 19:19:10,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:19:14,189][71000] Updated weights for policy 0, policy_version 61104 (0.0034) [2024-06-12 19:19:15,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 1001226240. Throughput: 0: 48479.5. Samples: 530085900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 19:19:15,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:19:17,147][71000] Updated weights for policy 0, policy_version 61114 (0.0026) [2024-06-12 19:19:20,912][71000] Updated weights for policy 0, policy_version 61124 (0.0025) [2024-06-12 19:19:20,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48879.1, 300 sec: 48818.8). Total num frames: 1001455616. Throughput: 0: 48499.2. Samples: 530235320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 19:19:20,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:19:23,686][71000] Updated weights for policy 0, policy_version 61134 (0.0025) [2024-06-12 19:19:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 1001701376. Throughput: 0: 48493.0. Samples: 530530100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 19:19:25,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:19:27,266][71000] Updated weights for policy 0, policy_version 61144 (0.0036) [2024-06-12 19:19:30,727][71000] Updated weights for policy 0, policy_version 61154 (0.0033) [2024-06-12 19:19:30,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 1001947136. Throughput: 0: 48429.0. Samples: 530814680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 19:19:30,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 19:19:34,193][71000] Updated weights for policy 0, policy_version 61164 (0.0032) [2024-06-12 19:19:35,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48332.9, 300 sec: 48929.8). Total num frames: 1002209280. Throughput: 0: 48794.2. Samples: 530973560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-12 19:19:35,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:19:37,271][71000] Updated weights for policy 0, policy_version 61174 (0.0036) [2024-06-12 19:19:40,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48332.9, 300 sec: 48763.2). Total num frames: 1002422272. Throughput: 0: 48563.0. Samples: 531257060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 19:19:40,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:19:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000061183_1002422272.pth... [2024-06-12 19:19:41,006][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000060470_990740480.pth [2024-06-12 19:19:41,151][71000] Updated weights for policy 0, policy_version 61184 (0.0045) [2024-06-12 19:19:44,144][71000] Updated weights for policy 0, policy_version 61194 (0.0035) [2024-06-12 19:19:45,939][70768] Fps is (10 sec: 45875.5, 60 sec: 48606.0, 300 sec: 48763.2). Total num frames: 1002668032. Throughput: 0: 48743.6. Samples: 531549320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 19:19:45,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 19:19:47,816][71000] Updated weights for policy 0, policy_version 61204 (0.0026) [2024-06-12 19:19:50,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.8, 300 sec: 48707.7). Total num frames: 1002913792. Throughput: 0: 48924.7. Samples: 531695780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 19:19:50,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:19:50,953][71000] Updated weights for policy 0, policy_version 61214 (0.0024) [2024-06-12 19:19:54,529][71000] Updated weights for policy 0, policy_version 61224 (0.0026) [2024-06-12 19:19:55,939][70768] Fps is (10 sec: 50790.5, 60 sec: 48605.9, 300 sec: 48819.5). Total num frames: 1003175936. Throughput: 0: 48900.2. Samples: 531990800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 19:19:55,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:19:57,466][71000] Updated weights for policy 0, policy_version 61234 (0.0025) [2024-06-12 19:20:00,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 1003388928. Throughput: 0: 48776.9. Samples: 532280860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 19:20:00,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 19:20:01,429][71000] Updated weights for policy 0, policy_version 61244 (0.0029) [2024-06-12 19:20:04,069][71000] Updated weights for policy 0, policy_version 61254 (0.0020) [2024-06-12 19:20:05,940][70768] Fps is (10 sec: 45874.0, 60 sec: 48605.7, 300 sec: 48707.7). Total num frames: 1003634688. Throughput: 0: 48532.3. Samples: 532419280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 19:20:05,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:20:07,961][71000] Updated weights for policy 0, policy_version 61264 (0.0031) [2024-06-12 19:20:10,939][70768] Fps is (10 sec: 50790.5, 60 sec: 48879.1, 300 sec: 48763.2). Total num frames: 1003896832. Throughput: 0: 48454.3. Samples: 532710540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 19:20:10,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:20:11,062][71000] Updated weights for policy 0, policy_version 61274 (0.0024) [2024-06-12 19:20:14,741][71000] Updated weights for policy 0, policy_version 61284 (0.0030) [2024-06-12 19:20:15,940][70768] Fps is (10 sec: 52429.8, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1004158976. Throughput: 0: 48783.2. Samples: 533009920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 19:20:15,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:20:16,647][70980] Signal inference workers to stop experience collection... (7850 times) [2024-06-12 19:20:16,647][70980] Signal inference workers to resume experience collection... (7850 times) [2024-06-12 19:20:16,659][71000] InferenceWorker_p0-w0: stopping experience collection (7850 times) [2024-06-12 19:20:16,660][71000] InferenceWorker_p0-w0: resuming experience collection (7850 times) [2024-06-12 19:20:17,875][71000] Updated weights for policy 0, policy_version 61294 (0.0030) [2024-06-12 19:20:20,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48332.8, 300 sec: 48652.2). Total num frames: 1004355584. Throughput: 0: 48449.7. Samples: 533153800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 19:20:20,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 19:20:21,369][71000] Updated weights for policy 0, policy_version 61304 (0.0032) [2024-06-12 19:20:24,560][71000] Updated weights for policy 0, policy_version 61314 (0.0031) [2024-06-12 19:20:25,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 1004617728. Throughput: 0: 48631.6. Samples: 533445480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 19:20:25,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:20:28,213][71000] Updated weights for policy 0, policy_version 61324 (0.0031) [2024-06-12 19:20:30,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 1004863488. Throughput: 0: 48643.9. Samples: 533738300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 19:20:30,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:20:31,198][71000] Updated weights for policy 0, policy_version 61334 (0.0029) [2024-06-12 19:20:34,742][71000] Updated weights for policy 0, policy_version 61344 (0.0033) [2024-06-12 19:20:35,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48605.8, 300 sec: 48763.3). Total num frames: 1005125632. Throughput: 0: 48687.2. Samples: 533886700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 19:20:35,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:20:37,927][71000] Updated weights for policy 0, policy_version 61354 (0.0032) [2024-06-12 19:20:40,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 1005355008. Throughput: 0: 48670.2. Samples: 534180960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 19:20:40,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 19:20:41,443][71000] Updated weights for policy 0, policy_version 61364 (0.0026) [2024-06-12 19:20:44,723][71000] Updated weights for policy 0, policy_version 61374 (0.0026) [2024-06-12 19:20:45,939][70768] Fps is (10 sec: 47514.2, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1005600768. Throughput: 0: 48693.4. Samples: 534472060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 19:20:45,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:20:48,023][71000] Updated weights for policy 0, policy_version 61384 (0.0034) [2024-06-12 19:20:50,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 1005846528. Throughput: 0: 48800.6. Samples: 534615300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 19:20:50,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:20:51,446][71000] Updated weights for policy 0, policy_version 61394 (0.0038) [2024-06-12 19:20:54,850][71000] Updated weights for policy 0, policy_version 61404 (0.0032) [2024-06-12 19:20:55,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 1006092288. Throughput: 0: 49032.9. Samples: 534917020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 19:20:55,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:20:58,275][71000] Updated weights for policy 0, policy_version 61414 (0.0031) [2024-06-12 19:21:00,940][70768] Fps is (10 sec: 45874.2, 60 sec: 48605.7, 300 sec: 48652.1). Total num frames: 1006305280. Throughput: 0: 48829.1. Samples: 535207240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 19:21:00,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:21:01,756][71000] Updated weights for policy 0, policy_version 61424 (0.0040) [2024-06-12 19:21:04,677][71000] Updated weights for policy 0, policy_version 61434 (0.0031) [2024-06-12 19:21:05,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.2, 300 sec: 48763.2). Total num frames: 1006583808. Throughput: 0: 48750.7. Samples: 535347580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 19:21:05,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:21:07,975][71000] Updated weights for policy 0, policy_version 61444 (0.0027) [2024-06-12 19:21:10,940][70768] Fps is (10 sec: 52430.0, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1006829568. Throughput: 0: 49050.2. Samples: 535652740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 19:21:10,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:21:11,366][71000] Updated weights for policy 0, policy_version 61454 (0.0033) [2024-06-12 19:21:14,659][71000] Updated weights for policy 0, policy_version 61464 (0.0026) [2024-06-12 19:21:15,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1007091712. Throughput: 0: 49075.1. Samples: 535946680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 19:21:15,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:21:18,029][71000] Updated weights for policy 0, policy_version 61474 (0.0031) [2024-06-12 19:21:20,939][70768] Fps is (10 sec: 50791.1, 60 sec: 49698.3, 300 sec: 48929.9). Total num frames: 1007337472. Throughput: 0: 49057.5. Samples: 536094280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 19:21:20,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 19:21:21,070][71000] Updated weights for policy 0, policy_version 61484 (0.0025) [2024-06-12 19:21:24,991][71000] Updated weights for policy 0, policy_version 61494 (0.0029) [2024-06-12 19:21:25,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 1007566848. Throughput: 0: 49076.8. Samples: 536389420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 19:21:25,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 19:21:28,106][71000] Updated weights for policy 0, policy_version 61504 (0.0022) [2024-06-12 19:21:30,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 1007812608. Throughput: 0: 49096.8. Samples: 536681420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 19:21:30,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:21:31,451][71000] Updated weights for policy 0, policy_version 61514 (0.0027) [2024-06-12 19:21:34,657][71000] Updated weights for policy 0, policy_version 61524 (0.0032) [2024-06-12 19:21:35,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 1008041984. Throughput: 0: 49179.5. Samples: 536828380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:21:35,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:21:38,210][70980] Signal inference workers to stop experience collection... (7900 times) [2024-06-12 19:21:38,213][70980] Signal inference workers to resume experience collection... (7900 times) [2024-06-12 19:21:38,220][71000] InferenceWorker_p0-w0: stopping experience collection (7900 times) [2024-06-12 19:21:38,232][71000] InferenceWorker_p0-w0: resuming experience collection (7900 times) [2024-06-12 19:21:38,357][71000] Updated weights for policy 0, policy_version 61534 (0.0035) [2024-06-12 19:21:40,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1008287744. Throughput: 0: 48799.6. Samples: 537113000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:21:40,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 19:21:41,038][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000061542_1008304128.pth... [2024-06-12 19:21:41,099][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000060828_996605952.pth [2024-06-12 19:21:41,479][71000] Updated weights for policy 0, policy_version 61544 (0.0026) [2024-06-12 19:21:45,095][71000] Updated weights for policy 0, policy_version 61554 (0.0029) [2024-06-12 19:21:45,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 1008517120. Throughput: 0: 48832.7. Samples: 537404700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:21:45,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:21:48,118][71000] Updated weights for policy 0, policy_version 61564 (0.0030) [2024-06-12 19:21:50,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1008779264. Throughput: 0: 48912.7. Samples: 537548660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:21:50,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:21:51,866][71000] Updated weights for policy 0, policy_version 61574 (0.0030) [2024-06-12 19:21:54,839][71000] Updated weights for policy 0, policy_version 61584 (0.0029) [2024-06-12 19:21:55,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 1009041408. Throughput: 0: 48953.2. Samples: 537855640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:21:55,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:21:58,247][71000] Updated weights for policy 0, policy_version 61594 (0.0025) [2024-06-12 19:22:00,939][70768] Fps is (10 sec: 50791.3, 60 sec: 49698.4, 300 sec: 48874.3). Total num frames: 1009287168. Throughput: 0: 49083.3. Samples: 538155420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:22:00,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:22:01,190][71000] Updated weights for policy 0, policy_version 61604 (0.0027) [2024-06-12 19:22:04,898][71000] Updated weights for policy 0, policy_version 61614 (0.0027) [2024-06-12 19:22:05,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1009516544. Throughput: 0: 49011.0. Samples: 538299780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:22:05,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:22:08,106][71000] Updated weights for policy 0, policy_version 61624 (0.0035) [2024-06-12 19:22:10,939][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 1009778688. Throughput: 0: 48909.9. Samples: 538590360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:22:10,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 19:22:11,591][71000] Updated weights for policy 0, policy_version 61634 (0.0034) [2024-06-12 19:22:14,693][71000] Updated weights for policy 0, policy_version 61644 (0.0030) [2024-06-12 19:22:15,940][70768] Fps is (10 sec: 50789.3, 60 sec: 48878.8, 300 sec: 48818.7). Total num frames: 1010024448. Throughput: 0: 48887.8. Samples: 538881380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 19:22:15,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:22:18,216][71000] Updated weights for policy 0, policy_version 61654 (0.0021) [2024-06-12 19:22:20,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 1010270208. Throughput: 0: 48967.2. Samples: 539031900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 19:22:20,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:22:21,399][71000] Updated weights for policy 0, policy_version 61664 (0.0029) [2024-06-12 19:22:24,975][71000] Updated weights for policy 0, policy_version 61674 (0.0032) [2024-06-12 19:22:25,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48605.8, 300 sec: 48818.7). Total num frames: 1010483200. Throughput: 0: 49184.3. Samples: 539326300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 19:22:25,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:22:28,458][71000] Updated weights for policy 0, policy_version 61684 (0.0023) [2024-06-12 19:22:30,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1010745344. Throughput: 0: 49125.3. Samples: 539615340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 19:22:30,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:22:31,603][71000] Updated weights for policy 0, policy_version 61694 (0.0027) [2024-06-12 19:22:34,930][71000] Updated weights for policy 0, policy_version 61704 (0.0027) [2024-06-12 19:22:35,939][70768] Fps is (10 sec: 52429.7, 60 sec: 49425.1, 300 sec: 48874.3). Total num frames: 1011007488. Throughput: 0: 49411.7. Samples: 539772180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 19:22:35,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:22:38,215][71000] Updated weights for policy 0, policy_version 61714 (0.0025) [2024-06-12 19:22:40,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 1011236864. Throughput: 0: 49094.2. Samples: 540064880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 19:22:40,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:22:41,327][71000] Updated weights for policy 0, policy_version 61724 (0.0029) [2024-06-12 19:22:45,107][71000] Updated weights for policy 0, policy_version 61734 (0.0032) [2024-06-12 19:22:45,940][70768] Fps is (10 sec: 45874.2, 60 sec: 49151.8, 300 sec: 48818.7). Total num frames: 1011466240. Throughput: 0: 48856.2. Samples: 540353960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 19:22:45,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:22:46,856][70980] Signal inference workers to stop experience collection... (7950 times) [2024-06-12 19:22:46,856][70980] Signal inference workers to resume experience collection... (7950 times) [2024-06-12 19:22:46,894][71000] InferenceWorker_p0-w0: stopping experience collection (7950 times) [2024-06-12 19:22:46,894][71000] InferenceWorker_p0-w0: resuming experience collection (7950 times) [2024-06-12 19:22:48,086][71000] Updated weights for policy 0, policy_version 61744 (0.0041) [2024-06-12 19:22:50,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1011712000. Throughput: 0: 48811.4. Samples: 540496300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 19:22:50,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 19:22:51,662][71000] Updated weights for policy 0, policy_version 61754 (0.0027) [2024-06-12 19:22:55,255][71000] Updated weights for policy 0, policy_version 61764 (0.0031) [2024-06-12 19:22:55,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 1011974144. Throughput: 0: 48840.6. Samples: 540788200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 19:22:55,941][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:22:58,375][71000] Updated weights for policy 0, policy_version 61774 (0.0032) [2024-06-12 19:23:00,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48605.7, 300 sec: 48818.7). Total num frames: 1012203520. Throughput: 0: 48838.3. Samples: 541079100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 19:23:00,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 19:23:01,964][71000] Updated weights for policy 0, policy_version 61784 (0.0036) [2024-06-12 19:23:05,239][71000] Updated weights for policy 0, policy_version 61794 (0.0034) [2024-06-12 19:23:05,940][70768] Fps is (10 sec: 45876.4, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 1012432896. Throughput: 0: 48665.3. Samples: 541221840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 19:23:05,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:23:08,453][71000] Updated weights for policy 0, policy_version 61804 (0.0034) [2024-06-12 19:23:10,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 1012695040. Throughput: 0: 48774.3. Samples: 541521140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 19:23:10,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:23:12,102][71000] Updated weights for policy 0, policy_version 61814 (0.0035) [2024-06-12 19:23:15,223][71000] Updated weights for policy 0, policy_version 61824 (0.0028) [2024-06-12 19:23:15,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48606.0, 300 sec: 48874.3). Total num frames: 1012940800. Throughput: 0: 48807.6. Samples: 541811680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 19:23:15,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:23:19,038][71000] Updated weights for policy 0, policy_version 61834 (0.0031) [2024-06-12 19:23:20,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48332.7, 300 sec: 48763.2). Total num frames: 1013170176. Throughput: 0: 48458.5. Samples: 541952820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 19:23:20,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:23:22,304][71000] Updated weights for policy 0, policy_version 61844 (0.0030) [2024-06-12 19:23:25,748][71000] Updated weights for policy 0, policy_version 61854 (0.0033) [2024-06-12 19:23:25,939][70768] Fps is (10 sec: 47513.7, 60 sec: 48879.1, 300 sec: 48763.2). Total num frames: 1013415936. Throughput: 0: 48246.4. Samples: 542235960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 19:23:25,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 19:23:28,978][71000] Updated weights for policy 0, policy_version 61864 (0.0031) [2024-06-12 19:23:30,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48605.8, 300 sec: 48652.1). Total num frames: 1013661696. Throughput: 0: 48156.5. Samples: 542521000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 19:23:30,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 19:23:32,831][71000] Updated weights for policy 0, policy_version 61874 (0.0028) [2024-06-12 19:23:35,591][71000] Updated weights for policy 0, policy_version 61884 (0.0026) [2024-06-12 19:23:35,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48332.8, 300 sec: 48763.2). Total num frames: 1013907456. Throughput: 0: 48542.4. Samples: 542680700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 19:23:35,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 19:23:39,213][71000] Updated weights for policy 0, policy_version 61894 (0.0026) [2024-06-12 19:23:40,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 1014153216. Throughput: 0: 48662.4. Samples: 542978000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 19:23:40,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 19:23:40,946][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000061899_1014153216.pth... [2024-06-12 19:23:40,999][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000061183_1002422272.pth [2024-06-12 19:23:42,207][71000] Updated weights for policy 0, policy_version 61904 (0.0024) [2024-06-12 19:23:45,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48333.0, 300 sec: 48763.2). Total num frames: 1014366208. Throughput: 0: 48559.7. Samples: 543264280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 19:23:45,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 19:23:46,189][71000] Updated weights for policy 0, policy_version 61914 (0.0039) [2024-06-12 19:23:49,424][71000] Updated weights for policy 0, policy_version 61924 (0.0027) [2024-06-12 19:23:49,950][70980] Signal inference workers to stop experience collection... (8000 times) [2024-06-12 19:23:49,950][70980] Signal inference workers to resume experience collection... (8000 times) [2024-06-12 19:23:49,994][71000] InferenceWorker_p0-w0: stopping experience collection (8000 times) [2024-06-12 19:23:49,994][71000] InferenceWorker_p0-w0: resuming experience collection (8000 times) [2024-06-12 19:23:50,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.1, 300 sec: 48763.2). Total num frames: 1014644736. Throughput: 0: 48616.0. Samples: 543409560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 19:23:50,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:23:52,853][71000] Updated weights for policy 0, policy_version 61934 (0.0032) [2024-06-12 19:23:55,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48060.0, 300 sec: 48763.2). Total num frames: 1014857728. Throughput: 0: 48241.5. Samples: 543692000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 19:23:55,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 19:23:56,230][71000] Updated weights for policy 0, policy_version 61944 (0.0038) [2024-06-12 19:23:59,509][71000] Updated weights for policy 0, policy_version 61954 (0.0018) [2024-06-12 19:24:00,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48332.9, 300 sec: 48763.2). Total num frames: 1015103488. Throughput: 0: 48248.4. Samples: 543982860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 19:24:00,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:24:02,866][71000] Updated weights for policy 0, policy_version 61964 (0.0023) [2024-06-12 19:24:05,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 1015349248. Throughput: 0: 48370.3. Samples: 544129480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 19:24:05,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:24:06,170][71000] Updated weights for policy 0, policy_version 61974 (0.0025) [2024-06-12 19:24:09,526][71000] Updated weights for policy 0, policy_version 61984 (0.0029) [2024-06-12 19:24:10,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 1015611392. Throughput: 0: 48623.5. Samples: 544424020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 19:24:10,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:24:13,400][71000] Updated weights for policy 0, policy_version 61994 (0.0028) [2024-06-12 19:24:15,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48059.7, 300 sec: 48707.7). Total num frames: 1015824384. Throughput: 0: 48624.5. Samples: 544709100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 19:24:15,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:24:16,438][71000] Updated weights for policy 0, policy_version 62004 (0.0028) [2024-06-12 19:24:20,026][71000] Updated weights for policy 0, policy_version 62014 (0.0035) [2024-06-12 19:24:20,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 1016086528. Throughput: 0: 48331.9. Samples: 544855640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 19:24:20,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:24:23,246][71000] Updated weights for policy 0, policy_version 62024 (0.0026) [2024-06-12 19:24:25,943][70768] Fps is (10 sec: 49135.6, 60 sec: 48330.1, 300 sec: 48707.1). Total num frames: 1016315904. Throughput: 0: 48111.1. Samples: 545143160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 19:24:25,943][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:24:26,550][71000] Updated weights for policy 0, policy_version 62034 (0.0026) [2024-06-12 19:24:29,906][71000] Updated weights for policy 0, policy_version 62044 (0.0038) [2024-06-12 19:24:30,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48606.0, 300 sec: 48707.7). Total num frames: 1016578048. Throughput: 0: 48492.4. Samples: 545446440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 19:24:30,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:24:33,503][71000] Updated weights for policy 0, policy_version 62054 (0.0027) [2024-06-12 19:24:35,939][70768] Fps is (10 sec: 47529.9, 60 sec: 48059.8, 300 sec: 48707.7). Total num frames: 1016791040. Throughput: 0: 48724.1. Samples: 545602140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 19:24:35,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:24:36,503][71000] Updated weights for policy 0, policy_version 62064 (0.0028) [2024-06-12 19:24:40,427][71000] Updated weights for policy 0, policy_version 62074 (0.0032) [2024-06-12 19:24:40,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48605.9, 300 sec: 48818.7). Total num frames: 1017069568. Throughput: 0: 48856.3. Samples: 545890540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-12 19:24:40,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:24:43,483][71000] Updated weights for policy 0, policy_version 62084 (0.0038) [2024-06-12 19:24:45,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 1017282560. Throughput: 0: 48565.8. Samples: 546168320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:24:45,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:24:47,109][71000] Updated weights for policy 0, policy_version 62094 (0.0026) [2024-06-12 19:24:49,839][71000] Updated weights for policy 0, policy_version 62104 (0.0028) [2024-06-12 19:24:50,939][70768] Fps is (10 sec: 49152.5, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 1017561088. Throughput: 0: 48610.8. Samples: 546316960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:24:50,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 19:24:53,326][70980] Signal inference workers to stop experience collection... (8050 times) [2024-06-12 19:24:53,377][71000] InferenceWorker_p0-w0: stopping experience collection (8050 times) [2024-06-12 19:24:53,379][71000] Updated weights for policy 0, policy_version 62114 (0.0027) [2024-06-12 19:24:53,382][70980] Signal inference workers to resume experience collection... (8050 times) [2024-06-12 19:24:53,392][71000] InferenceWorker_p0-w0: resuming experience collection (8050 times) [2024-06-12 19:24:55,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 1017806848. Throughput: 0: 48844.0. Samples: 546622000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:24:55,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:24:56,255][71000] Updated weights for policy 0, policy_version 62124 (0.0027) [2024-06-12 19:25:00,370][71000] Updated weights for policy 0, policy_version 62134 (0.0024) [2024-06-12 19:25:00,939][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 1018052608. Throughput: 0: 49133.0. Samples: 546920080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:25:00,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:25:03,508][71000] Updated weights for policy 0, policy_version 62144 (0.0039) [2024-06-12 19:25:05,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 1018265600. Throughput: 0: 48966.3. Samples: 547059120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:25:05,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 19:25:07,191][71000] Updated weights for policy 0, policy_version 62154 (0.0027) [2024-06-12 19:25:10,283][71000] Updated weights for policy 0, policy_version 62164 (0.0023) [2024-06-12 19:25:10,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 1018527744. Throughput: 0: 49047.6. Samples: 547350140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:25:10,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:25:13,677][71000] Updated weights for policy 0, policy_version 62174 (0.0024) [2024-06-12 19:25:15,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 1018773504. Throughput: 0: 48871.5. Samples: 547645660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:25:15,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:25:16,683][71000] Updated weights for policy 0, policy_version 62184 (0.0028) [2024-06-12 19:25:20,162][71000] Updated weights for policy 0, policy_version 62194 (0.0033) [2024-06-12 19:25:20,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 1019035648. Throughput: 0: 48731.8. Samples: 547795080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:25:20,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:25:23,295][71000] Updated weights for policy 0, policy_version 62204 (0.0030) [2024-06-12 19:25:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48881.7, 300 sec: 48763.2). Total num frames: 1019248640. Throughput: 0: 49036.9. Samples: 548097200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 19:25:25,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:25:27,023][71000] Updated weights for policy 0, policy_version 62214 (0.0035) [2024-06-12 19:25:30,298][71000] Updated weights for policy 0, policy_version 62224 (0.0031) [2024-06-12 19:25:30,940][70768] Fps is (10 sec: 45875.9, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 1019494400. Throughput: 0: 49223.1. Samples: 548383360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 19:25:30,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:25:33,820][71000] Updated weights for policy 0, policy_version 62234 (0.0029) [2024-06-12 19:25:35,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49425.1, 300 sec: 48818.8). Total num frames: 1019756544. Throughput: 0: 49198.2. Samples: 548530880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 19:25:35,940][70768] Avg episode reward: [(0, '0.245')] [2024-06-12 19:25:37,151][71000] Updated weights for policy 0, policy_version 62244 (0.0033) [2024-06-12 19:25:40,147][71000] Updated weights for policy 0, policy_version 62254 (0.0034) [2024-06-12 19:25:40,939][70768] Fps is (10 sec: 50790.6, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1020002304. Throughput: 0: 48944.1. Samples: 548824480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 19:25:40,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:25:40,965][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000062257_1020018688.pth... [2024-06-12 19:25:41,001][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000061542_1008304128.pth [2024-06-12 19:25:43,559][71000] Updated weights for policy 0, policy_version 62264 (0.0024) [2024-06-12 19:25:45,939][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 1020231680. Throughput: 0: 48894.7. Samples: 549120340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 19:25:45,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:25:47,029][71000] Updated weights for policy 0, policy_version 62274 (0.0022) [2024-06-12 19:25:50,157][71000] Updated weights for policy 0, policy_version 62284 (0.0026) [2024-06-12 19:25:50,161][70980] Signal inference workers to stop experience collection... (8100 times) [2024-06-12 19:25:50,161][70980] Signal inference workers to resume experience collection... (8100 times) [2024-06-12 19:25:50,201][71000] InferenceWorker_p0-w0: stopping experience collection (8100 times) [2024-06-12 19:25:50,201][71000] InferenceWorker_p0-w0: resuming experience collection (8100 times) [2024-06-12 19:25:50,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 1020477440. Throughput: 0: 48883.1. Samples: 549258860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 19:25:50,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:25:53,606][71000] Updated weights for policy 0, policy_version 62294 (0.0025) [2024-06-12 19:25:55,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 1020755968. Throughput: 0: 48882.2. Samples: 549549840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 19:25:55,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:25:57,156][71000] Updated weights for policy 0, policy_version 62304 (0.0035) [2024-06-12 19:26:00,402][71000] Updated weights for policy 0, policy_version 62314 (0.0028) [2024-06-12 19:26:00,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 1020968960. Throughput: 0: 48920.9. Samples: 549847100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 19:26:00,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:26:03,528][71000] Updated weights for policy 0, policy_version 62324 (0.0025) [2024-06-12 19:26:05,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 1021214720. Throughput: 0: 48944.5. Samples: 549997580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 19:26:05,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 19:26:07,029][71000] Updated weights for policy 0, policy_version 62334 (0.0034) [2024-06-12 19:26:10,670][71000] Updated weights for policy 0, policy_version 62344 (0.0040) [2024-06-12 19:26:10,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 1021460480. Throughput: 0: 48632.9. Samples: 550285680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 19:26:10,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:26:13,875][71000] Updated weights for policy 0, policy_version 62354 (0.0030) [2024-06-12 19:26:15,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 1021722624. Throughput: 0: 48786.9. Samples: 550578780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 19:26:15,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:26:17,236][71000] Updated weights for policy 0, policy_version 62364 (0.0028) [2024-06-12 19:26:20,338][71000] Updated weights for policy 0, policy_version 62374 (0.0027) [2024-06-12 19:26:20,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1021968384. Throughput: 0: 48895.5. Samples: 550731180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 19:26:20,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 19:26:24,022][71000] Updated weights for policy 0, policy_version 62384 (0.0026) [2024-06-12 19:26:25,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 1022197760. Throughput: 0: 49013.2. Samples: 551030080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 19:26:25,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:26:27,110][71000] Updated weights for policy 0, policy_version 62394 (0.0024) [2024-06-12 19:26:30,724][71000] Updated weights for policy 0, policy_version 62404 (0.0033) [2024-06-12 19:26:30,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49151.9, 300 sec: 48818.8). Total num frames: 1022443520. Throughput: 0: 49023.0. Samples: 551326380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 19:26:30,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:26:33,495][71000] Updated weights for policy 0, policy_version 62414 (0.0030) [2024-06-12 19:26:35,939][70768] Fps is (10 sec: 52429.1, 60 sec: 49425.1, 300 sec: 48929.8). Total num frames: 1022722048. Throughput: 0: 49233.8. Samples: 551474380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 19:26:35,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:26:37,179][71000] Updated weights for policy 0, policy_version 62424 (0.0026) [2024-06-12 19:26:40,198][71000] Updated weights for policy 0, policy_version 62434 (0.0029) [2024-06-12 19:26:40,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 1022951424. Throughput: 0: 49275.6. Samples: 551767240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:26:40,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:26:44,119][71000] Updated weights for policy 0, policy_version 62444 (0.0034) [2024-06-12 19:26:45,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 1023180800. Throughput: 0: 49191.1. Samples: 552060700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:26:45,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:26:46,757][71000] Updated weights for policy 0, policy_version 62454 (0.0038) [2024-06-12 19:26:50,575][71000] Updated weights for policy 0, policy_version 62464 (0.0029) [2024-06-12 19:26:50,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48878.8, 300 sec: 48707.7). Total num frames: 1023410176. Throughput: 0: 48991.5. Samples: 552202200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:26:50,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:26:53,505][71000] Updated weights for policy 0, policy_version 62474 (0.0029) [2024-06-12 19:26:55,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48879.0, 300 sec: 48818.7). Total num frames: 1023688704. Throughput: 0: 49061.7. Samples: 552493460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:26:55,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:26:57,242][71000] Updated weights for policy 0, policy_version 62484 (0.0036) [2024-06-12 19:27:00,131][71000] Updated weights for policy 0, policy_version 62494 (0.0025) [2024-06-12 19:27:00,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49425.0, 300 sec: 48874.3). Total num frames: 1023934464. Throughput: 0: 49108.9. Samples: 552788680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:27:00,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 19:27:03,635][71000] Updated weights for policy 0, policy_version 62504 (0.0026) [2024-06-12 19:27:05,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 1024147456. Throughput: 0: 49120.5. Samples: 552941600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:27:05,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:27:06,838][71000] Updated weights for policy 0, policy_version 62514 (0.0029) [2024-06-12 19:27:10,659][71000] Updated weights for policy 0, policy_version 62524 (0.0028) [2024-06-12 19:27:10,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 1024393216. Throughput: 0: 48820.9. Samples: 553227020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:27:10,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 19:27:13,515][71000] Updated weights for policy 0, policy_version 62534 (0.0035) [2024-06-12 19:27:15,942][70768] Fps is (10 sec: 50780.5, 60 sec: 48877.5, 300 sec: 48762.9). Total num frames: 1024655360. Throughput: 0: 48742.0. Samples: 553519860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 19:27:15,942][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 19:27:17,564][71000] Updated weights for policy 0, policy_version 62544 (0.0032) [2024-06-12 19:27:18,956][70980] Signal inference workers to stop experience collection... (8150 times) [2024-06-12 19:27:18,956][70980] Signal inference workers to resume experience collection... (8150 times) [2024-06-12 19:27:19,000][71000] InferenceWorker_p0-w0: stopping experience collection (8150 times) [2024-06-12 19:27:19,000][71000] InferenceWorker_p0-w0: resuming experience collection (8150 times) [2024-06-12 19:27:20,301][71000] Updated weights for policy 0, policy_version 62554 (0.0025) [2024-06-12 19:27:20,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1024901120. Throughput: 0: 48778.1. Samples: 553669400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 19:27:20,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:27:24,022][71000] Updated weights for policy 0, policy_version 62564 (0.0037) [2024-06-12 19:27:25,940][70768] Fps is (10 sec: 47522.1, 60 sec: 48878.8, 300 sec: 48763.2). Total num frames: 1025130496. Throughput: 0: 48730.1. Samples: 553960100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 19:27:25,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:27:26,945][71000] Updated weights for policy 0, policy_version 62574 (0.0021) [2024-06-12 19:27:30,735][71000] Updated weights for policy 0, policy_version 62584 (0.0032) [2024-06-12 19:27:30,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 1025376256. Throughput: 0: 48600.4. Samples: 554247720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 19:27:30,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:27:33,688][71000] Updated weights for policy 0, policy_version 62594 (0.0029) [2024-06-12 19:27:35,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48605.7, 300 sec: 48818.8). Total num frames: 1025638400. Throughput: 0: 48694.7. Samples: 554393460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 19:27:35,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:27:37,982][71000] Updated weights for policy 0, policy_version 62604 (0.0032) [2024-06-12 19:27:40,544][71000] Updated weights for policy 0, policy_version 62614 (0.0029) [2024-06-12 19:27:40,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1025884160. Throughput: 0: 48705.2. Samples: 554685200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 19:27:40,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:27:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000062615_1025884160.pth... [2024-06-12 19:27:40,996][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000061899_1014153216.pth [2024-06-12 19:27:44,332][71000] Updated weights for policy 0, policy_version 62624 (0.0024) [2024-06-12 19:27:45,939][70768] Fps is (10 sec: 44237.7, 60 sec: 48332.8, 300 sec: 48707.7). Total num frames: 1026080768. Throughput: 0: 48692.7. Samples: 554979840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 19:27:45,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:27:47,116][71000] Updated weights for policy 0, policy_version 62634 (0.0021) [2024-06-12 19:27:50,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 1026342912. Throughput: 0: 48456.8. Samples: 555122160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 19:27:50,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 19:27:51,275][71000] Updated weights for policy 0, policy_version 62644 (0.0030) [2024-06-12 19:27:54,198][71000] Updated weights for policy 0, policy_version 62654 (0.0024) [2024-06-12 19:27:55,940][70768] Fps is (10 sec: 52428.4, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 1026605056. Throughput: 0: 48481.3. Samples: 555408680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 19:27:55,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:27:57,981][71000] Updated weights for policy 0, policy_version 62664 (0.0035) [2024-06-12 19:28:00,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48332.9, 300 sec: 48818.8). Total num frames: 1026834432. Throughput: 0: 48267.9. Samples: 555691820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 19:28:00,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:28:01,090][71000] Updated weights for policy 0, policy_version 62674 (0.0031) [2024-06-12 19:28:04,662][71000] Updated weights for policy 0, policy_version 62684 (0.0030) [2024-06-12 19:28:05,939][70768] Fps is (10 sec: 42598.7, 60 sec: 48059.8, 300 sec: 48596.6). Total num frames: 1027031040. Throughput: 0: 48062.4. Samples: 555832200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 19:28:05,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:28:07,793][71000] Updated weights for policy 0, policy_version 62694 (0.0031) [2024-06-12 19:28:10,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 1027309568. Throughput: 0: 48117.5. Samples: 556125380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 19:28:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 19:28:11,574][71000] Updated weights for policy 0, policy_version 62704 (0.0027) [2024-06-12 19:28:14,726][71000] Updated weights for policy 0, policy_version 62714 (0.0037) [2024-06-12 19:28:15,940][70768] Fps is (10 sec: 52428.4, 60 sec: 48334.4, 300 sec: 48763.2). Total num frames: 1027555328. Throughput: 0: 48332.5. Samples: 556422680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 19:28:15,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:28:18,206][71000] Updated weights for policy 0, policy_version 62724 (0.0024) [2024-06-12 19:28:20,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48332.8, 300 sec: 48763.2). Total num frames: 1027801088. Throughput: 0: 48389.0. Samples: 556570960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 19:28:20,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 19:28:21,252][71000] Updated weights for policy 0, policy_version 62734 (0.0024) [2024-06-12 19:28:24,821][71000] Updated weights for policy 0, policy_version 62744 (0.0026) [2024-06-12 19:28:25,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48059.9, 300 sec: 48652.2). Total num frames: 1028014080. Throughput: 0: 48339.7. Samples: 556860480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 19:28:25,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:28:26,051][70980] Signal inference workers to stop experience collection... (8200 times) [2024-06-12 19:28:26,051][70980] Signal inference workers to resume experience collection... (8200 times) [2024-06-12 19:28:26,091][71000] InferenceWorker_p0-w0: stopping experience collection (8200 times) [2024-06-12 19:28:26,091][71000] InferenceWorker_p0-w0: resuming experience collection (8200 times) [2024-06-12 19:28:28,191][71000] Updated weights for policy 0, policy_version 62754 (0.0043) [2024-06-12 19:28:30,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48332.7, 300 sec: 48707.7). Total num frames: 1028276224. Throughput: 0: 48441.2. Samples: 557159700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 19:28:30,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:28:31,723][71000] Updated weights for policy 0, policy_version 62764 (0.0025) [2024-06-12 19:28:34,577][71000] Updated weights for policy 0, policy_version 62774 (0.0022) [2024-06-12 19:28:35,940][70768] Fps is (10 sec: 52428.5, 60 sec: 48332.9, 300 sec: 48763.2). Total num frames: 1028538368. Throughput: 0: 48557.8. Samples: 557307260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 19:28:35,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:28:38,403][71000] Updated weights for policy 0, policy_version 62784 (0.0032) [2024-06-12 19:28:40,939][70768] Fps is (10 sec: 50791.3, 60 sec: 48333.0, 300 sec: 48874.3). Total num frames: 1028784128. Throughput: 0: 48905.9. Samples: 557609440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 19:28:40,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:28:41,281][71000] Updated weights for policy 0, policy_version 62794 (0.0041) [2024-06-12 19:28:45,035][71000] Updated weights for policy 0, policy_version 62804 (0.0039) [2024-06-12 19:28:45,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 1029013504. Throughput: 0: 48988.9. Samples: 557896320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 19:28:45,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:28:47,982][71000] Updated weights for policy 0, policy_version 62814 (0.0029) [2024-06-12 19:28:50,940][70768] Fps is (10 sec: 49150.8, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1029275648. Throughput: 0: 48917.5. Samples: 558033500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 19:28:50,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:28:51,740][71000] Updated weights for policy 0, policy_version 62824 (0.0024) [2024-06-12 19:28:55,063][71000] Updated weights for policy 0, policy_version 62834 (0.0026) [2024-06-12 19:28:55,939][70768] Fps is (10 sec: 49152.7, 60 sec: 48332.9, 300 sec: 48818.8). Total num frames: 1029505024. Throughput: 0: 49041.9. Samples: 558332260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 19:28:55,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 19:28:58,487][71000] Updated weights for policy 0, policy_version 62844 (0.0035) [2024-06-12 19:29:00,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1029767168. Throughput: 0: 49245.7. Samples: 558638740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 19:29:00,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:29:01,596][71000] Updated weights for policy 0, policy_version 62854 (0.0024) [2024-06-12 19:29:05,031][71000] Updated weights for policy 0, policy_version 62864 (0.0040) [2024-06-12 19:29:05,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49698.1, 300 sec: 48818.8). Total num frames: 1030012928. Throughput: 0: 49090.7. Samples: 558780040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 19:29:05,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 19:29:08,083][71000] Updated weights for policy 0, policy_version 62874 (0.0038) [2024-06-12 19:29:10,939][70768] Fps is (10 sec: 47514.5, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1030242304. Throughput: 0: 49113.4. Samples: 559070580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 19:29:10,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:29:11,995][71000] Updated weights for policy 0, policy_version 62884 (0.0024) [2024-06-12 19:29:14,803][71000] Updated weights for policy 0, policy_version 62894 (0.0026) [2024-06-12 19:29:15,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 1030471680. Throughput: 0: 48741.0. Samples: 559353040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-12 19:29:15,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:29:18,758][71000] Updated weights for policy 0, policy_version 62904 (0.0033) [2024-06-12 19:29:20,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49151.9, 300 sec: 48930.4). Total num frames: 1030750208. Throughput: 0: 48841.3. Samples: 559505120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 19:29:20,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:29:21,679][71000] Updated weights for policy 0, policy_version 62914 (0.0027) [2024-06-12 19:29:25,281][71000] Updated weights for policy 0, policy_version 62924 (0.0029) [2024-06-12 19:29:25,939][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.1, 300 sec: 48763.2). Total num frames: 1030963200. Throughput: 0: 48726.7. Samples: 559802140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 19:29:25,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:29:28,386][71000] Updated weights for policy 0, policy_version 62934 (0.0037) [2024-06-12 19:29:30,940][70768] Fps is (10 sec: 45875.9, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1031208960. Throughput: 0: 48767.6. Samples: 560090860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 19:29:30,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:29:31,724][71000] Updated weights for policy 0, policy_version 62944 (0.0033) [2024-06-12 19:29:35,240][71000] Updated weights for policy 0, policy_version 62954 (0.0025) [2024-06-12 19:29:35,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 1031454720. Throughput: 0: 48992.2. Samples: 560238140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 19:29:35,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:29:37,236][70980] Signal inference workers to stop experience collection... (8250 times) [2024-06-12 19:29:37,236][70980] Signal inference workers to resume experience collection... (8250 times) [2024-06-12 19:29:37,283][71000] InferenceWorker_p0-w0: stopping experience collection (8250 times) [2024-06-12 19:29:37,283][71000] InferenceWorker_p0-w0: resuming experience collection (8250 times) [2024-06-12 19:29:38,634][71000] Updated weights for policy 0, policy_version 62964 (0.0029) [2024-06-12 19:29:40,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 1031716864. Throughput: 0: 48770.5. Samples: 560526940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 19:29:40,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:29:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000062971_1031716864.pth... [2024-06-12 19:29:40,995][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000062257_1020018688.pth [2024-06-12 19:29:41,726][71000] Updated weights for policy 0, policy_version 62974 (0.0027) [2024-06-12 19:29:45,271][71000] Updated weights for policy 0, policy_version 62984 (0.0023) [2024-06-12 19:29:45,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 1031946240. Throughput: 0: 48649.6. Samples: 560827960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 19:29:45,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:29:48,303][71000] Updated weights for policy 0, policy_version 62994 (0.0030) [2024-06-12 19:29:50,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48606.0, 300 sec: 48763.2). Total num frames: 1032192000. Throughput: 0: 48815.6. Samples: 560976740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 19:29:50,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:29:51,603][71000] Updated weights for policy 0, policy_version 63004 (0.0023) [2024-06-12 19:29:55,241][71000] Updated weights for policy 0, policy_version 63014 (0.0023) [2024-06-12 19:29:55,939][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 1032454144. Throughput: 0: 48914.2. Samples: 561271720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 19:29:55,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:29:58,333][71000] Updated weights for policy 0, policy_version 63024 (0.0033) [2024-06-12 19:30:00,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 1032683520. Throughput: 0: 49033.3. Samples: 561559540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 19:30:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 19:30:01,849][71000] Updated weights for policy 0, policy_version 63034 (0.0028) [2024-06-12 19:30:05,108][71000] Updated weights for policy 0, policy_version 63044 (0.0032) [2024-06-12 19:30:05,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 1032929280. Throughput: 0: 48928.1. Samples: 561706880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 19:30:05,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:30:08,365][71000] Updated weights for policy 0, policy_version 63054 (0.0032) [2024-06-12 19:30:10,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1033175040. Throughput: 0: 48937.7. Samples: 562004340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 19:30:10,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:30:11,685][71000] Updated weights for policy 0, policy_version 63064 (0.0028) [2024-06-12 19:30:15,026][71000] Updated weights for policy 0, policy_version 63074 (0.0029) [2024-06-12 19:30:15,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 48763.3). Total num frames: 1033420800. Throughput: 0: 49047.1. Samples: 562297980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 19:30:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 19:30:18,135][71000] Updated weights for policy 0, policy_version 63084 (0.0028) [2024-06-12 19:30:20,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48606.0, 300 sec: 48874.3). Total num frames: 1033666560. Throughput: 0: 48916.5. Samples: 562439380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 19:30:20,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:30:21,919][71000] Updated weights for policy 0, policy_version 63094 (0.0044) [2024-06-12 19:30:25,016][71000] Updated weights for policy 0, policy_version 63104 (0.0028) [2024-06-12 19:30:25,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49151.8, 300 sec: 48874.3). Total num frames: 1033912320. Throughput: 0: 49138.2. Samples: 562738160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 19:30:25,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:30:26,937][70980] Signal inference workers to stop experience collection... (8300 times) [2024-06-12 19:30:26,983][71000] InferenceWorker_p0-w0: stopping experience collection (8300 times) [2024-06-12 19:30:26,990][70980] Signal inference workers to resume experience collection... (8300 times) [2024-06-12 19:30:27,001][71000] InferenceWorker_p0-w0: resuming experience collection (8300 times) [2024-06-12 19:30:28,682][71000] Updated weights for policy 0, policy_version 63114 (0.0026) [2024-06-12 19:30:30,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.8, 300 sec: 48763.2). Total num frames: 1034141696. Throughput: 0: 48819.3. Samples: 563024840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 19:30:30,948][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:30:31,954][71000] Updated weights for policy 0, policy_version 63124 (0.0030) [2024-06-12 19:30:35,006][71000] Updated weights for policy 0, policy_version 63134 (0.0022) [2024-06-12 19:30:35,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49151.9, 300 sec: 48818.7). Total num frames: 1034403840. Throughput: 0: 48963.5. Samples: 563180100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 19:30:35,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:30:38,571][71000] Updated weights for policy 0, policy_version 63144 (0.0033) [2024-06-12 19:30:40,939][70768] Fps is (10 sec: 49152.6, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 1034633216. Throughput: 0: 48944.9. Samples: 563474240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 19:30:40,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:30:42,078][71000] Updated weights for policy 0, policy_version 63154 (0.0025) [2024-06-12 19:30:45,059][71000] Updated weights for policy 0, policy_version 63164 (0.0033) [2024-06-12 19:30:45,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 1034895360. Throughput: 0: 49047.1. Samples: 563766660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 19:30:45,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:30:48,735][71000] Updated weights for policy 0, policy_version 63174 (0.0032) [2024-06-12 19:30:50,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 1035141120. Throughput: 0: 49193.3. Samples: 563920580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 19:30:50,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:30:51,977][71000] Updated weights for policy 0, policy_version 63184 (0.0034) [2024-06-12 19:30:55,317][71000] Updated weights for policy 0, policy_version 63194 (0.0027) [2024-06-12 19:30:55,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1035386880. Throughput: 0: 49041.9. Samples: 564211220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 19:30:55,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:30:58,535][71000] Updated weights for policy 0, policy_version 63204 (0.0035) [2024-06-12 19:31:00,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 1035632640. Throughput: 0: 48998.2. Samples: 564502900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 19:31:00,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:31:01,839][71000] Updated weights for policy 0, policy_version 63214 (0.0022) [2024-06-12 19:31:05,262][71000] Updated weights for policy 0, policy_version 63224 (0.0029) [2024-06-12 19:31:05,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 1035878400. Throughput: 0: 49353.8. Samples: 564660300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 19:31:05,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 19:31:08,646][71000] Updated weights for policy 0, policy_version 63234 (0.0027) [2024-06-12 19:31:10,940][70768] Fps is (10 sec: 50789.1, 60 sec: 49424.9, 300 sec: 48874.3). Total num frames: 1036140544. Throughput: 0: 49208.4. Samples: 564952540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 19:31:10,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:31:11,850][71000] Updated weights for policy 0, policy_version 63244 (0.0025) [2024-06-12 19:31:15,336][71000] Updated weights for policy 0, policy_version 63254 (0.0036) [2024-06-12 19:31:15,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 1036369920. Throughput: 0: 49316.1. Samples: 565244060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 19:31:15,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:31:18,772][71000] Updated weights for policy 0, policy_version 63264 (0.0027) [2024-06-12 19:31:20,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.0, 300 sec: 48929.8). Total num frames: 1036632064. Throughput: 0: 49000.0. Samples: 565385100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 19:31:20,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 19:31:22,015][71000] Updated weights for policy 0, policy_version 63274 (0.0035) [2024-06-12 19:31:25,199][71000] Updated weights for policy 0, policy_version 63284 (0.0030) [2024-06-12 19:31:25,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 1036861440. Throughput: 0: 49247.5. Samples: 565690380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 19:31:25,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:31:28,658][71000] Updated weights for policy 0, policy_version 63294 (0.0033) [2024-06-12 19:31:30,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49425.0, 300 sec: 48763.2). Total num frames: 1037107200. Throughput: 0: 49290.5. Samples: 565984740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 19:31:30,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:31:31,992][71000] Updated weights for policy 0, policy_version 63304 (0.0033) [2024-06-12 19:31:35,354][71000] Updated weights for policy 0, policy_version 63314 (0.0024) [2024-06-12 19:31:35,941][70768] Fps is (10 sec: 50784.0, 60 sec: 49424.1, 300 sec: 48874.1). Total num frames: 1037369344. Throughput: 0: 49094.2. Samples: 566129880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 19:31:35,941][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 19:31:38,577][71000] Updated weights for policy 0, policy_version 63324 (0.0034) [2024-06-12 19:31:39,720][70980] Signal inference workers to stop experience collection... (8350 times) [2024-06-12 19:31:39,762][71000] InferenceWorker_p0-w0: stopping experience collection (8350 times) [2024-06-12 19:31:39,770][70980] Signal inference workers to resume experience collection... (8350 times) [2024-06-12 19:31:39,774][71000] InferenceWorker_p0-w0: resuming experience collection (8350 times) [2024-06-12 19:31:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49424.9, 300 sec: 48874.3). Total num frames: 1037598720. Throughput: 0: 49326.5. Samples: 566430920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 19:31:40,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 19:31:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000063330_1037598720.pth... [2024-06-12 19:31:41,021][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000062615_1025884160.pth [2024-06-12 19:31:41,953][71000] Updated weights for policy 0, policy_version 63334 (0.0034) [2024-06-12 19:31:45,290][71000] Updated weights for policy 0, policy_version 63344 (0.0029) [2024-06-12 19:31:45,939][70768] Fps is (10 sec: 45881.3, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1037828096. Throughput: 0: 49158.7. Samples: 566715040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 19:31:45,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:31:48,706][71000] Updated weights for policy 0, policy_version 63354 (0.0023) [2024-06-12 19:31:50,942][70768] Fps is (10 sec: 49142.0, 60 sec: 49150.2, 300 sec: 48818.4). Total num frames: 1038090240. Throughput: 0: 48951.3. Samples: 566863220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 19:31:50,942][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 19:31:52,045][71000] Updated weights for policy 0, policy_version 63364 (0.0035) [2024-06-12 19:31:55,323][71000] Updated weights for policy 0, policy_version 63374 (0.0020) [2024-06-12 19:31:55,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1038319616. Throughput: 0: 48863.8. Samples: 567151400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 19:31:55,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:31:58,601][71000] Updated weights for policy 0, policy_version 63384 (0.0029) [2024-06-12 19:32:00,940][70768] Fps is (10 sec: 49162.3, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 1038581760. Throughput: 0: 48890.6. Samples: 567444140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 19:32:00,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:32:02,383][71000] Updated weights for policy 0, policy_version 63394 (0.0023) [2024-06-12 19:32:05,703][71000] Updated weights for policy 0, policy_version 63404 (0.0032) [2024-06-12 19:32:05,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1038811136. Throughput: 0: 48994.8. Samples: 567589860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 19:32:05,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:32:08,903][71000] Updated weights for policy 0, policy_version 63414 (0.0025) [2024-06-12 19:32:10,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48606.0, 300 sec: 48819.1). Total num frames: 1039056896. Throughput: 0: 48781.4. Samples: 567885540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 19:32:10,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:32:12,231][71000] Updated weights for policy 0, policy_version 63424 (0.0027) [2024-06-12 19:32:15,432][71000] Updated weights for policy 0, policy_version 63434 (0.0034) [2024-06-12 19:32:15,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 1039319040. Throughput: 0: 48826.0. Samples: 568181900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 19:32:15,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:32:18,893][71000] Updated weights for policy 0, policy_version 63444 (0.0027) [2024-06-12 19:32:20,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 1039548416. Throughput: 0: 48921.2. Samples: 568331280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 19:32:20,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:32:22,079][71000] Updated weights for policy 0, policy_version 63454 (0.0027) [2024-06-12 19:32:25,468][71000] Updated weights for policy 0, policy_version 63464 (0.0036) [2024-06-12 19:32:25,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 1039810560. Throughput: 0: 48869.5. Samples: 568630040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 19:32:25,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 19:32:28,837][71000] Updated weights for policy 0, policy_version 63474 (0.0025) [2024-06-12 19:32:30,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1040039936. Throughput: 0: 49038.6. Samples: 568921780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 19:32:30,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:32:32,400][71000] Updated weights for policy 0, policy_version 63484 (0.0024) [2024-06-12 19:32:35,520][71000] Updated weights for policy 0, policy_version 63494 (0.0027) [2024-06-12 19:32:35,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48879.9, 300 sec: 48874.3). Total num frames: 1040302080. Throughput: 0: 48821.4. Samples: 569060080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-12 19:32:35,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:32:39,083][71000] Updated weights for policy 0, policy_version 63504 (0.0026) [2024-06-12 19:32:40,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48879.1, 300 sec: 48985.4). Total num frames: 1040531456. Throughput: 0: 49109.8. Samples: 569361340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-12 19:32:40,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:32:42,117][71000] Updated weights for policy 0, policy_version 63514 (0.0024) [2024-06-12 19:32:45,398][71000] Updated weights for policy 0, policy_version 63524 (0.0029) [2024-06-12 19:32:45,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 1040793600. Throughput: 0: 49378.3. Samples: 569666160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-12 19:32:45,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:32:48,573][71000] Updated weights for policy 0, policy_version 63534 (0.0023) [2024-06-12 19:32:50,939][70768] Fps is (10 sec: 50790.3, 60 sec: 49153.8, 300 sec: 48929.9). Total num frames: 1041039360. Throughput: 0: 49346.2. Samples: 569810440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-12 19:32:50,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:32:52,071][71000] Updated weights for policy 0, policy_version 63544 (0.0027) [2024-06-12 19:32:52,314][70980] Signal inference workers to stop experience collection... (8400 times) [2024-06-12 19:32:52,315][70980] Signal inference workers to resume experience collection... (8400 times) [2024-06-12 19:32:52,329][71000] InferenceWorker_p0-w0: stopping experience collection (8400 times) [2024-06-12 19:32:52,329][71000] InferenceWorker_p0-w0: resuming experience collection (8400 times) [2024-06-12 19:32:55,474][71000] Updated weights for policy 0, policy_version 63554 (0.0027) [2024-06-12 19:32:55,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 1041285120. Throughput: 0: 49311.0. Samples: 570104540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-12 19:32:55,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:32:58,888][71000] Updated weights for policy 0, policy_version 63564 (0.0029) [2024-06-12 19:33:00,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48879.0, 300 sec: 49096.4). Total num frames: 1041514496. Throughput: 0: 49166.6. Samples: 570394400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-12 19:33:00,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:33:02,097][71000] Updated weights for policy 0, policy_version 63574 (0.0021) [2024-06-12 19:33:05,311][71000] Updated weights for policy 0, policy_version 63584 (0.0041) [2024-06-12 19:33:05,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.1, 300 sec: 49096.5). Total num frames: 1041793024. Throughput: 0: 49185.1. Samples: 570544600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-12 19:33:05,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:33:08,700][71000] Updated weights for policy 0, policy_version 63594 (0.0023) [2024-06-12 19:33:10,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 1042022400. Throughput: 0: 49330.6. Samples: 570849920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 19:33:10,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:33:11,839][71000] Updated weights for policy 0, policy_version 63604 (0.0028) [2024-06-12 19:33:15,289][71000] Updated weights for policy 0, policy_version 63614 (0.0044) [2024-06-12 19:33:15,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 1042268160. Throughput: 0: 49351.2. Samples: 571142580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 19:33:15,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:33:18,349][71000] Updated weights for policy 0, policy_version 63624 (0.0033) [2024-06-12 19:33:20,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.2, 300 sec: 49152.0). Total num frames: 1042513920. Throughput: 0: 49455.2. Samples: 571285560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 19:33:20,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:33:21,844][71000] Updated weights for policy 0, policy_version 63634 (0.0030) [2024-06-12 19:33:25,319][71000] Updated weights for policy 0, policy_version 63644 (0.0024) [2024-06-12 19:33:25,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 1042743296. Throughput: 0: 49319.0. Samples: 571580700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 19:33:25,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:33:28,841][71000] Updated weights for policy 0, policy_version 63654 (0.0032) [2024-06-12 19:33:30,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 1042989056. Throughput: 0: 49111.6. Samples: 571876180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 19:33:30,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:33:32,099][71000] Updated weights for policy 0, policy_version 63664 (0.0030) [2024-06-12 19:33:35,772][71000] Updated weights for policy 0, policy_version 63674 (0.0030) [2024-06-12 19:33:35,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 1043251200. Throughput: 0: 49037.1. Samples: 572017120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 19:33:35,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 19:33:38,890][71000] Updated weights for policy 0, policy_version 63684 (0.0032) [2024-06-12 19:33:40,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 1043480576. Throughput: 0: 48900.5. Samples: 572305060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 19:33:40,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 19:33:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000063689_1043480576.pth... [2024-06-12 19:33:41,008][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000062971_1031716864.pth [2024-06-12 19:33:42,238][71000] Updated weights for policy 0, policy_version 63694 (0.0028) [2024-06-12 19:33:45,643][71000] Updated weights for policy 0, policy_version 63704 (0.0029) [2024-06-12 19:33:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 1043726336. Throughput: 0: 48859.4. Samples: 572593080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 19:33:45,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:33:49,243][71000] Updated weights for policy 0, policy_version 63714 (0.0025) [2024-06-12 19:33:50,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 1043972096. Throughput: 0: 48787.6. Samples: 572740040. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-12 19:33:50,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:33:52,521][71000] Updated weights for policy 0, policy_version 63724 (0.0029) [2024-06-12 19:33:55,940][70768] Fps is (10 sec: 47514.3, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 1044201472. Throughput: 0: 48344.0. Samples: 573025400. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-12 19:33:55,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:33:56,014][71000] Updated weights for policy 0, policy_version 63734 (0.0024) [2024-06-12 19:33:59,519][71000] Updated weights for policy 0, policy_version 63744 (0.0027) [2024-06-12 19:34:00,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 1044463616. Throughput: 0: 48454.9. Samples: 573323060. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-12 19:34:00,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:34:02,802][71000] Updated weights for policy 0, policy_version 63754 (0.0023) [2024-06-12 19:34:05,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48332.8, 300 sec: 48985.4). Total num frames: 1044692992. Throughput: 0: 48516.1. Samples: 573468780. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-12 19:34:05,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:34:06,097][71000] Updated weights for policy 0, policy_version 63764 (0.0030) [2024-06-12 19:34:09,565][71000] Updated weights for policy 0, policy_version 63774 (0.0031) [2024-06-12 19:34:09,843][70980] Signal inference workers to stop experience collection... (8450 times) [2024-06-12 19:34:09,844][70980] Signal inference workers to resume experience collection... (8450 times) [2024-06-12 19:34:09,884][71000] InferenceWorker_p0-w0: stopping experience collection (8450 times) [2024-06-12 19:34:09,884][71000] InferenceWorker_p0-w0: resuming experience collection (8450 times) [2024-06-12 19:34:10,940][70768] Fps is (10 sec: 47514.3, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 1044938752. Throughput: 0: 48510.7. Samples: 573763680. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-12 19:34:10,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:34:12,884][71000] Updated weights for policy 0, policy_version 63784 (0.0030) [2024-06-12 19:34:15,940][70768] Fps is (10 sec: 49151.2, 60 sec: 48605.7, 300 sec: 48929.8). Total num frames: 1045184512. Throughput: 0: 48316.7. Samples: 574050440. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-12 19:34:15,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:34:16,180][71000] Updated weights for policy 0, policy_version 63794 (0.0041) [2024-06-12 19:34:19,385][71000] Updated weights for policy 0, policy_version 63804 (0.0024) [2024-06-12 19:34:20,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 1045430272. Throughput: 0: 48620.1. Samples: 574205020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-12 19:34:20,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 19:34:22,647][71000] Updated weights for policy 0, policy_version 63814 (0.0028) [2024-06-12 19:34:25,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 1045676032. Throughput: 0: 48817.4. Samples: 574501840. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-12 19:34:25,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:34:26,209][71000] Updated weights for policy 0, policy_version 63824 (0.0026) [2024-06-12 19:34:29,563][71000] Updated weights for policy 0, policy_version 63834 (0.0028) [2024-06-12 19:34:30,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 1045921792. Throughput: 0: 49037.4. Samples: 574799760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 19:34:30,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:34:32,803][71000] Updated weights for policy 0, policy_version 63844 (0.0027) [2024-06-12 19:34:35,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 1046167552. Throughput: 0: 49036.0. Samples: 574946660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 19:34:35,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:34:36,289][71000] Updated weights for policy 0, policy_version 63854 (0.0030) [2024-06-12 19:34:39,796][71000] Updated weights for policy 0, policy_version 63864 (0.0030) [2024-06-12 19:34:40,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 1046396928. Throughput: 0: 49204.4. Samples: 575239600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 19:34:40,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:34:42,874][71000] Updated weights for policy 0, policy_version 63874 (0.0040) [2024-06-12 19:34:45,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 1046642688. Throughput: 0: 48851.3. Samples: 575521360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 19:34:45,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:34:46,399][71000] Updated weights for policy 0, policy_version 63884 (0.0036) [2024-06-12 19:34:49,631][71000] Updated weights for policy 0, policy_version 63894 (0.0031) [2024-06-12 19:34:50,939][70768] Fps is (10 sec: 50790.6, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 1046904832. Throughput: 0: 48900.4. Samples: 575669300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 19:34:50,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:34:52,793][71000] Updated weights for policy 0, policy_version 63904 (0.0027) [2024-06-12 19:34:55,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 1047134208. Throughput: 0: 48944.0. Samples: 575966160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 19:34:55,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:34:56,327][71000] Updated weights for policy 0, policy_version 63914 (0.0032) [2024-06-12 19:34:59,727][71000] Updated weights for policy 0, policy_version 63924 (0.0032) [2024-06-12 19:35:00,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48879.1, 300 sec: 49040.9). Total num frames: 1047396352. Throughput: 0: 49114.8. Samples: 576260600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 19:35:00,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:35:02,982][71000] Updated weights for policy 0, policy_version 63934 (0.0037) [2024-06-12 19:35:05,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 1047625728. Throughput: 0: 48947.9. Samples: 576407680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 19:35:05,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:35:06,494][71000] Updated weights for policy 0, policy_version 63944 (0.0037) [2024-06-12 19:35:09,952][71000] Updated weights for policy 0, policy_version 63954 (0.0032) [2024-06-12 19:35:10,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 1047871488. Throughput: 0: 48781.8. Samples: 576697020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 19:35:10,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:35:13,039][71000] Updated weights for policy 0, policy_version 63964 (0.0030) [2024-06-12 19:35:15,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 1048117248. Throughput: 0: 48316.9. Samples: 576974020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 19:35:15,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:35:16,686][71000] Updated weights for policy 0, policy_version 63974 (0.0031) [2024-06-12 19:35:19,851][71000] Updated weights for policy 0, policy_version 63984 (0.0029) [2024-06-12 19:35:20,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 1048346624. Throughput: 0: 48246.1. Samples: 577117740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 19:35:20,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:35:23,327][71000] Updated weights for policy 0, policy_version 63994 (0.0029) [2024-06-12 19:35:25,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 1048592384. Throughput: 0: 48407.5. Samples: 577417940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 19:35:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 19:35:26,870][71000] Updated weights for policy 0, policy_version 64004 (0.0027) [2024-06-12 19:35:30,157][71000] Updated weights for policy 0, policy_version 64014 (0.0030) [2024-06-12 19:35:30,939][70768] Fps is (10 sec: 49152.9, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 1048838144. Throughput: 0: 48465.4. Samples: 577702300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 19:35:30,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:35:33,453][71000] Updated weights for policy 0, policy_version 64024 (0.0030) [2024-06-12 19:35:35,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48332.8, 300 sec: 48929.8). Total num frames: 1049067520. Throughput: 0: 48565.3. Samples: 577854740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 19:35:35,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:35:37,084][71000] Updated weights for policy 0, policy_version 64034 (0.0027) [2024-06-12 19:35:40,014][71000] Updated weights for policy 0, policy_version 64044 (0.0031) [2024-06-12 19:35:40,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 1049313280. Throughput: 0: 48220.0. Samples: 578136060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 19:35:40,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 19:35:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000064045_1049313280.pth... [2024-06-12 19:35:40,997][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000063330_1037598720.pth [2024-06-12 19:35:41,569][70980] Signal inference workers to stop experience collection... (8500 times) [2024-06-12 19:35:41,569][70980] Signal inference workers to resume experience collection... (8500 times) [2024-06-12 19:35:41,591][71000] InferenceWorker_p0-w0: stopping experience collection (8500 times) [2024-06-12 19:35:41,591][71000] InferenceWorker_p0-w0: resuming experience collection (8500 times) [2024-06-12 19:35:43,691][71000] Updated weights for policy 0, policy_version 64054 (0.0024) [2024-06-12 19:35:45,939][70768] Fps is (10 sec: 47513.7, 60 sec: 48332.9, 300 sec: 48818.8). Total num frames: 1049542656. Throughput: 0: 48033.4. Samples: 578422100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 19:35:45,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:35:47,400][71000] Updated weights for policy 0, policy_version 64064 (0.0034) [2024-06-12 19:35:50,711][71000] Updated weights for policy 0, policy_version 64074 (0.0030) [2024-06-12 19:35:50,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48332.6, 300 sec: 48874.3). Total num frames: 1049804800. Throughput: 0: 48005.3. Samples: 578567920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 19:35:50,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:35:54,367][71000] Updated weights for policy 0, policy_version 64084 (0.0034) [2024-06-12 19:35:55,939][70768] Fps is (10 sec: 49152.0, 60 sec: 48332.8, 300 sec: 48818.8). Total num frames: 1050034176. Throughput: 0: 48090.3. Samples: 578861080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 19:35:55,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:35:57,535][71000] Updated weights for policy 0, policy_version 64094 (0.0028) [2024-06-12 19:36:00,940][70768] Fps is (10 sec: 45876.0, 60 sec: 47786.7, 300 sec: 48763.2). Total num frames: 1050263552. Throughput: 0: 48223.6. Samples: 579144080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 19:36:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 19:36:01,087][71000] Updated weights for policy 0, policy_version 64104 (0.0031) [2024-06-12 19:36:04,517][71000] Updated weights for policy 0, policy_version 64114 (0.0037) [2024-06-12 19:36:05,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48332.9, 300 sec: 48763.3). Total num frames: 1050525696. Throughput: 0: 48365.9. Samples: 579294200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 19:36:05,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 19:36:07,904][71000] Updated weights for policy 0, policy_version 64124 (0.0032) [2024-06-12 19:36:10,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48059.7, 300 sec: 48763.2). Total num frames: 1050755072. Throughput: 0: 48083.1. Samples: 579581680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 19:36:10,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 19:36:11,054][71000] Updated weights for policy 0, policy_version 64134 (0.0030) [2024-06-12 19:36:14,574][71000] Updated weights for policy 0, policy_version 64144 (0.0029) [2024-06-12 19:36:15,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48059.8, 300 sec: 48707.7). Total num frames: 1051000832. Throughput: 0: 48121.7. Samples: 579867780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 19:36:15,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:36:17,552][71000] Updated weights for policy 0, policy_version 64154 (0.0031) [2024-06-12 19:36:20,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48332.9, 300 sec: 48763.2). Total num frames: 1051246592. Throughput: 0: 48228.8. Samples: 580025040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 19:36:20,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:36:20,985][71000] Updated weights for policy 0, policy_version 64164 (0.0028) [2024-06-12 19:36:24,375][71000] Updated weights for policy 0, policy_version 64174 (0.0031) [2024-06-12 19:36:25,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 1051508736. Throughput: 0: 48568.4. Samples: 580321640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 19:36:25,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:36:27,693][71000] Updated weights for policy 0, policy_version 64184 (0.0034) [2024-06-12 19:36:30,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48332.8, 300 sec: 48707.9). Total num frames: 1051738112. Throughput: 0: 48580.9. Samples: 580608240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 19:36:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:36:31,072][71000] Updated weights for policy 0, policy_version 64194 (0.0035) [2024-06-12 19:36:34,406][71000] Updated weights for policy 0, policy_version 64204 (0.0029) [2024-06-12 19:36:35,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 1051983872. Throughput: 0: 48616.6. Samples: 580755660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 19:36:35,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:36:37,813][71000] Updated weights for policy 0, policy_version 64214 (0.0026) [2024-06-12 19:36:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48605.8, 300 sec: 48818.8). Total num frames: 1052229632. Throughput: 0: 48633.2. Samples: 581049580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 19:36:40,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:36:41,067][71000] Updated weights for policy 0, policy_version 64224 (0.0033) [2024-06-12 19:36:44,604][71000] Updated weights for policy 0, policy_version 64234 (0.0030) [2024-06-12 19:36:45,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49151.9, 300 sec: 48819.1). Total num frames: 1052491776. Throughput: 0: 48900.3. Samples: 581344600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 19:36:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 19:36:47,833][71000] Updated weights for policy 0, policy_version 64244 (0.0038) [2024-06-12 19:36:50,939][70768] Fps is (10 sec: 47514.1, 60 sec: 48333.0, 300 sec: 48763.2). Total num frames: 1052704768. Throughput: 0: 48818.8. Samples: 581491040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 19:36:50,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:36:51,187][71000] Updated weights for policy 0, policy_version 64254 (0.0033) [2024-06-12 19:36:54,558][71000] Updated weights for policy 0, policy_version 64264 (0.0031) [2024-06-12 19:36:55,940][70768] Fps is (10 sec: 44237.3, 60 sec: 48332.8, 300 sec: 48652.2). Total num frames: 1052934144. Throughput: 0: 48899.6. Samples: 581782160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 19:36:55,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:36:57,848][71000] Updated weights for policy 0, policy_version 64274 (0.0029) [2024-06-12 19:37:00,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1053196288. Throughput: 0: 49006.6. Samples: 582073080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 19:37:00,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:37:01,407][71000] Updated weights for policy 0, policy_version 64284 (0.0031) [2024-06-12 19:37:04,803][71000] Updated weights for policy 0, policy_version 64294 (0.0034) [2024-06-12 19:37:05,940][70768] Fps is (10 sec: 52428.3, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1053458432. Throughput: 0: 48736.8. Samples: 582218200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 19:37:05,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:37:07,878][71000] Updated weights for policy 0, policy_version 64304 (0.0031) [2024-06-12 19:37:10,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 1053687808. Throughput: 0: 48659.5. Samples: 582511320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 19:37:10,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:37:11,266][71000] Updated weights for policy 0, policy_version 64314 (0.0029) [2024-06-12 19:37:12,554][70980] Signal inference workers to stop experience collection... (8550 times) [2024-06-12 19:37:12,555][70980] Signal inference workers to resume experience collection... (8550 times) [2024-06-12 19:37:12,601][71000] InferenceWorker_p0-w0: stopping experience collection (8550 times) [2024-06-12 19:37:12,601][71000] InferenceWorker_p0-w0: resuming experience collection (8550 times) [2024-06-12 19:37:14,882][71000] Updated weights for policy 0, policy_version 64324 (0.0029) [2024-06-12 19:37:15,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1053933568. Throughput: 0: 48799.5. Samples: 582804220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 19:37:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 19:37:18,185][71000] Updated weights for policy 0, policy_version 64334 (0.0028) [2024-06-12 19:37:20,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 1054179328. Throughput: 0: 48687.6. Samples: 582946600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 19:37:20,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:37:21,711][71000] Updated weights for policy 0, policy_version 64344 (0.0032) [2024-06-12 19:37:24,964][71000] Updated weights for policy 0, policy_version 64354 (0.0039) [2024-06-12 19:37:25,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 1054425088. Throughput: 0: 48632.1. Samples: 583238020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 19:37:25,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:37:28,164][71000] Updated weights for policy 0, policy_version 64364 (0.0037) [2024-06-12 19:37:30,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.9, 300 sec: 48652.2). Total num frames: 1054654464. Throughput: 0: 48713.1. Samples: 583536680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 19:37:30,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 19:37:31,722][71000] Updated weights for policy 0, policy_version 64374 (0.0030) [2024-06-12 19:37:34,958][71000] Updated weights for policy 0, policy_version 64384 (0.0027) [2024-06-12 19:37:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 1054916608. Throughput: 0: 48731.5. Samples: 583683960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 19:37:35,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:37:38,250][71000] Updated weights for policy 0, policy_version 64394 (0.0025) [2024-06-12 19:37:40,940][70768] Fps is (10 sec: 50789.1, 60 sec: 48878.8, 300 sec: 48707.7). Total num frames: 1055162368. Throughput: 0: 49021.6. Samples: 583988140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 19:37:40,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:37:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000064402_1055162368.pth... [2024-06-12 19:37:41,018][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000063689_1043480576.pth [2024-06-12 19:37:41,779][71000] Updated weights for policy 0, policy_version 64404 (0.0023) [2024-06-12 19:37:44,826][71000] Updated weights for policy 0, policy_version 64414 (0.0032) [2024-06-12 19:37:45,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48332.9, 300 sec: 48652.1). Total num frames: 1055391744. Throughput: 0: 48873.4. Samples: 584272380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 19:37:45,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:37:48,523][71000] Updated weights for policy 0, policy_version 64424 (0.0036) [2024-06-12 19:37:50,940][70768] Fps is (10 sec: 49153.0, 60 sec: 49152.0, 300 sec: 48707.7). Total num frames: 1055653888. Throughput: 0: 48966.3. Samples: 584421680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-12 19:37:50,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:37:52,016][71000] Updated weights for policy 0, policy_version 64434 (0.0030) [2024-06-12 19:37:55,255][71000] Updated weights for policy 0, policy_version 64444 (0.0028) [2024-06-12 19:37:55,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49151.9, 300 sec: 48707.7). Total num frames: 1055883264. Throughput: 0: 48694.6. Samples: 584702580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-12 19:37:55,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:37:58,953][71000] Updated weights for policy 0, policy_version 64454 (0.0032) [2024-06-12 19:38:00,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 48652.1). Total num frames: 1056145408. Throughput: 0: 48763.1. Samples: 584998560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-12 19:38:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 19:38:00,951][70980] Saving new best policy, reward=0.277! [2024-06-12 19:38:01,788][71000] Updated weights for policy 0, policy_version 64464 (0.0033) [2024-06-12 19:38:05,268][71000] Updated weights for policy 0, policy_version 64474 (0.0032) [2024-06-12 19:38:05,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48605.9, 300 sec: 48652.2). Total num frames: 1056374784. Throughput: 0: 49001.7. Samples: 585151680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-12 19:38:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 19:38:08,394][71000] Updated weights for policy 0, policy_version 64484 (0.0023) [2024-06-12 19:38:10,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 48707.7). Total num frames: 1056636928. Throughput: 0: 49177.8. Samples: 585451020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-12 19:38:10,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:38:12,035][71000] Updated weights for policy 0, policy_version 64494 (0.0029) [2024-06-12 19:38:15,506][71000] Updated weights for policy 0, policy_version 64504 (0.0033) [2024-06-12 19:38:15,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.9, 300 sec: 48596.6). Total num frames: 1056849920. Throughput: 0: 48944.4. Samples: 585739180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-12 19:38:15,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:38:18,907][71000] Updated weights for policy 0, policy_version 64514 (0.0025) [2024-06-12 19:38:20,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 1057128448. Throughput: 0: 48971.8. Samples: 585887700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-12 19:38:20,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:38:22,138][71000] Updated weights for policy 0, policy_version 64524 (0.0033) [2024-06-12 19:38:22,948][70980] Signal inference workers to stop experience collection... (8600 times) [2024-06-12 19:38:22,986][71000] InferenceWorker_p0-w0: stopping experience collection (8600 times) [2024-06-12 19:38:22,996][70980] Signal inference workers to resume experience collection... (8600 times) [2024-06-12 19:38:22,997][71000] InferenceWorker_p0-w0: resuming experience collection (8600 times) [2024-06-12 19:38:25,335][71000] Updated weights for policy 0, policy_version 64534 (0.0034) [2024-06-12 19:38:25,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48605.7, 300 sec: 48652.1). Total num frames: 1057341440. Throughput: 0: 48637.4. Samples: 586176820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-12 19:38:25,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:38:28,568][71000] Updated weights for policy 0, policy_version 64544 (0.0035) [2024-06-12 19:38:30,944][70768] Fps is (10 sec: 47493.7, 60 sec: 49148.4, 300 sec: 48651.5). Total num frames: 1057603584. Throughput: 0: 48761.6. Samples: 586466860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 19:38:30,944][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:38:32,188][71000] Updated weights for policy 0, policy_version 64554 (0.0029) [2024-06-12 19:38:35,234][71000] Updated weights for policy 0, policy_version 64564 (0.0030) [2024-06-12 19:38:35,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48605.8, 300 sec: 48652.2). Total num frames: 1057832960. Throughput: 0: 48799.5. Samples: 586617660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 19:38:35,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:38:38,906][71000] Updated weights for policy 0, policy_version 64574 (0.0030) [2024-06-12 19:38:40,940][70768] Fps is (10 sec: 49172.5, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 1058095104. Throughput: 0: 49109.7. Samples: 586912520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 19:38:40,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 19:38:42,108][71000] Updated weights for policy 0, policy_version 64584 (0.0027) [2024-06-12 19:38:45,662][71000] Updated weights for policy 0, policy_version 64594 (0.0034) [2024-06-12 19:38:45,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48605.9, 300 sec: 48596.6). Total num frames: 1058308096. Throughput: 0: 49057.4. Samples: 587206140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 19:38:45,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 19:38:48,755][71000] Updated weights for policy 0, policy_version 64604 (0.0022) [2024-06-12 19:38:50,940][70768] Fps is (10 sec: 47514.4, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 1058570240. Throughput: 0: 48806.7. Samples: 587347980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 19:38:50,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:38:52,215][71000] Updated weights for policy 0, policy_version 64614 (0.0042) [2024-06-12 19:38:55,346][71000] Updated weights for policy 0, policy_version 64624 (0.0028) [2024-06-12 19:38:55,939][70768] Fps is (10 sec: 50790.4, 60 sec: 48879.0, 300 sec: 48652.2). Total num frames: 1058816000. Throughput: 0: 48399.1. Samples: 587628980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 19:38:55,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:38:59,097][71000] Updated weights for policy 0, policy_version 64634 (0.0025) [2024-06-12 19:39:00,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 1059061760. Throughput: 0: 48656.0. Samples: 587928700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 19:39:00,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:39:02,481][71000] Updated weights for policy 0, policy_version 64644 (0.0030) [2024-06-12 19:39:05,785][71000] Updated weights for policy 0, policy_version 64654 (0.0032) [2024-06-12 19:39:05,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48605.9, 300 sec: 48652.2). Total num frames: 1059291136. Throughput: 0: 48471.7. Samples: 588068920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 19:39:05,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 19:39:09,063][71000] Updated weights for policy 0, policy_version 64664 (0.0032) [2024-06-12 19:39:10,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 1059553280. Throughput: 0: 48592.1. Samples: 588363460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:39:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 19:39:12,413][71000] Updated weights for policy 0, policy_version 64674 (0.0035) [2024-06-12 19:39:15,487][71000] Updated weights for policy 0, policy_version 64684 (0.0037) [2024-06-12 19:39:15,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49151.9, 300 sec: 48707.7). Total num frames: 1059799040. Throughput: 0: 48784.1. Samples: 588661940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:39:15,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:39:18,910][71000] Updated weights for policy 0, policy_version 64694 (0.0023) [2024-06-12 19:39:20,939][70768] Fps is (10 sec: 47514.3, 60 sec: 48333.0, 300 sec: 48652.2). Total num frames: 1060028416. Throughput: 0: 48626.8. Samples: 588805860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:39:20,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:39:22,210][71000] Updated weights for policy 0, policy_version 64704 (0.0026) [2024-06-12 19:39:25,526][71000] Updated weights for policy 0, policy_version 64714 (0.0030) [2024-06-12 19:39:25,940][70768] Fps is (10 sec: 47512.6, 60 sec: 48878.8, 300 sec: 48652.1). Total num frames: 1060274176. Throughput: 0: 48546.5. Samples: 589097120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:39:25,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:39:28,987][71000] Updated weights for policy 0, policy_version 64724 (0.0032) [2024-06-12 19:39:30,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48609.3, 300 sec: 48652.1). Total num frames: 1060519936. Throughput: 0: 48522.1. Samples: 589389640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:39:30,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 19:39:32,650][71000] Updated weights for policy 0, policy_version 64734 (0.0031) [2024-06-12 19:39:35,631][71000] Updated weights for policy 0, policy_version 64744 (0.0030) [2024-06-12 19:39:35,940][70768] Fps is (10 sec: 49153.6, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 1060765696. Throughput: 0: 48632.9. Samples: 589536460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:39:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 19:39:36,055][70980] Saving new best policy, reward=0.279! [2024-06-12 19:39:39,408][71000] Updated weights for policy 0, policy_version 64754 (0.0029) [2024-06-12 19:39:40,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 1061011456. Throughput: 0: 49046.1. Samples: 589836060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:39:40,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:39:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000064759_1061011456.pth... [2024-06-12 19:39:40,990][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000064045_1049313280.pth [2024-06-12 19:39:41,875][70980] Signal inference workers to stop experience collection... (8650 times) [2024-06-12 19:39:41,875][70980] Signal inference workers to resume experience collection... (8650 times) [2024-06-12 19:39:41,901][71000] InferenceWorker_p0-w0: stopping experience collection (8650 times) [2024-06-12 19:39:41,901][71000] InferenceWorker_p0-w0: resuming experience collection (8650 times) [2024-06-12 19:39:42,376][71000] Updated weights for policy 0, policy_version 64764 (0.0031) [2024-06-12 19:39:45,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.9, 300 sec: 48596.6). Total num frames: 1061240832. Throughput: 0: 48784.0. Samples: 590123980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:39:45,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:39:45,971][71000] Updated weights for policy 0, policy_version 64774 (0.0023) [2024-06-12 19:39:49,290][71000] Updated weights for policy 0, policy_version 64784 (0.0023) [2024-06-12 19:39:50,939][70768] Fps is (10 sec: 47514.2, 60 sec: 48605.9, 300 sec: 48652.2). Total num frames: 1061486592. Throughput: 0: 48808.5. Samples: 590265300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 19:39:50,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:39:52,624][71000] Updated weights for policy 0, policy_version 64794 (0.0022) [2024-06-12 19:39:55,863][71000] Updated weights for policy 0, policy_version 64804 (0.0030) [2024-06-12 19:39:55,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48878.8, 300 sec: 48652.1). Total num frames: 1061748736. Throughput: 0: 48884.4. Samples: 590563260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 19:39:55,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 19:39:59,575][71000] Updated weights for policy 0, policy_version 64814 (0.0025) [2024-06-12 19:40:00,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48332.9, 300 sec: 48596.6). Total num frames: 1061961728. Throughput: 0: 48590.4. Samples: 590848500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 19:40:00,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:40:02,527][71000] Updated weights for policy 0, policy_version 64824 (0.0028) [2024-06-12 19:40:05,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48878.9, 300 sec: 48652.1). Total num frames: 1062223872. Throughput: 0: 48682.5. Samples: 590996580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 19:40:05,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:40:06,281][71000] Updated weights for policy 0, policy_version 64834 (0.0037) [2024-06-12 19:40:09,245][71000] Updated weights for policy 0, policy_version 64844 (0.0034) [2024-06-12 19:40:10,940][70768] Fps is (10 sec: 50789.7, 60 sec: 48605.9, 300 sec: 48652.1). Total num frames: 1062469632. Throughput: 0: 48667.4. Samples: 591287140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 19:40:10,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:40:12,782][71000] Updated weights for policy 0, policy_version 64854 (0.0030) [2024-06-12 19:40:15,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 1062715392. Throughput: 0: 48733.3. Samples: 591582640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 19:40:15,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 19:40:15,968][71000] Updated weights for policy 0, policy_version 64864 (0.0029) [2024-06-12 19:40:19,922][71000] Updated weights for policy 0, policy_version 64874 (0.0032) [2024-06-12 19:40:20,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48605.7, 300 sec: 48652.1). Total num frames: 1062944768. Throughput: 0: 48624.3. Samples: 591724560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 19:40:20,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:40:22,711][71000] Updated weights for policy 0, policy_version 64884 (0.0036) [2024-06-12 19:40:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48606.0, 300 sec: 48652.1). Total num frames: 1063190528. Throughput: 0: 48527.5. Samples: 592019800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 19:40:25,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 19:40:26,435][71000] Updated weights for policy 0, policy_version 64894 (0.0025) [2024-06-12 19:40:29,385][71000] Updated weights for policy 0, policy_version 64904 (0.0032) [2024-06-12 19:40:30,939][70768] Fps is (10 sec: 49152.8, 60 sec: 48606.0, 300 sec: 48707.7). Total num frames: 1063436288. Throughput: 0: 48676.5. Samples: 592314420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 19:40:30,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:40:32,950][71000] Updated weights for policy 0, policy_version 64914 (0.0032) [2024-06-12 19:40:35,940][70768] Fps is (10 sec: 50791.2, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1063698432. Throughput: 0: 48909.3. Samples: 592466220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 19:40:35,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:40:36,050][71000] Updated weights for policy 0, policy_version 64924 (0.0032) [2024-06-12 19:40:39,853][71000] Updated weights for policy 0, policy_version 64934 (0.0034) [2024-06-12 19:40:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 1063927808. Throughput: 0: 48781.4. Samples: 592758420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 19:40:40,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 19:40:42,624][71000] Updated weights for policy 0, policy_version 64944 (0.0032) [2024-06-12 19:40:45,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 1064173568. Throughput: 0: 49001.6. Samples: 593053580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 19:40:45,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 19:40:46,654][71000] Updated weights for policy 0, policy_version 64954 (0.0022) [2024-06-12 19:40:49,488][71000] Updated weights for policy 0, policy_version 64964 (0.0026) [2024-06-12 19:40:50,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1064419328. Throughput: 0: 48792.5. Samples: 593192240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 19:40:50,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 19:40:53,225][71000] Updated weights for policy 0, policy_version 64974 (0.0032) [2024-06-12 19:40:54,115][70980] Signal inference workers to stop experience collection... (8700 times) [2024-06-12 19:40:54,115][70980] Signal inference workers to resume experience collection... (8700 times) [2024-06-12 19:40:54,139][71000] InferenceWorker_p0-w0: stopping experience collection (8700 times) [2024-06-12 19:40:54,139][71000] InferenceWorker_p0-w0: resuming experience collection (8700 times) [2024-06-12 19:40:55,939][71000] Updated weights for policy 0, policy_version 64984 (0.0024) [2024-06-12 19:40:55,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49152.1, 300 sec: 48929.8). Total num frames: 1064697856. Throughput: 0: 49045.8. Samples: 593494200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 19:40:55,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:40:59,784][71000] Updated weights for policy 0, policy_version 64994 (0.0031) [2024-06-12 19:41:00,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49151.8, 300 sec: 48763.2). Total num frames: 1064910848. Throughput: 0: 49009.8. Samples: 593788080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 19:41:00,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 19:41:02,526][71000] Updated weights for policy 0, policy_version 65004 (0.0024) [2024-06-12 19:41:05,939][70768] Fps is (10 sec: 44236.9, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 1065140224. Throughput: 0: 48916.6. Samples: 593925800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 19:41:05,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:41:06,760][71000] Updated weights for policy 0, policy_version 65014 (0.0029) [2024-06-12 19:41:09,628][71000] Updated weights for policy 0, policy_version 65024 (0.0029) [2024-06-12 19:41:10,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 1065385984. Throughput: 0: 48700.6. Samples: 594211320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 19:41:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 19:41:13,194][71000] Updated weights for policy 0, policy_version 65034 (0.0023) [2024-06-12 19:41:15,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 1065664512. Throughput: 0: 48891.1. Samples: 594514520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 19:41:15,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:41:16,507][71000] Updated weights for policy 0, policy_version 65044 (0.0026) [2024-06-12 19:41:19,867][71000] Updated weights for policy 0, policy_version 65054 (0.0030) [2024-06-12 19:41:20,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 1065893888. Throughput: 0: 48772.4. Samples: 594660980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 19:41:20,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:41:22,954][71000] Updated weights for policy 0, policy_version 65064 (0.0027) [2024-06-12 19:41:25,939][70768] Fps is (10 sec: 44236.9, 60 sec: 48606.0, 300 sec: 48707.7). Total num frames: 1066106880. Throughput: 0: 48849.4. Samples: 594956640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 19:41:25,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 19:41:26,563][71000] Updated weights for policy 0, policy_version 65074 (0.0031) [2024-06-12 19:41:29,431][71000] Updated weights for policy 0, policy_version 65084 (0.0033) [2024-06-12 19:41:30,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 1066352640. Throughput: 0: 48615.1. Samples: 595241260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 19:41:30,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:41:33,542][71000] Updated weights for policy 0, policy_version 65094 (0.0029) [2024-06-12 19:41:35,939][70768] Fps is (10 sec: 52428.6, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1066631168. Throughput: 0: 48915.1. Samples: 595393420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 19:41:35,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 19:41:36,448][71000] Updated weights for policy 0, policy_version 65104 (0.0035) [2024-06-12 19:41:40,191][71000] Updated weights for policy 0, policy_version 65114 (0.0029) [2024-06-12 19:41:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48605.9, 300 sec: 48652.2). Total num frames: 1066844160. Throughput: 0: 48592.4. Samples: 595680860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 19:41:40,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:41:40,955][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000065115_1066844160.pth... [2024-06-12 19:41:41,017][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000064402_1055162368.pth [2024-06-12 19:41:43,611][71000] Updated weights for policy 0, policy_version 65124 (0.0028) [2024-06-12 19:41:45,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 1067089920. Throughput: 0: 48649.0. Samples: 595977280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 19:41:45,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:41:46,965][71000] Updated weights for policy 0, policy_version 65134 (0.0026) [2024-06-12 19:41:49,853][71000] Updated weights for policy 0, policy_version 65144 (0.0035) [2024-06-12 19:41:50,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 1067335680. Throughput: 0: 48815.6. Samples: 596122500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 19:41:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 19:41:53,259][70980] Signal inference workers to stop experience collection... (8750 times) [2024-06-12 19:41:53,312][71000] InferenceWorker_p0-w0: stopping experience collection (8750 times) [2024-06-12 19:41:53,311][70980] Signal inference workers to resume experience collection... (8750 times) [2024-06-12 19:41:53,325][71000] InferenceWorker_p0-w0: resuming experience collection (8750 times) [2024-06-12 19:41:53,459][71000] Updated weights for policy 0, policy_version 65154 (0.0024) [2024-06-12 19:41:55,940][70768] Fps is (10 sec: 54066.8, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 1067630592. Throughput: 0: 49049.2. Samples: 596418540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 19:41:55,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:41:56,717][71000] Updated weights for policy 0, policy_version 65164 (0.0037) [2024-06-12 19:42:00,541][71000] Updated weights for policy 0, policy_version 65174 (0.0028) [2024-06-12 19:42:00,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48606.0, 300 sec: 48707.7). Total num frames: 1067827200. Throughput: 0: 48693.7. Samples: 596705740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 19:42:00,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:42:03,373][71000] Updated weights for policy 0, policy_version 65184 (0.0030) [2024-06-12 19:42:05,939][70768] Fps is (10 sec: 42599.1, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 1068056576. Throughput: 0: 48491.6. Samples: 596843100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 19:42:05,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:42:07,096][71000] Updated weights for policy 0, policy_version 65194 (0.0028) [2024-06-12 19:42:09,931][71000] Updated weights for policy 0, policy_version 65204 (0.0031) [2024-06-12 19:42:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 1068318720. Throughput: 0: 48442.6. Samples: 597136560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 19:42:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 19:42:13,753][71000] Updated weights for policy 0, policy_version 65214 (0.0032) [2024-06-12 19:42:15,940][70768] Fps is (10 sec: 54067.0, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1068597248. Throughput: 0: 48929.8. Samples: 597443100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 19:42:15,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:42:16,535][71000] Updated weights for policy 0, policy_version 65224 (0.0029) [2024-06-12 19:42:20,380][71000] Updated weights for policy 0, policy_version 65234 (0.0030) [2024-06-12 19:42:20,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 1068810240. Throughput: 0: 48838.6. Samples: 597591160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 19:42:20,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:42:23,327][71000] Updated weights for policy 0, policy_version 65244 (0.0027) [2024-06-12 19:42:25,940][70768] Fps is (10 sec: 44236.0, 60 sec: 48878.7, 300 sec: 48763.2). Total num frames: 1069039616. Throughput: 0: 48801.2. Samples: 597876920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-12 19:42:25,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:42:27,479][71000] Updated weights for policy 0, policy_version 65254 (0.0034) [2024-06-12 19:42:29,914][71000] Updated weights for policy 0, policy_version 65264 (0.0031) [2024-06-12 19:42:30,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 1069301760. Throughput: 0: 48595.0. Samples: 598164060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 19:42:30,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:42:33,779][71000] Updated weights for policy 0, policy_version 65274 (0.0027) [2024-06-12 19:42:35,939][70768] Fps is (10 sec: 52430.3, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1069563904. Throughput: 0: 48915.2. Samples: 598323680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 19:42:35,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:42:36,719][71000] Updated weights for policy 0, policy_version 65284 (0.0030) [2024-06-12 19:42:40,528][71000] Updated weights for policy 0, policy_version 65294 (0.0033) [2024-06-12 19:42:40,939][70768] Fps is (10 sec: 49152.8, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 1069793280. Throughput: 0: 49056.6. Samples: 598626080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 19:42:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 19:42:43,439][71000] Updated weights for policy 0, policy_version 65304 (0.0030) [2024-06-12 19:42:45,943][70768] Fps is (10 sec: 47496.9, 60 sec: 49149.2, 300 sec: 48762.7). Total num frames: 1070039040. Throughput: 0: 49299.8. Samples: 598924400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 19:42:45,944][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 19:42:46,022][70980] Signal inference workers to stop experience collection... (8800 times) [2024-06-12 19:42:46,068][71000] InferenceWorker_p0-w0: stopping experience collection (8800 times) [2024-06-12 19:42:46,133][70980] Signal inference workers to resume experience collection... (8800 times) [2024-06-12 19:42:46,133][71000] InferenceWorker_p0-w0: resuming experience collection (8800 times) [2024-06-12 19:42:47,102][71000] Updated weights for policy 0, policy_version 65314 (0.0032) [2024-06-12 19:42:50,051][71000] Updated weights for policy 0, policy_version 65324 (0.0028) [2024-06-12 19:42:50,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49151.9, 300 sec: 48818.8). Total num frames: 1070284800. Throughput: 0: 49264.3. Samples: 599060000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 19:42:50,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:42:54,099][71000] Updated weights for policy 0, policy_version 65334 (0.0031) [2024-06-12 19:42:55,940][70768] Fps is (10 sec: 50807.7, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 1070546944. Throughput: 0: 49410.6. Samples: 599360040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 19:42:55,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:42:56,789][71000] Updated weights for policy 0, policy_version 65344 (0.0027) [2024-06-12 19:43:00,681][71000] Updated weights for policy 0, policy_version 65354 (0.0026) [2024-06-12 19:43:00,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 1070776320. Throughput: 0: 49176.0. Samples: 599656020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 19:43:00,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 19:43:03,600][71000] Updated weights for policy 0, policy_version 65364 (0.0026) [2024-06-12 19:43:05,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.1, 300 sec: 48818.8). Total num frames: 1071038464. Throughput: 0: 49153.8. Samples: 599803080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 19:43:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 19:43:07,279][71000] Updated weights for policy 0, policy_version 65374 (0.0027) [2024-06-12 19:43:10,165][71000] Updated weights for policy 0, policy_version 65384 (0.0032) [2024-06-12 19:43:10,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 1071267840. Throughput: 0: 49284.6. Samples: 600094720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 19:43:10,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:43:13,866][71000] Updated weights for policy 0, policy_version 65394 (0.0025) [2024-06-12 19:43:15,939][70768] Fps is (10 sec: 47513.7, 60 sec: 48605.9, 300 sec: 48763.3). Total num frames: 1071513600. Throughput: 0: 49245.1. Samples: 600380080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 19:43:15,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 19:43:16,934][71000] Updated weights for policy 0, policy_version 65404 (0.0023) [2024-06-12 19:43:20,740][71000] Updated weights for policy 0, policy_version 65414 (0.0039) [2024-06-12 19:43:20,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 1071759360. Throughput: 0: 49057.7. Samples: 600531280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 19:43:20,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:43:23,603][71000] Updated weights for policy 0, policy_version 65424 (0.0019) [2024-06-12 19:43:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.2, 300 sec: 48819.5). Total num frames: 1072005120. Throughput: 0: 49034.6. Samples: 600832640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 19:43:25,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:43:26,971][71000] Updated weights for policy 0, policy_version 65434 (0.0025) [2024-06-12 19:43:30,261][71000] Updated weights for policy 0, policy_version 65444 (0.0025) [2024-06-12 19:43:30,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49425.1, 300 sec: 48929.8). Total num frames: 1072267264. Throughput: 0: 49005.8. Samples: 601129500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 19:43:30,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 19:43:33,718][71000] Updated weights for policy 0, policy_version 65454 (0.0031) [2024-06-12 19:43:35,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48878.7, 300 sec: 48818.8). Total num frames: 1072496640. Throughput: 0: 49403.1. Samples: 601283140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 19:43:35,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:43:36,722][71000] Updated weights for policy 0, policy_version 65464 (0.0032) [2024-06-12 19:43:40,426][71000] Updated weights for policy 0, policy_version 65474 (0.0028) [2024-06-12 19:43:40,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49151.8, 300 sec: 48929.8). Total num frames: 1072742400. Throughput: 0: 49071.3. Samples: 601568260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 19:43:40,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:43:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000065475_1072742400.pth... [2024-06-12 19:43:41,003][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000064759_1061011456.pth [2024-06-12 19:43:43,653][71000] Updated weights for policy 0, policy_version 65484 (0.0034) [2024-06-12 19:43:45,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48881.6, 300 sec: 48818.7). Total num frames: 1072971776. Throughput: 0: 48910.9. Samples: 601857020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-12 19:43:45,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:43:47,146][71000] Updated weights for policy 0, policy_version 65494 (0.0023) [2024-06-12 19:43:50,418][71000] Updated weights for policy 0, policy_version 65504 (0.0031) [2024-06-12 19:43:50,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 1073233920. Throughput: 0: 48980.7. Samples: 602007220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 19:43:50,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:43:53,923][71000] Updated weights for policy 0, policy_version 65514 (0.0028) [2024-06-12 19:43:55,939][70768] Fps is (10 sec: 50791.4, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1073479680. Throughput: 0: 49154.4. Samples: 602306660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 19:43:55,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:43:57,032][71000] Updated weights for policy 0, policy_version 65524 (0.0029) [2024-06-12 19:44:00,619][71000] Updated weights for policy 0, policy_version 65534 (0.0036) [2024-06-12 19:44:00,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1073709056. Throughput: 0: 49155.0. Samples: 602592060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 19:44:00,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:44:03,956][71000] Updated weights for policy 0, policy_version 65544 (0.0033) [2024-06-12 19:44:05,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 1073954816. Throughput: 0: 48779.1. Samples: 602726340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 19:44:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 19:44:07,605][71000] Updated weights for policy 0, policy_version 65554 (0.0030) [2024-06-12 19:44:10,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 1074184192. Throughput: 0: 48566.2. Samples: 603018120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 19:44:10,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 19:44:10,951][71000] Updated weights for policy 0, policy_version 65564 (0.0028) [2024-06-12 19:44:11,229][70980] Signal inference workers to stop experience collection... (8850 times) [2024-06-12 19:44:11,229][70980] Signal inference workers to resume experience collection... (8850 times) [2024-06-12 19:44:11,254][71000] InferenceWorker_p0-w0: stopping experience collection (8850 times) [2024-06-12 19:44:11,254][71000] InferenceWorker_p0-w0: resuming experience collection (8850 times) [2024-06-12 19:44:14,244][71000] Updated weights for policy 0, policy_version 65574 (0.0022) [2024-06-12 19:44:15,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 1074429952. Throughput: 0: 48555.7. Samples: 603314500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 19:44:15,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:44:17,442][71000] Updated weights for policy 0, policy_version 65584 (0.0028) [2024-06-12 19:44:20,699][71000] Updated weights for policy 0, policy_version 65594 (0.0031) [2024-06-12 19:44:20,939][70768] Fps is (10 sec: 50790.8, 60 sec: 48879.0, 300 sec: 48874.4). Total num frames: 1074692096. Throughput: 0: 48333.1. Samples: 603458120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 19:44:20,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:44:23,955][71000] Updated weights for policy 0, policy_version 65604 (0.0028) [2024-06-12 19:44:25,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1074937856. Throughput: 0: 48586.1. Samples: 603754620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 19:44:25,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:44:27,541][71000] Updated weights for policy 0, policy_version 65614 (0.0031) [2024-06-12 19:44:30,756][71000] Updated weights for policy 0, policy_version 65624 (0.0031) [2024-06-12 19:44:30,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 1075183616. Throughput: 0: 48499.2. Samples: 604039480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 19:44:30,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:44:34,361][71000] Updated weights for policy 0, policy_version 65634 (0.0029) [2024-06-12 19:44:35,940][70768] Fps is (10 sec: 45874.4, 60 sec: 48332.8, 300 sec: 48763.2). Total num frames: 1075396608. Throughput: 0: 48440.5. Samples: 604187040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 19:44:35,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:44:37,662][71000] Updated weights for policy 0, policy_version 65644 (0.0025) [2024-06-12 19:44:40,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 1075658752. Throughput: 0: 48277.6. Samples: 604479160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 19:44:40,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:44:41,074][71000] Updated weights for policy 0, policy_version 65654 (0.0023) [2024-06-12 19:44:44,327][71000] Updated weights for policy 0, policy_version 65664 (0.0025) [2024-06-12 19:44:45,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1075904512. Throughput: 0: 48544.9. Samples: 604776580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 19:44:45,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 19:44:47,686][71000] Updated weights for policy 0, policy_version 65674 (0.0029) [2024-06-12 19:44:50,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 1076150272. Throughput: 0: 48844.4. Samples: 604924340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 19:44:50,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:44:51,156][71000] Updated weights for policy 0, policy_version 65684 (0.0024) [2024-06-12 19:44:54,316][71000] Updated weights for policy 0, policy_version 65694 (0.0040) [2024-06-12 19:44:55,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48059.6, 300 sec: 48818.7). Total num frames: 1076363264. Throughput: 0: 48868.8. Samples: 605217220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 19:44:55,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:44:57,778][71000] Updated weights for policy 0, policy_version 65704 (0.0025) [2024-06-12 19:45:00,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1076641792. Throughput: 0: 48785.7. Samples: 605509860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 19:45:00,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:45:01,090][71000] Updated weights for policy 0, policy_version 65714 (0.0035) [2024-06-12 19:45:04,398][71000] Updated weights for policy 0, policy_version 65724 (0.0032) [2024-06-12 19:45:05,939][70768] Fps is (10 sec: 54068.0, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 1076903936. Throughput: 0: 48921.3. Samples: 605659580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 19:45:05,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:45:07,491][71000] Updated weights for policy 0, policy_version 65734 (0.0028) [2024-06-12 19:45:10,939][70768] Fps is (10 sec: 47514.2, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1077116928. Throughput: 0: 48938.7. Samples: 605956860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:45:10,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:45:11,091][71000] Updated weights for policy 0, policy_version 65744 (0.0026) [2024-06-12 19:45:14,318][71000] Updated weights for policy 0, policy_version 65754 (0.0028) [2024-06-12 19:45:15,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1077362688. Throughput: 0: 48929.9. Samples: 606241320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:45:15,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 19:45:17,986][71000] Updated weights for policy 0, policy_version 65764 (0.0037) [2024-06-12 19:45:20,909][71000] Updated weights for policy 0, policy_version 65774 (0.0031) [2024-06-12 19:45:20,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 1077641216. Throughput: 0: 48891.2. Samples: 606387140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:45:20,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:45:24,916][71000] Updated weights for policy 0, policy_version 65784 (0.0029) [2024-06-12 19:45:25,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48605.7, 300 sec: 48874.3). Total num frames: 1077854208. Throughput: 0: 48872.9. Samples: 606678440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:45:25,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 19:45:27,956][71000] Updated weights for policy 0, policy_version 65794 (0.0033) [2024-06-12 19:45:30,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 1078099968. Throughput: 0: 49034.7. Samples: 606983140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:45:30,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:45:31,178][71000] Updated weights for policy 0, policy_version 65804 (0.0026) [2024-06-12 19:45:32,256][70980] Signal inference workers to stop experience collection... (8900 times) [2024-06-12 19:45:32,300][71000] InferenceWorker_p0-w0: stopping experience collection (8900 times) [2024-06-12 19:45:32,309][70980] Signal inference workers to resume experience collection... (8900 times) [2024-06-12 19:45:32,324][71000] InferenceWorker_p0-w0: resuming experience collection (8900 times) [2024-06-12 19:45:34,219][71000] Updated weights for policy 0, policy_version 65814 (0.0037) [2024-06-12 19:45:35,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 1078345728. Throughput: 0: 48921.8. Samples: 607125820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:45:35,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:45:38,017][71000] Updated weights for policy 0, policy_version 65824 (0.0027) [2024-06-12 19:45:40,779][71000] Updated weights for policy 0, policy_version 65834 (0.0036) [2024-06-12 19:45:40,940][70768] Fps is (10 sec: 52427.7, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 1078624256. Throughput: 0: 49043.0. Samples: 607424160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:45:40,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:45:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000065834_1078624256.pth... [2024-06-12 19:45:40,985][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000065115_1066844160.pth [2024-06-12 19:45:44,537][71000] Updated weights for policy 0, policy_version 65844 (0.0030) [2024-06-12 19:45:45,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.1, 300 sec: 48929.8). Total num frames: 1078853632. Throughput: 0: 49060.6. Samples: 607717580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:45:45,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:45:47,323][71000] Updated weights for policy 0, policy_version 65854 (0.0027) [2024-06-12 19:45:50,940][70768] Fps is (10 sec: 44237.4, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 1079066624. Throughput: 0: 49050.1. Samples: 607866840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 19:45:50,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:45:51,436][71000] Updated weights for policy 0, policy_version 65864 (0.0031) [2024-06-12 19:45:54,587][71000] Updated weights for policy 0, policy_version 65874 (0.0029) [2024-06-12 19:45:55,940][70768] Fps is (10 sec: 47512.5, 60 sec: 49425.0, 300 sec: 48874.3). Total num frames: 1079328768. Throughput: 0: 48922.4. Samples: 608158380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 19:45:55,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:45:58,115][71000] Updated weights for policy 0, policy_version 65884 (0.0025) [2024-06-12 19:46:00,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 1079590912. Throughput: 0: 49216.2. Samples: 608456060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 19:46:00,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:46:01,130][71000] Updated weights for policy 0, policy_version 65894 (0.0025) [2024-06-12 19:46:04,756][71000] Updated weights for policy 0, policy_version 65904 (0.0028) [2024-06-12 19:46:05,940][70768] Fps is (10 sec: 50791.2, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 1079836672. Throughput: 0: 49384.9. Samples: 608609460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 19:46:05,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 19:46:07,613][71000] Updated weights for policy 0, policy_version 65914 (0.0031) [2024-06-12 19:46:10,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48878.8, 300 sec: 48763.2). Total num frames: 1080049664. Throughput: 0: 49410.3. Samples: 608901900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 19:46:10,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 19:46:11,413][71000] Updated weights for policy 0, policy_version 65924 (0.0025) [2024-06-12 19:46:14,090][71000] Updated weights for policy 0, policy_version 65934 (0.0027) [2024-06-12 19:46:15,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.0, 300 sec: 48929.8). Total num frames: 1080328192. Throughput: 0: 49276.8. Samples: 609200600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 19:46:15,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:46:17,951][71000] Updated weights for policy 0, policy_version 65944 (0.0030) [2024-06-12 19:46:20,940][70768] Fps is (10 sec: 52428.4, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 1080573952. Throughput: 0: 49347.5. Samples: 609346460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 19:46:20,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 19:46:21,025][71000] Updated weights for policy 0, policy_version 65954 (0.0027) [2024-06-12 19:46:24,599][71000] Updated weights for policy 0, policy_version 65964 (0.0031) [2024-06-12 19:46:25,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.2, 300 sec: 49096.5). Total num frames: 1080836096. Throughput: 0: 49487.7. Samples: 609651100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 19:46:25,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:46:27,595][71000] Updated weights for policy 0, policy_version 65974 (0.0021) [2024-06-12 19:46:30,939][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 1081049088. Throughput: 0: 49427.5. Samples: 609941820. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-12 19:46:30,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:46:31,196][71000] Updated weights for policy 0, policy_version 65984 (0.0029) [2024-06-12 19:46:34,251][71000] Updated weights for policy 0, policy_version 65994 (0.0028) [2024-06-12 19:46:35,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 1081311232. Throughput: 0: 49236.0. Samples: 610082460. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-12 19:46:35,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 19:46:38,063][71000] Updated weights for policy 0, policy_version 66004 (0.0025) [2024-06-12 19:46:39,048][70980] Signal inference workers to stop experience collection... (8950 times) [2024-06-12 19:46:39,066][71000] InferenceWorker_p0-w0: stopping experience collection (8950 times) [2024-06-12 19:46:39,104][70980] Signal inference workers to resume experience collection... (8950 times) [2024-06-12 19:46:39,104][71000] InferenceWorker_p0-w0: resuming experience collection (8950 times) [2024-06-12 19:46:40,775][71000] Updated weights for policy 0, policy_version 66014 (0.0029) [2024-06-12 19:46:40,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 1081573376. Throughput: 0: 49299.2. Samples: 610376840. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-12 19:46:40,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:46:44,631][71000] Updated weights for policy 0, policy_version 66024 (0.0034) [2024-06-12 19:46:45,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 1081802752. Throughput: 0: 49252.5. Samples: 610672420. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-12 19:46:45,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:46:47,785][71000] Updated weights for policy 0, policy_version 66034 (0.0034) [2024-06-12 19:46:50,940][70768] Fps is (10 sec: 44236.2, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 1082015744. Throughput: 0: 48974.9. Samples: 610813340. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-12 19:46:50,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:46:51,621][71000] Updated weights for policy 0, policy_version 66044 (0.0040) [2024-06-12 19:46:54,653][71000] Updated weights for policy 0, policy_version 66054 (0.0038) [2024-06-12 19:46:55,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49698.1, 300 sec: 49096.4). Total num frames: 1082310656. Throughput: 0: 48841.6. Samples: 611099780. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-12 19:46:55,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:46:58,249][71000] Updated weights for policy 0, policy_version 66064 (0.0031) [2024-06-12 19:47:00,940][70768] Fps is (10 sec: 49153.0, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 1082507264. Throughput: 0: 48651.1. Samples: 611389900. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-12 19:47:00,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:47:01,486][71000] Updated weights for policy 0, policy_version 66074 (0.0028) [2024-06-12 19:47:04,815][71000] Updated weights for policy 0, policy_version 66084 (0.0026) [2024-06-12 19:47:05,940][70768] Fps is (10 sec: 44237.6, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 1082753024. Throughput: 0: 48650.3. Samples: 611535720. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-12 19:47:05,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:47:07,896][71000] Updated weights for policy 0, policy_version 66094 (0.0029) [2024-06-12 19:47:10,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1082982400. Throughput: 0: 48583.5. Samples: 611837360. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-12 19:47:10,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 19:47:11,521][71000] Updated weights for policy 0, policy_version 66104 (0.0029) [2024-06-12 19:47:14,802][71000] Updated weights for policy 0, policy_version 66114 (0.0033) [2024-06-12 19:47:15,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 1083277312. Throughput: 0: 48353.2. Samples: 612117720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-12 19:47:15,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:47:18,379][71000] Updated weights for policy 0, policy_version 66124 (0.0028) [2024-06-12 19:47:20,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 1083490304. Throughput: 0: 48737.6. Samples: 612275660. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-12 19:47:20,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:47:21,587][71000] Updated weights for policy 0, policy_version 66134 (0.0027) [2024-06-12 19:47:25,006][71000] Updated weights for policy 0, policy_version 66144 (0.0028) [2024-06-12 19:47:25,940][70768] Fps is (10 sec: 47514.4, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 1083752448. Throughput: 0: 48699.7. Samples: 612568320. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-12 19:47:25,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 19:47:28,113][71000] Updated weights for policy 0, policy_version 66154 (0.0025) [2024-06-12 19:47:30,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1083981824. Throughput: 0: 48654.7. Samples: 612861880. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-12 19:47:30,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:47:31,555][71000] Updated weights for policy 0, policy_version 66164 (0.0037) [2024-06-12 19:47:34,653][71000] Updated weights for policy 0, policy_version 66174 (0.0027) [2024-06-12 19:47:35,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 1084260352. Throughput: 0: 49007.3. Samples: 613018660. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-12 19:47:35,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:47:38,214][71000] Updated weights for policy 0, policy_version 66184 (0.0023) [2024-06-12 19:47:40,743][70980] Signal inference workers to stop experience collection... (9000 times) [2024-06-12 19:47:40,743][70980] Signal inference workers to resume experience collection... (9000 times) [2024-06-12 19:47:40,758][71000] InferenceWorker_p0-w0: stopping experience collection (9000 times) [2024-06-12 19:47:40,758][71000] InferenceWorker_p0-w0: resuming experience collection (9000 times) [2024-06-12 19:47:40,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48332.8, 300 sec: 48930.4). Total num frames: 1084473344. Throughput: 0: 49092.6. Samples: 613308940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-12 19:47:40,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:47:41,044][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000066192_1084489728.pth... [2024-06-12 19:47:41,100][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000065475_1072742400.pth [2024-06-12 19:47:41,385][71000] Updated weights for policy 0, policy_version 66194 (0.0032) [2024-06-12 19:47:44,966][71000] Updated weights for policy 0, policy_version 66204 (0.0028) [2024-06-12 19:47:45,942][70768] Fps is (10 sec: 44225.7, 60 sec: 48330.9, 300 sec: 48873.9). Total num frames: 1084702720. Throughput: 0: 48987.1. Samples: 613594440. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-12 19:47:45,943][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:47:48,450][71000] Updated weights for policy 0, policy_version 66214 (0.0024) [2024-06-12 19:47:50,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48879.1, 300 sec: 48818.8). Total num frames: 1084948480. Throughput: 0: 48878.6. Samples: 613735260. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-12 19:47:50,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:47:51,877][71000] Updated weights for policy 0, policy_version 66224 (0.0022) [2024-06-12 19:47:55,055][71000] Updated weights for policy 0, policy_version 66234 (0.0031) [2024-06-12 19:47:55,940][70768] Fps is (10 sec: 52440.9, 60 sec: 48605.9, 300 sec: 48985.3). Total num frames: 1085227008. Throughput: 0: 48727.0. Samples: 614030080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 19:47:55,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:47:58,226][71000] Updated weights for policy 0, policy_version 66244 (0.0037) [2024-06-12 19:48:00,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48878.8, 300 sec: 48818.7). Total num frames: 1085440000. Throughput: 0: 49231.5. Samples: 614333140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 19:48:00,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:48:01,752][71000] Updated weights for policy 0, policy_version 66254 (0.0036) [2024-06-12 19:48:05,353][71000] Updated weights for policy 0, policy_version 66264 (0.0025) [2024-06-12 19:48:05,940][70768] Fps is (10 sec: 45875.7, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1085685760. Throughput: 0: 48662.7. Samples: 614465480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 19:48:05,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:48:08,512][71000] Updated weights for policy 0, policy_version 66274 (0.0031) [2024-06-12 19:48:10,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49425.1, 300 sec: 48929.8). Total num frames: 1085947904. Throughput: 0: 48718.6. Samples: 614760660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 19:48:10,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:48:11,883][71000] Updated weights for policy 0, policy_version 66284 (0.0027) [2024-06-12 19:48:15,479][71000] Updated weights for policy 0, policy_version 66294 (0.0030) [2024-06-12 19:48:15,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48606.0, 300 sec: 48929.8). Total num frames: 1086193664. Throughput: 0: 48712.5. Samples: 615053940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 19:48:15,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:48:18,650][71000] Updated weights for policy 0, policy_version 66304 (0.0028) [2024-06-12 19:48:20,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 1086406656. Throughput: 0: 48483.5. Samples: 615200420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 19:48:20,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:48:21,940][71000] Updated weights for policy 0, policy_version 66314 (0.0030) [2024-06-12 19:48:25,171][71000] Updated weights for policy 0, policy_version 66324 (0.0028) [2024-06-12 19:48:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1086685184. Throughput: 0: 48676.0. Samples: 615499360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 19:48:25,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:48:28,541][71000] Updated weights for policy 0, policy_version 66334 (0.0024) [2024-06-12 19:48:30,940][70768] Fps is (10 sec: 54067.1, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 1086947328. Throughput: 0: 48752.0. Samples: 615788160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 19:48:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:48:31,953][71000] Updated weights for policy 0, policy_version 66344 (0.0026) [2024-06-12 19:48:35,378][71000] Updated weights for policy 0, policy_version 66354 (0.0024) [2024-06-12 19:48:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 1087176704. Throughput: 0: 49090.3. Samples: 615944320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 19:48:35,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:48:38,340][71000] Updated weights for policy 0, policy_version 66364 (0.0028) [2024-06-12 19:48:40,222][70980] Signal inference workers to stop experience collection... (9050 times) [2024-06-12 19:48:40,254][71000] InferenceWorker_p0-w0: stopping experience collection (9050 times) [2024-06-12 19:48:40,280][70980] Signal inference workers to resume experience collection... (9050 times) [2024-06-12 19:48:40,282][71000] InferenceWorker_p0-w0: resuming experience collection (9050 times) [2024-06-12 19:48:40,940][70768] Fps is (10 sec: 44236.8, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 1087389696. Throughput: 0: 49093.1. Samples: 616239260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 19:48:40,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 19:48:42,091][71000] Updated weights for policy 0, policy_version 66374 (0.0024) [2024-06-12 19:48:45,186][71000] Updated weights for policy 0, policy_version 66384 (0.0027) [2024-06-12 19:48:45,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49427.1, 300 sec: 48929.9). Total num frames: 1087668224. Throughput: 0: 48826.8. Samples: 616530340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 19:48:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 19:48:48,547][71000] Updated weights for policy 0, policy_version 66394 (0.0027) [2024-06-12 19:48:50,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 1087897600. Throughput: 0: 49198.4. Samples: 616679400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 19:48:50,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:48:51,946][71000] Updated weights for policy 0, policy_version 66404 (0.0037) [2024-06-12 19:48:55,511][71000] Updated weights for policy 0, policy_version 66414 (0.0037) [2024-06-12 19:48:55,939][70768] Fps is (10 sec: 45875.4, 60 sec: 48333.0, 300 sec: 48874.3). Total num frames: 1088126976. Throughput: 0: 49082.3. Samples: 616969360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 19:48:55,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 19:48:58,532][71000] Updated weights for policy 0, policy_version 66424 (0.0021) [2024-06-12 19:49:00,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49152.1, 300 sec: 48929.8). Total num frames: 1088389120. Throughput: 0: 49152.9. Samples: 617265820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 19:49:00,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:49:02,362][71000] Updated weights for policy 0, policy_version 66434 (0.0027) [2024-06-12 19:49:04,851][71000] Updated weights for policy 0, policy_version 66444 (0.0029) [2024-06-12 19:49:05,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 1088634880. Throughput: 0: 49110.7. Samples: 617410400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 19:49:05,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 19:49:08,817][71000] Updated weights for policy 0, policy_version 66454 (0.0033) [2024-06-12 19:49:10,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 1088897024. Throughput: 0: 49107.2. Samples: 617709180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 19:49:10,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:49:11,535][71000] Updated weights for policy 0, policy_version 66464 (0.0026) [2024-06-12 19:49:15,389][71000] Updated weights for policy 0, policy_version 66474 (0.0025) [2024-06-12 19:49:15,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 1089126400. Throughput: 0: 49130.1. Samples: 617999020. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-12 19:49:15,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:49:18,598][71000] Updated weights for policy 0, policy_version 66484 (0.0031) [2024-06-12 19:49:20,940][70768] Fps is (10 sec: 45874.3, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 1089355776. Throughput: 0: 48702.5. Samples: 618135940. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-12 19:49:20,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:49:22,647][71000] Updated weights for policy 0, policy_version 66494 (0.0028) [2024-06-12 19:49:25,077][71000] Updated weights for policy 0, policy_version 66504 (0.0031) [2024-06-12 19:49:25,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 1089617920. Throughput: 0: 48705.6. Samples: 618431020. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-12 19:49:25,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:49:29,080][71000] Updated weights for policy 0, policy_version 66514 (0.0033) [2024-06-12 19:49:30,940][70768] Fps is (10 sec: 52429.5, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 1089880064. Throughput: 0: 48935.1. Samples: 618732420. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-12 19:49:30,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:49:31,476][71000] Updated weights for policy 0, policy_version 66524 (0.0032) [2024-06-12 19:49:35,653][71000] Updated weights for policy 0, policy_version 66534 (0.0025) [2024-06-12 19:49:35,940][70768] Fps is (10 sec: 47514.4, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 1090093056. Throughput: 0: 48905.7. Samples: 618880160. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-12 19:49:35,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:49:38,513][71000] Updated weights for policy 0, policy_version 66544 (0.0038) [2024-06-12 19:49:40,940][70768] Fps is (10 sec: 47512.7, 60 sec: 49424.9, 300 sec: 48985.4). Total num frames: 1090355200. Throughput: 0: 49078.8. Samples: 619177920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-12 19:49:40,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:49:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000066550_1090355200.pth... [2024-06-12 19:49:40,994][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000065834_1078624256.pth [2024-06-12 19:49:42,236][70980] Signal inference workers to stop experience collection... (9100 times) [2024-06-12 19:49:42,236][70980] Signal inference workers to resume experience collection... (9100 times) [2024-06-12 19:49:42,260][71000] InferenceWorker_p0-w0: stopping experience collection (9100 times) [2024-06-12 19:49:42,260][71000] InferenceWorker_p0-w0: resuming experience collection (9100 times) [2024-06-12 19:49:42,374][71000] Updated weights for policy 0, policy_version 66554 (0.0031) [2024-06-12 19:49:45,168][71000] Updated weights for policy 0, policy_version 66564 (0.0033) [2024-06-12 19:49:45,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 1090584576. Throughput: 0: 48752.1. Samples: 619459660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-12 19:49:45,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:49:49,130][71000] Updated weights for policy 0, policy_version 66574 (0.0030) [2024-06-12 19:49:50,939][70768] Fps is (10 sec: 50791.7, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 1090863104. Throughput: 0: 49201.4. Samples: 619624460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-12 19:49:50,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 19:49:51,482][71000] Updated weights for policy 0, policy_version 66584 (0.0031) [2024-06-12 19:49:55,923][71000] Updated weights for policy 0, policy_version 66594 (0.0026) [2024-06-12 19:49:55,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 1091076096. Throughput: 0: 49050.1. Samples: 619916440. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-12 19:49:55,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:49:58,415][71000] Updated weights for policy 0, policy_version 66604 (0.0023) [2024-06-12 19:50:00,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1091321856. Throughput: 0: 48952.1. Samples: 620201860. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-12 19:50:00,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 19:50:02,674][71000] Updated weights for policy 0, policy_version 66614 (0.0034) [2024-06-12 19:50:05,616][71000] Updated weights for policy 0, policy_version 66624 (0.0031) [2024-06-12 19:50:05,940][70768] Fps is (10 sec: 49152.8, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 1091567616. Throughput: 0: 49110.0. Samples: 620345880. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-12 19:50:05,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:50:09,380][71000] Updated weights for policy 0, policy_version 66634 (0.0031) [2024-06-12 19:50:10,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48605.7, 300 sec: 48985.4). Total num frames: 1091813376. Throughput: 0: 49029.8. Samples: 620637360. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-12 19:50:10,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 19:50:12,255][71000] Updated weights for policy 0, policy_version 66644 (0.0026) [2024-06-12 19:50:15,939][70768] Fps is (10 sec: 47513.7, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 1092042752. Throughput: 0: 48996.5. Samples: 620937260. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-12 19:50:15,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:50:16,075][71000] Updated weights for policy 0, policy_version 66654 (0.0028) [2024-06-12 19:50:19,155][71000] Updated weights for policy 0, policy_version 66664 (0.0027) [2024-06-12 19:50:20,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 1092304896. Throughput: 0: 48771.0. Samples: 621074860. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-12 19:50:20,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:50:22,674][71000] Updated weights for policy 0, policy_version 66674 (0.0024) [2024-06-12 19:50:25,731][71000] Updated weights for policy 0, policy_version 66684 (0.0031) [2024-06-12 19:50:25,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 1092567040. Throughput: 0: 48812.2. Samples: 621374460. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-12 19:50:25,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:50:29,638][71000] Updated weights for policy 0, policy_version 66694 (0.0029) [2024-06-12 19:50:30,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 1092796416. Throughput: 0: 48984.4. Samples: 621663960. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-12 19:50:30,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:50:32,590][71000] Updated weights for policy 0, policy_version 66704 (0.0038) [2024-06-12 19:50:35,940][70768] Fps is (10 sec: 44235.5, 60 sec: 48605.6, 300 sec: 48763.2). Total num frames: 1093009408. Throughput: 0: 48420.5. Samples: 621803400. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-12 19:50:35,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:50:36,211][71000] Updated weights for policy 0, policy_version 66714 (0.0030) [2024-06-12 19:50:38,914][71000] Updated weights for policy 0, policy_version 66724 (0.0026) [2024-06-12 19:50:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48879.1, 300 sec: 48929.8). Total num frames: 1093287936. Throughput: 0: 48575.2. Samples: 622102320. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-12 19:50:40,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 19:50:42,726][71000] Updated weights for policy 0, policy_version 66734 (0.0026) [2024-06-12 19:50:45,806][71000] Updated weights for policy 0, policy_version 66744 (0.0031) [2024-06-12 19:50:45,940][70768] Fps is (10 sec: 54068.6, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 1093550080. Throughput: 0: 48828.5. Samples: 622399140. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-12 19:50:45,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:50:49,605][71000] Updated weights for policy 0, policy_version 66754 (0.0035) [2024-06-12 19:50:50,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 1093779456. Throughput: 0: 48905.3. Samples: 622546620. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-12 19:50:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 19:50:51,848][70980] Signal inference workers to stop experience collection... (9150 times) [2024-06-12 19:50:51,869][71000] InferenceWorker_p0-w0: stopping experience collection (9150 times) [2024-06-12 19:50:51,955][70980] Signal inference workers to resume experience collection... (9150 times) [2024-06-12 19:50:51,956][71000] InferenceWorker_p0-w0: resuming experience collection (9150 times) [2024-06-12 19:50:52,427][71000] Updated weights for policy 0, policy_version 66764 (0.0028) [2024-06-12 19:50:55,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1094008832. Throughput: 0: 48661.9. Samples: 622827140. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-12 19:50:55,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:50:56,275][71000] Updated weights for policy 0, policy_version 66774 (0.0031) [2024-06-12 19:50:59,183][71000] Updated weights for policy 0, policy_version 66784 (0.0031) [2024-06-12 19:51:00,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 1094270976. Throughput: 0: 48560.8. Samples: 623122500. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-12 19:51:00,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 19:51:03,079][71000] Updated weights for policy 0, policy_version 66794 (0.0029) [2024-06-12 19:51:05,794][71000] Updated weights for policy 0, policy_version 66804 (0.0027) [2024-06-12 19:51:05,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 1094516736. Throughput: 0: 48922.3. Samples: 623276360. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-12 19:51:05,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:51:09,637][71000] Updated weights for policy 0, policy_version 66814 (0.0028) [2024-06-12 19:51:10,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1094746112. Throughput: 0: 48920.4. Samples: 623575880. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-12 19:51:10,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:51:12,632][71000] Updated weights for policy 0, policy_version 66824 (0.0030) [2024-06-12 19:51:15,940][70768] Fps is (10 sec: 45874.7, 60 sec: 48878.8, 300 sec: 48818.8). Total num frames: 1094975488. Throughput: 0: 48776.8. Samples: 623858920. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-12 19:51:15,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:51:16,312][71000] Updated weights for policy 0, policy_version 66834 (0.0041) [2024-06-12 19:51:19,279][71000] Updated weights for policy 0, policy_version 66844 (0.0039) [2024-06-12 19:51:20,940][70768] Fps is (10 sec: 50789.3, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 1095254016. Throughput: 0: 49081.8. Samples: 624012080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 19:51:20,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:51:22,890][71000] Updated weights for policy 0, policy_version 66854 (0.0031) [2024-06-12 19:51:25,738][71000] Updated weights for policy 0, policy_version 66864 (0.0027) [2024-06-12 19:51:25,940][70768] Fps is (10 sec: 52428.8, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 1095499776. Throughput: 0: 49092.8. Samples: 624311500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 19:51:25,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 19:51:29,411][71000] Updated weights for policy 0, policy_version 66874 (0.0037) [2024-06-12 19:51:30,940][70768] Fps is (10 sec: 45876.2, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 1095712768. Throughput: 0: 49082.7. Samples: 624607860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 19:51:30,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:51:32,417][71000] Updated weights for policy 0, policy_version 66884 (0.0033) [2024-06-12 19:51:35,939][70768] Fps is (10 sec: 45875.9, 60 sec: 49152.2, 300 sec: 48763.2). Total num frames: 1095958528. Throughput: 0: 48968.5. Samples: 624750200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 19:51:35,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:51:36,192][71000] Updated weights for policy 0, policy_version 66894 (0.0033) [2024-06-12 19:51:39,358][71000] Updated weights for policy 0, policy_version 66904 (0.0048) [2024-06-12 19:51:40,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1096220672. Throughput: 0: 49140.9. Samples: 625038480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 19:51:40,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:51:40,967][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000066909_1096237056.pth... [2024-06-12 19:51:41,013][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000066192_1084489728.pth [2024-06-12 19:51:43,043][71000] Updated weights for policy 0, policy_version 66914 (0.0029) [2024-06-12 19:51:45,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48332.7, 300 sec: 48929.9). Total num frames: 1096450048. Throughput: 0: 49043.5. Samples: 625329460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 19:51:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 19:51:46,517][71000] Updated weights for policy 0, policy_version 66924 (0.0032) [2024-06-12 19:51:49,687][71000] Updated weights for policy 0, policy_version 66934 (0.0020) [2024-06-12 19:51:50,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48605.9, 300 sec: 48763.3). Total num frames: 1096695808. Throughput: 0: 48672.0. Samples: 625466600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 19:51:50,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 19:51:53,063][71000] Updated weights for policy 0, policy_version 66944 (0.0033) [2024-06-12 19:51:55,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 1096941568. Throughput: 0: 48528.0. Samples: 625759640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 19:51:55,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:51:56,201][71000] Updated weights for policy 0, policy_version 66954 (0.0031) [2024-06-12 19:51:59,734][71000] Updated weights for policy 0, policy_version 66964 (0.0027) [2024-06-12 19:52:00,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 1097187328. Throughput: 0: 48847.2. Samples: 626057040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-12 19:52:00,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:52:03,077][71000] Updated weights for policy 0, policy_version 66974 (0.0028) [2024-06-12 19:52:04,686][70980] Signal inference workers to stop experience collection... (9200 times) [2024-06-12 19:52:04,726][71000] InferenceWorker_p0-w0: stopping experience collection (9200 times) [2024-06-12 19:52:04,732][70980] Signal inference workers to resume experience collection... (9200 times) [2024-06-12 19:52:04,742][71000] InferenceWorker_p0-w0: resuming experience collection (9200 times) [2024-06-12 19:52:05,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48332.7, 300 sec: 48929.8). Total num frames: 1097416704. Throughput: 0: 48717.5. Samples: 626204360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-12 19:52:05,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 19:52:06,679][71000] Updated weights for policy 0, policy_version 66984 (0.0033) [2024-06-12 19:52:09,679][71000] Updated weights for policy 0, policy_version 66994 (0.0030) [2024-06-12 19:52:10,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48605.9, 300 sec: 48763.3). Total num frames: 1097662464. Throughput: 0: 48406.0. Samples: 626489760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-12 19:52:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 19:52:13,585][71000] Updated weights for policy 0, policy_version 67004 (0.0033) [2024-06-12 19:52:15,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 1097924608. Throughput: 0: 48257.2. Samples: 626779440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-12 19:52:15,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:52:16,243][71000] Updated weights for policy 0, policy_version 67014 (0.0027) [2024-06-12 19:52:20,147][71000] Updated weights for policy 0, policy_version 67024 (0.0026) [2024-06-12 19:52:20,939][70768] Fps is (10 sec: 49152.0, 60 sec: 48333.0, 300 sec: 48818.8). Total num frames: 1098153984. Throughput: 0: 48631.1. Samples: 626938600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-12 19:52:20,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:52:22,776][71000] Updated weights for policy 0, policy_version 67034 (0.0035) [2024-06-12 19:52:25,939][70768] Fps is (10 sec: 47514.6, 60 sec: 48332.9, 300 sec: 48874.3). Total num frames: 1098399744. Throughput: 0: 48860.0. Samples: 627237180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-12 19:52:25,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:52:26,780][71000] Updated weights for policy 0, policy_version 67044 (0.0032) [2024-06-12 19:52:29,551][71000] Updated weights for policy 0, policy_version 67054 (0.0032) [2024-06-12 19:52:30,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 1098629120. Throughput: 0: 48711.2. Samples: 627521460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-12 19:52:30,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 19:52:33,436][71000] Updated weights for policy 0, policy_version 67064 (0.0028) [2024-06-12 19:52:35,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 1098907648. Throughput: 0: 49076.0. Samples: 627675020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-12 19:52:35,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 19:52:36,110][71000] Updated weights for policy 0, policy_version 67074 (0.0033) [2024-06-12 19:52:40,181][71000] Updated weights for policy 0, policy_version 67084 (0.0036) [2024-06-12 19:52:40,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48605.8, 300 sec: 48930.2). Total num frames: 1099137024. Throughput: 0: 49076.8. Samples: 627968100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 19:52:40,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:52:43,021][71000] Updated weights for policy 0, policy_version 67094 (0.0030) [2024-06-12 19:52:45,940][70768] Fps is (10 sec: 45874.7, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 1099366400. Throughput: 0: 48935.5. Samples: 628259140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 19:52:45,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 19:52:46,707][71000] Updated weights for policy 0, policy_version 67104 (0.0030) [2024-06-12 19:52:49,768][71000] Updated weights for policy 0, policy_version 67114 (0.0031) [2024-06-12 19:52:50,939][70768] Fps is (10 sec: 47514.3, 60 sec: 48605.9, 300 sec: 48763.3). Total num frames: 1099612160. Throughput: 0: 48946.8. Samples: 628406960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 19:52:50,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:52:53,465][71000] Updated weights for policy 0, policy_version 67124 (0.0036) [2024-06-12 19:52:55,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 1099890688. Throughput: 0: 49232.3. Samples: 628705220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 19:52:55,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:52:56,247][71000] Updated weights for policy 0, policy_version 67134 (0.0034) [2024-06-12 19:53:00,254][71000] Updated weights for policy 0, policy_version 67144 (0.0035) [2024-06-12 19:53:00,939][70768] Fps is (10 sec: 49152.0, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 1100103680. Throughput: 0: 49066.0. Samples: 628987400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 19:53:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 19:53:01,216][70980] Signal inference workers to stop experience collection... (9250 times) [2024-06-12 19:53:01,261][71000] InferenceWorker_p0-w0: stopping experience collection (9250 times) [2024-06-12 19:53:01,272][70980] Signal inference workers to resume experience collection... (9250 times) [2024-06-12 19:53:01,273][71000] InferenceWorker_p0-w0: resuming experience collection (9250 times) [2024-06-12 19:53:03,289][71000] Updated weights for policy 0, policy_version 67154 (0.0030) [2024-06-12 19:53:05,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48878.9, 300 sec: 48818.7). Total num frames: 1100349440. Throughput: 0: 48696.3. Samples: 629129940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 19:53:05,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:53:07,044][71000] Updated weights for policy 0, policy_version 67164 (0.0029) [2024-06-12 19:53:10,003][71000] Updated weights for policy 0, policy_version 67174 (0.0025) [2024-06-12 19:53:10,940][70768] Fps is (10 sec: 49151.0, 60 sec: 48878.8, 300 sec: 48818.7). Total num frames: 1100595200. Throughput: 0: 48565.5. Samples: 629422640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 19:53:10,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 19:53:13,654][71000] Updated weights for policy 0, policy_version 67184 (0.0029) [2024-06-12 19:53:15,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 1100857344. Throughput: 0: 48807.0. Samples: 629717780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 19:53:15,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 19:53:16,741][71000] Updated weights for policy 0, policy_version 67194 (0.0036) [2024-06-12 19:53:20,419][71000] Updated weights for policy 0, policy_version 67204 (0.0035) [2024-06-12 19:53:20,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.7, 300 sec: 48763.2). Total num frames: 1101070336. Throughput: 0: 48553.2. Samples: 629859920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 19:53:20,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 19:53:23,223][71000] Updated weights for policy 0, policy_version 67214 (0.0031) [2024-06-12 19:53:25,940][70768] Fps is (10 sec: 45876.0, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 1101316096. Throughput: 0: 48452.6. Samples: 630148460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 19:53:25,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:53:27,257][71000] Updated weights for policy 0, policy_version 67224 (0.0024) [2024-06-12 19:53:30,193][71000] Updated weights for policy 0, policy_version 67234 (0.0025) [2024-06-12 19:53:30,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 1101578240. Throughput: 0: 48379.2. Samples: 630436200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 19:53:30,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 19:53:34,140][71000] Updated weights for policy 0, policy_version 67244 (0.0032) [2024-06-12 19:53:35,940][70768] Fps is (10 sec: 52427.9, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 1101840384. Throughput: 0: 48607.8. Samples: 630594320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 19:53:35,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 19:53:36,931][71000] Updated weights for policy 0, policy_version 67254 (0.0022) [2024-06-12 19:53:40,614][71000] Updated weights for policy 0, policy_version 67264 (0.0031) [2024-06-12 19:53:40,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48606.0, 300 sec: 48763.2). Total num frames: 1102053376. Throughput: 0: 48477.5. Samples: 630886700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 19:53:40,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:53:41,047][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000067265_1102069760.pth... [2024-06-12 19:53:41,085][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000066550_1090355200.pth [2024-06-12 19:53:43,374][71000] Updated weights for policy 0, policy_version 67274 (0.0034) [2024-06-12 19:53:45,940][70768] Fps is (10 sec: 44236.7, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 1102282752. Throughput: 0: 48627.8. Samples: 631175660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 19:53:45,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 19:53:47,843][71000] Updated weights for policy 0, policy_version 67284 (0.0037) [2024-06-12 19:53:50,253][71000] Updated weights for policy 0, policy_version 67294 (0.0024) [2024-06-12 19:53:50,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 1102561280. Throughput: 0: 48640.1. Samples: 631318740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 19:53:50,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:53:54,375][71000] Updated weights for policy 0, policy_version 67304 (0.0025) [2024-06-12 19:53:55,940][70768] Fps is (10 sec: 50791.0, 60 sec: 48332.9, 300 sec: 48818.8). Total num frames: 1102790656. Throughput: 0: 48789.5. Samples: 631618160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 19:53:55,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:53:56,949][71000] Updated weights for policy 0, policy_version 67314 (0.0032) [2024-06-12 19:54:00,802][71000] Updated weights for policy 0, policy_version 67324 (0.0027) [2024-06-12 19:54:00,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1103036416. Throughput: 0: 48869.5. Samples: 631916900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-12 19:54:00,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:54:03,547][71000] Updated weights for policy 0, policy_version 67334 (0.0031) [2024-06-12 19:54:05,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48606.0, 300 sec: 48707.7). Total num frames: 1103265792. Throughput: 0: 48778.4. Samples: 632054940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 19:54:05,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:54:07,410][71000] Updated weights for policy 0, policy_version 67344 (0.0035) [2024-06-12 19:54:10,271][71000] Updated weights for policy 0, policy_version 67354 (0.0021) [2024-06-12 19:54:10,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 1103544320. Throughput: 0: 48968.0. Samples: 632352020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 19:54:10,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:54:14,338][71000] Updated weights for policy 0, policy_version 67364 (0.0032) [2024-06-12 19:54:15,604][70980] Signal inference workers to stop experience collection... (9300 times) [2024-06-12 19:54:15,640][71000] InferenceWorker_p0-w0: stopping experience collection (9300 times) [2024-06-12 19:54:15,662][70980] Signal inference workers to resume experience collection... (9300 times) [2024-06-12 19:54:15,663][71000] InferenceWorker_p0-w0: resuming experience collection (9300 times) [2024-06-12 19:54:15,940][70768] Fps is (10 sec: 50789.3, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 1103773696. Throughput: 0: 49018.5. Samples: 632642040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 19:54:15,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 19:54:17,152][71000] Updated weights for policy 0, policy_version 67374 (0.0026) [2024-06-12 19:54:20,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 1104003072. Throughput: 0: 48689.0. Samples: 632785320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 19:54:20,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 19:54:21,299][71000] Updated weights for policy 0, policy_version 67384 (0.0025) [2024-06-12 19:54:23,919][71000] Updated weights for policy 0, policy_version 67394 (0.0026) [2024-06-12 19:54:25,940][70768] Fps is (10 sec: 47514.4, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 1104248832. Throughput: 0: 48511.1. Samples: 633069700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 19:54:25,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:54:27,699][71000] Updated weights for policy 0, policy_version 67404 (0.0020) [2024-06-12 19:54:30,444][71000] Updated weights for policy 0, policy_version 67414 (0.0032) [2024-06-12 19:54:30,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 1104527360. Throughput: 0: 48781.0. Samples: 633370800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 19:54:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:54:34,312][71000] Updated weights for policy 0, policy_version 67424 (0.0030) [2024-06-12 19:54:35,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 1104756736. Throughput: 0: 49111.6. Samples: 633528760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 19:54:35,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:54:37,107][71000] Updated weights for policy 0, policy_version 67434 (0.0031) [2024-06-12 19:54:40,933][71000] Updated weights for policy 0, policy_version 67444 (0.0031) [2024-06-12 19:54:40,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 1105002496. Throughput: 0: 48847.6. Samples: 633816300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 19:54:40,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 19:54:43,870][71000] Updated weights for policy 0, policy_version 67454 (0.0020) [2024-06-12 19:54:45,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.1, 300 sec: 48707.7). Total num frames: 1105231872. Throughput: 0: 48597.8. Samples: 634103800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 19:54:45,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 19:54:48,058][71000] Updated weights for policy 0, policy_version 67464 (0.0029) [2024-06-12 19:54:50,537][71000] Updated weights for policy 0, policy_version 67474 (0.0028) [2024-06-12 19:54:50,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1105494016. Throughput: 0: 49000.9. Samples: 634259980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 19:54:50,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:54:54,392][71000] Updated weights for policy 0, policy_version 67484 (0.0021) [2024-06-12 19:54:55,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1105723392. Throughput: 0: 49033.4. Samples: 634558520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 19:54:55,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:54:57,291][71000] Updated weights for policy 0, policy_version 67494 (0.0036) [2024-06-12 19:55:00,939][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1105969152. Throughput: 0: 49174.5. Samples: 634854880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 19:55:00,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:55:01,038][71000] Updated weights for policy 0, policy_version 67504 (0.0036) [2024-06-12 19:55:03,929][71000] Updated weights for policy 0, policy_version 67514 (0.0032) [2024-06-12 19:55:05,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 48763.3). Total num frames: 1106198528. Throughput: 0: 48982.7. Samples: 634989540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 19:55:05,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:55:08,018][71000] Updated weights for policy 0, policy_version 67524 (0.0035) [2024-06-12 19:55:10,782][71000] Updated weights for policy 0, policy_version 67534 (0.0031) [2024-06-12 19:55:10,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 1106477056. Throughput: 0: 49327.1. Samples: 635289420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 19:55:10,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 19:55:14,618][71000] Updated weights for policy 0, policy_version 67544 (0.0027) [2024-06-12 19:55:15,496][70980] Signal inference workers to stop experience collection... (9350 times) [2024-06-12 19:55:15,540][71000] InferenceWorker_p0-w0: stopping experience collection (9350 times) [2024-06-12 19:55:15,607][70980] Signal inference workers to resume experience collection... (9350 times) [2024-06-12 19:55:15,608][71000] InferenceWorker_p0-w0: resuming experience collection (9350 times) [2024-06-12 19:55:15,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 1106722816. Throughput: 0: 49306.3. Samples: 635589580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 19:55:15,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:55:17,699][71000] Updated weights for policy 0, policy_version 67554 (0.0024) [2024-06-12 19:55:20,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 1106935808. Throughput: 0: 48922.2. Samples: 635730260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 19:55:20,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:55:21,350][71000] Updated weights for policy 0, policy_version 67564 (0.0036) [2024-06-12 19:55:24,314][71000] Updated weights for policy 0, policy_version 67574 (0.0029) [2024-06-12 19:55:25,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 1107181568. Throughput: 0: 48896.0. Samples: 636016620. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-12 19:55:25,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:55:28,241][71000] Updated weights for policy 0, policy_version 67584 (0.0029) [2024-06-12 19:55:30,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 1107443712. Throughput: 0: 49059.9. Samples: 636311500. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-12 19:55:30,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 19:55:30,995][71000] Updated weights for policy 0, policy_version 67594 (0.0033) [2024-06-12 19:55:34,716][71000] Updated weights for policy 0, policy_version 67604 (0.0034) [2024-06-12 19:55:35,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1107689472. Throughput: 0: 48898.6. Samples: 636460420. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-12 19:55:35,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:55:37,587][71000] Updated weights for policy 0, policy_version 67614 (0.0031) [2024-06-12 19:55:40,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 1107935232. Throughput: 0: 48966.6. Samples: 636762020. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-12 19:55:40,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 19:55:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000067623_1107935232.pth... [2024-06-12 19:55:41,006][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000066909_1096237056.pth [2024-06-12 19:55:41,263][71000] Updated weights for policy 0, policy_version 67624 (0.0030) [2024-06-12 19:55:44,443][71000] Updated weights for policy 0, policy_version 67634 (0.0027) [2024-06-12 19:55:45,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.1, 300 sec: 48874.3). Total num frames: 1108197376. Throughput: 0: 48668.5. Samples: 637044960. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-12 19:55:45,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:55:48,362][71000] Updated weights for policy 0, policy_version 67644 (0.0037) [2024-06-12 19:55:50,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.8, 300 sec: 48818.8). Total num frames: 1108410368. Throughput: 0: 49013.8. Samples: 637195160. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-12 19:55:50,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:55:51,349][71000] Updated weights for policy 0, policy_version 67654 (0.0029) [2024-06-12 19:55:54,926][71000] Updated weights for policy 0, policy_version 67664 (0.0033) [2024-06-12 19:55:55,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1108656128. Throughput: 0: 48672.0. Samples: 637479660. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-12 19:55:55,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 19:55:58,233][71000] Updated weights for policy 0, policy_version 67674 (0.0027) [2024-06-12 19:56:00,940][70768] Fps is (10 sec: 49151.0, 60 sec: 48878.7, 300 sec: 48763.2). Total num frames: 1108901888. Throughput: 0: 48299.8. Samples: 637763080. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-12 19:56:00,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 19:56:01,533][71000] Updated weights for policy 0, policy_version 67684 (0.0027) [2024-06-12 19:56:04,615][71000] Updated weights for policy 0, policy_version 67694 (0.0031) [2024-06-12 19:56:05,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 1109147648. Throughput: 0: 48685.4. Samples: 637921100. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-12 19:56:05,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 19:56:08,242][71000] Updated weights for policy 0, policy_version 67704 (0.0031) [2024-06-12 19:56:10,940][70768] Fps is (10 sec: 47514.6, 60 sec: 48332.8, 300 sec: 48818.8). Total num frames: 1109377024. Throughput: 0: 48724.0. Samples: 638209200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-12 19:56:10,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:56:11,574][71000] Updated weights for policy 0, policy_version 67714 (0.0033) [2024-06-12 19:56:15,066][71000] Updated weights for policy 0, policy_version 67724 (0.0043) [2024-06-12 19:56:15,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48605.9, 300 sec: 48763.3). Total num frames: 1109639168. Throughput: 0: 48571.2. Samples: 638497200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-12 19:56:15,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:56:18,386][71000] Updated weights for policy 0, policy_version 67734 (0.0029) [2024-06-12 19:56:20,940][70768] Fps is (10 sec: 50789.3, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 1109884928. Throughput: 0: 48446.9. Samples: 638640540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-12 19:56:20,941][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 19:56:21,924][71000] Updated weights for policy 0, policy_version 67744 (0.0037) [2024-06-12 19:56:25,274][71000] Updated weights for policy 0, policy_version 67754 (0.0028) [2024-06-12 19:56:25,943][70768] Fps is (10 sec: 45861.1, 60 sec: 48603.3, 300 sec: 48762.7). Total num frames: 1110097920. Throughput: 0: 48141.5. Samples: 638928540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-12 19:56:25,943][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:56:28,564][71000] Updated weights for policy 0, policy_version 67764 (0.0025) [2024-06-12 19:56:30,940][70768] Fps is (10 sec: 45876.2, 60 sec: 48332.9, 300 sec: 48763.2). Total num frames: 1110343680. Throughput: 0: 48788.9. Samples: 639240460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-12 19:56:30,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:56:31,711][71000] Updated weights for policy 0, policy_version 67774 (0.0029) [2024-06-12 19:56:34,694][70980] Signal inference workers to stop experience collection... (9400 times) [2024-06-12 19:56:34,694][70980] Signal inference workers to resume experience collection... (9400 times) [2024-06-12 19:56:34,738][71000] InferenceWorker_p0-w0: stopping experience collection (9400 times) [2024-06-12 19:56:34,738][71000] InferenceWorker_p0-w0: resuming experience collection (9400 times) [2024-06-12 19:56:34,822][71000] Updated weights for policy 0, policy_version 67784 (0.0024) [2024-06-12 19:56:35,940][70768] Fps is (10 sec: 52444.7, 60 sec: 48878.8, 300 sec: 48818.7). Total num frames: 1110622208. Throughput: 0: 48443.9. Samples: 639375140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-12 19:56:35,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 19:56:38,483][71000] Updated weights for policy 0, policy_version 67794 (0.0033) [2024-06-12 19:56:40,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48605.8, 300 sec: 48818.8). Total num frames: 1110851584. Throughput: 0: 48839.5. Samples: 639677440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-12 19:56:40,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:56:41,460][71000] Updated weights for policy 0, policy_version 67804 (0.0037) [2024-06-12 19:56:45,451][71000] Updated weights for policy 0, policy_version 67814 (0.0036) [2024-06-12 19:56:45,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48059.7, 300 sec: 48763.2). Total num frames: 1111080960. Throughput: 0: 49050.9. Samples: 639970360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-12 19:56:45,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:56:48,603][71000] Updated weights for policy 0, policy_version 67824 (0.0024) [2024-06-12 19:56:50,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48332.8, 300 sec: 48707.7). Total num frames: 1111310336. Throughput: 0: 48563.1. Samples: 640106440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-12 19:56:50,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 19:56:52,058][71000] Updated weights for policy 0, policy_version 67834 (0.0040) [2024-06-12 19:56:55,158][71000] Updated weights for policy 0, policy_version 67844 (0.0026) [2024-06-12 19:56:55,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1111588864. Throughput: 0: 48686.6. Samples: 640400100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-12 19:56:55,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 19:56:58,818][71000] Updated weights for policy 0, policy_version 67854 (0.0030) [2024-06-12 19:57:00,940][70768] Fps is (10 sec: 54066.3, 60 sec: 49152.1, 300 sec: 48929.8). Total num frames: 1111851008. Throughput: 0: 48938.1. Samples: 640699420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-12 19:57:00,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:57:01,570][71000] Updated weights for policy 0, policy_version 67864 (0.0028) [2024-06-12 19:57:05,697][71000] Updated weights for policy 0, policy_version 67874 (0.0032) [2024-06-12 19:57:05,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48605.7, 300 sec: 48818.7). Total num frames: 1112064000. Throughput: 0: 49006.8. Samples: 640845840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-12 19:57:05,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:57:08,505][71000] Updated weights for policy 0, policy_version 67884 (0.0030) [2024-06-12 19:57:10,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48878.8, 300 sec: 48763.2). Total num frames: 1112309760. Throughput: 0: 49029.5. Samples: 641134720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-12 19:57:10,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 19:57:12,333][71000] Updated weights for policy 0, policy_version 67894 (0.0029) [2024-06-12 19:57:15,408][71000] Updated weights for policy 0, policy_version 67904 (0.0039) [2024-06-12 19:57:15,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48605.8, 300 sec: 48818.7). Total num frames: 1112555520. Throughput: 0: 48430.5. Samples: 641419840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-12 19:57:15,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 19:57:19,250][71000] Updated weights for policy 0, policy_version 67914 (0.0032) [2024-06-12 19:57:20,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 1112801280. Throughput: 0: 48907.2. Samples: 641575960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-12 19:57:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 19:57:21,729][71000] Updated weights for policy 0, policy_version 67924 (0.0027) [2024-06-12 19:57:25,753][71000] Updated weights for policy 0, policy_version 67934 (0.0031) [2024-06-12 19:57:25,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48881.5, 300 sec: 48818.8). Total num frames: 1113030656. Throughput: 0: 48600.5. Samples: 641864460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-12 19:57:25,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:57:28,723][71000] Updated weights for policy 0, policy_version 67944 (0.0034) [2024-06-12 19:57:30,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 1113292800. Throughput: 0: 48620.7. Samples: 642158300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 19:57:30,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:57:32,406][71000] Updated weights for policy 0, policy_version 67954 (0.0029) [2024-06-12 19:57:35,331][71000] Updated weights for policy 0, policy_version 67964 (0.0031) [2024-06-12 19:57:35,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 1113538560. Throughput: 0: 48809.7. Samples: 642302880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 19:57:35,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:57:39,350][71000] Updated weights for policy 0, policy_version 67974 (0.0026) [2024-06-12 19:57:40,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1113784320. Throughput: 0: 48999.1. Samples: 642605060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 19:57:40,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:57:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000067980_1113784320.pth... [2024-06-12 19:57:41,010][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000067265_1102069760.pth [2024-06-12 19:57:41,888][71000] Updated weights for policy 0, policy_version 67984 (0.0028) [2024-06-12 19:57:45,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 1113997312. Throughput: 0: 48942.3. Samples: 642901820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 19:57:45,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:57:46,007][71000] Updated weights for policy 0, policy_version 67994 (0.0032) [2024-06-12 19:57:48,672][71000] Updated weights for policy 0, policy_version 68004 (0.0040) [2024-06-12 19:57:50,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49424.9, 300 sec: 48763.2). Total num frames: 1114275840. Throughput: 0: 48823.1. Samples: 643042880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 19:57:50,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 19:57:52,609][71000] Updated weights for policy 0, policy_version 68014 (0.0034) [2024-06-12 19:57:52,612][70980] Signal inference workers to stop experience collection... (9450 times) [2024-06-12 19:57:52,612][70980] Signal inference workers to resume experience collection... (9450 times) [2024-06-12 19:57:52,654][71000] InferenceWorker_p0-w0: stopping experience collection (9450 times) [2024-06-12 19:57:52,654][71000] InferenceWorker_p0-w0: resuming experience collection (9450 times) [2024-06-12 19:57:55,486][71000] Updated weights for policy 0, policy_version 68024 (0.0027) [2024-06-12 19:57:55,940][70768] Fps is (10 sec: 52429.2, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1114521600. Throughput: 0: 48893.5. Samples: 643334920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 19:57:55,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 19:57:59,219][71000] Updated weights for policy 0, policy_version 68034 (0.0031) [2024-06-12 19:58:00,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48332.9, 300 sec: 48818.8). Total num frames: 1114750976. Throughput: 0: 49047.2. Samples: 643626960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 19:58:00,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 19:58:02,188][71000] Updated weights for policy 0, policy_version 68044 (0.0031) [2024-06-12 19:58:05,889][71000] Updated weights for policy 0, policy_version 68054 (0.0028) [2024-06-12 19:58:05,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1114996736. Throughput: 0: 48682.2. Samples: 643766660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 19:58:05,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:58:08,732][71000] Updated weights for policy 0, policy_version 68064 (0.0033) [2024-06-12 19:58:10,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 1115242496. Throughput: 0: 48984.9. Samples: 644068780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-12 19:58:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 19:58:12,813][71000] Updated weights for policy 0, policy_version 68074 (0.0029) [2024-06-12 19:58:15,576][71000] Updated weights for policy 0, policy_version 68084 (0.0027) [2024-06-12 19:58:15,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.2, 300 sec: 48929.9). Total num frames: 1115504640. Throughput: 0: 48914.0. Samples: 644359420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 19:58:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 19:58:19,481][71000] Updated weights for policy 0, policy_version 68094 (0.0024) [2024-06-12 19:58:20,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 1115717632. Throughput: 0: 48872.9. Samples: 644502160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 19:58:20,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 19:58:22,530][71000] Updated weights for policy 0, policy_version 68104 (0.0025) [2024-06-12 19:58:25,940][70768] Fps is (10 sec: 44236.4, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 1115947008. Throughput: 0: 48420.0. Samples: 644783960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 19:58:25,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:58:26,469][71000] Updated weights for policy 0, policy_version 68114 (0.0034) [2024-06-12 19:58:29,483][71000] Updated weights for policy 0, policy_version 68124 (0.0026) [2024-06-12 19:58:30,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48606.0, 300 sec: 48707.7). Total num frames: 1116209152. Throughput: 0: 48176.5. Samples: 645069760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 19:58:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 19:58:33,076][71000] Updated weights for policy 0, policy_version 68134 (0.0028) [2024-06-12 19:58:35,882][71000] Updated weights for policy 0, policy_version 68144 (0.0026) [2024-06-12 19:58:35,940][70768] Fps is (10 sec: 52429.1, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1116471296. Throughput: 0: 48433.9. Samples: 645222400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 19:58:35,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:58:39,748][71000] Updated weights for policy 0, policy_version 68154 (0.0038) [2024-06-12 19:58:40,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48332.9, 300 sec: 48818.8). Total num frames: 1116684288. Throughput: 0: 48493.8. Samples: 645517140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 19:58:40,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:58:43,202][71000] Updated weights for policy 0, policy_version 68164 (0.0040) [2024-06-12 19:58:45,940][70768] Fps is (10 sec: 45874.4, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 1116930048. Throughput: 0: 48394.5. Samples: 645804720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 19:58:45,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 19:58:46,299][71000] Updated weights for policy 0, policy_version 68174 (0.0033) [2024-06-12 19:58:49,758][71000] Updated weights for policy 0, policy_version 68184 (0.0030) [2024-06-12 19:58:50,940][70768] Fps is (10 sec: 50789.0, 60 sec: 48605.8, 300 sec: 48818.7). Total num frames: 1117192192. Throughput: 0: 48620.2. Samples: 645954580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 19:58:50,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:58:53,123][71000] Updated weights for policy 0, policy_version 68194 (0.0033) [2024-06-12 19:58:55,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48332.6, 300 sec: 48763.2). Total num frames: 1117421568. Throughput: 0: 48388.7. Samples: 646246280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:58:55,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 19:58:56,546][71000] Updated weights for policy 0, policy_version 68204 (0.0030) [2024-06-12 19:58:59,794][71000] Updated weights for policy 0, policy_version 68214 (0.0027) [2024-06-12 19:59:00,940][70768] Fps is (10 sec: 44237.8, 60 sec: 48059.7, 300 sec: 48707.7). Total num frames: 1117634560. Throughput: 0: 48167.0. Samples: 646526940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:59:00,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 19:59:03,087][71000] Updated weights for policy 0, policy_version 68224 (0.0021) [2024-06-12 19:59:05,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 1117913088. Throughput: 0: 48339.4. Samples: 646677440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:59:05,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 19:59:06,524][71000] Updated weights for policy 0, policy_version 68234 (0.0024) [2024-06-12 19:59:10,064][71000] Updated weights for policy 0, policy_version 68244 (0.0029) [2024-06-12 19:59:10,940][70768] Fps is (10 sec: 54067.3, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1118175232. Throughput: 0: 48725.8. Samples: 646976620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:59:10,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 19:59:13,055][71000] Updated weights for policy 0, policy_version 68254 (0.0026) [2024-06-12 19:59:15,939][70768] Fps is (10 sec: 47514.7, 60 sec: 48059.8, 300 sec: 48763.2). Total num frames: 1118388224. Throughput: 0: 49109.5. Samples: 647279680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:59:15,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 19:59:16,530][71000] Updated weights for policy 0, policy_version 68264 (0.0029) [2024-06-12 19:59:19,901][71000] Updated weights for policy 0, policy_version 68274 (0.0051) [2024-06-12 19:59:20,944][70768] Fps is (10 sec: 45855.3, 60 sec: 48602.4, 300 sec: 48762.5). Total num frames: 1118633984. Throughput: 0: 48705.1. Samples: 647414340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:59:20,944][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 19:59:21,600][70980] Signal inference workers to stop experience collection... (9500 times) [2024-06-12 19:59:21,600][70980] Signal inference workers to resume experience collection... (9500 times) [2024-06-12 19:59:21,637][71000] InferenceWorker_p0-w0: stopping experience collection (9500 times) [2024-06-12 19:59:21,637][71000] InferenceWorker_p0-w0: resuming experience collection (9500 times) [2024-06-12 19:59:23,240][71000] Updated weights for policy 0, policy_version 68284 (0.0032) [2024-06-12 19:59:25,940][70768] Fps is (10 sec: 50789.3, 60 sec: 49151.9, 300 sec: 48707.7). Total num frames: 1118896128. Throughput: 0: 48747.4. Samples: 647710780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:59:25,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 19:59:26,366][71000] Updated weights for policy 0, policy_version 68294 (0.0034) [2024-06-12 19:59:29,807][71000] Updated weights for policy 0, policy_version 68304 (0.0029) [2024-06-12 19:59:30,940][70768] Fps is (10 sec: 50812.1, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1119141888. Throughput: 0: 48987.2. Samples: 648009140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:59:30,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 19:59:33,175][71000] Updated weights for policy 0, policy_version 68314 (0.0029) [2024-06-12 19:59:35,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48059.6, 300 sec: 48652.1). Total num frames: 1119354880. Throughput: 0: 48836.5. Samples: 648152220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 19:59:35,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 19:59:36,747][71000] Updated weights for policy 0, policy_version 68324 (0.0026) [2024-06-12 19:59:39,811][71000] Updated weights for policy 0, policy_version 68334 (0.0033) [2024-06-12 19:59:40,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 1119600640. Throughput: 0: 48780.6. Samples: 648441400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 19:59:40,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 19:59:40,956][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000068335_1119600640.pth... [2024-06-12 19:59:41,017][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000067623_1107935232.pth [2024-06-12 19:59:43,580][71000] Updated weights for policy 0, policy_version 68344 (0.0025) [2024-06-12 19:59:45,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 1119879168. Throughput: 0: 49008.7. Samples: 648732340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 19:59:45,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 19:59:46,531][71000] Updated weights for policy 0, policy_version 68354 (0.0040) [2024-06-12 19:59:50,123][71000] Updated weights for policy 0, policy_version 68364 (0.0021) [2024-06-12 19:59:50,939][70768] Fps is (10 sec: 52429.2, 60 sec: 48879.2, 300 sec: 48818.8). Total num frames: 1120124928. Throughput: 0: 49250.4. Samples: 648893700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 19:59:50,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 19:59:53,298][71000] Updated weights for policy 0, policy_version 68374 (0.0025) [2024-06-12 19:59:55,939][70768] Fps is (10 sec: 45876.0, 60 sec: 48606.0, 300 sec: 48707.7). Total num frames: 1120337920. Throughput: 0: 49095.1. Samples: 649185900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 19:59:55,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 19:59:56,745][71000] Updated weights for policy 0, policy_version 68384 (0.0029) [2024-06-12 19:59:59,777][71000] Updated weights for policy 0, policy_version 68394 (0.0027) [2024-06-12 20:00:00,940][70768] Fps is (10 sec: 45874.7, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 1120583680. Throughput: 0: 48664.3. Samples: 649469580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 20:00:00,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 20:00:03,552][71000] Updated weights for policy 0, policy_version 68404 (0.0025) [2024-06-12 20:00:05,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 1120862208. Throughput: 0: 49213.5. Samples: 649628740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 20:00:05,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:00:06,394][71000] Updated weights for policy 0, policy_version 68414 (0.0024) [2024-06-12 20:00:10,206][71000] Updated weights for policy 0, policy_version 68424 (0.0029) [2024-06-12 20:00:10,940][70768] Fps is (10 sec: 52429.2, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1121107968. Throughput: 0: 49104.1. Samples: 649920460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 20:00:10,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:00:12,957][71000] Updated weights for policy 0, policy_version 68434 (0.0029) [2024-06-12 20:00:15,939][70768] Fps is (10 sec: 44237.3, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 1121304576. Throughput: 0: 48963.6. Samples: 650212500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 20:00:15,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:00:16,964][71000] Updated weights for policy 0, policy_version 68444 (0.0027) [2024-06-12 20:00:17,460][70980] Signal inference workers to stop experience collection... (9550 times) [2024-06-12 20:00:17,460][70980] Signal inference workers to resume experience collection... (9550 times) [2024-06-12 20:00:17,486][71000] InferenceWorker_p0-w0: stopping experience collection (9550 times) [2024-06-12 20:00:17,486][71000] InferenceWorker_p0-w0: resuming experience collection (9550 times) [2024-06-12 20:00:19,848][71000] Updated weights for policy 0, policy_version 68454 (0.0028) [2024-06-12 20:00:20,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48882.4, 300 sec: 48763.2). Total num frames: 1121566720. Throughput: 0: 48745.0. Samples: 650345740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 20:00:20,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:00:23,478][71000] Updated weights for policy 0, policy_version 68464 (0.0031) [2024-06-12 20:00:25,940][70768] Fps is (10 sec: 54066.0, 60 sec: 49151.9, 300 sec: 48818.8). Total num frames: 1121845248. Throughput: 0: 48928.3. Samples: 650643180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 20:00:25,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:00:26,519][71000] Updated weights for policy 0, policy_version 68474 (0.0036) [2024-06-12 20:00:30,187][71000] Updated weights for policy 0, policy_version 68484 (0.0029) [2024-06-12 20:00:30,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49151.9, 300 sec: 48818.7). Total num frames: 1122091008. Throughput: 0: 49216.9. Samples: 650947100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 20:00:30,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 20:00:32,987][71000] Updated weights for policy 0, policy_version 68494 (0.0030) [2024-06-12 20:00:35,939][70768] Fps is (10 sec: 45876.3, 60 sec: 49152.2, 300 sec: 48707.7). Total num frames: 1122304000. Throughput: 0: 48823.6. Samples: 651090760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 20:00:35,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 20:00:36,788][71000] Updated weights for policy 0, policy_version 68504 (0.0033) [2024-06-12 20:00:39,901][71000] Updated weights for policy 0, policy_version 68514 (0.0036) [2024-06-12 20:00:40,940][70768] Fps is (10 sec: 45875.6, 60 sec: 49152.0, 300 sec: 48652.1). Total num frames: 1122549760. Throughput: 0: 48781.6. Samples: 651381080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 20:00:40,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:00:43,497][71000] Updated weights for policy 0, policy_version 68524 (0.0034) [2024-06-12 20:00:45,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 1122828288. Throughput: 0: 49132.1. Samples: 651680520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 20:00:45,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:00:46,291][71000] Updated weights for policy 0, policy_version 68534 (0.0027) [2024-06-12 20:00:49,978][71000] Updated weights for policy 0, policy_version 68544 (0.0025) [2024-06-12 20:00:50,939][70768] Fps is (10 sec: 52429.7, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 1123074048. Throughput: 0: 49178.8. Samples: 651841780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 20:00:50,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:00:53,007][71000] Updated weights for policy 0, policy_version 68554 (0.0035) [2024-06-12 20:00:55,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.0, 300 sec: 48763.3). Total num frames: 1123287040. Throughput: 0: 49171.1. Samples: 652133160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 20:00:55,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:00:56,589][71000] Updated weights for policy 0, policy_version 68564 (0.0036) [2024-06-12 20:00:59,855][71000] Updated weights for policy 0, policy_version 68574 (0.0037) [2024-06-12 20:01:00,940][70768] Fps is (10 sec: 45874.1, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 1123532800. Throughput: 0: 48960.2. Samples: 652415720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-12 20:01:00,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 20:01:03,513][71000] Updated weights for policy 0, policy_version 68584 (0.0025) [2024-06-12 20:01:05,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 1123811328. Throughput: 0: 49252.4. Samples: 652562100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 20:01:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:01:06,501][71000] Updated weights for policy 0, policy_version 68594 (0.0030) [2024-06-12 20:01:10,311][71000] Updated weights for policy 0, policy_version 68604 (0.0027) [2024-06-12 20:01:10,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 1124024320. Throughput: 0: 49291.3. Samples: 652861280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 20:01:10,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 20:01:13,160][71000] Updated weights for policy 0, policy_version 68614 (0.0030) [2024-06-12 20:01:15,939][70768] Fps is (10 sec: 44237.5, 60 sec: 49152.0, 300 sec: 48707.7). Total num frames: 1124253696. Throughput: 0: 49089.1. Samples: 653156100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 20:01:15,940][70768] Avg episode reward: [(0, '0.249')] [2024-06-12 20:01:17,074][71000] Updated weights for policy 0, policy_version 68624 (0.0032) [2024-06-12 20:01:19,718][71000] Updated weights for policy 0, policy_version 68634 (0.0031) [2024-06-12 20:01:20,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 48874.8). Total num frames: 1124515840. Throughput: 0: 48882.0. Samples: 653290460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 20:01:20,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:01:21,925][70980] Signal inference workers to stop experience collection... (9600 times) [2024-06-12 20:01:21,925][70980] Signal inference workers to resume experience collection... (9600 times) [2024-06-12 20:01:21,950][71000] InferenceWorker_p0-w0: stopping experience collection (9600 times) [2024-06-12 20:01:21,950][71000] InferenceWorker_p0-w0: resuming experience collection (9600 times) [2024-06-12 20:01:23,584][71000] Updated weights for policy 0, policy_version 68644 (0.0029) [2024-06-12 20:01:25,939][70768] Fps is (10 sec: 52429.3, 60 sec: 48879.2, 300 sec: 48929.9). Total num frames: 1124777984. Throughput: 0: 48938.5. Samples: 653583300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 20:01:25,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:01:26,689][71000] Updated weights for policy 0, policy_version 68654 (0.0030) [2024-06-12 20:01:30,485][71000] Updated weights for policy 0, policy_version 68664 (0.0028) [2024-06-12 20:01:30,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48332.8, 300 sec: 48707.7). Total num frames: 1124990976. Throughput: 0: 48841.2. Samples: 653878380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 20:01:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:01:33,146][71000] Updated weights for policy 0, policy_version 68674 (0.0029) [2024-06-12 20:01:35,940][70768] Fps is (10 sec: 45873.9, 60 sec: 48878.8, 300 sec: 48763.2). Total num frames: 1125236736. Throughput: 0: 48318.0. Samples: 654016100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 20:01:35,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:01:37,242][71000] Updated weights for policy 0, policy_version 68684 (0.0031) [2024-06-12 20:01:39,976][71000] Updated weights for policy 0, policy_version 68694 (0.0025) [2024-06-12 20:01:40,939][70768] Fps is (10 sec: 50791.4, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 1125498880. Throughput: 0: 48595.6. Samples: 654319960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 20:01:40,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:01:41,063][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000068696_1125515264.pth... [2024-06-12 20:01:41,100][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000067980_1113784320.pth [2024-06-12 20:01:43,755][71000] Updated weights for policy 0, policy_version 68704 (0.0040) [2024-06-12 20:01:45,940][70768] Fps is (10 sec: 54067.9, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 1125777408. Throughput: 0: 48873.0. Samples: 654615000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 20:01:45,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:01:46,553][71000] Updated weights for policy 0, policy_version 68714 (0.0038) [2024-06-12 20:01:50,784][71000] Updated weights for policy 0, policy_version 68724 (0.0037) [2024-06-12 20:01:50,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48332.7, 300 sec: 48763.2). Total num frames: 1125974016. Throughput: 0: 48883.6. Samples: 654761860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 20:01:50,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:01:53,813][71000] Updated weights for policy 0, policy_version 68734 (0.0027) [2024-06-12 20:01:55,939][70768] Fps is (10 sec: 42598.7, 60 sec: 48605.9, 300 sec: 48652.2). Total num frames: 1126203392. Throughput: 0: 48736.6. Samples: 655054420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 20:01:55,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 20:01:57,413][71000] Updated weights for policy 0, policy_version 68744 (0.0029) [2024-06-12 20:02:00,426][71000] Updated weights for policy 0, policy_version 68754 (0.0028) [2024-06-12 20:02:00,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1126465536. Throughput: 0: 48463.8. Samples: 655336980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 20:02:00,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 20:02:04,271][71000] Updated weights for policy 0, policy_version 68764 (0.0035) [2024-06-12 20:02:05,940][70768] Fps is (10 sec: 54066.1, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 1126744064. Throughput: 0: 48950.2. Samples: 655493220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 20:02:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:02:07,073][71000] Updated weights for policy 0, policy_version 68774 (0.0036) [2024-06-12 20:02:10,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 1126940672. Throughput: 0: 49051.2. Samples: 655790620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 20:02:10,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 20:02:11,002][71000] Updated weights for policy 0, policy_version 68784 (0.0033) [2024-06-12 20:02:13,718][71000] Updated weights for policy 0, policy_version 68794 (0.0030) [2024-06-12 20:02:15,940][70768] Fps is (10 sec: 45875.8, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 1127202816. Throughput: 0: 48907.2. Samples: 656079200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 20:02:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:02:17,572][71000] Updated weights for policy 0, policy_version 68804 (0.0025) [2024-06-12 20:02:20,797][71000] Updated weights for policy 0, policy_version 68814 (0.0039) [2024-06-12 20:02:20,940][70768] Fps is (10 sec: 50789.4, 60 sec: 48878.7, 300 sec: 48874.2). Total num frames: 1127448576. Throughput: 0: 48958.4. Samples: 656219240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 20:02:20,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 20:02:22,278][70980] Signal inference workers to stop experience collection... (9650 times) [2024-06-12 20:02:22,311][71000] InferenceWorker_p0-w0: stopping experience collection (9650 times) [2024-06-12 20:02:22,386][70980] Signal inference workers to resume experience collection... (9650 times) [2024-06-12 20:02:22,387][71000] InferenceWorker_p0-w0: resuming experience collection (9650 times) [2024-06-12 20:02:24,311][71000] Updated weights for policy 0, policy_version 68824 (0.0030) [2024-06-12 20:02:25,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48605.7, 300 sec: 48818.8). Total num frames: 1127694336. Throughput: 0: 48691.8. Samples: 656511100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 20:02:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 20:02:27,304][71000] Updated weights for policy 0, policy_version 68834 (0.0033) [2024-06-12 20:02:30,887][71000] Updated weights for policy 0, policy_version 68844 (0.0027) [2024-06-12 20:02:30,940][70768] Fps is (10 sec: 49153.1, 60 sec: 49152.0, 300 sec: 48818.7). Total num frames: 1127940096. Throughput: 0: 48820.8. Samples: 656811940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 20:02:30,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 20:02:34,235][71000] Updated weights for policy 0, policy_version 68854 (0.0026) [2024-06-12 20:02:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 1128185856. Throughput: 0: 48817.8. Samples: 656958660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 20:02:35,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:02:37,779][71000] Updated weights for policy 0, policy_version 68864 (0.0028) [2024-06-12 20:02:40,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48605.7, 300 sec: 48874.3). Total num frames: 1128415232. Throughput: 0: 48581.6. Samples: 657240600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 20:02:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 20:02:41,092][71000] Updated weights for policy 0, policy_version 68874 (0.0028) [2024-06-12 20:02:44,412][71000] Updated weights for policy 0, policy_version 68884 (0.0037) [2024-06-12 20:02:45,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48332.8, 300 sec: 48818.8). Total num frames: 1128677376. Throughput: 0: 48878.4. Samples: 657536500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 20:02:45,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 20:02:47,635][71000] Updated weights for policy 0, policy_version 68894 (0.0026) [2024-06-12 20:02:50,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1128906752. Throughput: 0: 48680.5. Samples: 657683840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 20:02:50,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:02:51,002][71000] Updated weights for policy 0, policy_version 68904 (0.0027) [2024-06-12 20:02:54,237][71000] Updated weights for policy 0, policy_version 68914 (0.0028) [2024-06-12 20:02:55,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.0, 300 sec: 48874.3). Total num frames: 1129168896. Throughput: 0: 48606.4. Samples: 657977900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 20:02:55,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:02:57,511][71000] Updated weights for policy 0, policy_version 68924 (0.0027) [2024-06-12 20:03:00,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 1129381888. Throughput: 0: 48807.8. Samples: 658275560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 20:03:00,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:03:01,125][71000] Updated weights for policy 0, policy_version 68934 (0.0038) [2024-06-12 20:03:04,938][71000] Updated weights for policy 0, policy_version 68944 (0.0021) [2024-06-12 20:03:05,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48332.9, 300 sec: 48818.8). Total num frames: 1129644032. Throughput: 0: 48783.1. Samples: 658414460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 20:03:05,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:03:08,079][71000] Updated weights for policy 0, policy_version 68954 (0.0028) [2024-06-12 20:03:10,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 1129873408. Throughput: 0: 48500.9. Samples: 658693640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 20:03:10,949][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:03:11,561][71000] Updated weights for policy 0, policy_version 68964 (0.0033) [2024-06-12 20:03:14,985][71000] Updated weights for policy 0, policy_version 68974 (0.0026) [2024-06-12 20:03:15,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 1130151936. Throughput: 0: 48210.4. Samples: 658981400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 20:03:15,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:03:18,264][71000] Updated weights for policy 0, policy_version 68984 (0.0024) [2024-06-12 20:03:20,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48060.0, 300 sec: 48763.2). Total num frames: 1130332160. Throughput: 0: 48265.9. Samples: 659130620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 20:03:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:03:21,692][71000] Updated weights for policy 0, policy_version 68994 (0.0027) [2024-06-12 20:03:25,170][71000] Updated weights for policy 0, policy_version 69004 (0.0026) [2024-06-12 20:03:25,940][70768] Fps is (10 sec: 44236.6, 60 sec: 48332.9, 300 sec: 48763.2). Total num frames: 1130594304. Throughput: 0: 48514.8. Samples: 659423760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 20:03:25,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:03:28,507][71000] Updated weights for policy 0, policy_version 69014 (0.0029) [2024-06-12 20:03:29,009][70980] Signal inference workers to stop experience collection... (9700 times) [2024-06-12 20:03:29,048][71000] InferenceWorker_p0-w0: stopping experience collection (9700 times) [2024-06-12 20:03:29,070][70980] Signal inference workers to resume experience collection... (9700 times) [2024-06-12 20:03:29,073][71000] InferenceWorker_p0-w0: resuming experience collection (9700 times) [2024-06-12 20:03:30,940][70768] Fps is (10 sec: 50789.7, 60 sec: 48332.8, 300 sec: 48707.7). Total num frames: 1130840064. Throughput: 0: 48301.6. Samples: 659710080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 20:03:30,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:03:32,110][71000] Updated weights for policy 0, policy_version 69024 (0.0027) [2024-06-12 20:03:35,411][71000] Updated weights for policy 0, policy_version 69034 (0.0034) [2024-06-12 20:03:35,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 1131102208. Throughput: 0: 48328.4. Samples: 659858620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 20:03:35,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:03:38,568][71000] Updated weights for policy 0, policy_version 69044 (0.0023) [2024-06-12 20:03:40,939][70768] Fps is (10 sec: 45876.1, 60 sec: 48059.9, 300 sec: 48707.7). Total num frames: 1131298816. Throughput: 0: 48240.5. Samples: 660148720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 20:03:40,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:03:40,973][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000069050_1131315200.pth... [2024-06-12 20:03:41,021][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000068335_1119600640.pth [2024-06-12 20:03:41,889][71000] Updated weights for policy 0, policy_version 69054 (0.0028) [2024-06-12 20:03:45,289][71000] Updated weights for policy 0, policy_version 69064 (0.0036) [2024-06-12 20:03:45,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48605.7, 300 sec: 48818.8). Total num frames: 1131593728. Throughput: 0: 48274.7. Samples: 660447920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-12 20:03:45,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:03:48,671][71000] Updated weights for policy 0, policy_version 69074 (0.0030) [2024-06-12 20:03:50,940][70768] Fps is (10 sec: 52428.6, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 1131823104. Throughput: 0: 48516.5. Samples: 660597700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 20:03:50,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:03:52,110][71000] Updated weights for policy 0, policy_version 69084 (0.0029) [2024-06-12 20:03:55,426][71000] Updated weights for policy 0, policy_version 69094 (0.0036) [2024-06-12 20:03:55,939][70768] Fps is (10 sec: 47514.7, 60 sec: 48332.9, 300 sec: 48929.9). Total num frames: 1132068864. Throughput: 0: 48671.8. Samples: 660883860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 20:03:55,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 20:03:58,935][71000] Updated weights for policy 0, policy_version 69104 (0.0035) [2024-06-12 20:04:00,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48332.9, 300 sec: 48707.7). Total num frames: 1132281856. Throughput: 0: 48576.8. Samples: 661167360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 20:04:00,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:04:02,111][71000] Updated weights for policy 0, policy_version 69114 (0.0023) [2024-06-12 20:04:05,331][71000] Updated weights for policy 0, policy_version 69124 (0.0021) [2024-06-12 20:04:05,939][70768] Fps is (10 sec: 47513.5, 60 sec: 48332.9, 300 sec: 48707.7). Total num frames: 1132544000. Throughput: 0: 48503.2. Samples: 661313260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 20:04:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:04:08,787][71000] Updated weights for policy 0, policy_version 69134 (0.0034) [2024-06-12 20:04:10,939][70768] Fps is (10 sec: 52429.4, 60 sec: 48879.1, 300 sec: 48874.3). Total num frames: 1132806144. Throughput: 0: 48717.9. Samples: 661616060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 20:04:10,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:04:11,868][71000] Updated weights for policy 0, policy_version 69144 (0.0030) [2024-06-12 20:04:15,371][71000] Updated weights for policy 0, policy_version 69154 (0.0029) [2024-06-12 20:04:15,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48059.7, 300 sec: 48819.5). Total num frames: 1133035520. Throughput: 0: 48897.0. Samples: 661910440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 20:04:15,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:04:18,722][71000] Updated weights for policy 0, policy_version 69164 (0.0035) [2024-06-12 20:04:20,939][70768] Fps is (10 sec: 44236.6, 60 sec: 48605.9, 300 sec: 48652.2). Total num frames: 1133248512. Throughput: 0: 48796.6. Samples: 662054460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 20:04:20,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:04:22,149][71000] Updated weights for policy 0, policy_version 69174 (0.0026) [2024-06-12 20:04:25,707][71000] Updated weights for policy 0, policy_version 69184 (0.0027) [2024-06-12 20:04:25,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 1133510656. Throughput: 0: 48645.7. Samples: 662337780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 20:04:25,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:04:28,826][71000] Updated weights for policy 0, policy_version 69194 (0.0024) [2024-06-12 20:04:30,940][70768] Fps is (10 sec: 54066.8, 60 sec: 49152.1, 300 sec: 48929.9). Total num frames: 1133789184. Throughput: 0: 48679.7. Samples: 662638500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 20:04:30,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:04:32,082][71000] Updated weights for policy 0, policy_version 69204 (0.0029) [2024-06-12 20:04:34,890][70980] Signal inference workers to stop experience collection... (9750 times) [2024-06-12 20:04:34,893][70980] Signal inference workers to resume experience collection... (9750 times) [2024-06-12 20:04:34,933][71000] InferenceWorker_p0-w0: stopping experience collection (9750 times) [2024-06-12 20:04:34,933][71000] InferenceWorker_p0-w0: resuming experience collection (9750 times) [2024-06-12 20:04:35,741][71000] Updated weights for policy 0, policy_version 69214 (0.0042) [2024-06-12 20:04:35,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48606.0, 300 sec: 48874.3). Total num frames: 1134018560. Throughput: 0: 48669.8. Samples: 662787840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:04:35,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 20:04:38,687][71000] Updated weights for policy 0, policy_version 69224 (0.0034) [2024-06-12 20:04:40,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49151.9, 300 sec: 48707.7). Total num frames: 1134247936. Throughput: 0: 48710.1. Samples: 663075820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:04:40,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:04:42,417][71000] Updated weights for policy 0, policy_version 69234 (0.0032) [2024-06-12 20:04:45,531][71000] Updated weights for policy 0, policy_version 69244 (0.0036) [2024-06-12 20:04:45,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48332.9, 300 sec: 48707.7). Total num frames: 1134493696. Throughput: 0: 48815.6. Samples: 663364060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:04:45,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 20:04:48,927][71000] Updated weights for policy 0, policy_version 69254 (0.0025) [2024-06-12 20:04:50,939][70768] Fps is (10 sec: 49152.7, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 1134739456. Throughput: 0: 48884.0. Samples: 663513040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:04:50,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:04:52,549][71000] Updated weights for policy 0, policy_version 69264 (0.0031) [2024-06-12 20:04:55,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48605.8, 300 sec: 48818.8). Total num frames: 1134985216. Throughput: 0: 48687.5. Samples: 663807000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:04:55,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:04:55,945][71000] Updated weights for policy 0, policy_version 69274 (0.0025) [2024-06-12 20:04:59,230][71000] Updated weights for policy 0, policy_version 69284 (0.0029) [2024-06-12 20:05:00,939][70768] Fps is (10 sec: 47513.3, 60 sec: 48879.0, 300 sec: 48652.2). Total num frames: 1135214592. Throughput: 0: 48748.1. Samples: 664104100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:05:00,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 20:05:02,466][71000] Updated weights for policy 0, policy_version 69294 (0.0038) [2024-06-12 20:05:05,748][71000] Updated weights for policy 0, policy_version 69304 (0.0029) [2024-06-12 20:05:05,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 1135476736. Throughput: 0: 48692.0. Samples: 664245600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:05:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:05:09,087][71000] Updated weights for policy 0, policy_version 69314 (0.0029) [2024-06-12 20:05:10,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 1135722496. Throughput: 0: 48703.1. Samples: 664529420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:05:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:05:12,602][71000] Updated weights for policy 0, policy_version 69324 (0.0030) [2024-06-12 20:05:15,715][71000] Updated weights for policy 0, policy_version 69334 (0.0030) [2024-06-12 20:05:15,940][70768] Fps is (10 sec: 49150.8, 60 sec: 48878.8, 300 sec: 48818.7). Total num frames: 1135968256. Throughput: 0: 48698.9. Samples: 664829960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 20:05:15,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:05:19,029][71000] Updated weights for policy 0, policy_version 69344 (0.0028) [2024-06-12 20:05:20,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49152.0, 300 sec: 48652.2). Total num frames: 1136197632. Throughput: 0: 48621.8. Samples: 664975820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 20:05:20,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:05:22,761][71000] Updated weights for policy 0, policy_version 69354 (0.0029) [2024-06-12 20:05:25,801][71000] Updated weights for policy 0, policy_version 69364 (0.0033) [2024-06-12 20:05:25,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.0, 300 sec: 48707.7). Total num frames: 1136459776. Throughput: 0: 48715.5. Samples: 665268020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 20:05:25,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:05:29,352][71000] Updated weights for policy 0, policy_version 69374 (0.0032) [2024-06-12 20:05:30,940][70768] Fps is (10 sec: 50789.1, 60 sec: 48605.7, 300 sec: 48818.7). Total num frames: 1136705536. Throughput: 0: 48892.7. Samples: 665564240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 20:05:30,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:05:32,406][71000] Updated weights for policy 0, policy_version 69384 (0.0033) [2024-06-12 20:05:35,814][71000] Updated weights for policy 0, policy_version 69394 (0.0030) [2024-06-12 20:05:35,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1136951296. Throughput: 0: 48927.9. Samples: 665714800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 20:05:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:05:39,570][71000] Updated weights for policy 0, policy_version 69404 (0.0041) [2024-06-12 20:05:40,941][70768] Fps is (10 sec: 47508.0, 60 sec: 48877.8, 300 sec: 48651.9). Total num frames: 1137180672. Throughput: 0: 48870.5. Samples: 666006240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 20:05:40,941][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:05:41,067][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000069409_1137197056.pth... [2024-06-12 20:05:41,123][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000068696_1125515264.pth [2024-06-12 20:05:42,583][71000] Updated weights for policy 0, policy_version 69414 (0.0026) [2024-06-12 20:05:45,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48878.9, 300 sec: 48652.1). Total num frames: 1137426432. Throughput: 0: 48728.0. Samples: 666296860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 20:05:45,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:05:46,158][71000] Updated weights for policy 0, policy_version 69424 (0.0032) [2024-06-12 20:05:48,032][70980] Signal inference workers to stop experience collection... (9800 times) [2024-06-12 20:05:48,033][70980] Signal inference workers to resume experience collection... (9800 times) [2024-06-12 20:05:48,050][71000] InferenceWorker_p0-w0: stopping experience collection (9800 times) [2024-06-12 20:05:48,051][71000] InferenceWorker_p0-w0: resuming experience collection (9800 times) [2024-06-12 20:05:49,503][71000] Updated weights for policy 0, policy_version 69434 (0.0039) [2024-06-12 20:05:50,943][70768] Fps is (10 sec: 49140.3, 60 sec: 48875.8, 300 sec: 48762.6). Total num frames: 1137672192. Throughput: 0: 48696.8. Samples: 666437140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 20:05:50,944][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:05:52,689][71000] Updated weights for policy 0, policy_version 69444 (0.0032) [2024-06-12 20:05:55,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1137917952. Throughput: 0: 49020.5. Samples: 666735340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 20:05:55,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:05:56,411][71000] Updated weights for policy 0, policy_version 69454 (0.0041) [2024-06-12 20:05:59,561][71000] Updated weights for policy 0, policy_version 69464 (0.0021) [2024-06-12 20:06:00,940][70768] Fps is (10 sec: 47531.3, 60 sec: 48878.9, 300 sec: 48596.6). Total num frames: 1138147328. Throughput: 0: 48888.2. Samples: 667029920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:06:00,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:06:02,799][71000] Updated weights for policy 0, policy_version 69474 (0.0037) [2024-06-12 20:06:05,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48605.7, 300 sec: 48707.7). Total num frames: 1138393088. Throughput: 0: 48906.9. Samples: 667176640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:06:05,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:06:06,542][71000] Updated weights for policy 0, policy_version 69484 (0.0033) [2024-06-12 20:06:09,668][71000] Updated weights for policy 0, policy_version 69494 (0.0027) [2024-06-12 20:06:10,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1138655232. Throughput: 0: 48977.9. Samples: 667472020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:06:10,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:06:13,154][71000] Updated weights for policy 0, policy_version 69504 (0.0031) [2024-06-12 20:06:15,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48606.0, 300 sec: 48707.7). Total num frames: 1138884608. Throughput: 0: 48838.0. Samples: 667761940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:06:15,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 20:06:16,287][71000] Updated weights for policy 0, policy_version 69514 (0.0028) [2024-06-12 20:06:19,778][71000] Updated weights for policy 0, policy_version 69524 (0.0046) [2024-06-12 20:06:20,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49151.8, 300 sec: 48707.6). Total num frames: 1139146752. Throughput: 0: 48746.9. Samples: 667908420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:06:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:06:22,932][71000] Updated weights for policy 0, policy_version 69534 (0.0044) [2024-06-12 20:06:25,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48606.0, 300 sec: 48763.3). Total num frames: 1139376128. Throughput: 0: 49006.0. Samples: 668211440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:06:25,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:06:26,337][71000] Updated weights for policy 0, policy_version 69544 (0.0030) [2024-06-12 20:06:29,546][71000] Updated weights for policy 0, policy_version 69554 (0.0028) [2024-06-12 20:06:30,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48606.0, 300 sec: 48763.2). Total num frames: 1139621888. Throughput: 0: 48850.2. Samples: 668495120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:06:30,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:06:33,174][71000] Updated weights for policy 0, policy_version 69564 (0.0030) [2024-06-12 20:06:35,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 1139867648. Throughput: 0: 48996.1. Samples: 668641780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:06:35,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:06:36,329][71000] Updated weights for policy 0, policy_version 69574 (0.0035) [2024-06-12 20:06:39,986][71000] Updated weights for policy 0, policy_version 69584 (0.0027) [2024-06-12 20:06:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48880.0, 300 sec: 48596.6). Total num frames: 1140113408. Throughput: 0: 48815.5. Samples: 668932040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:06:40,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:06:42,947][71000] Updated weights for policy 0, policy_version 69594 (0.0026) [2024-06-12 20:06:45,939][70768] Fps is (10 sec: 45875.5, 60 sec: 48332.8, 300 sec: 48652.2). Total num frames: 1140326400. Throughput: 0: 48748.5. Samples: 669223600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 20:06:45,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 20:06:46,640][71000] Updated weights for policy 0, policy_version 69604 (0.0032) [2024-06-12 20:06:49,750][71000] Updated weights for policy 0, policy_version 69614 (0.0033) [2024-06-12 20:06:50,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48882.0, 300 sec: 48818.8). Total num frames: 1140604928. Throughput: 0: 48672.6. Samples: 669366900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 20:06:50,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:06:53,361][71000] Updated weights for policy 0, policy_version 69624 (0.0034) [2024-06-12 20:06:55,940][70768] Fps is (10 sec: 52428.0, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1140850688. Throughput: 0: 48647.9. Samples: 669661180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 20:06:55,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:06:56,690][71000] Updated weights for policy 0, policy_version 69634 (0.0034) [2024-06-12 20:07:00,135][71000] Updated weights for policy 0, policy_version 69644 (0.0031) [2024-06-12 20:07:00,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48878.9, 300 sec: 48596.6). Total num frames: 1141080064. Throughput: 0: 48751.5. Samples: 669955760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 20:07:00,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 20:07:03,181][71000] Updated weights for policy 0, policy_version 69654 (0.0040) [2024-06-12 20:07:05,940][70768] Fps is (10 sec: 45875.7, 60 sec: 48606.0, 300 sec: 48707.7). Total num frames: 1141309440. Throughput: 0: 48684.2. Samples: 670099200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 20:07:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:07:06,854][71000] Updated weights for policy 0, policy_version 69664 (0.0025) [2024-06-12 20:07:09,533][70980] Signal inference workers to stop experience collection... (9850 times) [2024-06-12 20:07:09,582][71000] InferenceWorker_p0-w0: stopping experience collection (9850 times) [2024-06-12 20:07:09,644][70980] Signal inference workers to resume experience collection... (9850 times) [2024-06-12 20:07:09,644][71000] InferenceWorker_p0-w0: resuming experience collection (9850 times) [2024-06-12 20:07:09,797][71000] Updated weights for policy 0, policy_version 69674 (0.0025) [2024-06-12 20:07:10,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 1141571584. Throughput: 0: 48451.9. Samples: 670391780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 20:07:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:07:13,385][71000] Updated weights for policy 0, policy_version 69684 (0.0041) [2024-06-12 20:07:15,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48605.9, 300 sec: 48652.2). Total num frames: 1141800960. Throughput: 0: 48544.5. Samples: 670679620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 20:07:15,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:07:16,539][71000] Updated weights for policy 0, policy_version 69694 (0.0026) [2024-06-12 20:07:20,357][71000] Updated weights for policy 0, policy_version 69704 (0.0024) [2024-06-12 20:07:20,940][70768] Fps is (10 sec: 45875.6, 60 sec: 48059.9, 300 sec: 48596.6). Total num frames: 1142030336. Throughput: 0: 48624.9. Samples: 670829900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 20:07:20,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:07:23,504][71000] Updated weights for policy 0, policy_version 69714 (0.0026) [2024-06-12 20:07:25,939][70768] Fps is (10 sec: 47513.5, 60 sec: 48332.8, 300 sec: 48596.6). Total num frames: 1142276096. Throughput: 0: 48511.3. Samples: 671115040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 20:07:25,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:07:27,421][71000] Updated weights for policy 0, policy_version 69724 (0.0037) [2024-06-12 20:07:30,071][71000] Updated weights for policy 0, policy_version 69734 (0.0035) [2024-06-12 20:07:30,940][70768] Fps is (10 sec: 50789.0, 60 sec: 48605.7, 300 sec: 48652.1). Total num frames: 1142538240. Throughput: 0: 48361.9. Samples: 671399900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 20:07:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:07:34,083][71000] Updated weights for policy 0, policy_version 69744 (0.0028) [2024-06-12 20:07:35,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 1142784000. Throughput: 0: 48693.8. Samples: 671558120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 20:07:35,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:07:36,831][71000] Updated weights for policy 0, policy_version 69754 (0.0033) [2024-06-12 20:07:40,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48059.7, 300 sec: 48541.0). Total num frames: 1142996992. Throughput: 0: 48565.2. Samples: 671846620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 20:07:40,941][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:07:41,097][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000069764_1143013376.pth... [2024-06-12 20:07:41,113][71000] Updated weights for policy 0, policy_version 69764 (0.0033) [2024-06-12 20:07:41,150][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000069050_1131315200.pth [2024-06-12 20:07:43,503][71000] Updated weights for policy 0, policy_version 69774 (0.0028) [2024-06-12 20:07:45,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.8, 300 sec: 48652.1). Total num frames: 1143259136. Throughput: 0: 48555.5. Samples: 672140760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 20:07:45,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:07:47,650][71000] Updated weights for policy 0, policy_version 69784 (0.0030) [2024-06-12 20:07:50,545][71000] Updated weights for policy 0, policy_version 69794 (0.0031) [2024-06-12 20:07:50,940][70768] Fps is (10 sec: 52428.9, 60 sec: 48605.7, 300 sec: 48652.1). Total num frames: 1143521280. Throughput: 0: 48543.3. Samples: 672283660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 20:07:50,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:07:54,346][71000] Updated weights for policy 0, policy_version 69804 (0.0028) [2024-06-12 20:07:55,939][70768] Fps is (10 sec: 49153.0, 60 sec: 48332.9, 300 sec: 48707.7). Total num frames: 1143750656. Throughput: 0: 48656.6. Samples: 672581320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 20:07:55,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:07:57,008][71000] Updated weights for policy 0, policy_version 69814 (0.0026) [2024-06-12 20:08:00,940][70768] Fps is (10 sec: 45876.0, 60 sec: 48332.9, 300 sec: 48596.6). Total num frames: 1143980032. Throughput: 0: 48683.5. Samples: 672870380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 20:08:00,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:08:01,033][71000] Updated weights for policy 0, policy_version 69824 (0.0031) [2024-06-12 20:08:03,725][71000] Updated weights for policy 0, policy_version 69834 (0.0026) [2024-06-12 20:08:05,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48605.8, 300 sec: 48652.2). Total num frames: 1144225792. Throughput: 0: 48439.9. Samples: 673009700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 20:08:05,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:08:07,658][71000] Updated weights for policy 0, policy_version 69844 (0.0037) [2024-06-12 20:08:10,377][71000] Updated weights for policy 0, policy_version 69854 (0.0033) [2024-06-12 20:08:10,940][70768] Fps is (10 sec: 52428.3, 60 sec: 48878.9, 300 sec: 48652.1). Total num frames: 1144504320. Throughput: 0: 48858.5. Samples: 673313680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 20:08:10,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:08:12,553][70980] Signal inference workers to stop experience collection... (9900 times) [2024-06-12 20:08:12,598][71000] InferenceWorker_p0-w0: stopping experience collection (9900 times) [2024-06-12 20:08:12,663][70980] Signal inference workers to resume experience collection... (9900 times) [2024-06-12 20:08:12,663][71000] InferenceWorker_p0-w0: resuming experience collection (9900 times) [2024-06-12 20:08:14,368][71000] Updated weights for policy 0, policy_version 69864 (0.0030) [2024-06-12 20:08:15,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1144733696. Throughput: 0: 48931.8. Samples: 673601820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 20:08:15,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:08:16,960][71000] Updated weights for policy 0, policy_version 69874 (0.0027) [2024-06-12 20:08:20,940][70768] Fps is (10 sec: 44237.1, 60 sec: 48605.8, 300 sec: 48652.1). Total num frames: 1144946688. Throughput: 0: 48673.3. Samples: 673748420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 20:08:20,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:08:21,168][71000] Updated weights for policy 0, policy_version 69884 (0.0033) [2024-06-12 20:08:23,855][71000] Updated weights for policy 0, policy_version 69894 (0.0030) [2024-06-12 20:08:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 1145208832. Throughput: 0: 48375.8. Samples: 674023520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 20:08:25,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:08:27,755][71000] Updated weights for policy 0, policy_version 69904 (0.0031) [2024-06-12 20:08:30,650][71000] Updated weights for policy 0, policy_version 69914 (0.0026) [2024-06-12 20:08:30,943][70768] Fps is (10 sec: 54048.2, 60 sec: 49149.3, 300 sec: 48762.7). Total num frames: 1145487360. Throughput: 0: 48640.3. Samples: 674329740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 20:08:30,944][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:08:34,348][71000] Updated weights for policy 0, policy_version 69924 (0.0040) [2024-06-12 20:08:35,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48332.8, 300 sec: 48763.2). Total num frames: 1145683968. Throughput: 0: 48868.6. Samples: 674482740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 20:08:35,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:08:37,317][71000] Updated weights for policy 0, policy_version 69934 (0.0034) [2024-06-12 20:08:40,940][70768] Fps is (10 sec: 44252.2, 60 sec: 48879.0, 300 sec: 48596.6). Total num frames: 1145929728. Throughput: 0: 48699.4. Samples: 674772800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 20:08:40,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:08:41,220][71000] Updated weights for policy 0, policy_version 69944 (0.0039) [2024-06-12 20:08:43,911][71000] Updated weights for policy 0, policy_version 69954 (0.0024) [2024-06-12 20:08:45,942][70768] Fps is (10 sec: 49137.7, 60 sec: 48603.6, 300 sec: 48651.7). Total num frames: 1146175488. Throughput: 0: 48676.9. Samples: 675060980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 20:08:45,943][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:08:47,888][71000] Updated weights for policy 0, policy_version 69964 (0.0034) [2024-06-12 20:08:50,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48606.0, 300 sec: 48707.7). Total num frames: 1146437632. Throughput: 0: 48831.1. Samples: 675207100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 20:08:50,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:08:50,952][71000] Updated weights for policy 0, policy_version 69974 (0.0025) [2024-06-12 20:08:54,625][71000] Updated weights for policy 0, policy_version 69984 (0.0025) [2024-06-12 20:08:55,940][70768] Fps is (10 sec: 49166.4, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 1146667008. Throughput: 0: 48512.1. Samples: 675496720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 20:08:55,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:08:57,822][71000] Updated weights for policy 0, policy_version 69994 (0.0026) [2024-06-12 20:09:00,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49425.0, 300 sec: 48818.7). Total num frames: 1146945536. Throughput: 0: 48818.5. Samples: 675798660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 20:09:00,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:09:00,944][71000] Updated weights for policy 0, policy_version 70004 (0.0039) [2024-06-12 20:09:04,369][71000] Updated weights for policy 0, policy_version 70014 (0.0037) [2024-06-12 20:09:05,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.9, 300 sec: 48652.1). Total num frames: 1147158528. Throughput: 0: 48530.2. Samples: 675932280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 20:09:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 20:09:08,228][71000] Updated weights for policy 0, policy_version 70024 (0.0026) [2024-06-12 20:09:10,918][71000] Updated weights for policy 0, policy_version 70034 (0.0031) [2024-06-12 20:09:10,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1147437056. Throughput: 0: 49137.3. Samples: 676234700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 20:09:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:09:14,838][71000] Updated weights for policy 0, policy_version 70044 (0.0036) [2024-06-12 20:09:15,731][70980] Signal inference workers to stop experience collection... (9950 times) [2024-06-12 20:09:15,779][71000] InferenceWorker_p0-w0: stopping experience collection (9950 times) [2024-06-12 20:09:15,783][70980] Signal inference workers to resume experience collection... (9950 times) [2024-06-12 20:09:15,790][71000] InferenceWorker_p0-w0: resuming experience collection (9950 times) [2024-06-12 20:09:15,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48605.8, 300 sec: 48818.7). Total num frames: 1147650048. Throughput: 0: 48623.3. Samples: 676517620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 20:09:15,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:09:17,759][71000] Updated weights for policy 0, policy_version 70054 (0.0035) [2024-06-12 20:09:20,940][70768] Fps is (10 sec: 44236.7, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 1147879424. Throughput: 0: 48360.4. Samples: 676658960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 20:09:20,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:09:21,458][71000] Updated weights for policy 0, policy_version 70064 (0.0023) [2024-06-12 20:09:24,861][71000] Updated weights for policy 0, policy_version 70074 (0.0030) [2024-06-12 20:09:25,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48605.9, 300 sec: 48596.6). Total num frames: 1148125184. Throughput: 0: 48403.2. Samples: 676950940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 20:09:25,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:09:28,007][71000] Updated weights for policy 0, policy_version 70084 (0.0022) [2024-06-12 20:09:30,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48335.7, 300 sec: 48707.7). Total num frames: 1148387328. Throughput: 0: 48632.9. Samples: 677249320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 20:09:30,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:09:31,350][71000] Updated weights for policy 0, policy_version 70094 (0.0025) [2024-06-12 20:09:34,787][71000] Updated weights for policy 0, policy_version 70104 (0.0035) [2024-06-12 20:09:35,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 1148616704. Throughput: 0: 48694.3. Samples: 677398340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 20:09:35,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:09:38,106][71000] Updated weights for policy 0, policy_version 70114 (0.0025) [2024-06-12 20:09:40,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 1148862464. Throughput: 0: 48723.1. Samples: 677689260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 20:09:40,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:09:40,954][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000070121_1148862464.pth... [2024-06-12 20:09:41,008][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000069409_1137197056.pth [2024-06-12 20:09:41,587][71000] Updated weights for policy 0, policy_version 70124 (0.0033) [2024-06-12 20:09:45,068][71000] Updated weights for policy 0, policy_version 70134 (0.0031) [2024-06-12 20:09:45,944][70768] Fps is (10 sec: 50768.4, 60 sec: 49150.9, 300 sec: 48762.5). Total num frames: 1149124608. Throughput: 0: 48633.7. Samples: 677987380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 20:09:45,944][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:09:48,278][71000] Updated weights for policy 0, policy_version 70144 (0.0030) [2024-06-12 20:09:50,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48878.8, 300 sec: 48763.2). Total num frames: 1149370368. Throughput: 0: 48905.6. Samples: 678133040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 20:09:50,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:09:51,676][71000] Updated weights for policy 0, policy_version 70154 (0.0029) [2024-06-12 20:09:55,172][71000] Updated weights for policy 0, policy_version 70164 (0.0034) [2024-06-12 20:09:55,939][70768] Fps is (10 sec: 49173.5, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 1149616128. Throughput: 0: 48869.9. Samples: 678433840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 20:09:55,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:09:58,429][71000] Updated weights for policy 0, policy_version 70174 (0.0029) [2024-06-12 20:10:00,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48059.8, 300 sec: 48652.1). Total num frames: 1149829120. Throughput: 0: 48977.0. Samples: 678721580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 20:10:00,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:10:01,812][71000] Updated weights for policy 0, policy_version 70184 (0.0028) [2024-06-12 20:10:05,427][71000] Updated weights for policy 0, policy_version 70194 (0.0033) [2024-06-12 20:10:05,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 1150091264. Throughput: 0: 48954.7. Samples: 678861920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 20:10:05,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:10:08,623][71000] Updated weights for policy 0, policy_version 70204 (0.0039) [2024-06-12 20:10:10,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48332.7, 300 sec: 48707.7). Total num frames: 1150337024. Throughput: 0: 48941.6. Samples: 679153320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 20:10:10,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:10:11,925][71000] Updated weights for policy 0, policy_version 70214 (0.0035) [2024-06-12 20:10:15,297][71000] Updated weights for policy 0, policy_version 70224 (0.0024) [2024-06-12 20:10:15,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 1150582784. Throughput: 0: 49128.5. Samples: 679460100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 20:10:15,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 20:10:18,522][71000] Updated weights for policy 0, policy_version 70234 (0.0026) [2024-06-12 20:10:20,939][70768] Fps is (10 sec: 47514.8, 60 sec: 48879.0, 300 sec: 48652.2). Total num frames: 1150812160. Throughput: 0: 48950.7. Samples: 679601120. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-12 20:10:20,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:10:22,151][71000] Updated weights for policy 0, policy_version 70244 (0.0028) [2024-06-12 20:10:25,177][71000] Updated weights for policy 0, policy_version 70254 (0.0021) [2024-06-12 20:10:25,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48879.0, 300 sec: 48652.2). Total num frames: 1151057920. Throughput: 0: 48833.9. Samples: 679886780. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-12 20:10:25,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:10:28,626][71000] Updated weights for policy 0, policy_version 70264 (0.0024) [2024-06-12 20:10:29,677][70980] Signal inference workers to stop experience collection... (10000 times) [2024-06-12 20:10:29,677][70980] Signal inference workers to resume experience collection... (10000 times) [2024-06-12 20:10:29,719][71000] InferenceWorker_p0-w0: stopping experience collection (10000 times) [2024-06-12 20:10:29,719][71000] InferenceWorker_p0-w0: resuming experience collection (10000 times) [2024-06-12 20:10:30,940][70768] Fps is (10 sec: 49150.9, 60 sec: 48605.7, 300 sec: 48652.1). Total num frames: 1151303680. Throughput: 0: 48716.5. Samples: 680179420. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-12 20:10:30,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:10:32,042][71000] Updated weights for policy 0, policy_version 70274 (0.0029) [2024-06-12 20:10:35,410][71000] Updated weights for policy 0, policy_version 70284 (0.0024) [2024-06-12 20:10:35,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48878.9, 300 sec: 48707.9). Total num frames: 1151549440. Throughput: 0: 48723.7. Samples: 680325600. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-12 20:10:35,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 20:10:38,548][71000] Updated weights for policy 0, policy_version 70294 (0.0032) [2024-06-12 20:10:40,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 1151795200. Throughput: 0: 48884.4. Samples: 680633640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-12 20:10:40,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:10:41,945][71000] Updated weights for policy 0, policy_version 70304 (0.0028) [2024-06-12 20:10:44,965][71000] Updated weights for policy 0, policy_version 70314 (0.0027) [2024-06-12 20:10:45,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48609.3, 300 sec: 48708.3). Total num frames: 1152040960. Throughput: 0: 48963.5. Samples: 680924940. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-12 20:10:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:10:48,637][71000] Updated weights for policy 0, policy_version 70324 (0.0031) [2024-06-12 20:10:50,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48879.1, 300 sec: 48763.2). Total num frames: 1152303104. Throughput: 0: 49252.0. Samples: 681078260. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-12 20:10:50,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:10:51,503][71000] Updated weights for policy 0, policy_version 70334 (0.0031) [2024-06-12 20:10:55,178][71000] Updated weights for policy 0, policy_version 70344 (0.0031) [2024-06-12 20:10:55,940][70768] Fps is (10 sec: 50790.9, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1152548864. Throughput: 0: 49123.7. Samples: 681363880. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-12 20:10:55,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 20:10:58,915][71000] Updated weights for policy 0, policy_version 70354 (0.0042) [2024-06-12 20:11:00,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 1152778240. Throughput: 0: 48687.0. Samples: 681651020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 20:11:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:11:01,960][71000] Updated weights for policy 0, policy_version 70364 (0.0026) [2024-06-12 20:11:05,227][71000] Updated weights for policy 0, policy_version 70374 (0.0032) [2024-06-12 20:11:05,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 1153040384. Throughput: 0: 48942.0. Samples: 681803520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 20:11:05,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:11:08,434][71000] Updated weights for policy 0, policy_version 70384 (0.0027) [2024-06-12 20:11:10,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.1, 300 sec: 48818.8). Total num frames: 1153286144. Throughput: 0: 49225.7. Samples: 682101940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 20:11:10,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:11:12,005][71000] Updated weights for policy 0, policy_version 70394 (0.0028) [2024-06-12 20:11:15,324][71000] Updated weights for policy 0, policy_version 70404 (0.0028) [2024-06-12 20:11:15,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49425.1, 300 sec: 48818.8). Total num frames: 1153548288. Throughput: 0: 49137.9. Samples: 682390620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 20:11:15,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:11:18,871][71000] Updated weights for policy 0, policy_version 70414 (0.0037) [2024-06-12 20:11:20,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 1153761280. Throughput: 0: 49229.3. Samples: 682540920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 20:11:20,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 20:11:21,888][71000] Updated weights for policy 0, policy_version 70424 (0.0028) [2024-06-12 20:11:25,683][71000] Updated weights for policy 0, policy_version 70434 (0.0028) [2024-06-12 20:11:25,940][70768] Fps is (10 sec: 45874.6, 60 sec: 49151.8, 300 sec: 48763.2). Total num frames: 1154007040. Throughput: 0: 48738.1. Samples: 682826860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 20:11:25,949][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:11:28,679][71000] Updated weights for policy 0, policy_version 70444 (0.0038) [2024-06-12 20:11:30,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 48763.2). Total num frames: 1154252800. Throughput: 0: 48870.3. Samples: 683124100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 20:11:30,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:11:31,923][71000] Updated weights for policy 0, policy_version 70454 (0.0025) [2024-06-12 20:11:34,999][71000] Updated weights for policy 0, policy_version 70464 (0.0034) [2024-06-12 20:11:35,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49425.1, 300 sec: 48818.8). Total num frames: 1154514944. Throughput: 0: 48855.6. Samples: 683276760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 20:11:35,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:11:38,832][71000] Updated weights for policy 0, policy_version 70474 (0.0037) [2024-06-12 20:11:40,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49151.8, 300 sec: 48874.3). Total num frames: 1154744320. Throughput: 0: 49045.1. Samples: 683570920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 20:11:40,949][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:11:41,086][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000070481_1154760704.pth... [2024-06-12 20:11:41,137][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000069764_1143013376.pth [2024-06-12 20:11:41,746][71000] Updated weights for policy 0, policy_version 70484 (0.0024) [2024-06-12 20:11:45,360][71000] Updated weights for policy 0, policy_version 70494 (0.0028) [2024-06-12 20:11:45,944][70768] Fps is (10 sec: 47493.2, 60 sec: 49148.6, 300 sec: 48762.5). Total num frames: 1154990080. Throughput: 0: 49251.8. Samples: 683867560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 20:11:45,944][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 20:11:48,304][71000] Updated weights for policy 0, policy_version 70504 (0.0035) [2024-06-12 20:11:50,939][70768] Fps is (10 sec: 47514.9, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 1155219456. Throughput: 0: 49018.9. Samples: 684009360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 20:11:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:11:52,193][71000] Updated weights for policy 0, policy_version 70514 (0.0024) [2024-06-12 20:11:52,691][70980] Signal inference workers to stop experience collection... (10050 times) [2024-06-12 20:11:52,693][70980] Signal inference workers to resume experience collection... (10050 times) [2024-06-12 20:11:52,713][71000] InferenceWorker_p0-w0: stopping experience collection (10050 times) [2024-06-12 20:11:52,713][71000] InferenceWorker_p0-w0: resuming experience collection (10050 times) [2024-06-12 20:11:55,199][71000] Updated weights for policy 0, policy_version 70524 (0.0028) [2024-06-12 20:11:55,940][70768] Fps is (10 sec: 50811.8, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 1155497984. Throughput: 0: 49097.2. Samples: 684311320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 20:11:55,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:11:58,845][71000] Updated weights for policy 0, policy_version 70534 (0.0026) [2024-06-12 20:12:00,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1155710976. Throughput: 0: 49125.0. Samples: 684601240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 20:12:00,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:12:01,866][71000] Updated weights for policy 0, policy_version 70544 (0.0028) [2024-06-12 20:12:05,537][71000] Updated weights for policy 0, policy_version 70554 (0.0032) [2024-06-12 20:12:05,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1155973120. Throughput: 0: 48819.0. Samples: 684737780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 20:12:05,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:12:08,319][71000] Updated weights for policy 0, policy_version 70564 (0.0037) [2024-06-12 20:12:10,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 1156202496. Throughput: 0: 48916.6. Samples: 685028100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 20:12:10,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:12:12,265][71000] Updated weights for policy 0, policy_version 70574 (0.0032) [2024-06-12 20:12:15,119][71000] Updated weights for policy 0, policy_version 70584 (0.0028) [2024-06-12 20:12:15,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 1156497408. Throughput: 0: 48880.5. Samples: 685323720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 20:12:15,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:12:19,036][71000] Updated weights for policy 0, policy_version 70594 (0.0027) [2024-06-12 20:12:20,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1156694016. Throughput: 0: 48912.5. Samples: 685477820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 20:12:20,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:12:22,031][71000] Updated weights for policy 0, policy_version 70604 (0.0039) [2024-06-12 20:12:25,843][71000] Updated weights for policy 0, policy_version 70614 (0.0031) [2024-06-12 20:12:25,940][70768] Fps is (10 sec: 44236.7, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1156939776. Throughput: 0: 48822.4. Samples: 685767920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 20:12:25,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:12:28,687][71000] Updated weights for policy 0, policy_version 70624 (0.0023) [2024-06-12 20:12:30,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1157185536. Throughput: 0: 48633.5. Samples: 686055860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 20:12:30,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:12:32,222][71000] Updated weights for policy 0, policy_version 70634 (0.0031) [2024-06-12 20:12:35,095][71000] Updated weights for policy 0, policy_version 70644 (0.0029) [2024-06-12 20:12:35,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 1157464064. Throughput: 0: 49151.5. Samples: 686221180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 20:12:35,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 20:12:38,947][71000] Updated weights for policy 0, policy_version 70654 (0.0035) [2024-06-12 20:12:40,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49152.1, 300 sec: 48929.8). Total num frames: 1157693440. Throughput: 0: 48981.7. Samples: 686515500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 20:12:40,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:12:41,693][71000] Updated weights for policy 0, policy_version 70664 (0.0037) [2024-06-12 20:12:45,749][71000] Updated weights for policy 0, policy_version 70674 (0.0029) [2024-06-12 20:12:45,943][70768] Fps is (10 sec: 45858.5, 60 sec: 48879.4, 300 sec: 48818.2). Total num frames: 1157922816. Throughput: 0: 48939.9. Samples: 686803720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 20:12:45,944][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:12:48,628][71000] Updated weights for policy 0, policy_version 70684 (0.0026) [2024-06-12 20:12:50,940][70768] Fps is (10 sec: 47514.5, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 1158168576. Throughput: 0: 49033.1. Samples: 686944260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 20:12:50,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:12:52,231][71000] Updated weights for policy 0, policy_version 70694 (0.0028) [2024-06-12 20:12:55,337][71000] Updated weights for policy 0, policy_version 70704 (0.0029) [2024-06-12 20:12:55,939][70768] Fps is (10 sec: 49170.3, 60 sec: 48606.0, 300 sec: 48929.9). Total num frames: 1158414336. Throughput: 0: 49224.0. Samples: 687243180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 20:12:55,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:12:57,739][70980] Signal inference workers to stop experience collection... (10100 times) [2024-06-12 20:12:57,766][71000] InferenceWorker_p0-w0: stopping experience collection (10100 times) [2024-06-12 20:12:57,789][70980] Signal inference workers to resume experience collection... (10100 times) [2024-06-12 20:12:57,790][71000] InferenceWorker_p0-w0: resuming experience collection (10100 times) [2024-06-12 20:12:59,062][71000] Updated weights for policy 0, policy_version 70714 (0.0023) [2024-06-12 20:13:00,941][70768] Fps is (10 sec: 50781.0, 60 sec: 49423.5, 300 sec: 48985.1). Total num frames: 1158676480. Throughput: 0: 49305.6. Samples: 687542560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 20:13:00,942][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:13:01,682][71000] Updated weights for policy 0, policy_version 70724 (0.0030) [2024-06-12 20:13:05,689][71000] Updated weights for policy 0, policy_version 70734 (0.0032) [2024-06-12 20:13:05,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1158905856. Throughput: 0: 49144.9. Samples: 687689340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-12 20:13:05,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:13:08,759][71000] Updated weights for policy 0, policy_version 70744 (0.0026) [2024-06-12 20:13:10,940][70768] Fps is (10 sec: 47521.5, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 1159151616. Throughput: 0: 49257.2. Samples: 687984500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-12 20:13:10,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:13:12,450][71000] Updated weights for policy 0, policy_version 70754 (0.0027) [2024-06-12 20:13:15,464][71000] Updated weights for policy 0, policy_version 70764 (0.0025) [2024-06-12 20:13:15,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 1159413760. Throughput: 0: 49129.4. Samples: 688266680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-12 20:13:15,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:13:18,906][71000] Updated weights for policy 0, policy_version 70774 (0.0032) [2024-06-12 20:13:20,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 1159643136. Throughput: 0: 48942.3. Samples: 688423580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-12 20:13:20,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:13:22,144][71000] Updated weights for policy 0, policy_version 70784 (0.0029) [2024-06-12 20:13:25,651][71000] Updated weights for policy 0, policy_version 70794 (0.0031) [2024-06-12 20:13:25,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49151.9, 300 sec: 48819.3). Total num frames: 1159888896. Throughput: 0: 48938.7. Samples: 688717740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-12 20:13:25,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:13:28,750][71000] Updated weights for policy 0, policy_version 70804 (0.0033) [2024-06-12 20:13:30,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 1160134656. Throughput: 0: 49045.4. Samples: 689010580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-12 20:13:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:13:32,371][71000] Updated weights for policy 0, policy_version 70814 (0.0027) [2024-06-12 20:13:35,695][71000] Updated weights for policy 0, policy_version 70824 (0.0028) [2024-06-12 20:13:35,940][70768] Fps is (10 sec: 49152.9, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 1160380416. Throughput: 0: 49242.7. Samples: 689160180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-12 20:13:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 20:13:39,123][71000] Updated weights for policy 0, policy_version 70834 (0.0028) [2024-06-12 20:13:40,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49152.1, 300 sec: 49041.4). Total num frames: 1160642560. Throughput: 0: 49035.9. Samples: 689449800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-12 20:13:40,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:13:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000070840_1160642560.pth... [2024-06-12 20:13:40,994][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000070121_1148862464.pth [2024-06-12 20:13:42,404][71000] Updated weights for policy 0, policy_version 70844 (0.0033) [2024-06-12 20:13:45,498][71000] Updated weights for policy 0, policy_version 70854 (0.0024) [2024-06-12 20:13:45,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49154.9, 300 sec: 48929.8). Total num frames: 1160871936. Throughput: 0: 48946.3. Samples: 689745060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-12 20:13:45,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:13:49,089][71000] Updated weights for policy 0, policy_version 70864 (0.0032) [2024-06-12 20:13:50,939][70768] Fps is (10 sec: 45875.7, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 1161101312. Throughput: 0: 48815.1. Samples: 689886020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-12 20:13:50,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:13:52,257][71000] Updated weights for policy 0, policy_version 70874 (0.0022) [2024-06-12 20:13:55,803][71000] Updated weights for policy 0, policy_version 70884 (0.0029) [2024-06-12 20:13:55,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49424.9, 300 sec: 48929.8). Total num frames: 1161379840. Throughput: 0: 48866.3. Samples: 690183480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 20:13:55,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:13:59,327][71000] Updated weights for policy 0, policy_version 70894 (0.0030) [2024-06-12 20:14:00,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49153.4, 300 sec: 49040.9). Total num frames: 1161625600. Throughput: 0: 49080.3. Samples: 690475300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 20:14:00,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 20:14:02,172][71000] Updated weights for policy 0, policy_version 70904 (0.0040) [2024-06-12 20:14:04,076][70980] Signal inference workers to stop experience collection... (10150 times) [2024-06-12 20:14:04,132][70980] Signal inference workers to resume experience collection... (10150 times) [2024-06-12 20:14:04,133][71000] InferenceWorker_p0-w0: stopping experience collection (10150 times) [2024-06-12 20:14:04,143][71000] InferenceWorker_p0-w0: resuming experience collection (10150 times) [2024-06-12 20:14:05,734][71000] Updated weights for policy 0, policy_version 70914 (0.0033) [2024-06-12 20:14:05,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 1161854976. Throughput: 0: 48968.5. Samples: 690627160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 20:14:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:14:09,221][71000] Updated weights for policy 0, policy_version 70924 (0.0027) [2024-06-12 20:14:10,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 1162100736. Throughput: 0: 48993.9. Samples: 690922460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 20:14:10,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:14:12,384][71000] Updated weights for policy 0, policy_version 70934 (0.0028) [2024-06-12 20:14:15,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 1162330112. Throughput: 0: 48951.1. Samples: 691213380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 20:14:15,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:14:16,071][71000] Updated weights for policy 0, policy_version 70944 (0.0031) [2024-06-12 20:14:19,492][71000] Updated weights for policy 0, policy_version 70954 (0.0035) [2024-06-12 20:14:20,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 1162592256. Throughput: 0: 48907.6. Samples: 691361020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 20:14:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:14:22,799][71000] Updated weights for policy 0, policy_version 70964 (0.0031) [2024-06-12 20:14:25,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 1162805248. Throughput: 0: 48980.0. Samples: 691653900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 20:14:25,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:14:26,261][71000] Updated weights for policy 0, policy_version 70974 (0.0031) [2024-06-12 20:14:29,195][71000] Updated weights for policy 0, policy_version 70984 (0.0031) [2024-06-12 20:14:30,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 1163067392. Throughput: 0: 49048.1. Samples: 691952220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 20:14:30,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 20:14:32,522][71000] Updated weights for policy 0, policy_version 70994 (0.0030) [2024-06-12 20:14:35,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 1163313152. Throughput: 0: 48891.8. Samples: 692086160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 20:14:35,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:14:36,088][71000] Updated weights for policy 0, policy_version 71004 (0.0028) [2024-06-12 20:14:39,241][71000] Updated weights for policy 0, policy_version 71014 (0.0031) [2024-06-12 20:14:40,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48879.0, 300 sec: 48986.1). Total num frames: 1163575296. Throughput: 0: 49078.7. Samples: 692392020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 20:14:40,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:14:42,817][71000] Updated weights for policy 0, policy_version 71024 (0.0031) [2024-06-12 20:14:45,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 1163804672. Throughput: 0: 49285.0. Samples: 692693120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 20:14:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:14:46,031][71000] Updated weights for policy 0, policy_version 71034 (0.0031) [2024-06-12 20:14:49,637][71000] Updated weights for policy 0, policy_version 71044 (0.0024) [2024-06-12 20:14:50,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 1164066816. Throughput: 0: 49022.3. Samples: 692833160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 20:14:50,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:14:52,640][71000] Updated weights for policy 0, policy_version 71054 (0.0030) [2024-06-12 20:14:55,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48332.9, 300 sec: 48985.4). Total num frames: 1164279808. Throughput: 0: 49033.4. Samples: 693128960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 20:14:55,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:14:56,212][71000] Updated weights for policy 0, policy_version 71064 (0.0028) [2024-06-12 20:14:59,054][71000] Updated weights for policy 0, policy_version 71074 (0.0032) [2024-06-12 20:15:00,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 1164558336. Throughput: 0: 49042.2. Samples: 693420280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 20:15:00,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:15:02,706][71000] Updated weights for policy 0, policy_version 71084 (0.0031) [2024-06-12 20:15:05,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 1164787712. Throughput: 0: 49307.0. Samples: 693579840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 20:15:05,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:15:06,131][70980] Signal inference workers to stop experience collection... (10200 times) [2024-06-12 20:15:06,132][70980] Signal inference workers to resume experience collection... (10200 times) [2024-06-12 20:15:06,134][71000] Updated weights for policy 0, policy_version 71094 (0.0032) [2024-06-12 20:15:06,151][71000] InferenceWorker_p0-w0: stopping experience collection (10200 times) [2024-06-12 20:15:06,151][71000] InferenceWorker_p0-w0: resuming experience collection (10200 times) [2024-06-12 20:15:09,276][71000] Updated weights for policy 0, policy_version 71104 (0.0022) [2024-06-12 20:15:10,940][70768] Fps is (10 sec: 45874.5, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 1165017088. Throughput: 0: 49053.7. Samples: 693861320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 20:15:10,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:15:12,701][71000] Updated weights for policy 0, policy_version 71114 (0.0026) [2024-06-12 20:15:15,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 1165262848. Throughput: 0: 48958.7. Samples: 694155360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 20:15:15,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:15:16,460][71000] Updated weights for policy 0, policy_version 71124 (0.0025) [2024-06-12 20:15:19,143][71000] Updated weights for policy 0, policy_version 71134 (0.0025) [2024-06-12 20:15:20,940][70768] Fps is (10 sec: 54067.3, 60 sec: 49424.9, 300 sec: 49152.0). Total num frames: 1165557760. Throughput: 0: 49352.5. Samples: 694307020. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-12 20:15:20,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:15:22,716][71000] Updated weights for policy 0, policy_version 71144 (0.0021) [2024-06-12 20:15:25,903][71000] Updated weights for policy 0, policy_version 71154 (0.0026) [2024-06-12 20:15:25,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49698.1, 300 sec: 49096.5). Total num frames: 1165787136. Throughput: 0: 49306.5. Samples: 694610820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-12 20:15:25,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:15:29,455][71000] Updated weights for policy 0, policy_version 71164 (0.0028) [2024-06-12 20:15:30,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49424.9, 300 sec: 49096.4). Total num frames: 1166032896. Throughput: 0: 49086.0. Samples: 694902000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-12 20:15:30,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:15:32,519][71000] Updated weights for policy 0, policy_version 71174 (0.0026) [2024-06-12 20:15:35,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 1166245888. Throughput: 0: 49117.7. Samples: 695043460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-12 20:15:35,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:15:36,424][71000] Updated weights for policy 0, policy_version 71184 (0.0029) [2024-06-12 20:15:38,925][71000] Updated weights for policy 0, policy_version 71194 (0.0024) [2024-06-12 20:15:40,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 1166524416. Throughput: 0: 49055.9. Samples: 695336480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-12 20:15:40,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 20:15:41,036][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000071200_1166540800.pth... [2024-06-12 20:15:41,077][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000070481_1154760704.pth [2024-06-12 20:15:42,828][71000] Updated weights for policy 0, policy_version 71204 (0.0032) [2024-06-12 20:15:45,674][71000] Updated weights for policy 0, policy_version 71214 (0.0031) [2024-06-12 20:15:45,939][70768] Fps is (10 sec: 52429.0, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 1166770176. Throughput: 0: 49232.0. Samples: 695635720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-12 20:15:45,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:15:49,202][71000] Updated weights for policy 0, policy_version 71224 (0.0025) [2024-06-12 20:15:50,939][70768] Fps is (10 sec: 47514.1, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 1166999552. Throughput: 0: 49125.9. Samples: 695790500. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-12 20:15:50,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:15:52,358][71000] Updated weights for policy 0, policy_version 71234 (0.0028) [2024-06-12 20:15:55,907][71000] Updated weights for policy 0, policy_version 71244 (0.0031) [2024-06-12 20:15:55,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49698.1, 300 sec: 49096.5). Total num frames: 1167261696. Throughput: 0: 49256.6. Samples: 696077860. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-12 20:15:55,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 20:15:58,885][71000] Updated weights for policy 0, policy_version 71254 (0.0027) [2024-06-12 20:16:00,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 1167507456. Throughput: 0: 49311.6. Samples: 696374380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-12 20:16:00,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:16:02,925][71000] Updated weights for policy 0, policy_version 71264 (0.0029) [2024-06-12 20:16:05,402][71000] Updated weights for policy 0, policy_version 71274 (0.0020) [2024-06-12 20:16:05,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 1167753216. Throughput: 0: 49283.2. Samples: 696524760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-12 20:16:05,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 20:16:07,102][70980] Signal inference workers to stop experience collection... (10250 times) [2024-06-12 20:16:07,128][71000] InferenceWorker_p0-w0: stopping experience collection (10250 times) [2024-06-12 20:16:07,145][70980] Signal inference workers to resume experience collection... (10250 times) [2024-06-12 20:16:07,146][71000] InferenceWorker_p0-w0: resuming experience collection (10250 times) [2024-06-12 20:16:09,568][71000] Updated weights for policy 0, policy_version 71284 (0.0033) [2024-06-12 20:16:10,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 1167966208. Throughput: 0: 49073.5. Samples: 696819120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-12 20:16:10,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:16:12,352][71000] Updated weights for policy 0, policy_version 71294 (0.0030) [2024-06-12 20:16:15,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 1168228352. Throughput: 0: 48969.0. Samples: 697105600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-12 20:16:15,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:16:16,125][71000] Updated weights for policy 0, policy_version 71304 (0.0035) [2024-06-12 20:16:19,162][71000] Updated weights for policy 0, policy_version 71314 (0.0028) [2024-06-12 20:16:20,940][70768] Fps is (10 sec: 52428.5, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 1168490496. Throughput: 0: 49137.7. Samples: 697254660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-12 20:16:20,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:16:22,820][71000] Updated weights for policy 0, policy_version 71324 (0.0029) [2024-06-12 20:16:25,601][71000] Updated weights for policy 0, policy_version 71334 (0.0029) [2024-06-12 20:16:25,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49152.0, 300 sec: 49096.4). Total num frames: 1168736256. Throughput: 0: 49311.8. Samples: 697555520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-12 20:16:25,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 20:16:29,815][71000] Updated weights for policy 0, policy_version 71344 (0.0032) [2024-06-12 20:16:30,939][70768] Fps is (10 sec: 45875.9, 60 sec: 48606.0, 300 sec: 48929.9). Total num frames: 1168949248. Throughput: 0: 49116.5. Samples: 697845960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-12 20:16:30,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:16:32,285][71000] Updated weights for policy 0, policy_version 71354 (0.0026) [2024-06-12 20:16:35,939][70768] Fps is (10 sec: 45876.3, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 1169195008. Throughput: 0: 48808.4. Samples: 697986880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-12 20:16:35,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:16:36,161][71000] Updated weights for policy 0, policy_version 71364 (0.0026) [2024-06-12 20:16:39,113][71000] Updated weights for policy 0, policy_version 71374 (0.0022) [2024-06-12 20:16:40,940][70768] Fps is (10 sec: 52427.8, 60 sec: 49151.9, 300 sec: 49097.2). Total num frames: 1169473536. Throughput: 0: 49079.4. Samples: 698286440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-12 20:16:40,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:16:42,755][71000] Updated weights for policy 0, policy_version 71384 (0.0026) [2024-06-12 20:16:45,522][71000] Updated weights for policy 0, policy_version 71394 (0.0032) [2024-06-12 20:16:45,940][70768] Fps is (10 sec: 54066.8, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 1169735680. Throughput: 0: 49141.7. Samples: 698585760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-12 20:16:45,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:16:49,143][71000] Updated weights for policy 0, policy_version 71404 (0.0025) [2024-06-12 20:16:50,940][70768] Fps is (10 sec: 45875.9, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 1169932288. Throughput: 0: 49123.1. Samples: 698735300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 20:16:50,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:16:52,267][71000] Updated weights for policy 0, policy_version 71414 (0.0030) [2024-06-12 20:16:55,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48878.9, 300 sec: 49096.4). Total num frames: 1170194432. Throughput: 0: 49022.2. Samples: 699025120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 20:16:55,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:16:55,999][71000] Updated weights for policy 0, policy_version 71424 (0.0032) [2024-06-12 20:16:58,959][71000] Updated weights for policy 0, policy_version 71434 (0.0021) [2024-06-12 20:17:00,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 1170456576. Throughput: 0: 49101.0. Samples: 699315140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 20:17:00,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:17:02,604][71000] Updated weights for policy 0, policy_version 71444 (0.0029) [2024-06-12 20:17:05,762][71000] Updated weights for policy 0, policy_version 71454 (0.0027) [2024-06-12 20:17:05,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 1170702336. Throughput: 0: 49146.4. Samples: 699466240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 20:17:05,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 20:17:09,623][71000] Updated weights for policy 0, policy_version 71464 (0.0028) [2024-06-12 20:17:10,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.1, 300 sec: 48929.8). Total num frames: 1170931712. Throughput: 0: 49067.8. Samples: 699763560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 20:17:10,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:17:12,334][71000] Updated weights for policy 0, policy_version 71474 (0.0033) [2024-06-12 20:17:15,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 1171177472. Throughput: 0: 49027.9. Samples: 700052220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 20:17:15,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:17:15,949][71000] Updated weights for policy 0, policy_version 71484 (0.0024) [2024-06-12 20:17:17,523][70980] Signal inference workers to stop experience collection... (10300 times) [2024-06-12 20:17:17,549][71000] InferenceWorker_p0-w0: stopping experience collection (10300 times) [2024-06-12 20:17:17,583][70980] Signal inference workers to resume experience collection... (10300 times) [2024-06-12 20:17:17,583][71000] InferenceWorker_p0-w0: resuming experience collection (10300 times) [2024-06-12 20:17:19,080][71000] Updated weights for policy 0, policy_version 71494 (0.0036) [2024-06-12 20:17:20,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 1171423232. Throughput: 0: 49099.0. Samples: 700196340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 20:17:20,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:17:22,962][71000] Updated weights for policy 0, policy_version 71504 (0.0028) [2024-06-12 20:17:25,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.1, 300 sec: 49096.5). Total num frames: 1171668992. Throughput: 0: 48971.8. Samples: 700490160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 20:17:25,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:17:26,016][71000] Updated weights for policy 0, policy_version 71514 (0.0040) [2024-06-12 20:17:29,659][71000] Updated weights for policy 0, policy_version 71524 (0.0026) [2024-06-12 20:17:30,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 1171898368. Throughput: 0: 48826.7. Samples: 700782960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 20:17:30,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:17:32,506][71000] Updated weights for policy 0, policy_version 71534 (0.0029) [2024-06-12 20:17:35,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.0, 300 sec: 49041.0). Total num frames: 1172160512. Throughput: 0: 48724.5. Samples: 700927900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 20:17:35,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:17:35,983][71000] Updated weights for policy 0, policy_version 71544 (0.0025) [2024-06-12 20:17:39,097][71000] Updated weights for policy 0, policy_version 71554 (0.0037) [2024-06-12 20:17:40,940][70768] Fps is (10 sec: 50789.1, 60 sec: 48878.9, 300 sec: 49097.0). Total num frames: 1172406272. Throughput: 0: 48825.5. Samples: 701222280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 20:17:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:17:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000071558_1172406272.pth... [2024-06-12 20:17:41,008][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000070840_1160642560.pth [2024-06-12 20:17:42,554][71000] Updated weights for policy 0, policy_version 71564 (0.0035) [2024-06-12 20:17:45,940][70768] Fps is (10 sec: 49151.2, 60 sec: 48605.8, 300 sec: 49096.4). Total num frames: 1172652032. Throughput: 0: 48949.2. Samples: 701517860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 20:17:45,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:17:45,949][71000] Updated weights for policy 0, policy_version 71574 (0.0038) [2024-06-12 20:17:49,474][71000] Updated weights for policy 0, policy_version 71584 (0.0026) [2024-06-12 20:17:50,940][70768] Fps is (10 sec: 47514.8, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 1172881408. Throughput: 0: 49005.7. Samples: 701671500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 20:17:50,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:17:52,728][71000] Updated weights for policy 0, policy_version 71594 (0.0027) [2024-06-12 20:17:55,939][70768] Fps is (10 sec: 47514.6, 60 sec: 48879.0, 300 sec: 48985.7). Total num frames: 1173127168. Throughput: 0: 48897.9. Samples: 701963960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 20:17:55,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:17:56,196][71000] Updated weights for policy 0, policy_version 71604 (0.0027) [2024-06-12 20:17:59,184][71000] Updated weights for policy 0, policy_version 71614 (0.0023) [2024-06-12 20:18:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 1173389312. Throughput: 0: 48988.0. Samples: 702256680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 20:18:00,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:18:03,011][71000] Updated weights for policy 0, policy_version 71624 (0.0031) [2024-06-12 20:18:05,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 1173635072. Throughput: 0: 49240.6. Samples: 702412160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 20:18:05,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:18:06,044][71000] Updated weights for policy 0, policy_version 71634 (0.0034) [2024-06-12 20:18:09,534][71000] Updated weights for policy 0, policy_version 71644 (0.0027) [2024-06-12 20:18:10,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 1173880832. Throughput: 0: 49242.6. Samples: 702706080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 20:18:10,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:18:12,533][71000] Updated weights for policy 0, policy_version 71654 (0.0021) [2024-06-12 20:18:15,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 1174110208. Throughput: 0: 49133.3. Samples: 702993960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 20:18:15,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:18:16,615][71000] Updated weights for policy 0, policy_version 71664 (0.0029) [2024-06-12 20:18:19,301][71000] Updated weights for policy 0, policy_version 71674 (0.0030) [2024-06-12 20:18:20,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 1174372352. Throughput: 0: 49078.6. Samples: 703136440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 20:18:20,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:18:23,275][71000] Updated weights for policy 0, policy_version 71684 (0.0028) [2024-06-12 20:18:25,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49151.8, 300 sec: 49096.4). Total num frames: 1174618112. Throughput: 0: 49104.1. Samples: 703431960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 20:18:25,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:18:26,098][71000] Updated weights for policy 0, policy_version 71694 (0.0029) [2024-06-12 20:18:27,632][70980] Signal inference workers to stop experience collection... (10350 times) [2024-06-12 20:18:27,633][70980] Signal inference workers to resume experience collection... (10350 times) [2024-06-12 20:18:27,673][71000] InferenceWorker_p0-w0: stopping experience collection (10350 times) [2024-06-12 20:18:27,673][71000] InferenceWorker_p0-w0: resuming experience collection (10350 times) [2024-06-12 20:18:30,055][71000] Updated weights for policy 0, policy_version 71704 (0.0031) [2024-06-12 20:18:30,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 1174847488. Throughput: 0: 48988.1. Samples: 703722320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 20:18:30,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:18:33,041][71000] Updated weights for policy 0, policy_version 71714 (0.0030) [2024-06-12 20:18:35,939][70768] Fps is (10 sec: 45876.0, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 1175076864. Throughput: 0: 48638.3. Samples: 703860220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 20:18:35,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:18:36,910][71000] Updated weights for policy 0, policy_version 71724 (0.0026) [2024-06-12 20:18:39,724][71000] Updated weights for policy 0, policy_version 71734 (0.0027) [2024-06-12 20:18:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48879.1, 300 sec: 49040.9). Total num frames: 1175339008. Throughput: 0: 48734.1. Samples: 704157000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 20:18:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:18:43,651][71000] Updated weights for policy 0, policy_version 71744 (0.0040) [2024-06-12 20:18:45,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 1175584768. Throughput: 0: 48626.7. Samples: 704444880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 20:18:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:18:46,379][71000] Updated weights for policy 0, policy_version 71754 (0.0033) [2024-06-12 20:18:50,435][71000] Updated weights for policy 0, policy_version 71764 (0.0028) [2024-06-12 20:18:50,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 1175814144. Throughput: 0: 48429.3. Samples: 704591480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 20:18:50,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 20:18:53,228][71000] Updated weights for policy 0, policy_version 71774 (0.0034) [2024-06-12 20:18:55,942][70768] Fps is (10 sec: 45866.0, 60 sec: 48604.2, 300 sec: 48874.0). Total num frames: 1176043520. Throughput: 0: 48503.7. Samples: 704888840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 20:18:55,942][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:18:56,929][71000] Updated weights for policy 0, policy_version 71784 (0.0028) [2024-06-12 20:19:00,135][71000] Updated weights for policy 0, policy_version 71794 (0.0028) [2024-06-12 20:19:00,940][70768] Fps is (10 sec: 50789.7, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 1176322048. Throughput: 0: 48629.7. Samples: 705182300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 20:19:00,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:19:03,441][71000] Updated weights for policy 0, policy_version 71804 (0.0027) [2024-06-12 20:19:05,940][70768] Fps is (10 sec: 50800.6, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 1176551424. Throughput: 0: 48937.4. Samples: 705338620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 20:19:05,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:19:06,440][71000] Updated weights for policy 0, policy_version 71814 (0.0038) [2024-06-12 20:19:10,061][71000] Updated weights for policy 0, policy_version 71824 (0.0031) [2024-06-12 20:19:10,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 1176797184. Throughput: 0: 48973.0. Samples: 705635740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 20:19:10,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:19:12,877][71000] Updated weights for policy 0, policy_version 71834 (0.0028) [2024-06-12 20:19:15,940][70768] Fps is (10 sec: 49151.0, 60 sec: 48878.8, 300 sec: 48985.3). Total num frames: 1177042944. Throughput: 0: 48863.4. Samples: 705921180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 20:19:15,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:19:16,738][71000] Updated weights for policy 0, policy_version 71844 (0.0029) [2024-06-12 20:19:20,094][71000] Updated weights for policy 0, policy_version 71854 (0.0031) [2024-06-12 20:19:20,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48606.0, 300 sec: 49096.5). Total num frames: 1177288704. Throughput: 0: 49188.0. Samples: 706073680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 20:19:20,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:19:23,478][71000] Updated weights for policy 0, policy_version 71864 (0.0028) [2024-06-12 20:19:25,939][70768] Fps is (10 sec: 49153.2, 60 sec: 48606.0, 300 sec: 49040.9). Total num frames: 1177534464. Throughput: 0: 49011.6. Samples: 706362520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 20:19:25,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:19:26,563][71000] Updated weights for policy 0, policy_version 71874 (0.0029) [2024-06-12 20:19:30,498][71000] Updated weights for policy 0, policy_version 71884 (0.0028) [2024-06-12 20:19:30,939][70768] Fps is (10 sec: 47513.7, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 1177763840. Throughput: 0: 49139.2. Samples: 706656140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 20:19:30,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:19:33,299][71000] Updated weights for policy 0, policy_version 71894 (0.0025) [2024-06-12 20:19:35,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 1178009600. Throughput: 0: 48982.2. Samples: 706795680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 20:19:35,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:19:37,119][71000] Updated weights for policy 0, policy_version 71904 (0.0037) [2024-06-12 20:19:37,805][70980] Signal inference workers to stop experience collection... (10400 times) [2024-06-12 20:19:37,805][70980] Signal inference workers to resume experience collection... (10400 times) [2024-06-12 20:19:37,815][71000] InferenceWorker_p0-w0: stopping experience collection (10400 times) [2024-06-12 20:19:37,846][71000] InferenceWorker_p0-w0: resuming experience collection (10400 times) [2024-06-12 20:19:39,793][71000] Updated weights for policy 0, policy_version 71914 (0.0036) [2024-06-12 20:19:40,944][70768] Fps is (10 sec: 50768.1, 60 sec: 48875.4, 300 sec: 49040.2). Total num frames: 1178271744. Throughput: 0: 48911.3. Samples: 707089960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 20:19:40,944][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:19:40,998][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000071917_1178288128.pth... [2024-06-12 20:19:41,045][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000071200_1166540800.pth [2024-06-12 20:19:43,719][71000] Updated weights for policy 0, policy_version 71924 (0.0030) [2024-06-12 20:19:45,939][70768] Fps is (10 sec: 49152.1, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 1178501120. Throughput: 0: 48933.5. Samples: 707384300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 20:19:45,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:19:46,602][71000] Updated weights for policy 0, policy_version 71934 (0.0032) [2024-06-12 20:19:50,418][71000] Updated weights for policy 0, policy_version 71944 (0.0033) [2024-06-12 20:19:50,940][70768] Fps is (10 sec: 47533.4, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 1178746880. Throughput: 0: 48577.2. Samples: 707524600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 20:19:50,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:19:53,148][71000] Updated weights for policy 0, policy_version 71954 (0.0031) [2024-06-12 20:19:55,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49426.6, 300 sec: 48985.4). Total num frames: 1179009024. Throughput: 0: 48550.5. Samples: 707820520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 20:19:55,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:19:57,256][71000] Updated weights for policy 0, policy_version 71964 (0.0023) [2024-06-12 20:20:00,084][71000] Updated weights for policy 0, policy_version 71974 (0.0023) [2024-06-12 20:20:00,940][70768] Fps is (10 sec: 50791.1, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 1179254784. Throughput: 0: 48858.0. Samples: 708119780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 20:20:00,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:20:03,554][71000] Updated weights for policy 0, policy_version 71984 (0.0029) [2024-06-12 20:20:05,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 1179500544. Throughput: 0: 48902.0. Samples: 708274280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 20:20:05,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 20:20:06,358][71000] Updated weights for policy 0, policy_version 71994 (0.0021) [2024-06-12 20:20:09,956][71000] Updated weights for policy 0, policy_version 72004 (0.0031) [2024-06-12 20:20:10,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 1179729920. Throughput: 0: 49056.5. Samples: 708570060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 20:20:10,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:20:13,296][71000] Updated weights for policy 0, policy_version 72014 (0.0041) [2024-06-12 20:20:15,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49152.2, 300 sec: 48929.9). Total num frames: 1179992064. Throughput: 0: 48844.8. Samples: 708854160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 20:20:15,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:20:16,739][71000] Updated weights for policy 0, policy_version 72024 (0.0035) [2024-06-12 20:20:20,300][71000] Updated weights for policy 0, policy_version 72034 (0.0030) [2024-06-12 20:20:20,940][70768] Fps is (10 sec: 49151.0, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 1180221440. Throughput: 0: 49166.5. Samples: 709008180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 20:20:20,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:20:23,802][71000] Updated weights for policy 0, policy_version 72044 (0.0026) [2024-06-12 20:20:25,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48878.8, 300 sec: 48929.9). Total num frames: 1180467200. Throughput: 0: 49011.7. Samples: 709295280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 20:20:25,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:20:26,957][71000] Updated weights for policy 0, policy_version 72054 (0.0028) [2024-06-12 20:20:30,250][71000] Updated weights for policy 0, policy_version 72064 (0.0028) [2024-06-12 20:20:30,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 1180712960. Throughput: 0: 48961.2. Samples: 709587560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-12 20:20:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:20:33,668][71000] Updated weights for policy 0, policy_version 72074 (0.0027) [2024-06-12 20:20:35,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 1180975104. Throughput: 0: 49119.7. Samples: 709734980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-12 20:20:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:20:36,876][71000] Updated weights for policy 0, policy_version 72084 (0.0031) [2024-06-12 20:20:40,552][71000] Updated weights for policy 0, policy_version 72094 (0.0028) [2024-06-12 20:20:40,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48882.4, 300 sec: 48929.8). Total num frames: 1181204480. Throughput: 0: 49001.9. Samples: 710025600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-12 20:20:40,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:20:44,191][71000] Updated weights for policy 0, policy_version 72104 (0.0027) [2024-06-12 20:20:45,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 1181433856. Throughput: 0: 48792.0. Samples: 710315420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-12 20:20:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:20:47,407][71000] Updated weights for policy 0, policy_version 72114 (0.0021) [2024-06-12 20:20:50,553][71000] Updated weights for policy 0, policy_version 72124 (0.0022) [2024-06-12 20:20:50,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 48929.8). Total num frames: 1181696000. Throughput: 0: 48607.3. Samples: 710461600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-12 20:20:50,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:20:53,287][70980] Signal inference workers to stop experience collection... (10450 times) [2024-06-12 20:20:53,287][70980] Signal inference workers to resume experience collection... (10450 times) [2024-06-12 20:20:53,301][71000] InferenceWorker_p0-w0: stopping experience collection (10450 times) [2024-06-12 20:20:53,301][71000] InferenceWorker_p0-w0: resuming experience collection (10450 times) [2024-06-12 20:20:53,912][71000] Updated weights for policy 0, policy_version 72134 (0.0023) [2024-06-12 20:20:55,939][70768] Fps is (10 sec: 50790.5, 60 sec: 48879.1, 300 sec: 48929.8). Total num frames: 1181941760. Throughput: 0: 48596.9. Samples: 710756920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-12 20:20:55,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:20:56,905][71000] Updated weights for policy 0, policy_version 72144 (0.0030) [2024-06-12 20:21:00,497][71000] Updated weights for policy 0, policy_version 72154 (0.0031) [2024-06-12 20:21:00,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48605.7, 300 sec: 48874.3). Total num frames: 1182171136. Throughput: 0: 48905.6. Samples: 711054920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-12 20:21:00,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 20:21:03,994][71000] Updated weights for policy 0, policy_version 72164 (0.0027) [2024-06-12 20:21:05,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 1182433280. Throughput: 0: 48341.0. Samples: 711183520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-12 20:21:05,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:21:07,794][71000] Updated weights for policy 0, policy_version 72174 (0.0029) [2024-06-12 20:21:10,667][71000] Updated weights for policy 0, policy_version 72184 (0.0027) [2024-06-12 20:21:10,940][70768] Fps is (10 sec: 49152.8, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 1182662656. Throughput: 0: 48541.9. Samples: 711479660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 20:21:10,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 20:21:14,302][71000] Updated weights for policy 0, policy_version 72194 (0.0035) [2024-06-12 20:21:15,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 1182908416. Throughput: 0: 48703.5. Samples: 711779220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 20:21:15,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 20:21:16,983][71000] Updated weights for policy 0, policy_version 72204 (0.0027) [2024-06-12 20:21:20,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 1183137792. Throughput: 0: 48758.6. Samples: 711929120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 20:21:20,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:21:21,398][71000] Updated weights for policy 0, policy_version 72214 (0.0038) [2024-06-12 20:21:23,448][71000] Updated weights for policy 0, policy_version 72224 (0.0030) [2024-06-12 20:21:25,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 1183383552. Throughput: 0: 48729.8. Samples: 712218440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 20:21:25,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:21:28,057][71000] Updated weights for policy 0, policy_version 72234 (0.0032) [2024-06-12 20:21:30,529][71000] Updated weights for policy 0, policy_version 72244 (0.0033) [2024-06-12 20:21:30,940][70768] Fps is (10 sec: 50791.1, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 1183645696. Throughput: 0: 48748.9. Samples: 712509120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 20:21:30,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:21:34,514][71000] Updated weights for policy 0, policy_version 72254 (0.0029) [2024-06-12 20:21:35,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 1183891456. Throughput: 0: 48875.1. Samples: 712660980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 20:21:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 20:21:35,951][70980] Saving new best policy, reward=0.281! [2024-06-12 20:21:37,008][71000] Updated weights for policy 0, policy_version 72264 (0.0035) [2024-06-12 20:21:40,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 1184120832. Throughput: 0: 48734.1. Samples: 712949960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 20:21:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:21:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000072273_1184120832.pth... [2024-06-12 20:21:41,006][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000071558_1172406272.pth [2024-06-12 20:21:41,295][71000] Updated weights for policy 0, policy_version 72274 (0.0033) [2024-06-12 20:21:44,085][71000] Updated weights for policy 0, policy_version 72284 (0.0028) [2024-06-12 20:21:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 1184366592. Throughput: 0: 48564.1. Samples: 713240300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 20:21:45,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:21:48,024][71000] Updated weights for policy 0, policy_version 72294 (0.0020) [2024-06-12 20:21:50,507][71000] Updated weights for policy 0, policy_version 72304 (0.0035) [2024-06-12 20:21:50,940][70768] Fps is (10 sec: 50790.9, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 1184628736. Throughput: 0: 49168.4. Samples: 713396100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 20:21:50,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:21:54,539][71000] Updated weights for policy 0, policy_version 72314 (0.0027) [2024-06-12 20:21:55,548][70980] Signal inference workers to stop experience collection... (10500 times) [2024-06-12 20:21:55,549][70980] Signal inference workers to resume experience collection... (10500 times) [2024-06-12 20:21:55,564][71000] InferenceWorker_p0-w0: stopping experience collection (10500 times) [2024-06-12 20:21:55,564][71000] InferenceWorker_p0-w0: resuming experience collection (10500 times) [2024-06-12 20:21:55,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48332.8, 300 sec: 48763.2). Total num frames: 1184841728. Throughput: 0: 48885.8. Samples: 713679520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 20:21:55,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:21:57,418][71000] Updated weights for policy 0, policy_version 72324 (0.0023) [2024-06-12 20:22:00,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1185103872. Throughput: 0: 48900.5. Samples: 713979740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 20:22:00,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:22:01,019][71000] Updated weights for policy 0, policy_version 72334 (0.0033) [2024-06-12 20:22:04,207][71000] Updated weights for policy 0, policy_version 72344 (0.0035) [2024-06-12 20:22:05,939][70768] Fps is (10 sec: 50790.4, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 1185349632. Throughput: 0: 48726.8. Samples: 714121820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 20:22:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 20:22:07,867][71000] Updated weights for policy 0, policy_version 72354 (0.0030) [2024-06-12 20:22:10,639][71000] Updated weights for policy 0, policy_version 72364 (0.0028) [2024-06-12 20:22:10,939][70768] Fps is (10 sec: 52429.1, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 1185628160. Throughput: 0: 49100.1. Samples: 714427940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 20:22:10,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:22:14,654][71000] Updated weights for policy 0, policy_version 72374 (0.0027) [2024-06-12 20:22:15,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 1185824768. Throughput: 0: 49233.7. Samples: 714724640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 20:22:15,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:22:17,228][71000] Updated weights for policy 0, policy_version 72384 (0.0028) [2024-06-12 20:22:20,940][70768] Fps is (10 sec: 44236.5, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1186070528. Throughput: 0: 48809.8. Samples: 714857420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 20:22:20,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:22:21,236][71000] Updated weights for policy 0, policy_version 72394 (0.0033) [2024-06-12 20:22:24,190][71000] Updated weights for policy 0, policy_version 72404 (0.0024) [2024-06-12 20:22:25,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 1186332672. Throughput: 0: 48889.9. Samples: 715150000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 20:22:25,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:22:28,095][71000] Updated weights for policy 0, policy_version 72414 (0.0033) [2024-06-12 20:22:30,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1186578432. Throughput: 0: 48904.1. Samples: 715440980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 20:22:30,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:22:31,000][71000] Updated weights for policy 0, policy_version 72424 (0.0025) [2024-06-12 20:22:34,954][71000] Updated weights for policy 0, policy_version 72434 (0.0029) [2024-06-12 20:22:35,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1186824192. Throughput: 0: 48857.8. Samples: 715594700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-12 20:22:35,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 20:22:37,566][71000] Updated weights for policy 0, policy_version 72444 (0.0028) [2024-06-12 20:22:40,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48606.0, 300 sec: 48763.3). Total num frames: 1187037184. Throughput: 0: 48907.5. Samples: 715880360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 20:22:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:22:41,641][71000] Updated weights for policy 0, policy_version 72454 (0.0032) [2024-06-12 20:22:44,305][71000] Updated weights for policy 0, policy_version 72464 (0.0032) [2024-06-12 20:22:45,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1187299328. Throughput: 0: 48614.7. Samples: 716167400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 20:22:45,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:22:48,212][71000] Updated weights for policy 0, policy_version 72474 (0.0030) [2024-06-12 20:22:50,940][70768] Fps is (10 sec: 52428.6, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 1187561472. Throughput: 0: 48910.2. Samples: 716322780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 20:22:50,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:22:51,130][71000] Updated weights for policy 0, policy_version 72484 (0.0032) [2024-06-12 20:22:55,032][71000] Updated weights for policy 0, policy_version 72494 (0.0024) [2024-06-12 20:22:55,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49151.8, 300 sec: 48818.7). Total num frames: 1187790848. Throughput: 0: 48706.0. Samples: 716619720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 20:22:55,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:22:57,761][71000] Updated weights for policy 0, policy_version 72504 (0.0020) [2024-06-12 20:23:00,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 1188020224. Throughput: 0: 48456.9. Samples: 716905200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 20:23:00,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:23:01,941][71000] Updated weights for policy 0, policy_version 72514 (0.0026) [2024-06-12 20:23:04,350][71000] Updated weights for policy 0, policy_version 72524 (0.0028) [2024-06-12 20:23:05,940][70768] Fps is (10 sec: 49152.8, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1188282368. Throughput: 0: 48845.8. Samples: 717055480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 20:23:05,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:23:08,715][71000] Updated weights for policy 0, policy_version 72534 (0.0021) [2024-06-12 20:23:10,177][70980] Signal inference workers to stop experience collection... (10550 times) [2024-06-12 20:23:10,185][70980] Signal inference workers to resume experience collection... (10550 times) [2024-06-12 20:23:10,200][71000] InferenceWorker_p0-w0: stopping experience collection (10550 times) [2024-06-12 20:23:10,200][71000] InferenceWorker_p0-w0: resuming experience collection (10550 times) [2024-06-12 20:23:10,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48332.7, 300 sec: 48874.3). Total num frames: 1188528128. Throughput: 0: 49025.2. Samples: 717356140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 20:23:10,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:23:11,169][71000] Updated weights for policy 0, policy_version 72544 (0.0031) [2024-06-12 20:23:15,268][71000] Updated weights for policy 0, policy_version 72554 (0.0028) [2024-06-12 20:23:15,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 1188757504. Throughput: 0: 49029.8. Samples: 717647320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 20:23:15,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:23:18,186][71000] Updated weights for policy 0, policy_version 72564 (0.0032) [2024-06-12 20:23:20,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48605.7, 300 sec: 48707.7). Total num frames: 1188986880. Throughput: 0: 48746.1. Samples: 717788280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 20:23:20,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:23:21,876][71000] Updated weights for policy 0, policy_version 72574 (0.0029) [2024-06-12 20:23:24,674][71000] Updated weights for policy 0, policy_version 72584 (0.0026) [2024-06-12 20:23:25,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 1189281792. Throughput: 0: 48860.9. Samples: 718079100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 20:23:25,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:23:28,870][71000] Updated weights for policy 0, policy_version 72594 (0.0033) [2024-06-12 20:23:30,940][70768] Fps is (10 sec: 52429.9, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 1189511168. Throughput: 0: 49069.4. Samples: 718375520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 20:23:30,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 20:23:31,282][71000] Updated weights for policy 0, policy_version 72604 (0.0023) [2024-06-12 20:23:35,368][71000] Updated weights for policy 0, policy_version 72614 (0.0031) [2024-06-12 20:23:35,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 1189740544. Throughput: 0: 48767.1. Samples: 718517300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 20:23:35,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:23:38,257][71000] Updated weights for policy 0, policy_version 72624 (0.0026) [2024-06-12 20:23:40,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49151.9, 300 sec: 48818.8). Total num frames: 1189986304. Throughput: 0: 48805.0. Samples: 718815940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 20:23:40,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 20:23:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000072631_1189986304.pth... [2024-06-12 20:23:41,001][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000071917_1178288128.pth [2024-06-12 20:23:42,077][71000] Updated weights for policy 0, policy_version 72634 (0.0022) [2024-06-12 20:23:44,961][71000] Updated weights for policy 0, policy_version 72644 (0.0034) [2024-06-12 20:23:45,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 1190264832. Throughput: 0: 48842.3. Samples: 719103100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 20:23:45,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:23:48,600][71000] Updated weights for policy 0, policy_version 72654 (0.0026) [2024-06-12 20:23:50,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48332.8, 300 sec: 48874.6). Total num frames: 1190461440. Throughput: 0: 48980.5. Samples: 719259600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 20:23:50,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 20:23:51,539][71000] Updated weights for policy 0, policy_version 72664 (0.0027) [2024-06-12 20:23:55,463][71000] Updated weights for policy 0, policy_version 72674 (0.0027) [2024-06-12 20:23:55,940][70768] Fps is (10 sec: 44237.0, 60 sec: 48606.0, 300 sec: 48763.3). Total num frames: 1190707200. Throughput: 0: 48777.0. Samples: 719551100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 20:23:55,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 20:23:58,213][71000] Updated weights for policy 0, policy_version 72684 (0.0035) [2024-06-12 20:24:00,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1190952960. Throughput: 0: 48660.4. Samples: 719837040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 20:24:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:24:01,986][71000] Updated weights for policy 0, policy_version 72694 (0.0035) [2024-06-12 20:24:04,954][71000] Updated weights for policy 0, policy_version 72704 (0.0026) [2024-06-12 20:24:05,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1191215104. Throughput: 0: 48812.2. Samples: 719984820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 20:24:05,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:24:08,625][71000] Updated weights for policy 0, policy_version 72714 (0.0027) [2024-06-12 20:24:10,940][70768] Fps is (10 sec: 50789.1, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 1191460864. Throughput: 0: 48973.1. Samples: 720282900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 20:24:10,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:24:11,591][71000] Updated weights for policy 0, policy_version 72724 (0.0038) [2024-06-12 20:24:15,333][71000] Updated weights for policy 0, policy_version 72734 (0.0027) [2024-06-12 20:24:15,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48878.8, 300 sec: 48818.7). Total num frames: 1191690240. Throughput: 0: 48891.9. Samples: 720575660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 20:24:15,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:24:18,414][71000] Updated weights for policy 0, policy_version 72744 (0.0026) [2024-06-12 20:24:20,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.1, 300 sec: 48818.7). Total num frames: 1191936000. Throughput: 0: 48911.0. Samples: 720718300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 20:24:20,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:24:21,325][70980] Signal inference workers to stop experience collection... (10600 times) [2024-06-12 20:24:21,371][71000] InferenceWorker_p0-w0: stopping experience collection (10600 times) [2024-06-12 20:24:21,435][70980] Signal inference workers to resume experience collection... (10600 times) [2024-06-12 20:24:21,435][71000] InferenceWorker_p0-w0: resuming experience collection (10600 times) [2024-06-12 20:24:22,191][71000] Updated weights for policy 0, policy_version 72754 (0.0029) [2024-06-12 20:24:25,237][71000] Updated weights for policy 0, policy_version 72764 (0.0044) [2024-06-12 20:24:25,940][70768] Fps is (10 sec: 52428.9, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 1192214528. Throughput: 0: 48752.4. Samples: 721009800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 20:24:25,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:24:28,914][71000] Updated weights for policy 0, policy_version 72774 (0.0040) [2024-06-12 20:24:30,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 1192427520. Throughput: 0: 48910.2. Samples: 721304060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 20:24:30,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:24:31,923][71000] Updated weights for policy 0, policy_version 72784 (0.0032) [2024-06-12 20:24:35,729][71000] Updated weights for policy 0, policy_version 72794 (0.0024) [2024-06-12 20:24:35,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48878.9, 300 sec: 48819.5). Total num frames: 1192673280. Throughput: 0: 48323.0. Samples: 721434140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 20:24:35,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:24:38,635][71000] Updated weights for policy 0, policy_version 72804 (0.0039) [2024-06-12 20:24:40,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48605.8, 300 sec: 48818.7). Total num frames: 1192902656. Throughput: 0: 48274.5. Samples: 721723460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 20:24:40,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:24:42,331][71000] Updated weights for policy 0, policy_version 72814 (0.0033) [2024-06-12 20:24:45,738][71000] Updated weights for policy 0, policy_version 72824 (0.0029) [2024-06-12 20:24:45,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48332.8, 300 sec: 48874.3). Total num frames: 1193164800. Throughput: 0: 48402.2. Samples: 722015140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 20:24:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:24:49,154][71000] Updated weights for policy 0, policy_version 72834 (0.0027) [2024-06-12 20:24:50,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48332.7, 300 sec: 48652.1). Total num frames: 1193361408. Throughput: 0: 48363.8. Samples: 722161200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 20:24:50,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:24:52,225][71000] Updated weights for policy 0, policy_version 72844 (0.0034) [2024-06-12 20:24:55,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 1193623552. Throughput: 0: 48096.2. Samples: 722447220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 20:24:55,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:24:55,953][71000] Updated weights for policy 0, policy_version 72854 (0.0030) [2024-06-12 20:24:59,129][71000] Updated weights for policy 0, policy_version 72864 (0.0037) [2024-06-12 20:25:00,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48605.7, 300 sec: 48707.7). Total num frames: 1193869312. Throughput: 0: 48182.2. Samples: 722743860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 20:25:00,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:25:02,871][71000] Updated weights for policy 0, policy_version 72874 (0.0028) [2024-06-12 20:25:05,843][71000] Updated weights for policy 0, policy_version 72884 (0.0026) [2024-06-12 20:25:05,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48605.7, 300 sec: 48818.7). Total num frames: 1194131456. Throughput: 0: 48112.4. Samples: 722883360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 20:25:05,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:25:09,433][71000] Updated weights for policy 0, policy_version 72894 (0.0033) [2024-06-12 20:25:10,941][70768] Fps is (10 sec: 45871.6, 60 sec: 47786.1, 300 sec: 48596.5). Total num frames: 1194328064. Throughput: 0: 48262.2. Samples: 723181640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 20:25:10,941][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:25:12,609][71000] Updated weights for policy 0, policy_version 72904 (0.0023) [2024-06-12 20:25:15,940][70768] Fps is (10 sec: 45876.1, 60 sec: 48332.9, 300 sec: 48707.7). Total num frames: 1194590208. Throughput: 0: 48201.4. Samples: 723473120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 20:25:15,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:25:16,136][71000] Updated weights for policy 0, policy_version 72914 (0.0031) [2024-06-12 20:25:19,371][71000] Updated weights for policy 0, policy_version 72924 (0.0030) [2024-06-12 20:25:20,939][70768] Fps is (10 sec: 52434.1, 60 sec: 48606.0, 300 sec: 48763.3). Total num frames: 1194852352. Throughput: 0: 48542.4. Samples: 723618540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 20:25:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:25:22,890][71000] Updated weights for policy 0, policy_version 72934 (0.0023) [2024-06-12 20:25:25,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48059.9, 300 sec: 48763.2). Total num frames: 1195098112. Throughput: 0: 48737.6. Samples: 723916640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 20:25:25,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 20:25:26,021][71000] Updated weights for policy 0, policy_version 72944 (0.0034) [2024-06-12 20:25:29,682][71000] Updated weights for policy 0, policy_version 72954 (0.0026) [2024-06-12 20:25:30,940][70768] Fps is (10 sec: 47512.5, 60 sec: 48332.7, 300 sec: 48652.1). Total num frames: 1195327488. Throughput: 0: 48918.9. Samples: 724216500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 20:25:30,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:25:32,665][71000] Updated weights for policy 0, policy_version 72964 (0.0025) [2024-06-12 20:25:35,904][70980] Signal inference workers to stop experience collection... (10650 times) [2024-06-12 20:25:35,939][70768] Fps is (10 sec: 47513.4, 60 sec: 48332.9, 300 sec: 48707.7). Total num frames: 1195573248. Throughput: 0: 48961.1. Samples: 724364440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 20:25:35,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:25:35,955][70980] Signal inference workers to resume experience collection... (10650 times) [2024-06-12 20:25:35,956][71000] InferenceWorker_p0-w0: stopping experience collection (10650 times) [2024-06-12 20:25:35,967][71000] InferenceWorker_p0-w0: resuming experience collection (10650 times) [2024-06-12 20:25:36,114][71000] Updated weights for policy 0, policy_version 72974 (0.0024) [2024-06-12 20:25:39,278][71000] Updated weights for policy 0, policy_version 72984 (0.0030) [2024-06-12 20:25:40,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 1195851776. Throughput: 0: 49123.5. Samples: 724657780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 20:25:40,944][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:25:40,955][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000072989_1195851776.pth... [2024-06-12 20:25:41,007][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000072273_1184120832.pth [2024-06-12 20:25:42,953][71000] Updated weights for policy 0, policy_version 72994 (0.0036) [2024-06-12 20:25:45,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48332.9, 300 sec: 48707.7). Total num frames: 1196064768. Throughput: 0: 49015.4. Samples: 724949540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 20:25:45,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:25:46,195][71000] Updated weights for policy 0, policy_version 73004 (0.0032) [2024-06-12 20:25:49,755][71000] Updated weights for policy 0, policy_version 73014 (0.0031) [2024-06-12 20:25:50,939][70768] Fps is (10 sec: 45875.7, 60 sec: 49152.2, 300 sec: 48707.7). Total num frames: 1196310528. Throughput: 0: 48979.8. Samples: 725087440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 20:25:50,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:25:53,202][71000] Updated weights for policy 0, policy_version 73024 (0.0031) [2024-06-12 20:25:55,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1196556288. Throughput: 0: 48749.4. Samples: 725375320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 20:25:55,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 20:25:56,422][71000] Updated weights for policy 0, policy_version 73034 (0.0031) [2024-06-12 20:25:59,966][71000] Updated weights for policy 0, policy_version 73044 (0.0030) [2024-06-12 20:26:00,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 1196802048. Throughput: 0: 48814.2. Samples: 725669760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 20:26:00,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:26:02,912][71000] Updated weights for policy 0, policy_version 73054 (0.0034) [2024-06-12 20:26:05,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48332.9, 300 sec: 48707.7). Total num frames: 1197031424. Throughput: 0: 48718.5. Samples: 725810880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 20:26:05,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:26:06,708][71000] Updated weights for policy 0, policy_version 73064 (0.0034) [2024-06-12 20:26:09,681][71000] Updated weights for policy 0, policy_version 73074 (0.0026) [2024-06-12 20:26:10,941][70768] Fps is (10 sec: 47507.7, 60 sec: 49151.7, 300 sec: 48707.5). Total num frames: 1197277184. Throughput: 0: 48810.1. Samples: 726113160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 20:26:10,941][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:26:13,300][71000] Updated weights for policy 0, policy_version 73084 (0.0037) [2024-06-12 20:26:15,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49151.9, 300 sec: 48818.8). Total num frames: 1197539328. Throughput: 0: 48322.7. Samples: 726391020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 20:26:15,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:26:16,607][71000] Updated weights for policy 0, policy_version 73094 (0.0039) [2024-06-12 20:26:20,305][71000] Updated weights for policy 0, policy_version 73104 (0.0029) [2024-06-12 20:26:20,940][70768] Fps is (10 sec: 49158.1, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 1197768704. Throughput: 0: 48320.4. Samples: 726538860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 20:26:20,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:26:23,148][71000] Updated weights for policy 0, policy_version 73114 (0.0026) [2024-06-12 20:26:25,940][70768] Fps is (10 sec: 45871.6, 60 sec: 48332.0, 300 sec: 48652.0). Total num frames: 1197998080. Throughput: 0: 48592.4. Samples: 726844480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 20:26:25,941][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:26:27,002][71000] Updated weights for policy 0, policy_version 73124 (0.0029) [2024-06-12 20:26:30,163][71000] Updated weights for policy 0, policy_version 73134 (0.0032) [2024-06-12 20:26:30,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48606.0, 300 sec: 48652.2). Total num frames: 1198243840. Throughput: 0: 48283.5. Samples: 727122300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 20:26:30,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:26:33,623][71000] Updated weights for policy 0, policy_version 73144 (0.0021) [2024-06-12 20:26:35,940][70768] Fps is (10 sec: 50795.0, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1198505984. Throughput: 0: 48619.0. Samples: 727275300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 20:26:35,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:26:36,883][71000] Updated weights for policy 0, policy_version 73154 (0.0032) [2024-06-12 20:26:40,335][71000] Updated weights for policy 0, policy_version 73164 (0.0035) [2024-06-12 20:26:40,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48059.6, 300 sec: 48707.7). Total num frames: 1198735360. Throughput: 0: 48743.0. Samples: 727568760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 20:26:40,944][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:26:43,528][71000] Updated weights for policy 0, policy_version 73174 (0.0026) [2024-06-12 20:26:45,939][70768] Fps is (10 sec: 45875.5, 60 sec: 48332.8, 300 sec: 48596.6). Total num frames: 1198964736. Throughput: 0: 48337.4. Samples: 727844940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 20:26:45,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:26:47,274][71000] Updated weights for policy 0, policy_version 73184 (0.0026) [2024-06-12 20:26:50,145][71000] Updated weights for policy 0, policy_version 73194 (0.0032) [2024-06-12 20:26:50,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48605.7, 300 sec: 48763.2). Total num frames: 1199226880. Throughput: 0: 48719.0. Samples: 728003240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 20:26:50,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:26:53,971][71000] Updated weights for policy 0, policy_version 73204 (0.0033) [2024-06-12 20:26:55,940][70768] Fps is (10 sec: 52427.7, 60 sec: 48878.8, 300 sec: 48763.2). Total num frames: 1199489024. Throughput: 0: 48510.1. Samples: 728296060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 20:26:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 20:26:57,078][71000] Updated weights for policy 0, policy_version 73214 (0.0033) [2024-06-12 20:27:00,703][71000] Updated weights for policy 0, policy_version 73224 (0.0022) [2024-06-12 20:27:00,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 1199718400. Throughput: 0: 48996.0. Samples: 728595840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 20:27:00,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 20:27:01,368][70980] Signal inference workers to stop experience collection... (10700 times) [2024-06-12 20:27:01,402][71000] InferenceWorker_p0-w0: stopping experience collection (10700 times) [2024-06-12 20:27:01,422][70980] Signal inference workers to resume experience collection... (10700 times) [2024-06-12 20:27:01,424][71000] InferenceWorker_p0-w0: resuming experience collection (10700 times) [2024-06-12 20:27:03,703][71000] Updated weights for policy 0, policy_version 73234 (0.0034) [2024-06-12 20:27:05,940][70768] Fps is (10 sec: 45876.1, 60 sec: 48606.0, 300 sec: 48541.1). Total num frames: 1199947776. Throughput: 0: 48833.8. Samples: 728736380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 20:27:05,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 20:27:07,115][71000] Updated weights for policy 0, policy_version 73244 (0.0024) [2024-06-12 20:27:10,387][71000] Updated weights for policy 0, policy_version 73254 (0.0025) [2024-06-12 20:27:10,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.9, 300 sec: 48763.2). Total num frames: 1200209920. Throughput: 0: 48517.4. Samples: 729027720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 20:27:10,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:27:13,946][71000] Updated weights for policy 0, policy_version 73264 (0.0028) [2024-06-12 20:27:15,940][70768] Fps is (10 sec: 54066.8, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 1200488448. Throughput: 0: 49155.5. Samples: 729334300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 20:27:15,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 20:27:16,907][71000] Updated weights for policy 0, policy_version 73274 (0.0028) [2024-06-12 20:27:20,732][71000] Updated weights for policy 0, policy_version 73284 (0.0027) [2024-06-12 20:27:20,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48605.8, 300 sec: 48652.1). Total num frames: 1200685056. Throughput: 0: 48931.0. Samples: 729477200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 20:27:20,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:27:23,766][71000] Updated weights for policy 0, policy_version 73294 (0.0027) [2024-06-12 20:27:25,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.8, 300 sec: 48707.7). Total num frames: 1200947200. Throughput: 0: 48868.2. Samples: 729767820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 20:27:25,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:27:27,575][71000] Updated weights for policy 0, policy_version 73304 (0.0033) [2024-06-12 20:27:30,246][71000] Updated weights for policy 0, policy_version 73314 (0.0026) [2024-06-12 20:27:30,939][70768] Fps is (10 sec: 52429.9, 60 sec: 49425.1, 300 sec: 48763.2). Total num frames: 1201209344. Throughput: 0: 49210.7. Samples: 730059420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 20:27:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:27:33,991][71000] Updated weights for policy 0, policy_version 73324 (0.0036) [2024-06-12 20:27:35,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1201438720. Throughput: 0: 49105.5. Samples: 730212980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 20:27:35,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:27:37,107][71000] Updated weights for policy 0, policy_version 73334 (0.0031) [2024-06-12 20:27:40,912][71000] Updated weights for policy 0, policy_version 73344 (0.0029) [2024-06-12 20:27:40,939][70768] Fps is (10 sec: 45875.2, 60 sec: 48879.1, 300 sec: 48707.7). Total num frames: 1201668096. Throughput: 0: 49039.8. Samples: 730502840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 20:27:40,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 20:27:41,047][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000073345_1201684480.pth... [2024-06-12 20:27:41,097][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000072631_1189986304.pth [2024-06-12 20:27:43,728][71000] Updated weights for policy 0, policy_version 73354 (0.0038) [2024-06-12 20:27:45,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.0, 300 sec: 48707.7). Total num frames: 1201930240. Throughput: 0: 48840.1. Samples: 730793640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 20:27:45,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:27:47,720][71000] Updated weights for policy 0, policy_version 73364 (0.0028) [2024-06-12 20:27:50,335][71000] Updated weights for policy 0, policy_version 73374 (0.0027) [2024-06-12 20:27:50,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49425.2, 300 sec: 48818.8). Total num frames: 1202192384. Throughput: 0: 49064.0. Samples: 730944260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-12 20:27:50,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:27:54,222][71000] Updated weights for policy 0, policy_version 73384 (0.0025) [2024-06-12 20:27:55,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 1202405376. Throughput: 0: 49230.6. Samples: 731243100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 20:27:55,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:27:57,076][71000] Updated weights for policy 0, policy_version 73394 (0.0028) [2024-06-12 20:28:00,749][71000] Updated weights for policy 0, policy_version 73404 (0.0022) [2024-06-12 20:28:00,939][70768] Fps is (10 sec: 45875.4, 60 sec: 48879.1, 300 sec: 48707.7). Total num frames: 1202651136. Throughput: 0: 48997.4. Samples: 731539180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 20:28:00,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:28:03,517][71000] Updated weights for policy 0, policy_version 73414 (0.0033) [2024-06-12 20:28:05,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49698.0, 300 sec: 48818.8). Total num frames: 1202929664. Throughput: 0: 49030.7. Samples: 731683580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 20:28:05,940][70768] Avg episode reward: [(0, '0.244')] [2024-06-12 20:28:07,307][71000] Updated weights for policy 0, policy_version 73424 (0.0034) [2024-06-12 20:28:10,256][71000] Updated weights for policy 0, policy_version 73434 (0.0034) [2024-06-12 20:28:10,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49425.1, 300 sec: 48874.3). Total num frames: 1203175424. Throughput: 0: 49216.4. Samples: 731982560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 20:28:10,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:28:14,068][70980] Signal inference workers to stop experience collection... (10750 times) [2024-06-12 20:28:14,068][70980] Signal inference workers to resume experience collection... (10750 times) [2024-06-12 20:28:14,087][71000] InferenceWorker_p0-w0: stopping experience collection (10750 times) [2024-06-12 20:28:14,087][71000] InferenceWorker_p0-w0: resuming experience collection (10750 times) [2024-06-12 20:28:14,219][71000] Updated weights for policy 0, policy_version 73444 (0.0025) [2024-06-12 20:28:15,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48332.8, 300 sec: 48818.8). Total num frames: 1203388416. Throughput: 0: 49200.8. Samples: 732273460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 20:28:15,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:28:16,700][71000] Updated weights for policy 0, policy_version 73454 (0.0030) [2024-06-12 20:28:20,595][71000] Updated weights for policy 0, policy_version 73464 (0.0030) [2024-06-12 20:28:20,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49152.1, 300 sec: 48652.1). Total num frames: 1203634176. Throughput: 0: 49109.2. Samples: 732422900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 20:28:20,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:28:23,398][71000] Updated weights for policy 0, policy_version 73474 (0.0025) [2024-06-12 20:28:25,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.1, 300 sec: 48818.8). Total num frames: 1203912704. Throughput: 0: 48984.4. Samples: 732707140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 20:28:25,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 20:28:27,691][71000] Updated weights for policy 0, policy_version 73484 (0.0024) [2024-06-12 20:28:30,393][71000] Updated weights for policy 0, policy_version 73494 (0.0024) [2024-06-12 20:28:30,939][70768] Fps is (10 sec: 50791.1, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1204142080. Throughput: 0: 49058.3. Samples: 733001260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 20:28:30,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 20:28:34,315][71000] Updated weights for policy 0, policy_version 73504 (0.0027) [2024-06-12 20:28:35,940][70768] Fps is (10 sec: 44236.8, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 1204355072. Throughput: 0: 49066.7. Samples: 733152260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 20:28:35,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 20:28:37,036][71000] Updated weights for policy 0, policy_version 73514 (0.0031) [2024-06-12 20:28:40,649][71000] Updated weights for policy 0, policy_version 73524 (0.0042) [2024-06-12 20:28:40,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49424.9, 300 sec: 48707.7). Total num frames: 1204633600. Throughput: 0: 49032.5. Samples: 733449560. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-12 20:28:40,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:28:43,412][71000] Updated weights for policy 0, policy_version 73534 (0.0029) [2024-06-12 20:28:45,939][70768] Fps is (10 sec: 50790.6, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1204862976. Throughput: 0: 49017.8. Samples: 733744980. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-12 20:28:45,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:28:47,329][71000] Updated weights for policy 0, policy_version 73544 (0.0029) [2024-06-12 20:28:50,273][71000] Updated weights for policy 0, policy_version 73554 (0.0028) [2024-06-12 20:28:50,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1205125120. Throughput: 0: 49206.8. Samples: 733897880. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-12 20:28:50,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 20:28:54,024][71000] Updated weights for policy 0, policy_version 73564 (0.0028) [2024-06-12 20:28:55,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.3, 300 sec: 48874.3). Total num frames: 1205370880. Throughput: 0: 49159.7. Samples: 734194740. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-12 20:28:55,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 20:28:56,916][71000] Updated weights for policy 0, policy_version 73574 (0.0030) [2024-06-12 20:29:00,485][71000] Updated weights for policy 0, policy_version 73584 (0.0030) [2024-06-12 20:29:00,944][70768] Fps is (10 sec: 50768.7, 60 sec: 49694.5, 300 sec: 48873.6). Total num frames: 1205633024. Throughput: 0: 49306.4. Samples: 734492460. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-12 20:29:00,944][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:29:03,327][71000] Updated weights for policy 0, policy_version 73594 (0.0030) [2024-06-12 20:29:05,939][70768] Fps is (10 sec: 47513.3, 60 sec: 48606.0, 300 sec: 48763.3). Total num frames: 1205846016. Throughput: 0: 49169.9. Samples: 734635540. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-12 20:29:05,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 20:29:06,864][71000] Updated weights for policy 0, policy_version 73604 (0.0035) [2024-06-12 20:29:10,146][71000] Updated weights for policy 0, policy_version 73614 (0.0024) [2024-06-12 20:29:10,940][70768] Fps is (10 sec: 47533.9, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1206108160. Throughput: 0: 49443.1. Samples: 734932080. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-12 20:29:10,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 20:29:13,770][71000] Updated weights for policy 0, policy_version 73624 (0.0035) [2024-06-12 20:29:15,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.0, 300 sec: 48874.3). Total num frames: 1206353920. Throughput: 0: 49513.2. Samples: 735229360. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-12 20:29:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:29:17,097][71000] Updated weights for policy 0, policy_version 73634 (0.0028) [2024-06-12 20:29:20,458][71000] Updated weights for policy 0, policy_version 73644 (0.0026) [2024-06-12 20:29:20,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.1, 300 sec: 48707.7). Total num frames: 1206583296. Throughput: 0: 49315.6. Samples: 735371460. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-12 20:29:20,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:29:23,860][71000] Updated weights for policy 0, policy_version 73654 (0.0035) [2024-06-12 20:29:24,636][70980] Signal inference workers to stop experience collection... (10800 times) [2024-06-12 20:29:24,691][71000] InferenceWorker_p0-w0: stopping experience collection (10800 times) [2024-06-12 20:29:24,756][70980] Signal inference workers to resume experience collection... (10800 times) [2024-06-12 20:29:24,756][71000] InferenceWorker_p0-w0: resuming experience collection (10800 times) [2024-06-12 20:29:25,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48605.8, 300 sec: 48818.7). Total num frames: 1206829056. Throughput: 0: 49107.6. Samples: 735659400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 20:29:25,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:29:27,059][71000] Updated weights for policy 0, policy_version 73664 (0.0023) [2024-06-12 20:29:30,317][71000] Updated weights for policy 0, policy_version 73674 (0.0027) [2024-06-12 20:29:30,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 1207091200. Throughput: 0: 49087.0. Samples: 735953900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 20:29:30,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:29:34,031][71000] Updated weights for policy 0, policy_version 73684 (0.0026) [2024-06-12 20:29:35,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49698.1, 300 sec: 48929.9). Total num frames: 1207336960. Throughput: 0: 49080.4. Samples: 736106500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 20:29:35,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:29:37,043][71000] Updated weights for policy 0, policy_version 73694 (0.0034) [2024-06-12 20:29:40,541][71000] Updated weights for policy 0, policy_version 73704 (0.0036) [2024-06-12 20:29:40,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1207566336. Throughput: 0: 48951.0. Samples: 736397540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 20:29:40,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:29:41,020][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000073705_1207582720.pth... [2024-06-12 20:29:41,069][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000072989_1195851776.pth [2024-06-12 20:29:43,703][71000] Updated weights for policy 0, policy_version 73714 (0.0042) [2024-06-12 20:29:45,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49151.8, 300 sec: 48985.4). Total num frames: 1207812096. Throughput: 0: 48901.0. Samples: 736692800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 20:29:45,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:29:47,304][71000] Updated weights for policy 0, policy_version 73724 (0.0028) [2024-06-12 20:29:50,887][71000] Updated weights for policy 0, policy_version 73734 (0.0043) [2024-06-12 20:29:50,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 1208057856. Throughput: 0: 48885.1. Samples: 736835380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 20:29:50,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:29:54,010][71000] Updated weights for policy 0, policy_version 73744 (0.0029) [2024-06-12 20:29:55,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.8, 300 sec: 48929.9). Total num frames: 1208303616. Throughput: 0: 48714.1. Samples: 737124220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 20:29:55,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:29:57,358][71000] Updated weights for policy 0, policy_version 73754 (0.0029) [2024-06-12 20:30:00,879][71000] Updated weights for policy 0, policy_version 73764 (0.0030) [2024-06-12 20:30:00,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48609.2, 300 sec: 48874.3). Total num frames: 1208549376. Throughput: 0: 48796.8. Samples: 737425220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 20:30:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:30:04,001][71000] Updated weights for policy 0, policy_version 73774 (0.0039) [2024-06-12 20:30:05,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49151.9, 300 sec: 49041.1). Total num frames: 1208795136. Throughput: 0: 48701.2. Samples: 737563020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 20:30:05,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:30:07,382][71000] Updated weights for policy 0, policy_version 73784 (0.0032) [2024-06-12 20:30:10,811][71000] Updated weights for policy 0, policy_version 73794 (0.0034) [2024-06-12 20:30:10,939][70768] Fps is (10 sec: 49153.1, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 1209040896. Throughput: 0: 48856.6. Samples: 737857940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-12 20:30:10,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 20:30:14,691][71000] Updated weights for policy 0, policy_version 73804 (0.0035) [2024-06-12 20:30:15,939][70768] Fps is (10 sec: 49152.5, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 1209286656. Throughput: 0: 48784.5. Samples: 738149200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-12 20:30:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:30:17,453][71000] Updated weights for policy 0, policy_version 73814 (0.0026) [2024-06-12 20:30:20,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1209516032. Throughput: 0: 48692.0. Samples: 738297640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-12 20:30:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:30:21,247][71000] Updated weights for policy 0, policy_version 73824 (0.0028) [2024-06-12 20:30:24,062][71000] Updated weights for policy 0, policy_version 73834 (0.0029) [2024-06-12 20:30:25,135][70980] Signal inference workers to stop experience collection... (10850 times) [2024-06-12 20:30:25,135][70980] Signal inference workers to resume experience collection... (10850 times) [2024-06-12 20:30:25,176][71000] InferenceWorker_p0-w0: stopping experience collection (10850 times) [2024-06-12 20:30:25,176][71000] InferenceWorker_p0-w0: resuming experience collection (10850 times) [2024-06-12 20:30:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 1209778176. Throughput: 0: 48576.0. Samples: 738583460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-12 20:30:25,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:30:28,154][71000] Updated weights for policy 0, policy_version 73844 (0.0029) [2024-06-12 20:30:30,718][71000] Updated weights for policy 0, policy_version 73854 (0.0023) [2024-06-12 20:30:30,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 1210023936. Throughput: 0: 48505.8. Samples: 738875560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-12 20:30:30,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:30:34,657][71000] Updated weights for policy 0, policy_version 73864 (0.0030) [2024-06-12 20:30:35,939][70768] Fps is (10 sec: 45875.5, 60 sec: 48332.9, 300 sec: 48763.2). Total num frames: 1210236928. Throughput: 0: 48830.0. Samples: 739032720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-12 20:30:35,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 20:30:37,466][71000] Updated weights for policy 0, policy_version 73874 (0.0025) [2024-06-12 20:30:40,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 1210482688. Throughput: 0: 48790.2. Samples: 739319780. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-12 20:30:40,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:30:41,347][71000] Updated weights for policy 0, policy_version 73884 (0.0024) [2024-06-12 20:30:44,325][71000] Updated weights for policy 0, policy_version 73894 (0.0026) [2024-06-12 20:30:45,939][70768] Fps is (10 sec: 50790.4, 60 sec: 48879.1, 300 sec: 48929.8). Total num frames: 1210744832. Throughput: 0: 48434.0. Samples: 739604740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-12 20:30:45,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:30:48,307][71000] Updated weights for policy 0, policy_version 73904 (0.0030) [2024-06-12 20:30:50,940][70768] Fps is (10 sec: 50791.2, 60 sec: 48879.1, 300 sec: 48929.9). Total num frames: 1210990592. Throughput: 0: 48703.2. Samples: 739754660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-12 20:30:50,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:30:51,111][71000] Updated weights for policy 0, policy_version 73914 (0.0047) [2024-06-12 20:30:54,955][71000] Updated weights for policy 0, policy_version 73924 (0.0026) [2024-06-12 20:30:55,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48606.0, 300 sec: 48874.3). Total num frames: 1211219968. Throughput: 0: 48659.5. Samples: 740047620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 20:30:55,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:30:57,632][71000] Updated weights for policy 0, policy_version 73934 (0.0034) [2024-06-12 20:31:00,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48332.9, 300 sec: 48874.3). Total num frames: 1211449344. Throughput: 0: 48649.2. Samples: 740338420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 20:31:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 20:31:01,479][71000] Updated weights for policy 0, policy_version 73944 (0.0024) [2024-06-12 20:31:04,420][71000] Updated weights for policy 0, policy_version 73954 (0.0037) [2024-06-12 20:31:05,939][70768] Fps is (10 sec: 49152.1, 60 sec: 48605.9, 300 sec: 48930.1). Total num frames: 1211711488. Throughput: 0: 48677.0. Samples: 740488100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 20:31:05,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:31:08,383][71000] Updated weights for policy 0, policy_version 73964 (0.0035) [2024-06-12 20:31:10,940][70768] Fps is (10 sec: 52428.8, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 1211973632. Throughput: 0: 48835.5. Samples: 740781060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 20:31:10,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:31:11,198][71000] Updated weights for policy 0, policy_version 73974 (0.0025) [2024-06-12 20:31:15,042][71000] Updated weights for policy 0, policy_version 73984 (0.0033) [2024-06-12 20:31:15,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48332.7, 300 sec: 48874.3). Total num frames: 1212186624. Throughput: 0: 48806.8. Samples: 741071860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 20:31:15,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:31:17,797][71000] Updated weights for policy 0, policy_version 73994 (0.0035) [2024-06-12 20:31:20,940][70768] Fps is (10 sec: 44236.6, 60 sec: 48332.8, 300 sec: 48874.4). Total num frames: 1212416000. Throughput: 0: 48433.2. Samples: 741212220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 20:31:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:31:21,758][71000] Updated weights for policy 0, policy_version 74004 (0.0040) [2024-06-12 20:31:24,874][71000] Updated weights for policy 0, policy_version 74014 (0.0033) [2024-06-12 20:31:25,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 1212694528. Throughput: 0: 48532.6. Samples: 741503740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 20:31:25,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:31:28,733][71000] Updated weights for policy 0, policy_version 74024 (0.0025) [2024-06-12 20:31:30,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48332.9, 300 sec: 48874.3). Total num frames: 1212923904. Throughput: 0: 48574.6. Samples: 741790600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 20:31:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:31:31,650][71000] Updated weights for policy 0, policy_version 74034 (0.0037) [2024-06-12 20:31:35,415][71000] Updated weights for policy 0, policy_version 74044 (0.0027) [2024-06-12 20:31:35,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 1213153280. Throughput: 0: 48505.7. Samples: 741937420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 20:31:35,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:31:38,328][71000] Updated weights for policy 0, policy_version 74054 (0.0026) [2024-06-12 20:31:38,343][70980] Signal inference workers to stop experience collection... (10900 times) [2024-06-12 20:31:38,343][70980] Signal inference workers to resume experience collection... (10900 times) [2024-06-12 20:31:38,376][71000] InferenceWorker_p0-w0: stopping experience collection (10900 times) [2024-06-12 20:31:38,376][71000] InferenceWorker_p0-w0: resuming experience collection (10900 times) [2024-06-12 20:31:40,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48606.0, 300 sec: 48929.8). Total num frames: 1213399040. Throughput: 0: 48478.6. Samples: 742229160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:31:40,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:31:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000074060_1213399040.pth... [2024-06-12 20:31:40,989][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000073345_1201684480.pth [2024-06-12 20:31:41,849][71000] Updated weights for policy 0, policy_version 74064 (0.0021) [2024-06-12 20:31:44,786][71000] Updated weights for policy 0, policy_version 74074 (0.0024) [2024-06-12 20:31:45,940][70768] Fps is (10 sec: 52428.2, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 1213677568. Throughput: 0: 48683.0. Samples: 742529160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:31:45,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:31:48,471][71000] Updated weights for policy 0, policy_version 74084 (0.0040) [2024-06-12 20:31:50,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 1213906944. Throughput: 0: 48675.9. Samples: 742678520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:31:50,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:31:51,695][71000] Updated weights for policy 0, policy_version 74094 (0.0041) [2024-06-12 20:31:55,403][71000] Updated weights for policy 0, policy_version 74104 (0.0049) [2024-06-12 20:31:55,940][70768] Fps is (10 sec: 44237.3, 60 sec: 48332.8, 300 sec: 48818.8). Total num frames: 1214119936. Throughput: 0: 48624.5. Samples: 742969160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:31:55,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 20:31:58,472][71000] Updated weights for policy 0, policy_version 74114 (0.0031) [2024-06-12 20:32:00,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 1214382080. Throughput: 0: 48464.3. Samples: 743252760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:32:00,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:32:02,068][71000] Updated weights for policy 0, policy_version 74124 (0.0025) [2024-06-12 20:32:04,915][71000] Updated weights for policy 0, policy_version 74134 (0.0027) [2024-06-12 20:32:05,940][70768] Fps is (10 sec: 52428.9, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 1214644224. Throughput: 0: 48896.6. Samples: 743412560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:32:05,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:32:08,706][71000] Updated weights for policy 0, policy_version 74144 (0.0032) [2024-06-12 20:32:10,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48332.8, 300 sec: 48763.2). Total num frames: 1214873600. Throughput: 0: 48858.6. Samples: 743702380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:32:10,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:32:11,909][71000] Updated weights for policy 0, policy_version 74154 (0.0040) [2024-06-12 20:32:15,093][71000] Updated weights for policy 0, policy_version 74164 (0.0028) [2024-06-12 20:32:15,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 1215119360. Throughput: 0: 48954.2. Samples: 743993540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:32:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:32:18,344][71000] Updated weights for policy 0, policy_version 74174 (0.0026) [2024-06-12 20:32:20,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 1215365120. Throughput: 0: 48989.4. Samples: 744141940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:32:20,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:32:21,994][71000] Updated weights for policy 0, policy_version 74184 (0.0027) [2024-06-12 20:32:25,211][71000] Updated weights for policy 0, policy_version 74194 (0.0028) [2024-06-12 20:32:25,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1215627264. Throughput: 0: 49062.2. Samples: 744436960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 20:32:25,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:32:28,579][71000] Updated weights for policy 0, policy_version 74204 (0.0033) [2024-06-12 20:32:30,940][70768] Fps is (10 sec: 49151.0, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 1215856640. Throughput: 0: 49025.7. Samples: 744735320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 20:32:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:32:31,678][71000] Updated weights for policy 0, policy_version 74214 (0.0030) [2024-06-12 20:32:35,366][71000] Updated weights for policy 0, policy_version 74224 (0.0027) [2024-06-12 20:32:35,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1216086016. Throughput: 0: 48766.4. Samples: 744873000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 20:32:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:32:38,553][71000] Updated weights for policy 0, policy_version 74234 (0.0036) [2024-06-12 20:32:40,940][70768] Fps is (10 sec: 49153.0, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 1216348160. Throughput: 0: 48857.8. Samples: 745167760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 20:32:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:32:41,971][71000] Updated weights for policy 0, policy_version 74244 (0.0036) [2024-06-12 20:32:43,602][70980] Signal inference workers to stop experience collection... (10950 times) [2024-06-12 20:32:43,602][70980] Signal inference workers to resume experience collection... (10950 times) [2024-06-12 20:32:43,618][71000] InferenceWorker_p0-w0: stopping experience collection (10950 times) [2024-06-12 20:32:43,619][71000] InferenceWorker_p0-w0: resuming experience collection (10950 times) [2024-06-12 20:32:45,195][71000] Updated weights for policy 0, policy_version 74254 (0.0045) [2024-06-12 20:32:45,940][70768] Fps is (10 sec: 52428.6, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1216610304. Throughput: 0: 48927.3. Samples: 745454480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 20:32:45,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:32:49,077][71000] Updated weights for policy 0, policy_version 74264 (0.0028) [2024-06-12 20:32:50,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 1216823296. Throughput: 0: 48821.8. Samples: 745609540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 20:32:50,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:32:51,762][71000] Updated weights for policy 0, policy_version 74274 (0.0027) [2024-06-12 20:32:55,718][71000] Updated weights for policy 0, policy_version 74284 (0.0022) [2024-06-12 20:32:55,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49425.0, 300 sec: 48929.8). Total num frames: 1217085440. Throughput: 0: 49092.0. Samples: 745911520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 20:32:55,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 20:32:58,263][71000] Updated weights for policy 0, policy_version 74294 (0.0028) [2024-06-12 20:33:00,940][70768] Fps is (10 sec: 52427.7, 60 sec: 49425.0, 300 sec: 48874.3). Total num frames: 1217347584. Throughput: 0: 49495.3. Samples: 746220840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 20:33:00,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 20:33:01,889][71000] Updated weights for policy 0, policy_version 74304 (0.0026) [2024-06-12 20:33:04,587][71000] Updated weights for policy 0, policy_version 74314 (0.0028) [2024-06-12 20:33:05,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.0, 300 sec: 48929.8). Total num frames: 1217609728. Throughput: 0: 49578.2. Samples: 746372960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 20:33:05,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 20:33:08,255][71000] Updated weights for policy 0, policy_version 74324 (0.0030) [2024-06-12 20:33:10,939][70768] Fps is (10 sec: 52430.1, 60 sec: 49971.3, 300 sec: 49096.5). Total num frames: 1217871872. Throughput: 0: 49893.0. Samples: 746682140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 20:33:10,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 20:33:11,062][71000] Updated weights for policy 0, policy_version 74334 (0.0025) [2024-06-12 20:33:14,915][71000] Updated weights for policy 0, policy_version 74344 (0.0038) [2024-06-12 20:33:15,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49424.9, 300 sec: 48985.4). Total num frames: 1218084864. Throughput: 0: 49781.8. Samples: 746975500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 20:33:15,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 20:33:17,575][71000] Updated weights for policy 0, policy_version 74354 (0.0030) [2024-06-12 20:33:20,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49698.1, 300 sec: 48929.8). Total num frames: 1218347008. Throughput: 0: 49937.7. Samples: 747120200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 20:33:20,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 20:33:21,508][71000] Updated weights for policy 0, policy_version 74364 (0.0029) [2024-06-12 20:33:24,334][71000] Updated weights for policy 0, policy_version 74374 (0.0023) [2024-06-12 20:33:25,939][70768] Fps is (10 sec: 52430.1, 60 sec: 49698.2, 300 sec: 49040.9). Total num frames: 1218609152. Throughput: 0: 50062.3. Samples: 747420560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 20:33:25,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 20:33:28,193][71000] Updated weights for policy 0, policy_version 74384 (0.0031) [2024-06-12 20:33:30,940][70768] Fps is (10 sec: 52428.7, 60 sec: 50244.4, 300 sec: 49207.5). Total num frames: 1218871296. Throughput: 0: 50273.7. Samples: 747716800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 20:33:30,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 20:33:30,941][71000] Updated weights for policy 0, policy_version 74394 (0.0029) [2024-06-12 20:33:34,881][71000] Updated weights for policy 0, policy_version 74404 (0.0031) [2024-06-12 20:33:35,939][70768] Fps is (10 sec: 44236.7, 60 sec: 49425.1, 300 sec: 48874.3). Total num frames: 1219051520. Throughput: 0: 50012.5. Samples: 747860100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 20:33:35,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:33:37,922][71000] Updated weights for policy 0, policy_version 74414 (0.0032) [2024-06-12 20:33:40,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49698.1, 300 sec: 49040.9). Total num frames: 1219330048. Throughput: 0: 49630.7. Samples: 748144900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 20:33:40,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 20:33:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000074422_1219330048.pth... [2024-06-12 20:33:40,993][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000073705_1207582720.pth [2024-06-12 20:33:41,474][71000] Updated weights for policy 0, policy_version 74424 (0.0036) [2024-06-12 20:33:44,538][71000] Updated weights for policy 0, policy_version 74434 (0.0028) [2024-06-12 20:33:45,940][70768] Fps is (10 sec: 54067.0, 60 sec: 49698.2, 300 sec: 49040.9). Total num frames: 1219592192. Throughput: 0: 49496.3. Samples: 748448160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 20:33:45,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:33:48,039][71000] Updated weights for policy 0, policy_version 74444 (0.0023) [2024-06-12 20:33:50,940][70768] Fps is (10 sec: 50789.7, 60 sec: 50244.2, 300 sec: 49040.9). Total num frames: 1219837952. Throughput: 0: 49610.5. Samples: 748605440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 20:33:50,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:33:51,128][71000] Updated weights for policy 0, policy_version 74454 (0.0031) [2024-06-12 20:33:51,494][70980] Signal inference workers to stop experience collection... (11000 times) [2024-06-12 20:33:51,494][70980] Signal inference workers to resume experience collection... (11000 times) [2024-06-12 20:33:51,509][71000] InferenceWorker_p0-w0: stopping experience collection (11000 times) [2024-06-12 20:33:51,510][71000] InferenceWorker_p0-w0: resuming experience collection (11000 times) [2024-06-12 20:33:54,607][71000] Updated weights for policy 0, policy_version 74464 (0.0036) [2024-06-12 20:33:55,942][70768] Fps is (10 sec: 47502.2, 60 sec: 49696.2, 300 sec: 48930.2). Total num frames: 1220067328. Throughput: 0: 49396.0. Samples: 748905080. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-12 20:33:55,943][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 20:33:57,731][71000] Updated weights for policy 0, policy_version 74474 (0.0024) [2024-06-12 20:34:00,939][70768] Fps is (10 sec: 49153.0, 60 sec: 49698.4, 300 sec: 49096.5). Total num frames: 1220329472. Throughput: 0: 49532.7. Samples: 749204460. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-12 20:34:00,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 20:34:01,034][71000] Updated weights for policy 0, policy_version 74484 (0.0020) [2024-06-12 20:34:04,326][71000] Updated weights for policy 0, policy_version 74494 (0.0029) [2024-06-12 20:34:05,940][70768] Fps is (10 sec: 50802.3, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 1220575232. Throughput: 0: 49544.9. Samples: 749349720. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-12 20:34:05,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:34:07,647][71000] Updated weights for policy 0, policy_version 74504 (0.0025) [2024-06-12 20:34:10,876][71000] Updated weights for policy 0, policy_version 74514 (0.0034) [2024-06-12 20:34:10,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 1220837376. Throughput: 0: 49552.8. Samples: 749650440. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-12 20:34:10,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:34:14,356][71000] Updated weights for policy 0, policy_version 74524 (0.0026) [2024-06-12 20:34:15,939][70768] Fps is (10 sec: 47514.2, 60 sec: 49425.3, 300 sec: 49040.9). Total num frames: 1221050368. Throughput: 0: 49449.5. Samples: 749942020. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-12 20:34:15,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:34:17,699][71000] Updated weights for policy 0, policy_version 74534 (0.0029) [2024-06-12 20:34:20,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49424.9, 300 sec: 49096.5). Total num frames: 1221312512. Throughput: 0: 49407.3. Samples: 750083440. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-12 20:34:20,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:34:21,245][71000] Updated weights for policy 0, policy_version 74544 (0.0028) [2024-06-12 20:34:24,570][71000] Updated weights for policy 0, policy_version 74554 (0.0027) [2024-06-12 20:34:25,940][70768] Fps is (10 sec: 50788.2, 60 sec: 49151.7, 300 sec: 49040.9). Total num frames: 1221558272. Throughput: 0: 49490.3. Samples: 750371980. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-12 20:34:25,941][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:34:28,250][71000] Updated weights for policy 0, policy_version 74564 (0.0028) [2024-06-12 20:34:30,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 1221787648. Throughput: 0: 49463.5. Samples: 750674020. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-12 20:34:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:34:31,215][71000] Updated weights for policy 0, policy_version 74574 (0.0028) [2024-06-12 20:34:34,482][71000] Updated weights for policy 0, policy_version 74584 (0.0034) [2024-06-12 20:34:35,939][70768] Fps is (10 sec: 49153.8, 60 sec: 49971.2, 300 sec: 49096.5). Total num frames: 1222049792. Throughput: 0: 49260.7. Samples: 750822160. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-12 20:34:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:34:37,661][71000] Updated weights for policy 0, policy_version 74594 (0.0025) [2024-06-12 20:34:40,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 1222295552. Throughput: 0: 49075.8. Samples: 751113380. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-12 20:34:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:34:41,198][71000] Updated weights for policy 0, policy_version 74604 (0.0028) [2024-06-12 20:34:44,267][71000] Updated weights for policy 0, policy_version 74614 (0.0031) [2024-06-12 20:34:45,939][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 1222541312. Throughput: 0: 49039.1. Samples: 751411220. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-12 20:34:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 20:34:48,092][71000] Updated weights for policy 0, policy_version 74624 (0.0025) [2024-06-12 20:34:50,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 1222770688. Throughput: 0: 48949.2. Samples: 751552440. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-12 20:34:50,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:34:51,224][71000] Updated weights for policy 0, policy_version 74634 (0.0028) [2024-06-12 20:34:54,505][71000] Updated weights for policy 0, policy_version 74644 (0.0031) [2024-06-12 20:34:55,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49426.9, 300 sec: 49096.5). Total num frames: 1223032832. Throughput: 0: 48860.3. Samples: 751849160. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-12 20:34:55,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:34:57,822][71000] Updated weights for policy 0, policy_version 74654 (0.0030) [2024-06-12 20:35:00,944][70768] Fps is (10 sec: 50769.1, 60 sec: 49148.4, 300 sec: 49095.8). Total num frames: 1223278592. Throughput: 0: 48825.0. Samples: 752139360. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-12 20:35:00,944][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:35:01,102][71000] Updated weights for policy 0, policy_version 74664 (0.0033) [2024-06-12 20:35:04,506][71000] Updated weights for policy 0, policy_version 74674 (0.0032) [2024-06-12 20:35:05,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 1223507968. Throughput: 0: 48987.6. Samples: 752287880. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-12 20:35:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:35:07,777][71000] Updated weights for policy 0, policy_version 74684 (0.0028) [2024-06-12 20:35:10,940][70768] Fps is (10 sec: 47533.5, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 1223753728. Throughput: 0: 49076.2. Samples: 752580400. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-12 20:35:10,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:35:11,203][71000] Updated weights for policy 0, policy_version 74694 (0.0027) [2024-06-12 20:35:14,734][71000] Updated weights for policy 0, policy_version 74704 (0.0030) [2024-06-12 20:35:15,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 1223999488. Throughput: 0: 48809.8. Samples: 752870460. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-12 20:35:15,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 20:35:17,413][70980] Signal inference workers to stop experience collection... (11050 times) [2024-06-12 20:35:17,414][70980] Signal inference workers to resume experience collection... (11050 times) [2024-06-12 20:35:17,448][71000] InferenceWorker_p0-w0: stopping experience collection (11050 times) [2024-06-12 20:35:17,449][71000] InferenceWorker_p0-w0: resuming experience collection (11050 times) [2024-06-12 20:35:18,284][71000] Updated weights for policy 0, policy_version 74714 (0.0035) [2024-06-12 20:35:20,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 1224261632. Throughput: 0: 48931.4. Samples: 753024080. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-12 20:35:20,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 20:35:21,185][71000] Updated weights for policy 0, policy_version 74724 (0.0031) [2024-06-12 20:35:24,912][71000] Updated weights for policy 0, policy_version 74734 (0.0023) [2024-06-12 20:35:25,939][70768] Fps is (10 sec: 47514.3, 60 sec: 48606.2, 300 sec: 48985.4). Total num frames: 1224474624. Throughput: 0: 48940.7. Samples: 753315700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 20:35:25,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:35:27,974][71000] Updated weights for policy 0, policy_version 74744 (0.0037) [2024-06-12 20:35:30,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 1224736768. Throughput: 0: 48755.5. Samples: 753605220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 20:35:30,940][70768] Avg episode reward: [(0, '0.248')] [2024-06-12 20:35:31,530][71000] Updated weights for policy 0, policy_version 74754 (0.0020) [2024-06-12 20:35:34,930][71000] Updated weights for policy 0, policy_version 74764 (0.0026) [2024-06-12 20:35:35,940][70768] Fps is (10 sec: 50789.7, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 1224982528. Throughput: 0: 48957.9. Samples: 753755540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 20:35:35,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 20:35:38,042][71000] Updated weights for policy 0, policy_version 74774 (0.0034) [2024-06-12 20:35:40,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 1225228288. Throughput: 0: 48927.7. Samples: 754050900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 20:35:40,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:35:41,031][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000074783_1225244672.pth... [2024-06-12 20:35:41,080][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000074060_1213399040.pth [2024-06-12 20:35:41,501][71000] Updated weights for policy 0, policy_version 74784 (0.0023) [2024-06-12 20:35:45,175][71000] Updated weights for policy 0, policy_version 74794 (0.0033) [2024-06-12 20:35:45,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48332.7, 300 sec: 48985.4). Total num frames: 1225441280. Throughput: 0: 48776.2. Samples: 754334080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 20:35:45,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:35:48,068][71000] Updated weights for policy 0, policy_version 74804 (0.0024) [2024-06-12 20:35:50,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 1225703424. Throughput: 0: 48625.8. Samples: 754476040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 20:35:50,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:35:51,976][71000] Updated weights for policy 0, policy_version 74814 (0.0022) [2024-06-12 20:35:55,230][71000] Updated weights for policy 0, policy_version 74824 (0.0029) [2024-06-12 20:35:55,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48606.0, 300 sec: 49152.0). Total num frames: 1225949184. Throughput: 0: 48566.8. Samples: 754765900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 20:35:55,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:35:58,602][71000] Updated weights for policy 0, policy_version 74834 (0.0031) [2024-06-12 20:36:00,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48336.3, 300 sec: 49040.9). Total num frames: 1226178560. Throughput: 0: 48624.5. Samples: 755058560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 20:36:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:36:01,853][71000] Updated weights for policy 0, policy_version 74844 (0.0022) [2024-06-12 20:36:05,505][71000] Updated weights for policy 0, policy_version 74854 (0.0028) [2024-06-12 20:36:05,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48332.9, 300 sec: 48929.8). Total num frames: 1226407936. Throughput: 0: 48289.8. Samples: 755197120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 20:36:05,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:36:08,323][71000] Updated weights for policy 0, policy_version 74864 (0.0037) [2024-06-12 20:36:10,940][70768] Fps is (10 sec: 49151.2, 60 sec: 48605.8, 300 sec: 49096.4). Total num frames: 1226670080. Throughput: 0: 48370.8. Samples: 755492400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 20:36:10,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:36:12,289][71000] Updated weights for policy 0, policy_version 74874 (0.0025) [2024-06-12 20:36:14,795][71000] Updated weights for policy 0, policy_version 74884 (0.0034) [2024-06-12 20:36:15,940][70768] Fps is (10 sec: 52429.0, 60 sec: 48879.0, 300 sec: 49207.6). Total num frames: 1226932224. Throughput: 0: 48536.9. Samples: 755789380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 20:36:15,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 20:36:19,001][71000] Updated weights for policy 0, policy_version 74894 (0.0031) [2024-06-12 20:36:20,087][70980] Signal inference workers to stop experience collection... (11100 times) [2024-06-12 20:36:20,139][71000] InferenceWorker_p0-w0: stopping experience collection (11100 times) [2024-06-12 20:36:20,142][70980] Signal inference workers to resume experience collection... (11100 times) [2024-06-12 20:36:20,151][71000] InferenceWorker_p0-w0: resuming experience collection (11100 times) [2024-06-12 20:36:20,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48059.8, 300 sec: 48985.4). Total num frames: 1227145216. Throughput: 0: 48570.2. Samples: 755941200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 20:36:20,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 20:36:21,906][71000] Updated weights for policy 0, policy_version 74904 (0.0032) [2024-06-12 20:36:25,545][71000] Updated weights for policy 0, policy_version 74914 (0.0029) [2024-06-12 20:36:25,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.8, 300 sec: 49096.5). Total num frames: 1227407360. Throughput: 0: 48390.6. Samples: 756228480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 20:36:25,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:36:28,390][71000] Updated weights for policy 0, policy_version 74924 (0.0029) [2024-06-12 20:36:30,939][70768] Fps is (10 sec: 49152.5, 60 sec: 48332.8, 300 sec: 49096.5). Total num frames: 1227636736. Throughput: 0: 48705.9. Samples: 756525840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 20:36:30,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:36:32,166][71000] Updated weights for policy 0, policy_version 74934 (0.0026) [2024-06-12 20:36:34,920][71000] Updated weights for policy 0, policy_version 74944 (0.0025) [2024-06-12 20:36:35,939][70768] Fps is (10 sec: 52429.0, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 1227931648. Throughput: 0: 49166.3. Samples: 756688520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 20:36:35,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:36:38,852][71000] Updated weights for policy 0, policy_version 74954 (0.0029) [2024-06-12 20:36:40,940][70768] Fps is (10 sec: 52428.4, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 1228161024. Throughput: 0: 49260.0. Samples: 756982600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 20:36:40,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:36:41,795][71000] Updated weights for policy 0, policy_version 74964 (0.0031) [2024-06-12 20:36:45,552][71000] Updated weights for policy 0, policy_version 74974 (0.0034) [2024-06-12 20:36:45,940][70768] Fps is (10 sec: 44236.0, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 1228374016. Throughput: 0: 49128.7. Samples: 757269360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 20:36:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 20:36:48,502][71000] Updated weights for policy 0, policy_version 74984 (0.0035) [2024-06-12 20:36:50,939][70768] Fps is (10 sec: 45875.5, 60 sec: 48605.9, 300 sec: 49152.0). Total num frames: 1228619776. Throughput: 0: 49132.5. Samples: 757408080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 20:36:50,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 20:36:52,277][71000] Updated weights for policy 0, policy_version 74994 (0.0030) [2024-06-12 20:36:55,211][71000] Updated weights for policy 0, policy_version 75004 (0.0028) [2024-06-12 20:36:55,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 1228881920. Throughput: 0: 49018.2. Samples: 757698220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 20:36:55,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:36:59,088][71000] Updated weights for policy 0, policy_version 75014 (0.0022) [2024-06-12 20:37:00,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 1229127680. Throughput: 0: 49099.7. Samples: 757998860. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-12 20:37:00,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 20:37:02,123][71000] Updated weights for policy 0, policy_version 75024 (0.0037) [2024-06-12 20:37:05,940][70768] Fps is (10 sec: 45876.0, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 1229340672. Throughput: 0: 48737.0. Samples: 758134360. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-12 20:37:05,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:37:06,059][71000] Updated weights for policy 0, policy_version 75034 (0.0024) [2024-06-12 20:37:08,927][71000] Updated weights for policy 0, policy_version 75044 (0.0034) [2024-06-12 20:37:10,940][70768] Fps is (10 sec: 47512.7, 60 sec: 48879.0, 300 sec: 49096.4). Total num frames: 1229602816. Throughput: 0: 48697.2. Samples: 758419860. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-12 20:37:10,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:37:12,812][71000] Updated weights for policy 0, policy_version 75054 (0.0033) [2024-06-12 20:37:15,842][71000] Updated weights for policy 0, policy_version 75064 (0.0023) [2024-06-12 20:37:15,939][70768] Fps is (10 sec: 50790.5, 60 sec: 48605.9, 300 sec: 49096.5). Total num frames: 1229848576. Throughput: 0: 48458.7. Samples: 758706480. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-12 20:37:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:37:19,689][71000] Updated weights for policy 0, policy_version 75074 (0.0037) [2024-06-12 20:37:20,940][70768] Fps is (10 sec: 45875.6, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 1230061568. Throughput: 0: 48143.5. Samples: 758854980. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-12 20:37:20,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:37:22,374][71000] Updated weights for policy 0, policy_version 75084 (0.0032) [2024-06-12 20:37:25,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48332.8, 300 sec: 48985.4). Total num frames: 1230307328. Throughput: 0: 47992.9. Samples: 759142280. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-12 20:37:25,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:37:26,566][71000] Updated weights for policy 0, policy_version 75094 (0.0030) [2024-06-12 20:37:29,261][71000] Updated weights for policy 0, policy_version 75104 (0.0036) [2024-06-12 20:37:29,671][70980] Signal inference workers to stop experience collection... (11150 times) [2024-06-12 20:37:29,719][70980] Signal inference workers to resume experience collection... (11150 times) [2024-06-12 20:37:29,720][71000] InferenceWorker_p0-w0: stopping experience collection (11150 times) [2024-06-12 20:37:29,732][71000] InferenceWorker_p0-w0: resuming experience collection (11150 times) [2024-06-12 20:37:30,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 1230569472. Throughput: 0: 48018.3. Samples: 759430180. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-12 20:37:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:37:33,258][71000] Updated weights for policy 0, policy_version 75114 (0.0028) [2024-06-12 20:37:35,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48059.7, 300 sec: 49040.9). Total num frames: 1230815232. Throughput: 0: 48314.7. Samples: 759582240. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-12 20:37:35,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:37:36,066][71000] Updated weights for policy 0, policy_version 75124 (0.0032) [2024-06-12 20:37:39,969][71000] Updated weights for policy 0, policy_version 75134 (0.0032) [2024-06-12 20:37:40,939][70768] Fps is (10 sec: 47514.1, 60 sec: 48059.8, 300 sec: 48929.8). Total num frames: 1231044608. Throughput: 0: 48306.9. Samples: 759872020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-12 20:37:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:37:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000075137_1231044608.pth... [2024-06-12 20:37:40,993][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000074422_1219330048.pth [2024-06-12 20:37:42,935][71000] Updated weights for policy 0, policy_version 75144 (0.0035) [2024-06-12 20:37:45,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48332.9, 300 sec: 48985.4). Total num frames: 1231273984. Throughput: 0: 48115.9. Samples: 760164080. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-12 20:37:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:37:46,531][71000] Updated weights for policy 0, policy_version 75154 (0.0032) [2024-06-12 20:37:49,356][71000] Updated weights for policy 0, policy_version 75164 (0.0028) [2024-06-12 20:37:50,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 1231536128. Throughput: 0: 48337.2. Samples: 760309540. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-12 20:37:50,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 20:37:53,582][71000] Updated weights for policy 0, policy_version 75174 (0.0029) [2024-06-12 20:37:55,939][70768] Fps is (10 sec: 52429.1, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 1231798272. Throughput: 0: 48591.7. Samples: 760606480. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-12 20:37:55,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:37:56,035][71000] Updated weights for policy 0, policy_version 75184 (0.0031) [2024-06-12 20:38:00,112][71000] Updated weights for policy 0, policy_version 75194 (0.0030) [2024-06-12 20:38:00,939][70768] Fps is (10 sec: 49152.7, 60 sec: 48332.8, 300 sec: 48874.3). Total num frames: 1232027648. Throughput: 0: 48895.1. Samples: 760906760. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-12 20:38:00,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:38:02,712][71000] Updated weights for policy 0, policy_version 75204 (0.0033) [2024-06-12 20:38:05,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 1232257024. Throughput: 0: 48669.8. Samples: 761045120. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-12 20:38:05,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:38:06,547][71000] Updated weights for policy 0, policy_version 75214 (0.0026) [2024-06-12 20:38:09,463][71000] Updated weights for policy 0, policy_version 75224 (0.0025) [2024-06-12 20:38:10,939][70768] Fps is (10 sec: 49152.0, 60 sec: 48606.0, 300 sec: 48929.9). Total num frames: 1232519168. Throughput: 0: 48916.5. Samples: 761343520. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-12 20:38:10,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:38:13,472][71000] Updated weights for policy 0, policy_version 75234 (0.0022) [2024-06-12 20:38:15,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48605.7, 300 sec: 48874.3). Total num frames: 1232764928. Throughput: 0: 48724.9. Samples: 761622800. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-12 20:38:15,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 20:38:16,254][71000] Updated weights for policy 0, policy_version 75244 (0.0030) [2024-06-12 20:38:20,387][71000] Updated weights for policy 0, policy_version 75254 (0.0039) [2024-06-12 20:38:20,939][70768] Fps is (10 sec: 45875.1, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 1232977920. Throughput: 0: 48566.2. Samples: 761767720. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-12 20:38:20,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:38:23,196][71000] Updated weights for policy 0, policy_version 75264 (0.0032) [2024-06-12 20:38:25,940][70768] Fps is (10 sec: 45875.6, 60 sec: 48605.9, 300 sec: 48652.1). Total num frames: 1233223680. Throughput: 0: 48686.2. Samples: 762062900. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-12 20:38:25,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:38:26,914][71000] Updated weights for policy 0, policy_version 75274 (0.0028) [2024-06-12 20:38:29,798][71000] Updated weights for policy 0, policy_version 75284 (0.0029) [2024-06-12 20:38:30,944][70768] Fps is (10 sec: 50768.5, 60 sec: 48602.5, 300 sec: 48929.1). Total num frames: 1233485824. Throughput: 0: 48758.9. Samples: 762358440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 20:38:30,944][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:38:33,737][71000] Updated weights for policy 0, policy_version 75294 (0.0022) [2024-06-12 20:38:35,939][70768] Fps is (10 sec: 50791.1, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 1233731584. Throughput: 0: 48942.9. Samples: 762511960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 20:38:35,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:38:36,428][71000] Updated weights for policy 0, policy_version 75304 (0.0025) [2024-06-12 20:38:40,449][71000] Updated weights for policy 0, policy_version 75314 (0.0028) [2024-06-12 20:38:40,939][70768] Fps is (10 sec: 49173.5, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 1233977344. Throughput: 0: 48994.3. Samples: 762811220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 20:38:40,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 20:38:42,072][70980] Signal inference workers to stop experience collection... (11200 times) [2024-06-12 20:38:42,073][70980] Signal inference workers to resume experience collection... (11200 times) [2024-06-12 20:38:42,110][71000] InferenceWorker_p0-w0: stopping experience collection (11200 times) [2024-06-12 20:38:42,111][71000] InferenceWorker_p0-w0: resuming experience collection (11200 times) [2024-06-12 20:38:42,977][71000] Updated weights for policy 0, policy_version 75324 (0.0033) [2024-06-12 20:38:45,939][70768] Fps is (10 sec: 47513.4, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 1234206720. Throughput: 0: 48542.2. Samples: 763091160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 20:38:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:38:47,095][71000] Updated weights for policy 0, policy_version 75334 (0.0034) [2024-06-12 20:38:49,851][71000] Updated weights for policy 0, policy_version 75344 (0.0029) [2024-06-12 20:38:50,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48879.0, 300 sec: 48819.2). Total num frames: 1234468864. Throughput: 0: 48746.2. Samples: 763238700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 20:38:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:38:53,595][71000] Updated weights for policy 0, policy_version 75354 (0.0028) [2024-06-12 20:38:55,940][70768] Fps is (10 sec: 52428.4, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1234731008. Throughput: 0: 48721.3. Samples: 763535980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 20:38:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 20:38:56,631][71000] Updated weights for policy 0, policy_version 75364 (0.0025) [2024-06-12 20:39:00,226][71000] Updated weights for policy 0, policy_version 75374 (0.0030) [2024-06-12 20:39:00,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1234960384. Throughput: 0: 49045.5. Samples: 763829840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 20:39:00,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 20:39:03,813][71000] Updated weights for policy 0, policy_version 75384 (0.0037) [2024-06-12 20:39:05,940][70768] Fps is (10 sec: 44236.3, 60 sec: 48605.8, 300 sec: 48596.6). Total num frames: 1235173376. Throughput: 0: 48819.8. Samples: 763964620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 20:39:05,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:39:07,215][71000] Updated weights for policy 0, policy_version 75394 (0.0028) [2024-06-12 20:39:10,388][71000] Updated weights for policy 0, policy_version 75404 (0.0028) [2024-06-12 20:39:10,940][70768] Fps is (10 sec: 47511.2, 60 sec: 48605.5, 300 sec: 48763.1). Total num frames: 1235435520. Throughput: 0: 48572.4. Samples: 764248680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 20:39:10,941][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:39:14,201][71000] Updated weights for policy 0, policy_version 75414 (0.0027) [2024-06-12 20:39:15,940][70768] Fps is (10 sec: 50791.0, 60 sec: 48606.0, 300 sec: 48707.7). Total num frames: 1235681280. Throughput: 0: 48444.2. Samples: 764538220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:39:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:39:16,951][71000] Updated weights for policy 0, policy_version 75424 (0.0033) [2024-06-12 20:39:20,440][71000] Updated weights for policy 0, policy_version 75434 (0.0038) [2024-06-12 20:39:20,940][70768] Fps is (10 sec: 49154.4, 60 sec: 49152.0, 300 sec: 48707.7). Total num frames: 1235927040. Throughput: 0: 48635.5. Samples: 764700560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:39:20,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:39:23,694][71000] Updated weights for policy 0, policy_version 75444 (0.0029) [2024-06-12 20:39:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 1236156416. Throughput: 0: 48497.7. Samples: 764993620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:39:25,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:39:27,338][71000] Updated weights for policy 0, policy_version 75454 (0.0036) [2024-06-12 20:39:30,344][71000] Updated weights for policy 0, policy_version 75464 (0.0042) [2024-06-12 20:39:30,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48882.4, 300 sec: 48707.7). Total num frames: 1236418560. Throughput: 0: 48575.5. Samples: 765277060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:39:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 20:39:33,962][71000] Updated weights for policy 0, policy_version 75474 (0.0031) [2024-06-12 20:39:35,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48878.7, 300 sec: 48707.7). Total num frames: 1236664320. Throughput: 0: 48626.5. Samples: 765426900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:39:35,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:39:37,338][71000] Updated weights for policy 0, policy_version 75484 (0.0027) [2024-06-12 20:39:40,577][71000] Updated weights for policy 0, policy_version 75494 (0.0025) [2024-06-12 20:39:40,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48878.8, 300 sec: 48707.7). Total num frames: 1236910080. Throughput: 0: 48714.6. Samples: 765728140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:39:40,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:39:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000075495_1236910080.pth... [2024-06-12 20:39:40,989][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000074783_1225244672.pth [2024-06-12 20:39:44,115][71000] Updated weights for policy 0, policy_version 75504 (0.0022) [2024-06-12 20:39:45,940][70768] Fps is (10 sec: 44236.9, 60 sec: 48332.6, 300 sec: 48596.6). Total num frames: 1237106688. Throughput: 0: 48722.0. Samples: 766022340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:39:45,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:39:46,259][70980] Signal inference workers to stop experience collection... (11250 times) [2024-06-12 20:39:46,261][70980] Signal inference workers to resume experience collection... (11250 times) [2024-06-12 20:39:46,269][71000] InferenceWorker_p0-w0: stopping experience collection (11250 times) [2024-06-12 20:39:46,280][71000] InferenceWorker_p0-w0: resuming experience collection (11250 times) [2024-06-12 20:39:47,215][71000] Updated weights for policy 0, policy_version 75514 (0.0025) [2024-06-12 20:39:50,498][71000] Updated weights for policy 0, policy_version 75524 (0.0029) [2024-06-12 20:39:50,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48605.9, 300 sec: 48652.2). Total num frames: 1237385216. Throughput: 0: 48902.4. Samples: 766165220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:39:50,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 20:39:54,040][71000] Updated weights for policy 0, policy_version 75534 (0.0029) [2024-06-12 20:39:55,940][70768] Fps is (10 sec: 54067.8, 60 sec: 48605.9, 300 sec: 48708.4). Total num frames: 1237647360. Throughput: 0: 49122.7. Samples: 766459180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:39:55,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:39:56,967][71000] Updated weights for policy 0, policy_version 75544 (0.0020) [2024-06-12 20:40:00,691][71000] Updated weights for policy 0, policy_version 75554 (0.0030) [2024-06-12 20:40:00,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1237893120. Throughput: 0: 49225.3. Samples: 766753360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 20:40:00,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:40:04,343][71000] Updated weights for policy 0, policy_version 75564 (0.0034) [2024-06-12 20:40:05,939][70768] Fps is (10 sec: 45875.4, 60 sec: 48879.1, 300 sec: 48652.2). Total num frames: 1238106112. Throughput: 0: 48712.9. Samples: 766892640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 20:40:05,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:40:07,297][71000] Updated weights for policy 0, policy_version 75574 (0.0034) [2024-06-12 20:40:10,779][71000] Updated weights for policy 0, policy_version 75584 (0.0028) [2024-06-12 20:40:10,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48879.2, 300 sec: 48707.7). Total num frames: 1238368256. Throughput: 0: 48700.8. Samples: 767185160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 20:40:10,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:40:14,203][71000] Updated weights for policy 0, policy_version 75594 (0.0029) [2024-06-12 20:40:15,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49152.0, 300 sec: 48707.7). Total num frames: 1238630400. Throughput: 0: 48727.2. Samples: 767469780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 20:40:15,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:40:17,897][71000] Updated weights for policy 0, policy_version 75604 (0.0026) [2024-06-12 20:40:20,917][71000] Updated weights for policy 0, policy_version 75614 (0.0030) [2024-06-12 20:40:20,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1238859776. Throughput: 0: 48965.9. Samples: 767630360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 20:40:20,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:40:24,687][71000] Updated weights for policy 0, policy_version 75624 (0.0031) [2024-06-12 20:40:25,940][70768] Fps is (10 sec: 44236.4, 60 sec: 48605.9, 300 sec: 48596.6). Total num frames: 1239072768. Throughput: 0: 48594.8. Samples: 767914900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 20:40:25,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:40:27,650][71000] Updated weights for policy 0, policy_version 75634 (0.0031) [2024-06-12 20:40:30,940][70768] Fps is (10 sec: 47512.7, 60 sec: 48605.7, 300 sec: 48652.1). Total num frames: 1239334912. Throughput: 0: 48611.9. Samples: 768209880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 20:40:30,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:40:31,165][71000] Updated weights for policy 0, policy_version 75644 (0.0034) [2024-06-12 20:40:34,335][71000] Updated weights for policy 0, policy_version 75654 (0.0035) [2024-06-12 20:40:35,939][70768] Fps is (10 sec: 54067.6, 60 sec: 49152.2, 300 sec: 48763.2). Total num frames: 1239613440. Throughput: 0: 48852.9. Samples: 768363600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 20:40:35,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:40:38,125][71000] Updated weights for policy 0, policy_version 75664 (0.0026) [2024-06-12 20:40:40,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 1239826432. Throughput: 0: 48774.6. Samples: 768654040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 20:40:40,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:40:41,140][71000] Updated weights for policy 0, policy_version 75674 (0.0023) [2024-06-12 20:40:44,683][71000] Updated weights for policy 0, policy_version 75684 (0.0025) [2024-06-12 20:40:45,942][70768] Fps is (10 sec: 40951.2, 60 sec: 48604.3, 300 sec: 48540.7). Total num frames: 1240023040. Throughput: 0: 48648.5. Samples: 768942640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 20:40:45,942][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:40:47,972][71000] Updated weights for policy 0, policy_version 75694 (0.0039) [2024-06-12 20:40:50,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48605.8, 300 sec: 48652.1). Total num frames: 1240301568. Throughput: 0: 48381.6. Samples: 769069820. Policy #0 lag: (min: 0.0, avg: 7.3, max: 22.0) [2024-06-12 20:40:50,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:40:51,863][71000] Updated weights for policy 0, policy_version 75704 (0.0027) [2024-06-12 20:40:52,894][70980] Signal inference workers to stop experience collection... (11300 times) [2024-06-12 20:40:52,894][70980] Signal inference workers to resume experience collection... (11300 times) [2024-06-12 20:40:52,913][71000] InferenceWorker_p0-w0: stopping experience collection (11300 times) [2024-06-12 20:40:52,914][71000] InferenceWorker_p0-w0: resuming experience collection (11300 times) [2024-06-12 20:40:54,481][71000] Updated weights for policy 0, policy_version 75714 (0.0024) [2024-06-12 20:40:55,940][70768] Fps is (10 sec: 55717.0, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1240580096. Throughput: 0: 48666.7. Samples: 769375160. Policy #0 lag: (min: 0.0, avg: 7.3, max: 22.0) [2024-06-12 20:40:55,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:40:58,463][71000] Updated weights for policy 0, policy_version 75724 (0.0030) [2024-06-12 20:41:00,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48332.8, 300 sec: 48763.2). Total num frames: 1240793088. Throughput: 0: 49020.8. Samples: 769675720. Policy #0 lag: (min: 0.0, avg: 7.3, max: 22.0) [2024-06-12 20:41:00,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:41:01,462][71000] Updated weights for policy 0, policy_version 75734 (0.0029) [2024-06-12 20:41:05,391][71000] Updated weights for policy 0, policy_version 75744 (0.0031) [2024-06-12 20:41:05,940][70768] Fps is (10 sec: 42598.6, 60 sec: 48332.8, 300 sec: 48596.6). Total num frames: 1241006080. Throughput: 0: 48465.8. Samples: 769811320. Policy #0 lag: (min: 0.0, avg: 7.3, max: 22.0) [2024-06-12 20:41:05,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:41:08,077][71000] Updated weights for policy 0, policy_version 75754 (0.0031) [2024-06-12 20:41:10,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48605.9, 300 sec: 48652.1). Total num frames: 1241284608. Throughput: 0: 48483.1. Samples: 770096640. Policy #0 lag: (min: 0.0, avg: 7.3, max: 22.0) [2024-06-12 20:41:10,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:41:12,233][71000] Updated weights for policy 0, policy_version 75764 (0.0035) [2024-06-12 20:41:14,958][71000] Updated weights for policy 0, policy_version 75774 (0.0026) [2024-06-12 20:41:15,940][70768] Fps is (10 sec: 52428.7, 60 sec: 48332.7, 300 sec: 48763.2). Total num frames: 1241530368. Throughput: 0: 48400.6. Samples: 770387900. Policy #0 lag: (min: 0.0, avg: 7.3, max: 22.0) [2024-06-12 20:41:15,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:41:18,523][71000] Updated weights for policy 0, policy_version 75784 (0.0024) [2024-06-12 20:41:20,939][70768] Fps is (10 sec: 49152.5, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 1241776128. Throughput: 0: 48593.3. Samples: 770550300. Policy #0 lag: (min: 0.0, avg: 7.3, max: 22.0) [2024-06-12 20:41:20,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:41:21,783][71000] Updated weights for policy 0, policy_version 75794 (0.0023) [2024-06-12 20:41:25,016][71000] Updated weights for policy 0, policy_version 75804 (0.0028) [2024-06-12 20:41:25,940][70768] Fps is (10 sec: 47511.7, 60 sec: 48878.6, 300 sec: 48707.6). Total num frames: 1242005504. Throughput: 0: 48421.4. Samples: 770833020. Policy #0 lag: (min: 0.0, avg: 7.3, max: 22.0) [2024-06-12 20:41:25,941][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:41:28,404][71000] Updated weights for policy 0, policy_version 75814 (0.0020) [2024-06-12 20:41:30,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48879.0, 300 sec: 48596.6). Total num frames: 1242267648. Throughput: 0: 48394.6. Samples: 771120300. Policy #0 lag: (min: 0.0, avg: 7.3, max: 22.0) [2024-06-12 20:41:30,948][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:41:32,122][71000] Updated weights for policy 0, policy_version 75824 (0.0025) [2024-06-12 20:41:35,143][71000] Updated weights for policy 0, policy_version 75834 (0.0023) [2024-06-12 20:41:35,940][70768] Fps is (10 sec: 49154.1, 60 sec: 48059.7, 300 sec: 48596.6). Total num frames: 1242497024. Throughput: 0: 49063.7. Samples: 771277680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 20:41:35,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:41:38,546][71000] Updated weights for policy 0, policy_version 75844 (0.0022) [2024-06-12 20:41:40,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 1242742784. Throughput: 0: 48811.1. Samples: 771571660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 20:41:40,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:41:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000075851_1242742784.pth... [2024-06-12 20:41:41,000][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000075137_1231044608.pth [2024-06-12 20:41:41,456][71000] Updated weights for policy 0, policy_version 75854 (0.0019) [2024-06-12 20:41:45,208][71000] Updated weights for policy 0, policy_version 75864 (0.0036) [2024-06-12 20:41:45,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49153.8, 300 sec: 48652.2). Total num frames: 1242972160. Throughput: 0: 48690.3. Samples: 771866780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 20:41:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 20:41:48,355][71000] Updated weights for policy 0, policy_version 75874 (0.0033) [2024-06-12 20:41:50,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49151.9, 300 sec: 48707.7). Total num frames: 1243250688. Throughput: 0: 48913.1. Samples: 772012420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 20:41:50,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:41:51,574][71000] Updated weights for policy 0, policy_version 75884 (0.0040) [2024-06-12 20:41:54,346][70980] Signal inference workers to stop experience collection... (11350 times) [2024-06-12 20:41:54,393][71000] InferenceWorker_p0-w0: stopping experience collection (11350 times) [2024-06-12 20:41:54,397][70980] Signal inference workers to resume experience collection... (11350 times) [2024-06-12 20:41:54,406][71000] InferenceWorker_p0-w0: resuming experience collection (11350 times) [2024-06-12 20:41:55,137][71000] Updated weights for policy 0, policy_version 75894 (0.0034) [2024-06-12 20:41:55,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48332.8, 300 sec: 48652.1). Total num frames: 1243480064. Throughput: 0: 49340.0. Samples: 772316940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 20:41:55,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:41:58,362][71000] Updated weights for policy 0, policy_version 75904 (0.0027) [2024-06-12 20:42:00,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1243725824. Throughput: 0: 49175.5. Samples: 772600800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 20:42:00,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:42:01,530][71000] Updated weights for policy 0, policy_version 75914 (0.0027) [2024-06-12 20:42:04,653][71000] Updated weights for policy 0, policy_version 75924 (0.0033) [2024-06-12 20:42:05,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.0, 300 sec: 48652.2). Total num frames: 1243955200. Throughput: 0: 48840.4. Samples: 772748120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 20:42:05,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:42:07,934][71000] Updated weights for policy 0, policy_version 75934 (0.0031) [2024-06-12 20:42:10,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 1244233728. Throughput: 0: 49508.4. Samples: 773060880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 20:42:10,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:42:11,642][71000] Updated weights for policy 0, policy_version 75944 (0.0027) [2024-06-12 20:42:14,846][71000] Updated weights for policy 0, policy_version 75954 (0.0032) [2024-06-12 20:42:15,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 1244479488. Throughput: 0: 49420.0. Samples: 773344200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 20:42:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:42:18,084][71000] Updated weights for policy 0, policy_version 75964 (0.0030) [2024-06-12 20:42:20,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 1244725248. Throughput: 0: 49289.7. Samples: 773495720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 20:42:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:42:21,380][71000] Updated weights for policy 0, policy_version 75974 (0.0025) [2024-06-12 20:42:24,573][71000] Updated weights for policy 0, policy_version 75984 (0.0028) [2024-06-12 20:42:25,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.3, 300 sec: 48763.2). Total num frames: 1244954624. Throughput: 0: 49362.2. Samples: 773792960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 20:42:25,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:42:28,203][71000] Updated weights for policy 0, policy_version 75994 (0.0030) [2024-06-12 20:42:30,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49151.9, 300 sec: 48818.7). Total num frames: 1245216768. Throughput: 0: 49222.4. Samples: 774081800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 20:42:30,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:42:31,352][71000] Updated weights for policy 0, policy_version 76004 (0.0033) [2024-06-12 20:42:34,524][71000] Updated weights for policy 0, policy_version 76014 (0.0024) [2024-06-12 20:42:35,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49424.9, 300 sec: 48874.3). Total num frames: 1245462528. Throughput: 0: 49426.6. Samples: 774236620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 20:42:35,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:42:38,058][71000] Updated weights for policy 0, policy_version 76024 (0.0040) [2024-06-12 20:42:40,939][70768] Fps is (10 sec: 49153.2, 60 sec: 49425.2, 300 sec: 48929.9). Total num frames: 1245708288. Throughput: 0: 49102.3. Samples: 774526540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 20:42:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:42:41,525][71000] Updated weights for policy 0, policy_version 76034 (0.0022) [2024-06-12 20:42:44,905][71000] Updated weights for policy 0, policy_version 76044 (0.0033) [2024-06-12 20:42:45,939][70768] Fps is (10 sec: 49153.2, 60 sec: 49698.1, 300 sec: 48874.3). Total num frames: 1245954048. Throughput: 0: 49127.7. Samples: 774811540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 20:42:45,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:42:48,353][71000] Updated weights for policy 0, policy_version 76054 (0.0036) [2024-06-12 20:42:50,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48879.1, 300 sec: 48763.2). Total num frames: 1246183424. Throughput: 0: 49120.8. Samples: 774958560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 20:42:50,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:42:51,550][71000] Updated weights for policy 0, policy_version 76064 (0.0032) [2024-06-12 20:42:55,019][71000] Updated weights for policy 0, policy_version 76074 (0.0029) [2024-06-12 20:42:55,939][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.1, 300 sec: 48818.8). Total num frames: 1246429184. Throughput: 0: 48681.5. Samples: 775251540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 20:42:55,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:42:58,056][71000] Updated weights for policy 0, policy_version 76084 (0.0025) [2024-06-12 20:43:00,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 1246674944. Throughput: 0: 49077.4. Samples: 775552680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 20:43:00,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:43:01,401][71000] Updated weights for policy 0, policy_version 76094 (0.0032) [2024-06-12 20:43:04,939][71000] Updated weights for policy 0, policy_version 76104 (0.0032) [2024-06-12 20:43:05,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.1, 300 sec: 48874.3). Total num frames: 1246937088. Throughput: 0: 49032.5. Samples: 775702180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 20:43:05,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:43:07,597][70980] Signal inference workers to stop experience collection... (11400 times) [2024-06-12 20:43:07,597][70980] Signal inference workers to resume experience collection... (11400 times) [2024-06-12 20:43:07,646][71000] InferenceWorker_p0-w0: stopping experience collection (11400 times) [2024-06-12 20:43:07,646][71000] InferenceWorker_p0-w0: resuming experience collection (11400 times) [2024-06-12 20:43:08,326][71000] Updated weights for policy 0, policy_version 76114 (0.0039) [2024-06-12 20:43:10,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1247166464. Throughput: 0: 48709.9. Samples: 775984900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 20:43:10,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:43:11,754][71000] Updated weights for policy 0, policy_version 76124 (0.0030) [2024-06-12 20:43:15,511][71000] Updated weights for policy 0, policy_version 76134 (0.0032) [2024-06-12 20:43:15,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 1247395840. Throughput: 0: 48856.6. Samples: 776280340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 20:43:15,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:43:18,585][71000] Updated weights for policy 0, policy_version 76144 (0.0024) [2024-06-12 20:43:20,939][70768] Fps is (10 sec: 49152.1, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 1247657984. Throughput: 0: 48574.5. Samples: 776422460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 20:43:20,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:43:22,032][71000] Updated weights for policy 0, policy_version 76154 (0.0025) [2024-06-12 20:43:25,278][71000] Updated weights for policy 0, policy_version 76164 (0.0028) [2024-06-12 20:43:25,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49425.0, 300 sec: 48930.5). Total num frames: 1247920128. Throughput: 0: 48706.5. Samples: 776718340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 20:43:25,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 20:43:28,723][71000] Updated weights for policy 0, policy_version 76174 (0.0025) [2024-06-12 20:43:30,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48606.0, 300 sec: 48818.7). Total num frames: 1248133120. Throughput: 0: 48990.1. Samples: 777016100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 20:43:30,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 20:43:31,974][71000] Updated weights for policy 0, policy_version 76184 (0.0035) [2024-06-12 20:43:35,556][71000] Updated weights for policy 0, policy_version 76194 (0.0033) [2024-06-12 20:43:35,940][70768] Fps is (10 sec: 45875.7, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 1248378880. Throughput: 0: 48879.6. Samples: 777158140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 20:43:35,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 20:43:38,856][71000] Updated weights for policy 0, policy_version 76204 (0.0029) [2024-06-12 20:43:40,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48878.7, 300 sec: 48929.8). Total num frames: 1248641024. Throughput: 0: 48790.4. Samples: 777447120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 20:43:40,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:43:40,993][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000076212_1248657408.pth... [2024-06-12 20:43:41,042][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000075495_1236910080.pth [2024-06-12 20:43:42,313][71000] Updated weights for policy 0, policy_version 76214 (0.0025) [2024-06-12 20:43:45,379][71000] Updated weights for policy 0, policy_version 76224 (0.0026) [2024-06-12 20:43:45,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 1248886784. Throughput: 0: 48827.5. Samples: 777749920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 20:43:45,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:43:48,798][71000] Updated weights for policy 0, policy_version 76234 (0.0035) [2024-06-12 20:43:50,940][70768] Fps is (10 sec: 47514.5, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 1249116160. Throughput: 0: 48894.7. Samples: 777902440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 20:43:50,942][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:43:51,783][71000] Updated weights for policy 0, policy_version 76244 (0.0019) [2024-06-12 20:43:55,305][71000] Updated weights for policy 0, policy_version 76254 (0.0037) [2024-06-12 20:43:55,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.8, 300 sec: 48818.7). Total num frames: 1249361920. Throughput: 0: 48986.5. Samples: 778189300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 20:43:55,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 20:43:58,626][71000] Updated weights for policy 0, policy_version 76264 (0.0024) [2024-06-12 20:44:00,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 1249607680. Throughput: 0: 48766.7. Samples: 778474840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 20:44:00,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:44:02,394][71000] Updated weights for policy 0, policy_version 76274 (0.0031) [2024-06-12 20:44:05,763][71000] Updated weights for policy 0, policy_version 76284 (0.0026) [2024-06-12 20:44:05,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48605.9, 300 sec: 48874.4). Total num frames: 1249853440. Throughput: 0: 48792.4. Samples: 778618120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 20:44:05,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:44:09,038][71000] Updated weights for policy 0, policy_version 76294 (0.0032) [2024-06-12 20:44:10,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 1250082816. Throughput: 0: 48970.4. Samples: 778922000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 20:44:10,940][70768] Avg episode reward: [(0, '0.253')] [2024-06-12 20:44:12,077][71000] Updated weights for policy 0, policy_version 76304 (0.0034) [2024-06-12 20:44:15,707][71000] Updated weights for policy 0, policy_version 76314 (0.0034) [2024-06-12 20:44:15,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 1250344960. Throughput: 0: 49115.6. Samples: 779226300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 20:44:15,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:44:19,068][71000] Updated weights for policy 0, policy_version 76324 (0.0030) [2024-06-12 20:44:19,178][70980] Signal inference workers to stop experience collection... (11450 times) [2024-06-12 20:44:19,235][71000] InferenceWorker_p0-w0: stopping experience collection (11450 times) [2024-06-12 20:44:19,292][70980] Signal inference workers to resume experience collection... (11450 times) [2024-06-12 20:44:19,292][71000] InferenceWorker_p0-w0: resuming experience collection (11450 times) [2024-06-12 20:44:20,940][70768] Fps is (10 sec: 52427.8, 60 sec: 49151.8, 300 sec: 48985.4). Total num frames: 1250607104. Throughput: 0: 49028.7. Samples: 779364440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 20:44:20,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:44:22,395][71000] Updated weights for policy 0, policy_version 76334 (0.0032) [2024-06-12 20:44:25,814][71000] Updated weights for policy 0, policy_version 76344 (0.0026) [2024-06-12 20:44:25,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48332.8, 300 sec: 48818.7). Total num frames: 1250820096. Throughput: 0: 48924.5. Samples: 779648720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 20:44:25,952][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:44:29,245][71000] Updated weights for policy 0, policy_version 76354 (0.0031) [2024-06-12 20:44:30,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 1251082240. Throughput: 0: 48843.2. Samples: 779947860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 20:44:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:44:32,395][71000] Updated weights for policy 0, policy_version 76364 (0.0025) [2024-06-12 20:44:35,926][71000] Updated weights for policy 0, policy_version 76374 (0.0025) [2024-06-12 20:44:35,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1251311616. Throughput: 0: 48591.5. Samples: 780089060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-12 20:44:35,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:44:39,094][71000] Updated weights for policy 0, policy_version 76384 (0.0031) [2024-06-12 20:44:40,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 1251573760. Throughput: 0: 48897.3. Samples: 780389680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 20:44:40,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:44:42,436][71000] Updated weights for policy 0, policy_version 76394 (0.0026) [2024-06-12 20:44:45,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48332.9, 300 sec: 48818.8). Total num frames: 1251786752. Throughput: 0: 48770.7. Samples: 780669520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 20:44:45,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:44:46,005][71000] Updated weights for policy 0, policy_version 76404 (0.0032) [2024-06-12 20:44:49,326][71000] Updated weights for policy 0, policy_version 76414 (0.0024) [2024-06-12 20:44:50,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 1252032512. Throughput: 0: 48787.5. Samples: 780813560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 20:44:50,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:44:52,739][71000] Updated weights for policy 0, policy_version 76424 (0.0031) [2024-06-12 20:44:55,853][71000] Updated weights for policy 0, policy_version 76434 (0.0023) [2024-06-12 20:44:55,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1252294656. Throughput: 0: 48561.2. Samples: 781107260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 20:44:55,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:44:59,183][71000] Updated weights for policy 0, policy_version 76444 (0.0029) [2024-06-12 20:45:00,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 1252556800. Throughput: 0: 48561.6. Samples: 781411580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 20:45:00,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:45:02,528][71000] Updated weights for policy 0, policy_version 76454 (0.0030) [2024-06-12 20:45:05,676][71000] Updated weights for policy 0, policy_version 76464 (0.0024) [2024-06-12 20:45:05,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1252786176. Throughput: 0: 48860.1. Samples: 781563140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 20:45:05,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:45:09,211][71000] Updated weights for policy 0, policy_version 76474 (0.0022) [2024-06-12 20:45:10,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49151.8, 300 sec: 48818.7). Total num frames: 1253031936. Throughput: 0: 49104.0. Samples: 781858400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 20:45:10,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:45:12,656][71000] Updated weights for policy 0, policy_version 76484 (0.0031) [2024-06-12 20:45:15,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1253277696. Throughput: 0: 48908.0. Samples: 782148720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 20:45:15,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:45:15,942][71000] Updated weights for policy 0, policy_version 76494 (0.0031) [2024-06-12 20:45:19,229][71000] Updated weights for policy 0, policy_version 76504 (0.0028) [2024-06-12 20:45:20,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 1253539840. Throughput: 0: 48942.5. Samples: 782291480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 20:45:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:45:22,528][71000] Updated weights for policy 0, policy_version 76514 (0.0026) [2024-06-12 20:45:25,871][71000] Updated weights for policy 0, policy_version 76524 (0.0030) [2024-06-12 20:45:25,941][70768] Fps is (10 sec: 49146.1, 60 sec: 49151.1, 300 sec: 48929.7). Total num frames: 1253769216. Throughput: 0: 48930.8. Samples: 782591620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:45:25,941][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:45:29,426][71000] Updated weights for policy 0, policy_version 76534 (0.0040) [2024-06-12 20:45:29,433][70980] Signal inference workers to stop experience collection... (11500 times) [2024-06-12 20:45:29,433][70980] Signal inference workers to resume experience collection... (11500 times) [2024-06-12 20:45:29,446][71000] InferenceWorker_p0-w0: stopping experience collection (11500 times) [2024-06-12 20:45:29,446][71000] InferenceWorker_p0-w0: resuming experience collection (11500 times) [2024-06-12 20:45:30,944][70768] Fps is (10 sec: 47495.1, 60 sec: 48875.6, 300 sec: 48818.1). Total num frames: 1254014976. Throughput: 0: 49217.3. Samples: 782884500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:45:30,944][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:45:32,315][71000] Updated weights for policy 0, policy_version 76544 (0.0029) [2024-06-12 20:45:35,869][71000] Updated weights for policy 0, policy_version 76554 (0.0030) [2024-06-12 20:45:35,939][70768] Fps is (10 sec: 49158.5, 60 sec: 49152.1, 300 sec: 48929.9). Total num frames: 1254260736. Throughput: 0: 49397.1. Samples: 783036420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:45:35,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:45:39,215][71000] Updated weights for policy 0, policy_version 76564 (0.0037) [2024-06-12 20:45:40,939][70768] Fps is (10 sec: 49172.4, 60 sec: 48879.1, 300 sec: 49096.8). Total num frames: 1254506496. Throughput: 0: 49278.0. Samples: 783324760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:45:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:45:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000076569_1254506496.pth... [2024-06-12 20:45:40,993][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000075851_1242742784.pth [2024-06-12 20:45:42,404][71000] Updated weights for policy 0, policy_version 76574 (0.0035) [2024-06-12 20:45:45,860][71000] Updated weights for policy 0, policy_version 76584 (0.0022) [2024-06-12 20:45:45,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 1254752256. Throughput: 0: 49117.3. Samples: 783621860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:45:45,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:45:48,819][71000] Updated weights for policy 0, policy_version 76594 (0.0029) [2024-06-12 20:45:50,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.2, 300 sec: 48929.9). Total num frames: 1255014400. Throughput: 0: 49064.5. Samples: 783771040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:45:50,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:45:52,681][71000] Updated weights for policy 0, policy_version 76604 (0.0025) [2024-06-12 20:45:55,562][71000] Updated weights for policy 0, policy_version 76614 (0.0026) [2024-06-12 20:45:55,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 1255243776. Throughput: 0: 49156.9. Samples: 784070460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:45:55,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:45:59,140][71000] Updated weights for policy 0, policy_version 76624 (0.0028) [2024-06-12 20:46:00,940][70768] Fps is (10 sec: 47510.4, 60 sec: 48878.5, 300 sec: 49096.4). Total num frames: 1255489536. Throughput: 0: 49090.4. Samples: 784357820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:46:00,941][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:46:02,109][71000] Updated weights for policy 0, policy_version 76634 (0.0026) [2024-06-12 20:46:05,940][70768] Fps is (10 sec: 47514.3, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 1255718912. Throughput: 0: 49245.5. Samples: 784507520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:46:05,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:46:06,062][71000] Updated weights for policy 0, policy_version 76644 (0.0031) [2024-06-12 20:46:08,955][71000] Updated weights for policy 0, policy_version 76654 (0.0026) [2024-06-12 20:46:10,940][70768] Fps is (10 sec: 49155.2, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 1255981056. Throughput: 0: 49143.1. Samples: 784803000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 20:46:10,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:46:12,584][71000] Updated weights for policy 0, policy_version 76664 (0.0032) [2024-06-12 20:46:15,603][71000] Updated weights for policy 0, policy_version 76674 (0.0028) [2024-06-12 20:46:15,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 1256226816. Throughput: 0: 49091.9. Samples: 785093440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 20:46:15,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:46:19,217][71000] Updated weights for policy 0, policy_version 76684 (0.0029) [2024-06-12 20:46:20,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48879.1, 300 sec: 49041.0). Total num frames: 1256472576. Throughput: 0: 49020.0. Samples: 785242320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 20:46:20,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:46:22,009][71000] Updated weights for policy 0, policy_version 76694 (0.0019) [2024-06-12 20:46:25,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48879.9, 300 sec: 48929.8). Total num frames: 1256701952. Throughput: 0: 49232.7. Samples: 785540240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 20:46:25,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:46:26,051][71000] Updated weights for policy 0, policy_version 76704 (0.0032) [2024-06-12 20:46:28,841][70980] Signal inference workers to stop experience collection... (11550 times) [2024-06-12 20:46:28,843][70980] Signal inference workers to resume experience collection... (11550 times) [2024-06-12 20:46:28,853][71000] InferenceWorker_p0-w0: stopping experience collection (11550 times) [2024-06-12 20:46:28,883][71000] InferenceWorker_p0-w0: resuming experience collection (11550 times) [2024-06-12 20:46:29,281][71000] Updated weights for policy 0, policy_version 76714 (0.0022) [2024-06-12 20:46:30,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48882.3, 300 sec: 48985.4). Total num frames: 1256947712. Throughput: 0: 48989.9. Samples: 785826400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 20:46:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:46:32,478][71000] Updated weights for policy 0, policy_version 76724 (0.0042) [2024-06-12 20:46:35,878][71000] Updated weights for policy 0, policy_version 76734 (0.0022) [2024-06-12 20:46:35,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 1257209856. Throughput: 0: 48937.3. Samples: 785973220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 20:46:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 20:46:39,167][71000] Updated weights for policy 0, policy_version 76744 (0.0022) [2024-06-12 20:46:40,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49151.8, 300 sec: 49096.4). Total num frames: 1257455616. Throughput: 0: 48929.3. Samples: 786272280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 20:46:40,949][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:46:42,308][71000] Updated weights for policy 0, policy_version 76754 (0.0026) [2024-06-12 20:46:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 1257684992. Throughput: 0: 49103.8. Samples: 786567460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 20:46:45,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:46:46,047][71000] Updated weights for policy 0, policy_version 76764 (0.0027) [2024-06-12 20:46:48,969][71000] Updated weights for policy 0, policy_version 76774 (0.0030) [2024-06-12 20:46:50,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.7, 300 sec: 48985.4). Total num frames: 1257930752. Throughput: 0: 49186.9. Samples: 786720940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 20:46:50,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:46:52,638][71000] Updated weights for policy 0, policy_version 76784 (0.0021) [2024-06-12 20:46:55,941][70768] Fps is (10 sec: 47504.5, 60 sec: 48604.4, 300 sec: 48929.5). Total num frames: 1258160128. Throughput: 0: 48885.5. Samples: 787002940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 20:46:55,942][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:46:56,187][71000] Updated weights for policy 0, policy_version 76794 (0.0038) [2024-06-12 20:46:59,277][71000] Updated weights for policy 0, policy_version 76804 (0.0022) [2024-06-12 20:47:00,940][70768] Fps is (10 sec: 47514.6, 60 sec: 48606.4, 300 sec: 48985.4). Total num frames: 1258405888. Throughput: 0: 48847.2. Samples: 787291560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 20:47:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 20:47:02,847][71000] Updated weights for policy 0, policy_version 76814 (0.0032) [2024-06-12 20:47:05,915][71000] Updated weights for policy 0, policy_version 76824 (0.0029) [2024-06-12 20:47:05,940][70768] Fps is (10 sec: 52437.7, 60 sec: 49424.9, 300 sec: 48985.4). Total num frames: 1258684416. Throughput: 0: 48787.7. Samples: 787437780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 20:47:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:47:09,565][71000] Updated weights for policy 0, policy_version 76834 (0.0036) [2024-06-12 20:47:10,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 1258913792. Throughput: 0: 48861.0. Samples: 787738980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 20:47:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 20:47:13,082][71000] Updated weights for policy 0, policy_version 76844 (0.0034) [2024-06-12 20:47:15,940][70768] Fps is (10 sec: 45876.3, 60 sec: 48606.0, 300 sec: 48874.3). Total num frames: 1259143168. Throughput: 0: 48907.1. Samples: 788027220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 20:47:15,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:47:16,135][71000] Updated weights for policy 0, policy_version 76854 (0.0035) [2024-06-12 20:47:19,640][71000] Updated weights for policy 0, policy_version 76864 (0.0040) [2024-06-12 20:47:20,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 1259405312. Throughput: 0: 48808.0. Samples: 788169580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 20:47:20,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:47:23,169][71000] Updated weights for policy 0, policy_version 76874 (0.0034) [2024-06-12 20:47:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1259634688. Throughput: 0: 48676.7. Samples: 788462720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 20:47:25,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 20:47:26,167][71000] Updated weights for policy 0, policy_version 76884 (0.0030) [2024-06-12 20:47:29,751][71000] Updated weights for policy 0, policy_version 76894 (0.0028) [2024-06-12 20:47:30,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 1259896832. Throughput: 0: 48867.1. Samples: 788766480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 20:47:30,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:47:32,820][71000] Updated weights for policy 0, policy_version 76904 (0.0025) [2024-06-12 20:47:35,940][70768] Fps is (10 sec: 49148.3, 60 sec: 48605.3, 300 sec: 48874.2). Total num frames: 1260126208. Throughput: 0: 48687.4. Samples: 788911900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 20:47:35,941][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:47:36,622][71000] Updated weights for policy 0, policy_version 76914 (0.0033) [2024-06-12 20:47:39,839][71000] Updated weights for policy 0, policy_version 76924 (0.0027) [2024-06-12 20:47:40,940][70768] Fps is (10 sec: 49150.9, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 1260388352. Throughput: 0: 48806.3. Samples: 789199140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 20:47:40,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:47:40,945][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000076928_1260388352.pth... [2024-06-12 20:47:40,999][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000076212_1248657408.pth [2024-06-12 20:47:43,132][71000] Updated weights for policy 0, policy_version 76934 (0.0029) [2024-06-12 20:47:45,939][70768] Fps is (10 sec: 49155.9, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 1260617728. Throughput: 0: 49132.9. Samples: 789502540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 20:47:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:47:46,251][71000] Updated weights for policy 0, policy_version 76944 (0.0030) [2024-06-12 20:47:49,466][71000] Updated weights for policy 0, policy_version 76954 (0.0028) [2024-06-12 20:47:50,467][70980] Signal inference workers to stop experience collection... (11600 times) [2024-06-12 20:47:50,510][71000] InferenceWorker_p0-w0: stopping experience collection (11600 times) [2024-06-12 20:47:50,515][70980] Signal inference workers to resume experience collection... (11600 times) [2024-06-12 20:47:50,526][71000] InferenceWorker_p0-w0: resuming experience collection (11600 times) [2024-06-12 20:47:50,940][70768] Fps is (10 sec: 49153.3, 60 sec: 49152.2, 300 sec: 48985.4). Total num frames: 1260879872. Throughput: 0: 49028.3. Samples: 789644040. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-12 20:47:50,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:47:52,669][71000] Updated weights for policy 0, policy_version 76964 (0.0032) [2024-06-12 20:47:55,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49153.6, 300 sec: 48929.9). Total num frames: 1261109248. Throughput: 0: 48990.7. Samples: 789943560. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-12 20:47:55,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:47:56,361][71000] Updated weights for policy 0, policy_version 76974 (0.0030) [2024-06-12 20:47:59,451][71000] Updated weights for policy 0, policy_version 76984 (0.0037) [2024-06-12 20:48:00,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 48929.9). Total num frames: 1261371392. Throughput: 0: 49068.0. Samples: 790235280. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-12 20:48:00,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:48:03,346][71000] Updated weights for policy 0, policy_version 76994 (0.0026) [2024-06-12 20:48:05,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48606.1, 300 sec: 48929.8). Total num frames: 1261600768. Throughput: 0: 49260.1. Samples: 790386280. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-12 20:48:05,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:48:06,206][71000] Updated weights for policy 0, policy_version 77004 (0.0026) [2024-06-12 20:48:09,730][71000] Updated weights for policy 0, policy_version 77014 (0.0022) [2024-06-12 20:48:10,939][70768] Fps is (10 sec: 47513.5, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 1261846528. Throughput: 0: 49260.0. Samples: 790679420. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-12 20:48:10,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 20:48:12,783][71000] Updated weights for policy 0, policy_version 77024 (0.0027) [2024-06-12 20:48:15,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 1262108672. Throughput: 0: 49153.0. Samples: 790978360. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-12 20:48:15,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:48:16,058][71000] Updated weights for policy 0, policy_version 77034 (0.0027) [2024-06-12 20:48:19,250][71000] Updated weights for policy 0, policy_version 77044 (0.0029) [2024-06-12 20:48:20,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 1262370816. Throughput: 0: 49442.6. Samples: 791136780. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-12 20:48:20,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:48:23,121][71000] Updated weights for policy 0, policy_version 77054 (0.0031) [2024-06-12 20:48:25,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 1262600192. Throughput: 0: 49441.9. Samples: 791424020. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-12 20:48:25,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:48:26,143][71000] Updated weights for policy 0, policy_version 77064 (0.0031) [2024-06-12 20:48:29,830][71000] Updated weights for policy 0, policy_version 77074 (0.0026) [2024-06-12 20:48:30,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 1262829568. Throughput: 0: 49240.3. Samples: 791718360. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-12 20:48:30,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:48:32,825][71000] Updated weights for policy 0, policy_version 77084 (0.0036) [2024-06-12 20:48:35,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.5, 300 sec: 48929.8). Total num frames: 1263075328. Throughput: 0: 49152.3. Samples: 791855900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 20:48:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:48:36,262][71000] Updated weights for policy 0, policy_version 77094 (0.0022) [2024-06-12 20:48:39,365][71000] Updated weights for policy 0, policy_version 77104 (0.0031) [2024-06-12 20:48:40,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49425.3, 300 sec: 49040.9). Total num frames: 1263353856. Throughput: 0: 49238.6. Samples: 792159300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 20:48:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:48:42,751][71000] Updated weights for policy 0, policy_version 77114 (0.0033) [2024-06-12 20:48:45,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 1263583232. Throughput: 0: 49549.3. Samples: 792465000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 20:48:45,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:48:46,045][71000] Updated weights for policy 0, policy_version 77124 (0.0035) [2024-06-12 20:48:49,717][71000] Updated weights for policy 0, policy_version 77134 (0.0029) [2024-06-12 20:48:50,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 1263812608. Throughput: 0: 49295.0. Samples: 792604560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 20:48:50,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:48:52,517][71000] Updated weights for policy 0, policy_version 77144 (0.0026) [2024-06-12 20:48:55,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 1264074752. Throughput: 0: 49256.0. Samples: 792895940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 20:48:55,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:48:56,123][71000] Updated weights for policy 0, policy_version 77154 (0.0030) [2024-06-12 20:48:58,820][70980] Signal inference workers to stop experience collection... (11650 times) [2024-06-12 20:48:58,821][70980] Signal inference workers to resume experience collection... (11650 times) [2024-06-12 20:48:58,830][71000] InferenceWorker_p0-w0: stopping experience collection (11650 times) [2024-06-12 20:48:58,830][71000] InferenceWorker_p0-w0: resuming experience collection (11650 times) [2024-06-12 20:48:59,411][71000] Updated weights for policy 0, policy_version 77164 (0.0031) [2024-06-12 20:49:00,940][70768] Fps is (10 sec: 54066.7, 60 sec: 49698.0, 300 sec: 49152.0). Total num frames: 1264353280. Throughput: 0: 49128.2. Samples: 793189140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 20:49:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:49:02,494][71000] Updated weights for policy 0, policy_version 77174 (0.0032) [2024-06-12 20:49:05,892][71000] Updated weights for policy 0, policy_version 77184 (0.0028) [2024-06-12 20:49:05,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.1, 300 sec: 49152.0). Total num frames: 1264582656. Throughput: 0: 49096.5. Samples: 793346120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 20:49:05,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:49:09,363][71000] Updated weights for policy 0, policy_version 77194 (0.0031) [2024-06-12 20:49:10,940][70768] Fps is (10 sec: 44236.8, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 1264795648. Throughput: 0: 49317.3. Samples: 793643300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 20:49:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:49:12,459][71000] Updated weights for policy 0, policy_version 77204 (0.0024) [2024-06-12 20:49:15,889][71000] Updated weights for policy 0, policy_version 77214 (0.0033) [2024-06-12 20:49:15,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 1265074176. Throughput: 0: 49167.2. Samples: 793930880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 20:49:15,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:49:19,227][71000] Updated weights for policy 0, policy_version 77224 (0.0031) [2024-06-12 20:49:20,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 1265319936. Throughput: 0: 49477.0. Samples: 794082360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 20:49:20,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:49:22,369][71000] Updated weights for policy 0, policy_version 77234 (0.0026) [2024-06-12 20:49:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 1265549312. Throughput: 0: 49279.1. Samples: 794376860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 20:49:25,949][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:49:26,075][71000] Updated weights for policy 0, policy_version 77244 (0.0033) [2024-06-12 20:49:29,159][71000] Updated weights for policy 0, policy_version 77254 (0.0030) [2024-06-12 20:49:30,940][70768] Fps is (10 sec: 45874.8, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 1265778688. Throughput: 0: 49166.5. Samples: 794677500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 20:49:30,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 20:49:32,794][71000] Updated weights for policy 0, policy_version 77264 (0.0027) [2024-06-12 20:49:35,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 1266040832. Throughput: 0: 49120.4. Samples: 794814980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 20:49:35,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:49:36,080][71000] Updated weights for policy 0, policy_version 77274 (0.0034) [2024-06-12 20:49:39,213][71000] Updated weights for policy 0, policy_version 77284 (0.0028) [2024-06-12 20:49:40,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 1266302976. Throughput: 0: 49060.2. Samples: 795103660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 20:49:40,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:49:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000077289_1266302976.pth... [2024-06-12 20:49:40,998][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000076569_1254506496.pth [2024-06-12 20:49:42,671][71000] Updated weights for policy 0, policy_version 77294 (0.0031) [2024-06-12 20:49:45,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 1266532352. Throughput: 0: 49054.0. Samples: 795396560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 20:49:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:49:46,388][71000] Updated weights for policy 0, policy_version 77304 (0.0029) [2024-06-12 20:49:49,290][71000] Updated weights for policy 0, policy_version 77314 (0.0027) [2024-06-12 20:49:50,940][70768] Fps is (10 sec: 44237.5, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 1266745344. Throughput: 0: 48675.6. Samples: 795536520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 20:49:50,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:49:52,979][71000] Updated weights for policy 0, policy_version 77324 (0.0032) [2024-06-12 20:49:55,940][70768] Fps is (10 sec: 47512.7, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 1267007488. Throughput: 0: 48591.1. Samples: 795829900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 20:49:55,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:49:56,310][71000] Updated weights for policy 0, policy_version 77334 (0.0043) [2024-06-12 20:49:59,831][71000] Updated weights for policy 0, policy_version 77344 (0.0025) [2024-06-12 20:50:00,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48332.9, 300 sec: 49040.9). Total num frames: 1267253248. Throughput: 0: 48593.8. Samples: 796117600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 20:50:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:50:03,303][71000] Updated weights for policy 0, policy_version 77354 (0.0028) [2024-06-12 20:50:05,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48605.7, 300 sec: 49040.9). Total num frames: 1267499008. Throughput: 0: 48551.0. Samples: 796267160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 20:50:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:50:06,445][71000] Updated weights for policy 0, policy_version 77364 (0.0025) [2024-06-12 20:50:10,022][71000] Updated weights for policy 0, policy_version 77374 (0.0029) [2024-06-12 20:50:10,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48606.0, 300 sec: 48929.8). Total num frames: 1267712000. Throughput: 0: 48318.7. Samples: 796551200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 20:50:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:50:11,832][70980] Signal inference workers to stop experience collection... (11700 times) [2024-06-12 20:50:11,833][70980] Signal inference workers to resume experience collection... (11700 times) [2024-06-12 20:50:11,853][71000] InferenceWorker_p0-w0: stopping experience collection (11700 times) [2024-06-12 20:50:11,853][71000] InferenceWorker_p0-w0: resuming experience collection (11700 times) [2024-06-12 20:50:13,537][71000] Updated weights for policy 0, policy_version 77384 (0.0025) [2024-06-12 20:50:15,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48605.7, 300 sec: 48985.4). Total num frames: 1267990528. Throughput: 0: 48237.7. Samples: 796848200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 20:50:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:50:16,494][71000] Updated weights for policy 0, policy_version 77394 (0.0031) [2024-06-12 20:50:20,191][71000] Updated weights for policy 0, policy_version 77404 (0.0030) [2024-06-12 20:50:20,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48059.6, 300 sec: 48930.0). Total num frames: 1268203520. Throughput: 0: 48471.9. Samples: 796996220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 20:50:20,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:50:23,385][71000] Updated weights for policy 0, policy_version 77414 (0.0025) [2024-06-12 20:50:25,940][70768] Fps is (10 sec: 45876.1, 60 sec: 48332.9, 300 sec: 48930.5). Total num frames: 1268449280. Throughput: 0: 48357.1. Samples: 797279720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 20:50:25,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:50:26,878][71000] Updated weights for policy 0, policy_version 77424 (0.0033) [2024-06-12 20:50:30,243][71000] Updated weights for policy 0, policy_version 77434 (0.0025) [2024-06-12 20:50:30,939][70768] Fps is (10 sec: 49153.1, 60 sec: 48606.0, 300 sec: 48929.8). Total num frames: 1268695040. Throughput: 0: 48186.7. Samples: 797564960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 20:50:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 20:50:33,602][71000] Updated weights for policy 0, policy_version 77444 (0.0028) [2024-06-12 20:50:35,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48332.9, 300 sec: 48929.8). Total num frames: 1268940800. Throughput: 0: 48391.1. Samples: 797714120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 20:50:35,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:50:36,933][71000] Updated weights for policy 0, policy_version 77454 (0.0037) [2024-06-12 20:50:40,579][71000] Updated weights for policy 0, policy_version 77464 (0.0036) [2024-06-12 20:50:40,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48059.8, 300 sec: 48929.9). Total num frames: 1269186560. Throughput: 0: 48349.0. Samples: 798005600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 20:50:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:50:44,035][71000] Updated weights for policy 0, policy_version 77474 (0.0034) [2024-06-12 20:50:45,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48059.6, 300 sec: 48818.7). Total num frames: 1269415936. Throughput: 0: 48351.4. Samples: 798293420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 20:50:45,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 20:50:47,278][71000] Updated weights for policy 0, policy_version 77484 (0.0034) [2024-06-12 20:50:50,677][71000] Updated weights for policy 0, policy_version 77494 (0.0036) [2024-06-12 20:50:50,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 1269661696. Throughput: 0: 48206.3. Samples: 798436440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 20:50:50,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:50:54,089][71000] Updated weights for policy 0, policy_version 77504 (0.0031) [2024-06-12 20:50:55,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 1269923840. Throughput: 0: 48422.1. Samples: 798730200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 20:50:55,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:50:57,616][71000] Updated weights for policy 0, policy_version 77514 (0.0027) [2024-06-12 20:51:00,845][71000] Updated weights for policy 0, policy_version 77524 (0.0035) [2024-06-12 20:51:00,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48332.8, 300 sec: 48929.8). Total num frames: 1270153216. Throughput: 0: 48208.2. Samples: 799017560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 20:51:00,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:51:04,449][71000] Updated weights for policy 0, policy_version 77534 (0.0027) [2024-06-12 20:51:05,940][70768] Fps is (10 sec: 45875.7, 60 sec: 48059.9, 300 sec: 48818.8). Total num frames: 1270382592. Throughput: 0: 48134.4. Samples: 799162260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 20:51:05,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:51:07,780][71000] Updated weights for policy 0, policy_version 77544 (0.0039) [2024-06-12 20:51:10,939][70768] Fps is (10 sec: 45875.4, 60 sec: 48332.9, 300 sec: 48763.3). Total num frames: 1270611968. Throughput: 0: 48393.3. Samples: 799457420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 20:51:10,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:51:11,158][71000] Updated weights for policy 0, policy_version 77554 (0.0031) [2024-06-12 20:51:14,675][71000] Updated weights for policy 0, policy_version 77564 (0.0028) [2024-06-12 20:51:15,940][70768] Fps is (10 sec: 47513.0, 60 sec: 47786.7, 300 sec: 48763.2). Total num frames: 1270857728. Throughput: 0: 48254.5. Samples: 799736420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 20:51:15,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:51:18,189][71000] Updated weights for policy 0, policy_version 77574 (0.0029) [2024-06-12 20:51:20,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48059.9, 300 sec: 48763.2). Total num frames: 1271087104. Throughput: 0: 48264.9. Samples: 799886040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 20:51:20,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:51:21,520][71000] Updated weights for policy 0, policy_version 77584 (0.0034) [2024-06-12 20:51:25,000][71000] Updated weights for policy 0, policy_version 77594 (0.0033) [2024-06-12 20:51:25,492][70980] Signal inference workers to stop experience collection... (11750 times) [2024-06-12 20:51:25,492][70980] Signal inference workers to resume experience collection... (11750 times) [2024-06-12 20:51:25,532][71000] InferenceWorker_p0-w0: stopping experience collection (11750 times) [2024-06-12 20:51:25,533][71000] InferenceWorker_p0-w0: resuming experience collection (11750 times) [2024-06-12 20:51:25,944][70768] Fps is (10 sec: 49131.3, 60 sec: 48329.3, 300 sec: 48818.1). Total num frames: 1271349248. Throughput: 0: 48227.9. Samples: 800176060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 20:51:25,944][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:51:28,531][71000] Updated weights for policy 0, policy_version 77604 (0.0039) [2024-06-12 20:51:30,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48332.7, 300 sec: 48763.2). Total num frames: 1271595008. Throughput: 0: 48052.5. Samples: 800455780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 20:51:30,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:51:31,778][71000] Updated weights for policy 0, policy_version 77614 (0.0030) [2024-06-12 20:51:35,349][71000] Updated weights for policy 0, policy_version 77624 (0.0030) [2024-06-12 20:51:35,940][70768] Fps is (10 sec: 49173.1, 60 sec: 48332.8, 300 sec: 48763.3). Total num frames: 1271840768. Throughput: 0: 48127.2. Samples: 800602160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 20:51:35,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:51:38,430][71000] Updated weights for policy 0, policy_version 77634 (0.0034) [2024-06-12 20:51:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48332.8, 300 sec: 48818.8). Total num frames: 1272086528. Throughput: 0: 48073.0. Samples: 800893480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 20:51:40,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:51:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000077642_1272086528.pth... [2024-06-12 20:51:40,991][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000076928_1260388352.pth [2024-06-12 20:51:42,125][71000] Updated weights for policy 0, policy_version 77644 (0.0025) [2024-06-12 20:51:45,054][71000] Updated weights for policy 0, policy_version 77654 (0.0036) [2024-06-12 20:51:45,944][70768] Fps is (10 sec: 49131.0, 60 sec: 48602.5, 300 sec: 48818.1). Total num frames: 1272332288. Throughput: 0: 48353.6. Samples: 801193680. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-12 20:51:45,944][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:51:48,679][71000] Updated weights for policy 0, policy_version 77664 (0.0025) [2024-06-12 20:51:50,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48332.9, 300 sec: 48819.1). Total num frames: 1272561664. Throughput: 0: 48363.9. Samples: 801338640. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-12 20:51:50,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:51:52,027][71000] Updated weights for policy 0, policy_version 77674 (0.0030) [2024-06-12 20:51:55,654][71000] Updated weights for policy 0, policy_version 77684 (0.0032) [2024-06-12 20:51:55,940][70768] Fps is (10 sec: 45894.4, 60 sec: 47786.7, 300 sec: 48763.2). Total num frames: 1272791040. Throughput: 0: 48055.4. Samples: 801619920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-12 20:51:55,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:51:58,452][71000] Updated weights for policy 0, policy_version 77694 (0.0026) [2024-06-12 20:52:00,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48332.7, 300 sec: 48707.7). Total num frames: 1273053184. Throughput: 0: 48423.1. Samples: 801915460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-12 20:52:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 20:52:02,153][71000] Updated weights for policy 0, policy_version 77704 (0.0031) [2024-06-12 20:52:05,082][71000] Updated weights for policy 0, policy_version 77714 (0.0022) [2024-06-12 20:52:05,939][70768] Fps is (10 sec: 52429.6, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1273315328. Throughput: 0: 48528.5. Samples: 802069820. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-12 20:52:05,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:52:08,608][71000] Updated weights for policy 0, policy_version 77724 (0.0024) [2024-06-12 20:52:10,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.8, 300 sec: 48818.7). Total num frames: 1273544704. Throughput: 0: 48901.9. Samples: 802376440. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-12 20:52:10,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:52:11,618][71000] Updated weights for policy 0, policy_version 77734 (0.0023) [2024-06-12 20:52:15,719][71000] Updated weights for policy 0, policy_version 77744 (0.0028) [2024-06-12 20:52:15,940][70768] Fps is (10 sec: 44236.6, 60 sec: 48332.9, 300 sec: 48652.2). Total num frames: 1273757696. Throughput: 0: 49049.8. Samples: 802663020. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-12 20:52:15,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:52:18,430][71000] Updated weights for policy 0, policy_version 77754 (0.0027) [2024-06-12 20:52:20,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.1, 300 sec: 48874.3). Total num frames: 1274052608. Throughput: 0: 48882.7. Samples: 802801880. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-12 20:52:20,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:52:22,430][71000] Updated weights for policy 0, policy_version 77764 (0.0034) [2024-06-12 20:52:25,080][71000] Updated weights for policy 0, policy_version 77774 (0.0029) [2024-06-12 20:52:25,940][70768] Fps is (10 sec: 52428.7, 60 sec: 48882.4, 300 sec: 48763.2). Total num frames: 1274281984. Throughput: 0: 48844.0. Samples: 803091460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-12 20:52:25,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:52:28,956][71000] Updated weights for policy 0, policy_version 77784 (0.0024) [2024-06-12 20:52:30,940][70768] Fps is (10 sec: 45874.4, 60 sec: 48605.8, 300 sec: 48763.3). Total num frames: 1274511360. Throughput: 0: 48798.3. Samples: 803389400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 20:52:30,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:52:30,997][70980] Signal inference workers to stop experience collection... (11800 times) [2024-06-12 20:52:31,001][70980] Signal inference workers to resume experience collection... (11800 times) [2024-06-12 20:52:31,023][71000] InferenceWorker_p0-w0: stopping experience collection (11800 times) [2024-06-12 20:52:31,024][71000] InferenceWorker_p0-w0: resuming experience collection (11800 times) [2024-06-12 20:52:31,621][71000] Updated weights for policy 0, policy_version 77794 (0.0034) [2024-06-12 20:52:35,940][70768] Fps is (10 sec: 44237.0, 60 sec: 48059.8, 300 sec: 48596.7). Total num frames: 1274724352. Throughput: 0: 48707.6. Samples: 803530480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 20:52:35,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:52:36,004][71000] Updated weights for policy 0, policy_version 77804 (0.0024) [2024-06-12 20:52:38,499][71000] Updated weights for policy 0, policy_version 77814 (0.0028) [2024-06-12 20:52:40,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48878.8, 300 sec: 48818.7). Total num frames: 1275019264. Throughput: 0: 48968.4. Samples: 803823500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 20:52:40,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:52:42,681][71000] Updated weights for policy 0, policy_version 77824 (0.0025) [2024-06-12 20:52:45,159][71000] Updated weights for policy 0, policy_version 77834 (0.0030) [2024-06-12 20:52:45,940][70768] Fps is (10 sec: 52427.7, 60 sec: 48609.2, 300 sec: 48707.7). Total num frames: 1275248640. Throughput: 0: 48757.3. Samples: 804109540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 20:52:45,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:52:49,076][71000] Updated weights for policy 0, policy_version 77844 (0.0031) [2024-06-12 20:52:50,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1275494400. Throughput: 0: 48894.6. Samples: 804270080. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 20:52:50,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:52:51,626][71000] Updated weights for policy 0, policy_version 77854 (0.0027) [2024-06-12 20:52:55,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48605.8, 300 sec: 48596.6). Total num frames: 1275707392. Throughput: 0: 48662.2. Samples: 804566240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 20:52:55,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:52:56,069][71000] Updated weights for policy 0, policy_version 77864 (0.0027) [2024-06-12 20:52:58,385][71000] Updated weights for policy 0, policy_version 77874 (0.0022) [2024-06-12 20:53:00,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.1, 300 sec: 48818.8). Total num frames: 1276002304. Throughput: 0: 48684.9. Samples: 804853840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 20:53:00,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:53:02,482][71000] Updated weights for policy 0, policy_version 77884 (0.0034) [2024-06-12 20:53:04,818][71000] Updated weights for policy 0, policy_version 77894 (0.0028) [2024-06-12 20:53:05,940][70768] Fps is (10 sec: 55706.3, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 1276264448. Throughput: 0: 49260.9. Samples: 805018620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 20:53:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:53:09,306][71000] Updated weights for policy 0, policy_version 77904 (0.0032) [2024-06-12 20:53:10,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 1276477440. Throughput: 0: 49281.8. Samples: 805309140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 20:53:10,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:53:11,409][71000] Updated weights for policy 0, policy_version 77914 (0.0022) [2024-06-12 20:53:15,615][71000] Updated weights for policy 0, policy_version 77924 (0.0027) [2024-06-12 20:53:15,940][70768] Fps is (10 sec: 44236.9, 60 sec: 49152.0, 300 sec: 48596.6). Total num frames: 1276706816. Throughput: 0: 49232.6. Samples: 805604860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 20:53:15,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:53:18,062][71000] Updated weights for policy 0, policy_version 77934 (0.0022) [2024-06-12 20:53:20,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 1276968960. Throughput: 0: 49102.6. Samples: 805740100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 20:53:20,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:53:22,256][71000] Updated weights for policy 0, policy_version 77944 (0.0031) [2024-06-12 20:53:24,949][71000] Updated weights for policy 0, policy_version 77954 (0.0027) [2024-06-12 20:53:25,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49151.9, 300 sec: 48818.8). Total num frames: 1277231104. Throughput: 0: 49144.4. Samples: 806035000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 20:53:25,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 20:53:28,661][71000] Updated weights for policy 0, policy_version 77964 (0.0025) [2024-06-12 20:53:30,169][70980] Signal inference workers to stop experience collection... (11850 times) [2024-06-12 20:53:30,170][70980] Signal inference workers to resume experience collection... (11850 times) [2024-06-12 20:53:30,180][71000] InferenceWorker_p0-w0: stopping experience collection (11850 times) [2024-06-12 20:53:30,180][71000] InferenceWorker_p0-w0: resuming experience collection (11850 times) [2024-06-12 20:53:30,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.2, 300 sec: 48818.8). Total num frames: 1277476864. Throughput: 0: 49614.0. Samples: 806342160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 20:53:30,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 20:53:31,494][71000] Updated weights for policy 0, policy_version 77974 (0.0035) [2024-06-12 20:53:35,683][71000] Updated weights for policy 0, policy_version 77984 (0.0023) [2024-06-12 20:53:35,940][70768] Fps is (10 sec: 45875.7, 60 sec: 49425.0, 300 sec: 48596.6). Total num frames: 1277689856. Throughput: 0: 49035.6. Samples: 806476680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 20:53:35,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:53:38,177][71000] Updated weights for policy 0, policy_version 77994 (0.0045) [2024-06-12 20:53:40,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 1277952000. Throughput: 0: 48967.7. Samples: 806769780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 20:53:40,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:53:40,953][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000078000_1277952000.pth... [2024-06-12 20:53:41,008][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000077289_1266302976.pth [2024-06-12 20:53:42,359][71000] Updated weights for policy 0, policy_version 78004 (0.0030) [2024-06-12 20:53:44,909][71000] Updated weights for policy 0, policy_version 78014 (0.0037) [2024-06-12 20:53:45,940][70768] Fps is (10 sec: 54067.4, 60 sec: 49698.3, 300 sec: 48874.3). Total num frames: 1278230528. Throughput: 0: 49199.5. Samples: 807067820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 20:53:45,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:53:48,917][71000] Updated weights for policy 0, policy_version 78024 (0.0033) [2024-06-12 20:53:50,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48878.9, 300 sec: 48652.1). Total num frames: 1278427136. Throughput: 0: 48808.4. Samples: 807215000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 20:53:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 20:53:51,929][71000] Updated weights for policy 0, policy_version 78034 (0.0028) [2024-06-12 20:53:55,874][71000] Updated weights for policy 0, policy_version 78044 (0.0025) [2024-06-12 20:53:55,940][70768] Fps is (10 sec: 44236.9, 60 sec: 49425.2, 300 sec: 48541.1). Total num frames: 1278672896. Throughput: 0: 48815.6. Samples: 807505840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 20:53:55,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:53:58,490][71000] Updated weights for policy 0, policy_version 78054 (0.0027) [2024-06-12 20:54:00,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48878.9, 300 sec: 48652.1). Total num frames: 1278935040. Throughput: 0: 48566.6. Samples: 807790360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 20:54:00,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:54:02,908][71000] Updated weights for policy 0, policy_version 78064 (0.0020) [2024-06-12 20:54:05,373][71000] Updated weights for policy 0, policy_version 78074 (0.0023) [2024-06-12 20:54:05,940][70768] Fps is (10 sec: 52428.7, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1279197184. Throughput: 0: 49244.5. Samples: 807956100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:54:05,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:54:09,293][71000] Updated weights for policy 0, policy_version 78084 (0.0037) [2024-06-12 20:54:10,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49151.9, 300 sec: 48652.1). Total num frames: 1279426560. Throughput: 0: 49408.9. Samples: 808258400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:54:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:54:11,975][71000] Updated weights for policy 0, policy_version 78094 (0.0029) [2024-06-12 20:54:15,940][70768] Fps is (10 sec: 44236.4, 60 sec: 48878.9, 300 sec: 48541.1). Total num frames: 1279639552. Throughput: 0: 48870.1. Samples: 808541320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:54:15,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:54:16,066][71000] Updated weights for policy 0, policy_version 78104 (0.0030) [2024-06-12 20:54:18,827][71000] Updated weights for policy 0, policy_version 78114 (0.0034) [2024-06-12 20:54:20,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 48707.7). Total num frames: 1279918080. Throughput: 0: 48990.7. Samples: 808681260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:54:20,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:54:22,934][71000] Updated weights for policy 0, policy_version 78124 (0.0024) [2024-06-12 20:54:25,551][71000] Updated weights for policy 0, policy_version 78134 (0.0027) [2024-06-12 20:54:25,940][70768] Fps is (10 sec: 52429.1, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 1280163840. Throughput: 0: 48911.1. Samples: 808970780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:54:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:54:29,889][71000] Updated weights for policy 0, policy_version 78144 (0.0032) [2024-06-12 20:54:30,940][70768] Fps is (10 sec: 44236.4, 60 sec: 48059.6, 300 sec: 48541.1). Total num frames: 1280360448. Throughput: 0: 48776.8. Samples: 809262780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:54:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:54:32,450][71000] Updated weights for policy 0, policy_version 78154 (0.0035) [2024-06-12 20:54:35,940][70768] Fps is (10 sec: 44236.9, 60 sec: 48605.9, 300 sec: 48485.6). Total num frames: 1280606208. Throughput: 0: 48354.7. Samples: 809390960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:54:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 20:54:36,538][70980] Signal inference workers to stop experience collection... (11900 times) [2024-06-12 20:54:36,538][70980] Signal inference workers to resume experience collection... (11900 times) [2024-06-12 20:54:36,575][71000] InferenceWorker_p0-w0: stopping experience collection (11900 times) [2024-06-12 20:54:36,575][71000] InferenceWorker_p0-w0: resuming experience collection (11900 times) [2024-06-12 20:54:36,676][71000] Updated weights for policy 0, policy_version 78164 (0.0026) [2024-06-12 20:54:39,296][71000] Updated weights for policy 0, policy_version 78174 (0.0024) [2024-06-12 20:54:40,940][70768] Fps is (10 sec: 52428.6, 60 sec: 48878.8, 300 sec: 48652.1). Total num frames: 1280884736. Throughput: 0: 48326.9. Samples: 809680560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:54:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:54:43,367][71000] Updated weights for policy 0, policy_version 78184 (0.0024) [2024-06-12 20:54:45,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48059.7, 300 sec: 48707.7). Total num frames: 1281114112. Throughput: 0: 48741.8. Samples: 809983740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:54:45,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:54:46,005][71000] Updated weights for policy 0, policy_version 78194 (0.0025) [2024-06-12 20:54:50,228][71000] Updated weights for policy 0, policy_version 78204 (0.0031) [2024-06-12 20:54:50,939][70768] Fps is (10 sec: 45876.2, 60 sec: 48605.9, 300 sec: 48596.6). Total num frames: 1281343488. Throughput: 0: 48211.2. Samples: 810125600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 20:54:50,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:54:52,484][71000] Updated weights for policy 0, policy_version 78214 (0.0026) [2024-06-12 20:54:55,940][70768] Fps is (10 sec: 47512.7, 60 sec: 48605.7, 300 sec: 48596.6). Total num frames: 1281589248. Throughput: 0: 48127.5. Samples: 810424140. Policy #0 lag: (min: 1.0, avg: 9.5, max: 19.0) [2024-06-12 20:54:55,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:54:56,607][71000] Updated weights for policy 0, policy_version 78224 (0.0032) [2024-06-12 20:54:59,329][71000] Updated weights for policy 0, policy_version 78234 (0.0030) [2024-06-12 20:55:00,940][70768] Fps is (10 sec: 52427.7, 60 sec: 48878.8, 300 sec: 48707.7). Total num frames: 1281867776. Throughput: 0: 48244.4. Samples: 810712320. Policy #0 lag: (min: 1.0, avg: 9.5, max: 19.0) [2024-06-12 20:55:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 20:55:03,307][71000] Updated weights for policy 0, policy_version 78244 (0.0033) [2024-06-12 20:55:05,940][70768] Fps is (10 sec: 50791.6, 60 sec: 48332.8, 300 sec: 48763.2). Total num frames: 1282097152. Throughput: 0: 48769.8. Samples: 810875900. Policy #0 lag: (min: 1.0, avg: 9.5, max: 19.0) [2024-06-12 20:55:05,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:55:05,967][71000] Updated weights for policy 0, policy_version 78254 (0.0037) [2024-06-12 20:55:09,942][71000] Updated weights for policy 0, policy_version 78264 (0.0031) [2024-06-12 20:55:10,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48332.8, 300 sec: 48596.6). Total num frames: 1282326528. Throughput: 0: 48739.9. Samples: 811164080. Policy #0 lag: (min: 1.0, avg: 9.5, max: 19.0) [2024-06-12 20:55:10,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:55:12,495][71000] Updated weights for policy 0, policy_version 78274 (0.0028) [2024-06-12 20:55:15,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48879.1, 300 sec: 48707.7). Total num frames: 1282572288. Throughput: 0: 48757.1. Samples: 811456840. Policy #0 lag: (min: 1.0, avg: 9.5, max: 19.0) [2024-06-12 20:55:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:55:16,310][71000] Updated weights for policy 0, policy_version 78284 (0.0027) [2024-06-12 20:55:18,938][71000] Updated weights for policy 0, policy_version 78294 (0.0031) [2024-06-12 20:55:20,940][70768] Fps is (10 sec: 52429.6, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1282850816. Throughput: 0: 49411.1. Samples: 811614460. Policy #0 lag: (min: 1.0, avg: 9.5, max: 19.0) [2024-06-12 20:55:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:55:23,106][71000] Updated weights for policy 0, policy_version 78304 (0.0034) [2024-06-12 20:55:25,729][71000] Updated weights for policy 0, policy_version 78314 (0.0026) [2024-06-12 20:55:25,940][70768] Fps is (10 sec: 52426.8, 60 sec: 48878.8, 300 sec: 48818.7). Total num frames: 1283096576. Throughput: 0: 49602.6. Samples: 811912680. Policy #0 lag: (min: 1.0, avg: 9.5, max: 19.0) [2024-06-12 20:55:25,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:55:29,783][71000] Updated weights for policy 0, policy_version 78324 (0.0025) [2024-06-12 20:55:30,940][70768] Fps is (10 sec: 45874.5, 60 sec: 49152.0, 300 sec: 48707.7). Total num frames: 1283309568. Throughput: 0: 49216.7. Samples: 812198500. Policy #0 lag: (min: 1.0, avg: 9.5, max: 19.0) [2024-06-12 20:55:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:55:32,745][71000] Updated weights for policy 0, policy_version 78334 (0.0032) [2024-06-12 20:55:35,940][70768] Fps is (10 sec: 45876.2, 60 sec: 49152.0, 300 sec: 48707.7). Total num frames: 1283555328. Throughput: 0: 49137.2. Samples: 812336780. Policy #0 lag: (min: 1.0, avg: 9.5, max: 19.0) [2024-06-12 20:55:35,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:55:36,493][71000] Updated weights for policy 0, policy_version 78344 (0.0026) [2024-06-12 20:55:38,203][70980] Signal inference workers to stop experience collection... (11950 times) [2024-06-12 20:55:38,217][71000] InferenceWorker_p0-w0: stopping experience collection (11950 times) [2024-06-12 20:55:38,259][70980] Signal inference workers to resume experience collection... (11950 times) [2024-06-12 20:55:38,260][71000] InferenceWorker_p0-w0: resuming experience collection (11950 times) [2024-06-12 20:55:39,269][71000] Updated weights for policy 0, policy_version 78354 (0.0034) [2024-06-12 20:55:40,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 1283833856. Throughput: 0: 49036.2. Samples: 812630760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-12 20:55:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 20:55:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000078359_1283833856.pth... [2024-06-12 20:55:41,001][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000077642_1272086528.pth [2024-06-12 20:55:42,885][71000] Updated weights for policy 0, policy_version 78364 (0.0030) [2024-06-12 20:55:45,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 1284063232. Throughput: 0: 49197.5. Samples: 812926200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-12 20:55:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:55:45,990][71000] Updated weights for policy 0, policy_version 78374 (0.0024) [2024-06-12 20:55:49,770][71000] Updated weights for policy 0, policy_version 78384 (0.0029) [2024-06-12 20:55:50,940][70768] Fps is (10 sec: 42598.5, 60 sec: 48605.8, 300 sec: 48596.6). Total num frames: 1284259840. Throughput: 0: 48643.5. Samples: 813064860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-12 20:55:50,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:55:52,726][71000] Updated weights for policy 0, policy_version 78394 (0.0026) [2024-06-12 20:55:55,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.2, 300 sec: 48763.2). Total num frames: 1284538368. Throughput: 0: 48715.3. Samples: 813356260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-12 20:55:55,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:55:56,488][71000] Updated weights for policy 0, policy_version 78404 (0.0028) [2024-06-12 20:55:59,757][71000] Updated weights for policy 0, policy_version 78414 (0.0036) [2024-06-12 20:56:00,940][70768] Fps is (10 sec: 55705.8, 60 sec: 49152.1, 300 sec: 48929.8). Total num frames: 1284816896. Throughput: 0: 48909.7. Samples: 813657780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-12 20:56:00,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:56:02,950][71000] Updated weights for policy 0, policy_version 78424 (0.0028) [2024-06-12 20:56:05,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 1285029888. Throughput: 0: 48660.8. Samples: 813804200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-12 20:56:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:56:06,506][71000] Updated weights for policy 0, policy_version 78434 (0.0028) [2024-06-12 20:56:09,231][71000] Updated weights for policy 0, policy_version 78444 (0.0024) [2024-06-12 20:56:10,939][70768] Fps is (10 sec: 42598.5, 60 sec: 48606.0, 300 sec: 48763.3). Total num frames: 1285242880. Throughput: 0: 48596.3. Samples: 814099500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-12 20:56:10,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:56:12,951][71000] Updated weights for policy 0, policy_version 78454 (0.0030) [2024-06-12 20:56:15,870][71000] Updated weights for policy 0, policy_version 78464 (0.0030) [2024-06-12 20:56:15,939][70768] Fps is (10 sec: 52429.8, 60 sec: 49698.1, 300 sec: 49040.9). Total num frames: 1285554176. Throughput: 0: 48829.1. Samples: 814395800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-12 20:56:15,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:56:19,518][71000] Updated weights for policy 0, policy_version 78474 (0.0037) [2024-06-12 20:56:20,940][70768] Fps is (10 sec: 52428.5, 60 sec: 48605.9, 300 sec: 48875.0). Total num frames: 1285767168. Throughput: 0: 49092.9. Samples: 814545960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-12 20:56:20,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:56:22,785][71000] Updated weights for policy 0, policy_version 78484 (0.0027) [2024-06-12 20:56:25,940][70768] Fps is (10 sec: 45874.6, 60 sec: 48606.0, 300 sec: 48874.3). Total num frames: 1286012928. Throughput: 0: 49144.4. Samples: 814842260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-12 20:56:25,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:56:26,462][71000] Updated weights for policy 0, policy_version 78494 (0.0034) [2024-06-12 20:56:29,933][71000] Updated weights for policy 0, policy_version 78504 (0.0026) [2024-06-12 20:56:30,939][70768] Fps is (10 sec: 45875.4, 60 sec: 48606.0, 300 sec: 48763.2). Total num frames: 1286225920. Throughput: 0: 48988.9. Samples: 815130700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 20:56:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:56:33,079][71000] Updated weights for policy 0, policy_version 78514 (0.0026) [2024-06-12 20:56:35,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 1286504448. Throughput: 0: 49176.4. Samples: 815277800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 20:56:35,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 20:56:36,443][71000] Updated weights for policy 0, policy_version 78524 (0.0031) [2024-06-12 20:56:39,891][71000] Updated weights for policy 0, policy_version 78534 (0.0032) [2024-06-12 20:56:40,940][70768] Fps is (10 sec: 54066.4, 60 sec: 48878.9, 300 sec: 48930.5). Total num frames: 1286766592. Throughput: 0: 49104.7. Samples: 815565980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 20:56:40,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:56:43,015][71000] Updated weights for policy 0, policy_version 78544 (0.0033) [2024-06-12 20:56:45,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 1286979584. Throughput: 0: 48935.1. Samples: 815859860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 20:56:45,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:56:46,435][71000] Updated weights for policy 0, policy_version 78554 (0.0025) [2024-06-12 20:56:49,650][71000] Updated weights for policy 0, policy_version 78564 (0.0036) [2024-06-12 20:56:50,939][70768] Fps is (10 sec: 44237.6, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 1287208960. Throughput: 0: 48799.8. Samples: 816000180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 20:56:50,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:56:53,448][71000] Updated weights for policy 0, policy_version 78574 (0.0025) [2024-06-12 20:56:54,350][70980] Signal inference workers to stop experience collection... (12000 times) [2024-06-12 20:56:54,351][70980] Signal inference workers to resume experience collection... (12000 times) [2024-06-12 20:56:54,398][71000] InferenceWorker_p0-w0: stopping experience collection (12000 times) [2024-06-12 20:56:54,399][71000] InferenceWorker_p0-w0: resuming experience collection (12000 times) [2024-06-12 20:56:55,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 1287487488. Throughput: 0: 48888.4. Samples: 816299480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 20:56:55,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:56:56,585][71000] Updated weights for policy 0, policy_version 78584 (0.0029) [2024-06-12 20:56:59,934][71000] Updated weights for policy 0, policy_version 78594 (0.0024) [2024-06-12 20:57:00,940][70768] Fps is (10 sec: 52428.0, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 1287733248. Throughput: 0: 48852.3. Samples: 816594160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 20:57:00,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 20:57:03,439][71000] Updated weights for policy 0, policy_version 78604 (0.0022) [2024-06-12 20:57:05,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1287962624. Throughput: 0: 48833.7. Samples: 816743480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 20:57:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:57:06,475][71000] Updated weights for policy 0, policy_version 78614 (0.0038) [2024-06-12 20:57:09,975][71000] Updated weights for policy 0, policy_version 78624 (0.0032) [2024-06-12 20:57:10,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 1288208384. Throughput: 0: 48612.5. Samples: 817029820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-12 20:57:10,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:57:13,198][71000] Updated weights for policy 0, policy_version 78634 (0.0027) [2024-06-12 20:57:15,939][70768] Fps is (10 sec: 50791.0, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 1288470528. Throughput: 0: 48845.4. Samples: 817328740. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-12 20:57:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:57:16,567][71000] Updated weights for policy 0, policy_version 78644 (0.0034) [2024-06-12 20:57:20,049][71000] Updated weights for policy 0, policy_version 78654 (0.0031) [2024-06-12 20:57:20,940][70768] Fps is (10 sec: 49150.8, 60 sec: 48878.7, 300 sec: 48874.3). Total num frames: 1288699904. Throughput: 0: 48922.8. Samples: 817479340. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-12 20:57:20,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 20:57:23,230][71000] Updated weights for policy 0, policy_version 78664 (0.0029) [2024-06-12 20:57:25,943][70768] Fps is (10 sec: 47495.0, 60 sec: 48875.9, 300 sec: 48929.2). Total num frames: 1288945664. Throughput: 0: 49045.7. Samples: 817773220. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-12 20:57:25,944][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:57:26,554][71000] Updated weights for policy 0, policy_version 78674 (0.0032) [2024-06-12 20:57:30,095][71000] Updated weights for policy 0, policy_version 78684 (0.0040) [2024-06-12 20:57:30,940][70768] Fps is (10 sec: 49153.2, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 1289191424. Throughput: 0: 49074.7. Samples: 818068220. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-12 20:57:30,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:57:33,182][71000] Updated weights for policy 0, policy_version 78694 (0.0029) [2024-06-12 20:57:35,940][70768] Fps is (10 sec: 50809.5, 60 sec: 49151.9, 300 sec: 48929.9). Total num frames: 1289453568. Throughput: 0: 49142.5. Samples: 818211600. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-12 20:57:35,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 20:57:36,601][71000] Updated weights for policy 0, policy_version 78704 (0.0038) [2024-06-12 20:57:40,026][71000] Updated weights for policy 0, policy_version 78714 (0.0037) [2024-06-12 20:57:40,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48605.8, 300 sec: 48929.9). Total num frames: 1289682944. Throughput: 0: 49153.2. Samples: 818511380. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-12 20:57:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:57:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000078716_1289682944.pth... [2024-06-12 20:57:40,986][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000078000_1277952000.pth [2024-06-12 20:57:43,195][71000] Updated weights for policy 0, policy_version 78724 (0.0025) [2024-06-12 20:57:45,940][70768] Fps is (10 sec: 45875.7, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1289912320. Throughput: 0: 49169.9. Samples: 818806800. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-12 20:57:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:57:46,875][71000] Updated weights for policy 0, policy_version 78734 (0.0029) [2024-06-12 20:57:50,104][71000] Updated weights for policy 0, policy_version 78744 (0.0032) [2024-06-12 20:57:50,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.0, 300 sec: 49041.0). Total num frames: 1290174464. Throughput: 0: 48949.0. Samples: 818946180. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-12 20:57:50,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:57:53,610][71000] Updated weights for policy 0, policy_version 78754 (0.0028) [2024-06-12 20:57:55,940][70768] Fps is (10 sec: 50789.4, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 1290420224. Throughput: 0: 49345.6. Samples: 819250380. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-12 20:57:55,941][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:57:56,573][71000] Updated weights for policy 0, policy_version 78764 (0.0028) [2024-06-12 20:57:59,907][71000] Updated weights for policy 0, policy_version 78774 (0.0032) [2024-06-12 20:58:00,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1290665984. Throughput: 0: 49172.8. Samples: 819541520. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-12 20:58:00,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:58:01,299][70980] Signal inference workers to stop experience collection... (12050 times) [2024-06-12 20:58:01,305][70980] Signal inference workers to resume experience collection... (12050 times) [2024-06-12 20:58:01,310][71000] InferenceWorker_p0-w0: stopping experience collection (12050 times) [2024-06-12 20:58:01,342][71000] InferenceWorker_p0-w0: resuming experience collection (12050 times) [2024-06-12 20:58:03,322][71000] Updated weights for policy 0, policy_version 78784 (0.0025) [2024-06-12 20:58:05,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 1290911744. Throughput: 0: 49101.5. Samples: 819688900. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-12 20:58:05,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 20:58:07,017][71000] Updated weights for policy 0, policy_version 78794 (0.0029) [2024-06-12 20:58:09,842][71000] Updated weights for policy 0, policy_version 78804 (0.0025) [2024-06-12 20:58:10,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 1291173888. Throughput: 0: 49210.0. Samples: 819987480. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-12 20:58:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 20:58:13,418][71000] Updated weights for policy 0, policy_version 78814 (0.0034) [2024-06-12 20:58:15,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 1291403264. Throughput: 0: 48977.4. Samples: 820272200. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-12 20:58:15,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:58:16,568][71000] Updated weights for policy 0, policy_version 78824 (0.0031) [2024-06-12 20:58:20,138][71000] Updated weights for policy 0, policy_version 78834 (0.0025) [2024-06-12 20:58:20,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.2, 300 sec: 48874.3). Total num frames: 1291649024. Throughput: 0: 49333.4. Samples: 820431600. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-12 20:58:20,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:58:23,356][71000] Updated weights for policy 0, policy_version 78844 (0.0020) [2024-06-12 20:58:25,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49155.2, 300 sec: 48874.3). Total num frames: 1291894784. Throughput: 0: 49224.7. Samples: 820726480. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-12 20:58:25,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:58:26,427][71000] Updated weights for policy 0, policy_version 78854 (0.0029) [2024-06-12 20:58:29,929][71000] Updated weights for policy 0, policy_version 78864 (0.0033) [2024-06-12 20:58:30,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 1292140544. Throughput: 0: 49031.1. Samples: 821013200. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-12 20:58:30,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:58:33,571][71000] Updated weights for policy 0, policy_version 78874 (0.0031) [2024-06-12 20:58:35,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 1292369920. Throughput: 0: 49234.5. Samples: 821161740. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-12 20:58:35,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 20:58:36,772][71000] Updated weights for policy 0, policy_version 78884 (0.0038) [2024-06-12 20:58:40,100][71000] Updated weights for policy 0, policy_version 78894 (0.0029) [2024-06-12 20:58:40,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49152.0, 300 sec: 48818.7). Total num frames: 1292632064. Throughput: 0: 48965.4. Samples: 821453820. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-12 20:58:40,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:58:43,505][71000] Updated weights for policy 0, policy_version 78904 (0.0026) [2024-06-12 20:58:45,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 1292861440. Throughput: 0: 49003.9. Samples: 821746700. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-12 20:58:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:58:47,158][71000] Updated weights for policy 0, policy_version 78914 (0.0022) [2024-06-12 20:58:50,336][71000] Updated weights for policy 0, policy_version 78924 (0.0041) [2024-06-12 20:58:50,939][70768] Fps is (10 sec: 47514.5, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 1293107200. Throughput: 0: 48863.8. Samples: 821887760. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-12 20:58:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 20:58:53,828][71000] Updated weights for policy 0, policy_version 78934 (0.0029) [2024-06-12 20:58:55,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 1293336576. Throughput: 0: 48726.7. Samples: 822180180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 20:58:55,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 20:58:57,061][71000] Updated weights for policy 0, policy_version 78944 (0.0034) [2024-06-12 20:59:00,433][71000] Updated weights for policy 0, policy_version 78954 (0.0026) [2024-06-12 20:59:00,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1293598720. Throughput: 0: 48892.0. Samples: 822472340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 20:59:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:59:03,538][71000] Updated weights for policy 0, policy_version 78964 (0.0025) [2024-06-12 20:59:05,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 1293828096. Throughput: 0: 48626.2. Samples: 822619780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 20:59:05,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 20:59:07,138][71000] Updated weights for policy 0, policy_version 78974 (0.0027) [2024-06-12 20:59:07,969][70980] Signal inference workers to stop experience collection... (12100 times) [2024-06-12 20:59:08,018][70980] Signal inference workers to resume experience collection... (12100 times) [2024-06-12 20:59:08,019][71000] InferenceWorker_p0-w0: stopping experience collection (12100 times) [2024-06-12 20:59:08,029][71000] InferenceWorker_p0-w0: resuming experience collection (12100 times) [2024-06-12 20:59:10,320][71000] Updated weights for policy 0, policy_version 78984 (0.0030) [2024-06-12 20:59:10,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 1294090240. Throughput: 0: 48608.9. Samples: 822913880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 20:59:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 20:59:13,900][71000] Updated weights for policy 0, policy_version 78994 (0.0029) [2024-06-12 20:59:15,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 1294336000. Throughput: 0: 48712.8. Samples: 823205280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 20:59:15,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 20:59:17,089][71000] Updated weights for policy 0, policy_version 79004 (0.0028) [2024-06-12 20:59:20,633][71000] Updated weights for policy 0, policy_version 79014 (0.0023) [2024-06-12 20:59:20,940][70768] Fps is (10 sec: 49149.8, 60 sec: 48878.6, 300 sec: 48874.2). Total num frames: 1294581760. Throughput: 0: 48665.4. Samples: 823351700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 20:59:20,941][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:59:23,894][71000] Updated weights for policy 0, policy_version 79024 (0.0030) [2024-06-12 20:59:25,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 1294811136. Throughput: 0: 48394.3. Samples: 823631560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 20:59:25,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:59:27,451][71000] Updated weights for policy 0, policy_version 79034 (0.0027) [2024-06-12 20:59:30,690][71000] Updated weights for policy 0, policy_version 79044 (0.0040) [2024-06-12 20:59:30,939][70768] Fps is (10 sec: 47515.8, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 1295056896. Throughput: 0: 48287.7. Samples: 823919640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 20:59:30,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 20:59:34,131][71000] Updated weights for policy 0, policy_version 79054 (0.0034) [2024-06-12 20:59:35,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1295302656. Throughput: 0: 48617.2. Samples: 824075540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-12 20:59:35,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 20:59:37,234][71000] Updated weights for policy 0, policy_version 79064 (0.0020) [2024-06-12 20:59:40,707][71000] Updated weights for policy 0, policy_version 79074 (0.0028) [2024-06-12 20:59:40,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 1295548416. Throughput: 0: 48611.0. Samples: 824367680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 20:59:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 20:59:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000079074_1295548416.pth... [2024-06-12 20:59:41,001][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000078359_1283833856.pth [2024-06-12 20:59:44,211][71000] Updated weights for policy 0, policy_version 79084 (0.0041) [2024-06-12 20:59:45,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 1295777792. Throughput: 0: 48687.6. Samples: 824663280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 20:59:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 20:59:47,736][71000] Updated weights for policy 0, policy_version 79094 (0.0027) [2024-06-12 20:59:50,704][71000] Updated weights for policy 0, policy_version 79104 (0.0027) [2024-06-12 20:59:50,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 1296039936. Throughput: 0: 48604.3. Samples: 824806980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 20:59:50,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 20:59:54,448][71000] Updated weights for policy 0, policy_version 79114 (0.0029) [2024-06-12 20:59:55,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 1296285696. Throughput: 0: 48677.4. Samples: 825104360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 20:59:55,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 20:59:57,430][71000] Updated weights for policy 0, policy_version 79124 (0.0039) [2024-06-12 21:00:00,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 1296515072. Throughput: 0: 48723.2. Samples: 825397820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 21:00:00,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 21:00:00,977][71000] Updated weights for policy 0, policy_version 79134 (0.0033) [2024-06-12 21:00:03,880][71000] Updated weights for policy 0, policy_version 79144 (0.0029) [2024-06-12 21:00:05,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 1296744448. Throughput: 0: 48668.9. Samples: 825541780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 21:00:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 21:00:07,647][70980] Signal inference workers to stop experience collection... (12150 times) [2024-06-12 21:00:07,648][70980] Signal inference workers to resume experience collection... (12150 times) [2024-06-12 21:00:07,665][71000] Updated weights for policy 0, policy_version 79154 (0.0027) [2024-06-12 21:00:07,691][71000] InferenceWorker_p0-w0: stopping experience collection (12150 times) [2024-06-12 21:00:07,691][71000] InferenceWorker_p0-w0: resuming experience collection (12150 times) [2024-06-12 21:00:10,871][71000] Updated weights for policy 0, policy_version 79164 (0.0028) [2024-06-12 21:00:10,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48878.8, 300 sec: 48985.3). Total num frames: 1297022976. Throughput: 0: 49047.5. Samples: 825838700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 21:00:10,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:00:14,507][71000] Updated weights for policy 0, policy_version 79174 (0.0032) [2024-06-12 21:00:15,940][70768] Fps is (10 sec: 52428.0, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1297268736. Throughput: 0: 48943.8. Samples: 826122120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 21:00:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:00:17,635][71000] Updated weights for policy 0, policy_version 79184 (0.0027) [2024-06-12 21:00:20,940][70768] Fps is (10 sec: 44237.4, 60 sec: 48060.1, 300 sec: 48707.7). Total num frames: 1297465344. Throughput: 0: 48816.6. Samples: 826272280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 21:00:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:00:21,401][71000] Updated weights for policy 0, policy_version 79194 (0.0048) [2024-06-12 21:00:24,274][71000] Updated weights for policy 0, policy_version 79204 (0.0036) [2024-06-12 21:00:25,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 1297727488. Throughput: 0: 48650.2. Samples: 826556940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-12 21:00:25,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:00:28,171][71000] Updated weights for policy 0, policy_version 79214 (0.0029) [2024-06-12 21:00:30,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 1297973248. Throughput: 0: 48483.5. Samples: 826845040. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 21:00:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:00:31,321][71000] Updated weights for policy 0, policy_version 79224 (0.0031) [2024-06-12 21:00:34,775][71000] Updated weights for policy 0, policy_version 79234 (0.0037) [2024-06-12 21:00:35,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1298235392. Throughput: 0: 48683.1. Samples: 826997720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 21:00:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:00:38,039][71000] Updated weights for policy 0, policy_version 79244 (0.0026) [2024-06-12 21:00:40,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48332.8, 300 sec: 48763.2). Total num frames: 1298448384. Throughput: 0: 48503.3. Samples: 827287020. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 21:00:40,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:00:41,601][71000] Updated weights for policy 0, policy_version 79254 (0.0027) [2024-06-12 21:00:44,741][71000] Updated weights for policy 0, policy_version 79264 (0.0025) [2024-06-12 21:00:45,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 1298710528. Throughput: 0: 48481.2. Samples: 827579480. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 21:00:45,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 21:00:48,257][71000] Updated weights for policy 0, policy_version 79274 (0.0029) [2024-06-12 21:00:50,940][70768] Fps is (10 sec: 52429.4, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 1298972672. Throughput: 0: 48813.7. Samples: 827738400. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 21:00:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 21:00:51,041][71000] Updated weights for policy 0, policy_version 79284 (0.0034) [2024-06-12 21:00:54,715][71000] Updated weights for policy 0, policy_version 79294 (0.0029) [2024-06-12 21:00:55,940][70768] Fps is (10 sec: 50791.3, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1299218432. Throughput: 0: 48996.2. Samples: 828043520. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 21:00:55,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:00:57,626][71000] Updated weights for policy 0, policy_version 79304 (0.0029) [2024-06-12 21:01:00,939][70768] Fps is (10 sec: 47514.2, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1299447808. Throughput: 0: 49343.8. Samples: 828342580. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 21:01:00,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:01:01,191][71000] Updated weights for policy 0, policy_version 79314 (0.0028) [2024-06-12 21:01:02,185][70980] Signal inference workers to stop experience collection... (12200 times) [2024-06-12 21:01:02,186][70980] Signal inference workers to resume experience collection... (12200 times) [2024-06-12 21:01:02,204][71000] InferenceWorker_p0-w0: stopping experience collection (12200 times) [2024-06-12 21:01:02,204][71000] InferenceWorker_p0-w0: resuming experience collection (12200 times) [2024-06-12 21:01:04,370][71000] Updated weights for policy 0, policy_version 79324 (0.0024) [2024-06-12 21:01:05,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 1299709952. Throughput: 0: 49151.9. Samples: 828484120. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 21:01:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:01:07,918][71000] Updated weights for policy 0, policy_version 79334 (0.0024) [2024-06-12 21:01:10,940][70768] Fps is (10 sec: 50789.3, 60 sec: 48878.9, 300 sec: 48818.7). Total num frames: 1299955712. Throughput: 0: 49215.6. Samples: 828771640. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 21:01:10,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:01:11,142][71000] Updated weights for policy 0, policy_version 79344 (0.0023) [2024-06-12 21:01:14,683][71000] Updated weights for policy 0, policy_version 79354 (0.0034) [2024-06-12 21:01:15,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 1300217856. Throughput: 0: 49440.9. Samples: 829069880. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-12 21:01:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:01:17,650][71000] Updated weights for policy 0, policy_version 79364 (0.0026) [2024-06-12 21:01:20,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49425.0, 300 sec: 48874.3). Total num frames: 1300430848. Throughput: 0: 49211.2. Samples: 829212220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-12 21:01:20,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:01:21,405][71000] Updated weights for policy 0, policy_version 79374 (0.0032) [2024-06-12 21:01:24,487][71000] Updated weights for policy 0, policy_version 79384 (0.0033) [2024-06-12 21:01:25,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.2, 300 sec: 49040.9). Total num frames: 1300692992. Throughput: 0: 49380.7. Samples: 829509140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-12 21:01:25,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 21:01:28,006][71000] Updated weights for policy 0, policy_version 79394 (0.0037) [2024-06-12 21:01:30,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49425.0, 300 sec: 48929.8). Total num frames: 1300938752. Throughput: 0: 49389.8. Samples: 829802020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-12 21:01:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:01:31,006][71000] Updated weights for policy 0, policy_version 79404 (0.0022) [2024-06-12 21:01:34,943][71000] Updated weights for policy 0, policy_version 79414 (0.0035) [2024-06-12 21:01:35,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1301168128. Throughput: 0: 49001.8. Samples: 829943480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-12 21:01:35,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 21:01:38,061][71000] Updated weights for policy 0, policy_version 79424 (0.0035) [2024-06-12 21:01:40,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.1, 300 sec: 48929.8). Total num frames: 1301413888. Throughput: 0: 48585.2. Samples: 830229860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-12 21:01:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:01:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000079432_1301413888.pth... [2024-06-12 21:01:41,007][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000078716_1289682944.pth [2024-06-12 21:01:41,678][71000] Updated weights for policy 0, policy_version 79434 (0.0029) [2024-06-12 21:01:44,529][71000] Updated weights for policy 0, policy_version 79444 (0.0028) [2024-06-12 21:01:45,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 1301659648. Throughput: 0: 48596.8. Samples: 830529440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-12 21:01:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:01:48,286][71000] Updated weights for policy 0, policy_version 79454 (0.0027) [2024-06-12 21:01:50,939][70768] Fps is (10 sec: 49152.8, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1301905408. Throughput: 0: 48886.8. Samples: 830684020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-12 21:01:50,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 21:01:51,303][71000] Updated weights for policy 0, policy_version 79464 (0.0033) [2024-06-12 21:01:54,778][71000] Updated weights for policy 0, policy_version 79474 (0.0026) [2024-06-12 21:01:55,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1302151168. Throughput: 0: 48960.6. Samples: 830974860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-12 21:01:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:01:57,882][71000] Updated weights for policy 0, policy_version 79484 (0.0025) [2024-06-12 21:02:00,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48878.7, 300 sec: 48874.3). Total num frames: 1302380544. Throughput: 0: 48752.7. Samples: 831263760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-12 21:02:00,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:02:01,888][71000] Updated weights for policy 0, policy_version 79494 (0.0023) [2024-06-12 21:02:04,515][71000] Updated weights for policy 0, policy_version 79504 (0.0029) [2024-06-12 21:02:05,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 1302642688. Throughput: 0: 48768.0. Samples: 831406780. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-12 21:02:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:02:08,494][71000] Updated weights for policy 0, policy_version 79514 (0.0024) [2024-06-12 21:02:10,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49152.1, 300 sec: 48929.8). Total num frames: 1302904832. Throughput: 0: 48792.4. Samples: 831704800. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-12 21:02:10,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 21:02:11,149][71000] Updated weights for policy 0, policy_version 79524 (0.0028) [2024-06-12 21:02:15,375][71000] Updated weights for policy 0, policy_version 79534 (0.0034) [2024-06-12 21:02:15,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48332.7, 300 sec: 48874.3). Total num frames: 1303117824. Throughput: 0: 48792.9. Samples: 831997700. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-12 21:02:15,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:02:18,266][71000] Updated weights for policy 0, policy_version 79544 (0.0028) [2024-06-12 21:02:20,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48878.9, 300 sec: 48874.9). Total num frames: 1303363584. Throughput: 0: 48787.6. Samples: 832138920. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-12 21:02:20,948][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 21:02:21,966][71000] Updated weights for policy 0, policy_version 79554 (0.0028) [2024-06-12 21:02:25,038][71000] Updated weights for policy 0, policy_version 79564 (0.0027) [2024-06-12 21:02:25,939][70768] Fps is (10 sec: 50791.2, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 1303625728. Throughput: 0: 48790.0. Samples: 832425400. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-12 21:02:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:02:28,633][71000] Updated weights for policy 0, policy_version 79574 (0.0035) [2024-06-12 21:02:30,095][70980] Signal inference workers to stop experience collection... (12250 times) [2024-06-12 21:02:30,095][70980] Signal inference workers to resume experience collection... (12250 times) [2024-06-12 21:02:30,131][71000] InferenceWorker_p0-w0: stopping experience collection (12250 times) [2024-06-12 21:02:30,131][71000] InferenceWorker_p0-w0: resuming experience collection (12250 times) [2024-06-12 21:02:30,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1303871488. Throughput: 0: 48694.6. Samples: 832720700. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-12 21:02:30,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:02:31,559][71000] Updated weights for policy 0, policy_version 79584 (0.0029) [2024-06-12 21:02:35,271][71000] Updated weights for policy 0, policy_version 79594 (0.0033) [2024-06-12 21:02:35,940][70768] Fps is (10 sec: 44236.5, 60 sec: 48332.8, 300 sec: 48763.2). Total num frames: 1304068096. Throughput: 0: 48616.8. Samples: 832871780. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-12 21:02:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:02:38,284][71000] Updated weights for policy 0, policy_version 79604 (0.0021) [2024-06-12 21:02:40,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 1304346624. Throughput: 0: 48783.3. Samples: 833170120. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-12 21:02:40,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:02:41,973][71000] Updated weights for policy 0, policy_version 79614 (0.0029) [2024-06-12 21:02:44,950][71000] Updated weights for policy 0, policy_version 79624 (0.0039) [2024-06-12 21:02:45,940][70768] Fps is (10 sec: 52428.8, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1304592384. Throughput: 0: 48794.8. Samples: 833459520. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-12 21:02:45,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:02:48,633][71000] Updated weights for policy 0, policy_version 79634 (0.0030) [2024-06-12 21:02:50,939][70768] Fps is (10 sec: 49153.4, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1304838144. Throughput: 0: 48857.4. Samples: 833605360. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-12 21:02:50,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 21:02:51,833][71000] Updated weights for policy 0, policy_version 79644 (0.0032) [2024-06-12 21:02:55,114][71000] Updated weights for policy 0, policy_version 79654 (0.0027) [2024-06-12 21:02:55,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48332.8, 300 sec: 48763.2). Total num frames: 1305051136. Throughput: 0: 48713.3. Samples: 833896900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 21:02:55,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:02:58,302][71000] Updated weights for policy 0, policy_version 79664 (0.0018) [2024-06-12 21:03:00,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.2, 300 sec: 48874.3). Total num frames: 1305329664. Throughput: 0: 48705.1. Samples: 834189420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 21:03:00,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:03:02,077][71000] Updated weights for policy 0, policy_version 79674 (0.0023) [2024-06-12 21:03:05,031][71000] Updated weights for policy 0, policy_version 79684 (0.0027) [2024-06-12 21:03:05,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 1305559040. Throughput: 0: 49058.2. Samples: 834346540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 21:03:05,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 21:03:08,510][71000] Updated weights for policy 0, policy_version 79694 (0.0029) [2024-06-12 21:03:10,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 1305821184. Throughput: 0: 49221.7. Samples: 834640380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 21:03:10,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:03:11,625][71000] Updated weights for policy 0, policy_version 79704 (0.0040) [2024-06-12 21:03:15,142][71000] Updated weights for policy 0, policy_version 79714 (0.0039) [2024-06-12 21:03:15,940][70768] Fps is (10 sec: 47512.5, 60 sec: 48605.7, 300 sec: 48763.2). Total num frames: 1306034176. Throughput: 0: 48927.4. Samples: 834922440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 21:03:15,941][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:03:18,362][71000] Updated weights for policy 0, policy_version 79724 (0.0031) [2024-06-12 21:03:20,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1306296320. Throughput: 0: 48903.6. Samples: 835072440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 21:03:20,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:03:21,894][71000] Updated weights for policy 0, policy_version 79734 (0.0034) [2024-06-12 21:03:25,144][71000] Updated weights for policy 0, policy_version 79744 (0.0034) [2024-06-12 21:03:25,940][70768] Fps is (10 sec: 52430.1, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1306558464. Throughput: 0: 48920.2. Samples: 835371520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 21:03:25,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:03:28,505][71000] Updated weights for policy 0, policy_version 79754 (0.0029) [2024-06-12 21:03:30,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48879.1, 300 sec: 48929.9). Total num frames: 1306804224. Throughput: 0: 49228.5. Samples: 835674800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 21:03:30,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 21:03:31,600][71000] Updated weights for policy 0, policy_version 79764 (0.0032) [2024-06-12 21:03:34,950][71000] Updated weights for policy 0, policy_version 79774 (0.0025) [2024-06-12 21:03:35,939][70768] Fps is (10 sec: 45875.6, 60 sec: 49152.1, 300 sec: 48763.3). Total num frames: 1307017216. Throughput: 0: 49100.0. Samples: 835814860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 21:03:35,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:03:37,292][70980] Signal inference workers to stop experience collection... (12300 times) [2024-06-12 21:03:37,341][71000] InferenceWorker_p0-w0: stopping experience collection (12300 times) [2024-06-12 21:03:37,346][70980] Signal inference workers to resume experience collection... (12300 times) [2024-06-12 21:03:37,352][71000] InferenceWorker_p0-w0: resuming experience collection (12300 times) [2024-06-12 21:03:38,222][71000] Updated weights for policy 0, policy_version 79784 (0.0028) [2024-06-12 21:03:40,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.2, 300 sec: 48985.4). Total num frames: 1307312128. Throughput: 0: 49324.0. Samples: 836116480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 21:03:40,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:03:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000079792_1307312128.pth... [2024-06-12 21:03:41,003][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000079074_1295548416.pth [2024-06-12 21:03:41,961][71000] Updated weights for policy 0, policy_version 79794 (0.0028) [2024-06-12 21:03:44,783][71000] Updated weights for policy 0, policy_version 79804 (0.0031) [2024-06-12 21:03:45,940][70768] Fps is (10 sec: 54066.6, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 1307557888. Throughput: 0: 49319.0. Samples: 836408780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 21:03:45,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 21:03:48,330][71000] Updated weights for policy 0, policy_version 79814 (0.0025) [2024-06-12 21:03:50,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 1307787264. Throughput: 0: 49116.9. Samples: 836556800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 21:03:50,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 21:03:51,489][71000] Updated weights for policy 0, policy_version 79824 (0.0029) [2024-06-12 21:03:54,778][71000] Updated weights for policy 0, policy_version 79834 (0.0024) [2024-06-12 21:03:55,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49425.0, 300 sec: 48874.3). Total num frames: 1308016640. Throughput: 0: 49141.3. Samples: 836851740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 21:03:55,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 21:03:58,250][71000] Updated weights for policy 0, policy_version 79844 (0.0025) [2024-06-12 21:04:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 1308295168. Throughput: 0: 49501.6. Samples: 837150000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 21:04:00,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:04:01,207][71000] Updated weights for policy 0, policy_version 79854 (0.0026) [2024-06-12 21:04:05,056][71000] Updated weights for policy 0, policy_version 79864 (0.0033) [2024-06-12 21:04:05,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49698.2, 300 sec: 48985.4). Total num frames: 1308540928. Throughput: 0: 49710.2. Samples: 837309400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 21:04:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:04:08,346][71000] Updated weights for policy 0, policy_version 79874 (0.0030) [2024-06-12 21:04:10,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1308753920. Throughput: 0: 49411.1. Samples: 837595020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 21:04:10,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:04:11,560][71000] Updated weights for policy 0, policy_version 79884 (0.0032) [2024-06-12 21:04:14,929][71000] Updated weights for policy 0, policy_version 79894 (0.0030) [2024-06-12 21:04:15,940][70768] Fps is (10 sec: 44236.2, 60 sec: 49152.1, 300 sec: 48818.8). Total num frames: 1308983296. Throughput: 0: 49151.8. Samples: 837886640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 21:04:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:04:18,139][71000] Updated weights for policy 0, policy_version 79904 (0.0028) [2024-06-12 21:04:20,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49698.1, 300 sec: 49040.9). Total num frames: 1309278208. Throughput: 0: 49160.3. Samples: 838027080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 21:04:20,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 21:04:21,659][71000] Updated weights for policy 0, policy_version 79914 (0.0029) [2024-06-12 21:04:24,832][71000] Updated weights for policy 0, policy_version 79924 (0.0031) [2024-06-12 21:04:25,940][70768] Fps is (10 sec: 54067.1, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 1309523968. Throughput: 0: 49324.8. Samples: 838336100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-12 21:04:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 21:04:28,514][71000] Updated weights for policy 0, policy_version 79934 (0.0028) [2024-06-12 21:04:30,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 1309753344. Throughput: 0: 49524.0. Samples: 838637360. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 21:04:30,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:04:31,545][71000] Updated weights for policy 0, policy_version 79944 (0.0031) [2024-06-12 21:04:34,921][71000] Updated weights for policy 0, policy_version 79954 (0.0034) [2024-06-12 21:04:35,940][70768] Fps is (10 sec: 45875.7, 60 sec: 49425.0, 300 sec: 48929.9). Total num frames: 1309982720. Throughput: 0: 49282.7. Samples: 838774520. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 21:04:35,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 21:04:38,132][70980] Signal inference workers to stop experience collection... (12350 times) [2024-06-12 21:04:38,187][71000] InferenceWorker_p0-w0: stopping experience collection (12350 times) [2024-06-12 21:04:38,244][70980] Signal inference workers to resume experience collection... (12350 times) [2024-06-12 21:04:38,244][71000] InferenceWorker_p0-w0: resuming experience collection (12350 times) [2024-06-12 21:04:38,246][71000] Updated weights for policy 0, policy_version 79964 (0.0027) [2024-06-12 21:04:40,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 1310261248. Throughput: 0: 49308.3. Samples: 839070620. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 21:04:40,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:04:41,511][71000] Updated weights for policy 0, policy_version 79974 (0.0026) [2024-06-12 21:04:44,793][71000] Updated weights for policy 0, policy_version 79984 (0.0024) [2024-06-12 21:04:45,939][70768] Fps is (10 sec: 54067.6, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 1310523392. Throughput: 0: 49343.2. Samples: 839370440. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 21:04:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:04:48,351][71000] Updated weights for policy 0, policy_version 79994 (0.0037) [2024-06-12 21:04:50,940][70768] Fps is (10 sec: 47514.4, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 1310736384. Throughput: 0: 49157.8. Samples: 839521500. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 21:04:50,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 21:04:51,739][71000] Updated weights for policy 0, policy_version 80004 (0.0033) [2024-06-12 21:04:54,820][71000] Updated weights for policy 0, policy_version 80014 (0.0028) [2024-06-12 21:04:55,940][70768] Fps is (10 sec: 44235.8, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 1310965760. Throughput: 0: 49165.1. Samples: 839807460. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 21:04:55,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:04:58,514][71000] Updated weights for policy 0, policy_version 80024 (0.0029) [2024-06-12 21:05:00,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 1311244288. Throughput: 0: 49193.5. Samples: 840100340. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 21:05:00,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:05:01,502][71000] Updated weights for policy 0, policy_version 80034 (0.0028) [2024-06-12 21:05:05,018][71000] Updated weights for policy 0, policy_version 80044 (0.0036) [2024-06-12 21:05:05,940][70768] Fps is (10 sec: 52429.6, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 1311490048. Throughput: 0: 49493.9. Samples: 840254300. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 21:05:05,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:05:08,519][71000] Updated weights for policy 0, policy_version 80054 (0.0027) [2024-06-12 21:05:10,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 1311703040. Throughput: 0: 49278.3. Samples: 840553620. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 21:05:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:05:11,485][71000] Updated weights for policy 0, policy_version 80064 (0.0031) [2024-06-12 21:05:14,829][71000] Updated weights for policy 0, policy_version 80074 (0.0027) [2024-06-12 21:05:15,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49425.2, 300 sec: 49096.5). Total num frames: 1311948800. Throughput: 0: 49210.2. Samples: 840851820. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-12 21:05:15,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:05:18,163][71000] Updated weights for policy 0, policy_version 80084 (0.0033) [2024-06-12 21:05:20,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 1312227328. Throughput: 0: 49261.1. Samples: 840991280. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-12 21:05:20,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:05:21,646][71000] Updated weights for policy 0, policy_version 80094 (0.0026) [2024-06-12 21:05:24,651][71000] Updated weights for policy 0, policy_version 80104 (0.0031) [2024-06-12 21:05:25,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 1312473088. Throughput: 0: 49401.5. Samples: 841293680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-12 21:05:25,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:05:28,056][71000] Updated weights for policy 0, policy_version 80114 (0.0025) [2024-06-12 21:05:30,940][70768] Fps is (10 sec: 47514.4, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 1312702464. Throughput: 0: 49554.6. Samples: 841600400. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-12 21:05:30,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 21:05:31,263][71000] Updated weights for policy 0, policy_version 80124 (0.0021) [2024-06-12 21:05:32,987][70980] Signal inference workers to stop experience collection... (12400 times) [2024-06-12 21:05:32,987][70980] Signal inference workers to resume experience collection... (12400 times) [2024-06-12 21:05:33,002][71000] InferenceWorker_p0-w0: stopping experience collection (12400 times) [2024-06-12 21:05:33,002][71000] InferenceWorker_p0-w0: resuming experience collection (12400 times) [2024-06-12 21:05:35,041][71000] Updated weights for policy 0, policy_version 80134 (0.0028) [2024-06-12 21:05:35,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 1312931840. Throughput: 0: 48984.4. Samples: 841725800. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-12 21:05:35,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:05:38,108][71000] Updated weights for policy 0, policy_version 80144 (0.0034) [2024-06-12 21:05:40,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 1313210368. Throughput: 0: 49139.7. Samples: 842018740. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-12 21:05:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:05:40,953][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000080152_1313210368.pth... [2024-06-12 21:05:41,000][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000079432_1301413888.pth [2024-06-12 21:05:41,824][71000] Updated weights for policy 0, policy_version 80154 (0.0028) [2024-06-12 21:05:45,142][71000] Updated weights for policy 0, policy_version 80164 (0.0032) [2024-06-12 21:05:45,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 1313439744. Throughput: 0: 48936.0. Samples: 842302460. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-12 21:05:45,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 21:05:48,450][71000] Updated weights for policy 0, policy_version 80174 (0.0025) [2024-06-12 21:05:50,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 1313669120. Throughput: 0: 48961.0. Samples: 842457540. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-12 21:05:50,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:05:51,674][71000] Updated weights for policy 0, policy_version 80184 (0.0027) [2024-06-12 21:05:54,700][71000] Updated weights for policy 0, policy_version 80194 (0.0028) [2024-06-12 21:05:55,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.2, 300 sec: 49096.5). Total num frames: 1313931264. Throughput: 0: 48959.6. Samples: 842756800. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-12 21:05:55,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 21:05:58,190][71000] Updated weights for policy 0, policy_version 80204 (0.0029) [2024-06-12 21:06:00,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 1314193408. Throughput: 0: 48871.5. Samples: 843051040. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-12 21:06:00,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:06:01,437][71000] Updated weights for policy 0, policy_version 80214 (0.0026) [2024-06-12 21:06:04,859][71000] Updated weights for policy 0, policy_version 80224 (0.0022) [2024-06-12 21:06:05,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 1314439168. Throughput: 0: 49251.0. Samples: 843207560. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-12 21:06:05,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:06:08,167][71000] Updated weights for policy 0, policy_version 80234 (0.0028) [2024-06-12 21:06:10,940][70768] Fps is (10 sec: 45874.4, 60 sec: 49151.8, 300 sec: 48929.8). Total num frames: 1314652160. Throughput: 0: 48831.0. Samples: 843491080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 21:06:10,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:06:11,820][71000] Updated weights for policy 0, policy_version 80244 (0.0026) [2024-06-12 21:06:15,054][71000] Updated weights for policy 0, policy_version 80254 (0.0023) [2024-06-12 21:06:15,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 1314914304. Throughput: 0: 48549.8. Samples: 843785140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 21:06:15,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:06:18,769][71000] Updated weights for policy 0, policy_version 80264 (0.0025) [2024-06-12 21:06:20,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49152.1, 300 sec: 49096.4). Total num frames: 1315176448. Throughput: 0: 49132.8. Samples: 843936780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 21:06:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:06:21,718][71000] Updated weights for policy 0, policy_version 80274 (0.0026) [2024-06-12 21:06:25,549][71000] Updated weights for policy 0, policy_version 80284 (0.0035) [2024-06-12 21:06:25,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 1315389440. Throughput: 0: 48920.5. Samples: 844220160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 21:06:25,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 21:06:28,447][71000] Updated weights for policy 0, policy_version 80294 (0.0038) [2024-06-12 21:06:30,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 1315635200. Throughput: 0: 49122.6. Samples: 844512980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 21:06:30,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 21:06:32,286][71000] Updated weights for policy 0, policy_version 80304 (0.0031) [2024-06-12 21:06:34,560][70980] Signal inference workers to stop experience collection... (12450 times) [2024-06-12 21:06:34,610][71000] InferenceWorker_p0-w0: stopping experience collection (12450 times) [2024-06-12 21:06:34,616][70980] Signal inference workers to resume experience collection... (12450 times) [2024-06-12 21:06:34,626][71000] InferenceWorker_p0-w0: resuming experience collection (12450 times) [2024-06-12 21:06:35,482][71000] Updated weights for policy 0, policy_version 80314 (0.0031) [2024-06-12 21:06:35,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 1315880960. Throughput: 0: 48817.3. Samples: 844654320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 21:06:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:06:38,929][71000] Updated weights for policy 0, policy_version 80324 (0.0027) [2024-06-12 21:06:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 1316126720. Throughput: 0: 48799.1. Samples: 844952760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 21:06:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:06:41,839][71000] Updated weights for policy 0, policy_version 80334 (0.0029) [2024-06-12 21:06:45,337][71000] Updated weights for policy 0, policy_version 80344 (0.0026) [2024-06-12 21:06:45,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 1316388864. Throughput: 0: 48819.5. Samples: 845247920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 21:06:45,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 21:06:48,318][71000] Updated weights for policy 0, policy_version 80354 (0.0025) [2024-06-12 21:06:50,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 1316618240. Throughput: 0: 48520.8. Samples: 845391000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 21:06:50,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:06:52,209][71000] Updated weights for policy 0, policy_version 80364 (0.0031) [2024-06-12 21:06:55,073][71000] Updated weights for policy 0, policy_version 80374 (0.0033) [2024-06-12 21:06:55,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 1316880384. Throughput: 0: 48886.0. Samples: 845690940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 21:06:55,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 21:06:58,551][71000] Updated weights for policy 0, policy_version 80384 (0.0023) [2024-06-12 21:07:00,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48332.8, 300 sec: 48985.4). Total num frames: 1317093376. Throughput: 0: 49141.3. Samples: 845996500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 21:07:00,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 21:07:01,646][71000] Updated weights for policy 0, policy_version 80394 (0.0019) [2024-06-12 21:07:05,149][71000] Updated weights for policy 0, policy_version 80404 (0.0030) [2024-06-12 21:07:05,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 1317355520. Throughput: 0: 48998.4. Samples: 846141700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 21:07:05,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:07:08,214][71000] Updated weights for policy 0, policy_version 80414 (0.0032) [2024-06-12 21:07:10,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.2, 300 sec: 49096.5). Total num frames: 1317601280. Throughput: 0: 49076.9. Samples: 846428620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 21:07:10,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:07:11,819][71000] Updated weights for policy 0, policy_version 80424 (0.0031) [2024-06-12 21:07:15,246][71000] Updated weights for policy 0, policy_version 80434 (0.0025) [2024-06-12 21:07:15,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 1317847040. Throughput: 0: 49093.8. Samples: 846722200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 21:07:15,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:07:18,646][71000] Updated weights for policy 0, policy_version 80444 (0.0030) [2024-06-12 21:07:20,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48332.8, 300 sec: 48985.4). Total num frames: 1318076416. Throughput: 0: 49348.3. Samples: 846875000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 21:07:20,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:07:21,785][71000] Updated weights for policy 0, policy_version 80454 (0.0027) [2024-06-12 21:07:25,061][71000] Updated weights for policy 0, policy_version 80464 (0.0025) [2024-06-12 21:07:25,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 1318338560. Throughput: 0: 49266.6. Samples: 847169760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 21:07:25,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:07:28,501][71000] Updated weights for policy 0, policy_version 80474 (0.0038) [2024-06-12 21:07:30,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 1318600704. Throughput: 0: 49068.1. Samples: 847455980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 21:07:30,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 21:07:31,908][71000] Updated weights for policy 0, policy_version 80484 (0.0024) [2024-06-12 21:07:35,442][71000] Updated weights for policy 0, policy_version 80494 (0.0027) [2024-06-12 21:07:35,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 1318830080. Throughput: 0: 49309.7. Samples: 847609940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 21:07:35,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:07:38,706][71000] Updated weights for policy 0, policy_version 80504 (0.0034) [2024-06-12 21:07:40,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 1319059456. Throughput: 0: 48923.1. Samples: 847892480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-12 21:07:40,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:07:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000080509_1319059456.pth... [2024-06-12 21:07:40,994][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000079792_1307312128.pth [2024-06-12 21:07:42,128][71000] Updated weights for policy 0, policy_version 80514 (0.0029) [2024-06-12 21:07:45,633][71000] Updated weights for policy 0, policy_version 80524 (0.0037) [2024-06-12 21:07:45,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48879.0, 300 sec: 49096.4). Total num frames: 1319321600. Throughput: 0: 48645.3. Samples: 848185540. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-12 21:07:45,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:07:48,052][70980] Signal inference workers to stop experience collection... (12500 times) [2024-06-12 21:07:48,052][70980] Signal inference workers to resume experience collection... (12500 times) [2024-06-12 21:07:48,068][71000] InferenceWorker_p0-w0: stopping experience collection (12500 times) [2024-06-12 21:07:48,068][71000] InferenceWorker_p0-w0: resuming experience collection (12500 times) [2024-06-12 21:07:48,721][71000] Updated weights for policy 0, policy_version 80534 (0.0025) [2024-06-12 21:07:50,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 1319567360. Throughput: 0: 48876.7. Samples: 848341160. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-12 21:07:50,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 21:07:51,874][71000] Updated weights for policy 0, policy_version 80544 (0.0021) [2024-06-12 21:07:55,595][71000] Updated weights for policy 0, policy_version 80554 (0.0027) [2024-06-12 21:07:55,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 1319813120. Throughput: 0: 48949.8. Samples: 848631360. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-12 21:07:55,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 21:07:58,917][71000] Updated weights for policy 0, policy_version 80564 (0.0031) [2024-06-12 21:08:00,940][70768] Fps is (10 sec: 45876.1, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 1320026112. Throughput: 0: 48776.5. Samples: 848917140. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-12 21:08:00,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:08:02,596][71000] Updated weights for policy 0, policy_version 80574 (0.0025) [2024-06-12 21:08:05,796][71000] Updated weights for policy 0, policy_version 80584 (0.0026) [2024-06-12 21:08:05,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 1320288256. Throughput: 0: 48473.1. Samples: 849056280. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-12 21:08:05,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:08:09,392][71000] Updated weights for policy 0, policy_version 80594 (0.0035) [2024-06-12 21:08:10,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 1320550400. Throughput: 0: 48505.9. Samples: 849352520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-12 21:08:10,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:08:12,624][71000] Updated weights for policy 0, policy_version 80604 (0.0021) [2024-06-12 21:08:15,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48332.9, 300 sec: 48985.4). Total num frames: 1320747008. Throughput: 0: 48643.6. Samples: 849644940. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-12 21:08:15,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:08:16,092][71000] Updated weights for policy 0, policy_version 80614 (0.0033) [2024-06-12 21:08:19,453][71000] Updated weights for policy 0, policy_version 80624 (0.0030) [2024-06-12 21:08:20,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 1321009152. Throughput: 0: 48356.9. Samples: 849786000. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-12 21:08:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:08:22,827][71000] Updated weights for policy 0, policy_version 80634 (0.0046) [2024-06-12 21:08:25,831][71000] Updated weights for policy 0, policy_version 80644 (0.0030) [2024-06-12 21:08:25,940][70768] Fps is (10 sec: 52428.2, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 1321271296. Throughput: 0: 48626.2. Samples: 850080660. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-12 21:08:25,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:08:29,329][71000] Updated weights for policy 0, policy_version 80654 (0.0030) [2024-06-12 21:08:30,940][70768] Fps is (10 sec: 52428.9, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 1321533440. Throughput: 0: 48806.2. Samples: 850381820. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-12 21:08:30,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:08:32,299][71000] Updated weights for policy 0, policy_version 80664 (0.0027) [2024-06-12 21:08:35,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 1321746432. Throughput: 0: 48839.3. Samples: 850538920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 21:08:35,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:08:35,987][71000] Updated weights for policy 0, policy_version 80674 (0.0028) [2024-06-12 21:08:39,202][71000] Updated weights for policy 0, policy_version 80684 (0.0026) [2024-06-12 21:08:40,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 1322008576. Throughput: 0: 49000.9. Samples: 850836400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 21:08:40,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 21:08:42,630][71000] Updated weights for policy 0, policy_version 80694 (0.0033) [2024-06-12 21:08:45,790][71000] Updated weights for policy 0, policy_version 80704 (0.0029) [2024-06-12 21:08:45,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 1322254336. Throughput: 0: 49024.5. Samples: 851123240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 21:08:45,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:08:49,606][71000] Updated weights for policy 0, policy_version 80714 (0.0031) [2024-06-12 21:08:49,776][70980] Signal inference workers to stop experience collection... (12550 times) [2024-06-12 21:08:49,831][71000] InferenceWorker_p0-w0: stopping experience collection (12550 times) [2024-06-12 21:08:49,885][70980] Signal inference workers to resume experience collection... (12550 times) [2024-06-12 21:08:49,885][71000] InferenceWorker_p0-w0: resuming experience collection (12550 times) [2024-06-12 21:08:50,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 1322516480. Throughput: 0: 49243.4. Samples: 851272240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 21:08:50,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:08:52,416][71000] Updated weights for policy 0, policy_version 80724 (0.0032) [2024-06-12 21:08:55,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 1322729472. Throughput: 0: 49125.3. Samples: 851563160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 21:08:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 21:08:56,067][71000] Updated weights for policy 0, policy_version 80734 (0.0032) [2024-06-12 21:08:59,541][71000] Updated weights for policy 0, policy_version 80744 (0.0026) [2024-06-12 21:09:00,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 1322975232. Throughput: 0: 49010.6. Samples: 851850420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 21:09:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:09:03,172][71000] Updated weights for policy 0, policy_version 80754 (0.0025) [2024-06-12 21:09:05,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 1323220992. Throughput: 0: 49100.4. Samples: 851995520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 21:09:05,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 21:09:06,276][71000] Updated weights for policy 0, policy_version 80764 (0.0021) [2024-06-12 21:09:09,567][71000] Updated weights for policy 0, policy_version 80774 (0.0026) [2024-06-12 21:09:10,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 1323483136. Throughput: 0: 49254.6. Samples: 852297120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 21:09:10,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:09:13,030][71000] Updated weights for policy 0, policy_version 80784 (0.0030) [2024-06-12 21:09:15,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1323679744. Throughput: 0: 48834.7. Samples: 852579380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 21:09:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:09:16,485][71000] Updated weights for policy 0, policy_version 80794 (0.0029) [2024-06-12 21:09:19,753][71000] Updated weights for policy 0, policy_version 80804 (0.0031) [2024-06-12 21:09:20,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 1323958272. Throughput: 0: 48418.9. Samples: 852717780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 21:09:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:09:23,145][71000] Updated weights for policy 0, policy_version 80814 (0.0031) [2024-06-12 21:09:25,939][70768] Fps is (10 sec: 50791.0, 60 sec: 48606.0, 300 sec: 48929.9). Total num frames: 1324187648. Throughput: 0: 48503.1. Samples: 853019040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-12 21:09:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:09:26,339][71000] Updated weights for policy 0, policy_version 80824 (0.0038) [2024-06-12 21:09:29,630][71000] Updated weights for policy 0, policy_version 80834 (0.0019) [2024-06-12 21:09:30,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48878.9, 300 sec: 49096.4). Total num frames: 1324466176. Throughput: 0: 48874.9. Samples: 853322620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-12 21:09:30,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 21:09:33,085][71000] Updated weights for policy 0, policy_version 80844 (0.0029) [2024-06-12 21:09:35,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 1324679168. Throughput: 0: 48799.1. Samples: 853468200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-12 21:09:35,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:09:36,331][71000] Updated weights for policy 0, policy_version 80854 (0.0024) [2024-06-12 21:09:39,489][71000] Updated weights for policy 0, policy_version 80864 (0.0031) [2024-06-12 21:09:40,939][70768] Fps is (10 sec: 45876.2, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 1324924928. Throughput: 0: 48833.1. Samples: 853760640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-12 21:09:40,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:09:41,034][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000080868_1324941312.pth... [2024-06-12 21:09:41,068][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000080152_1313210368.pth [2024-06-12 21:09:43,096][71000] Updated weights for policy 0, policy_version 80874 (0.0036) [2024-06-12 21:09:45,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48605.7, 300 sec: 48929.8). Total num frames: 1325170688. Throughput: 0: 48765.2. Samples: 854044860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-12 21:09:45,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:09:46,339][71000] Updated weights for policy 0, policy_version 80884 (0.0031) [2024-06-12 21:09:49,722][71000] Updated weights for policy 0, policy_version 80894 (0.0027) [2024-06-12 21:09:50,939][70768] Fps is (10 sec: 50790.3, 60 sec: 48606.0, 300 sec: 49041.0). Total num frames: 1325432832. Throughput: 0: 48963.8. Samples: 854198880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-12 21:09:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:09:52,875][71000] Updated weights for policy 0, policy_version 80904 (0.0021) [2024-06-12 21:09:55,939][70768] Fps is (10 sec: 49153.2, 60 sec: 48879.1, 300 sec: 48874.3). Total num frames: 1325662208. Throughput: 0: 48929.6. Samples: 854498940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-12 21:09:55,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:09:56,331][71000] Updated weights for policy 0, policy_version 80914 (0.0026) [2024-06-12 21:09:59,529][71000] Updated weights for policy 0, policy_version 80924 (0.0029) [2024-06-12 21:10:00,940][70768] Fps is (10 sec: 47512.2, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 1325907968. Throughput: 0: 49060.7. Samples: 854787120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-12 21:10:00,941][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:10:03,191][71000] Updated weights for policy 0, policy_version 80934 (0.0034) [2024-06-12 21:10:05,939][70768] Fps is (10 sec: 49152.1, 60 sec: 48879.1, 300 sec: 48985.4). Total num frames: 1326153728. Throughput: 0: 49093.6. Samples: 854926980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-12 21:10:05,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:10:06,259][71000] Updated weights for policy 0, policy_version 80944 (0.0033) [2024-06-12 21:10:09,878][71000] Updated weights for policy 0, policy_version 80954 (0.0021) [2024-06-12 21:10:10,939][70768] Fps is (10 sec: 49153.7, 60 sec: 48606.1, 300 sec: 48985.4). Total num frames: 1326399488. Throughput: 0: 48905.4. Samples: 855219780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-12 21:10:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:10:11,244][70980] Signal inference workers to stop experience collection... (12600 times) [2024-06-12 21:10:11,245][70980] Signal inference workers to resume experience collection... (12600 times) [2024-06-12 21:10:11,262][71000] InferenceWorker_p0-w0: stopping experience collection (12600 times) [2024-06-12 21:10:11,262][71000] InferenceWorker_p0-w0: resuming experience collection (12600 times) [2024-06-12 21:10:12,889][71000] Updated weights for policy 0, policy_version 80964 (0.0024) [2024-06-12 21:10:15,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.2, 300 sec: 48874.3). Total num frames: 1326645248. Throughput: 0: 48626.4. Samples: 855510800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 21:10:15,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:10:16,430][71000] Updated weights for policy 0, policy_version 80974 (0.0029) [2024-06-12 21:10:19,317][71000] Updated weights for policy 0, policy_version 80984 (0.0028) [2024-06-12 21:10:20,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 1326874624. Throughput: 0: 48735.2. Samples: 855661280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 21:10:20,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:10:22,977][71000] Updated weights for policy 0, policy_version 80994 (0.0025) [2024-06-12 21:10:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 1327136768. Throughput: 0: 48864.8. Samples: 855959560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 21:10:25,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:10:26,383][71000] Updated weights for policy 0, policy_version 81004 (0.0035) [2024-06-12 21:10:29,755][71000] Updated weights for policy 0, policy_version 81014 (0.0025) [2024-06-12 21:10:30,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48333.0, 300 sec: 48929.9). Total num frames: 1327366144. Throughput: 0: 48811.8. Samples: 856241380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 21:10:30,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:10:33,087][71000] Updated weights for policy 0, policy_version 81024 (0.0028) [2024-06-12 21:10:35,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 1327628288. Throughput: 0: 48687.1. Samples: 856389800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 21:10:35,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:10:36,750][71000] Updated weights for policy 0, policy_version 81034 (0.0036) [2024-06-12 21:10:40,112][71000] Updated weights for policy 0, policy_version 81044 (0.0034) [2024-06-12 21:10:40,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48605.8, 300 sec: 48818.8). Total num frames: 1327841280. Throughput: 0: 48471.5. Samples: 856680160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 21:10:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 21:10:43,433][71000] Updated weights for policy 0, policy_version 81054 (0.0035) [2024-06-12 21:10:45,940][70768] Fps is (10 sec: 45874.7, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 1328087040. Throughput: 0: 48451.2. Samples: 856967420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 21:10:45,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 21:10:47,085][71000] Updated weights for policy 0, policy_version 81064 (0.0037) [2024-06-12 21:10:50,212][71000] Updated weights for policy 0, policy_version 81074 (0.0029) [2024-06-12 21:10:50,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 1328349184. Throughput: 0: 48601.7. Samples: 857114060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 21:10:50,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:10:53,655][71000] Updated weights for policy 0, policy_version 81084 (0.0037) [2024-06-12 21:10:55,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 1328578560. Throughput: 0: 48601.2. Samples: 857406840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 21:10:55,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:10:56,868][71000] Updated weights for policy 0, policy_version 81094 (0.0033) [2024-06-12 21:11:00,545][71000] Updated weights for policy 0, policy_version 81104 (0.0028) [2024-06-12 21:11:00,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48606.1, 300 sec: 48763.2). Total num frames: 1328824320. Throughput: 0: 48811.5. Samples: 857707320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-12 21:11:00,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 21:11:03,344][71000] Updated weights for policy 0, policy_version 81114 (0.0024) [2024-06-12 21:11:05,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48878.8, 300 sec: 48929.9). Total num frames: 1329086464. Throughput: 0: 48573.3. Samples: 857847080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 21:11:05,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:11:07,059][71000] Updated weights for policy 0, policy_version 81124 (0.0034) [2024-06-12 21:11:10,070][71000] Updated weights for policy 0, policy_version 81134 (0.0027) [2024-06-12 21:11:10,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 1329332224. Throughput: 0: 48614.2. Samples: 858147200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 21:11:10,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:11:13,761][71000] Updated weights for policy 0, policy_version 81144 (0.0030) [2024-06-12 21:11:15,940][70768] Fps is (10 sec: 45875.6, 60 sec: 48332.8, 300 sec: 48707.7). Total num frames: 1329545216. Throughput: 0: 48978.6. Samples: 858445420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 21:11:15,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:11:16,594][71000] Updated weights for policy 0, policy_version 81154 (0.0025) [2024-06-12 21:11:20,613][71000] Updated weights for policy 0, policy_version 81164 (0.0024) [2024-06-12 21:11:20,939][70768] Fps is (10 sec: 47514.1, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1329807360. Throughput: 0: 49005.0. Samples: 858595020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 21:11:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:11:22,773][70980] Signal inference workers to stop experience collection... (12650 times) [2024-06-12 21:11:22,800][71000] InferenceWorker_p0-w0: stopping experience collection (12650 times) [2024-06-12 21:11:22,826][70980] Signal inference workers to resume experience collection... (12650 times) [2024-06-12 21:11:22,827][71000] InferenceWorker_p0-w0: resuming experience collection (12650 times) [2024-06-12 21:11:23,281][71000] Updated weights for policy 0, policy_version 81174 (0.0025) [2024-06-12 21:11:25,940][70768] Fps is (10 sec: 52428.9, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 1330069504. Throughput: 0: 48866.7. Samples: 858879160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 21:11:25,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:11:27,054][71000] Updated weights for policy 0, policy_version 81184 (0.0029) [2024-06-12 21:11:30,090][71000] Updated weights for policy 0, policy_version 81194 (0.0027) [2024-06-12 21:11:30,939][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 1330331648. Throughput: 0: 48911.8. Samples: 859168440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 21:11:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:11:33,863][71000] Updated weights for policy 0, policy_version 81204 (0.0029) [2024-06-12 21:11:35,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 1330561024. Throughput: 0: 49180.4. Samples: 859327180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 21:11:35,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:11:36,639][71000] Updated weights for policy 0, policy_version 81214 (0.0025) [2024-06-12 21:11:40,380][71000] Updated weights for policy 0, policy_version 81224 (0.0034) [2024-06-12 21:11:40,940][70768] Fps is (10 sec: 45874.8, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 1330790400. Throughput: 0: 49190.2. Samples: 859620400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 21:11:40,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:11:40,978][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000081226_1330806784.pth... [2024-06-12 21:11:41,024][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000080509_1319059456.pth [2024-06-12 21:11:43,539][71000] Updated weights for policy 0, policy_version 81234 (0.0030) [2024-06-12 21:11:45,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1331019776. Throughput: 0: 48998.7. Samples: 859912260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 21:11:45,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:11:47,156][71000] Updated weights for policy 0, policy_version 81244 (0.0036) [2024-06-12 21:11:50,090][71000] Updated weights for policy 0, policy_version 81254 (0.0025) [2024-06-12 21:11:50,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 1331281920. Throughput: 0: 49040.6. Samples: 860053900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 21:11:50,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:11:54,017][71000] Updated weights for policy 0, policy_version 81264 (0.0032) [2024-06-12 21:11:55,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 1331527680. Throughput: 0: 48821.3. Samples: 860344160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:11:55,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:11:56,875][71000] Updated weights for policy 0, policy_version 81274 (0.0042) [2024-06-12 21:12:00,783][71000] Updated weights for policy 0, policy_version 81284 (0.0030) [2024-06-12 21:12:00,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 1331773440. Throughput: 0: 48732.9. Samples: 860638400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:12:00,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:12:03,574][71000] Updated weights for policy 0, policy_version 81294 (0.0036) [2024-06-12 21:12:05,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 1332002816. Throughput: 0: 48574.1. Samples: 860780860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:12:05,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:12:07,155][71000] Updated weights for policy 0, policy_version 81304 (0.0027) [2024-06-12 21:12:10,340][71000] Updated weights for policy 0, policy_version 81314 (0.0030) [2024-06-12 21:12:10,940][70768] Fps is (10 sec: 50789.4, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 1332281344. Throughput: 0: 48918.0. Samples: 861080480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:12:10,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:12:14,057][71000] Updated weights for policy 0, policy_version 81324 (0.0030) [2024-06-12 21:12:15,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.0, 300 sec: 48929.8). Total num frames: 1332510720. Throughput: 0: 49082.0. Samples: 861377140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:12:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:12:16,873][71000] Updated weights for policy 0, policy_version 81334 (0.0023) [2024-06-12 21:12:20,737][71000] Updated weights for policy 0, policy_version 81344 (0.0031) [2024-06-12 21:12:20,940][70768] Fps is (10 sec: 45875.6, 60 sec: 48878.8, 300 sec: 48818.8). Total num frames: 1332740096. Throughput: 0: 48725.6. Samples: 861519840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:12:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:12:23,538][71000] Updated weights for policy 0, policy_version 81354 (0.0030) [2024-06-12 21:12:25,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 1332985856. Throughput: 0: 48777.8. Samples: 861815400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:12:25,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:12:27,449][71000] Updated weights for policy 0, policy_version 81364 (0.0033) [2024-06-12 21:12:30,202][71000] Updated weights for policy 0, policy_version 81374 (0.0033) [2024-06-12 21:12:30,939][70768] Fps is (10 sec: 49152.9, 60 sec: 48332.8, 300 sec: 48818.8). Total num frames: 1333231616. Throughput: 0: 48662.3. Samples: 862102060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:12:30,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:12:34,244][71000] Updated weights for policy 0, policy_version 81384 (0.0047) [2024-06-12 21:12:35,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 1333493760. Throughput: 0: 49040.0. Samples: 862260700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:12:35,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:12:36,981][71000] Updated weights for policy 0, policy_version 81394 (0.0028) [2024-06-12 21:12:40,806][71000] Updated weights for policy 0, policy_version 81404 (0.0034) [2024-06-12 21:12:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1333723136. Throughput: 0: 48966.3. Samples: 862547640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:12:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:12:41,941][70980] Signal inference workers to stop experience collection... (12700 times) [2024-06-12 21:12:41,964][71000] InferenceWorker_p0-w0: stopping experience collection (12700 times) [2024-06-12 21:12:41,995][70980] Signal inference workers to resume experience collection... (12700 times) [2024-06-12 21:12:41,995][71000] InferenceWorker_p0-w0: resuming experience collection (12700 times) [2024-06-12 21:12:44,158][71000] Updated weights for policy 0, policy_version 81414 (0.0031) [2024-06-12 21:12:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 1333968896. Throughput: 0: 48882.7. Samples: 862838120. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 21:12:45,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 21:12:47,817][71000] Updated weights for policy 0, policy_version 81424 (0.0028) [2024-06-12 21:12:50,693][71000] Updated weights for policy 0, policy_version 81434 (0.0027) [2024-06-12 21:12:50,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 1334231040. Throughput: 0: 49215.2. Samples: 862995540. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 21:12:50,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:12:54,508][71000] Updated weights for policy 0, policy_version 81444 (0.0032) [2024-06-12 21:12:55,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 1334476800. Throughput: 0: 49130.3. Samples: 863291340. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 21:12:55,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:12:57,185][71000] Updated weights for policy 0, policy_version 81454 (0.0030) [2024-06-12 21:13:00,940][70768] Fps is (10 sec: 44236.4, 60 sec: 48332.8, 300 sec: 48763.2). Total num frames: 1334673408. Throughput: 0: 48856.9. Samples: 863575700. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 21:13:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:13:01,359][71000] Updated weights for policy 0, policy_version 81464 (0.0032) [2024-06-12 21:13:04,251][71000] Updated weights for policy 0, policy_version 81474 (0.0038) [2024-06-12 21:13:05,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49151.9, 300 sec: 48818.8). Total num frames: 1334951936. Throughput: 0: 48792.5. Samples: 863715500. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 21:13:05,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:13:07,998][71000] Updated weights for policy 0, policy_version 81484 (0.0023) [2024-06-12 21:13:10,709][71000] Updated weights for policy 0, policy_version 81494 (0.0029) [2024-06-12 21:13:10,940][70768] Fps is (10 sec: 54067.5, 60 sec: 48879.1, 300 sec: 49040.9). Total num frames: 1335214080. Throughput: 0: 48772.0. Samples: 864010140. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 21:13:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:13:14,374][71000] Updated weights for policy 0, policy_version 81504 (0.0026) [2024-06-12 21:13:15,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 1335459840. Throughput: 0: 49057.6. Samples: 864309660. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 21:13:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:13:17,502][71000] Updated weights for policy 0, policy_version 81514 (0.0021) [2024-06-12 21:13:20,940][70768] Fps is (10 sec: 44236.4, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 1335656448. Throughput: 0: 48938.1. Samples: 864462920. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 21:13:20,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:13:21,269][71000] Updated weights for policy 0, policy_version 81524 (0.0029) [2024-06-12 21:13:23,955][71000] Updated weights for policy 0, policy_version 81534 (0.0033) [2024-06-12 21:13:25,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49151.9, 300 sec: 48818.8). Total num frames: 1335934976. Throughput: 0: 48893.2. Samples: 864747840. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 21:13:25,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 21:13:27,911][71000] Updated weights for policy 0, policy_version 81544 (0.0033) [2024-06-12 21:13:30,939][70768] Fps is (10 sec: 50791.3, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1336164352. Throughput: 0: 48908.5. Samples: 865039000. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 21:13:30,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 21:13:30,993][71000] Updated weights for policy 0, policy_version 81554 (0.0031) [2024-06-12 21:13:34,644][71000] Updated weights for policy 0, policy_version 81564 (0.0032) [2024-06-12 21:13:35,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1336426496. Throughput: 0: 48833.3. Samples: 865193040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 21:13:35,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 21:13:37,708][71000] Updated weights for policy 0, policy_version 81574 (0.0029) [2024-06-12 21:13:40,939][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 1336639488. Throughput: 0: 48819.7. Samples: 865488220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 21:13:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:13:41,143][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000081584_1336672256.pth... [2024-06-12 21:13:41,150][71000] Updated weights for policy 0, policy_version 81584 (0.0039) [2024-06-12 21:13:41,196][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000080868_1324941312.pth [2024-06-12 21:13:44,190][71000] Updated weights for policy 0, policy_version 81594 (0.0026) [2024-06-12 21:13:45,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 1336901632. Throughput: 0: 49042.7. Samples: 865782620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 21:13:45,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:13:47,667][70980] Signal inference workers to stop experience collection... (12750 times) [2024-06-12 21:13:47,668][70980] Signal inference workers to resume experience collection... (12750 times) [2024-06-12 21:13:47,708][71000] InferenceWorker_p0-w0: stopping experience collection (12750 times) [2024-06-12 21:13:47,708][71000] InferenceWorker_p0-w0: resuming experience collection (12750 times) [2024-06-12 21:13:47,820][71000] Updated weights for policy 0, policy_version 81604 (0.0045) [2024-06-12 21:13:50,940][70768] Fps is (10 sec: 50789.7, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 1337147392. Throughput: 0: 49140.9. Samples: 865926840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 21:13:50,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:13:51,041][71000] Updated weights for policy 0, policy_version 81614 (0.0039) [2024-06-12 21:13:54,434][71000] Updated weights for policy 0, policy_version 81624 (0.0024) [2024-06-12 21:13:55,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48606.0, 300 sec: 48874.3). Total num frames: 1337393152. Throughput: 0: 49160.0. Samples: 866222340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 21:13:55,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:13:57,532][71000] Updated weights for policy 0, policy_version 81634 (0.0032) [2024-06-12 21:14:00,921][71000] Updated weights for policy 0, policy_version 81644 (0.0022) [2024-06-12 21:14:00,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49698.2, 300 sec: 48929.9). Total num frames: 1337655296. Throughput: 0: 49265.0. Samples: 866526580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 21:14:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:14:04,016][71000] Updated weights for policy 0, policy_version 81654 (0.0024) [2024-06-12 21:14:05,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 1337901056. Throughput: 0: 49054.3. Samples: 866670360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 21:14:05,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 21:14:07,717][71000] Updated weights for policy 0, policy_version 81664 (0.0032) [2024-06-12 21:14:10,740][71000] Updated weights for policy 0, policy_version 81674 (0.0029) [2024-06-12 21:14:10,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 1338146816. Throughput: 0: 49109.9. Samples: 866957780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 21:14:10,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 21:14:14,538][71000] Updated weights for policy 0, policy_version 81684 (0.0036) [2024-06-12 21:14:15,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 1338376192. Throughput: 0: 49050.6. Samples: 867246280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 21:14:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:14:17,674][71000] Updated weights for policy 0, policy_version 81694 (0.0026) [2024-06-12 21:14:20,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49425.1, 300 sec: 48929.8). Total num frames: 1338621952. Throughput: 0: 48843.9. Samples: 867391020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 21:14:20,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 21:14:21,266][71000] Updated weights for policy 0, policy_version 81704 (0.0035) [2024-06-12 21:14:24,683][71000] Updated weights for policy 0, policy_version 81714 (0.0029) [2024-06-12 21:14:25,939][70768] Fps is (10 sec: 49152.6, 60 sec: 48879.1, 300 sec: 48818.8). Total num frames: 1338867712. Throughput: 0: 48894.7. Samples: 867688480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 21:14:25,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:14:27,823][71000] Updated weights for policy 0, policy_version 81724 (0.0035) [2024-06-12 21:14:30,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49151.8, 300 sec: 48929.8). Total num frames: 1339113472. Throughput: 0: 48949.7. Samples: 867985360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 21:14:30,947][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:14:31,250][71000] Updated weights for policy 0, policy_version 81734 (0.0024) [2024-06-12 21:14:34,508][71000] Updated weights for policy 0, policy_version 81744 (0.0026) [2024-06-12 21:14:35,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 1339359232. Throughput: 0: 49009.8. Samples: 868132280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 21:14:35,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:14:38,102][71000] Updated weights for policy 0, policy_version 81754 (0.0028) [2024-06-12 21:14:40,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48878.7, 300 sec: 48818.8). Total num frames: 1339572224. Throughput: 0: 48879.8. Samples: 868421940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 21:14:40,949][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:14:41,362][71000] Updated weights for policy 0, policy_version 81764 (0.0029) [2024-06-12 21:14:44,782][71000] Updated weights for policy 0, policy_version 81774 (0.0030) [2024-06-12 21:14:45,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 48818.7). Total num frames: 1339834368. Throughput: 0: 48580.8. Samples: 868712720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 21:14:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:14:48,152][71000] Updated weights for policy 0, policy_version 81784 (0.0028) [2024-06-12 21:14:50,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 1340080128. Throughput: 0: 48577.6. Samples: 868856360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 21:14:50,941][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:14:51,284][71000] Updated weights for policy 0, policy_version 81794 (0.0027) [2024-06-12 21:14:54,670][71000] Updated weights for policy 0, policy_version 81804 (0.0032) [2024-06-12 21:14:55,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 1340358656. Throughput: 0: 48883.6. Samples: 869157540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 21:14:55,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:14:58,321][71000] Updated weights for policy 0, policy_version 81814 (0.0031) [2024-06-12 21:15:00,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48332.7, 300 sec: 48818.7). Total num frames: 1340555264. Throughput: 0: 48827.4. Samples: 869443520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 21:15:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:15:01,481][71000] Updated weights for policy 0, policy_version 81824 (0.0043) [2024-06-12 21:15:03,609][70980] Signal inference workers to stop experience collection... (12800 times) [2024-06-12 21:15:03,610][70980] Signal inference workers to resume experience collection... (12800 times) [2024-06-12 21:15:03,652][71000] InferenceWorker_p0-w0: stopping experience collection (12800 times) [2024-06-12 21:15:03,652][71000] InferenceWorker_p0-w0: resuming experience collection (12800 times) [2024-06-12 21:15:05,248][71000] Updated weights for policy 0, policy_version 81834 (0.0033) [2024-06-12 21:15:05,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 1340817408. Throughput: 0: 48786.7. Samples: 869586420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 21:15:05,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:15:08,071][71000] Updated weights for policy 0, policy_version 81844 (0.0027) [2024-06-12 21:15:10,939][70768] Fps is (10 sec: 49152.9, 60 sec: 48332.9, 300 sec: 48818.8). Total num frames: 1341046784. Throughput: 0: 48619.1. Samples: 869876340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 21:15:10,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 21:15:11,695][71000] Updated weights for policy 0, policy_version 81854 (0.0026) [2024-06-12 21:15:14,719][71000] Updated weights for policy 0, policy_version 81864 (0.0030) [2024-06-12 21:15:15,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 1341308928. Throughput: 0: 48595.6. Samples: 870172160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-12 21:15:15,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 21:15:18,347][71000] Updated weights for policy 0, policy_version 81874 (0.0027) [2024-06-12 21:15:20,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1341554688. Throughput: 0: 48864.9. Samples: 870331200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-12 21:15:20,940][70768] Avg episode reward: [(0, '0.240')] [2024-06-12 21:15:21,343][71000] Updated weights for policy 0, policy_version 81884 (0.0024) [2024-06-12 21:15:25,279][71000] Updated weights for policy 0, policy_version 81894 (0.0034) [2024-06-12 21:15:25,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48332.7, 300 sec: 48818.8). Total num frames: 1341767680. Throughput: 0: 48773.6. Samples: 870616740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-12 21:15:25,940][70768] Avg episode reward: [(0, '0.252')] [2024-06-12 21:15:28,223][71000] Updated weights for policy 0, policy_version 81904 (0.0033) [2024-06-12 21:15:30,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.1, 300 sec: 48874.3). Total num frames: 1342046208. Throughput: 0: 48832.6. Samples: 870910180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-12 21:15:30,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 21:15:31,727][71000] Updated weights for policy 0, policy_version 81914 (0.0034) [2024-06-12 21:15:34,572][71000] Updated weights for policy 0, policy_version 81924 (0.0033) [2024-06-12 21:15:35,940][70768] Fps is (10 sec: 52428.7, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 1342291968. Throughput: 0: 49036.7. Samples: 871063000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-12 21:15:35,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:15:38,759][71000] Updated weights for policy 0, policy_version 81934 (0.0032) [2024-06-12 21:15:40,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.2, 300 sec: 48985.4). Total num frames: 1342537728. Throughput: 0: 48835.6. Samples: 871355140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-12 21:15:40,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:15:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000081943_1342554112.pth... [2024-06-12 21:15:40,996][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000081226_1330806784.pth [2024-06-12 21:15:41,189][71000] Updated weights for policy 0, policy_version 81944 (0.0024) [2024-06-12 21:15:45,024][71000] Updated weights for policy 0, policy_version 81954 (0.0038) [2024-06-12 21:15:45,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 1342750720. Throughput: 0: 49107.3. Samples: 871653340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-12 21:15:45,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 21:15:47,850][71000] Updated weights for policy 0, policy_version 81964 (0.0032) [2024-06-12 21:15:50,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 1343012864. Throughput: 0: 49001.2. Samples: 871791480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-12 21:15:50,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 21:15:51,623][71000] Updated weights for policy 0, policy_version 81974 (0.0038) [2024-06-12 21:15:54,583][71000] Updated weights for policy 0, policy_version 81984 (0.0039) [2024-06-12 21:15:55,941][70768] Fps is (10 sec: 52422.9, 60 sec: 48605.0, 300 sec: 48985.2). Total num frames: 1343275008. Throughput: 0: 49230.7. Samples: 872091780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-12 21:15:55,941][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:15:58,132][71000] Updated weights for policy 0, policy_version 81994 (0.0027) [2024-06-12 21:16:00,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49698.2, 300 sec: 48985.4). Total num frames: 1343537152. Throughput: 0: 49245.0. Samples: 872388180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 21:16:00,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 21:16:01,163][71000] Updated weights for policy 0, policy_version 82004 (0.0030) [2024-06-12 21:16:05,500][71000] Updated weights for policy 0, policy_version 82014 (0.0031) [2024-06-12 21:16:05,940][70768] Fps is (10 sec: 47518.9, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 1343750144. Throughput: 0: 48882.7. Samples: 872530920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 21:16:05,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:16:07,793][71000] Updated weights for policy 0, policy_version 82024 (0.0027) [2024-06-12 21:16:09,237][70980] Signal inference workers to stop experience collection... (12850 times) [2024-06-12 21:16:09,263][71000] InferenceWorker_p0-w0: stopping experience collection (12850 times) [2024-06-12 21:16:09,295][70980] Signal inference workers to resume experience collection... (12850 times) [2024-06-12 21:16:09,296][71000] InferenceWorker_p0-w0: resuming experience collection (12850 times) [2024-06-12 21:16:10,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 1343995904. Throughput: 0: 49086.2. Samples: 872825620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 21:16:10,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:16:11,988][71000] Updated weights for policy 0, policy_version 82034 (0.0019) [2024-06-12 21:16:14,748][71000] Updated weights for policy 0, policy_version 82044 (0.0029) [2024-06-12 21:16:15,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 1344258048. Throughput: 0: 48981.3. Samples: 873114340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 21:16:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:16:18,345][71000] Updated weights for policy 0, policy_version 82054 (0.0022) [2024-06-12 21:16:20,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 1344503808. Throughput: 0: 49107.0. Samples: 873272820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 21:16:20,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:16:21,117][71000] Updated weights for policy 0, policy_version 82064 (0.0027) [2024-06-12 21:16:25,277][71000] Updated weights for policy 0, policy_version 82074 (0.0039) [2024-06-12 21:16:25,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 1344716800. Throughput: 0: 49042.6. Samples: 873562060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 21:16:25,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:16:27,813][71000] Updated weights for policy 0, policy_version 82084 (0.0029) [2024-06-12 21:16:30,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 1344978944. Throughput: 0: 48837.1. Samples: 873851020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 21:16:30,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:16:32,037][71000] Updated weights for policy 0, policy_version 82094 (0.0035) [2024-06-12 21:16:35,033][71000] Updated weights for policy 0, policy_version 82104 (0.0034) [2024-06-12 21:16:35,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 1345241088. Throughput: 0: 49101.9. Samples: 874001060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 21:16:35,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:16:38,808][71000] Updated weights for policy 0, policy_version 82114 (0.0031) [2024-06-12 21:16:40,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 1345486848. Throughput: 0: 49050.3. Samples: 874299000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 21:16:40,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:16:41,525][71000] Updated weights for policy 0, policy_version 82124 (0.0034) [2024-06-12 21:16:45,383][71000] Updated weights for policy 0, policy_version 82134 (0.0031) [2024-06-12 21:16:45,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 1345699840. Throughput: 0: 49006.7. Samples: 874593480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 21:16:45,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-12 21:16:48,096][71000] Updated weights for policy 0, policy_version 82144 (0.0032) [2024-06-12 21:16:50,940][70768] Fps is (10 sec: 47514.6, 60 sec: 49152.1, 300 sec: 48929.9). Total num frames: 1345961984. Throughput: 0: 48821.8. Samples: 874727900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-12 21:16:50,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 21:16:52,179][71000] Updated weights for policy 0, policy_version 82154 (0.0031) [2024-06-12 21:16:55,099][71000] Updated weights for policy 0, policy_version 82164 (0.0024) [2024-06-12 21:16:55,939][70768] Fps is (10 sec: 52429.3, 60 sec: 49153.0, 300 sec: 48985.4). Total num frames: 1346224128. Throughput: 0: 48917.0. Samples: 875026880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 21:16:55,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:16:58,931][71000] Updated weights for policy 0, policy_version 82174 (0.0033) [2024-06-12 21:17:00,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 1346453504. Throughput: 0: 49031.0. Samples: 875320740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 21:17:00,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:17:01,794][71000] Updated weights for policy 0, policy_version 82184 (0.0031) [2024-06-12 21:17:05,766][71000] Updated weights for policy 0, policy_version 82194 (0.0035) [2024-06-12 21:17:05,939][70768] Fps is (10 sec: 44236.8, 60 sec: 48605.9, 300 sec: 48763.3). Total num frames: 1346666496. Throughput: 0: 48760.7. Samples: 875467040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 21:17:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:17:08,459][71000] Updated weights for policy 0, policy_version 82204 (0.0032) [2024-06-12 21:17:10,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 1346945024. Throughput: 0: 48735.1. Samples: 875755140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 21:17:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:17:12,342][71000] Updated weights for policy 0, policy_version 82214 (0.0038) [2024-06-12 21:17:15,222][71000] Updated weights for policy 0, policy_version 82224 (0.0026) [2024-06-12 21:17:15,939][70768] Fps is (10 sec: 50790.2, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 1347174400. Throughput: 0: 48856.2. Samples: 876049540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 21:17:15,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 21:17:19,055][71000] Updated weights for policy 0, policy_version 82234 (0.0031) [2024-06-12 21:17:20,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48606.0, 300 sec: 48929.8). Total num frames: 1347420160. Throughput: 0: 48972.0. Samples: 876204800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 21:17:20,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 21:17:21,848][71000] Updated weights for policy 0, policy_version 82244 (0.0028) [2024-06-12 21:17:25,459][71000] Updated weights for policy 0, policy_version 82254 (0.0033) [2024-06-12 21:17:25,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1347649536. Throughput: 0: 48768.5. Samples: 876493580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 21:17:25,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:17:26,716][70980] Signal inference workers to stop experience collection... (12900 times) [2024-06-12 21:17:26,718][70980] Signal inference workers to resume experience collection... (12900 times) [2024-06-12 21:17:26,727][71000] InferenceWorker_p0-w0: stopping experience collection (12900 times) [2024-06-12 21:17:26,757][71000] InferenceWorker_p0-w0: resuming experience collection (12900 times) [2024-06-12 21:17:28,571][71000] Updated weights for policy 0, policy_version 82264 (0.0028) [2024-06-12 21:17:30,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 1347944448. Throughput: 0: 48735.9. Samples: 876786600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 21:17:30,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 21:17:32,278][71000] Updated weights for policy 0, policy_version 82274 (0.0035) [2024-06-12 21:17:35,124][71000] Updated weights for policy 0, policy_version 82284 (0.0039) [2024-06-12 21:17:35,939][70768] Fps is (10 sec: 50791.3, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 1348157440. Throughput: 0: 49185.8. Samples: 876941260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 21:17:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:17:38,975][71000] Updated weights for policy 0, policy_version 82294 (0.0035) [2024-06-12 21:17:40,939][70768] Fps is (10 sec: 45876.0, 60 sec: 48606.1, 300 sec: 48929.8). Total num frames: 1348403200. Throughput: 0: 49165.7. Samples: 877239340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 21:17:40,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:17:41,095][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000082301_1348419584.pth... [2024-06-12 21:17:41,160][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000081584_1336672256.pth [2024-06-12 21:17:41,755][71000] Updated weights for policy 0, policy_version 82304 (0.0028) [2024-06-12 21:17:45,382][71000] Updated weights for policy 0, policy_version 82314 (0.0034) [2024-06-12 21:17:45,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 1348632576. Throughput: 0: 49015.2. Samples: 877526420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-12 21:17:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:17:48,564][71000] Updated weights for policy 0, policy_version 82324 (0.0024) [2024-06-12 21:17:50,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1348894720. Throughput: 0: 48966.9. Samples: 877670560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-12 21:17:50,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:17:52,034][71000] Updated weights for policy 0, policy_version 82334 (0.0027) [2024-06-12 21:17:55,262][71000] Updated weights for policy 0, policy_version 82344 (0.0027) [2024-06-12 21:17:55,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 1349140480. Throughput: 0: 49048.5. Samples: 877962320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-12 21:17:55,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:17:59,095][71000] Updated weights for policy 0, policy_version 82354 (0.0036) [2024-06-12 21:18:00,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48879.1, 300 sec: 48929.9). Total num frames: 1349386240. Throughput: 0: 48955.1. Samples: 878252520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-12 21:18:00,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:18:02,061][71000] Updated weights for policy 0, policy_version 82364 (0.0032) [2024-06-12 21:18:05,458][71000] Updated weights for policy 0, policy_version 82374 (0.0037) [2024-06-12 21:18:05,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 1349615616. Throughput: 0: 48744.5. Samples: 878398300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-12 21:18:05,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:18:08,669][71000] Updated weights for policy 0, policy_version 82384 (0.0023) [2024-06-12 21:18:10,940][70768] Fps is (10 sec: 49151.2, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1349877760. Throughput: 0: 48966.2. Samples: 878697060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-12 21:18:10,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:18:12,030][71000] Updated weights for policy 0, policy_version 82394 (0.0031) [2024-06-12 21:18:15,404][71000] Updated weights for policy 0, policy_version 82404 (0.0028) [2024-06-12 21:18:15,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 1350123520. Throughput: 0: 48979.1. Samples: 878990660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-12 21:18:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:18:18,759][71000] Updated weights for policy 0, policy_version 82414 (0.0032) [2024-06-12 21:18:20,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1350352896. Throughput: 0: 48821.7. Samples: 879138240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-12 21:18:20,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:18:21,886][71000] Updated weights for policy 0, policy_version 82424 (0.0027) [2024-06-12 21:18:25,334][71000] Updated weights for policy 0, policy_version 82434 (0.0027) [2024-06-12 21:18:25,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 1350615040. Throughput: 0: 48827.8. Samples: 879436600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-12 21:18:25,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:18:28,719][71000] Updated weights for policy 0, policy_version 82444 (0.0032) [2024-06-12 21:18:30,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48059.8, 300 sec: 48818.8). Total num frames: 1350828032. Throughput: 0: 48872.1. Samples: 879725660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-12 21:18:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:18:31,981][71000] Updated weights for policy 0, policy_version 82454 (0.0033) [2024-06-12 21:18:35,346][71000] Updated weights for policy 0, policy_version 82464 (0.0033) [2024-06-12 21:18:35,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 1351106560. Throughput: 0: 48924.9. Samples: 879872180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 21:18:35,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:18:38,759][71000] Updated weights for policy 0, policy_version 82474 (0.0036) [2024-06-12 21:18:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 1351319552. Throughput: 0: 48782.2. Samples: 880157520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 21:18:40,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 21:18:41,351][70980] Signal inference workers to stop experience collection... (12950 times) [2024-06-12 21:18:41,370][71000] InferenceWorker_p0-w0: stopping experience collection (12950 times) [2024-06-12 21:18:41,409][70980] Signal inference workers to resume experience collection... (12950 times) [2024-06-12 21:18:41,409][71000] InferenceWorker_p0-w0: resuming experience collection (12950 times) [2024-06-12 21:18:42,119][71000] Updated weights for policy 0, policy_version 82484 (0.0034) [2024-06-12 21:18:45,794][71000] Updated weights for policy 0, policy_version 82494 (0.0021) [2024-06-12 21:18:45,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.1, 300 sec: 48929.9). Total num frames: 1351581696. Throughput: 0: 48954.7. Samples: 880455480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 21:18:45,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:18:48,803][71000] Updated weights for policy 0, policy_version 82504 (0.0026) [2024-06-12 21:18:50,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 1351811072. Throughput: 0: 48967.1. Samples: 880601820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 21:18:50,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:18:52,068][71000] Updated weights for policy 0, policy_version 82514 (0.0027) [2024-06-12 21:18:55,207][71000] Updated weights for policy 0, policy_version 82524 (0.0028) [2024-06-12 21:18:55,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 1352089600. Throughput: 0: 49052.0. Samples: 880904400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 21:18:55,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:18:59,027][71000] Updated weights for policy 0, policy_version 82534 (0.0035) [2024-06-12 21:19:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 1352318976. Throughput: 0: 49067.6. Samples: 881198700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 21:19:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:19:02,051][71000] Updated weights for policy 0, policy_version 82544 (0.0033) [2024-06-12 21:19:05,499][71000] Updated weights for policy 0, policy_version 82554 (0.0030) [2024-06-12 21:19:05,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49425.0, 300 sec: 48929.9). Total num frames: 1352581120. Throughput: 0: 49121.8. Samples: 881348720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 21:19:05,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:19:08,516][71000] Updated weights for policy 0, policy_version 82564 (0.0027) [2024-06-12 21:19:10,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48879.1, 300 sec: 48929.9). Total num frames: 1352810496. Throughput: 0: 48960.6. Samples: 881639820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 21:19:10,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 21:19:12,174][71000] Updated weights for policy 0, policy_version 82574 (0.0031) [2024-06-12 21:19:15,517][71000] Updated weights for policy 0, policy_version 82584 (0.0028) [2024-06-12 21:19:15,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 1353072640. Throughput: 0: 49006.5. Samples: 881930960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 21:19:15,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:19:18,709][71000] Updated weights for policy 0, policy_version 82594 (0.0026) [2024-06-12 21:19:20,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 1353302016. Throughput: 0: 49092.0. Samples: 882081320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-12 21:19:20,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 21:19:21,861][71000] Updated weights for policy 0, policy_version 82604 (0.0031) [2024-06-12 21:19:25,400][71000] Updated weights for policy 0, policy_version 82614 (0.0030) [2024-06-12 21:19:25,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 1353564160. Throughput: 0: 49443.1. Samples: 882382460. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-12 21:19:25,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:19:28,663][71000] Updated weights for policy 0, policy_version 82624 (0.0029) [2024-06-12 21:19:30,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.0, 300 sec: 48929.9). Total num frames: 1353793536. Throughput: 0: 49375.9. Samples: 882677400. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-12 21:19:30,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:19:32,250][71000] Updated weights for policy 0, policy_version 82634 (0.0026) [2024-06-12 21:19:35,261][71000] Updated weights for policy 0, policy_version 82644 (0.0035) [2024-06-12 21:19:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 1354055680. Throughput: 0: 49293.4. Samples: 882820020. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-12 21:19:35,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 21:19:39,032][71000] Updated weights for policy 0, policy_version 82654 (0.0040) [2024-06-12 21:19:40,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 1354285056. Throughput: 0: 49032.1. Samples: 883110840. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-12 21:19:40,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 21:19:41,051][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000082660_1354301440.pth... [2024-06-12 21:19:41,104][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000081943_1342554112.pth [2024-06-12 21:19:42,051][71000] Updated weights for policy 0, policy_version 82664 (0.0031) [2024-06-12 21:19:45,491][71000] Updated weights for policy 0, policy_version 82674 (0.0027) [2024-06-12 21:19:45,940][70768] Fps is (10 sec: 50789.0, 60 sec: 49697.9, 300 sec: 49096.5). Total num frames: 1354563584. Throughput: 0: 49236.6. Samples: 883414360. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-12 21:19:45,941][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:19:48,905][71000] Updated weights for policy 0, policy_version 82684 (0.0036) [2024-06-12 21:19:50,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49151.9, 300 sec: 48818.7). Total num frames: 1354760192. Throughput: 0: 49017.2. Samples: 883554500. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-12 21:19:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:19:52,264][71000] Updated weights for policy 0, policy_version 82694 (0.0023) [2024-06-12 21:19:55,484][71000] Updated weights for policy 0, policy_version 82704 (0.0032) [2024-06-12 21:19:55,939][70768] Fps is (10 sec: 47515.0, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 1355038720. Throughput: 0: 49051.1. Samples: 883847120. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-12 21:19:55,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:19:59,131][71000] Updated weights for policy 0, policy_version 82714 (0.0034) [2024-06-12 21:20:00,301][70980] Signal inference workers to stop experience collection... (13000 times) [2024-06-12 21:20:00,346][71000] InferenceWorker_p0-w0: stopping experience collection (13000 times) [2024-06-12 21:20:00,407][70980] Signal inference workers to resume experience collection... (13000 times) [2024-06-12 21:20:00,407][71000] InferenceWorker_p0-w0: resuming experience collection (13000 times) [2024-06-12 21:20:00,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 1355284480. Throughput: 0: 49240.0. Samples: 884146760. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-12 21:20:00,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:20:02,184][71000] Updated weights for policy 0, policy_version 82724 (0.0026) [2024-06-12 21:20:05,605][71000] Updated weights for policy 0, policy_version 82734 (0.0027) [2024-06-12 21:20:05,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 1355513856. Throughput: 0: 49166.7. Samples: 884293820. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-12 21:20:05,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:20:08,761][71000] Updated weights for policy 0, policy_version 82744 (0.0022) [2024-06-12 21:20:10,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 1355743232. Throughput: 0: 49099.8. Samples: 884591960. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-12 21:20:10,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:20:12,045][71000] Updated weights for policy 0, policy_version 82754 (0.0033) [2024-06-12 21:20:15,652][71000] Updated weights for policy 0, policy_version 82764 (0.0033) [2024-06-12 21:20:15,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 1356021760. Throughput: 0: 48859.5. Samples: 884876080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 21:20:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:20:18,999][71000] Updated weights for policy 0, policy_version 82774 (0.0033) [2024-06-12 21:20:20,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 1356267520. Throughput: 0: 49057.3. Samples: 885027600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 21:20:20,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 21:20:22,357][71000] Updated weights for policy 0, policy_version 82784 (0.0030) [2024-06-12 21:20:25,636][71000] Updated weights for policy 0, policy_version 82794 (0.0031) [2024-06-12 21:20:25,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 1356496896. Throughput: 0: 49112.0. Samples: 885320880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 21:20:25,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 21:20:28,863][71000] Updated weights for policy 0, policy_version 82804 (0.0024) [2024-06-12 21:20:30,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 1356742656. Throughput: 0: 48996.6. Samples: 885619200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 21:20:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:20:32,177][71000] Updated weights for policy 0, policy_version 82814 (0.0037) [2024-06-12 21:20:35,592][71000] Updated weights for policy 0, policy_version 82824 (0.0026) [2024-06-12 21:20:35,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49151.8, 300 sec: 49040.9). Total num frames: 1357004800. Throughput: 0: 49214.6. Samples: 885769160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 21:20:35,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 21:20:38,994][71000] Updated weights for policy 0, policy_version 82834 (0.0035) [2024-06-12 21:20:40,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 1357234176. Throughput: 0: 49164.4. Samples: 886059520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 21:20:40,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:20:42,172][71000] Updated weights for policy 0, policy_version 82844 (0.0024) [2024-06-12 21:20:45,775][71000] Updated weights for policy 0, policy_version 82854 (0.0020) [2024-06-12 21:20:45,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48606.0, 300 sec: 49040.9). Total num frames: 1357479936. Throughput: 0: 49069.3. Samples: 886354880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 21:20:45,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:20:49,095][71000] Updated weights for policy 0, policy_version 82864 (0.0039) [2024-06-12 21:20:50,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49425.1, 300 sec: 48985.5). Total num frames: 1357725696. Throughput: 0: 49012.3. Samples: 886499380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 21:20:50,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:20:52,354][71000] Updated weights for policy 0, policy_version 82874 (0.0023) [2024-06-12 21:20:55,382][71000] Updated weights for policy 0, policy_version 82884 (0.0024) [2024-06-12 21:20:55,939][70768] Fps is (10 sec: 50791.3, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 1357987840. Throughput: 0: 49149.9. Samples: 886803700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 21:20:55,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:20:59,066][71000] Updated weights for policy 0, policy_version 82894 (0.0029) [2024-06-12 21:21:00,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48605.8, 300 sec: 48985.3). Total num frames: 1358200832. Throughput: 0: 49407.4. Samples: 887099420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-12 21:21:00,941][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:21:01,867][71000] Updated weights for policy 0, policy_version 82904 (0.0031) [2024-06-12 21:21:05,652][71000] Updated weights for policy 0, policy_version 82914 (0.0030) [2024-06-12 21:21:05,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 1358462976. Throughput: 0: 49141.8. Samples: 887238980. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-12 21:21:05,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:21:08,867][71000] Updated weights for policy 0, policy_version 82924 (0.0027) [2024-06-12 21:21:10,940][70768] Fps is (10 sec: 50791.4, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 1358708736. Throughput: 0: 49081.3. Samples: 887529540. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-12 21:21:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:21:12,399][71000] Updated weights for policy 0, policy_version 82934 (0.0027) [2024-06-12 21:21:15,106][70980] Signal inference workers to stop experience collection... (13050 times) [2024-06-12 21:21:15,150][71000] InferenceWorker_p0-w0: stopping experience collection (13050 times) [2024-06-12 21:21:15,216][70980] Signal inference workers to resume experience collection... (13050 times) [2024-06-12 21:21:15,217][71000] InferenceWorker_p0-w0: resuming experience collection (13050 times) [2024-06-12 21:21:15,598][71000] Updated weights for policy 0, policy_version 82944 (0.0023) [2024-06-12 21:21:15,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 1358970880. Throughput: 0: 49094.8. Samples: 887828460. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-12 21:21:15,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:21:18,881][71000] Updated weights for policy 0, policy_version 82954 (0.0028) [2024-06-12 21:21:20,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 1359200256. Throughput: 0: 49070.8. Samples: 887977340. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-12 21:21:20,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:21:22,277][71000] Updated weights for policy 0, policy_version 82964 (0.0032) [2024-06-12 21:21:25,566][71000] Updated weights for policy 0, policy_version 82974 (0.0037) [2024-06-12 21:21:25,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 1359446016. Throughput: 0: 49225.7. Samples: 888274680. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-12 21:21:25,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:21:28,827][71000] Updated weights for policy 0, policy_version 82984 (0.0030) [2024-06-12 21:21:30,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 1359691776. Throughput: 0: 48956.1. Samples: 888557900. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-12 21:21:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:21:32,299][71000] Updated weights for policy 0, policy_version 82994 (0.0032) [2024-06-12 21:21:35,632][71000] Updated weights for policy 0, policy_version 83004 (0.0032) [2024-06-12 21:21:35,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.1, 300 sec: 49041.0). Total num frames: 1359953920. Throughput: 0: 49119.2. Samples: 888709740. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-12 21:21:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:21:39,048][71000] Updated weights for policy 0, policy_version 83014 (0.0032) [2024-06-12 21:21:40,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 1360166912. Throughput: 0: 48896.7. Samples: 889004060. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-12 21:21:40,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 21:21:40,987][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000083019_1360183296.pth... [2024-06-12 21:21:41,027][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000082301_1348419584.pth [2024-06-12 21:21:42,427][71000] Updated weights for policy 0, policy_version 83024 (0.0029) [2024-06-12 21:21:45,477][71000] Updated weights for policy 0, policy_version 83034 (0.0023) [2024-06-12 21:21:45,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 1360429056. Throughput: 0: 49170.8. Samples: 889312100. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-12 21:21:45,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:21:48,948][71000] Updated weights for policy 0, policy_version 83044 (0.0024) [2024-06-12 21:21:50,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 1360674816. Throughput: 0: 49295.1. Samples: 889457260. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-12 21:21:50,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:21:52,315][71000] Updated weights for policy 0, policy_version 83054 (0.0031) [2024-06-12 21:21:55,438][71000] Updated weights for policy 0, policy_version 83064 (0.0027) [2024-06-12 21:21:55,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 1360936960. Throughput: 0: 49504.5. Samples: 889757240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 21:21:55,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:21:58,554][71000] Updated weights for policy 0, policy_version 83074 (0.0033) [2024-06-12 21:22:00,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.3, 300 sec: 49152.0). Total num frames: 1361166336. Throughput: 0: 49369.4. Samples: 890050080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 21:22:00,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:22:02,405][71000] Updated weights for policy 0, policy_version 83084 (0.0023) [2024-06-12 21:22:05,378][71000] Updated weights for policy 0, policy_version 83094 (0.0029) [2024-06-12 21:22:05,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 1361428480. Throughput: 0: 49213.9. Samples: 890191960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 21:22:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:22:09,082][71000] Updated weights for policy 0, policy_version 83104 (0.0027) [2024-06-12 21:22:10,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 1361657856. Throughput: 0: 49174.8. Samples: 890487540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 21:22:10,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:22:12,427][71000] Updated weights for policy 0, policy_version 83114 (0.0032) [2024-06-12 21:22:15,939][70768] Fps is (10 sec: 45875.3, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 1361887232. Throughput: 0: 49379.2. Samples: 890779960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 21:22:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:22:15,972][71000] Updated weights for policy 0, policy_version 83124 (0.0027) [2024-06-12 21:22:19,336][71000] Updated weights for policy 0, policy_version 83134 (0.0025) [2024-06-12 21:22:20,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 1362132992. Throughput: 0: 48962.7. Samples: 890913060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 21:22:20,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:22:22,803][71000] Updated weights for policy 0, policy_version 83144 (0.0032) [2024-06-12 21:22:25,748][71000] Updated weights for policy 0, policy_version 83154 (0.0028) [2024-06-12 21:22:25,940][70768] Fps is (10 sec: 50789.4, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 1362395136. Throughput: 0: 48903.1. Samples: 891204700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 21:22:25,942][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:22:29,451][71000] Updated weights for policy 0, policy_version 83164 (0.0021) [2024-06-12 21:22:30,101][70980] Signal inference workers to stop experience collection... (13100 times) [2024-06-12 21:22:30,103][70980] Signal inference workers to resume experience collection... (13100 times) [2024-06-12 21:22:30,142][71000] InferenceWorker_p0-w0: stopping experience collection (13100 times) [2024-06-12 21:22:30,142][71000] InferenceWorker_p0-w0: resuming experience collection (13100 times) [2024-06-12 21:22:30,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 1362640896. Throughput: 0: 48722.4. Samples: 891504600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 21:22:30,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:22:32,091][71000] Updated weights for policy 0, policy_version 83174 (0.0027) [2024-06-12 21:22:35,810][71000] Updated weights for policy 0, policy_version 83184 (0.0029) [2024-06-12 21:22:35,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 1362886656. Throughput: 0: 48813.2. Samples: 891653860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 21:22:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:22:38,904][71000] Updated weights for policy 0, policy_version 83194 (0.0031) [2024-06-12 21:22:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.2, 300 sec: 49152.0). Total num frames: 1363132416. Throughput: 0: 48922.2. Samples: 891958740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-12 21:22:40,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 21:22:42,562][71000] Updated weights for policy 0, policy_version 83204 (0.0023) [2024-06-12 21:22:45,540][71000] Updated weights for policy 0, policy_version 83214 (0.0027) [2024-06-12 21:22:45,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 1363394560. Throughput: 0: 48875.5. Samples: 892249480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-12 21:22:45,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:22:49,389][71000] Updated weights for policy 0, policy_version 83224 (0.0027) [2024-06-12 21:22:50,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49698.1, 300 sec: 49207.5). Total num frames: 1363656704. Throughput: 0: 49155.0. Samples: 892403940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-12 21:22:50,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:22:51,839][71000] Updated weights for policy 0, policy_version 83234 (0.0019) [2024-06-12 21:22:55,728][71000] Updated weights for policy 0, policy_version 83244 (0.0030) [2024-06-12 21:22:55,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 1363869696. Throughput: 0: 49280.2. Samples: 892705160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-12 21:22:55,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:22:58,570][71000] Updated weights for policy 0, policy_version 83254 (0.0031) [2024-06-12 21:23:00,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 1364115456. Throughput: 0: 49362.9. Samples: 893001300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-12 21:23:00,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 21:23:02,194][71000] Updated weights for policy 0, policy_version 83264 (0.0036) [2024-06-12 21:23:05,194][71000] Updated weights for policy 0, policy_version 83274 (0.0026) [2024-06-12 21:23:05,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49424.9, 300 sec: 49207.5). Total num frames: 1364393984. Throughput: 0: 49828.6. Samples: 893155360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-12 21:23:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 21:23:09,379][71000] Updated weights for policy 0, policy_version 83284 (0.0026) [2024-06-12 21:23:10,940][70768] Fps is (10 sec: 52429.6, 60 sec: 49698.1, 300 sec: 49207.6). Total num frames: 1364639744. Throughput: 0: 50001.0. Samples: 893454740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-12 21:23:10,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:23:11,675][71000] Updated weights for policy 0, policy_version 83294 (0.0027) [2024-06-12 21:23:15,841][71000] Updated weights for policy 0, policy_version 83304 (0.0031) [2024-06-12 21:23:15,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 1364852736. Throughput: 0: 49815.0. Samples: 893746280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-12 21:23:15,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:23:18,604][71000] Updated weights for policy 0, policy_version 83314 (0.0024) [2024-06-12 21:23:20,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49698.1, 300 sec: 49152.0). Total num frames: 1365114880. Throughput: 0: 49498.0. Samples: 893881260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-12 21:23:20,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:23:22,440][71000] Updated weights for policy 0, policy_version 83324 (0.0029) [2024-06-12 21:23:25,309][71000] Updated weights for policy 0, policy_version 83334 (0.0023) [2024-06-12 21:23:25,730][70980] Signal inference workers to stop experience collection... (13150 times) [2024-06-12 21:23:25,760][71000] InferenceWorker_p0-w0: stopping experience collection (13150 times) [2024-06-12 21:23:25,787][70980] Signal inference workers to resume experience collection... (13150 times) [2024-06-12 21:23:25,787][71000] InferenceWorker_p0-w0: resuming experience collection (13150 times) [2024-06-12 21:23:25,940][70768] Fps is (10 sec: 54067.2, 60 sec: 49971.3, 300 sec: 49374.1). Total num frames: 1365393408. Throughput: 0: 49400.8. Samples: 894181780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-12 21:23:25,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 21:23:29,423][71000] Updated weights for policy 0, policy_version 83344 (0.0033) [2024-06-12 21:23:30,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49698.0, 300 sec: 49207.5). Total num frames: 1365622784. Throughput: 0: 49477.7. Samples: 894475980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-12 21:23:30,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:23:31,798][71000] Updated weights for policy 0, policy_version 83354 (0.0026) [2024-06-12 21:23:35,942][70768] Fps is (10 sec: 40949.1, 60 sec: 48603.8, 300 sec: 49096.0). Total num frames: 1365803008. Throughput: 0: 49281.1. Samples: 894621720. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-12 21:23:35,943][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 21:23:36,227][71000] Updated weights for policy 0, policy_version 83364 (0.0027) [2024-06-12 21:23:38,594][71000] Updated weights for policy 0, policy_version 83374 (0.0026) [2024-06-12 21:23:40,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49424.9, 300 sec: 49207.5). Total num frames: 1366097920. Throughput: 0: 49149.3. Samples: 894916880. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-12 21:23:40,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:23:40,956][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000083380_1366097920.pth... [2024-06-12 21:23:41,002][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000082660_1354301440.pth [2024-06-12 21:23:42,647][71000] Updated weights for policy 0, policy_version 83384 (0.0031) [2024-06-12 21:23:45,279][71000] Updated weights for policy 0, policy_version 83394 (0.0030) [2024-06-12 21:23:45,940][70768] Fps is (10 sec: 55719.8, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1366360064. Throughput: 0: 49183.6. Samples: 895214560. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-12 21:23:45,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:23:49,115][71000] Updated weights for policy 0, policy_version 83404 (0.0028) [2024-06-12 21:23:50,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 1366589440. Throughput: 0: 49133.8. Samples: 895366380. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-12 21:23:50,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:23:51,981][71000] Updated weights for policy 0, policy_version 83414 (0.0021) [2024-06-12 21:23:55,880][71000] Updated weights for policy 0, policy_version 83424 (0.0030) [2024-06-12 21:23:55,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 1366818816. Throughput: 0: 48939.4. Samples: 895657020. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-12 21:23:55,944][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:23:58,417][71000] Updated weights for policy 0, policy_version 83434 (0.0032) [2024-06-12 21:24:00,939][70768] Fps is (10 sec: 49153.1, 60 sec: 49425.3, 300 sec: 49152.0). Total num frames: 1367080960. Throughput: 0: 48803.2. Samples: 895942420. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-12 21:24:00,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 21:24:02,564][71000] Updated weights for policy 0, policy_version 83444 (0.0038) [2024-06-12 21:24:05,080][71000] Updated weights for policy 0, policy_version 83454 (0.0033) [2024-06-12 21:24:05,939][70768] Fps is (10 sec: 50791.3, 60 sec: 48879.1, 300 sec: 49207.5). Total num frames: 1367326720. Throughput: 0: 49342.3. Samples: 896101660. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-12 21:24:05,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:24:09,273][71000] Updated weights for policy 0, policy_version 83464 (0.0033) [2024-06-12 21:24:10,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 1367572480. Throughput: 0: 49239.5. Samples: 896397560. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-12 21:24:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:24:11,776][71000] Updated weights for policy 0, policy_version 83474 (0.0028) [2024-06-12 21:24:15,865][71000] Updated weights for policy 0, policy_version 83484 (0.0024) [2024-06-12 21:24:15,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 1367801856. Throughput: 0: 49330.0. Samples: 896695820. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-12 21:24:15,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 21:24:18,441][71000] Updated weights for policy 0, policy_version 83494 (0.0026) [2024-06-12 21:24:20,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49151.8, 300 sec: 49152.0). Total num frames: 1368064000. Throughput: 0: 49212.1. Samples: 896836140. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-12 21:24:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:24:22,470][71000] Updated weights for policy 0, policy_version 83504 (0.0024) [2024-06-12 21:24:25,151][71000] Updated weights for policy 0, policy_version 83514 (0.0030) [2024-06-12 21:24:25,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48606.0, 300 sec: 49207.6). Total num frames: 1368309760. Throughput: 0: 49215.0. Samples: 897131540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-12 21:24:25,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:24:28,792][70980] Signal inference workers to stop experience collection... (13200 times) [2024-06-12 21:24:28,793][70980] Signal inference workers to resume experience collection... (13200 times) [2024-06-12 21:24:28,802][71000] InferenceWorker_p0-w0: stopping experience collection (13200 times) [2024-06-12 21:24:28,822][71000] InferenceWorker_p0-w0: resuming experience collection (13200 times) [2024-06-12 21:24:28,934][71000] Updated weights for policy 0, policy_version 83524 (0.0026) [2024-06-12 21:24:30,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 1368555520. Throughput: 0: 49262.6. Samples: 897431380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-12 21:24:30,941][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:24:31,717][71000] Updated weights for policy 0, policy_version 83534 (0.0025) [2024-06-12 21:24:35,940][70768] Fps is (10 sec: 45874.2, 60 sec: 49427.2, 300 sec: 49096.4). Total num frames: 1368768512. Throughput: 0: 48961.4. Samples: 897569640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-12 21:24:35,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 21:24:36,046][71000] Updated weights for policy 0, policy_version 83544 (0.0031) [2024-06-12 21:24:38,467][71000] Updated weights for policy 0, policy_version 83554 (0.0032) [2024-06-12 21:24:40,939][70768] Fps is (10 sec: 49153.3, 60 sec: 49152.2, 300 sec: 49096.5). Total num frames: 1369047040. Throughput: 0: 49047.7. Samples: 897864160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-12 21:24:40,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 21:24:42,447][71000] Updated weights for policy 0, policy_version 83564 (0.0037) [2024-06-12 21:24:45,029][71000] Updated weights for policy 0, policy_version 83574 (0.0030) [2024-06-12 21:24:45,940][70768] Fps is (10 sec: 54066.9, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1369309184. Throughput: 0: 49191.7. Samples: 898156060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-12 21:24:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:24:48,842][71000] Updated weights for policy 0, policy_version 83584 (0.0026) [2024-06-12 21:24:50,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 1369538560. Throughput: 0: 49194.1. Samples: 898315400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-12 21:24:50,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 21:24:51,803][71000] Updated weights for policy 0, policy_version 83594 (0.0036) [2024-06-12 21:24:55,514][71000] Updated weights for policy 0, policy_version 83604 (0.0031) [2024-06-12 21:24:55,940][70768] Fps is (10 sec: 45875.5, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 1369767936. Throughput: 0: 49067.1. Samples: 898605580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-12 21:24:55,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:24:58,268][71000] Updated weights for policy 0, policy_version 83614 (0.0020) [2024-06-12 21:25:00,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 1370030080. Throughput: 0: 48945.2. Samples: 898898360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-12 21:25:00,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 21:25:02,512][71000] Updated weights for policy 0, policy_version 83624 (0.0026) [2024-06-12 21:25:04,973][71000] Updated weights for policy 0, policy_version 83634 (0.0039) [2024-06-12 21:25:05,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 1370275840. Throughput: 0: 49254.4. Samples: 899052580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-12 21:25:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:25:08,983][71000] Updated weights for policy 0, policy_version 83644 (0.0027) [2024-06-12 21:25:10,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 1370521600. Throughput: 0: 49391.8. Samples: 899354180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-12 21:25:10,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 21:25:11,838][71000] Updated weights for policy 0, policy_version 83654 (0.0038) [2024-06-12 21:25:15,394][71000] Updated weights for policy 0, policy_version 83664 (0.0029) [2024-06-12 21:25:15,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49424.9, 300 sec: 49152.0). Total num frames: 1370767360. Throughput: 0: 49218.8. Samples: 899646220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 21:25:15,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:25:18,356][71000] Updated weights for policy 0, policy_version 83674 (0.0031) [2024-06-12 21:25:20,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 49207.5). Total num frames: 1371013120. Throughput: 0: 49416.6. Samples: 899793380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 21:25:20,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 21:25:22,258][71000] Updated weights for policy 0, policy_version 83684 (0.0035) [2024-06-12 21:25:24,835][71000] Updated weights for policy 0, policy_version 83694 (0.0026) [2024-06-12 21:25:25,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49424.9, 300 sec: 49263.1). Total num frames: 1371275264. Throughput: 0: 49526.1. Samples: 900092840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 21:25:25,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:25:28,571][71000] Updated weights for policy 0, policy_version 83704 (0.0032) [2024-06-12 21:25:30,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.3, 300 sec: 49207.6). Total num frames: 1371521024. Throughput: 0: 49794.4. Samples: 900396800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 21:25:30,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 21:25:31,618][71000] Updated weights for policy 0, policy_version 83714 (0.0022) [2024-06-12 21:25:35,216][71000] Updated weights for policy 0, policy_version 83724 (0.0032) [2024-06-12 21:25:35,939][70768] Fps is (10 sec: 49152.8, 60 sec: 49971.3, 300 sec: 49263.1). Total num frames: 1371766784. Throughput: 0: 49483.7. Samples: 900542160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 21:25:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:25:38,279][71000] Updated weights for policy 0, policy_version 83734 (0.0027) [2024-06-12 21:25:40,613][70980] Signal inference workers to stop experience collection... (13250 times) [2024-06-12 21:25:40,662][71000] InferenceWorker_p0-w0: stopping experience collection (13250 times) [2024-06-12 21:25:40,669][70980] Signal inference workers to resume experience collection... (13250 times) [2024-06-12 21:25:40,675][71000] InferenceWorker_p0-w0: resuming experience collection (13250 times) [2024-06-12 21:25:40,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 1372028928. Throughput: 0: 49618.8. Samples: 900838420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 21:25:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 21:25:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000083742_1372028928.pth... [2024-06-12 21:25:40,999][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000083019_1360183296.pth [2024-06-12 21:25:41,627][71000] Updated weights for policy 0, policy_version 83744 (0.0025) [2024-06-12 21:25:44,725][71000] Updated weights for policy 0, policy_version 83754 (0.0030) [2024-06-12 21:25:45,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 1372274688. Throughput: 0: 49640.9. Samples: 901132200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 21:25:45,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:25:48,506][71000] Updated weights for policy 0, policy_version 83764 (0.0038) [2024-06-12 21:25:50,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49698.0, 300 sec: 49263.0). Total num frames: 1372520448. Throughput: 0: 49684.7. Samples: 901288400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 21:25:50,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:25:51,445][71000] Updated weights for policy 0, policy_version 83774 (0.0031) [2024-06-12 21:25:54,887][71000] Updated weights for policy 0, policy_version 83784 (0.0027) [2024-06-12 21:25:55,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49971.3, 300 sec: 49374.2). Total num frames: 1372766208. Throughput: 0: 49448.1. Samples: 901579340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 21:25:55,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 21:25:58,573][71000] Updated weights for policy 0, policy_version 83794 (0.0031) [2024-06-12 21:26:00,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 1373011968. Throughput: 0: 49643.1. Samples: 901880160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 21:26:00,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:26:01,490][71000] Updated weights for policy 0, policy_version 83804 (0.0035) [2024-06-12 21:26:05,107][71000] Updated weights for policy 0, policy_version 83814 (0.0029) [2024-06-12 21:26:05,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49698.0, 300 sec: 49318.6). Total num frames: 1373257728. Throughput: 0: 49476.2. Samples: 902019820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 21:26:05,941][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 21:26:08,423][71000] Updated weights for policy 0, policy_version 83824 (0.0032) [2024-06-12 21:26:10,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49425.2, 300 sec: 49207.5). Total num frames: 1373487104. Throughput: 0: 49337.0. Samples: 902313000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 21:26:10,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 21:26:11,645][71000] Updated weights for policy 0, policy_version 83834 (0.0031) [2024-06-12 21:26:14,878][71000] Updated weights for policy 0, policy_version 83844 (0.0022) [2024-06-12 21:26:15,939][70768] Fps is (10 sec: 47514.9, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 1373732864. Throughput: 0: 49111.6. Samples: 902606820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 21:26:15,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 21:26:18,244][71000] Updated weights for policy 0, policy_version 83854 (0.0037) [2024-06-12 21:26:20,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 1373978624. Throughput: 0: 49207.0. Samples: 902756480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 21:26:20,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:26:21,482][71000] Updated weights for policy 0, policy_version 83864 (0.0029) [2024-06-12 21:26:25,168][71000] Updated weights for policy 0, policy_version 83874 (0.0034) [2024-06-12 21:26:25,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 1374240768. Throughput: 0: 49491.2. Samples: 903065520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 21:26:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:26:27,866][71000] Updated weights for policy 0, policy_version 83884 (0.0034) [2024-06-12 21:26:30,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49424.9, 300 sec: 49263.1). Total num frames: 1374486528. Throughput: 0: 49646.1. Samples: 903366280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 21:26:30,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:26:31,714][71000] Updated weights for policy 0, policy_version 83894 (0.0026) [2024-06-12 21:26:34,502][71000] Updated weights for policy 0, policy_version 83904 (0.0026) [2024-06-12 21:26:35,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49424.9, 300 sec: 49374.2). Total num frames: 1374732288. Throughput: 0: 49403.6. Samples: 903511560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 21:26:35,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:26:37,998][71000] Updated weights for policy 0, policy_version 83914 (0.0025) [2024-06-12 21:26:40,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 1374994432. Throughput: 0: 49517.3. Samples: 903807620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 21:26:40,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:26:41,095][71000] Updated weights for policy 0, policy_version 83924 (0.0026) [2024-06-12 21:26:44,986][71000] Updated weights for policy 0, policy_version 83934 (0.0022) [2024-06-12 21:26:45,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1375240192. Throughput: 0: 49465.9. Samples: 904106120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 21:26:45,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:26:47,719][71000] Updated weights for policy 0, policy_version 83944 (0.0024) [2024-06-12 21:26:50,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 1375469568. Throughput: 0: 49586.9. Samples: 904251220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 21:26:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 21:26:51,719][70980] Signal inference workers to stop experience collection... (13300 times) [2024-06-12 21:26:51,764][71000] InferenceWorker_p0-w0: stopping experience collection (13300 times) [2024-06-12 21:26:51,772][70980] Signal inference workers to resume experience collection... (13300 times) [2024-06-12 21:26:51,786][71000] InferenceWorker_p0-w0: resuming experience collection (13300 times) [2024-06-12 21:26:51,788][71000] Updated weights for policy 0, policy_version 83954 (0.0017) [2024-06-12 21:26:54,297][71000] Updated weights for policy 0, policy_version 83964 (0.0024) [2024-06-12 21:26:55,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49424.9, 300 sec: 49374.1). Total num frames: 1375731712. Throughput: 0: 49604.3. Samples: 904545200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-12 21:26:55,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:26:58,263][71000] Updated weights for policy 0, policy_version 83974 (0.0031) [2024-06-12 21:27:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 1375977472. Throughput: 0: 49683.9. Samples: 904842600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 21:27:00,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:27:00,983][71000] Updated weights for policy 0, policy_version 83984 (0.0025) [2024-06-12 21:27:04,870][71000] Updated weights for policy 0, policy_version 83994 (0.0033) [2024-06-12 21:27:05,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 1376206848. Throughput: 0: 49555.5. Samples: 904986480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 21:27:05,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 21:27:07,906][71000] Updated weights for policy 0, policy_version 84004 (0.0028) [2024-06-12 21:27:10,941][70768] Fps is (10 sec: 47507.9, 60 sec: 49424.0, 300 sec: 49373.9). Total num frames: 1376452608. Throughput: 0: 49227.5. Samples: 905280820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 21:27:10,941][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 21:27:11,401][71000] Updated weights for policy 0, policy_version 84014 (0.0024) [2024-06-12 21:27:14,361][71000] Updated weights for policy 0, policy_version 84024 (0.0033) [2024-06-12 21:27:15,939][70768] Fps is (10 sec: 50791.2, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1376714752. Throughput: 0: 49149.1. Samples: 905577980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 21:27:15,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 21:27:18,280][71000] Updated weights for policy 0, policy_version 84034 (0.0028) [2024-06-12 21:27:20,840][71000] Updated weights for policy 0, policy_version 84044 (0.0030) [2024-06-12 21:27:20,940][70768] Fps is (10 sec: 52433.9, 60 sec: 49971.0, 300 sec: 49429.7). Total num frames: 1376976896. Throughput: 0: 49445.2. Samples: 905736600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 21:27:20,941][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:27:24,812][71000] Updated weights for policy 0, policy_version 84054 (0.0034) [2024-06-12 21:27:25,940][70768] Fps is (10 sec: 47512.7, 60 sec: 49151.8, 300 sec: 49318.6). Total num frames: 1377189888. Throughput: 0: 49178.1. Samples: 906020640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 21:27:25,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 21:27:27,675][71000] Updated weights for policy 0, policy_version 84064 (0.0024) [2024-06-12 21:27:30,940][70768] Fps is (10 sec: 45876.3, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 1377435648. Throughput: 0: 49025.3. Samples: 906312260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 21:27:30,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 21:27:31,167][71000] Updated weights for policy 0, policy_version 84074 (0.0023) [2024-06-12 21:27:34,566][71000] Updated weights for policy 0, policy_version 84084 (0.0020) [2024-06-12 21:27:35,939][70768] Fps is (10 sec: 50791.4, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 1377697792. Throughput: 0: 49169.0. Samples: 906463820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 21:27:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:27:37,998][71000] Updated weights for policy 0, policy_version 84094 (0.0025) [2024-06-12 21:27:40,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 1377943552. Throughput: 0: 49292.9. Samples: 906763380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 21:27:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:27:40,959][71000] Updated weights for policy 0, policy_version 84104 (0.0025) [2024-06-12 21:27:40,960][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000084104_1377959936.pth... [2024-06-12 21:27:41,022][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000083380_1366097920.pth [2024-06-12 21:27:44,917][71000] Updated weights for policy 0, policy_version 84114 (0.0022) [2024-06-12 21:27:45,940][70768] Fps is (10 sec: 47512.5, 60 sec: 48878.8, 300 sec: 49207.5). Total num frames: 1378172928. Throughput: 0: 49095.4. Samples: 907051900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 21:27:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 21:27:47,496][71000] Updated weights for policy 0, policy_version 84124 (0.0029) [2024-06-12 21:27:50,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 1378418688. Throughput: 0: 49080.9. Samples: 907195120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 21:27:50,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:27:51,416][71000] Updated weights for policy 0, policy_version 84134 (0.0033) [2024-06-12 21:27:54,345][71000] Updated weights for policy 0, policy_version 84144 (0.0031) [2024-06-12 21:27:55,940][70768] Fps is (10 sec: 49153.0, 60 sec: 48879.1, 300 sec: 49318.6). Total num frames: 1378664448. Throughput: 0: 48994.2. Samples: 907485500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 21:27:55,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 21:27:57,953][71000] Updated weights for policy 0, policy_version 84154 (0.0027) [2024-06-12 21:27:59,660][70980] Signal inference workers to stop experience collection... (13350 times) [2024-06-12 21:27:59,701][71000] InferenceWorker_p0-w0: stopping experience collection (13350 times) [2024-06-12 21:27:59,709][70980] Signal inference workers to resume experience collection... (13350 times) [2024-06-12 21:27:59,718][71000] InferenceWorker_p0-w0: resuming experience collection (13350 times) [2024-06-12 21:28:00,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48878.9, 300 sec: 49207.6). Total num frames: 1378910208. Throughput: 0: 49040.4. Samples: 907784800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 21:28:00,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:28:01,286][71000] Updated weights for policy 0, policy_version 84164 (0.0024) [2024-06-12 21:28:04,661][71000] Updated weights for policy 0, policy_version 84174 (0.0030) [2024-06-12 21:28:05,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 1379139584. Throughput: 0: 48715.0. Samples: 907928760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 21:28:05,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:28:07,686][71000] Updated weights for policy 0, policy_version 84184 (0.0023) [2024-06-12 21:28:10,940][70768] Fps is (10 sec: 49150.8, 60 sec: 49152.8, 300 sec: 49318.6). Total num frames: 1379401728. Throughput: 0: 49057.2. Samples: 908228220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 21:28:10,941][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:28:11,639][71000] Updated weights for policy 0, policy_version 84194 (0.0033) [2024-06-12 21:28:14,652][71000] Updated weights for policy 0, policy_version 84204 (0.0037) [2024-06-12 21:28:15,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 1379647488. Throughput: 0: 48797.8. Samples: 908508160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 21:28:15,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:28:18,319][71000] Updated weights for policy 0, policy_version 84214 (0.0028) [2024-06-12 21:28:20,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48605.9, 300 sec: 49152.0). Total num frames: 1379893248. Throughput: 0: 48874.4. Samples: 908663180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 21:28:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:28:21,229][71000] Updated weights for policy 0, policy_version 84224 (0.0024) [2024-06-12 21:28:24,621][71000] Updated weights for policy 0, policy_version 84234 (0.0032) [2024-06-12 21:28:25,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 1380139008. Throughput: 0: 48811.3. Samples: 908959880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 21:28:25,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:28:27,639][71000] Updated weights for policy 0, policy_version 84244 (0.0030) [2024-06-12 21:28:30,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49152.0, 300 sec: 49430.2). Total num frames: 1380384768. Throughput: 0: 48962.0. Samples: 909255180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 21:28:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:28:31,558][71000] Updated weights for policy 0, policy_version 84254 (0.0031) [2024-06-12 21:28:34,533][71000] Updated weights for policy 0, policy_version 84264 (0.0040) [2024-06-12 21:28:35,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48878.8, 300 sec: 49263.1). Total num frames: 1380630528. Throughput: 0: 48998.7. Samples: 909400060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 21:28:35,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:28:38,178][71000] Updated weights for policy 0, policy_version 84274 (0.0024) [2024-06-12 21:28:40,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 1380892672. Throughput: 0: 49209.3. Samples: 909699920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 21:28:40,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:28:41,221][71000] Updated weights for policy 0, policy_version 84284 (0.0034) [2024-06-12 21:28:44,603][71000] Updated weights for policy 0, policy_version 84294 (0.0025) [2024-06-12 21:28:45,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48879.1, 300 sec: 49207.6). Total num frames: 1381105664. Throughput: 0: 49285.4. Samples: 910002640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 21:28:45,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:28:47,676][71000] Updated weights for policy 0, policy_version 84304 (0.0029) [2024-06-12 21:28:50,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 1381384192. Throughput: 0: 49228.3. Samples: 910144040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 21:28:50,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:28:51,029][71000] Updated weights for policy 0, policy_version 84314 (0.0026) [2024-06-12 21:28:54,503][71000] Updated weights for policy 0, policy_version 84324 (0.0032) [2024-06-12 21:28:55,942][70768] Fps is (10 sec: 50789.8, 60 sec: 49151.9, 300 sec: 49263.0). Total num frames: 1381613568. Throughput: 0: 49181.5. Samples: 910441380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 21:28:55,942][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:28:57,027][70980] Signal inference workers to stop experience collection... (13400 times) [2024-06-12 21:28:57,029][70980] Signal inference workers to resume experience collection... (13400 times) [2024-06-12 21:28:57,038][71000] InferenceWorker_p0-w0: stopping experience collection (13400 times) [2024-06-12 21:28:57,060][71000] InferenceWorker_p0-w0: resuming experience collection (13400 times) [2024-06-12 21:28:57,913][71000] Updated weights for policy 0, policy_version 84334 (0.0038) [2024-06-12 21:29:00,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1381875712. Throughput: 0: 49306.9. Samples: 910726980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 21:29:00,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:29:01,109][71000] Updated weights for policy 0, policy_version 84344 (0.0024) [2024-06-12 21:29:04,478][71000] Updated weights for policy 0, policy_version 84354 (0.0026) [2024-06-12 21:29:05,939][70768] Fps is (10 sec: 49153.0, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 1382105088. Throughput: 0: 49426.1. Samples: 910887340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 21:29:05,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:29:07,995][71000] Updated weights for policy 0, policy_version 84364 (0.0030) [2024-06-12 21:29:10,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.2, 300 sec: 49318.6). Total num frames: 1382350848. Throughput: 0: 49392.4. Samples: 911182540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 21:29:10,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:29:11,235][71000] Updated weights for policy 0, policy_version 84374 (0.0034) [2024-06-12 21:29:14,501][71000] Updated weights for policy 0, policy_version 84384 (0.0029) [2024-06-12 21:29:15,939][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 1382596608. Throughput: 0: 49380.0. Samples: 911477280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 21:29:15,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:29:17,885][71000] Updated weights for policy 0, policy_version 84394 (0.0036) [2024-06-12 21:29:20,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 1382858752. Throughput: 0: 49378.7. Samples: 911622100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 21:29:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:29:21,227][71000] Updated weights for policy 0, policy_version 84404 (0.0024) [2024-06-12 21:29:24,377][71000] Updated weights for policy 0, policy_version 84414 (0.0027) [2024-06-12 21:29:25,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 1383120896. Throughput: 0: 49415.0. Samples: 911923600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 21:29:25,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 21:29:27,652][71000] Updated weights for policy 0, policy_version 84424 (0.0035) [2024-06-12 21:29:30,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49424.8, 300 sec: 49429.7). Total num frames: 1383350272. Throughput: 0: 49495.2. Samples: 912229940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 21:29:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:29:31,064][71000] Updated weights for policy 0, policy_version 84434 (0.0024) [2024-06-12 21:29:34,296][71000] Updated weights for policy 0, policy_version 84444 (0.0023) [2024-06-12 21:29:35,939][70768] Fps is (10 sec: 47514.4, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 1383596032. Throughput: 0: 49595.2. Samples: 912375820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 21:29:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 21:29:37,441][71000] Updated weights for policy 0, policy_version 84454 (0.0034) [2024-06-12 21:29:40,940][70768] Fps is (10 sec: 49153.2, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 1383841792. Throughput: 0: 49449.9. Samples: 912666620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 21:29:40,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:29:41,066][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000084464_1383858176.pth... [2024-06-12 21:29:41,067][71000] Updated weights for policy 0, policy_version 84464 (0.0035) [2024-06-12 21:29:41,122][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000083742_1372028928.pth [2024-06-12 21:29:44,187][71000] Updated weights for policy 0, policy_version 84474 (0.0030) [2024-06-12 21:29:45,940][70768] Fps is (10 sec: 50789.2, 60 sec: 49971.1, 300 sec: 49374.1). Total num frames: 1384103936. Throughput: 0: 49776.9. Samples: 912966940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 21:29:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:29:47,417][71000] Updated weights for policy 0, policy_version 84484 (0.0025) [2024-06-12 21:29:50,902][71000] Updated weights for policy 0, policy_version 84494 (0.0025) [2024-06-12 21:29:50,944][70768] Fps is (10 sec: 50769.0, 60 sec: 49421.6, 300 sec: 49429.0). Total num frames: 1384349696. Throughput: 0: 49481.9. Samples: 913114240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 21:29:50,944][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:29:54,056][71000] Updated weights for policy 0, policy_version 84504 (0.0033) [2024-06-12 21:29:55,939][70768] Fps is (10 sec: 49153.0, 60 sec: 49698.3, 300 sec: 49374.2). Total num frames: 1384595456. Throughput: 0: 49714.3. Samples: 913419680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 21:29:55,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:29:57,188][71000] Updated weights for policy 0, policy_version 84514 (0.0022) [2024-06-12 21:30:00,707][71000] Updated weights for policy 0, policy_version 84524 (0.0029) [2024-06-12 21:30:00,940][70768] Fps is (10 sec: 50811.1, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1384857600. Throughput: 0: 49891.8. Samples: 913722420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 21:30:00,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:30:04,018][71000] Updated weights for policy 0, policy_version 84534 (0.0027) [2024-06-12 21:30:04,034][70980] Signal inference workers to stop experience collection... (13450 times) [2024-06-12 21:30:04,034][70980] Signal inference workers to resume experience collection... (13450 times) [2024-06-12 21:30:04,070][71000] InferenceWorker_p0-w0: stopping experience collection (13450 times) [2024-06-12 21:30:04,070][71000] InferenceWorker_p0-w0: resuming experience collection (13450 times) [2024-06-12 21:30:05,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49971.1, 300 sec: 49429.7). Total num frames: 1385103360. Throughput: 0: 49985.9. Samples: 913871460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 21:30:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:30:07,088][71000] Updated weights for policy 0, policy_version 84544 (0.0029) [2024-06-12 21:30:10,722][71000] Updated weights for policy 0, policy_version 84554 (0.0024) [2024-06-12 21:30:10,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 1385349120. Throughput: 0: 49922.7. Samples: 914170120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 21:30:10,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:30:13,786][71000] Updated weights for policy 0, policy_version 84564 (0.0025) [2024-06-12 21:30:15,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49698.0, 300 sec: 49374.1). Total num frames: 1385578496. Throughput: 0: 49770.0. Samples: 914469580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 21:30:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:30:17,167][71000] Updated weights for policy 0, policy_version 84574 (0.0023) [2024-06-12 21:30:20,672][71000] Updated weights for policy 0, policy_version 84584 (0.0035) [2024-06-12 21:30:20,942][70768] Fps is (10 sec: 47500.4, 60 sec: 49422.8, 300 sec: 49318.2). Total num frames: 1385824256. Throughput: 0: 49557.2. Samples: 914606040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 21:30:20,943][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:30:23,675][71000] Updated weights for policy 0, policy_version 84594 (0.0022) [2024-06-12 21:30:25,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 1386086400. Throughput: 0: 49791.4. Samples: 914907240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-12 21:30:25,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:30:27,007][71000] Updated weights for policy 0, policy_version 84604 (0.0020) [2024-06-12 21:30:30,414][71000] Updated weights for policy 0, policy_version 84614 (0.0029) [2024-06-12 21:30:30,940][70768] Fps is (10 sec: 50804.6, 60 sec: 49698.3, 300 sec: 49374.1). Total num frames: 1386332160. Throughput: 0: 49761.0. Samples: 915206180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-12 21:30:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:30:33,719][71000] Updated weights for policy 0, policy_version 84624 (0.0026) [2024-06-12 21:30:35,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49697.9, 300 sec: 49318.6). Total num frames: 1386577920. Throughput: 0: 49743.6. Samples: 915352500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-12 21:30:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:30:36,885][71000] Updated weights for policy 0, policy_version 84634 (0.0030) [2024-06-12 21:30:40,070][71000] Updated weights for policy 0, policy_version 84644 (0.0034) [2024-06-12 21:30:40,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49698.0, 300 sec: 49318.6). Total num frames: 1386823680. Throughput: 0: 49706.4. Samples: 915656480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-12 21:30:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 21:30:43,675][71000] Updated weights for policy 0, policy_version 84654 (0.0029) [2024-06-12 21:30:45,941][70768] Fps is (10 sec: 50783.4, 60 sec: 49697.0, 300 sec: 49373.9). Total num frames: 1387085824. Throughput: 0: 49435.4. Samples: 915947080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-12 21:30:45,942][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:30:46,713][71000] Updated weights for policy 0, policy_version 84664 (0.0039) [2024-06-12 21:30:50,037][71000] Updated weights for policy 0, policy_version 84674 (0.0030) [2024-06-12 21:30:50,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49974.6, 300 sec: 49429.7). Total num frames: 1387347968. Throughput: 0: 49799.4. Samples: 916112440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-12 21:30:50,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:30:53,581][71000] Updated weights for policy 0, policy_version 84684 (0.0031) [2024-06-12 21:30:55,940][70768] Fps is (10 sec: 49159.5, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 1387577344. Throughput: 0: 49709.4. Samples: 916407040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-12 21:30:55,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:30:56,893][71000] Updated weights for policy 0, policy_version 84694 (0.0027) [2024-06-12 21:31:00,172][71000] Updated weights for policy 0, policy_version 84704 (0.0026) [2024-06-12 21:31:00,939][70768] Fps is (10 sec: 45876.2, 60 sec: 49152.2, 300 sec: 49318.7). Total num frames: 1387806720. Throughput: 0: 49541.5. Samples: 916698940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-12 21:31:00,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:31:03,511][71000] Updated weights for policy 0, policy_version 84714 (0.0027) [2024-06-12 21:31:05,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1388085248. Throughput: 0: 50054.1. Samples: 916858340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-12 21:31:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:31:06,594][71000] Updated weights for policy 0, policy_version 84724 (0.0027) [2024-06-12 21:31:09,989][71000] Updated weights for policy 0, policy_version 84734 (0.0024) [2024-06-12 21:31:10,942][70768] Fps is (10 sec: 52415.8, 60 sec: 49696.1, 300 sec: 49484.8). Total num frames: 1388331008. Throughput: 0: 49814.3. Samples: 917149000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-12 21:31:10,943][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:31:13,017][71000] Updated weights for policy 0, policy_version 84744 (0.0028) [2024-06-12 21:31:15,940][70768] Fps is (10 sec: 45875.5, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1388544000. Throughput: 0: 49762.2. Samples: 917445480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 21:31:15,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:31:15,961][70980] Signal inference workers to stop experience collection... (13500 times) [2024-06-12 21:31:15,962][70980] Signal inference workers to resume experience collection... (13500 times) [2024-06-12 21:31:16,010][71000] InferenceWorker_p0-w0: stopping experience collection (13500 times) [2024-06-12 21:31:16,010][71000] InferenceWorker_p0-w0: resuming experience collection (13500 times) [2024-06-12 21:31:16,598][71000] Updated weights for policy 0, policy_version 84754 (0.0029) [2024-06-12 21:31:19,999][71000] Updated weights for policy 0, policy_version 84764 (0.0030) [2024-06-12 21:31:20,939][70768] Fps is (10 sec: 47525.3, 60 sec: 49700.5, 300 sec: 49374.2). Total num frames: 1388806144. Throughput: 0: 49473.1. Samples: 917578780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 21:31:20,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:31:23,375][71000] Updated weights for policy 0, policy_version 84774 (0.0036) [2024-06-12 21:31:25,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1389068288. Throughput: 0: 49328.2. Samples: 917876240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 21:31:25,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:31:26,700][71000] Updated weights for policy 0, policy_version 84784 (0.0026) [2024-06-12 21:31:30,368][71000] Updated weights for policy 0, policy_version 84794 (0.0036) [2024-06-12 21:31:30,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1389297664. Throughput: 0: 49444.8. Samples: 918172020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 21:31:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:31:33,295][71000] Updated weights for policy 0, policy_version 84804 (0.0027) [2024-06-12 21:31:35,940][70768] Fps is (10 sec: 45874.5, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 1389527040. Throughput: 0: 48829.4. Samples: 918309760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 21:31:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 21:31:37,060][71000] Updated weights for policy 0, policy_version 84814 (0.0029) [2024-06-12 21:31:39,729][71000] Updated weights for policy 0, policy_version 84824 (0.0030) [2024-06-12 21:31:40,940][70768] Fps is (10 sec: 49150.9, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1389789184. Throughput: 0: 48786.0. Samples: 918602420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 21:31:40,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:31:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000084826_1389789184.pth... [2024-06-12 21:31:40,996][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000084104_1377959936.pth [2024-06-12 21:31:43,537][71000] Updated weights for policy 0, policy_version 84834 (0.0026) [2024-06-12 21:31:45,939][70768] Fps is (10 sec: 52429.8, 60 sec: 49426.4, 300 sec: 49429.7). Total num frames: 1390051328. Throughput: 0: 48968.4. Samples: 918902520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 21:31:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:31:46,864][71000] Updated weights for policy 0, policy_version 84844 (0.0026) [2024-06-12 21:31:50,158][71000] Updated weights for policy 0, policy_version 84854 (0.0038) [2024-06-12 21:31:50,939][70768] Fps is (10 sec: 49153.3, 60 sec: 48879.1, 300 sec: 49318.6). Total num frames: 1390280704. Throughput: 0: 48853.9. Samples: 919056760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 21:31:50,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:31:53,562][71000] Updated weights for policy 0, policy_version 84864 (0.0024) [2024-06-12 21:31:55,940][70768] Fps is (10 sec: 45874.5, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 1390510080. Throughput: 0: 48845.2. Samples: 919346920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 21:31:55,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:31:56,884][71000] Updated weights for policy 0, policy_version 84874 (0.0034) [2024-06-12 21:32:00,256][71000] Updated weights for policy 0, policy_version 84884 (0.0040) [2024-06-12 21:32:00,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 1390755840. Throughput: 0: 48840.0. Samples: 919643280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-12 21:32:00,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 21:32:03,663][71000] Updated weights for policy 0, policy_version 84894 (0.0031) [2024-06-12 21:32:05,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49152.1, 300 sec: 49429.9). Total num frames: 1391034368. Throughput: 0: 49152.9. Samples: 919790660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-12 21:32:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 21:32:06,874][71000] Updated weights for policy 0, policy_version 84904 (0.0030) [2024-06-12 21:32:10,312][71000] Updated weights for policy 0, policy_version 84914 (0.0035) [2024-06-12 21:32:10,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48880.8, 300 sec: 49318.6). Total num frames: 1391263744. Throughput: 0: 49147.4. Samples: 920087880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-12 21:32:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 21:32:13,174][71000] Updated weights for policy 0, policy_version 84924 (0.0026) [2024-06-12 21:32:15,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 1391493120. Throughput: 0: 48962.6. Samples: 920375340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-12 21:32:15,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:32:16,917][71000] Updated weights for policy 0, policy_version 84934 (0.0034) [2024-06-12 21:32:20,461][71000] Updated weights for policy 0, policy_version 84944 (0.0032) [2024-06-12 21:32:20,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49151.9, 300 sec: 49374.2). Total num frames: 1391755264. Throughput: 0: 49117.9. Samples: 920520060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-12 21:32:20,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:32:23,511][71000] Updated weights for policy 0, policy_version 84954 (0.0029) [2024-06-12 21:32:25,942][70768] Fps is (10 sec: 52413.9, 60 sec: 49149.6, 300 sec: 49429.2). Total num frames: 1392017408. Throughput: 0: 49254.4. Samples: 920819000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-12 21:32:25,943][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 21:32:26,814][71000] Updated weights for policy 0, policy_version 84964 (0.0036) [2024-06-12 21:32:30,323][71000] Updated weights for policy 0, policy_version 84974 (0.0027) [2024-06-12 21:32:30,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 1392246784. Throughput: 0: 49196.7. Samples: 921116380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-12 21:32:30,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 21:32:33,535][71000] Updated weights for policy 0, policy_version 84984 (0.0030) [2024-06-12 21:32:35,940][70768] Fps is (10 sec: 45888.5, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 1392476160. Throughput: 0: 49033.3. Samples: 921263260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-12 21:32:35,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:32:36,745][70980] Signal inference workers to stop experience collection... (13550 times) [2024-06-12 21:32:36,780][71000] InferenceWorker_p0-w0: stopping experience collection (13550 times) [2024-06-12 21:32:36,799][70980] Signal inference workers to resume experience collection... (13550 times) [2024-06-12 21:32:36,804][71000] InferenceWorker_p0-w0: resuming experience collection (13550 times) [2024-06-12 21:32:36,940][71000] Updated weights for policy 0, policy_version 84994 (0.0028) [2024-06-12 21:32:40,073][71000] Updated weights for policy 0, policy_version 85004 (0.0025) [2024-06-12 21:32:40,940][70768] Fps is (10 sec: 47514.3, 60 sec: 48879.1, 300 sec: 49318.6). Total num frames: 1392721920. Throughput: 0: 48936.1. Samples: 921549040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-12 21:32:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:32:43,371][71000] Updated weights for policy 0, policy_version 85014 (0.0034) [2024-06-12 21:32:45,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1393000448. Throughput: 0: 48983.2. Samples: 921847520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-12 21:32:45,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:32:46,919][71000] Updated weights for policy 0, policy_version 85024 (0.0026) [2024-06-12 21:32:49,903][71000] Updated weights for policy 0, policy_version 85034 (0.0039) [2024-06-12 21:32:50,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48878.8, 300 sec: 49318.6). Total num frames: 1393213440. Throughput: 0: 49120.7. Samples: 922001100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-12 21:32:50,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:32:53,265][71000] Updated weights for policy 0, policy_version 85044 (0.0034) [2024-06-12 21:32:55,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 1393459200. Throughput: 0: 49241.9. Samples: 922303760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-12 21:32:55,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:32:56,679][71000] Updated weights for policy 0, policy_version 85054 (0.0031) [2024-06-12 21:33:00,118][71000] Updated weights for policy 0, policy_version 85064 (0.0033) [2024-06-12 21:33:00,939][70768] Fps is (10 sec: 52429.7, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1393737728. Throughput: 0: 49248.6. Samples: 922591520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 21:33:00,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:33:03,587][71000] Updated weights for policy 0, policy_version 85074 (0.0032) [2024-06-12 21:33:05,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 1393983488. Throughput: 0: 49415.0. Samples: 922743740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 21:33:05,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 21:33:06,674][71000] Updated weights for policy 0, policy_version 85084 (0.0035) [2024-06-12 21:33:09,905][71000] Updated weights for policy 0, policy_version 85094 (0.0029) [2024-06-12 21:33:10,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 1394212864. Throughput: 0: 49405.7. Samples: 923042120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 21:33:10,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:33:13,467][71000] Updated weights for policy 0, policy_version 85104 (0.0026) [2024-06-12 21:33:15,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1394458624. Throughput: 0: 49330.4. Samples: 923336240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 21:33:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:33:16,642][71000] Updated weights for policy 0, policy_version 85114 (0.0032) [2024-06-12 21:33:20,047][71000] Updated weights for policy 0, policy_version 85124 (0.0035) [2024-06-12 21:33:20,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1394720768. Throughput: 0: 49455.4. Samples: 923488760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 21:33:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:33:23,466][71000] Updated weights for policy 0, policy_version 85134 (0.0028) [2024-06-12 21:33:25,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49154.3, 300 sec: 49429.7). Total num frames: 1394966528. Throughput: 0: 49531.9. Samples: 923777980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 21:33:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 21:33:26,810][71000] Updated weights for policy 0, policy_version 85144 (0.0038) [2024-06-12 21:33:29,947][71000] Updated weights for policy 0, policy_version 85154 (0.0028) [2024-06-12 21:33:30,939][70768] Fps is (10 sec: 45876.1, 60 sec: 48879.1, 300 sec: 49318.6). Total num frames: 1395179520. Throughput: 0: 49339.1. Samples: 924067780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 21:33:30,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:33:33,373][71000] Updated weights for policy 0, policy_version 85164 (0.0029) [2024-06-12 21:33:35,940][70768] Fps is (10 sec: 49148.9, 60 sec: 49697.6, 300 sec: 49374.0). Total num frames: 1395458048. Throughput: 0: 49198.1. Samples: 924215040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 21:33:35,941][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:33:36,320][71000] Updated weights for policy 0, policy_version 85174 (0.0029) [2024-06-12 21:33:39,793][71000] Updated weights for policy 0, policy_version 85184 (0.0030) [2024-06-12 21:33:40,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1395687424. Throughput: 0: 49109.7. Samples: 924513700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 21:33:40,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:33:40,945][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000085186_1395687424.pth... [2024-06-12 21:33:41,023][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000084464_1383858176.pth [2024-06-12 21:33:43,033][71000] Updated weights for policy 0, policy_version 85194 (0.0032) [2024-06-12 21:33:45,742][70980] Signal inference workers to stop experience collection... (13600 times) [2024-06-12 21:33:45,771][71000] InferenceWorker_p0-w0: stopping experience collection (13600 times) [2024-06-12 21:33:45,800][70980] Signal inference workers to resume experience collection... (13600 times) [2024-06-12 21:33:45,801][71000] InferenceWorker_p0-w0: resuming experience collection (13600 times) [2024-06-12 21:33:45,939][70768] Fps is (10 sec: 49155.6, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1395949568. Throughput: 0: 49362.7. Samples: 924812840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 21:33:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:33:46,543][71000] Updated weights for policy 0, policy_version 85204 (0.0029) [2024-06-12 21:33:49,750][71000] Updated weights for policy 0, policy_version 85214 (0.0024) [2024-06-12 21:33:50,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1396178944. Throughput: 0: 49061.3. Samples: 924951500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 21:33:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:33:53,110][71000] Updated weights for policy 0, policy_version 85224 (0.0024) [2024-06-12 21:33:55,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 1396457472. Throughput: 0: 49194.8. Samples: 925255880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 21:33:55,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:33:56,023][71000] Updated weights for policy 0, policy_version 85234 (0.0023) [2024-06-12 21:33:59,627][71000] Updated weights for policy 0, policy_version 85244 (0.0027) [2024-06-12 21:34:00,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 1396686848. Throughput: 0: 49262.6. Samples: 925553060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 21:34:00,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:34:02,788][71000] Updated weights for policy 0, policy_version 85254 (0.0027) [2024-06-12 21:34:05,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 1396932608. Throughput: 0: 49176.2. Samples: 925701680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 21:34:05,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:34:06,167][71000] Updated weights for policy 0, policy_version 85264 (0.0027) [2024-06-12 21:34:09,630][71000] Updated weights for policy 0, policy_version 85274 (0.0027) [2024-06-12 21:34:10,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1397178368. Throughput: 0: 49271.2. Samples: 925995180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 21:34:10,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:34:12,786][71000] Updated weights for policy 0, policy_version 85284 (0.0030) [2024-06-12 21:34:15,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1397440512. Throughput: 0: 49494.7. Samples: 926295040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 21:34:15,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:34:16,129][71000] Updated weights for policy 0, policy_version 85294 (0.0031) [2024-06-12 21:34:19,456][71000] Updated weights for policy 0, policy_version 85304 (0.0029) [2024-06-12 21:34:20,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1397686272. Throughput: 0: 49802.0. Samples: 926456100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 21:34:20,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 21:34:22,562][71000] Updated weights for policy 0, policy_version 85314 (0.0024) [2024-06-12 21:34:25,939][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 1397915648. Throughput: 0: 49549.4. Samples: 926743420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 21:34:25,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:34:26,177][71000] Updated weights for policy 0, policy_version 85324 (0.0035) [2024-06-12 21:34:29,301][71000] Updated weights for policy 0, policy_version 85334 (0.0026) [2024-06-12 21:34:30,939][70768] Fps is (10 sec: 47514.5, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 1398161408. Throughput: 0: 49500.9. Samples: 927040380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 21:34:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:34:32,893][71000] Updated weights for policy 0, policy_version 85344 (0.0029) [2024-06-12 21:34:35,939][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.7, 300 sec: 49429.7). Total num frames: 1398423552. Throughput: 0: 49574.0. Samples: 927182320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 21:34:35,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:34:36,026][71000] Updated weights for policy 0, policy_version 85354 (0.0031) [2024-06-12 21:34:39,712][71000] Updated weights for policy 0, policy_version 85364 (0.0030) [2024-06-12 21:34:40,940][70768] Fps is (10 sec: 50789.4, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 1398669312. Throughput: 0: 49708.3. Samples: 927492760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 21:34:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:34:42,439][71000] Updated weights for policy 0, policy_version 85374 (0.0029) [2024-06-12 21:34:45,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49151.9, 300 sec: 49319.3). Total num frames: 1398898688. Throughput: 0: 49872.0. Samples: 927797300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 21:34:45,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:34:46,162][71000] Updated weights for policy 0, policy_version 85384 (0.0028) [2024-06-12 21:34:48,639][71000] Updated weights for policy 0, policy_version 85394 (0.0021) [2024-06-12 21:34:50,940][70768] Fps is (10 sec: 47514.4, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 1399144448. Throughput: 0: 49641.8. Samples: 927935560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 21:34:50,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:34:52,897][71000] Updated weights for policy 0, policy_version 85404 (0.0034) [2024-06-12 21:34:55,590][71000] Updated weights for policy 0, policy_version 85414 (0.0028) [2024-06-12 21:34:55,939][70768] Fps is (10 sec: 52429.1, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1399422976. Throughput: 0: 49690.2. Samples: 928231240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 21:34:55,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 21:34:58,027][70980] Signal inference workers to stop experience collection... (13650 times) [2024-06-12 21:34:58,065][71000] InferenceWorker_p0-w0: stopping experience collection (13650 times) [2024-06-12 21:34:58,072][70980] Signal inference workers to resume experience collection... (13650 times) [2024-06-12 21:34:58,080][71000] InferenceWorker_p0-w0: resuming experience collection (13650 times) [2024-06-12 21:34:59,698][71000] Updated weights for policy 0, policy_version 85424 (0.0031) [2024-06-12 21:35:00,940][70768] Fps is (10 sec: 54066.5, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 1399685120. Throughput: 0: 49620.3. Samples: 928527960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 21:35:00,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:35:02,178][71000] Updated weights for policy 0, policy_version 85434 (0.0028) [2024-06-12 21:35:05,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49424.9, 300 sec: 49318.6). Total num frames: 1399898112. Throughput: 0: 49428.4. Samples: 928680380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 21:35:05,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:35:06,207][71000] Updated weights for policy 0, policy_version 85444 (0.0028) [2024-06-12 21:35:08,614][71000] Updated weights for policy 0, policy_version 85454 (0.0026) [2024-06-12 21:35:10,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 1400160256. Throughput: 0: 49613.6. Samples: 928976040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 21:35:10,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 21:35:12,865][71000] Updated weights for policy 0, policy_version 85464 (0.0027) [2024-06-12 21:35:15,214][71000] Updated weights for policy 0, policy_version 85474 (0.0027) [2024-06-12 21:35:15,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.0, 300 sec: 49430.2). Total num frames: 1400406016. Throughput: 0: 49326.1. Samples: 929260060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 21:35:15,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:35:19,673][71000] Updated weights for policy 0, policy_version 85484 (0.0037) [2024-06-12 21:35:20,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1400651776. Throughput: 0: 49537.3. Samples: 929411500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 21:35:20,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:35:22,425][71000] Updated weights for policy 0, policy_version 85494 (0.0024) [2024-06-12 21:35:25,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 1400864768. Throughput: 0: 49209.0. Samples: 929707160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 21:35:25,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:35:26,293][71000] Updated weights for policy 0, policy_version 85504 (0.0028) [2024-06-12 21:35:29,123][71000] Updated weights for policy 0, policy_version 85514 (0.0022) [2024-06-12 21:35:30,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49698.0, 300 sec: 49374.2). Total num frames: 1401143296. Throughput: 0: 49063.9. Samples: 930005180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 21:35:30,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:35:32,649][71000] Updated weights for policy 0, policy_version 85524 (0.0025) [2024-06-12 21:35:35,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1401372672. Throughput: 0: 49474.2. Samples: 930161900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 21:35:35,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:35:35,992][71000] Updated weights for policy 0, policy_version 85534 (0.0030) [2024-06-12 21:35:39,401][71000] Updated weights for policy 0, policy_version 85544 (0.0027) [2024-06-12 21:35:40,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.1, 300 sec: 49263.3). Total num frames: 1401618432. Throughput: 0: 49366.7. Samples: 930452740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 21:35:40,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:35:41,040][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000085549_1401634816.pth... [2024-06-12 21:35:41,088][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000084826_1389789184.pth [2024-06-12 21:35:42,566][71000] Updated weights for policy 0, policy_version 85554 (0.0037) [2024-06-12 21:35:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 1401847808. Throughput: 0: 49282.8. Samples: 930745680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 21:35:45,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:35:46,291][71000] Updated weights for policy 0, policy_version 85564 (0.0025) [2024-06-12 21:35:49,586][71000] Updated weights for policy 0, policy_version 85574 (0.0027) [2024-06-12 21:35:50,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49971.1, 300 sec: 49374.2). Total num frames: 1402142720. Throughput: 0: 48976.1. Samples: 930884300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 21:35:50,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 21:35:53,153][71000] Updated weights for policy 0, policy_version 85584 (0.0025) [2024-06-12 21:35:55,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 1402355712. Throughput: 0: 48965.0. Samples: 931179460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 21:35:55,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:35:56,260][71000] Updated weights for policy 0, policy_version 85594 (0.0026) [2024-06-12 21:35:56,679][70980] Signal inference workers to stop experience collection... (13700 times) [2024-06-12 21:35:56,679][70980] Signal inference workers to resume experience collection... (13700 times) [2024-06-12 21:35:56,710][71000] InferenceWorker_p0-w0: stopping experience collection (13700 times) [2024-06-12 21:35:56,710][71000] InferenceWorker_p0-w0: resuming experience collection (13700 times) [2024-06-12 21:35:59,668][71000] Updated weights for policy 0, policy_version 85604 (0.0029) [2024-06-12 21:36:00,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48605.9, 300 sec: 49207.5). Total num frames: 1402601472. Throughput: 0: 49139.5. Samples: 931471340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 21:36:00,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 21:36:02,719][71000] Updated weights for policy 0, policy_version 85614 (0.0023) [2024-06-12 21:36:05,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 49207.9). Total num frames: 1402847232. Throughput: 0: 49108.0. Samples: 931621360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 21:36:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:36:06,034][71000] Updated weights for policy 0, policy_version 85624 (0.0029) [2024-06-12 21:36:09,286][71000] Updated weights for policy 0, policy_version 85634 (0.0028) [2024-06-12 21:36:10,939][70768] Fps is (10 sec: 52429.2, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1403125760. Throughput: 0: 49226.3. Samples: 931922340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 21:36:10,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:36:12,845][71000] Updated weights for policy 0, policy_version 85644 (0.0029) [2024-06-12 21:36:15,880][71000] Updated weights for policy 0, policy_version 85654 (0.0028) [2024-06-12 21:36:15,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1403355136. Throughput: 0: 49317.0. Samples: 932224440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 21:36:15,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 21:36:19,418][71000] Updated weights for policy 0, policy_version 85664 (0.0025) [2024-06-12 21:36:20,940][70768] Fps is (10 sec: 47512.7, 60 sec: 49151.9, 300 sec: 49263.0). Total num frames: 1403600896. Throughput: 0: 49183.4. Samples: 932375160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 21:36:20,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:36:22,485][71000] Updated weights for policy 0, policy_version 85674 (0.0023) [2024-06-12 21:36:25,901][71000] Updated weights for policy 0, policy_version 85684 (0.0028) [2024-06-12 21:36:25,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49698.0, 300 sec: 49318.6). Total num frames: 1403846656. Throughput: 0: 49046.5. Samples: 932659840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:36:25,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:36:29,173][71000] Updated weights for policy 0, policy_version 85694 (0.0029) [2024-06-12 21:36:30,941][70768] Fps is (10 sec: 50784.4, 60 sec: 49424.0, 300 sec: 49429.5). Total num frames: 1404108800. Throughput: 0: 49143.4. Samples: 932957200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:36:30,941][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:36:32,787][71000] Updated weights for policy 0, policy_version 85704 (0.0023) [2024-06-12 21:36:35,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 1404321792. Throughput: 0: 49349.3. Samples: 933105020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:36:35,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:36:35,975][71000] Updated weights for policy 0, policy_version 85714 (0.0035) [2024-06-12 21:36:39,411][71000] Updated weights for policy 0, policy_version 85724 (0.0025) [2024-06-12 21:36:40,940][70768] Fps is (10 sec: 47520.0, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 1404583936. Throughput: 0: 49266.7. Samples: 933396460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:36:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:36:42,751][71000] Updated weights for policy 0, policy_version 85734 (0.0027) [2024-06-12 21:36:45,863][71000] Updated weights for policy 0, policy_version 85744 (0.0029) [2024-06-12 21:36:45,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49698.0, 300 sec: 49318.6). Total num frames: 1404829696. Throughput: 0: 49376.4. Samples: 933693280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:36:45,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:36:49,474][71000] Updated weights for policy 0, policy_version 85754 (0.0025) [2024-06-12 21:36:50,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1405091840. Throughput: 0: 49387.1. Samples: 933843780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:36:50,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:36:52,515][71000] Updated weights for policy 0, policy_version 85764 (0.0036) [2024-06-12 21:36:55,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 1405304832. Throughput: 0: 49271.4. Samples: 934139560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:36:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:36:55,952][71000] Updated weights for policy 0, policy_version 85774 (0.0033) [2024-06-12 21:36:58,834][70980] Signal inference workers to stop experience collection... (13750 times) [2024-06-12 21:36:58,834][70980] Signal inference workers to resume experience collection... (13750 times) [2024-06-12 21:36:58,879][71000] InferenceWorker_p0-w0: stopping experience collection (13750 times) [2024-06-12 21:36:58,879][71000] InferenceWorker_p0-w0: resuming experience collection (13750 times) [2024-06-12 21:36:58,966][71000] Updated weights for policy 0, policy_version 85784 (0.0036) [2024-06-12 21:37:00,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 1405550592. Throughput: 0: 48971.6. Samples: 934428160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:37:00,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:37:02,488][71000] Updated weights for policy 0, policy_version 85794 (0.0028) [2024-06-12 21:37:05,665][71000] Updated weights for policy 0, policy_version 85804 (0.0039) [2024-06-12 21:37:05,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1405812736. Throughput: 0: 49008.0. Samples: 934580520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:37:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:37:09,203][71000] Updated weights for policy 0, policy_version 85814 (0.0025) [2024-06-12 21:37:10,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48878.8, 300 sec: 49374.2). Total num frames: 1406058496. Throughput: 0: 49382.7. Samples: 934882060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:37:10,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:37:12,254][71000] Updated weights for policy 0, policy_version 85824 (0.0023) [2024-06-12 21:37:15,654][71000] Updated weights for policy 0, policy_version 85834 (0.0025) [2024-06-12 21:37:15,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1406304256. Throughput: 0: 49347.3. Samples: 935177760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:37:15,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:37:19,118][71000] Updated weights for policy 0, policy_version 85844 (0.0032) [2024-06-12 21:37:20,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 49263.5). Total num frames: 1406550016. Throughput: 0: 49258.6. Samples: 935321660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 21:37:20,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:37:22,607][71000] Updated weights for policy 0, policy_version 85854 (0.0033) [2024-06-12 21:37:25,643][71000] Updated weights for policy 0, policy_version 85864 (0.0031) [2024-06-12 21:37:25,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 1406812160. Throughput: 0: 49220.0. Samples: 935611360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 21:37:25,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:37:29,118][71000] Updated weights for policy 0, policy_version 85874 (0.0020) [2024-06-12 21:37:30,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49153.1, 300 sec: 49429.7). Total num frames: 1407057920. Throughput: 0: 49375.7. Samples: 935915180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 21:37:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:37:32,098][71000] Updated weights for policy 0, policy_version 85884 (0.0027) [2024-06-12 21:37:35,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1407270912. Throughput: 0: 49059.1. Samples: 936051440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 21:37:35,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:37:36,026][71000] Updated weights for policy 0, policy_version 85894 (0.0027) [2024-06-12 21:37:38,787][71000] Updated weights for policy 0, policy_version 85904 (0.0023) [2024-06-12 21:37:40,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 1407533056. Throughput: 0: 49034.7. Samples: 936346120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 21:37:40,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:37:40,953][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000085909_1407533056.pth... [2024-06-12 21:37:41,015][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000085186_1395687424.pth [2024-06-12 21:37:42,426][71000] Updated weights for policy 0, policy_version 85914 (0.0027) [2024-06-12 21:37:45,400][71000] Updated weights for policy 0, policy_version 85924 (0.0029) [2024-06-12 21:37:45,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1407795200. Throughput: 0: 49342.2. Samples: 936648560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 21:37:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:37:49,167][71000] Updated weights for policy 0, policy_version 85934 (0.0031) [2024-06-12 21:37:50,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48878.9, 300 sec: 49374.1). Total num frames: 1408024576. Throughput: 0: 49413.4. Samples: 936804120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 21:37:50,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:37:52,007][71000] Updated weights for policy 0, policy_version 85944 (0.0029) [2024-06-12 21:37:55,939][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.1, 300 sec: 49207.5). Total num frames: 1408253952. Throughput: 0: 49168.6. Samples: 937094640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 21:37:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:37:56,007][71000] Updated weights for policy 0, policy_version 85954 (0.0027) [2024-06-12 21:37:58,520][71000] Updated weights for policy 0, policy_version 85964 (0.0025) [2024-06-12 21:38:00,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49424.9, 300 sec: 49263.1). Total num frames: 1408516096. Throughput: 0: 49154.4. Samples: 937389720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 21:38:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:38:02,653][71000] Updated weights for policy 0, policy_version 85974 (0.0030) [2024-06-12 21:38:05,340][71000] Updated weights for policy 0, policy_version 85984 (0.0020) [2024-06-12 21:38:05,940][70768] Fps is (10 sec: 52427.8, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1408778240. Throughput: 0: 49364.9. Samples: 937543080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-12 21:38:05,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:38:09,072][71000] Updated weights for policy 0, policy_version 85994 (0.0028) [2024-06-12 21:38:10,940][70768] Fps is (10 sec: 47514.6, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 1408991232. Throughput: 0: 49340.0. Samples: 937831660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 21:38:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:38:11,835][71000] Updated weights for policy 0, policy_version 86004 (0.0032) [2024-06-12 21:38:12,970][70980] Signal inference workers to stop experience collection... (13800 times) [2024-06-12 21:38:13,011][71000] InferenceWorker_p0-w0: stopping experience collection (13800 times) [2024-06-12 21:38:13,015][70980] Signal inference workers to resume experience collection... (13800 times) [2024-06-12 21:38:13,025][71000] InferenceWorker_p0-w0: resuming experience collection (13800 times) [2024-06-12 21:38:15,747][71000] Updated weights for policy 0, policy_version 86014 (0.0033) [2024-06-12 21:38:15,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 1409253376. Throughput: 0: 49343.4. Samples: 938135640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 21:38:15,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:38:18,450][71000] Updated weights for policy 0, policy_version 86024 (0.0029) [2024-06-12 21:38:20,940][70768] Fps is (10 sec: 54066.5, 60 sec: 49698.1, 300 sec: 49374.1). Total num frames: 1409531904. Throughput: 0: 49572.7. Samples: 938282220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 21:38:20,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:38:22,376][71000] Updated weights for policy 0, policy_version 86034 (0.0033) [2024-06-12 21:38:24,956][71000] Updated weights for policy 0, policy_version 86044 (0.0022) [2024-06-12 21:38:25,939][70768] Fps is (10 sec: 54068.2, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1409794048. Throughput: 0: 49797.9. Samples: 938587020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 21:38:25,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:38:28,815][71000] Updated weights for policy 0, policy_version 86054 (0.0023) [2024-06-12 21:38:30,939][70768] Fps is (10 sec: 47514.6, 60 sec: 49152.1, 300 sec: 49318.7). Total num frames: 1410007040. Throughput: 0: 49839.6. Samples: 938891340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 21:38:30,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:38:31,622][71000] Updated weights for policy 0, policy_version 86064 (0.0022) [2024-06-12 21:38:35,704][71000] Updated weights for policy 0, policy_version 86074 (0.0039) [2024-06-12 21:38:35,940][70768] Fps is (10 sec: 44236.4, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 1410236416. Throughput: 0: 49331.2. Samples: 939024020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 21:38:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 21:38:38,528][71000] Updated weights for policy 0, policy_version 86084 (0.0028) [2024-06-12 21:38:40,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49698.1, 300 sec: 49374.1). Total num frames: 1410514944. Throughput: 0: 49581.6. Samples: 939325820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 21:38:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:38:42,303][71000] Updated weights for policy 0, policy_version 86094 (0.0029) [2024-06-12 21:38:45,068][71000] Updated weights for policy 0, policy_version 86104 (0.0028) [2024-06-12 21:38:45,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1410760704. Throughput: 0: 49546.0. Samples: 939619280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 21:38:45,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 21:38:48,703][71000] Updated weights for policy 0, policy_version 86114 (0.0030) [2024-06-12 21:38:50,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 1410990080. Throughput: 0: 49429.9. Samples: 939767420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 21:38:50,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:38:51,778][71000] Updated weights for policy 0, policy_version 86124 (0.0030) [2024-06-12 21:38:55,248][71000] Updated weights for policy 0, policy_version 86134 (0.0025) [2024-06-12 21:38:55,939][70768] Fps is (10 sec: 47514.2, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 1411235840. Throughput: 0: 49623.7. Samples: 940064720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 21:38:55,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:38:58,464][71000] Updated weights for policy 0, policy_version 86144 (0.0033) [2024-06-12 21:39:00,940][70768] Fps is (10 sec: 54065.8, 60 sec: 50244.2, 300 sec: 49485.2). Total num frames: 1411530752. Throughput: 0: 49443.9. Samples: 940360620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 21:39:00,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 21:39:01,656][71000] Updated weights for policy 0, policy_version 86154 (0.0027) [2024-06-12 21:39:04,920][71000] Updated weights for policy 0, policy_version 86164 (0.0028) [2024-06-12 21:39:05,940][70768] Fps is (10 sec: 50789.0, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 1411743744. Throughput: 0: 49769.3. Samples: 940521840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-12 21:39:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:39:08,422][71000] Updated weights for policy 0, policy_version 86174 (0.0029) [2024-06-12 21:39:10,939][70768] Fps is (10 sec: 45876.5, 60 sec: 49971.2, 300 sec: 49318.6). Total num frames: 1411989504. Throughput: 0: 49624.0. Samples: 940820100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-12 21:39:10,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:39:11,679][71000] Updated weights for policy 0, policy_version 86184 (0.0025) [2024-06-12 21:39:14,770][71000] Updated weights for policy 0, policy_version 86194 (0.0021) [2024-06-12 21:39:15,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 1412218880. Throughput: 0: 49462.2. Samples: 941117140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-12 21:39:15,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:39:18,357][71000] Updated weights for policy 0, policy_version 86204 (0.0025) [2024-06-12 21:39:18,369][70980] Signal inference workers to stop experience collection... (13850 times) [2024-06-12 21:39:18,369][70980] Signal inference workers to resume experience collection... (13850 times) [2024-06-12 21:39:18,387][71000] InferenceWorker_p0-w0: stopping experience collection (13850 times) [2024-06-12 21:39:18,387][71000] InferenceWorker_p0-w0: resuming experience collection (13850 times) [2024-06-12 21:39:20,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1412513792. Throughput: 0: 49789.3. Samples: 941264540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-12 21:39:20,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:39:21,432][71000] Updated weights for policy 0, policy_version 86214 (0.0037) [2024-06-12 21:39:25,030][71000] Updated weights for policy 0, policy_version 86224 (0.0026) [2024-06-12 21:39:25,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1412743168. Throughput: 0: 49682.8. Samples: 941561540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-12 21:39:25,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:39:28,251][71000] Updated weights for policy 0, policy_version 86234 (0.0030) [2024-06-12 21:39:30,940][70768] Fps is (10 sec: 44237.1, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 1412956160. Throughput: 0: 49692.5. Samples: 941855440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-12 21:39:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:39:31,701][71000] Updated weights for policy 0, policy_version 86244 (0.0027) [2024-06-12 21:39:34,644][71000] Updated weights for policy 0, policy_version 86254 (0.0033) [2024-06-12 21:39:35,939][70768] Fps is (10 sec: 45875.6, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 1413201920. Throughput: 0: 49403.2. Samples: 941990560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-12 21:39:35,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:39:38,723][71000] Updated weights for policy 0, policy_version 86264 (0.0025) [2024-06-12 21:39:40,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1413480448. Throughput: 0: 49367.3. Samples: 942286260. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-12 21:39:40,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:39:41,089][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000086274_1413513216.pth... [2024-06-12 21:39:41,094][71000] Updated weights for policy 0, policy_version 86274 (0.0021) [2024-06-12 21:39:41,125][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000085549_1401634816.pth [2024-06-12 21:39:45,124][71000] Updated weights for policy 0, policy_version 86284 (0.0027) [2024-06-12 21:39:45,940][70768] Fps is (10 sec: 52427.4, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1413726208. Throughput: 0: 49408.5. Samples: 942584000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-12 21:39:45,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 21:39:47,635][71000] Updated weights for policy 0, policy_version 86294 (0.0027) [2024-06-12 21:39:50,940][70768] Fps is (10 sec: 44237.4, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 1413922816. Throughput: 0: 49059.3. Samples: 942729500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-12 21:39:50,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:39:51,717][71000] Updated weights for policy 0, policy_version 86304 (0.0029) [2024-06-12 21:39:54,562][71000] Updated weights for policy 0, policy_version 86314 (0.0030) [2024-06-12 21:39:55,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49424.9, 300 sec: 49207.5). Total num frames: 1414201344. Throughput: 0: 48986.1. Samples: 943024480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 21:39:55,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 21:39:58,245][71000] Updated weights for policy 0, policy_version 86324 (0.0024) [2024-06-12 21:40:00,940][70768] Fps is (10 sec: 55705.6, 60 sec: 49152.2, 300 sec: 49429.7). Total num frames: 1414479872. Throughput: 0: 49054.7. Samples: 943324600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 21:40:00,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:40:01,009][71000] Updated weights for policy 0, policy_version 86334 (0.0029) [2024-06-12 21:40:05,431][71000] Updated weights for policy 0, policy_version 86344 (0.0032) [2024-06-12 21:40:05,939][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 1414709248. Throughput: 0: 49245.0. Samples: 943480560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 21:40:05,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:40:08,166][71000] Updated weights for policy 0, policy_version 86354 (0.0029) [2024-06-12 21:40:10,940][70768] Fps is (10 sec: 44236.3, 60 sec: 48878.8, 300 sec: 49207.5). Total num frames: 1414922240. Throughput: 0: 49087.8. Samples: 943770500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 21:40:10,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:40:11,897][71000] Updated weights for policy 0, policy_version 86364 (0.0031) [2024-06-12 21:40:14,625][71000] Updated weights for policy 0, policy_version 86374 (0.0033) [2024-06-12 21:40:15,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 1415184384. Throughput: 0: 48995.0. Samples: 944060220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 21:40:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:40:17,618][70980] Signal inference workers to stop experience collection... (13900 times) [2024-06-12 21:40:17,618][70980] Signal inference workers to resume experience collection... (13900 times) [2024-06-12 21:40:17,637][71000] InferenceWorker_p0-w0: stopping experience collection (13900 times) [2024-06-12 21:40:17,638][71000] InferenceWorker_p0-w0: resuming experience collection (13900 times) [2024-06-12 21:40:18,573][71000] Updated weights for policy 0, policy_version 86384 (0.0029) [2024-06-12 21:40:20,940][70768] Fps is (10 sec: 52429.2, 60 sec: 48879.0, 300 sec: 49429.7). Total num frames: 1415446528. Throughput: 0: 49480.3. Samples: 944217180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 21:40:20,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:40:21,265][71000] Updated weights for policy 0, policy_version 86394 (0.0029) [2024-06-12 21:40:25,114][71000] Updated weights for policy 0, policy_version 86404 (0.0027) [2024-06-12 21:40:25,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1415692288. Throughput: 0: 49589.1. Samples: 944517760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 21:40:25,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:40:27,893][71000] Updated weights for policy 0, policy_version 86414 (0.0028) [2024-06-12 21:40:30,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 1415905280. Throughput: 0: 49571.3. Samples: 944814700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 21:40:30,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 21:40:31,778][71000] Updated weights for policy 0, policy_version 86424 (0.0026) [2024-06-12 21:40:34,249][71000] Updated weights for policy 0, policy_version 86434 (0.0032) [2024-06-12 21:40:35,940][70768] Fps is (10 sec: 49149.8, 60 sec: 49697.7, 300 sec: 49374.1). Total num frames: 1416183808. Throughput: 0: 49440.9. Samples: 944954360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 21:40:35,941][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:40:38,541][71000] Updated weights for policy 0, policy_version 86444 (0.0026) [2024-06-12 21:40:40,942][70768] Fps is (10 sec: 54056.6, 60 sec: 49423.6, 300 sec: 49484.9). Total num frames: 1416445952. Throughput: 0: 49413.5. Samples: 945248180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 21:40:40,942][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:40:41,070][71000] Updated weights for policy 0, policy_version 86454 (0.0035) [2024-06-12 21:40:44,944][71000] Updated weights for policy 0, policy_version 86464 (0.0026) [2024-06-12 21:40:45,940][70768] Fps is (10 sec: 49153.8, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 1416675328. Throughput: 0: 49265.7. Samples: 945541560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-12 21:40:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:40:47,902][71000] Updated weights for policy 0, policy_version 86474 (0.0030) [2024-06-12 21:40:50,940][70768] Fps is (10 sec: 45883.1, 60 sec: 49698.0, 300 sec: 49318.6). Total num frames: 1416904704. Throughput: 0: 49029.9. Samples: 945686920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 21:40:50,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:40:51,543][71000] Updated weights for policy 0, policy_version 86484 (0.0027) [2024-06-12 21:40:54,406][71000] Updated weights for policy 0, policy_version 86494 (0.0026) [2024-06-12 21:40:55,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1417183232. Throughput: 0: 49277.9. Samples: 945988000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 21:40:55,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:40:58,215][71000] Updated weights for policy 0, policy_version 86504 (0.0021) [2024-06-12 21:41:00,940][70768] Fps is (10 sec: 52430.1, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1417428992. Throughput: 0: 49361.4. Samples: 946281480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 21:41:00,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:41:01,014][71000] Updated weights for policy 0, policy_version 86514 (0.0033) [2024-06-12 21:41:04,666][71000] Updated weights for policy 0, policy_version 86524 (0.0030) [2024-06-12 21:41:05,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 1417674752. Throughput: 0: 49333.4. Samples: 946437180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 21:41:05,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 21:41:07,700][71000] Updated weights for policy 0, policy_version 86534 (0.0034) [2024-06-12 21:41:10,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 1417904128. Throughput: 0: 49182.2. Samples: 946730960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 21:41:10,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:41:11,358][71000] Updated weights for policy 0, policy_version 86544 (0.0030) [2024-06-12 21:41:14,318][71000] Updated weights for policy 0, policy_version 86554 (0.0023) [2024-06-12 21:41:15,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49971.3, 300 sec: 49429.7). Total num frames: 1418182656. Throughput: 0: 49288.1. Samples: 947032660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 21:41:15,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:41:18,140][71000] Updated weights for policy 0, policy_version 86564 (0.0026) [2024-06-12 21:41:20,744][71000] Updated weights for policy 0, policy_version 86574 (0.0031) [2024-06-12 21:41:20,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1418428416. Throughput: 0: 49584.1. Samples: 947185620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 21:41:20,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:41:24,663][71000] Updated weights for policy 0, policy_version 86584 (0.0024) [2024-06-12 21:41:25,939][70768] Fps is (10 sec: 45875.1, 60 sec: 49152.0, 300 sec: 49263.3). Total num frames: 1418641408. Throughput: 0: 49690.2. Samples: 947484140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 21:41:25,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:41:27,642][71000] Updated weights for policy 0, policy_version 86594 (0.0035) [2024-06-12 21:41:30,939][70768] Fps is (10 sec: 45875.4, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 1418887168. Throughput: 0: 49741.9. Samples: 947779940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 21:41:30,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:41:31,176][71000] Updated weights for policy 0, policy_version 86604 (0.0032) [2024-06-12 21:41:34,414][70980] Signal inference workers to stop experience collection... (13950 times) [2024-06-12 21:41:34,446][71000] InferenceWorker_p0-w0: stopping experience collection (13950 times) [2024-06-12 21:41:34,470][70980] Signal inference workers to resume experience collection... (13950 times) [2024-06-12 21:41:34,470][71000] InferenceWorker_p0-w0: resuming experience collection (13950 times) [2024-06-12 21:41:34,473][71000] Updated weights for policy 0, policy_version 86614 (0.0027) [2024-06-12 21:41:35,939][70768] Fps is (10 sec: 52428.7, 60 sec: 49698.5, 300 sec: 49429.7). Total num frames: 1419165696. Throughput: 0: 49738.1. Samples: 947925120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 21:41:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:41:37,787][71000] Updated weights for policy 0, policy_version 86624 (0.0028) [2024-06-12 21:41:40,891][71000] Updated weights for policy 0, policy_version 86634 (0.0034) [2024-06-12 21:41:40,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49426.6, 300 sec: 49429.7). Total num frames: 1419411456. Throughput: 0: 49613.8. Samples: 948220620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 21:41:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:41:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000086634_1419411456.pth... [2024-06-12 21:41:41,000][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000085909_1407533056.pth [2024-06-12 21:41:44,320][71000] Updated weights for policy 0, policy_version 86644 (0.0022) [2024-06-12 21:41:45,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1419640832. Throughput: 0: 49702.1. Samples: 948518080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 21:41:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:41:47,570][71000] Updated weights for policy 0, policy_version 86654 (0.0032) [2024-06-12 21:41:50,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49698.3, 300 sec: 49429.7). Total num frames: 1419886592. Throughput: 0: 49389.4. Samples: 948659700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 21:41:50,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:41:51,038][71000] Updated weights for policy 0, policy_version 86664 (0.0028) [2024-06-12 21:41:54,178][71000] Updated weights for policy 0, policy_version 86674 (0.0030) [2024-06-12 21:41:55,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1420148736. Throughput: 0: 49541.3. Samples: 948960320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 21:41:55,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 21:41:57,399][71000] Updated weights for policy 0, policy_version 86684 (0.0032) [2024-06-12 21:42:00,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1420378112. Throughput: 0: 49703.1. Samples: 949269300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 21:42:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:42:00,949][71000] Updated weights for policy 0, policy_version 86694 (0.0025) [2024-06-12 21:42:03,986][71000] Updated weights for policy 0, policy_version 86704 (0.0028) [2024-06-12 21:42:05,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1420640256. Throughput: 0: 49598.1. Samples: 949417540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 21:42:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:42:07,445][71000] Updated weights for policy 0, policy_version 86714 (0.0032) [2024-06-12 21:42:10,564][71000] Updated weights for policy 0, policy_version 86724 (0.0035) [2024-06-12 21:42:10,940][70768] Fps is (10 sec: 50789.1, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 1420886016. Throughput: 0: 49364.6. Samples: 949705560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 21:42:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:42:14,179][71000] Updated weights for policy 0, policy_version 86734 (0.0024) [2024-06-12 21:42:15,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 1421131776. Throughput: 0: 49394.6. Samples: 950002700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 21:42:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:42:17,210][71000] Updated weights for policy 0, policy_version 86744 (0.0023) [2024-06-12 21:42:20,873][71000] Updated weights for policy 0, policy_version 86754 (0.0024) [2024-06-12 21:42:20,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49151.8, 300 sec: 49374.1). Total num frames: 1421377536. Throughput: 0: 49474.4. Samples: 950151480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 21:42:20,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:42:23,928][71000] Updated weights for policy 0, policy_version 86764 (0.0032) [2024-06-12 21:42:25,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49698.0, 300 sec: 49374.1). Total num frames: 1421623296. Throughput: 0: 49664.8. Samples: 950455540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 21:42:25,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:42:27,400][71000] Updated weights for policy 0, policy_version 86774 (0.0035) [2024-06-12 21:42:30,327][71000] Updated weights for policy 0, policy_version 86784 (0.0030) [2024-06-12 21:42:30,939][70768] Fps is (10 sec: 52430.0, 60 sec: 50244.2, 300 sec: 49596.3). Total num frames: 1421901824. Throughput: 0: 49469.1. Samples: 950744180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-12 21:42:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:42:34,050][71000] Updated weights for policy 0, policy_version 86794 (0.0027) [2024-06-12 21:42:35,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1422131200. Throughput: 0: 49915.7. Samples: 950905920. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 21:42:35,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:42:37,153][71000] Updated weights for policy 0, policy_version 86804 (0.0028) [2024-06-12 21:42:40,690][71000] Updated weights for policy 0, policy_version 86814 (0.0035) [2024-06-12 21:42:40,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1422360576. Throughput: 0: 49723.7. Samples: 951197880. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 21:42:40,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:42:43,467][71000] Updated weights for policy 0, policy_version 86824 (0.0026) [2024-06-12 21:42:45,940][70768] Fps is (10 sec: 49153.0, 60 sec: 49698.2, 300 sec: 49485.3). Total num frames: 1422622720. Throughput: 0: 49441.3. Samples: 951494160. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 21:42:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:42:47,259][71000] Updated weights for policy 0, policy_version 86834 (0.0033) [2024-06-12 21:42:50,273][71000] Updated weights for policy 0, policy_version 86844 (0.0028) [2024-06-12 21:42:50,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49971.0, 300 sec: 49596.3). Total num frames: 1422884864. Throughput: 0: 49453.7. Samples: 951642960. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 21:42:50,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 21:42:53,780][71000] Updated weights for policy 0, policy_version 86854 (0.0028) [2024-06-12 21:42:54,761][70980] Signal inference workers to stop experience collection... (14000 times) [2024-06-12 21:42:54,794][71000] InferenceWorker_p0-w0: stopping experience collection (14000 times) [2024-06-12 21:42:54,818][70980] Signal inference workers to resume experience collection... (14000 times) [2024-06-12 21:42:54,819][71000] InferenceWorker_p0-w0: resuming experience collection (14000 times) [2024-06-12 21:42:55,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49971.3, 300 sec: 49596.3). Total num frames: 1423147008. Throughput: 0: 49842.9. Samples: 951948480. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 21:42:55,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:42:56,606][71000] Updated weights for policy 0, policy_version 86864 (0.0027) [2024-06-12 21:43:00,475][71000] Updated weights for policy 0, policy_version 86874 (0.0031) [2024-06-12 21:43:00,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1423360000. Throughput: 0: 50037.3. Samples: 952254380. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 21:43:00,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 21:43:03,522][71000] Updated weights for policy 0, policy_version 86884 (0.0031) [2024-06-12 21:43:05,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1423605760. Throughput: 0: 49906.3. Samples: 952397260. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 21:43:05,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 21:43:06,883][71000] Updated weights for policy 0, policy_version 86894 (0.0027) [2024-06-12 21:43:09,886][71000] Updated weights for policy 0, policy_version 86904 (0.0024) [2024-06-12 21:43:10,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.3, 300 sec: 49540.8). Total num frames: 1423867904. Throughput: 0: 49599.2. Samples: 952687500. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 21:43:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:43:13,696][71000] Updated weights for policy 0, policy_version 86914 (0.0025) [2024-06-12 21:43:15,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 1424130048. Throughput: 0: 49792.3. Samples: 952984840. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 21:43:15,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:43:16,483][71000] Updated weights for policy 0, policy_version 86924 (0.0031) [2024-06-12 21:43:20,342][71000] Updated weights for policy 0, policy_version 86934 (0.0022) [2024-06-12 21:43:20,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.3, 300 sec: 49318.6). Total num frames: 1424343040. Throughput: 0: 49564.7. Samples: 953136320. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 21:43:20,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:43:23,219][71000] Updated weights for policy 0, policy_version 86944 (0.0031) [2024-06-12 21:43:25,940][70768] Fps is (10 sec: 45875.5, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1424588800. Throughput: 0: 49564.4. Samples: 953428280. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-12 21:43:25,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:43:26,951][71000] Updated weights for policy 0, policy_version 86954 (0.0024) [2024-06-12 21:43:29,722][71000] Updated weights for policy 0, policy_version 86964 (0.0024) [2024-06-12 21:43:30,940][70768] Fps is (10 sec: 50789.1, 60 sec: 49151.8, 300 sec: 49540.7). Total num frames: 1424850944. Throughput: 0: 49378.0. Samples: 953716180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 21:43:30,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:43:33,669][71000] Updated weights for policy 0, policy_version 86974 (0.0030) [2024-06-12 21:43:35,940][70768] Fps is (10 sec: 54066.5, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 1425129472. Throughput: 0: 49641.3. Samples: 953876820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 21:43:35,941][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:43:36,096][71000] Updated weights for policy 0, policy_version 86984 (0.0023) [2024-06-12 21:43:40,326][71000] Updated weights for policy 0, policy_version 86994 (0.0030) [2024-06-12 21:43:40,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1425342464. Throughput: 0: 49605.7. Samples: 954180740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 21:43:40,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:43:41,000][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000086997_1425358848.pth... [2024-06-12 21:43:41,038][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000086274_1413513216.pth [2024-06-12 21:43:42,676][71000] Updated weights for policy 0, policy_version 87004 (0.0024) [2024-06-12 21:43:45,940][70768] Fps is (10 sec: 45875.8, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1425588224. Throughput: 0: 49320.9. Samples: 954473820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 21:43:45,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:43:46,772][71000] Updated weights for policy 0, policy_version 87014 (0.0027) [2024-06-12 21:43:49,504][71000] Updated weights for policy 0, policy_version 87024 (0.0029) [2024-06-12 21:43:50,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1425850368. Throughput: 0: 49397.0. Samples: 954620120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 21:43:50,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 21:43:53,740][71000] Updated weights for policy 0, policy_version 87034 (0.0027) [2024-06-12 21:43:55,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49151.9, 300 sec: 49374.2). Total num frames: 1426096128. Throughput: 0: 49555.4. Samples: 954917500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 21:43:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:43:55,998][70980] Signal inference workers to stop experience collection... (14050 times) [2024-06-12 21:43:55,998][70980] Signal inference workers to resume experience collection... (14050 times) [2024-06-12 21:43:56,041][71000] InferenceWorker_p0-w0: stopping experience collection (14050 times) [2024-06-12 21:43:56,042][71000] InferenceWorker_p0-w0: resuming experience collection (14050 times) [2024-06-12 21:43:56,129][71000] Updated weights for policy 0, policy_version 87044 (0.0023) [2024-06-12 21:44:00,103][71000] Updated weights for policy 0, policy_version 87054 (0.0020) [2024-06-12 21:44:00,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1426309120. Throughput: 0: 49662.8. Samples: 955219660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 21:44:00,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:44:02,447][71000] Updated weights for policy 0, policy_version 87064 (0.0025) [2024-06-12 21:44:05,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1426587648. Throughput: 0: 49362.1. Samples: 955357620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 21:44:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:44:06,838][71000] Updated weights for policy 0, policy_version 87074 (0.0030) [2024-06-12 21:44:09,231][71000] Updated weights for policy 0, policy_version 87084 (0.0031) [2024-06-12 21:44:10,939][70768] Fps is (10 sec: 54067.4, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1426849792. Throughput: 0: 49529.4. Samples: 955657100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 21:44:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:44:13,323][71000] Updated weights for policy 0, policy_version 87094 (0.0032) [2024-06-12 21:44:15,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1427095552. Throughput: 0: 49724.2. Samples: 955953760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 21:44:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:44:15,997][71000] Updated weights for policy 0, policy_version 87104 (0.0029) [2024-06-12 21:44:19,714][71000] Updated weights for policy 0, policy_version 87114 (0.0030) [2024-06-12 21:44:20,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49971.1, 300 sec: 49485.2). Total num frames: 1427341312. Throughput: 0: 49609.0. Samples: 956109220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 21:44:20,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 21:44:22,693][71000] Updated weights for policy 0, policy_version 87124 (0.0028) [2024-06-12 21:44:25,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1427587072. Throughput: 0: 49555.9. Samples: 956410760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 21:44:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:44:26,146][71000] Updated weights for policy 0, policy_version 87134 (0.0030) [2024-06-12 21:44:29,076][71000] Updated weights for policy 0, policy_version 87144 (0.0029) [2024-06-12 21:44:30,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1427832832. Throughput: 0: 49587.4. Samples: 956705260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 21:44:30,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 21:44:32,894][71000] Updated weights for policy 0, policy_version 87154 (0.0028) [2024-06-12 21:44:35,545][71000] Updated weights for policy 0, policy_version 87164 (0.0030) [2024-06-12 21:44:35,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1428094976. Throughput: 0: 49724.9. Samples: 956857740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 21:44:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:44:39,503][71000] Updated weights for policy 0, policy_version 87174 (0.0026) [2024-06-12 21:44:40,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 1428340736. Throughput: 0: 49693.0. Samples: 957153680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 21:44:40,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:44:42,417][71000] Updated weights for policy 0, policy_version 87184 (0.0028) [2024-06-12 21:44:45,895][71000] Updated weights for policy 0, policy_version 87194 (0.0031) [2024-06-12 21:44:45,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49971.2, 300 sec: 49707.4). Total num frames: 1428586496. Throughput: 0: 49515.1. Samples: 957447840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 21:44:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 21:44:49,369][71000] Updated weights for policy 0, policy_version 87204 (0.0036) [2024-06-12 21:44:50,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1428815872. Throughput: 0: 49677.3. Samples: 957593100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 21:44:50,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:44:52,580][71000] Updated weights for policy 0, policy_version 87214 (0.0028) [2024-06-12 21:44:55,781][71000] Updated weights for policy 0, policy_version 87224 (0.0026) [2024-06-12 21:44:55,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1429078016. Throughput: 0: 49730.1. Samples: 957894960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 21:44:55,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:44:59,034][71000] Updated weights for policy 0, policy_version 87234 (0.0028) [2024-06-12 21:45:00,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 1429307392. Throughput: 0: 49626.6. Samples: 958186960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 21:45:00,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:45:02,692][71000] Updated weights for policy 0, policy_version 87244 (0.0029) [2024-06-12 21:45:04,208][70980] Signal inference workers to stop experience collection... (14100 times) [2024-06-12 21:45:04,262][70980] Signal inference workers to resume experience collection... (14100 times) [2024-06-12 21:45:04,262][71000] InferenceWorker_p0-w0: stopping experience collection (14100 times) [2024-06-12 21:45:04,284][71000] InferenceWorker_p0-w0: resuming experience collection (14100 times) [2024-06-12 21:45:05,588][71000] Updated weights for policy 0, policy_version 87254 (0.0026) [2024-06-12 21:45:05,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.1, 300 sec: 49651.9). Total num frames: 1429569536. Throughput: 0: 49452.0. Samples: 958334560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 21:45:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:45:09,098][71000] Updated weights for policy 0, policy_version 87264 (0.0034) [2024-06-12 21:45:10,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1429815296. Throughput: 0: 49483.6. Samples: 958637520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-12 21:45:10,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 21:45:12,007][71000] Updated weights for policy 0, policy_version 87274 (0.0033) [2024-06-12 21:45:15,793][71000] Updated weights for policy 0, policy_version 87284 (0.0025) [2024-06-12 21:45:15,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1430077440. Throughput: 0: 49633.0. Samples: 958938740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 21:45:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:45:18,757][71000] Updated weights for policy 0, policy_version 87294 (0.0040) [2024-06-12 21:45:20,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1430306816. Throughput: 0: 49563.0. Samples: 959088080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 21:45:20,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:45:22,364][71000] Updated weights for policy 0, policy_version 87304 (0.0028) [2024-06-12 21:45:25,219][71000] Updated weights for policy 0, policy_version 87314 (0.0024) [2024-06-12 21:45:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49698.1, 300 sec: 49707.4). Total num frames: 1430568960. Throughput: 0: 49459.5. Samples: 959379360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 21:45:25,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:45:28,841][71000] Updated weights for policy 0, policy_version 87324 (0.0027) [2024-06-12 21:45:30,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49698.3, 300 sec: 49596.4). Total num frames: 1430814720. Throughput: 0: 49654.8. Samples: 959682300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 21:45:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:45:32,160][71000] Updated weights for policy 0, policy_version 87334 (0.0028) [2024-06-12 21:45:35,802][71000] Updated weights for policy 0, policy_version 87344 (0.0026) [2024-06-12 21:45:35,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49151.9, 300 sec: 49485.5). Total num frames: 1431044096. Throughput: 0: 49733.3. Samples: 959831100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 21:45:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:45:38,539][71000] Updated weights for policy 0, policy_version 87354 (0.0027) [2024-06-12 21:45:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 1431306240. Throughput: 0: 49442.7. Samples: 960119880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 21:45:40,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 21:45:40,955][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000087360_1431306240.pth... [2024-06-12 21:45:40,997][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000086634_1419411456.pth [2024-06-12 21:45:42,525][71000] Updated weights for policy 0, policy_version 87364 (0.0029) [2024-06-12 21:45:45,121][71000] Updated weights for policy 0, policy_version 87374 (0.0031) [2024-06-12 21:45:45,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49698.0, 300 sec: 49707.4). Total num frames: 1431568384. Throughput: 0: 49649.1. Samples: 960421180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 21:45:45,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:45:49,108][71000] Updated weights for policy 0, policy_version 87384 (0.0033) [2024-06-12 21:45:50,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1431797760. Throughput: 0: 49794.7. Samples: 960575320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 21:45:50,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:45:51,936][71000] Updated weights for policy 0, policy_version 87394 (0.0024) [2024-06-12 21:45:55,581][71000] Updated weights for policy 0, policy_version 87404 (0.0023) [2024-06-12 21:45:55,939][70768] Fps is (10 sec: 47515.1, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1432043520. Throughput: 0: 49635.6. Samples: 960871120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 21:45:55,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:45:58,411][71000] Updated weights for policy 0, policy_version 87414 (0.0026) [2024-06-12 21:46:00,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1432289280. Throughput: 0: 49672.8. Samples: 961174020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 21:46:00,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:46:02,191][71000] Updated weights for policy 0, policy_version 87424 (0.0025) [2024-06-12 21:46:04,877][71000] Updated weights for policy 0, policy_version 87434 (0.0024) [2024-06-12 21:46:05,940][70768] Fps is (10 sec: 52427.6, 60 sec: 49971.1, 300 sec: 49707.4). Total num frames: 1432567808. Throughput: 0: 49607.9. Samples: 961320440. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-06-12 21:46:05,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:46:08,676][71000] Updated weights for policy 0, policy_version 87444 (0.0030) [2024-06-12 21:46:09,333][70980] Signal inference workers to stop experience collection... (14150 times) [2024-06-12 21:46:09,335][70980] Signal inference workers to resume experience collection... (14150 times) [2024-06-12 21:46:09,344][71000] InferenceWorker_p0-w0: stopping experience collection (14150 times) [2024-06-12 21:46:09,344][71000] InferenceWorker_p0-w0: resuming experience collection (14150 times) [2024-06-12 21:46:10,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1432797184. Throughput: 0: 49822.3. Samples: 961621360. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-06-12 21:46:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:46:11,594][71000] Updated weights for policy 0, policy_version 87454 (0.0028) [2024-06-12 21:46:15,578][71000] Updated weights for policy 0, policy_version 87464 (0.0023) [2024-06-12 21:46:15,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1433026560. Throughput: 0: 49662.2. Samples: 961917100. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-06-12 21:46:15,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:46:18,475][71000] Updated weights for policy 0, policy_version 87474 (0.0028) [2024-06-12 21:46:20,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49425.2, 300 sec: 49596.3). Total num frames: 1433272320. Throughput: 0: 49491.7. Samples: 962058220. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-06-12 21:46:20,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:46:22,071][71000] Updated weights for policy 0, policy_version 87484 (0.0026) [2024-06-12 21:46:24,658][71000] Updated weights for policy 0, policy_version 87494 (0.0020) [2024-06-12 21:46:25,940][70768] Fps is (10 sec: 54067.2, 60 sec: 49971.3, 300 sec: 49762.9). Total num frames: 1433567232. Throughput: 0: 49957.3. Samples: 962367960. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-06-12 21:46:25,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 21:46:28,548][71000] Updated weights for policy 0, policy_version 87504 (0.0028) [2024-06-12 21:46:30,938][71000] Updated weights for policy 0, policy_version 87514 (0.0030) [2024-06-12 21:46:30,940][70768] Fps is (10 sec: 55705.2, 60 sec: 50244.2, 300 sec: 49707.4). Total num frames: 1433829376. Throughput: 0: 49844.3. Samples: 962664160. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-06-12 21:46:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:46:35,042][71000] Updated weights for policy 0, policy_version 87524 (0.0030) [2024-06-12 21:46:35,940][70768] Fps is (10 sec: 44237.0, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 1434009600. Throughput: 0: 49705.4. Samples: 962812060. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-06-12 21:46:35,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:46:37,835][71000] Updated weights for policy 0, policy_version 87534 (0.0032) [2024-06-12 21:46:40,940][70768] Fps is (10 sec: 45874.7, 60 sec: 49698.0, 300 sec: 49651.8). Total num frames: 1434288128. Throughput: 0: 49606.4. Samples: 963103420. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-06-12 21:46:40,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:46:41,638][71000] Updated weights for policy 0, policy_version 87544 (0.0021) [2024-06-12 21:46:44,665][71000] Updated weights for policy 0, policy_version 87554 (0.0030) [2024-06-12 21:46:45,940][70768] Fps is (10 sec: 54066.8, 60 sec: 49698.3, 300 sec: 49707.4). Total num frames: 1434550272. Throughput: 0: 49480.5. Samples: 963400640. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-06-12 21:46:45,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:46:48,463][71000] Updated weights for policy 0, policy_version 87564 (0.0035) [2024-06-12 21:46:50,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49971.1, 300 sec: 49651.8). Total num frames: 1434796032. Throughput: 0: 49740.4. Samples: 963558760. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-06-12 21:46:50,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:46:51,102][71000] Updated weights for policy 0, policy_version 87574 (0.0028) [2024-06-12 21:46:55,086][71000] Updated weights for policy 0, policy_version 87584 (0.0024) [2024-06-12 21:46:55,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1435009024. Throughput: 0: 49551.5. Samples: 963851180. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-06-12 21:46:55,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 21:46:57,622][71000] Updated weights for policy 0, policy_version 87594 (0.0030) [2024-06-12 21:47:00,939][70768] Fps is (10 sec: 49153.1, 60 sec: 49971.3, 300 sec: 49651.9). Total num frames: 1435287552. Throughput: 0: 49609.9. Samples: 964149540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 21:47:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:47:01,287][71000] Updated weights for policy 0, policy_version 87604 (0.0029) [2024-06-12 21:47:04,343][71000] Updated weights for policy 0, policy_version 87614 (0.0030) [2024-06-12 21:47:05,940][70768] Fps is (10 sec: 54066.5, 60 sec: 49698.1, 300 sec: 49707.4). Total num frames: 1435549696. Throughput: 0: 49856.7. Samples: 964301780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 21:47:05,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:47:08,172][71000] Updated weights for policy 0, policy_version 87624 (0.0033) [2024-06-12 21:47:10,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49698.1, 300 sec: 49651.8). Total num frames: 1435779072. Throughput: 0: 49627.5. Samples: 964601200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 21:47:10,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:47:11,206][71000] Updated weights for policy 0, policy_version 87634 (0.0023) [2024-06-12 21:47:14,588][71000] Updated weights for policy 0, policy_version 87644 (0.0028) [2024-06-12 21:47:15,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49971.2, 300 sec: 49651.9). Total num frames: 1436024832. Throughput: 0: 49616.0. Samples: 964896880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 21:47:15,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:47:17,754][71000] Updated weights for policy 0, policy_version 87654 (0.0037) [2024-06-12 21:47:20,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49698.0, 300 sec: 49596.3). Total num frames: 1436254208. Throughput: 0: 49482.9. Samples: 965038800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 21:47:20,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 21:47:21,315][71000] Updated weights for policy 0, policy_version 87664 (0.0034) [2024-06-12 21:47:24,564][71000] Updated weights for policy 0, policy_version 87674 (0.0022) [2024-06-12 21:47:25,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49424.9, 300 sec: 49596.3). Total num frames: 1436532736. Throughput: 0: 49531.1. Samples: 965332320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 21:47:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:47:28,129][71000] Updated weights for policy 0, policy_version 87684 (0.0031) [2024-06-12 21:47:30,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48605.9, 300 sec: 49540.8). Total num frames: 1436745728. Throughput: 0: 49329.4. Samples: 965620460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 21:47:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 21:47:31,251][71000] Updated weights for policy 0, policy_version 87694 (0.0021) [2024-06-12 21:47:31,819][70980] Signal inference workers to stop experience collection... (14200 times) [2024-06-12 21:47:31,866][71000] InferenceWorker_p0-w0: stopping experience collection (14200 times) [2024-06-12 21:47:31,926][70980] Signal inference workers to resume experience collection... (14200 times) [2024-06-12 21:47:31,926][71000] InferenceWorker_p0-w0: resuming experience collection (14200 times) [2024-06-12 21:47:34,580][71000] Updated weights for policy 0, policy_version 87704 (0.0028) [2024-06-12 21:47:35,940][70768] Fps is (10 sec: 45876.1, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1436991488. Throughput: 0: 49178.4. Samples: 965771780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 21:47:35,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 21:47:38,015][71000] Updated weights for policy 0, policy_version 87714 (0.0031) [2024-06-12 21:47:40,933][71000] Updated weights for policy 0, policy_version 87724 (0.0015) [2024-06-12 21:47:40,940][70768] Fps is (10 sec: 52426.9, 60 sec: 49697.9, 300 sec: 49651.8). Total num frames: 1437270016. Throughput: 0: 49291.6. Samples: 966069320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 21:47:40,941][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:47:40,960][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000087724_1437270016.pth... [2024-06-12 21:47:41,013][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000086997_1425358848.pth [2024-06-12 21:47:44,447][71000] Updated weights for policy 0, policy_version 87734 (0.0023) [2024-06-12 21:47:45,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 1437515776. Throughput: 0: 49300.4. Samples: 966368060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 21:47:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:47:47,689][71000] Updated weights for policy 0, policy_version 87744 (0.0035) [2024-06-12 21:47:50,940][70768] Fps is (10 sec: 47514.5, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1437745152. Throughput: 0: 49362.6. Samples: 966523100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-12 21:47:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 21:47:51,164][71000] Updated weights for policy 0, policy_version 87754 (0.0028) [2024-06-12 21:47:54,474][71000] Updated weights for policy 0, policy_version 87764 (0.0032) [2024-06-12 21:47:55,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1437990912. Throughput: 0: 49293.4. Samples: 966819400. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-12 21:47:55,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:47:57,683][71000] Updated weights for policy 0, policy_version 87774 (0.0026) [2024-06-12 21:48:00,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49151.9, 300 sec: 49596.3). Total num frames: 1438236672. Throughput: 0: 49535.1. Samples: 967125960. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-12 21:48:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 21:48:01,210][71000] Updated weights for policy 0, policy_version 87784 (0.0023) [2024-06-12 21:48:04,307][71000] Updated weights for policy 0, policy_version 87794 (0.0027) [2024-06-12 21:48:05,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.1, 300 sec: 49596.3). Total num frames: 1438498816. Throughput: 0: 49438.4. Samples: 967263520. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-12 21:48:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:48:07,551][71000] Updated weights for policy 0, policy_version 87804 (0.0025) [2024-06-12 21:48:10,883][71000] Updated weights for policy 0, policy_version 87814 (0.0025) [2024-06-12 21:48:10,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1438744576. Throughput: 0: 49613.5. Samples: 967564920. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-12 21:48:10,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:48:14,335][71000] Updated weights for policy 0, policy_version 87824 (0.0038) [2024-06-12 21:48:15,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.1, 300 sec: 49596.3). Total num frames: 1438973952. Throughput: 0: 49767.2. Samples: 967859980. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-12 21:48:15,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:48:17,522][71000] Updated weights for policy 0, policy_version 87834 (0.0028) [2024-06-12 21:48:20,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.2, 300 sec: 49596.3). Total num frames: 1439219712. Throughput: 0: 49715.5. Samples: 968008980. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-12 21:48:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:48:20,997][71000] Updated weights for policy 0, policy_version 87844 (0.0027) [2024-06-12 21:48:24,058][71000] Updated weights for policy 0, policy_version 87854 (0.0028) [2024-06-12 21:48:25,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.2, 300 sec: 49596.3). Total num frames: 1439481856. Throughput: 0: 49596.9. Samples: 968301160. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-12 21:48:25,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:48:27,686][71000] Updated weights for policy 0, policy_version 87864 (0.0028) [2024-06-12 21:48:30,778][71000] Updated weights for policy 0, policy_version 87874 (0.0021) [2024-06-12 21:48:30,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49698.0, 300 sec: 49485.2). Total num frames: 1439727616. Throughput: 0: 49558.9. Samples: 968598220. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-12 21:48:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 21:48:34,295][71000] Updated weights for policy 0, policy_version 87884 (0.0023) [2024-06-12 21:48:35,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1439973376. Throughput: 0: 49468.6. Samples: 968749180. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-12 21:48:35,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:48:37,750][71000] Updated weights for policy 0, policy_version 87894 (0.0028) [2024-06-12 21:48:40,805][71000] Updated weights for policy 0, policy_version 87904 (0.0030) [2024-06-12 21:48:40,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.2, 300 sec: 49596.3). Total num frames: 1440219136. Throughput: 0: 49177.2. Samples: 969032380. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-12 21:48:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:48:44,310][71000] Updated weights for policy 0, policy_version 87914 (0.0032) [2024-06-12 21:48:45,939][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1440464896. Throughput: 0: 49047.7. Samples: 969333100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 21:48:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:48:47,618][71000] Updated weights for policy 0, policy_version 87924 (0.0029) [2024-06-12 21:48:50,624][71000] Updated weights for policy 0, policy_version 87934 (0.0037) [2024-06-12 21:48:50,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1440710656. Throughput: 0: 49260.4. Samples: 969480240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 21:48:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 21:48:50,952][70980] Saving new best policy, reward=0.282! [2024-06-12 21:48:54,162][71000] Updated weights for policy 0, policy_version 87944 (0.0032) [2024-06-12 21:48:54,746][70980] Signal inference workers to stop experience collection... (14250 times) [2024-06-12 21:48:54,747][70980] Signal inference workers to resume experience collection... (14250 times) [2024-06-12 21:48:54,784][71000] InferenceWorker_p0-w0: stopping experience collection (14250 times) [2024-06-12 21:48:54,785][71000] InferenceWorker_p0-w0: resuming experience collection (14250 times) [2024-06-12 21:48:55,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49425.0, 300 sec: 49651.8). Total num frames: 1440956416. Throughput: 0: 49165.2. Samples: 969777360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 21:48:55,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:48:57,495][71000] Updated weights for policy 0, policy_version 87954 (0.0030) [2024-06-12 21:49:00,738][71000] Updated weights for policy 0, policy_version 87964 (0.0034) [2024-06-12 21:49:00,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1441202176. Throughput: 0: 49187.9. Samples: 970073440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 21:49:00,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:49:04,332][71000] Updated weights for policy 0, policy_version 87974 (0.0031) [2024-06-12 21:49:05,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1441447936. Throughput: 0: 49112.9. Samples: 970219060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 21:49:05,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:49:07,488][71000] Updated weights for policy 0, policy_version 87984 (0.0031) [2024-06-12 21:49:10,820][71000] Updated weights for policy 0, policy_version 87994 (0.0030) [2024-06-12 21:49:10,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1441693696. Throughput: 0: 49026.1. Samples: 970507340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 21:49:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 21:49:14,086][71000] Updated weights for policy 0, policy_version 88004 (0.0029) [2024-06-12 21:49:15,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49698.0, 300 sec: 49540.8). Total num frames: 1441955840. Throughput: 0: 49383.6. Samples: 970820480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 21:49:15,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 21:49:17,402][71000] Updated weights for policy 0, policy_version 88014 (0.0031) [2024-06-12 21:49:20,357][71000] Updated weights for policy 0, policy_version 88024 (0.0046) [2024-06-12 21:49:20,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1442201600. Throughput: 0: 49362.7. Samples: 970970500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 21:49:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:49:23,703][71000] Updated weights for policy 0, policy_version 88034 (0.0032) [2024-06-12 21:49:25,939][70768] Fps is (10 sec: 49152.8, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1442447360. Throughput: 0: 49722.9. Samples: 971269900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 21:49:25,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:49:27,066][71000] Updated weights for policy 0, policy_version 88044 (0.0028) [2024-06-12 21:49:30,394][71000] Updated weights for policy 0, policy_version 88054 (0.0034) [2024-06-12 21:49:30,939][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.2, 300 sec: 49429.7). Total num frames: 1442676736. Throughput: 0: 49596.0. Samples: 971564920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 21:49:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:49:33,553][71000] Updated weights for policy 0, policy_version 88064 (0.0037) [2024-06-12 21:49:35,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49971.3, 300 sec: 49596.3). Total num frames: 1442971648. Throughput: 0: 49601.5. Samples: 971712300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 21:49:35,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 21:49:36,698][71000] Updated weights for policy 0, policy_version 88074 (0.0034) [2024-06-12 21:49:39,821][71000] Updated weights for policy 0, policy_version 88084 (0.0020) [2024-06-12 21:49:40,940][70768] Fps is (10 sec: 52427.3, 60 sec: 49698.1, 300 sec: 49540.7). Total num frames: 1443201024. Throughput: 0: 50005.7. Samples: 972027620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 21:49:40,941][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:49:40,955][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000088086_1443201024.pth... [2024-06-12 21:49:40,997][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000087360_1431306240.pth [2024-06-12 21:49:43,110][71000] Updated weights for policy 0, policy_version 88094 (0.0027) [2024-06-12 21:49:45,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49971.2, 300 sec: 49651.9). Total num frames: 1443463168. Throughput: 0: 50046.7. Samples: 972325540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 21:49:45,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:49:46,282][71000] Updated weights for policy 0, policy_version 88104 (0.0023) [2024-06-12 21:49:50,125][71000] Updated weights for policy 0, policy_version 88114 (0.0029) [2024-06-12 21:49:50,940][70768] Fps is (10 sec: 47514.6, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1443676160. Throughput: 0: 50002.2. Samples: 972469160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 21:49:50,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:49:53,575][71000] Updated weights for policy 0, policy_version 88124 (0.0022) [2024-06-12 21:49:55,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49971.4, 300 sec: 49651.9). Total num frames: 1443954688. Throughput: 0: 50040.6. Samples: 972759160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 21:49:55,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:49:56,948][71000] Updated weights for policy 0, policy_version 88134 (0.0021) [2024-06-12 21:49:59,962][71000] Updated weights for policy 0, policy_version 88144 (0.0025) [2024-06-12 21:50:00,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1444184064. Throughput: 0: 49694.9. Samples: 973056740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 21:50:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:50:03,294][71000] Updated weights for policy 0, policy_version 88154 (0.0029) [2024-06-12 21:50:05,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1444446208. Throughput: 0: 49948.5. Samples: 973218180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 21:50:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 21:50:06,217][71000] Updated weights for policy 0, policy_version 88164 (0.0025) [2024-06-12 21:50:09,893][71000] Updated weights for policy 0, policy_version 88174 (0.0025) [2024-06-12 21:50:10,941][70768] Fps is (10 sec: 50781.6, 60 sec: 49969.8, 300 sec: 49540.5). Total num frames: 1444691968. Throughput: 0: 49926.5. Samples: 973516680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 21:50:10,942][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:50:13,000][71000] Updated weights for policy 0, policy_version 88184 (0.0028) [2024-06-12 21:50:13,000][70980] Signal inference workers to stop experience collection... (14300 times) [2024-06-12 21:50:13,000][70980] Signal inference workers to resume experience collection... (14300 times) [2024-06-12 21:50:13,043][71000] InferenceWorker_p0-w0: stopping experience collection (14300 times) [2024-06-12 21:50:13,043][71000] InferenceWorker_p0-w0: resuming experience collection (14300 times) [2024-06-12 21:50:15,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1444921344. Throughput: 0: 49716.4. Samples: 973802160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 21:50:15,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:50:16,572][71000] Updated weights for policy 0, policy_version 88194 (0.0029) [2024-06-12 21:50:19,799][71000] Updated weights for policy 0, policy_version 88204 (0.0033) [2024-06-12 21:50:20,940][70768] Fps is (10 sec: 49160.2, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1445183488. Throughput: 0: 49921.3. Samples: 973958760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 21:50:20,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 21:50:23,142][71000] Updated weights for policy 0, policy_version 88214 (0.0021) [2024-06-12 21:50:25,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1445429248. Throughput: 0: 49662.9. Samples: 974262440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 21:50:25,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:50:26,388][71000] Updated weights for policy 0, policy_version 88224 (0.0031) [2024-06-12 21:50:29,592][71000] Updated weights for policy 0, policy_version 88234 (0.0031) [2024-06-12 21:50:30,940][70768] Fps is (10 sec: 52428.6, 60 sec: 50517.2, 300 sec: 49707.4). Total num frames: 1445707776. Throughput: 0: 49582.6. Samples: 974556760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 21:50:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:50:32,990][71000] Updated weights for policy 0, policy_version 88244 (0.0028) [2024-06-12 21:50:35,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1445937152. Throughput: 0: 49793.8. Samples: 974709880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 21:50:35,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:50:36,048][71000] Updated weights for policy 0, policy_version 88254 (0.0032) [2024-06-12 21:50:39,287][71000] Updated weights for policy 0, policy_version 88264 (0.0023) [2024-06-12 21:50:40,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49971.3, 300 sec: 49596.3). Total num frames: 1446199296. Throughput: 0: 50005.6. Samples: 975009420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 21:50:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:50:42,417][71000] Updated weights for policy 0, policy_version 88274 (0.0037) [2024-06-12 21:50:45,905][71000] Updated weights for policy 0, policy_version 88284 (0.0031) [2024-06-12 21:50:45,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.1, 300 sec: 49651.8). Total num frames: 1446445056. Throughput: 0: 50119.0. Samples: 975312100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 21:50:45,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:50:49,268][71000] Updated weights for policy 0, policy_version 88294 (0.0033) [2024-06-12 21:50:50,940][70768] Fps is (10 sec: 49152.0, 60 sec: 50244.2, 300 sec: 49651.8). Total num frames: 1446690816. Throughput: 0: 50022.6. Samples: 975469200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 21:50:50,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 21:50:52,701][71000] Updated weights for policy 0, policy_version 88304 (0.0030) [2024-06-12 21:50:55,875][71000] Updated weights for policy 0, policy_version 88314 (0.0030) [2024-06-12 21:50:55,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49698.0, 300 sec: 49651.9). Total num frames: 1446936576. Throughput: 0: 49711.5. Samples: 975753620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 21:50:55,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:50:59,229][71000] Updated weights for policy 0, policy_version 88324 (0.0032) [2024-06-12 21:51:00,940][70768] Fps is (10 sec: 50790.8, 60 sec: 50244.2, 300 sec: 49596.3). Total num frames: 1447198720. Throughput: 0: 50305.3. Samples: 976065900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 21:51:00,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 21:51:02,143][71000] Updated weights for policy 0, policy_version 88334 (0.0031) [2024-06-12 21:51:05,387][71000] Updated weights for policy 0, policy_version 88344 (0.0029) [2024-06-12 21:51:05,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49971.1, 300 sec: 49651.8). Total num frames: 1447444480. Throughput: 0: 50133.6. Samples: 976214780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 21:51:05,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 21:51:08,381][71000] Updated weights for policy 0, policy_version 88354 (0.0029) [2024-06-12 21:51:10,940][70768] Fps is (10 sec: 50790.2, 60 sec: 50245.6, 300 sec: 49762.9). Total num frames: 1447706624. Throughput: 0: 50073.3. Samples: 976515740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 21:51:10,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:51:12,167][71000] Updated weights for policy 0, policy_version 88364 (0.0026) [2024-06-12 21:51:15,586][71000] Updated weights for policy 0, policy_version 88374 (0.0029) [2024-06-12 21:51:15,940][70768] Fps is (10 sec: 49152.6, 60 sec: 50244.2, 300 sec: 49707.4). Total num frames: 1447936000. Throughput: 0: 50099.2. Samples: 976811220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 21:51:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:51:19,030][71000] Updated weights for policy 0, policy_version 88384 (0.0033) [2024-06-12 21:51:20,940][70768] Fps is (10 sec: 45874.4, 60 sec: 49697.9, 300 sec: 49485.2). Total num frames: 1448165376. Throughput: 0: 49789.5. Samples: 976950420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 21:51:20,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 21:51:22,063][71000] Updated weights for policy 0, policy_version 88394 (0.0036) [2024-06-12 21:51:25,689][71000] Updated weights for policy 0, policy_version 88404 (0.0032) [2024-06-12 21:51:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1448411136. Throughput: 0: 49633.4. Samples: 977242920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 21:51:25,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:51:26,117][70980] Signal inference workers to stop experience collection... (14350 times) [2024-06-12 21:51:26,117][70980] Signal inference workers to resume experience collection... (14350 times) [2024-06-12 21:51:26,159][71000] InferenceWorker_p0-w0: stopping experience collection (14350 times) [2024-06-12 21:51:26,159][71000] InferenceWorker_p0-w0: resuming experience collection (14350 times) [2024-06-12 21:51:28,823][71000] Updated weights for policy 0, policy_version 88414 (0.0030) [2024-06-12 21:51:30,940][70768] Fps is (10 sec: 50791.6, 60 sec: 49425.1, 300 sec: 49707.4). Total num frames: 1448673280. Throughput: 0: 49579.1. Samples: 977543160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 21:51:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:51:32,163][71000] Updated weights for policy 0, policy_version 88424 (0.0032) [2024-06-12 21:51:35,292][71000] Updated weights for policy 0, policy_version 88434 (0.0024) [2024-06-12 21:51:35,940][70768] Fps is (10 sec: 50788.4, 60 sec: 49697.8, 300 sec: 49596.3). Total num frames: 1448919040. Throughput: 0: 49382.3. Samples: 977691420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 21:51:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 21:51:38,919][71000] Updated weights for policy 0, policy_version 88444 (0.0033) [2024-06-12 21:51:40,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49151.9, 300 sec: 49485.2). Total num frames: 1449148416. Throughput: 0: 49760.4. Samples: 977992840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 21:51:40,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:51:40,967][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000088450_1449164800.pth... [2024-06-12 21:51:41,019][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000087724_1437270016.pth [2024-06-12 21:51:41,960][71000] Updated weights for policy 0, policy_version 88454 (0.0021) [2024-06-12 21:51:45,463][71000] Updated weights for policy 0, policy_version 88464 (0.0026) [2024-06-12 21:51:45,940][70768] Fps is (10 sec: 49154.2, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1449410560. Throughput: 0: 49351.1. Samples: 978286700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 21:51:45,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:51:48,401][71000] Updated weights for policy 0, policy_version 88474 (0.0022) [2024-06-12 21:51:50,939][70768] Fps is (10 sec: 52429.8, 60 sec: 49698.2, 300 sec: 49707.4). Total num frames: 1449672704. Throughput: 0: 49383.7. Samples: 978437040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 21:51:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:51:52,018][71000] Updated weights for policy 0, policy_version 88484 (0.0034) [2024-06-12 21:51:55,090][71000] Updated weights for policy 0, policy_version 88494 (0.0029) [2024-06-12 21:51:55,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1449902080. Throughput: 0: 49375.2. Samples: 978737620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 21:51:55,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:51:58,419][71000] Updated weights for policy 0, policy_version 88504 (0.0022) [2024-06-12 21:52:00,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49151.9, 300 sec: 49485.2). Total num frames: 1450147840. Throughput: 0: 49397.3. Samples: 979034100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 21:52:00,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:52:01,566][71000] Updated weights for policy 0, policy_version 88514 (0.0039) [2024-06-12 21:52:05,130][71000] Updated weights for policy 0, policy_version 88524 (0.0022) [2024-06-12 21:52:05,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 49540.8). Total num frames: 1450393600. Throughput: 0: 49445.6. Samples: 979175460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 21:52:05,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:52:08,378][71000] Updated weights for policy 0, policy_version 88534 (0.0034) [2024-06-12 21:52:10,940][70768] Fps is (10 sec: 52429.1, 60 sec: 49425.1, 300 sec: 49651.8). Total num frames: 1450672128. Throughput: 0: 49564.9. Samples: 979473340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 21:52:10,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:52:11,914][71000] Updated weights for policy 0, policy_version 88544 (0.0029) [2024-06-12 21:52:15,089][71000] Updated weights for policy 0, policy_version 88554 (0.0033) [2024-06-12 21:52:15,939][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.2, 300 sec: 49651.9). Total num frames: 1450901504. Throughput: 0: 49267.3. Samples: 979760180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 21:52:15,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:52:18,471][71000] Updated weights for policy 0, policy_version 88564 (0.0028) [2024-06-12 21:52:20,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49425.3, 300 sec: 49485.3). Total num frames: 1451130880. Throughput: 0: 49416.5. Samples: 979915140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 21:52:20,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:52:21,690][71000] Updated weights for policy 0, policy_version 88574 (0.0026) [2024-06-12 21:52:24,911][71000] Updated weights for policy 0, policy_version 88584 (0.0039) [2024-06-12 21:52:25,940][70768] Fps is (10 sec: 47512.3, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1451376640. Throughput: 0: 49162.3. Samples: 980205140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 21:52:25,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:52:28,386][71000] Updated weights for policy 0, policy_version 88594 (0.0027) [2024-06-12 21:52:30,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49596.3). Total num frames: 1451622400. Throughput: 0: 49300.4. Samples: 980505220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 21:52:30,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:52:31,463][70980] Signal inference workers to stop experience collection... (14400 times) [2024-06-12 21:52:31,508][71000] InferenceWorker_p0-w0: stopping experience collection (14400 times) [2024-06-12 21:52:31,520][70980] Signal inference workers to resume experience collection... (14400 times) [2024-06-12 21:52:31,521][71000] InferenceWorker_p0-w0: resuming experience collection (14400 times) [2024-06-12 21:52:31,649][71000] Updated weights for policy 0, policy_version 88604 (0.0023) [2024-06-12 21:52:34,982][71000] Updated weights for policy 0, policy_version 88614 (0.0033) [2024-06-12 21:52:35,939][70768] Fps is (10 sec: 50791.5, 60 sec: 49425.5, 300 sec: 49540.9). Total num frames: 1451884544. Throughput: 0: 49379.6. Samples: 980659120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 21:52:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:52:38,346][71000] Updated weights for policy 0, policy_version 88624 (0.0037) [2024-06-12 21:52:40,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49698.1, 300 sec: 49540.7). Total num frames: 1452130304. Throughput: 0: 49207.8. Samples: 980951980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 21:52:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:52:41,709][71000] Updated weights for policy 0, policy_version 88634 (0.0026) [2024-06-12 21:52:44,637][71000] Updated weights for policy 0, policy_version 88644 (0.0024) [2024-06-12 21:52:45,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49596.4). Total num frames: 1452376064. Throughput: 0: 49214.9. Samples: 981248760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 21:52:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:52:48,394][71000] Updated weights for policy 0, policy_version 88654 (0.0035) [2024-06-12 21:52:50,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49152.0, 300 sec: 49596.3). Total num frames: 1452621824. Throughput: 0: 49418.2. Samples: 981399280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 21:52:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 21:52:51,461][71000] Updated weights for policy 0, policy_version 88664 (0.0031) [2024-06-12 21:52:54,982][71000] Updated weights for policy 0, policy_version 88674 (0.0033) [2024-06-12 21:52:55,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1452867584. Throughput: 0: 49310.7. Samples: 981692320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 21:52:55,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:52:58,223][71000] Updated weights for policy 0, policy_version 88684 (0.0032) [2024-06-12 21:53:00,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49425.0, 300 sec: 49540.7). Total num frames: 1453113344. Throughput: 0: 49707.7. Samples: 981997040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 21:53:00,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:53:01,633][71000] Updated weights for policy 0, policy_version 88694 (0.0026) [2024-06-12 21:53:04,980][71000] Updated weights for policy 0, policy_version 88704 (0.0022) [2024-06-12 21:53:05,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49698.0, 300 sec: 49596.3). Total num frames: 1453375488. Throughput: 0: 49346.1. Samples: 982135720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 21:53:05,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:53:08,353][71000] Updated weights for policy 0, policy_version 88714 (0.0035) [2024-06-12 21:53:10,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49425.0, 300 sec: 49707.4). Total num frames: 1453637632. Throughput: 0: 49736.4. Samples: 982443280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 21:53:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:53:11,232][71000] Updated weights for policy 0, policy_version 88724 (0.0031) [2024-06-12 21:53:15,022][71000] Updated weights for policy 0, policy_version 88734 (0.0028) [2024-06-12 21:53:15,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49424.8, 300 sec: 49651.8). Total num frames: 1453867008. Throughput: 0: 49693.7. Samples: 982741440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:53:15,949][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 21:53:17,955][71000] Updated weights for policy 0, policy_version 88744 (0.0030) [2024-06-12 21:53:20,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1454112768. Throughput: 0: 49540.8. Samples: 982888460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:53:20,948][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:53:21,595][71000] Updated weights for policy 0, policy_version 88754 (0.0037) [2024-06-12 21:53:24,927][71000] Updated weights for policy 0, policy_version 88764 (0.0036) [2024-06-12 21:53:25,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49971.2, 300 sec: 49651.8). Total num frames: 1454374912. Throughput: 0: 49504.9. Samples: 983179700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:53:25,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:53:28,323][71000] Updated weights for policy 0, policy_version 88774 (0.0030) [2024-06-12 21:53:30,940][70768] Fps is (10 sec: 47512.7, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1454587904. Throughput: 0: 49332.2. Samples: 983468720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:53:30,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:53:31,758][71000] Updated weights for policy 0, policy_version 88784 (0.0020) [2024-06-12 21:53:35,025][71000] Updated weights for policy 0, policy_version 88794 (0.0025) [2024-06-12 21:53:35,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1454850048. Throughput: 0: 49268.9. Samples: 983616380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:53:35,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 21:53:38,022][71000] Updated weights for policy 0, policy_version 88804 (0.0032) [2024-06-12 21:53:40,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 1455095808. Throughput: 0: 49408.8. Samples: 983915720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:53:40,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:53:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000088812_1455095808.pth... [2024-06-12 21:53:41,003][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000088086_1443201024.pth [2024-06-12 21:53:41,664][71000] Updated weights for policy 0, policy_version 88814 (0.0032) [2024-06-12 21:53:44,894][71000] Updated weights for policy 0, policy_version 88824 (0.0034) [2024-06-12 21:53:45,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49971.0, 300 sec: 49707.4). Total num frames: 1455374336. Throughput: 0: 49289.4. Samples: 984215060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:53:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:53:48,348][71000] Updated weights for policy 0, policy_version 88834 (0.0037) [2024-06-12 21:53:50,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1455587328. Throughput: 0: 49650.7. Samples: 984370000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:53:50,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:53:51,411][71000] Updated weights for policy 0, policy_version 88844 (0.0021) [2024-06-12 21:53:51,545][70980] Signal inference workers to stop experience collection... (14450 times) [2024-06-12 21:53:51,547][70980] Signal inference workers to resume experience collection... (14450 times) [2024-06-12 21:53:51,567][71000] InferenceWorker_p0-w0: stopping experience collection (14450 times) [2024-06-12 21:53:51,567][71000] InferenceWorker_p0-w0: resuming experience collection (14450 times) [2024-06-12 21:53:54,849][71000] Updated weights for policy 0, policy_version 88854 (0.0031) [2024-06-12 21:53:55,940][70768] Fps is (10 sec: 45875.7, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 1455833088. Throughput: 0: 49391.7. Samples: 984665900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:53:55,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:53:57,904][71000] Updated weights for policy 0, policy_version 88864 (0.0024) [2024-06-12 21:54:00,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.2, 300 sec: 49651.8). Total num frames: 1456095232. Throughput: 0: 49264.0. Samples: 984958320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:54:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:54:01,299][71000] Updated weights for policy 0, policy_version 88874 (0.0033) [2024-06-12 21:54:04,214][71000] Updated weights for policy 0, policy_version 88884 (0.0020) [2024-06-12 21:54:05,940][70768] Fps is (10 sec: 54067.3, 60 sec: 49971.3, 300 sec: 49762.9). Total num frames: 1456373760. Throughput: 0: 49656.4. Samples: 985123000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 21:54:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:54:07,707][71000] Updated weights for policy 0, policy_version 88894 (0.0029) [2024-06-12 21:54:10,875][71000] Updated weights for policy 0, policy_version 88904 (0.0031) [2024-06-12 21:54:10,939][70768] Fps is (10 sec: 50791.2, 60 sec: 49425.2, 300 sec: 49651.9). Total num frames: 1456603136. Throughput: 0: 49805.5. Samples: 985420940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 21:54:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:54:14,467][71000] Updated weights for policy 0, policy_version 88914 (0.0032) [2024-06-12 21:54:15,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49425.2, 300 sec: 49596.3). Total num frames: 1456832512. Throughput: 0: 49976.6. Samples: 985717660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 21:54:15,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:54:17,447][71000] Updated weights for policy 0, policy_version 88924 (0.0026) [2024-06-12 21:54:20,944][70768] Fps is (10 sec: 47492.9, 60 sec: 49421.5, 300 sec: 49595.6). Total num frames: 1457078272. Throughput: 0: 49849.4. Samples: 985859820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 21:54:20,944][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:54:21,078][71000] Updated weights for policy 0, policy_version 88934 (0.0039) [2024-06-12 21:54:24,102][71000] Updated weights for policy 0, policy_version 88944 (0.0021) [2024-06-12 21:54:25,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49698.2, 300 sec: 49762.9). Total num frames: 1457356800. Throughput: 0: 49684.1. Samples: 986151500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 21:54:25,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:54:27,667][71000] Updated weights for policy 0, policy_version 88954 (0.0027) [2024-06-12 21:54:30,697][71000] Updated weights for policy 0, policy_version 88964 (0.0033) [2024-06-12 21:54:30,940][70768] Fps is (10 sec: 52451.6, 60 sec: 50244.4, 300 sec: 49596.3). Total num frames: 1457602560. Throughput: 0: 49898.8. Samples: 986460500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 21:54:30,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 21:54:34,006][71000] Updated weights for policy 0, policy_version 88974 (0.0022) [2024-06-12 21:54:35,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1457815552. Throughput: 0: 49655.6. Samples: 986604500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 21:54:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:54:37,325][71000] Updated weights for policy 0, policy_version 88984 (0.0025) [2024-06-12 21:54:40,711][71000] Updated weights for policy 0, policy_version 88994 (0.0028) [2024-06-12 21:54:40,940][70768] Fps is (10 sec: 47512.5, 60 sec: 49698.1, 300 sec: 49540.7). Total num frames: 1458077696. Throughput: 0: 49710.0. Samples: 986902860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 21:54:40,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 21:54:43,974][71000] Updated weights for policy 0, policy_version 89004 (0.0045) [2024-06-12 21:54:45,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49425.1, 300 sec: 49707.4). Total num frames: 1458339840. Throughput: 0: 49607.5. Samples: 987190660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 21:54:45,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:54:47,547][71000] Updated weights for policy 0, policy_version 89014 (0.0029) [2024-06-12 21:54:50,919][71000] Updated weights for policy 0, policy_version 89024 (0.0037) [2024-06-12 21:54:50,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49698.1, 300 sec: 49540.7). Total num frames: 1458569216. Throughput: 0: 49354.1. Samples: 987343940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 21:54:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:54:54,046][71000] Updated weights for policy 0, policy_version 89034 (0.0023) [2024-06-12 21:54:55,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1458798592. Throughput: 0: 49194.6. Samples: 987634700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 21:54:55,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:54:57,510][71000] Updated weights for policy 0, policy_version 89044 (0.0032) [2024-06-12 21:54:58,125][70980] Signal inference workers to stop experience collection... (14500 times) [2024-06-12 21:54:58,125][70980] Signal inference workers to resume experience collection... (14500 times) [2024-06-12 21:54:58,172][71000] InferenceWorker_p0-w0: stopping experience collection (14500 times) [2024-06-12 21:54:58,172][71000] InferenceWorker_p0-w0: resuming experience collection (14500 times) [2024-06-12 21:55:00,671][71000] Updated weights for policy 0, policy_version 89054 (0.0025) [2024-06-12 21:55:00,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1459060736. Throughput: 0: 49208.0. Samples: 987932020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 21:55:00,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:55:04,340][71000] Updated weights for policy 0, policy_version 89064 (0.0023) [2024-06-12 21:55:05,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49152.0, 300 sec: 49596.6). Total num frames: 1459322880. Throughput: 0: 49556.8. Samples: 988089660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 21:55:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 21:55:07,225][71000] Updated weights for policy 0, policy_version 89074 (0.0040) [2024-06-12 21:55:10,826][71000] Updated weights for policy 0, policy_version 89084 (0.0026) [2024-06-12 21:55:10,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 49596.3). Total num frames: 1459552256. Throughput: 0: 49482.3. Samples: 988378200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 21:55:10,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:55:14,239][71000] Updated weights for policy 0, policy_version 89094 (0.0033) [2024-06-12 21:55:15,940][70768] Fps is (10 sec: 47512.7, 60 sec: 49424.9, 300 sec: 49540.7). Total num frames: 1459798016. Throughput: 0: 49086.5. Samples: 988669400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 21:55:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:55:17,451][71000] Updated weights for policy 0, policy_version 89104 (0.0029) [2024-06-12 21:55:20,587][71000] Updated weights for policy 0, policy_version 89114 (0.0036) [2024-06-12 21:55:20,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49701.7, 300 sec: 49596.3). Total num frames: 1460060160. Throughput: 0: 49159.6. Samples: 988816680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 21:55:20,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:55:24,056][71000] Updated weights for policy 0, policy_version 89124 (0.0036) [2024-06-12 21:55:25,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1460305920. Throughput: 0: 49259.3. Samples: 989119520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 21:55:25,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:55:27,134][71000] Updated weights for policy 0, policy_version 89134 (0.0023) [2024-06-12 21:55:30,369][71000] Updated weights for policy 0, policy_version 89144 (0.0034) [2024-06-12 21:55:30,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1460551680. Throughput: 0: 49575.2. Samples: 989421540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 21:55:30,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 21:55:34,007][71000] Updated weights for policy 0, policy_version 89154 (0.0024) [2024-06-12 21:55:35,943][70768] Fps is (10 sec: 47496.8, 60 sec: 49422.1, 300 sec: 49429.1). Total num frames: 1460781056. Throughput: 0: 49424.2. Samples: 989568200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 21:55:35,944][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:55:37,392][71000] Updated weights for policy 0, policy_version 89164 (0.0035) [2024-06-12 21:55:40,363][71000] Updated weights for policy 0, policy_version 89174 (0.0037) [2024-06-12 21:55:40,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1461043200. Throughput: 0: 49550.9. Samples: 989864500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 21:55:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:55:41,067][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000089176_1461059584.pth... [2024-06-12 21:55:41,112][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000088450_1449164800.pth [2024-06-12 21:55:44,246][71000] Updated weights for policy 0, policy_version 89184 (0.0034) [2024-06-12 21:55:45,940][70768] Fps is (10 sec: 50808.0, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1461288960. Throughput: 0: 49220.3. Samples: 990146940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 21:55:45,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 21:55:47,103][71000] Updated weights for policy 0, policy_version 89194 (0.0022) [2024-06-12 21:55:50,654][71000] Updated weights for policy 0, policy_version 89204 (0.0027) [2024-06-12 21:55:50,944][70768] Fps is (10 sec: 49132.0, 60 sec: 49421.7, 300 sec: 49484.5). Total num frames: 1461534720. Throughput: 0: 49212.6. Samples: 990304440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-12 21:55:50,944][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:55:53,635][71000] Updated weights for policy 0, policy_version 89214 (0.0031) [2024-06-12 21:55:55,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1461780480. Throughput: 0: 49300.9. Samples: 990596740. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-12 21:55:55,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:55:57,152][71000] Updated weights for policy 0, policy_version 89224 (0.0032) [2024-06-12 21:56:00,381][71000] Updated weights for policy 0, policy_version 89234 (0.0023) [2024-06-12 21:56:00,940][70768] Fps is (10 sec: 49173.0, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1462026240. Throughput: 0: 49347.3. Samples: 990890020. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-12 21:56:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:56:04,085][71000] Updated weights for policy 0, policy_version 89244 (0.0033) [2024-06-12 21:56:05,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1462272000. Throughput: 0: 49484.0. Samples: 991043460. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-12 21:56:05,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 21:56:07,041][71000] Updated weights for policy 0, policy_version 89254 (0.0029) [2024-06-12 21:56:08,547][70980] Signal inference workers to stop experience collection... (14550 times) [2024-06-12 21:56:08,592][71000] InferenceWorker_p0-w0: stopping experience collection (14550 times) [2024-06-12 21:56:08,602][70980] Signal inference workers to resume experience collection... (14550 times) [2024-06-12 21:56:08,607][71000] InferenceWorker_p0-w0: resuming experience collection (14550 times) [2024-06-12 21:56:10,522][71000] Updated weights for policy 0, policy_version 89264 (0.0030) [2024-06-12 21:56:10,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1462517760. Throughput: 0: 49297.3. Samples: 991337900. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-12 21:56:10,949][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:56:13,812][71000] Updated weights for policy 0, policy_version 89274 (0.0033) [2024-06-12 21:56:15,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 1462747136. Throughput: 0: 48903.5. Samples: 991622200. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-12 21:56:15,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 21:56:17,179][71000] Updated weights for policy 0, policy_version 89284 (0.0029) [2024-06-12 21:56:20,286][71000] Updated weights for policy 0, policy_version 89294 (0.0034) [2024-06-12 21:56:20,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1463009280. Throughput: 0: 48963.0. Samples: 991771360. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-12 21:56:20,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 21:56:23,976][71000] Updated weights for policy 0, policy_version 89304 (0.0033) [2024-06-12 21:56:25,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.9, 300 sec: 49374.2). Total num frames: 1463238656. Throughput: 0: 49050.8. Samples: 992071780. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-12 21:56:25,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:56:26,994][71000] Updated weights for policy 0, policy_version 89314 (0.0034) [2024-06-12 21:56:30,549][71000] Updated weights for policy 0, policy_version 89324 (0.0027) [2024-06-12 21:56:30,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 1463500800. Throughput: 0: 49493.3. Samples: 992374140. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-12 21:56:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:56:33,658][71000] Updated weights for policy 0, policy_version 89334 (0.0028) [2024-06-12 21:56:35,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49154.8, 300 sec: 49429.7). Total num frames: 1463730176. Throughput: 0: 49146.7. Samples: 992515840. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-12 21:56:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:56:37,057][71000] Updated weights for policy 0, policy_version 89344 (0.0042) [2024-06-12 21:56:40,350][71000] Updated weights for policy 0, policy_version 89354 (0.0032) [2024-06-12 21:56:40,939][70768] Fps is (10 sec: 47514.4, 60 sec: 48879.1, 300 sec: 49374.2). Total num frames: 1463975936. Throughput: 0: 49260.0. Samples: 992813440. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-12 21:56:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 21:56:43,818][71000] Updated weights for policy 0, policy_version 89364 (0.0034) [2024-06-12 21:56:45,939][70768] Fps is (10 sec: 49153.1, 60 sec: 48879.1, 300 sec: 49318.6). Total num frames: 1464221696. Throughput: 0: 49103.1. Samples: 993099660. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-12 21:56:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:56:47,241][71000] Updated weights for policy 0, policy_version 89374 (0.0030) [2024-06-12 21:56:50,401][71000] Updated weights for policy 0, policy_version 89384 (0.0026) [2024-06-12 21:56:50,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49155.5, 300 sec: 49429.7). Total num frames: 1464483840. Throughput: 0: 49091.1. Samples: 993252560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 21:56:50,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 21:56:53,765][71000] Updated weights for policy 0, policy_version 89394 (0.0026) [2024-06-12 21:56:55,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1464729600. Throughput: 0: 49289.0. Samples: 993555900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 21:56:55,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:56:56,849][71000] Updated weights for policy 0, policy_version 89404 (0.0032) [2024-06-12 21:57:00,570][71000] Updated weights for policy 0, policy_version 89414 (0.0027) [2024-06-12 21:57:00,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1464975360. Throughput: 0: 49495.2. Samples: 993849480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 21:57:00,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:57:03,439][71000] Updated weights for policy 0, policy_version 89424 (0.0027) [2024-06-12 21:57:05,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48878.8, 300 sec: 49263.1). Total num frames: 1465204736. Throughput: 0: 49300.4. Samples: 993989880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 21:57:05,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:57:07,449][71000] Updated weights for policy 0, policy_version 89434 (0.0027) [2024-06-12 21:57:10,538][71000] Updated weights for policy 0, policy_version 89444 (0.0028) [2024-06-12 21:57:10,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 1465466880. Throughput: 0: 48827.8. Samples: 994269040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 21:57:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:57:14,131][71000] Updated weights for policy 0, policy_version 89454 (0.0020) [2024-06-12 21:57:15,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 1465696256. Throughput: 0: 48771.3. Samples: 994568840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 21:57:15,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:57:17,077][71000] Updated weights for policy 0, policy_version 89464 (0.0030) [2024-06-12 21:57:20,726][71000] Updated weights for policy 0, policy_version 89474 (0.0032) [2024-06-12 21:57:20,940][70768] Fps is (10 sec: 47514.8, 60 sec: 48879.0, 300 sec: 49374.2). Total num frames: 1465942016. Throughput: 0: 48891.3. Samples: 994715940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 21:57:20,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:57:22,898][70980] Signal inference workers to stop experience collection... (14600 times) [2024-06-12 21:57:22,944][71000] InferenceWorker_p0-w0: stopping experience collection (14600 times) [2024-06-12 21:57:22,947][70980] Signal inference workers to resume experience collection... (14600 times) [2024-06-12 21:57:22,958][71000] InferenceWorker_p0-w0: resuming experience collection (14600 times) [2024-06-12 21:57:23,489][71000] Updated weights for policy 0, policy_version 89484 (0.0030) [2024-06-12 21:57:25,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 1466171392. Throughput: 0: 48848.7. Samples: 995011640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 21:57:25,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:57:27,119][71000] Updated weights for policy 0, policy_version 89494 (0.0034) [2024-06-12 21:57:30,221][71000] Updated weights for policy 0, policy_version 89504 (0.0030) [2024-06-12 21:57:30,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.1, 300 sec: 49374.1). Total num frames: 1466449920. Throughput: 0: 48975.9. Samples: 995303580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 21:57:30,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 21:57:33,926][71000] Updated weights for policy 0, policy_version 89514 (0.0031) [2024-06-12 21:57:35,940][70768] Fps is (10 sec: 54066.9, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1466712064. Throughput: 0: 49162.0. Samples: 995464860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 21:57:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 21:57:35,941][70980] Saving new best policy, reward=0.283! [2024-06-12 21:57:36,732][71000] Updated weights for policy 0, policy_version 89524 (0.0024) [2024-06-12 21:57:40,676][71000] Updated weights for policy 0, policy_version 89534 (0.0029) [2024-06-12 21:57:40,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1466925056. Throughput: 0: 49178.7. Samples: 995768940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-12 21:57:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 21:57:40,955][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000089534_1466925056.pth... [2024-06-12 21:57:41,004][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000088812_1455095808.pth [2024-06-12 21:57:43,041][71000] Updated weights for policy 0, policy_version 89544 (0.0037) [2024-06-12 21:57:45,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49424.9, 300 sec: 49374.1). Total num frames: 1467187200. Throughput: 0: 49359.9. Samples: 996070680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 21:57:45,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 21:57:47,077][71000] Updated weights for policy 0, policy_version 89554 (0.0027) [2024-06-12 21:57:49,729][71000] Updated weights for policy 0, policy_version 89564 (0.0031) [2024-06-12 21:57:50,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1467432960. Throughput: 0: 49357.5. Samples: 996210960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 21:57:50,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:57:53,902][71000] Updated weights for policy 0, policy_version 89574 (0.0028) [2024-06-12 21:57:55,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1467695104. Throughput: 0: 49853.6. Samples: 996512440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 21:57:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:57:56,269][71000] Updated weights for policy 0, policy_version 89584 (0.0030) [2024-06-12 21:58:00,164][71000] Updated weights for policy 0, policy_version 89594 (0.0031) [2024-06-12 21:58:00,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 49318.7). Total num frames: 1467924480. Throughput: 0: 49804.1. Samples: 996810020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 21:58:00,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 21:58:03,436][71000] Updated weights for policy 0, policy_version 89604 (0.0026) [2024-06-12 21:58:05,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 1468186624. Throughput: 0: 49936.9. Samples: 996963100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 21:58:05,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 21:58:07,035][71000] Updated weights for policy 0, policy_version 89614 (0.0029) [2024-06-12 21:58:09,989][71000] Updated weights for policy 0, policy_version 89624 (0.0032) [2024-06-12 21:58:10,939][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.3, 300 sec: 49374.2). Total num frames: 1468432384. Throughput: 0: 49657.5. Samples: 997246220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 21:58:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 21:58:13,536][71000] Updated weights for policy 0, policy_version 89634 (0.0027) [2024-06-12 21:58:15,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49698.0, 300 sec: 49374.1). Total num frames: 1468678144. Throughput: 0: 49692.8. Samples: 997539760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 21:58:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 21:58:16,492][70980] Signal inference workers to stop experience collection... (14650 times) [2024-06-12 21:58:16,492][70980] Signal inference workers to resume experience collection... (14650 times) [2024-06-12 21:58:16,529][71000] InferenceWorker_p0-w0: stopping experience collection (14650 times) [2024-06-12 21:58:16,529][71000] InferenceWorker_p0-w0: resuming experience collection (14650 times) [2024-06-12 21:58:16,621][71000] Updated weights for policy 0, policy_version 89644 (0.0030) [2024-06-12 21:58:20,204][71000] Updated weights for policy 0, policy_version 89654 (0.0025) [2024-06-12 21:58:20,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 1468907520. Throughput: 0: 49568.6. Samples: 997695440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 21:58:20,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 21:58:23,091][71000] Updated weights for policy 0, policy_version 89664 (0.0028) [2024-06-12 21:58:25,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 1469153280. Throughput: 0: 49403.1. Samples: 997992080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 21:58:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 21:58:26,820][71000] Updated weights for policy 0, policy_version 89674 (0.0028) [2024-06-12 21:58:29,716][71000] Updated weights for policy 0, policy_version 89684 (0.0034) [2024-06-12 21:58:30,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1469415424. Throughput: 0: 49250.3. Samples: 998286940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 21:58:30,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 21:58:33,461][71000] Updated weights for policy 0, policy_version 89694 (0.0036) [2024-06-12 21:58:35,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 1469661184. Throughput: 0: 49443.9. Samples: 998435940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 21:58:35,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 21:58:37,037][71000] Updated weights for policy 0, policy_version 89704 (0.0034) [2024-06-12 21:58:40,402][71000] Updated weights for policy 0, policy_version 89714 (0.0030) [2024-06-12 21:58:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 1469906944. Throughput: 0: 49035.5. Samples: 998719040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 21:58:40,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 21:58:43,454][71000] Updated weights for policy 0, policy_version 89724 (0.0031) [2024-06-12 21:58:45,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 1470152704. Throughput: 0: 49225.5. Samples: 999025180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 21:58:45,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:58:46,791][71000] Updated weights for policy 0, policy_version 89734 (0.0020) [2024-06-12 21:58:49,999][71000] Updated weights for policy 0, policy_version 89744 (0.0030) [2024-06-12 21:58:50,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 1470398464. Throughput: 0: 49280.9. Samples: 999180740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 21:58:50,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 21:58:53,105][71000] Updated weights for policy 0, policy_version 89754 (0.0025) [2024-06-12 21:58:55,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 1470644224. Throughput: 0: 49483.3. Samples: 999472980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 21:58:55,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 21:58:56,561][71000] Updated weights for policy 0, policy_version 89764 (0.0027) [2024-06-12 21:58:59,937][71000] Updated weights for policy 0, policy_version 89774 (0.0030) [2024-06-12 21:59:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.0, 300 sec: 49263.1). Total num frames: 1470906368. Throughput: 0: 49366.3. Samples: 999761240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 21:59:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:59:03,454][71000] Updated weights for policy 0, policy_version 89784 (0.0028) [2024-06-12 21:59:05,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 1471135744. Throughput: 0: 49371.1. Samples: 999917140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 21:59:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 21:59:06,645][71000] Updated weights for policy 0, policy_version 89794 (0.0033) [2024-06-12 21:59:09,912][71000] Updated weights for policy 0, policy_version 89804 (0.0025) [2024-06-12 21:59:10,939][70768] Fps is (10 sec: 47514.5, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 1471381504. Throughput: 0: 49385.5. Samples: 1000214420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 21:59:10,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:59:13,070][71000] Updated weights for policy 0, policy_version 89814 (0.0022) [2024-06-12 21:59:15,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 49319.3). Total num frames: 1471627264. Throughput: 0: 49373.4. Samples: 1000508740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 21:59:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 21:59:16,607][71000] Updated weights for policy 0, policy_version 89824 (0.0023) [2024-06-12 21:59:19,633][71000] Updated weights for policy 0, policy_version 89834 (0.0030) [2024-06-12 21:59:20,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 1471889408. Throughput: 0: 49262.2. Samples: 1000652740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 21:59:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:59:23,539][71000] Updated weights for policy 0, policy_version 89844 (0.0025) [2024-06-12 21:59:25,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 1472118784. Throughput: 0: 49592.0. Samples: 1000950680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 21:59:25,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 21:59:26,499][71000] Updated weights for policy 0, policy_version 89854 (0.0029) [2024-06-12 21:59:29,942][71000] Updated weights for policy 0, policy_version 89864 (0.0024) [2024-06-12 21:59:30,931][70980] Signal inference workers to stop experience collection... (14700 times) [2024-06-12 21:59:30,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1472364544. Throughput: 0: 49391.7. Samples: 1001247800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 21:59:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:59:30,978][71000] InferenceWorker_p0-w0: stopping experience collection (14700 times) [2024-06-12 21:59:30,984][70980] Signal inference workers to resume experience collection... (14700 times) [2024-06-12 21:59:31,002][71000] InferenceWorker_p0-w0: resuming experience collection (14700 times) [2024-06-12 21:59:33,171][71000] Updated weights for policy 0, policy_version 89874 (0.0029) [2024-06-12 21:59:35,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 1472610304. Throughput: 0: 49141.0. Samples: 1001392080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 21:59:35,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:59:36,630][71000] Updated weights for policy 0, policy_version 89884 (0.0031) [2024-06-12 21:59:39,547][71000] Updated weights for policy 0, policy_version 89894 (0.0029) [2024-06-12 21:59:40,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 1472888832. Throughput: 0: 49444.1. Samples: 1001697960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 21:59:40,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 21:59:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000089898_1472888832.pth... [2024-06-12 21:59:40,995][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000089176_1461059584.pth [2024-06-12 21:59:43,441][71000] Updated weights for policy 0, policy_version 89904 (0.0028) [2024-06-12 21:59:45,939][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 1473118208. Throughput: 0: 49425.0. Samples: 1001985360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 21:59:45,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 21:59:46,138][71000] Updated weights for policy 0, policy_version 89914 (0.0023) [2024-06-12 21:59:50,018][71000] Updated weights for policy 0, policy_version 89924 (0.0026) [2024-06-12 21:59:50,939][70768] Fps is (10 sec: 47514.4, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 1473363968. Throughput: 0: 49240.5. Samples: 1002132960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 21:59:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 21:59:52,850][71000] Updated weights for policy 0, policy_version 89934 (0.0031) [2024-06-12 21:59:55,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 1473593344. Throughput: 0: 49252.7. Samples: 1002430800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 21:59:55,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 21:59:56,682][71000] Updated weights for policy 0, policy_version 89944 (0.0033) [2024-06-12 21:59:59,470][71000] Updated weights for policy 0, policy_version 89954 (0.0022) [2024-06-12 22:00:00,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 1473855488. Throughput: 0: 49218.2. Samples: 1002723560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 22:00:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:00:03,237][71000] Updated weights for policy 0, policy_version 89964 (0.0033) [2024-06-12 22:00:05,939][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 1474101248. Throughput: 0: 49393.5. Samples: 1002875440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 22:00:05,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:00:06,116][71000] Updated weights for policy 0, policy_version 89974 (0.0030) [2024-06-12 22:00:09,997][71000] Updated weights for policy 0, policy_version 89984 (0.0022) [2024-06-12 22:00:10,940][70768] Fps is (10 sec: 50789.0, 60 sec: 49697.8, 300 sec: 49374.1). Total num frames: 1474363392. Throughput: 0: 49522.4. Samples: 1003179200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 22:00:10,941][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:00:12,770][71000] Updated weights for policy 0, policy_version 89994 (0.0031) [2024-06-12 22:00:15,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 1474592768. Throughput: 0: 49467.6. Samples: 1003473840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 22:00:15,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:00:16,620][71000] Updated weights for policy 0, policy_version 90004 (0.0027) [2024-06-12 22:00:19,179][71000] Updated weights for policy 0, policy_version 90014 (0.0026) [2024-06-12 22:00:20,940][70768] Fps is (10 sec: 49153.1, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 1474854912. Throughput: 0: 49514.1. Samples: 1003620220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 22:00:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:00:23,326][71000] Updated weights for policy 0, policy_version 90024 (0.0033) [2024-06-12 22:00:25,863][71000] Updated weights for policy 0, policy_version 90034 (0.0034) [2024-06-12 22:00:25,940][70768] Fps is (10 sec: 52427.8, 60 sec: 49971.1, 300 sec: 49374.1). Total num frames: 1475117056. Throughput: 0: 49135.5. Samples: 1003909060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 22:00:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:00:29,881][71000] Updated weights for policy 0, policy_version 90044 (0.0022) [2024-06-12 22:00:30,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.1, 300 sec: 49319.2). Total num frames: 1475330048. Throughput: 0: 49456.8. Samples: 1004210920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:00:30,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:00:32,557][71000] Updated weights for policy 0, policy_version 90054 (0.0025) [2024-06-12 22:00:35,940][70768] Fps is (10 sec: 42598.0, 60 sec: 48878.7, 300 sec: 49152.0). Total num frames: 1475543040. Throughput: 0: 49264.1. Samples: 1004349860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:00:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:00:36,597][71000] Updated weights for policy 0, policy_version 90064 (0.0023) [2024-06-12 22:00:38,015][70980] Signal inference workers to stop experience collection... (14750 times) [2024-06-12 22:00:38,016][70980] Signal inference workers to resume experience collection... (14750 times) [2024-06-12 22:00:38,064][71000] InferenceWorker_p0-w0: stopping experience collection (14750 times) [2024-06-12 22:00:38,064][71000] InferenceWorker_p0-w0: resuming experience collection (14750 times) [2024-06-12 22:00:39,236][71000] Updated weights for policy 0, policy_version 90074 (0.0024) [2024-06-12 22:00:40,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 1475837952. Throughput: 0: 49127.6. Samples: 1004641540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:00:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:00:43,381][71000] Updated weights for policy 0, policy_version 90084 (0.0031) [2024-06-12 22:00:45,935][71000] Updated weights for policy 0, policy_version 90094 (0.0028) [2024-06-12 22:00:45,940][70768] Fps is (10 sec: 55706.6, 60 sec: 49698.1, 300 sec: 49374.9). Total num frames: 1476100096. Throughput: 0: 49289.8. Samples: 1004941600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:00:45,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:00:50,074][71000] Updated weights for policy 0, policy_version 90104 (0.0024) [2024-06-12 22:00:50,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48878.8, 300 sec: 49207.5). Total num frames: 1476296704. Throughput: 0: 48986.5. Samples: 1005079840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:00:50,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:00:52,678][71000] Updated weights for policy 0, policy_version 90114 (0.0027) [2024-06-12 22:00:55,940][70768] Fps is (10 sec: 44236.9, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 1476542464. Throughput: 0: 48752.3. Samples: 1005373040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:00:55,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:00:56,746][71000] Updated weights for policy 0, policy_version 90124 (0.0028) [2024-06-12 22:00:59,601][71000] Updated weights for policy 0, policy_version 90134 (0.0029) [2024-06-12 22:01:00,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1476820992. Throughput: 0: 48826.1. Samples: 1005671020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:01:00,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:01:03,410][71000] Updated weights for policy 0, policy_version 90144 (0.0029) [2024-06-12 22:01:05,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49424.9, 300 sec: 49318.6). Total num frames: 1477066752. Throughput: 0: 49107.0. Samples: 1005830040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:01:05,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:01:06,266][71000] Updated weights for policy 0, policy_version 90154 (0.0025) [2024-06-12 22:01:10,185][71000] Updated weights for policy 0, policy_version 90164 (0.0038) [2024-06-12 22:01:10,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48606.0, 300 sec: 49263.1). Total num frames: 1477279744. Throughput: 0: 49219.6. Samples: 1006123940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:01:10,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:01:12,779][71000] Updated weights for policy 0, policy_version 90174 (0.0024) [2024-06-12 22:01:15,940][70768] Fps is (10 sec: 44236.8, 60 sec: 48605.7, 300 sec: 49152.0). Total num frames: 1477509120. Throughput: 0: 48672.3. Samples: 1006401180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:01:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 22:01:16,816][71000] Updated weights for policy 0, policy_version 90184 (0.0023) [2024-06-12 22:01:19,399][71000] Updated weights for policy 0, policy_version 90194 (0.0031) [2024-06-12 22:01:20,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1477804032. Throughput: 0: 49009.5. Samples: 1006555280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:01:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:01:23,205][71000] Updated weights for policy 0, policy_version 90204 (0.0028) [2024-06-12 22:01:25,940][70768] Fps is (10 sec: 54067.9, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 1478049792. Throughput: 0: 49244.0. Samples: 1006857520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 22:01:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 22:01:26,085][71000] Updated weights for policy 0, policy_version 90214 (0.0026) [2024-06-12 22:01:30,271][71000] Updated weights for policy 0, policy_version 90224 (0.0021) [2024-06-12 22:01:30,940][70768] Fps is (10 sec: 44237.0, 60 sec: 48605.9, 300 sec: 49207.6). Total num frames: 1478246400. Throughput: 0: 49109.8. Samples: 1007151540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 22:01:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:01:31,148][70980] Signal inference workers to stop experience collection... (14800 times) [2024-06-12 22:01:31,148][70980] Signal inference workers to resume experience collection... (14800 times) [2024-06-12 22:01:31,179][71000] InferenceWorker_p0-w0: stopping experience collection (14800 times) [2024-06-12 22:01:31,179][71000] InferenceWorker_p0-w0: resuming experience collection (14800 times) [2024-06-12 22:01:33,081][71000] Updated weights for policy 0, policy_version 90234 (0.0032) [2024-06-12 22:01:35,940][70768] Fps is (10 sec: 45874.4, 60 sec: 49425.1, 300 sec: 49263.0). Total num frames: 1478508544. Throughput: 0: 49102.1. Samples: 1007289440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 22:01:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:01:36,874][71000] Updated weights for policy 0, policy_version 90244 (0.0038) [2024-06-12 22:01:39,680][71000] Updated weights for policy 0, policy_version 90254 (0.0031) [2024-06-12 22:01:40,942][70768] Fps is (10 sec: 52414.9, 60 sec: 48876.8, 300 sec: 49318.2). Total num frames: 1478770688. Throughput: 0: 49177.1. Samples: 1007586140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 22:01:40,943][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:01:41,005][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000090258_1478787072.pth... [2024-06-12 22:01:41,058][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000089534_1466925056.pth [2024-06-12 22:01:43,685][71000] Updated weights for policy 0, policy_version 90264 (0.0028) [2024-06-12 22:01:45,939][70768] Fps is (10 sec: 50791.4, 60 sec: 48605.9, 300 sec: 49263.1). Total num frames: 1479016448. Throughput: 0: 48962.7. Samples: 1007874340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 22:01:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 22:01:46,392][71000] Updated weights for policy 0, policy_version 90274 (0.0033) [2024-06-12 22:01:50,375][71000] Updated weights for policy 0, policy_version 90284 (0.0028) [2024-06-12 22:01:50,940][70768] Fps is (10 sec: 47525.5, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 1479245824. Throughput: 0: 48708.4. Samples: 1008021920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 22:01:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:01:53,122][71000] Updated weights for policy 0, policy_version 90294 (0.0029) [2024-06-12 22:01:55,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 1479491584. Throughput: 0: 48708.5. Samples: 1008315820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 22:01:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:01:57,150][71000] Updated weights for policy 0, policy_version 90304 (0.0030) [2024-06-12 22:01:59,799][71000] Updated weights for policy 0, policy_version 90314 (0.0030) [2024-06-12 22:02:00,940][70768] Fps is (10 sec: 50790.9, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 1479753728. Throughput: 0: 49032.1. Samples: 1008607620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 22:02:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:02:03,730][71000] Updated weights for policy 0, policy_version 90324 (0.0030) [2024-06-12 22:02:05,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48606.0, 300 sec: 49207.6). Total num frames: 1479983104. Throughput: 0: 49221.4. Samples: 1008770240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 22:02:05,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:02:06,541][71000] Updated weights for policy 0, policy_version 90334 (0.0023) [2024-06-12 22:02:09,992][71000] Updated weights for policy 0, policy_version 90344 (0.0021) [2024-06-12 22:02:10,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 1480228864. Throughput: 0: 49099.4. Samples: 1009067000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 22:02:10,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:02:13,057][71000] Updated weights for policy 0, policy_version 90354 (0.0032) [2024-06-12 22:02:15,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 1480474624. Throughput: 0: 49122.2. Samples: 1009362040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-12 22:02:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:02:16,590][71000] Updated weights for policy 0, policy_version 90364 (0.0027) [2024-06-12 22:02:19,655][71000] Updated weights for policy 0, policy_version 90374 (0.0030) [2024-06-12 22:02:20,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49151.8, 300 sec: 49429.7). Total num frames: 1480753152. Throughput: 0: 49336.0. Samples: 1009509560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 22:02:20,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:02:23,501][71000] Updated weights for policy 0, policy_version 90384 (0.0028) [2024-06-12 22:02:25,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48605.9, 300 sec: 49207.5). Total num frames: 1480966144. Throughput: 0: 49194.5. Samples: 1009799760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 22:02:25,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:02:26,434][71000] Updated weights for policy 0, policy_version 90394 (0.0030) [2024-06-12 22:02:29,955][71000] Updated weights for policy 0, policy_version 90404 (0.0037) [2024-06-12 22:02:30,940][70768] Fps is (10 sec: 45876.1, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 1481211904. Throughput: 0: 49291.5. Samples: 1010092460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 22:02:30,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:02:33,065][71000] Updated weights for policy 0, policy_version 90414 (0.0034) [2024-06-12 22:02:35,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.2, 300 sec: 49263.1). Total num frames: 1481457664. Throughput: 0: 49370.0. Samples: 1010243560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 22:02:35,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:02:36,375][71000] Updated weights for policy 0, policy_version 90424 (0.0027) [2024-06-12 22:02:39,690][71000] Updated weights for policy 0, policy_version 90434 (0.0033) [2024-06-12 22:02:39,939][70980] Signal inference workers to stop experience collection... (14850 times) [2024-06-12 22:02:39,986][71000] InferenceWorker_p0-w0: stopping experience collection (14850 times) [2024-06-12 22:02:39,986][70980] Signal inference workers to resume experience collection... (14850 times) [2024-06-12 22:02:40,002][71000] InferenceWorker_p0-w0: resuming experience collection (14850 times) [2024-06-12 22:02:40,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49427.3, 300 sec: 49318.6). Total num frames: 1481736192. Throughput: 0: 49679.6. Samples: 1010551400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 22:02:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:02:43,032][71000] Updated weights for policy 0, policy_version 90444 (0.0032) [2024-06-12 22:02:45,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 1481965568. Throughput: 0: 49707.1. Samples: 1010844440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 22:02:45,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:02:46,195][71000] Updated weights for policy 0, policy_version 90454 (0.0032) [2024-06-12 22:02:49,591][71000] Updated weights for policy 0, policy_version 90464 (0.0022) [2024-06-12 22:02:50,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.2, 300 sec: 49207.5). Total num frames: 1482211328. Throughput: 0: 49335.1. Samples: 1010990320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 22:02:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:02:52,883][71000] Updated weights for policy 0, policy_version 90474 (0.0038) [2024-06-12 22:02:55,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 1482473472. Throughput: 0: 49328.6. Samples: 1011286780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 22:02:55,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:02:56,130][71000] Updated weights for policy 0, policy_version 90484 (0.0021) [2024-06-12 22:02:59,465][71000] Updated weights for policy 0, policy_version 90494 (0.0032) [2024-06-12 22:03:00,939][70768] Fps is (10 sec: 52429.2, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 1482735616. Throughput: 0: 49584.0. Samples: 1011593320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 22:03:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:03:02,635][71000] Updated weights for policy 0, policy_version 90504 (0.0023) [2024-06-12 22:03:05,791][71000] Updated weights for policy 0, policy_version 90514 (0.0029) [2024-06-12 22:03:05,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49971.1, 300 sec: 49318.6). Total num frames: 1482981376. Throughput: 0: 49703.7. Samples: 1011746220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 22:03:05,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:03:09,185][71000] Updated weights for policy 0, policy_version 90524 (0.0028) [2024-06-12 22:03:10,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49971.2, 300 sec: 49318.6). Total num frames: 1483227136. Throughput: 0: 49867.0. Samples: 1012043780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-12 22:03:10,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:03:12,621][71000] Updated weights for policy 0, policy_version 90534 (0.0025) [2024-06-12 22:03:15,869][71000] Updated weights for policy 0, policy_version 90544 (0.0024) [2024-06-12 22:03:15,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49971.2, 300 sec: 49374.2). Total num frames: 1483472896. Throughput: 0: 49812.5. Samples: 1012334020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 22:03:15,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 22:03:19,352][71000] Updated weights for policy 0, policy_version 90554 (0.0025) [2024-06-12 22:03:20,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.2, 300 sec: 49374.1). Total num frames: 1483718656. Throughput: 0: 49894.9. Samples: 1012488840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 22:03:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:03:22,219][71000] Updated weights for policy 0, policy_version 90564 (0.0028) [2024-06-12 22:03:25,729][71000] Updated weights for policy 0, policy_version 90574 (0.0028) [2024-06-12 22:03:25,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49971.1, 300 sec: 49318.6). Total num frames: 1483964416. Throughput: 0: 49762.1. Samples: 1012790700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 22:03:25,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:03:28,710][71000] Updated weights for policy 0, policy_version 90584 (0.0025) [2024-06-12 22:03:30,940][70768] Fps is (10 sec: 50790.2, 60 sec: 50244.2, 300 sec: 49374.1). Total num frames: 1484226560. Throughput: 0: 49879.0. Samples: 1013089000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 22:03:30,952][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 22:03:32,221][71000] Updated weights for policy 0, policy_version 90594 (0.0037) [2024-06-12 22:03:35,648][71000] Updated weights for policy 0, policy_version 90604 (0.0032) [2024-06-12 22:03:35,942][70768] Fps is (10 sec: 49141.5, 60 sec: 49969.2, 300 sec: 49318.2). Total num frames: 1484455936. Throughput: 0: 49962.4. Samples: 1013238740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 22:03:35,942][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:03:38,778][71000] Updated weights for policy 0, policy_version 90614 (0.0023) [2024-06-12 22:03:40,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 1484718080. Throughput: 0: 49941.3. Samples: 1013534140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 22:03:40,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 22:03:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000090620_1484718080.pth... [2024-06-12 22:03:41,008][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000089898_1472888832.pth [2024-06-12 22:03:42,273][71000] Updated weights for policy 0, policy_version 90624 (0.0036) [2024-06-12 22:03:45,502][71000] Updated weights for policy 0, policy_version 90634 (0.0041) [2024-06-12 22:03:45,939][70768] Fps is (10 sec: 49163.3, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 1484947456. Throughput: 0: 49622.2. Samples: 1013826320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 22:03:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:03:48,861][71000] Updated weights for policy 0, policy_version 90644 (0.0028) [2024-06-12 22:03:50,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49698.0, 300 sec: 49318.6). Total num frames: 1485193216. Throughput: 0: 49519.1. Samples: 1013974580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 22:03:50,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:03:52,276][71000] Updated weights for policy 0, policy_version 90654 (0.0032) [2024-06-12 22:03:55,314][71000] Updated weights for policy 0, policy_version 90664 (0.0028) [2024-06-12 22:03:55,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49698.0, 300 sec: 49318.6). Total num frames: 1485455360. Throughput: 0: 49564.8. Samples: 1014274200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 22:03:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:03:58,884][71000] Updated weights for policy 0, policy_version 90674 (0.0029) [2024-06-12 22:04:00,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 1485684736. Throughput: 0: 49539.1. Samples: 1014563280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 22:04:00,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:04:02,443][71000] Updated weights for policy 0, policy_version 90684 (0.0039) [2024-06-12 22:04:05,697][71000] Updated weights for policy 0, policy_version 90694 (0.0036) [2024-06-12 22:04:05,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 1485930496. Throughput: 0: 49267.6. Samples: 1014705880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-12 22:04:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:04:08,767][70980] Signal inference workers to stop experience collection... (14900 times) [2024-06-12 22:04:08,767][70980] Signal inference workers to resume experience collection... (14900 times) [2024-06-12 22:04:08,788][71000] InferenceWorker_p0-w0: stopping experience collection (14900 times) [2024-06-12 22:04:08,788][71000] InferenceWorker_p0-w0: resuming experience collection (14900 times) [2024-06-12 22:04:08,909][71000] Updated weights for policy 0, policy_version 90704 (0.0032) [2024-06-12 22:04:10,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1486176256. Throughput: 0: 49066.3. Samples: 1014998680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 22:04:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:04:12,630][71000] Updated weights for policy 0, policy_version 90714 (0.0030) [2024-06-12 22:04:15,425][71000] Updated weights for policy 0, policy_version 90724 (0.0019) [2024-06-12 22:04:15,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1486438400. Throughput: 0: 49079.2. Samples: 1015297560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 22:04:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:04:19,434][71000] Updated weights for policy 0, policy_version 90734 (0.0024) [2024-06-12 22:04:20,939][70768] Fps is (10 sec: 52429.1, 60 sec: 49698.3, 300 sec: 49429.7). Total num frames: 1486700544. Throughput: 0: 49090.5. Samples: 1015447700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 22:04:20,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:04:22,145][71000] Updated weights for policy 0, policy_version 90744 (0.0024) [2024-06-12 22:04:25,750][71000] Updated weights for policy 0, policy_version 90754 (0.0031) [2024-06-12 22:04:25,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 1486913536. Throughput: 0: 49200.4. Samples: 1015748160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 22:04:25,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:04:28,955][71000] Updated weights for policy 0, policy_version 90764 (0.0032) [2024-06-12 22:04:30,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 1487175680. Throughput: 0: 49266.9. Samples: 1016043340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 22:04:30,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:04:32,458][71000] Updated weights for policy 0, policy_version 90774 (0.0028) [2024-06-12 22:04:35,347][71000] Updated weights for policy 0, policy_version 90784 (0.0024) [2024-06-12 22:04:35,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49700.0, 300 sec: 49318.6). Total num frames: 1487437824. Throughput: 0: 49361.0. Samples: 1016195820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 22:04:35,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:04:38,986][71000] Updated weights for policy 0, policy_version 90794 (0.0023) [2024-06-12 22:04:40,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49424.9, 300 sec: 49374.1). Total num frames: 1487683584. Throughput: 0: 49526.2. Samples: 1016502880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 22:04:40,942][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 22:04:41,881][71000] Updated weights for policy 0, policy_version 90804 (0.0031) [2024-06-12 22:04:45,683][71000] Updated weights for policy 0, policy_version 90814 (0.0032) [2024-06-12 22:04:45,940][70768] Fps is (10 sec: 45874.6, 60 sec: 49151.8, 300 sec: 49263.0). Total num frames: 1487896576. Throughput: 0: 49791.9. Samples: 1016803920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 22:04:45,949][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:04:48,431][71000] Updated weights for policy 0, policy_version 90824 (0.0022) [2024-06-12 22:04:50,940][70768] Fps is (10 sec: 49153.0, 60 sec: 49698.3, 300 sec: 49429.7). Total num frames: 1488175104. Throughput: 0: 49833.4. Samples: 1016948380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 22:04:50,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:04:52,115][71000] Updated weights for policy 0, policy_version 90834 (0.0026) [2024-06-12 22:04:55,207][71000] Updated weights for policy 0, policy_version 90844 (0.0024) [2024-06-12 22:04:55,940][70768] Fps is (10 sec: 54067.2, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1488437248. Throughput: 0: 49915.8. Samples: 1017244900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 22:04:55,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 22:04:58,827][71000] Updated weights for policy 0, policy_version 90854 (0.0028) [2024-06-12 22:05:00,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 1488683008. Throughput: 0: 49923.0. Samples: 1017544100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 22:05:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:05:01,701][71000] Updated weights for policy 0, policy_version 90864 (0.0035) [2024-06-12 22:05:05,244][71000] Updated weights for policy 0, policy_version 90874 (0.0029) [2024-06-12 22:05:05,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49698.0, 300 sec: 49318.6). Total num frames: 1488912384. Throughput: 0: 49718.4. Samples: 1017685040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:05:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:05:08,165][71000] Updated weights for policy 0, policy_version 90884 (0.0023) [2024-06-12 22:05:10,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 1489174528. Throughput: 0: 49783.2. Samples: 1017988400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:05:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:05:11,969][71000] Updated weights for policy 0, policy_version 90894 (0.0027) [2024-06-12 22:05:14,696][71000] Updated weights for policy 0, policy_version 90904 (0.0024) [2024-06-12 22:05:15,939][70768] Fps is (10 sec: 52429.9, 60 sec: 49971.3, 300 sec: 49429.7). Total num frames: 1489436672. Throughput: 0: 49931.3. Samples: 1018290240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:05:15,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:05:18,554][71000] Updated weights for policy 0, policy_version 90914 (0.0029) [2024-06-12 22:05:20,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 1489649664. Throughput: 0: 50097.8. Samples: 1018450220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:05:20,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:05:21,415][71000] Updated weights for policy 0, policy_version 90924 (0.0031) [2024-06-12 22:05:25,119][71000] Updated weights for policy 0, policy_version 90934 (0.0034) [2024-06-12 22:05:25,939][70768] Fps is (10 sec: 45875.3, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 1489895424. Throughput: 0: 49575.0. Samples: 1018733740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:05:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:05:27,864][71000] Updated weights for policy 0, policy_version 90944 (0.0021) [2024-06-12 22:05:28,106][70980] Signal inference workers to stop experience collection... (14950 times) [2024-06-12 22:05:28,106][70980] Signal inference workers to resume experience collection... (14950 times) [2024-06-12 22:05:28,137][71000] InferenceWorker_p0-w0: stopping experience collection (14950 times) [2024-06-12 22:05:28,137][71000] InferenceWorker_p0-w0: resuming experience collection (14950 times) [2024-06-12 22:05:30,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1490157568. Throughput: 0: 49661.1. Samples: 1019038660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:05:30,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 22:05:31,519][71000] Updated weights for policy 0, policy_version 90954 (0.0019) [2024-06-12 22:05:34,476][71000] Updated weights for policy 0, policy_version 90964 (0.0026) [2024-06-12 22:05:35,940][70768] Fps is (10 sec: 54066.5, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 1490436096. Throughput: 0: 49848.4. Samples: 1019191560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:05:35,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:05:38,137][71000] Updated weights for policy 0, policy_version 90974 (0.0028) [2024-06-12 22:05:40,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49698.1, 300 sec: 49374.1). Total num frames: 1490665472. Throughput: 0: 49898.2. Samples: 1019490320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:05:40,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:05:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000090984_1490681856.pth... [2024-06-12 22:05:40,951][71000] Updated weights for policy 0, policy_version 90984 (0.0021) [2024-06-12 22:05:40,989][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000090258_1478787072.pth [2024-06-12 22:05:44,647][71000] Updated weights for policy 0, policy_version 90994 (0.0024) [2024-06-12 22:05:45,940][70768] Fps is (10 sec: 47513.8, 60 sec: 50244.4, 300 sec: 49540.8). Total num frames: 1490911232. Throughput: 0: 49846.3. Samples: 1019787180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:05:45,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 22:05:47,861][71000] Updated weights for policy 0, policy_version 91004 (0.0038) [2024-06-12 22:05:50,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49698.0, 300 sec: 49540.8). Total num frames: 1491156992. Throughput: 0: 49751.1. Samples: 1019923840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:05:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:05:51,248][71000] Updated weights for policy 0, policy_version 91014 (0.0029) [2024-06-12 22:05:54,165][71000] Updated weights for policy 0, policy_version 91024 (0.0027) [2024-06-12 22:05:55,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.3, 300 sec: 49485.2). Total num frames: 1491419136. Throughput: 0: 49801.7. Samples: 1020229480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:05:55,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:05:57,641][71000] Updated weights for policy 0, policy_version 91034 (0.0027) [2024-06-12 22:06:00,747][71000] Updated weights for policy 0, policy_version 91044 (0.0027) [2024-06-12 22:06:00,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.2, 300 sec: 49485.3). Total num frames: 1491664896. Throughput: 0: 50003.5. Samples: 1020540400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 22:06:00,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 22:06:04,054][71000] Updated weights for policy 0, policy_version 91054 (0.0028) [2024-06-12 22:06:05,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1491910656. Throughput: 0: 49690.5. Samples: 1020686300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 22:06:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:06:07,380][71000] Updated weights for policy 0, policy_version 91064 (0.0032) [2024-06-12 22:06:10,808][71000] Updated weights for policy 0, policy_version 91074 (0.0026) [2024-06-12 22:06:10,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49698.0, 300 sec: 49651.8). Total num frames: 1492156416. Throughput: 0: 49893.1. Samples: 1020978940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 22:06:10,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:06:14,187][71000] Updated weights for policy 0, policy_version 91084 (0.0021) [2024-06-12 22:06:15,940][70768] Fps is (10 sec: 50791.3, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1492418560. Throughput: 0: 49592.9. Samples: 1021270340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 22:06:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:06:17,334][71000] Updated weights for policy 0, policy_version 91094 (0.0028) [2024-06-12 22:06:20,785][71000] Updated weights for policy 0, policy_version 91104 (0.0031) [2024-06-12 22:06:20,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 1492647936. Throughput: 0: 49757.8. Samples: 1021430660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 22:06:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:06:23,799][71000] Updated weights for policy 0, policy_version 91114 (0.0029) [2024-06-12 22:06:25,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49698.0, 300 sec: 49596.3). Total num frames: 1492877312. Throughput: 0: 49529.9. Samples: 1021719160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 22:06:25,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:06:27,234][71000] Updated weights for policy 0, policy_version 91124 (0.0029) [2024-06-12 22:06:30,680][71000] Updated weights for policy 0, policy_version 91134 (0.0029) [2024-06-12 22:06:30,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49698.0, 300 sec: 49596.3). Total num frames: 1493139456. Throughput: 0: 49624.7. Samples: 1022020300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 22:06:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:06:33,689][71000] Updated weights for policy 0, policy_version 91144 (0.0021) [2024-06-12 22:06:35,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49425.1, 300 sec: 49596.8). Total num frames: 1493401600. Throughput: 0: 50105.0. Samples: 1022178560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 22:06:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:06:37,206][71000] Updated weights for policy 0, policy_version 91154 (0.0028) [2024-06-12 22:06:40,187][71000] Updated weights for policy 0, policy_version 91164 (0.0027) [2024-06-12 22:06:40,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49971.2, 300 sec: 49651.8). Total num frames: 1493663744. Throughput: 0: 50016.3. Samples: 1022480220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 22:06:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:06:41,781][70980] Signal inference workers to stop experience collection... (15000 times) [2024-06-12 22:06:41,828][71000] InferenceWorker_p0-w0: stopping experience collection (15000 times) [2024-06-12 22:06:41,834][70980] Signal inference workers to resume experience collection... (15000 times) [2024-06-12 22:06:41,843][71000] InferenceWorker_p0-w0: resuming experience collection (15000 times) [2024-06-12 22:06:43,551][71000] Updated weights for policy 0, policy_version 91174 (0.0026) [2024-06-12 22:06:45,940][70768] Fps is (10 sec: 45874.6, 60 sec: 49151.9, 300 sec: 49540.8). Total num frames: 1493860352. Throughput: 0: 49666.5. Samples: 1022775400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 22:06:45,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 22:06:46,966][71000] Updated weights for policy 0, policy_version 91184 (0.0030) [2024-06-12 22:06:50,319][71000] Updated weights for policy 0, policy_version 91194 (0.0030) [2024-06-12 22:06:50,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49971.3, 300 sec: 49707.4). Total num frames: 1494155264. Throughput: 0: 49458.4. Samples: 1022911920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-12 22:06:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:06:53,440][71000] Updated weights for policy 0, policy_version 91204 (0.0026) [2024-06-12 22:06:55,939][70768] Fps is (10 sec: 52429.8, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 1494384640. Throughput: 0: 49600.7. Samples: 1023210960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 22:06:55,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:06:56,967][71000] Updated weights for policy 0, policy_version 91214 (0.0024) [2024-06-12 22:07:00,159][71000] Updated weights for policy 0, policy_version 91224 (0.0033) [2024-06-12 22:07:00,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.1, 300 sec: 49707.4). Total num frames: 1494646784. Throughput: 0: 49785.3. Samples: 1023510680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 22:07:00,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:07:03,275][71000] Updated weights for policy 0, policy_version 91234 (0.0026) [2024-06-12 22:07:05,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.2, 300 sec: 49651.9). Total num frames: 1494876160. Throughput: 0: 49519.6. Samples: 1023659040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 22:07:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 22:07:06,583][71000] Updated weights for policy 0, policy_version 91244 (0.0031) [2024-06-12 22:07:10,006][71000] Updated weights for policy 0, policy_version 91254 (0.0030) [2024-06-12 22:07:10,941][70768] Fps is (10 sec: 49147.4, 60 sec: 49697.5, 300 sec: 49707.2). Total num frames: 1495138304. Throughput: 0: 49873.2. Samples: 1023963500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 22:07:10,941][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:07:13,058][71000] Updated weights for policy 0, policy_version 91264 (0.0025) [2024-06-12 22:07:15,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.1, 300 sec: 49540.8). Total num frames: 1495367680. Throughput: 0: 49619.8. Samples: 1024253180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 22:07:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 22:07:16,522][71000] Updated weights for policy 0, policy_version 91274 (0.0027) [2024-06-12 22:07:20,019][71000] Updated weights for policy 0, policy_version 91284 (0.0026) [2024-06-12 22:07:20,940][70768] Fps is (10 sec: 49156.6, 60 sec: 49698.2, 300 sec: 49707.4). Total num frames: 1495629824. Throughput: 0: 49373.3. Samples: 1024400360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 22:07:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:07:23,247][71000] Updated weights for policy 0, policy_version 91294 (0.0022) [2024-06-12 22:07:25,939][70768] Fps is (10 sec: 50790.2, 60 sec: 49971.3, 300 sec: 49707.4). Total num frames: 1495875584. Throughput: 0: 49358.5. Samples: 1024701340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 22:07:25,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 22:07:26,299][71000] Updated weights for policy 0, policy_version 91304 (0.0025) [2024-06-12 22:07:29,973][71000] Updated weights for policy 0, policy_version 91314 (0.0028) [2024-06-12 22:07:30,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49698.3, 300 sec: 49707.4). Total num frames: 1496121344. Throughput: 0: 49233.9. Samples: 1024990920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 22:07:30,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:07:33,132][71000] Updated weights for policy 0, policy_version 91324 (0.0030) [2024-06-12 22:07:35,939][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1496350720. Throughput: 0: 49496.6. Samples: 1025139260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 22:07:35,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:07:36,526][71000] Updated weights for policy 0, policy_version 91334 (0.0030) [2024-06-12 22:07:39,742][71000] Updated weights for policy 0, policy_version 91344 (0.0027) [2024-06-12 22:07:40,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.2, 300 sec: 49651.9). Total num frames: 1496612864. Throughput: 0: 49473.8. Samples: 1025437280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 22:07:40,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 22:07:40,965][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000091347_1496629248.pth... [2024-06-12 22:07:41,012][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000090620_1484718080.pth [2024-06-12 22:07:43,149][71000] Updated weights for policy 0, policy_version 91354 (0.0026) [2024-06-12 22:07:45,939][70768] Fps is (10 sec: 52428.8, 60 sec: 50244.4, 300 sec: 49707.4). Total num frames: 1496875008. Throughput: 0: 49408.1. Samples: 1025734040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-12 22:07:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:07:46,595][71000] Updated weights for policy 0, policy_version 91364 (0.0031) [2024-06-12 22:07:49,754][71000] Updated weights for policy 0, policy_version 91374 (0.0030) [2024-06-12 22:07:50,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 49596.3). Total num frames: 1497104384. Throughput: 0: 49384.9. Samples: 1025881360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:07:50,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:07:52,963][70980] Signal inference workers to stop experience collection... (15050 times) [2024-06-12 22:07:52,964][70980] Signal inference workers to resume experience collection... (15050 times) [2024-06-12 22:07:53,004][71000] InferenceWorker_p0-w0: stopping experience collection (15050 times) [2024-06-12 22:07:53,004][71000] InferenceWorker_p0-w0: resuming experience collection (15050 times) [2024-06-12 22:07:53,097][71000] Updated weights for policy 0, policy_version 91384 (0.0031) [2024-06-12 22:07:55,939][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1497350144. Throughput: 0: 49082.0. Samples: 1026172140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:07:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:07:56,591][71000] Updated weights for policy 0, policy_version 91394 (0.0018) [2024-06-12 22:07:59,913][71000] Updated weights for policy 0, policy_version 91404 (0.0030) [2024-06-12 22:08:00,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 1497612288. Throughput: 0: 49251.0. Samples: 1026469480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:08:00,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:08:03,189][71000] Updated weights for policy 0, policy_version 91414 (0.0024) [2024-06-12 22:08:05,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1497858048. Throughput: 0: 49251.1. Samples: 1026616660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:08:05,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:08:06,469][71000] Updated weights for policy 0, policy_version 91424 (0.0027) [2024-06-12 22:08:10,126][71000] Updated weights for policy 0, policy_version 91434 (0.0036) [2024-06-12 22:08:10,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.8, 300 sec: 49596.3). Total num frames: 1498103808. Throughput: 0: 49393.7. Samples: 1026924060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:08:10,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:08:13,337][71000] Updated weights for policy 0, policy_version 91444 (0.0031) [2024-06-12 22:08:15,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1498349568. Throughput: 0: 49429.8. Samples: 1027215260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:08:15,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 22:08:16,637][71000] Updated weights for policy 0, policy_version 91454 (0.0024) [2024-06-12 22:08:19,703][71000] Updated weights for policy 0, policy_version 91464 (0.0036) [2024-06-12 22:08:20,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.1, 300 sec: 49651.9). Total num frames: 1498611712. Throughput: 0: 49585.7. Samples: 1027370620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:08:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:08:23,153][71000] Updated weights for policy 0, policy_version 91474 (0.0031) [2024-06-12 22:08:25,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1498841088. Throughput: 0: 49588.8. Samples: 1027668780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:08:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:08:26,427][71000] Updated weights for policy 0, policy_version 91484 (0.0027) [2024-06-12 22:08:29,877][71000] Updated weights for policy 0, policy_version 91494 (0.0035) [2024-06-12 22:08:30,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.0, 300 sec: 49541.2). Total num frames: 1499070464. Throughput: 0: 49558.6. Samples: 1027964180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:08:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:08:32,966][71000] Updated weights for policy 0, policy_version 91504 (0.0029) [2024-06-12 22:08:35,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1499332608. Throughput: 0: 49450.6. Samples: 1028106640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:08:35,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:08:36,471][71000] Updated weights for policy 0, policy_version 91514 (0.0028) [2024-06-12 22:08:39,567][71000] Updated weights for policy 0, policy_version 91524 (0.0019) [2024-06-12 22:08:40,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1499578368. Throughput: 0: 49638.6. Samples: 1028405880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:08:40,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:08:42,954][71000] Updated weights for policy 0, policy_version 91534 (0.0026) [2024-06-12 22:08:45,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49151.9, 300 sec: 49596.3). Total num frames: 1499824128. Throughput: 0: 49444.0. Samples: 1028694460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 22:08:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:08:46,339][71000] Updated weights for policy 0, policy_version 91544 (0.0025) [2024-06-12 22:08:49,825][71000] Updated weights for policy 0, policy_version 91554 (0.0023) [2024-06-12 22:08:50,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49424.9, 300 sec: 49540.8). Total num frames: 1500069888. Throughput: 0: 49347.0. Samples: 1028837280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 22:08:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:08:53,000][71000] Updated weights for policy 0, policy_version 91564 (0.0032) [2024-06-12 22:08:55,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49151.9, 300 sec: 49540.8). Total num frames: 1500299264. Throughput: 0: 49025.7. Samples: 1029130220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 22:08:55,943][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:08:56,775][71000] Updated weights for policy 0, policy_version 91574 (0.0030) [2024-06-12 22:08:59,535][71000] Updated weights for policy 0, policy_version 91584 (0.0031) [2024-06-12 22:09:00,942][70768] Fps is (10 sec: 49142.2, 60 sec: 49150.3, 300 sec: 49596.0). Total num frames: 1500561408. Throughput: 0: 49285.6. Samples: 1029433220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 22:09:00,942][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:09:01,238][70980] Signal inference workers to stop experience collection... (15100 times) [2024-06-12 22:09:01,269][71000] InferenceWorker_p0-w0: stopping experience collection (15100 times) [2024-06-12 22:09:01,296][70980] Signal inference workers to resume experience collection... (15100 times) [2024-06-12 22:09:01,297][71000] InferenceWorker_p0-w0: resuming experience collection (15100 times) [2024-06-12 22:09:03,246][71000] Updated weights for policy 0, policy_version 91594 (0.0033) [2024-06-12 22:09:05,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.0, 300 sec: 49596.3). Total num frames: 1500807168. Throughput: 0: 49101.3. Samples: 1029580180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 22:09:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:09:06,193][71000] Updated weights for policy 0, policy_version 91604 (0.0030) [2024-06-12 22:09:09,726][71000] Updated weights for policy 0, policy_version 91614 (0.0029) [2024-06-12 22:09:10,940][70768] Fps is (10 sec: 49162.3, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1501052928. Throughput: 0: 49216.5. Samples: 1029883520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 22:09:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:09:12,852][71000] Updated weights for policy 0, policy_version 91624 (0.0031) [2024-06-12 22:09:15,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48878.9, 300 sec: 49429.7). Total num frames: 1501282304. Throughput: 0: 48972.0. Samples: 1030167920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 22:09:15,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 22:09:16,363][71000] Updated weights for policy 0, policy_version 91634 (0.0022) [2024-06-12 22:09:19,539][71000] Updated weights for policy 0, policy_version 91644 (0.0038) [2024-06-12 22:09:20,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48879.0, 300 sec: 49596.3). Total num frames: 1501544448. Throughput: 0: 49121.8. Samples: 1030317120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 22:09:20,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:09:22,959][71000] Updated weights for policy 0, policy_version 91654 (0.0031) [2024-06-12 22:09:25,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1501790208. Throughput: 0: 49150.2. Samples: 1030617640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 22:09:25,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:09:26,311][71000] Updated weights for policy 0, policy_version 91664 (0.0028) [2024-06-12 22:09:29,991][71000] Updated weights for policy 0, policy_version 91674 (0.0032) [2024-06-12 22:09:30,940][70768] Fps is (10 sec: 49150.9, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1502035968. Throughput: 0: 49488.7. Samples: 1030921460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 22:09:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:09:32,715][71000] Updated weights for policy 0, policy_version 91684 (0.0023) [2024-06-12 22:09:35,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49151.9, 300 sec: 49485.2). Total num frames: 1502281728. Throughput: 0: 49477.3. Samples: 1031063760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 22:09:35,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:09:36,315][71000] Updated weights for policy 0, policy_version 91694 (0.0026) [2024-06-12 22:09:39,453][71000] Updated weights for policy 0, policy_version 91704 (0.0032) [2024-06-12 22:09:40,940][70768] Fps is (10 sec: 50791.6, 60 sec: 49425.1, 300 sec: 49651.9). Total num frames: 1502543872. Throughput: 0: 49409.9. Samples: 1031353660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 22:09:40,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:09:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000091708_1502543872.pth... [2024-06-12 22:09:40,990][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000090984_1490681856.pth [2024-06-12 22:09:42,950][71000] Updated weights for policy 0, policy_version 91714 (0.0026) [2024-06-12 22:09:45,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 49540.7). Total num frames: 1502789632. Throughput: 0: 49285.3. Samples: 1031650960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 22:09:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:09:46,202][71000] Updated weights for policy 0, policy_version 91724 (0.0030) [2024-06-12 22:09:49,763][71000] Updated weights for policy 0, policy_version 91734 (0.0024) [2024-06-12 22:09:50,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 1503019008. Throughput: 0: 49316.0. Samples: 1031799400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 22:09:50,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 22:09:52,846][71000] Updated weights for policy 0, policy_version 91744 (0.0026) [2024-06-12 22:09:55,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1503281152. Throughput: 0: 49317.7. Samples: 1032102820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 22:09:55,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 22:09:56,379][71000] Updated weights for policy 0, policy_version 91754 (0.0026) [2024-06-12 22:09:59,619][71000] Updated weights for policy 0, policy_version 91764 (0.0024) [2024-06-12 22:10:00,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49426.8, 300 sec: 49540.8). Total num frames: 1503526912. Throughput: 0: 49660.8. Samples: 1032402660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 22:10:00,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:10:02,787][71000] Updated weights for policy 0, policy_version 91774 (0.0024) [2024-06-12 22:10:05,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1503772672. Throughput: 0: 49567.0. Samples: 1032547640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 22:10:05,940][70768] Avg episode reward: [(0, '0.258')] [2024-06-12 22:10:06,363][71000] Updated weights for policy 0, policy_version 91784 (0.0035) [2024-06-12 22:10:09,491][71000] Updated weights for policy 0, policy_version 91794 (0.0028) [2024-06-12 22:10:10,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49698.0, 300 sec: 49485.2). Total num frames: 1504034816. Throughput: 0: 49712.3. Samples: 1032854700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 22:10:10,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:10:12,843][71000] Updated weights for policy 0, policy_version 91804 (0.0035) [2024-06-12 22:10:15,862][71000] Updated weights for policy 0, policy_version 91814 (0.0027) [2024-06-12 22:10:15,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49971.1, 300 sec: 49596.3). Total num frames: 1504280576. Throughput: 0: 49275.6. Samples: 1033138860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 22:10:15,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:10:17,318][70980] Signal inference workers to stop experience collection... (15150 times) [2024-06-12 22:10:17,320][70980] Signal inference workers to resume experience collection... (15150 times) [2024-06-12 22:10:17,330][71000] InferenceWorker_p0-w0: stopping experience collection (15150 times) [2024-06-12 22:10:17,361][71000] InferenceWorker_p0-w0: resuming experience collection (15150 times) [2024-06-12 22:10:19,333][71000] Updated weights for policy 0, policy_version 91824 (0.0033) [2024-06-12 22:10:20,940][70768] Fps is (10 sec: 49153.0, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1504526336. Throughput: 0: 49601.9. Samples: 1033295840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 22:10:20,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 22:10:22,841][71000] Updated weights for policy 0, policy_version 91834 (0.0017) [2024-06-12 22:10:25,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1504755712. Throughput: 0: 49682.6. Samples: 1033589380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 22:10:25,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 22:10:26,001][71000] Updated weights for policy 0, policy_version 91844 (0.0026) [2024-06-12 22:10:29,235][71000] Updated weights for policy 0, policy_version 91854 (0.0035) [2024-06-12 22:10:30,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.3, 300 sec: 49429.7). Total num frames: 1505017856. Throughput: 0: 49825.9. Samples: 1033893120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-12 22:10:30,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 22:10:32,670][71000] Updated weights for policy 0, policy_version 91864 (0.0025) [2024-06-12 22:10:35,743][71000] Updated weights for policy 0, policy_version 91874 (0.0030) [2024-06-12 22:10:35,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 1505280000. Throughput: 0: 49935.5. Samples: 1034046500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-12 22:10:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:10:38,975][71000] Updated weights for policy 0, policy_version 91884 (0.0033) [2024-06-12 22:10:40,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49698.0, 300 sec: 49540.7). Total num frames: 1505525760. Throughput: 0: 49769.7. Samples: 1034342460. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-12 22:10:40,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 22:10:42,102][71000] Updated weights for policy 0, policy_version 91894 (0.0025) [2024-06-12 22:10:45,292][71000] Updated weights for policy 0, policy_version 91904 (0.0026) [2024-06-12 22:10:45,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1505755136. Throughput: 0: 49605.2. Samples: 1034634900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-12 22:10:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:10:48,924][71000] Updated weights for policy 0, policy_version 91914 (0.0022) [2024-06-12 22:10:50,940][70768] Fps is (10 sec: 49153.0, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 1506017280. Throughput: 0: 49725.4. Samples: 1034785280. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-12 22:10:50,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 22:10:52,397][71000] Updated weights for policy 0, policy_version 91924 (0.0024) [2024-06-12 22:10:55,637][71000] Updated weights for policy 0, policy_version 91934 (0.0033) [2024-06-12 22:10:55,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 1506279424. Throughput: 0: 49642.4. Samples: 1035088600. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-12 22:10:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 22:10:58,722][71000] Updated weights for policy 0, policy_version 91944 (0.0023) [2024-06-12 22:11:00,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49971.3, 300 sec: 49540.8). Total num frames: 1506525184. Throughput: 0: 50020.2. Samples: 1035389760. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-12 22:11:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:11:02,059][71000] Updated weights for policy 0, policy_version 91954 (0.0023) [2024-06-12 22:11:05,455][71000] Updated weights for policy 0, policy_version 91964 (0.0029) [2024-06-12 22:11:05,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1506754560. Throughput: 0: 49758.1. Samples: 1035534960. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-12 22:11:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:11:08,749][71000] Updated weights for policy 0, policy_version 91974 (0.0040) [2024-06-12 22:11:10,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1507000320. Throughput: 0: 49684.4. Samples: 1035825180. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-12 22:11:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:11:12,257][71000] Updated weights for policy 0, policy_version 91984 (0.0035) [2024-06-12 22:11:15,167][71000] Updated weights for policy 0, policy_version 91994 (0.0025) [2024-06-12 22:11:15,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1507262464. Throughput: 0: 49559.5. Samples: 1036123300. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-12 22:11:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:11:18,549][71000] Updated weights for policy 0, policy_version 92004 (0.0032) [2024-06-12 22:11:20,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49971.0, 300 sec: 49651.8). Total num frames: 1507524608. Throughput: 0: 49761.2. Samples: 1036285760. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-12 22:11:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:11:21,524][71000] Updated weights for policy 0, policy_version 92014 (0.0030) [2024-06-12 22:11:25,015][71000] Updated weights for policy 0, policy_version 92024 (0.0026) [2024-06-12 22:11:25,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49698.2, 300 sec: 49485.3). Total num frames: 1507737600. Throughput: 0: 49833.1. Samples: 1036584940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-12 22:11:25,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 22:11:27,082][70980] Signal inference workers to stop experience collection... (15200 times) [2024-06-12 22:11:27,082][70980] Signal inference workers to resume experience collection... (15200 times) [2024-06-12 22:11:27,122][71000] InferenceWorker_p0-w0: stopping experience collection (15200 times) [2024-06-12 22:11:27,123][71000] InferenceWorker_p0-w0: resuming experience collection (15200 times) [2024-06-12 22:11:28,137][71000] Updated weights for policy 0, policy_version 92034 (0.0024) [2024-06-12 22:11:30,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 1508016128. Throughput: 0: 50006.4. Samples: 1036885180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 22:11:30,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:11:31,736][71000] Updated weights for policy 0, policy_version 92044 (0.0034) [2024-06-12 22:11:34,633][71000] Updated weights for policy 0, policy_version 92054 (0.0030) [2024-06-12 22:11:35,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49698.2, 300 sec: 49485.3). Total num frames: 1508261888. Throughput: 0: 50018.2. Samples: 1037036100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 22:11:35,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:11:38,314][71000] Updated weights for policy 0, policy_version 92064 (0.0026) [2024-06-12 22:11:40,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49698.2, 300 sec: 49651.9). Total num frames: 1508507648. Throughput: 0: 49838.3. Samples: 1037331320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 22:11:40,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:11:40,945][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000092072_1508507648.pth... [2024-06-12 22:11:40,998][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000091347_1496629248.pth [2024-06-12 22:11:41,330][71000] Updated weights for policy 0, policy_version 92074 (0.0030) [2024-06-12 22:11:44,763][71000] Updated weights for policy 0, policy_version 92084 (0.0027) [2024-06-12 22:11:45,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 1508720640. Throughput: 0: 49611.0. Samples: 1037622260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 22:11:45,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 22:11:47,902][71000] Updated weights for policy 0, policy_version 92094 (0.0028) [2024-06-12 22:11:50,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1508999168. Throughput: 0: 49504.6. Samples: 1037762660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 22:11:50,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 22:11:51,478][71000] Updated weights for policy 0, policy_version 92104 (0.0034) [2024-06-12 22:11:54,596][71000] Updated weights for policy 0, policy_version 92114 (0.0035) [2024-06-12 22:11:55,939][70768] Fps is (10 sec: 54067.5, 60 sec: 49698.3, 300 sec: 49540.8). Total num frames: 1509261312. Throughput: 0: 49754.4. Samples: 1038064120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 22:11:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:11:58,344][71000] Updated weights for policy 0, policy_version 92124 (0.0026) [2024-06-12 22:12:00,942][70768] Fps is (10 sec: 50775.7, 60 sec: 49695.7, 300 sec: 49595.8). Total num frames: 1509507072. Throughput: 0: 49712.4. Samples: 1038360500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 22:12:00,943][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:12:01,127][71000] Updated weights for policy 0, policy_version 92134 (0.0022) [2024-06-12 22:12:04,717][71000] Updated weights for policy 0, policy_version 92144 (0.0025) [2024-06-12 22:12:05,940][70768] Fps is (10 sec: 44236.5, 60 sec: 49152.1, 300 sec: 49374.3). Total num frames: 1509703680. Throughput: 0: 49246.4. Samples: 1038501840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 22:12:05,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:12:07,573][71000] Updated weights for policy 0, policy_version 92154 (0.0029) [2024-06-12 22:12:10,940][70768] Fps is (10 sec: 49165.9, 60 sec: 49971.3, 300 sec: 49596.3). Total num frames: 1509998592. Throughput: 0: 49508.8. Samples: 1038812840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 22:12:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:12:11,122][71000] Updated weights for policy 0, policy_version 92164 (0.0026) [2024-06-12 22:12:14,398][71000] Updated weights for policy 0, policy_version 92174 (0.0023) [2024-06-12 22:12:15,939][70768] Fps is (10 sec: 55705.9, 60 sec: 49971.3, 300 sec: 49596.3). Total num frames: 1510260736. Throughput: 0: 49264.1. Samples: 1039102060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 22:12:15,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:12:18,161][71000] Updated weights for policy 0, policy_version 92184 (0.0021) [2024-06-12 22:12:20,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49425.3, 300 sec: 49540.8). Total num frames: 1510490112. Throughput: 0: 49235.3. Samples: 1039251680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-12 22:12:20,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:12:20,947][71000] Updated weights for policy 0, policy_version 92194 (0.0031) [2024-06-12 22:12:24,654][71000] Updated weights for policy 0, policy_version 92204 (0.0031) [2024-06-12 22:12:25,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1510719488. Throughput: 0: 49466.3. Samples: 1039557300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 22:12:25,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:12:27,586][71000] Updated weights for policy 0, policy_version 92214 (0.0031) [2024-06-12 22:12:30,940][70768] Fps is (10 sec: 49150.8, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1510981632. Throughput: 0: 49484.3. Samples: 1039849060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 22:12:30,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:12:31,058][71000] Updated weights for policy 0, policy_version 92224 (0.0028) [2024-06-12 22:12:34,225][71000] Updated weights for policy 0, policy_version 92234 (0.0027) [2024-06-12 22:12:35,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1511243776. Throughput: 0: 49643.9. Samples: 1039996640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 22:12:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:12:38,015][71000] Updated weights for policy 0, policy_version 92244 (0.0032) [2024-06-12 22:12:40,938][70980] Signal inference workers to stop experience collection... (15250 times) [2024-06-12 22:12:40,940][70768] Fps is (10 sec: 47514.5, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 1511456768. Throughput: 0: 49521.3. Samples: 1040292580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 22:12:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:12:40,973][71000] InferenceWorker_p0-w0: stopping experience collection (15250 times) [2024-06-12 22:12:40,994][70980] Signal inference workers to resume experience collection... (15250 times) [2024-06-12 22:12:40,995][71000] InferenceWorker_p0-w0: resuming experience collection (15250 times) [2024-06-12 22:12:41,146][71000] Updated weights for policy 0, policy_version 92254 (0.0020) [2024-06-12 22:12:44,863][71000] Updated weights for policy 0, policy_version 92264 (0.0025) [2024-06-12 22:12:45,940][70768] Fps is (10 sec: 45875.6, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1511702528. Throughput: 0: 49676.9. Samples: 1040595820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 22:12:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:12:47,578][71000] Updated weights for policy 0, policy_version 92274 (0.0031) [2024-06-12 22:12:50,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49151.9, 300 sec: 49485.2). Total num frames: 1511948288. Throughput: 0: 49561.7. Samples: 1040732120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 22:12:50,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:12:51,342][71000] Updated weights for policy 0, policy_version 92284 (0.0026) [2024-06-12 22:12:54,412][71000] Updated weights for policy 0, policy_version 92294 (0.0035) [2024-06-12 22:12:55,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1512226816. Throughput: 0: 49279.1. Samples: 1041030400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 22:12:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:12:58,268][71000] Updated weights for policy 0, policy_version 92304 (0.0031) [2024-06-12 22:13:00,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49154.3, 300 sec: 49485.2). Total num frames: 1512456192. Throughput: 0: 49352.0. Samples: 1041322900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 22:13:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 22:13:01,057][71000] Updated weights for policy 0, policy_version 92314 (0.0026) [2024-06-12 22:13:04,903][71000] Updated weights for policy 0, policy_version 92324 (0.0029) [2024-06-12 22:13:05,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1512685568. Throughput: 0: 49199.4. Samples: 1041465660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 22:13:05,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:13:07,784][71000] Updated weights for policy 0, policy_version 92334 (0.0032) [2024-06-12 22:13:10,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48878.9, 300 sec: 49429.7). Total num frames: 1512931328. Throughput: 0: 48954.6. Samples: 1041760260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 22:13:10,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:13:11,415][71000] Updated weights for policy 0, policy_version 92344 (0.0027) [2024-06-12 22:13:14,352][71000] Updated weights for policy 0, policy_version 92354 (0.0025) [2024-06-12 22:13:15,939][70768] Fps is (10 sec: 52429.2, 60 sec: 49152.0, 300 sec: 49485.3). Total num frames: 1513209856. Throughput: 0: 49165.1. Samples: 1042061480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 22:13:15,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:13:18,117][71000] Updated weights for policy 0, policy_version 92364 (0.0028) [2024-06-12 22:13:20,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49151.8, 300 sec: 49485.2). Total num frames: 1513439232. Throughput: 0: 49281.7. Samples: 1042214320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 22:13:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:13:21,082][71000] Updated weights for policy 0, policy_version 92374 (0.0026) [2024-06-12 22:13:24,785][71000] Updated weights for policy 0, policy_version 92384 (0.0039) [2024-06-12 22:13:25,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1513684992. Throughput: 0: 49319.4. Samples: 1042511960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 22:13:25,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:13:27,628][71000] Updated weights for policy 0, policy_version 92394 (0.0023) [2024-06-12 22:13:30,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 49429.7). Total num frames: 1513914368. Throughput: 0: 48884.2. Samples: 1042795620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 22:13:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:13:31,534][71000] Updated weights for policy 0, policy_version 92404 (0.0031) [2024-06-12 22:13:34,431][71000] Updated weights for policy 0, policy_version 92414 (0.0031) [2024-06-12 22:13:35,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1514192896. Throughput: 0: 49191.1. Samples: 1042945720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 22:13:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:13:37,911][71000] Updated weights for policy 0, policy_version 92424 (0.0033) [2024-06-12 22:13:40,939][70768] Fps is (10 sec: 49153.1, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1514405888. Throughput: 0: 49188.1. Samples: 1043243860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 22:13:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:13:41,063][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000092433_1514422272.pth... [2024-06-12 22:13:41,105][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000091708_1502543872.pth [2024-06-12 22:13:41,240][71000] Updated weights for policy 0, policy_version 92434 (0.0031) [2024-06-12 22:13:44,691][71000] Updated weights for policy 0, policy_version 92444 (0.0030) [2024-06-12 22:13:45,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 1514651648. Throughput: 0: 49267.4. Samples: 1043539940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 22:13:45,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:13:47,708][71000] Updated weights for policy 0, policy_version 92454 (0.0037) [2024-06-12 22:13:50,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 49485.3). Total num frames: 1514897408. Throughput: 0: 49175.2. Samples: 1043678540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 22:13:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 22:13:51,425][71000] Updated weights for policy 0, policy_version 92464 (0.0027) [2024-06-12 22:13:54,301][70980] Signal inference workers to stop experience collection... (15300 times) [2024-06-12 22:13:54,350][71000] InferenceWorker_p0-w0: stopping experience collection (15300 times) [2024-06-12 22:13:54,351][70980] Signal inference workers to resume experience collection... (15300 times) [2024-06-12 22:13:54,355][71000] Updated weights for policy 0, policy_version 92474 (0.0025) [2024-06-12 22:13:54,361][71000] InferenceWorker_p0-w0: resuming experience collection (15300 times) [2024-06-12 22:13:55,940][70768] Fps is (10 sec: 52429.7, 60 sec: 49152.0, 300 sec: 49541.1). Total num frames: 1515175936. Throughput: 0: 49374.4. Samples: 1043982100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 22:13:55,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:13:57,879][71000] Updated weights for policy 0, policy_version 92484 (0.0030) [2024-06-12 22:14:00,896][71000] Updated weights for policy 0, policy_version 92494 (0.0026) [2024-06-12 22:14:00,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1515421696. Throughput: 0: 49345.6. Samples: 1044282040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 22:14:00,941][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:14:04,136][71000] Updated weights for policy 0, policy_version 92504 (0.0032) [2024-06-12 22:14:05,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1515634688. Throughput: 0: 49266.4. Samples: 1044431300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 22:14:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:14:07,493][71000] Updated weights for policy 0, policy_version 92514 (0.0036) [2024-06-12 22:14:10,634][71000] Updated weights for policy 0, policy_version 92524 (0.0022) [2024-06-12 22:14:10,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1515913216. Throughput: 0: 49112.1. Samples: 1044722000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 22:14:10,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:14:14,287][71000] Updated weights for policy 0, policy_version 92534 (0.0033) [2024-06-12 22:14:15,940][70768] Fps is (10 sec: 54066.5, 60 sec: 49424.9, 300 sec: 49596.3). Total num frames: 1516175360. Throughput: 0: 49405.0. Samples: 1045018840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-12 22:14:15,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:14:17,730][71000] Updated weights for policy 0, policy_version 92544 (0.0034) [2024-06-12 22:14:20,940][70768] Fps is (10 sec: 45874.5, 60 sec: 48879.0, 300 sec: 49429.7). Total num frames: 1516371968. Throughput: 0: 49414.2. Samples: 1045169360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-12 22:14:20,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:14:21,163][71000] Updated weights for policy 0, policy_version 92554 (0.0035) [2024-06-12 22:14:24,261][71000] Updated weights for policy 0, policy_version 92564 (0.0026) [2024-06-12 22:14:25,939][70768] Fps is (10 sec: 45876.0, 60 sec: 49152.1, 300 sec: 49485.3). Total num frames: 1516634112. Throughput: 0: 49440.5. Samples: 1045468680. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-12 22:14:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:14:27,441][71000] Updated weights for policy 0, policy_version 92574 (0.0030) [2024-06-12 22:14:30,799][71000] Updated weights for policy 0, policy_version 92584 (0.0030) [2024-06-12 22:14:30,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49698.3, 300 sec: 49540.8). Total num frames: 1516896256. Throughput: 0: 49417.5. Samples: 1045763720. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-12 22:14:30,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:14:34,200][71000] Updated weights for policy 0, policy_version 92594 (0.0034) [2024-06-12 22:14:35,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1517142016. Throughput: 0: 49610.5. Samples: 1045911020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-12 22:14:35,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:14:37,586][71000] Updated weights for policy 0, policy_version 92604 (0.0033) [2024-06-12 22:14:40,715][71000] Updated weights for policy 0, policy_version 92614 (0.0028) [2024-06-12 22:14:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49698.1, 300 sec: 49485.3). Total num frames: 1517387776. Throughput: 0: 49491.1. Samples: 1046209200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-12 22:14:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:14:43,735][71000] Updated weights for policy 0, policy_version 92624 (0.0024) [2024-06-12 22:14:45,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1517633536. Throughput: 0: 49599.5. Samples: 1046514020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-12 22:14:45,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 22:14:47,196][71000] Updated weights for policy 0, policy_version 92634 (0.0022) [2024-06-12 22:14:50,529][71000] Updated weights for policy 0, policy_version 92644 (0.0031) [2024-06-12 22:14:50,940][70768] Fps is (10 sec: 50789.3, 60 sec: 49971.0, 300 sec: 49540.8). Total num frames: 1517895680. Throughput: 0: 49426.9. Samples: 1046655520. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-12 22:14:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:14:53,865][71000] Updated weights for policy 0, policy_version 92654 (0.0036) [2024-06-12 22:14:55,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1518125056. Throughput: 0: 49612.9. Samples: 1046954580. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-12 22:14:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:14:57,240][71000] Updated weights for policy 0, policy_version 92664 (0.0024) [2024-06-12 22:14:59,420][70980] Signal inference workers to stop experience collection... (15350 times) [2024-06-12 22:14:59,456][71000] InferenceWorker_p0-w0: stopping experience collection (15350 times) [2024-06-12 22:14:59,478][70980] Signal inference workers to resume experience collection... (15350 times) [2024-06-12 22:14:59,481][71000] InferenceWorker_p0-w0: resuming experience collection (15350 times) [2024-06-12 22:15:00,778][71000] Updated weights for policy 0, policy_version 92674 (0.0031) [2024-06-12 22:15:00,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1518387200. Throughput: 0: 49476.8. Samples: 1047245300. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-12 22:15:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:15:03,980][71000] Updated weights for policy 0, policy_version 92684 (0.0021) [2024-06-12 22:15:05,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1518616576. Throughput: 0: 49576.1. Samples: 1047400280. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-12 22:15:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:15:07,132][71000] Updated weights for policy 0, policy_version 92694 (0.0024) [2024-06-12 22:15:10,396][71000] Updated weights for policy 0, policy_version 92704 (0.0033) [2024-06-12 22:15:10,939][70768] Fps is (10 sec: 49153.1, 60 sec: 49425.1, 300 sec: 49485.3). Total num frames: 1518878720. Throughput: 0: 49570.2. Samples: 1047699340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-12 22:15:10,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:15:13,887][71000] Updated weights for policy 0, policy_version 92714 (0.0028) [2024-06-12 22:15:15,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.0, 300 sec: 49429.7). Total num frames: 1519108096. Throughput: 0: 49520.5. Samples: 1047992140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:15:15,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:15:17,368][71000] Updated weights for policy 0, policy_version 92724 (0.0031) [2024-06-12 22:15:20,225][71000] Updated weights for policy 0, policy_version 92734 (0.0031) [2024-06-12 22:15:20,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49971.4, 300 sec: 49540.8). Total num frames: 1519370240. Throughput: 0: 49489.6. Samples: 1048138040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:15:20,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:15:23,850][71000] Updated weights for policy 0, policy_version 92744 (0.0028) [2024-06-12 22:15:25,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49697.9, 300 sec: 49485.2). Total num frames: 1519616000. Throughput: 0: 49389.6. Samples: 1048431740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:15:25,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:15:27,034][71000] Updated weights for policy 0, policy_version 92754 (0.0032) [2024-06-12 22:15:30,232][71000] Updated weights for policy 0, policy_version 92764 (0.0037) [2024-06-12 22:15:30,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1519861760. Throughput: 0: 49228.6. Samples: 1048729300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:15:30,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:15:33,621][71000] Updated weights for policy 0, policy_version 92774 (0.0024) [2024-06-12 22:15:35,939][70768] Fps is (10 sec: 49153.3, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1520107520. Throughput: 0: 49535.9. Samples: 1048884620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:15:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:15:37,167][71000] Updated weights for policy 0, policy_version 92784 (0.0027) [2024-06-12 22:15:40,020][71000] Updated weights for policy 0, policy_version 92794 (0.0030) [2024-06-12 22:15:40,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1520353280. Throughput: 0: 49445.6. Samples: 1049179640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:15:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:15:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000092795_1520353280.pth... [2024-06-12 22:15:40,994][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000092072_1508507648.pth [2024-06-12 22:15:43,707][71000] Updated weights for policy 0, policy_version 92804 (0.0029) [2024-06-12 22:15:45,939][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.3, 300 sec: 49429.7). Total num frames: 1520599040. Throughput: 0: 49573.1. Samples: 1049476080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:15:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:15:46,794][71000] Updated weights for policy 0, policy_version 92814 (0.0025) [2024-06-12 22:15:50,206][71000] Updated weights for policy 0, policy_version 92824 (0.0026) [2024-06-12 22:15:50,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 1520844800. Throughput: 0: 49499.9. Samples: 1049627780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:15:50,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 22:15:53,454][71000] Updated weights for policy 0, policy_version 92834 (0.0030) [2024-06-12 22:15:55,939][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1521090560. Throughput: 0: 49318.2. Samples: 1049918660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:15:55,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:15:57,183][71000] Updated weights for policy 0, policy_version 92844 (0.0026) [2024-06-12 22:16:00,179][71000] Updated weights for policy 0, policy_version 92854 (0.0026) [2024-06-12 22:16:00,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1521336320. Throughput: 0: 49243.4. Samples: 1050208100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:16:00,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:16:03,500][71000] Updated weights for policy 0, policy_version 92864 (0.0026) [2024-06-12 22:16:05,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1521598464. Throughput: 0: 49471.9. Samples: 1050364280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:16:05,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:16:06,809][71000] Updated weights for policy 0, policy_version 92874 (0.0036) [2024-06-12 22:16:10,385][71000] Updated weights for policy 0, policy_version 92884 (0.0034) [2024-06-12 22:16:10,939][70768] Fps is (10 sec: 47514.4, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 1521811456. Throughput: 0: 49510.9. Samples: 1050659720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-12 22:16:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:16:13,593][71000] Updated weights for policy 0, policy_version 92894 (0.0036) [2024-06-12 22:16:15,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.1, 300 sec: 49318.7). Total num frames: 1522073600. Throughput: 0: 49536.9. Samples: 1050958460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-12 22:16:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:16:16,821][71000] Updated weights for policy 0, policy_version 92904 (0.0030) [2024-06-12 22:16:20,308][71000] Updated weights for policy 0, policy_version 92914 (0.0040) [2024-06-12 22:16:20,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 1522319360. Throughput: 0: 49192.4. Samples: 1051098280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-12 22:16:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:16:23,406][71000] Updated weights for policy 0, policy_version 92924 (0.0032) [2024-06-12 22:16:24,687][70980] Signal inference workers to stop experience collection... (15400 times) [2024-06-12 22:16:24,687][70980] Signal inference workers to resume experience collection... (15400 times) [2024-06-12 22:16:24,795][71000] InferenceWorker_p0-w0: stopping experience collection (15400 times) [2024-06-12 22:16:24,795][71000] InferenceWorker_p0-w0: resuming experience collection (15400 times) [2024-06-12 22:16:25,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 1522581504. Throughput: 0: 49532.2. Samples: 1051408580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-12 22:16:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 22:16:27,045][71000] Updated weights for policy 0, policy_version 92934 (0.0030) [2024-06-12 22:16:30,229][71000] Updated weights for policy 0, policy_version 92944 (0.0033) [2024-06-12 22:16:30,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 1522810880. Throughput: 0: 49259.8. Samples: 1051692780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-12 22:16:30,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:16:33,549][71000] Updated weights for policy 0, policy_version 92954 (0.0026) [2024-06-12 22:16:35,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49151.8, 300 sec: 49318.6). Total num frames: 1523056640. Throughput: 0: 49194.2. Samples: 1051841520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-12 22:16:35,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 22:16:36,921][71000] Updated weights for policy 0, policy_version 92964 (0.0039) [2024-06-12 22:16:40,304][71000] Updated weights for policy 0, policy_version 92974 (0.0037) [2024-06-12 22:16:40,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 1523318784. Throughput: 0: 49164.4. Samples: 1052131060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-12 22:16:40,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:16:43,451][71000] Updated weights for policy 0, policy_version 92984 (0.0024) [2024-06-12 22:16:45,939][70768] Fps is (10 sec: 50791.3, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1523564544. Throughput: 0: 49548.6. Samples: 1052437780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-12 22:16:45,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 22:16:46,691][71000] Updated weights for policy 0, policy_version 92994 (0.0031) [2024-06-12 22:16:49,974][71000] Updated weights for policy 0, policy_version 93004 (0.0026) [2024-06-12 22:16:50,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 1523810304. Throughput: 0: 49538.7. Samples: 1052593520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-12 22:16:50,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:16:53,119][71000] Updated weights for policy 0, policy_version 93014 (0.0029) [2024-06-12 22:16:55,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.1, 300 sec: 49374.6). Total num frames: 1524072448. Throughput: 0: 49588.9. Samples: 1052891220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-12 22:16:55,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:16:56,666][71000] Updated weights for policy 0, policy_version 93024 (0.0032) [2024-06-12 22:16:59,913][71000] Updated weights for policy 0, policy_version 93034 (0.0026) [2024-06-12 22:17:00,941][70768] Fps is (10 sec: 50781.0, 60 sec: 49696.7, 300 sec: 49540.5). Total num frames: 1524318208. Throughput: 0: 49624.2. Samples: 1053191640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-12 22:17:00,942][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 22:17:03,072][71000] Updated weights for policy 0, policy_version 93044 (0.0027) [2024-06-12 22:17:05,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1524580352. Throughput: 0: 49767.4. Samples: 1053337820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:17:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:17:06,246][71000] Updated weights for policy 0, policy_version 93054 (0.0023) [2024-06-12 22:17:09,414][71000] Updated weights for policy 0, policy_version 93064 (0.0034) [2024-06-12 22:17:10,940][70768] Fps is (10 sec: 50799.7, 60 sec: 50244.2, 300 sec: 49374.1). Total num frames: 1524826112. Throughput: 0: 49798.6. Samples: 1053649520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:17:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 22:17:12,749][71000] Updated weights for policy 0, policy_version 93074 (0.0027) [2024-06-12 22:17:15,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 1525039104. Throughput: 0: 49998.4. Samples: 1053942700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:17:15,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:17:16,332][71000] Updated weights for policy 0, policy_version 93084 (0.0040) [2024-06-12 22:17:19,717][71000] Updated weights for policy 0, policy_version 93094 (0.0036) [2024-06-12 22:17:20,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 1525317632. Throughput: 0: 49799.7. Samples: 1054082500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:17:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:17:23,194][71000] Updated weights for policy 0, policy_version 93104 (0.0022) [2024-06-12 22:17:25,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1525547008. Throughput: 0: 49844.1. Samples: 1054374040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:17:25,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 22:17:26,214][71000] Updated weights for policy 0, policy_version 93114 (0.0028) [2024-06-12 22:17:29,648][71000] Updated weights for policy 0, policy_version 93124 (0.0026) [2024-06-12 22:17:30,940][70768] Fps is (10 sec: 50790.2, 60 sec: 50244.4, 300 sec: 49429.7). Total num frames: 1525825536. Throughput: 0: 49762.6. Samples: 1054677100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:17:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:17:32,487][71000] Updated weights for policy 0, policy_version 93134 (0.0027) [2024-06-12 22:17:33,998][70980] Signal inference workers to stop experience collection... (15450 times) [2024-06-12 22:17:34,005][70980] Signal inference workers to resume experience collection... (15450 times) [2024-06-12 22:17:34,009][71000] InferenceWorker_p0-w0: stopping experience collection (15450 times) [2024-06-12 22:17:34,039][71000] InferenceWorker_p0-w0: resuming experience collection (15450 times) [2024-06-12 22:17:35,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49971.3, 300 sec: 49485.2). Total num frames: 1526054912. Throughput: 0: 49867.1. Samples: 1054837540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:17:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:17:36,035][71000] Updated weights for policy 0, policy_version 93144 (0.0035) [2024-06-12 22:17:39,456][71000] Updated weights for policy 0, policy_version 93154 (0.0027) [2024-06-12 22:17:40,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49971.1, 300 sec: 49540.7). Total num frames: 1526317056. Throughput: 0: 49728.7. Samples: 1055129020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:17:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:17:40,958][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000093159_1526317056.pth... [2024-06-12 22:17:41,005][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000092433_1514422272.pth [2024-06-12 22:17:42,774][71000] Updated weights for policy 0, policy_version 93164 (0.0027) [2024-06-12 22:17:45,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1526546432. Throughput: 0: 49515.4. Samples: 1055419740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:17:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:17:46,026][71000] Updated weights for policy 0, policy_version 93174 (0.0035) [2024-06-12 22:17:49,513][71000] Updated weights for policy 0, policy_version 93184 (0.0033) [2024-06-12 22:17:50,940][70768] Fps is (10 sec: 50791.2, 60 sec: 50244.3, 300 sec: 49485.2). Total num frames: 1526824960. Throughput: 0: 49572.6. Samples: 1055568580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:17:50,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:17:52,674][71000] Updated weights for policy 0, policy_version 93194 (0.0024) [2024-06-12 22:17:55,792][71000] Updated weights for policy 0, policy_version 93204 (0.0028) [2024-06-12 22:17:55,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1527054336. Throughput: 0: 49481.3. Samples: 1055876180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:17:55,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 22:17:58,891][71000] Updated weights for policy 0, policy_version 93214 (0.0031) [2024-06-12 22:18:00,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49699.7, 300 sec: 49540.8). Total num frames: 1527300096. Throughput: 0: 49746.2. Samples: 1056181280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-12 22:18:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:18:02,268][71000] Updated weights for policy 0, policy_version 93224 (0.0031) [2024-06-12 22:18:05,910][71000] Updated weights for policy 0, policy_version 93234 (0.0026) [2024-06-12 22:18:05,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1527545856. Throughput: 0: 49691.0. Samples: 1056318600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-12 22:18:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:18:08,979][71000] Updated weights for policy 0, policy_version 93244 (0.0029) [2024-06-12 22:18:10,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1527808000. Throughput: 0: 49875.8. Samples: 1056618460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-12 22:18:10,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 22:18:12,626][71000] Updated weights for policy 0, policy_version 93254 (0.0029) [2024-06-12 22:18:15,695][71000] Updated weights for policy 0, policy_version 93264 (0.0030) [2024-06-12 22:18:15,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49971.1, 300 sec: 49485.2). Total num frames: 1528037376. Throughput: 0: 49748.3. Samples: 1056915780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-12 22:18:15,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:18:18,982][71000] Updated weights for policy 0, policy_version 93274 (0.0021) [2024-06-12 22:18:20,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1528299520. Throughput: 0: 49384.9. Samples: 1057059860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-12 22:18:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:18:22,424][71000] Updated weights for policy 0, policy_version 93284 (0.0032) [2024-06-12 22:18:25,574][71000] Updated weights for policy 0, policy_version 93294 (0.0023) [2024-06-12 22:18:25,939][70768] Fps is (10 sec: 50791.4, 60 sec: 49971.2, 300 sec: 49596.4). Total num frames: 1528545280. Throughput: 0: 49635.3. Samples: 1057362600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-12 22:18:25,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:18:28,670][71000] Updated weights for policy 0, policy_version 93304 (0.0036) [2024-06-12 22:18:30,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1528807424. Throughput: 0: 49999.0. Samples: 1057669700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-12 22:18:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:18:31,802][71000] Updated weights for policy 0, policy_version 93314 (0.0024) [2024-06-12 22:18:35,094][71000] Updated weights for policy 0, policy_version 93324 (0.0033) [2024-06-12 22:18:35,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49971.2, 300 sec: 49651.8). Total num frames: 1529053184. Throughput: 0: 50006.2. Samples: 1057818860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-12 22:18:35,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:18:38,581][71000] Updated weights for policy 0, policy_version 93334 (0.0028) [2024-06-12 22:18:40,943][70768] Fps is (10 sec: 49133.0, 60 sec: 49695.0, 300 sec: 49651.2). Total num frames: 1529298944. Throughput: 0: 49958.8. Samples: 1058124520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-12 22:18:40,944][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 22:18:41,694][71000] Updated weights for policy 0, policy_version 93344 (0.0020) [2024-06-12 22:18:45,215][71000] Updated weights for policy 0, policy_version 93354 (0.0029) [2024-06-12 22:18:45,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1529528320. Throughput: 0: 49779.1. Samples: 1058421340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-12 22:18:45,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:18:48,160][71000] Updated weights for policy 0, policy_version 93364 (0.0031) [2024-06-12 22:18:50,939][70768] Fps is (10 sec: 50810.6, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1529806848. Throughput: 0: 49997.0. Samples: 1058568460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-12 22:18:50,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 22:18:51,388][70980] Signal inference workers to stop experience collection... (15500 times) [2024-06-12 22:18:51,441][70980] Signal inference workers to resume experience collection... (15500 times) [2024-06-12 22:18:51,443][71000] InferenceWorker_p0-w0: stopping experience collection (15500 times) [2024-06-12 22:18:51,457][71000] InferenceWorker_p0-w0: resuming experience collection (15500 times) [2024-06-12 22:18:51,587][71000] Updated weights for policy 0, policy_version 93374 (0.0032) [2024-06-12 22:18:54,580][71000] Updated weights for policy 0, policy_version 93384 (0.0029) [2024-06-12 22:18:55,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1530052608. Throughput: 0: 50121.0. Samples: 1058873900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-12 22:18:55,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:18:58,054][71000] Updated weights for policy 0, policy_version 93394 (0.0028) [2024-06-12 22:19:00,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49971.1, 300 sec: 49707.4). Total num frames: 1530298368. Throughput: 0: 50107.6. Samples: 1059170620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 22:19:00,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:19:01,483][71000] Updated weights for policy 0, policy_version 93404 (0.0035) [2024-06-12 22:19:04,924][71000] Updated weights for policy 0, policy_version 93414 (0.0028) [2024-06-12 22:19:05,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1530527744. Throughput: 0: 50135.6. Samples: 1059315960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 22:19:05,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 22:19:07,854][71000] Updated weights for policy 0, policy_version 93424 (0.0021) [2024-06-12 22:19:10,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1530789888. Throughput: 0: 50025.1. Samples: 1059613740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 22:19:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:19:11,095][71000] Updated weights for policy 0, policy_version 93434 (0.0029) [2024-06-12 22:19:14,401][71000] Updated weights for policy 0, policy_version 93444 (0.0026) [2024-06-12 22:19:15,940][70768] Fps is (10 sec: 54067.0, 60 sec: 50517.4, 300 sec: 49818.5). Total num frames: 1531068416. Throughput: 0: 49918.3. Samples: 1059916020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 22:19:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 22:19:17,787][71000] Updated weights for policy 0, policy_version 93454 (0.0027) [2024-06-12 22:19:20,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49971.1, 300 sec: 49707.4). Total num frames: 1531297792. Throughput: 0: 50165.7. Samples: 1060076320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 22:19:20,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:19:21,118][71000] Updated weights for policy 0, policy_version 93464 (0.0026) [2024-06-12 22:19:24,536][71000] Updated weights for policy 0, policy_version 93474 (0.0021) [2024-06-12 22:19:25,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49698.0, 300 sec: 49596.3). Total num frames: 1531527168. Throughput: 0: 49670.1. Samples: 1060359480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 22:19:25,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 22:19:27,767][71000] Updated weights for policy 0, policy_version 93484 (0.0029) [2024-06-12 22:19:30,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1531772928. Throughput: 0: 49699.8. Samples: 1060657840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 22:19:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:19:31,287][71000] Updated weights for policy 0, policy_version 93494 (0.0028) [2024-06-12 22:19:34,647][71000] Updated weights for policy 0, policy_version 93504 (0.0032) [2024-06-12 22:19:35,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49971.3, 300 sec: 49707.4). Total num frames: 1532051456. Throughput: 0: 49647.5. Samples: 1060802600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 22:19:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:19:37,770][71000] Updated weights for policy 0, policy_version 93514 (0.0025) [2024-06-12 22:19:40,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49428.3, 300 sec: 49596.3). Total num frames: 1532264448. Throughput: 0: 49576.9. Samples: 1061104860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 22:19:40,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 22:19:40,955][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000093522_1532264448.pth... [2024-06-12 22:19:41,021][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000092795_1520353280.pth [2024-06-12 22:19:41,314][71000] Updated weights for policy 0, policy_version 93524 (0.0029) [2024-06-12 22:19:44,812][71000] Updated weights for policy 0, policy_version 93534 (0.0029) [2024-06-12 22:19:45,939][70768] Fps is (10 sec: 45875.3, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1532510208. Throughput: 0: 49529.5. Samples: 1061399440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 22:19:45,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:19:47,970][71000] Updated weights for policy 0, policy_version 93544 (0.0024) [2024-06-12 22:19:48,864][70980] Signal inference workers to stop experience collection... (15550 times) [2024-06-12 22:19:48,896][71000] InferenceWorker_p0-w0: stopping experience collection (15550 times) [2024-06-12 22:19:48,973][70980] Signal inference workers to resume experience collection... (15550 times) [2024-06-12 22:19:48,973][71000] InferenceWorker_p0-w0: resuming experience collection (15550 times) [2024-06-12 22:19:50,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49424.9, 300 sec: 49651.8). Total num frames: 1532772352. Throughput: 0: 49546.0. Samples: 1061545540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 22:19:50,942][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:19:51,226][71000] Updated weights for policy 0, policy_version 93554 (0.0030) [2024-06-12 22:19:54,394][71000] Updated weights for policy 0, policy_version 93564 (0.0027) [2024-06-12 22:19:55,940][70768] Fps is (10 sec: 52427.5, 60 sec: 49698.0, 300 sec: 49651.8). Total num frames: 1533034496. Throughput: 0: 49464.4. Samples: 1061839640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 22:19:55,941][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:19:58,051][71000] Updated weights for policy 0, policy_version 93574 (0.0029) [2024-06-12 22:20:00,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.2, 300 sec: 49651.9). Total num frames: 1533263872. Throughput: 0: 49375.1. Samples: 1062137900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 22:20:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:20:00,982][71000] Updated weights for policy 0, policy_version 93584 (0.0036) [2024-06-12 22:20:04,390][71000] Updated weights for policy 0, policy_version 93594 (0.0024) [2024-06-12 22:20:05,940][70768] Fps is (10 sec: 47514.7, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1533509632. Throughput: 0: 49100.6. Samples: 1062285840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 22:20:05,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 22:20:07,974][71000] Updated weights for policy 0, policy_version 93604 (0.0043) [2024-06-12 22:20:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.2, 300 sec: 49651.9). Total num frames: 1533755392. Throughput: 0: 49139.2. Samples: 1062570740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 22:20:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:20:11,263][71000] Updated weights for policy 0, policy_version 93614 (0.0032) [2024-06-12 22:20:14,735][71000] Updated weights for policy 0, policy_version 93624 (0.0030) [2024-06-12 22:20:15,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48878.9, 300 sec: 49596.3). Total num frames: 1534001152. Throughput: 0: 49029.4. Samples: 1062864160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 22:20:15,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:20:18,026][71000] Updated weights for policy 0, policy_version 93634 (0.0028) [2024-06-12 22:20:20,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48878.9, 300 sec: 49540.8). Total num frames: 1534230528. Throughput: 0: 49119.0. Samples: 1063012960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 22:20:20,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:20:21,359][71000] Updated weights for policy 0, policy_version 93644 (0.0025) [2024-06-12 22:20:24,591][71000] Updated weights for policy 0, policy_version 93654 (0.0026) [2024-06-12 22:20:25,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1534492672. Throughput: 0: 49125.6. Samples: 1063315520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 22:20:25,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:20:27,894][71000] Updated weights for policy 0, policy_version 93664 (0.0028) [2024-06-12 22:20:30,784][71000] Updated weights for policy 0, policy_version 93674 (0.0027) [2024-06-12 22:20:30,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49698.3, 300 sec: 49651.8). Total num frames: 1534754816. Throughput: 0: 49214.6. Samples: 1063614100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 22:20:30,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:20:34,317][71000] Updated weights for policy 0, policy_version 93684 (0.0032) [2024-06-12 22:20:35,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49151.9, 300 sec: 49651.9). Total num frames: 1535000576. Throughput: 0: 49358.2. Samples: 1063766660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 22:20:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:20:37,519][71000] Updated weights for policy 0, policy_version 93694 (0.0041) [2024-06-12 22:20:40,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1535229952. Throughput: 0: 49328.9. Samples: 1064059440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 22:20:40,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:20:41,072][71000] Updated weights for policy 0, policy_version 93704 (0.0037) [2024-06-12 22:20:44,234][71000] Updated weights for policy 0, policy_version 93714 (0.0031) [2024-06-12 22:20:45,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49698.1, 300 sec: 49651.9). Total num frames: 1535492096. Throughput: 0: 49250.7. Samples: 1064354180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-12 22:20:45,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:20:47,619][71000] Updated weights for policy 0, policy_version 93724 (0.0027) [2024-06-12 22:20:50,546][71000] Updated weights for policy 0, policy_version 93734 (0.0027) [2024-06-12 22:20:50,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49698.2, 300 sec: 49707.4). Total num frames: 1535754240. Throughput: 0: 49384.7. Samples: 1064508160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 22:20:50,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:20:54,104][71000] Updated weights for policy 0, policy_version 93744 (0.0028) [2024-06-12 22:20:55,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 49651.9). Total num frames: 1535983616. Throughput: 0: 49668.9. Samples: 1064805840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 22:20:55,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:20:57,316][71000] Updated weights for policy 0, policy_version 93754 (0.0034) [2024-06-12 22:21:00,940][70768] Fps is (10 sec: 45875.6, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1536212992. Throughput: 0: 49756.5. Samples: 1065103200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 22:21:00,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:21:00,951][71000] Updated weights for policy 0, policy_version 93764 (0.0026) [2024-06-12 22:21:04,019][71000] Updated weights for policy 0, policy_version 93774 (0.0039) [2024-06-12 22:21:05,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 49707.4). Total num frames: 1536475136. Throughput: 0: 49662.7. Samples: 1065247780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 22:21:05,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:21:07,378][71000] Updated weights for policy 0, policy_version 93784 (0.0039) [2024-06-12 22:21:10,403][71000] Updated weights for policy 0, policy_version 93794 (0.0023) [2024-06-12 22:21:10,899][70980] Signal inference workers to stop experience collection... (15600 times) [2024-06-12 22:21:10,900][70980] Signal inference workers to resume experience collection... (15600 times) [2024-06-12 22:21:10,915][71000] InferenceWorker_p0-w0: stopping experience collection (15600 times) [2024-06-12 22:21:10,915][71000] InferenceWorker_p0-w0: resuming experience collection (15600 times) [2024-06-12 22:21:10,939][70768] Fps is (10 sec: 52429.2, 60 sec: 49698.2, 300 sec: 49707.4). Total num frames: 1536737280. Throughput: 0: 49645.1. Samples: 1065549540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 22:21:10,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 22:21:14,110][71000] Updated weights for policy 0, policy_version 93804 (0.0031) [2024-06-12 22:21:15,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.1, 300 sec: 49651.8). Total num frames: 1536966656. Throughput: 0: 49654.2. Samples: 1065848540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 22:21:15,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:21:16,762][71000] Updated weights for policy 0, policy_version 93814 (0.0023) [2024-06-12 22:21:20,500][71000] Updated weights for policy 0, policy_version 93824 (0.0025) [2024-06-12 22:21:20,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49971.2, 300 sec: 49651.8). Total num frames: 1537228800. Throughput: 0: 49478.2. Samples: 1065993180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 22:21:20,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:21:23,587][71000] Updated weights for policy 0, policy_version 93834 (0.0031) [2024-06-12 22:21:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.2, 300 sec: 49651.9). Total num frames: 1537458176. Throughput: 0: 49609.5. Samples: 1066291860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 22:21:25,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:21:27,273][71000] Updated weights for policy 0, policy_version 93844 (0.0023) [2024-06-12 22:21:30,587][71000] Updated weights for policy 0, policy_version 93854 (0.0032) [2024-06-12 22:21:30,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49425.1, 300 sec: 49707.4). Total num frames: 1537720320. Throughput: 0: 49502.7. Samples: 1066581800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 22:21:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:21:33,790][71000] Updated weights for policy 0, policy_version 93864 (0.0029) [2024-06-12 22:21:35,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48879.1, 300 sec: 49540.8). Total num frames: 1537933312. Throughput: 0: 49518.4. Samples: 1066736480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 22:21:35,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:21:36,937][71000] Updated weights for policy 0, policy_version 93874 (0.0024) [2024-06-12 22:21:40,449][71000] Updated weights for policy 0, policy_version 93884 (0.0027) [2024-06-12 22:21:40,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49971.3, 300 sec: 49707.4). Total num frames: 1538228224. Throughput: 0: 49643.5. Samples: 1067039800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 22:21:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:21:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000093886_1538228224.pth... [2024-06-12 22:21:40,998][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000093159_1526317056.pth [2024-06-12 22:21:43,497][71000] Updated weights for policy 0, policy_version 93894 (0.0034) [2024-06-12 22:21:45,939][70768] Fps is (10 sec: 52429.0, 60 sec: 49425.1, 300 sec: 49651.9). Total num frames: 1538457600. Throughput: 0: 49497.0. Samples: 1067330560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 22:21:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:21:46,955][71000] Updated weights for policy 0, policy_version 93904 (0.0024) [2024-06-12 22:21:50,130][71000] Updated weights for policy 0, policy_version 93914 (0.0031) [2024-06-12 22:21:50,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49651.8). Total num frames: 1538719744. Throughput: 0: 49700.1. Samples: 1067484280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 22:21:50,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 22:21:53,730][71000] Updated weights for policy 0, policy_version 93924 (0.0025) [2024-06-12 22:21:55,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49425.0, 300 sec: 49596.6). Total num frames: 1538949120. Throughput: 0: 49446.0. Samples: 1067774620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 22:21:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:21:56,965][71000] Updated weights for policy 0, policy_version 93934 (0.0035) [2024-06-12 22:22:00,244][71000] Updated weights for policy 0, policy_version 93944 (0.0026) [2024-06-12 22:22:00,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1539211264. Throughput: 0: 49658.2. Samples: 1068083160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 22:22:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:22:03,266][71000] Updated weights for policy 0, policy_version 93954 (0.0022) [2024-06-12 22:22:03,972][70980] Signal inference workers to stop experience collection... (15650 times) [2024-06-12 22:22:04,021][71000] InferenceWorker_p0-w0: stopping experience collection (15650 times) [2024-06-12 22:22:04,081][70980] Signal inference workers to resume experience collection... (15650 times) [2024-06-12 22:22:04,081][71000] InferenceWorker_p0-w0: resuming experience collection (15650 times) [2024-06-12 22:22:05,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1539440640. Throughput: 0: 49640.5. Samples: 1068227000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 22:22:05,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 22:22:06,889][71000] Updated weights for policy 0, policy_version 93964 (0.0021) [2024-06-12 22:22:10,183][71000] Updated weights for policy 0, policy_version 93974 (0.0033) [2024-06-12 22:22:10,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.1, 300 sec: 49762.9). Total num frames: 1539719168. Throughput: 0: 49664.4. Samples: 1068526760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 22:22:10,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 22:22:13,495][71000] Updated weights for policy 0, policy_version 93984 (0.0025) [2024-06-12 22:22:15,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1539932160. Throughput: 0: 49842.6. Samples: 1068824720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 22:22:15,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 22:22:16,709][71000] Updated weights for policy 0, policy_version 93994 (0.0030) [2024-06-12 22:22:20,030][71000] Updated weights for policy 0, policy_version 94004 (0.0026) [2024-06-12 22:22:20,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49425.0, 300 sec: 49651.8). Total num frames: 1540194304. Throughput: 0: 49470.5. Samples: 1068962660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 22:22:20,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:22:23,429][71000] Updated weights for policy 0, policy_version 94014 (0.0027) [2024-06-12 22:22:25,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1540440064. Throughput: 0: 49297.8. Samples: 1069258200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 22:22:25,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:22:26,935][71000] Updated weights for policy 0, policy_version 94024 (0.0026) [2024-06-12 22:22:29,879][71000] Updated weights for policy 0, policy_version 94034 (0.0027) [2024-06-12 22:22:30,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.0, 300 sec: 49651.8). Total num frames: 1540702208. Throughput: 0: 49455.8. Samples: 1069556080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 22:22:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:22:33,747][71000] Updated weights for policy 0, policy_version 94044 (0.0038) [2024-06-12 22:22:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49971.1, 300 sec: 49540.8). Total num frames: 1540931584. Throughput: 0: 49339.1. Samples: 1069704540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 22:22:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:22:36,681][71000] Updated weights for policy 0, policy_version 94054 (0.0030) [2024-06-12 22:22:40,285][71000] Updated weights for policy 0, policy_version 94064 (0.0033) [2024-06-12 22:22:40,940][70768] Fps is (10 sec: 45875.9, 60 sec: 48879.0, 300 sec: 49540.8). Total num frames: 1541160960. Throughput: 0: 49381.5. Samples: 1069996780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-12 22:22:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:22:43,235][71000] Updated weights for policy 0, policy_version 94074 (0.0029) [2024-06-12 22:22:45,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1541423104. Throughput: 0: 49348.5. Samples: 1070303840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-12 22:22:45,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:22:46,686][71000] Updated weights for policy 0, policy_version 94084 (0.0023) [2024-06-12 22:22:49,899][71000] Updated weights for policy 0, policy_version 94094 (0.0025) [2024-06-12 22:22:50,940][70768] Fps is (10 sec: 54067.0, 60 sec: 49698.1, 300 sec: 49651.9). Total num frames: 1541701632. Throughput: 0: 49503.6. Samples: 1070454660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-12 22:22:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:22:53,195][71000] Updated weights for policy 0, policy_version 94104 (0.0025) [2024-06-12 22:22:55,940][70768] Fps is (10 sec: 52427.6, 60 sec: 49971.1, 300 sec: 49651.8). Total num frames: 1541947392. Throughput: 0: 49714.0. Samples: 1070763900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-12 22:22:55,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:22:56,310][71000] Updated weights for policy 0, policy_version 94114 (0.0028) [2024-06-12 22:22:59,500][71000] Updated weights for policy 0, policy_version 94124 (0.0029) [2024-06-12 22:23:00,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49424.9, 300 sec: 49596.3). Total num frames: 1542176768. Throughput: 0: 49926.1. Samples: 1071071400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-12 22:23:00,941][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:23:01,928][70980] Signal inference workers to stop experience collection... (15700 times) [2024-06-12 22:23:01,980][71000] InferenceWorker_p0-w0: stopping experience collection (15700 times) [2024-06-12 22:23:01,981][70980] Signal inference workers to resume experience collection... (15700 times) [2024-06-12 22:23:01,995][71000] InferenceWorker_p0-w0: resuming experience collection (15700 times) [2024-06-12 22:23:02,979][71000] Updated weights for policy 0, policy_version 94134 (0.0038) [2024-06-12 22:23:05,940][70768] Fps is (10 sec: 49153.0, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1542438912. Throughput: 0: 49905.1. Samples: 1071208380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-12 22:23:05,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:23:06,296][71000] Updated weights for policy 0, policy_version 94144 (0.0031) [2024-06-12 22:23:09,431][71000] Updated weights for policy 0, policy_version 94154 (0.0026) [2024-06-12 22:23:10,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49698.1, 300 sec: 49707.4). Total num frames: 1542701056. Throughput: 0: 50062.2. Samples: 1071511000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-12 22:23:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:23:12,550][71000] Updated weights for policy 0, policy_version 94164 (0.0036) [2024-06-12 22:23:15,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49971.3, 300 sec: 49596.3). Total num frames: 1542930432. Throughput: 0: 50006.9. Samples: 1071806380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-12 22:23:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:23:15,984][71000] Updated weights for policy 0, policy_version 94174 (0.0028) [2024-06-12 22:23:19,271][71000] Updated weights for policy 0, policy_version 94184 (0.0022) [2024-06-12 22:23:20,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49971.2, 300 sec: 49651.8). Total num frames: 1543192576. Throughput: 0: 49982.1. Samples: 1071953740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-12 22:23:20,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 22:23:22,557][71000] Updated weights for policy 0, policy_version 94194 (0.0028) [2024-06-12 22:23:25,905][71000] Updated weights for policy 0, policy_version 94204 (0.0042) [2024-06-12 22:23:25,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49971.3, 300 sec: 49596.3). Total num frames: 1543438336. Throughput: 0: 50163.1. Samples: 1072254120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-12 22:23:25,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:23:28,915][71000] Updated weights for policy 0, policy_version 94214 (0.0026) [2024-06-12 22:23:30,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1543684096. Throughput: 0: 50074.1. Samples: 1072557180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-12 22:23:30,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:23:32,317][71000] Updated weights for policy 0, policy_version 94224 (0.0026) [2024-06-12 22:23:35,311][71000] Updated weights for policy 0, policy_version 94234 (0.0027) [2024-06-12 22:23:35,940][70768] Fps is (10 sec: 50789.7, 60 sec: 50244.2, 300 sec: 49652.5). Total num frames: 1543946240. Throughput: 0: 50195.9. Samples: 1072713480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-12 22:23:35,949][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:23:39,022][71000] Updated weights for policy 0, policy_version 94244 (0.0032) [2024-06-12 22:23:40,940][70768] Fps is (10 sec: 49152.1, 60 sec: 50244.2, 300 sec: 49651.8). Total num frames: 1544175616. Throughput: 0: 49818.4. Samples: 1073005720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-12 22:23:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 22:23:40,958][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000094250_1544192000.pth... [2024-06-12 22:23:40,997][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000093522_1532264448.pth [2024-06-12 22:23:42,434][71000] Updated weights for policy 0, policy_version 94254 (0.0027) [2024-06-12 22:23:45,520][71000] Updated weights for policy 0, policy_version 94264 (0.0028) [2024-06-12 22:23:45,940][70768] Fps is (10 sec: 47514.4, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 1544421376. Throughput: 0: 49486.0. Samples: 1073298260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:23:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 22:23:48,769][71000] Updated weights for policy 0, policy_version 94274 (0.0029) [2024-06-12 22:23:50,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1544667136. Throughput: 0: 49785.3. Samples: 1073448720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:23:50,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:23:52,204][71000] Updated weights for policy 0, policy_version 94284 (0.0029) [2024-06-12 22:23:55,228][71000] Updated weights for policy 0, policy_version 94294 (0.0031) [2024-06-12 22:23:55,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.3, 300 sec: 49596.3). Total num frames: 1544929280. Throughput: 0: 49652.6. Samples: 1073745360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:23:55,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:23:59,141][71000] Updated weights for policy 0, policy_version 94304 (0.0031) [2024-06-12 22:24:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49971.3, 300 sec: 49651.8). Total num frames: 1545175040. Throughput: 0: 49726.5. Samples: 1074044080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:24:00,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:24:01,187][70980] Signal inference workers to stop experience collection... (15750 times) [2024-06-12 22:24:01,235][71000] InferenceWorker_p0-w0: stopping experience collection (15750 times) [2024-06-12 22:24:01,239][70980] Signal inference workers to resume experience collection... (15750 times) [2024-06-12 22:24:01,249][71000] InferenceWorker_p0-w0: resuming experience collection (15750 times) [2024-06-12 22:24:02,152][71000] Updated weights for policy 0, policy_version 94314 (0.0037) [2024-06-12 22:24:05,763][71000] Updated weights for policy 0, policy_version 94324 (0.0034) [2024-06-12 22:24:05,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1545404416. Throughput: 0: 49567.6. Samples: 1074184280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:24:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:24:08,794][71000] Updated weights for policy 0, policy_version 94334 (0.0025) [2024-06-12 22:24:10,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 1545666560. Throughput: 0: 49514.3. Samples: 1074482260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:24:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:24:12,109][71000] Updated weights for policy 0, policy_version 94344 (0.0032) [2024-06-12 22:24:15,331][71000] Updated weights for policy 0, policy_version 94354 (0.0025) [2024-06-12 22:24:15,939][70768] Fps is (10 sec: 49152.8, 60 sec: 49425.1, 300 sec: 49485.3). Total num frames: 1545895936. Throughput: 0: 49423.3. Samples: 1074781220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:24:15,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:24:19,024][71000] Updated weights for policy 0, policy_version 94364 (0.0027) [2024-06-12 22:24:20,941][70768] Fps is (10 sec: 47507.9, 60 sec: 49151.2, 300 sec: 49540.6). Total num frames: 1546141696. Throughput: 0: 49385.1. Samples: 1074935860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:24:20,941][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:24:21,891][71000] Updated weights for policy 0, policy_version 94374 (0.0033) [2024-06-12 22:24:25,445][71000] Updated weights for policy 0, policy_version 94384 (0.0038) [2024-06-12 22:24:25,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 1546403840. Throughput: 0: 49269.8. Samples: 1075222860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:24:25,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:24:28,898][71000] Updated weights for policy 0, policy_version 94394 (0.0025) [2024-06-12 22:24:30,940][70768] Fps is (10 sec: 50795.9, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1546649600. Throughput: 0: 49352.7. Samples: 1075519140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:24:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:24:31,962][71000] Updated weights for policy 0, policy_version 94404 (0.0026) [2024-06-12 22:24:35,326][71000] Updated weights for policy 0, policy_version 94414 (0.0033) [2024-06-12 22:24:35,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.2, 300 sec: 49596.3). Total num frames: 1546895360. Throughput: 0: 49400.1. Samples: 1075671720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:24:35,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 22:24:38,602][71000] Updated weights for policy 0, policy_version 94424 (0.0031) [2024-06-12 22:24:40,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49698.2, 300 sec: 49651.8). Total num frames: 1547157504. Throughput: 0: 49424.9. Samples: 1075969480. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-12 22:24:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:24:41,813][71000] Updated weights for policy 0, policy_version 94434 (0.0025) [2024-06-12 22:24:45,392][71000] Updated weights for policy 0, policy_version 94444 (0.0027) [2024-06-12 22:24:45,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1547403264. Throughput: 0: 49404.5. Samples: 1076267280. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-12 22:24:45,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:24:48,794][71000] Updated weights for policy 0, policy_version 94454 (0.0041) [2024-06-12 22:24:50,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1547632640. Throughput: 0: 49506.1. Samples: 1076412060. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-12 22:24:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:24:51,970][71000] Updated weights for policy 0, policy_version 94464 (0.0032) [2024-06-12 22:24:55,421][71000] Updated weights for policy 0, policy_version 94474 (0.0025) [2024-06-12 22:24:55,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1547894784. Throughput: 0: 49645.6. Samples: 1076716320. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-12 22:24:55,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:24:58,402][71000] Updated weights for policy 0, policy_version 94484 (0.0023) [2024-06-12 22:25:00,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1548124160. Throughput: 0: 49647.8. Samples: 1077015380. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-12 22:25:00,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:25:02,013][71000] Updated weights for policy 0, policy_version 94494 (0.0031) [2024-06-12 22:25:05,266][71000] Updated weights for policy 0, policy_version 94504 (0.0036) [2024-06-12 22:25:05,939][70768] Fps is (10 sec: 49152.8, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1548386304. Throughput: 0: 49355.6. Samples: 1077156800. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-12 22:25:05,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:25:08,558][71000] Updated weights for policy 0, policy_version 94514 (0.0033) [2024-06-12 22:25:10,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1548615680. Throughput: 0: 49512.5. Samples: 1077450920. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-12 22:25:10,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:25:11,522][70980] Signal inference workers to stop experience collection... (15800 times) [2024-06-12 22:25:11,566][71000] InferenceWorker_p0-w0: stopping experience collection (15800 times) [2024-06-12 22:25:11,574][70980] Signal inference workers to resume experience collection... (15800 times) [2024-06-12 22:25:11,586][71000] InferenceWorker_p0-w0: resuming experience collection (15800 times) [2024-06-12 22:25:11,706][71000] Updated weights for policy 0, policy_version 94524 (0.0029) [2024-06-12 22:25:15,201][71000] Updated weights for policy 0, policy_version 94534 (0.0019) [2024-06-12 22:25:15,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49424.9, 300 sec: 49596.3). Total num frames: 1548861440. Throughput: 0: 49470.2. Samples: 1077745300. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-12 22:25:15,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:25:18,543][71000] Updated weights for policy 0, policy_version 94544 (0.0033) [2024-06-12 22:25:20,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49426.1, 300 sec: 49540.8). Total num frames: 1549107200. Throughput: 0: 49375.1. Samples: 1077893600. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-12 22:25:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:25:21,898][71000] Updated weights for policy 0, policy_version 94554 (0.0029) [2024-06-12 22:25:24,906][71000] Updated weights for policy 0, policy_version 94564 (0.0032) [2024-06-12 22:25:25,940][70768] Fps is (10 sec: 52429.5, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1549385728. Throughput: 0: 49451.6. Samples: 1078194800. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-12 22:25:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 22:25:28,417][71000] Updated weights for policy 0, policy_version 94574 (0.0026) [2024-06-12 22:25:30,940][70768] Fps is (10 sec: 50789.1, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1549615104. Throughput: 0: 49409.1. Samples: 1078490700. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-12 22:25:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:25:31,802][71000] Updated weights for policy 0, policy_version 94584 (0.0034) [2024-06-12 22:25:34,970][71000] Updated weights for policy 0, policy_version 94594 (0.0029) [2024-06-12 22:25:35,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49151.9, 300 sec: 49540.8). Total num frames: 1549844480. Throughput: 0: 49486.9. Samples: 1078638960. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-12 22:25:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 22:25:38,227][71000] Updated weights for policy 0, policy_version 94604 (0.0030) [2024-06-12 22:25:40,940][70768] Fps is (10 sec: 47514.3, 60 sec: 48878.9, 300 sec: 49485.2). Total num frames: 1550090240. Throughput: 0: 49102.3. Samples: 1078925920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 22:25:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 22:25:40,958][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000094611_1550106624.pth... [2024-06-12 22:25:41,006][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000093886_1538228224.pth [2024-06-12 22:25:41,712][71000] Updated weights for policy 0, policy_version 94614 (0.0040) [2024-06-12 22:25:44,806][71000] Updated weights for policy 0, policy_version 94624 (0.0026) [2024-06-12 22:25:45,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1550368768. Throughput: 0: 49019.6. Samples: 1079221260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 22:25:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 22:25:48,227][71000] Updated weights for policy 0, policy_version 94634 (0.0025) [2024-06-12 22:25:50,939][70768] Fps is (10 sec: 52429.3, 60 sec: 49698.3, 300 sec: 49596.3). Total num frames: 1550614528. Throughput: 0: 49329.8. Samples: 1079376640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 22:25:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:25:51,692][71000] Updated weights for policy 0, policy_version 94644 (0.0028) [2024-06-12 22:25:54,582][71000] Updated weights for policy 0, policy_version 94654 (0.0033) [2024-06-12 22:25:55,939][70768] Fps is (10 sec: 45875.5, 60 sec: 48879.1, 300 sec: 49540.8). Total num frames: 1550827520. Throughput: 0: 49407.1. Samples: 1079674240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 22:25:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:25:58,099][71000] Updated weights for policy 0, policy_version 94664 (0.0030) [2024-06-12 22:26:00,939][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1551089664. Throughput: 0: 49513.1. Samples: 1079973380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 22:26:00,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:26:01,450][71000] Updated weights for policy 0, policy_version 94674 (0.0026) [2024-06-12 22:26:04,502][71000] Updated weights for policy 0, policy_version 94684 (0.0024) [2024-06-12 22:26:05,940][70768] Fps is (10 sec: 54067.0, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1551368192. Throughput: 0: 49617.7. Samples: 1080126400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 22:26:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-12 22:26:05,940][70980] Saving new best policy, reward=0.285! [2024-06-12 22:26:08,070][71000] Updated weights for policy 0, policy_version 94694 (0.0025) [2024-06-12 22:26:10,940][70768] Fps is (10 sec: 52427.8, 60 sec: 49971.1, 300 sec: 49651.8). Total num frames: 1551613952. Throughput: 0: 49812.7. Samples: 1080436380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 22:26:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:26:10,952][71000] Updated weights for policy 0, policy_version 94704 (0.0031) [2024-06-12 22:26:12,444][70980] Signal inference workers to stop experience collection... (15850 times) [2024-06-12 22:26:12,444][70980] Signal inference workers to resume experience collection... (15850 times) [2024-06-12 22:26:12,461][71000] InferenceWorker_p0-w0: stopping experience collection (15850 times) [2024-06-12 22:26:12,461][71000] InferenceWorker_p0-w0: resuming experience collection (15850 times) [2024-06-12 22:26:14,431][71000] Updated weights for policy 0, policy_version 94714 (0.0023) [2024-06-12 22:26:15,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1551843328. Throughput: 0: 49784.2. Samples: 1080730980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 22:26:15,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:26:17,591][71000] Updated weights for policy 0, policy_version 94724 (0.0026) [2024-06-12 22:26:20,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49971.2, 300 sec: 49651.9). Total num frames: 1552105472. Throughput: 0: 49721.0. Samples: 1080876400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 22:26:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:26:20,967][71000] Updated weights for policy 0, policy_version 94734 (0.0028) [2024-06-12 22:26:24,171][71000] Updated weights for policy 0, policy_version 94744 (0.0028) [2024-06-12 22:26:25,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1552351232. Throughput: 0: 49728.0. Samples: 1081163680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 22:26:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:26:27,779][71000] Updated weights for policy 0, policy_version 94754 (0.0040) [2024-06-12 22:26:30,714][71000] Updated weights for policy 0, policy_version 94764 (0.0026) [2024-06-12 22:26:30,940][70768] Fps is (10 sec: 52428.5, 60 sec: 50244.4, 300 sec: 49818.5). Total num frames: 1552629760. Throughput: 0: 49884.0. Samples: 1081466040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-12 22:26:30,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:26:34,338][71000] Updated weights for policy 0, policy_version 94774 (0.0034) [2024-06-12 22:26:35,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49698.0, 300 sec: 49485.2). Total num frames: 1552826368. Throughput: 0: 49945.1. Samples: 1081624180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 22:26:35,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:26:37,326][71000] Updated weights for policy 0, policy_version 94784 (0.0031) [2024-06-12 22:26:40,742][71000] Updated weights for policy 0, policy_version 94794 (0.0033) [2024-06-12 22:26:40,940][70768] Fps is (10 sec: 47513.6, 60 sec: 50244.3, 300 sec: 49651.8). Total num frames: 1553104896. Throughput: 0: 49840.4. Samples: 1081917060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 22:26:40,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:26:43,886][71000] Updated weights for policy 0, policy_version 94804 (0.0037) [2024-06-12 22:26:45,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1553334272. Throughput: 0: 49727.8. Samples: 1082211140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 22:26:45,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:26:47,816][71000] Updated weights for policy 0, policy_version 94814 (0.0029) [2024-06-12 22:26:50,574][71000] Updated weights for policy 0, policy_version 94824 (0.0032) [2024-06-12 22:26:50,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49698.1, 300 sec: 49651.9). Total num frames: 1553596416. Throughput: 0: 49753.8. Samples: 1082365320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 22:26:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:26:54,524][71000] Updated weights for policy 0, policy_version 94834 (0.0029) [2024-06-12 22:26:55,940][70768] Fps is (10 sec: 50790.9, 60 sec: 50244.2, 300 sec: 49596.3). Total num frames: 1553842176. Throughput: 0: 49512.1. Samples: 1082664420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 22:26:55,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:26:57,029][71000] Updated weights for policy 0, policy_version 94844 (0.0027) [2024-06-12 22:27:00,940][70768] Fps is (10 sec: 47512.5, 60 sec: 49697.9, 300 sec: 49596.3). Total num frames: 1554071552. Throughput: 0: 49464.2. Samples: 1082956880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 22:27:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:27:01,272][71000] Updated weights for policy 0, policy_version 94854 (0.0029) [2024-06-12 22:27:03,716][71000] Updated weights for policy 0, policy_version 94864 (0.0032) [2024-06-12 22:27:05,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1554333696. Throughput: 0: 49560.4. Samples: 1083106620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 22:27:05,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:27:07,574][71000] Updated weights for policy 0, policy_version 94874 (0.0030) [2024-06-12 22:27:09,806][70980] Signal inference workers to stop experience collection... (15900 times) [2024-06-12 22:27:09,808][70980] Signal inference workers to resume experience collection... (15900 times) [2024-06-12 22:27:09,850][71000] InferenceWorker_p0-w0: stopping experience collection (15900 times) [2024-06-12 22:27:09,850][71000] InferenceWorker_p0-w0: resuming experience collection (15900 times) [2024-06-12 22:27:10,612][71000] Updated weights for policy 0, policy_version 94884 (0.0024) [2024-06-12 22:27:10,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49698.2, 300 sec: 49707.4). Total num frames: 1554595840. Throughput: 0: 49762.6. Samples: 1083403000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 22:27:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 22:27:14,473][71000] Updated weights for policy 0, policy_version 94894 (0.0033) [2024-06-12 22:27:15,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49971.1, 300 sec: 49651.9). Total num frames: 1554841600. Throughput: 0: 49765.7. Samples: 1083705500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 22:27:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:27:17,049][71000] Updated weights for policy 0, policy_version 94904 (0.0029) [2024-06-12 22:27:20,940][70768] Fps is (10 sec: 45875.6, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1555054592. Throughput: 0: 49467.7. Samples: 1083850220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 22:27:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:27:20,998][71000] Updated weights for policy 0, policy_version 94914 (0.0035) [2024-06-12 22:27:23,393][71000] Updated weights for policy 0, policy_version 94924 (0.0030) [2024-06-12 22:27:25,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49971.2, 300 sec: 49651.9). Total num frames: 1555349504. Throughput: 0: 49707.6. Samples: 1084153900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 22:27:25,949][70768] Avg episode reward: [(0, '0.285')] [2024-06-12 22:27:27,519][71000] Updated weights for policy 0, policy_version 94934 (0.0025) [2024-06-12 22:27:30,206][71000] Updated weights for policy 0, policy_version 94944 (0.0028) [2024-06-12 22:27:30,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49152.0, 300 sec: 49651.9). Total num frames: 1555578880. Throughput: 0: 49631.2. Samples: 1084444540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 22:27:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:27:33,997][71000] Updated weights for policy 0, policy_version 94954 (0.0037) [2024-06-12 22:27:35,939][70768] Fps is (10 sec: 49152.4, 60 sec: 50244.5, 300 sec: 49762.9). Total num frames: 1555841024. Throughput: 0: 49837.8. Samples: 1084608020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 22:27:35,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:27:36,866][71000] Updated weights for policy 0, policy_version 94964 (0.0023) [2024-06-12 22:27:40,575][71000] Updated weights for policy 0, policy_version 94974 (0.0025) [2024-06-12 22:27:40,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 49596.3). Total num frames: 1556054016. Throughput: 0: 49671.6. Samples: 1084899640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 22:27:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:27:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000094974_1556054016.pth... [2024-06-12 22:27:41,010][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000094250_1544192000.pth [2024-06-12 22:27:43,253][71000] Updated weights for policy 0, policy_version 94984 (0.0022) [2024-06-12 22:27:45,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1556316160. Throughput: 0: 49593.5. Samples: 1085188580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 22:27:45,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 22:27:47,407][71000] Updated weights for policy 0, policy_version 94994 (0.0028) [2024-06-12 22:27:50,155][71000] Updated weights for policy 0, policy_version 95004 (0.0029) [2024-06-12 22:27:50,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1556578304. Throughput: 0: 49717.8. Samples: 1085343920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 22:27:50,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 22:27:53,923][71000] Updated weights for policy 0, policy_version 95014 (0.0034) [2024-06-12 22:27:55,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.1, 300 sec: 49651.9). Total num frames: 1556824064. Throughput: 0: 49711.5. Samples: 1085640020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 22:27:55,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:27:56,997][71000] Updated weights for policy 0, policy_version 95024 (0.0024) [2024-06-12 22:28:00,844][71000] Updated weights for policy 0, policy_version 95034 (0.0030) [2024-06-12 22:28:00,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 1557037056. Throughput: 0: 49453.0. Samples: 1085930880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 22:28:00,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 22:28:03,727][71000] Updated weights for policy 0, policy_version 95044 (0.0030) [2024-06-12 22:28:05,939][70768] Fps is (10 sec: 47514.3, 60 sec: 49425.1, 300 sec: 49485.3). Total num frames: 1557299200. Throughput: 0: 49309.4. Samples: 1086069140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 22:28:05,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:28:07,722][71000] Updated weights for policy 0, policy_version 95054 (0.0028) [2024-06-12 22:28:10,217][71000] Updated weights for policy 0, policy_version 95064 (0.0028) [2024-06-12 22:28:10,940][70768] Fps is (10 sec: 54067.5, 60 sec: 49698.2, 300 sec: 49651.8). Total num frames: 1557577728. Throughput: 0: 49107.6. Samples: 1086363740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 22:28:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:28:14,127][71000] Updated weights for policy 0, policy_version 95074 (0.0024) [2024-06-12 22:28:15,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1557807104. Throughput: 0: 49377.3. Samples: 1086666520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 22:28:15,944][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:28:16,796][71000] Updated weights for policy 0, policy_version 95084 (0.0024) [2024-06-12 22:28:20,508][71000] Updated weights for policy 0, policy_version 95094 (0.0036) [2024-06-12 22:28:20,940][70768] Fps is (10 sec: 44236.0, 60 sec: 49424.9, 300 sec: 49429.7). Total num frames: 1558020096. Throughput: 0: 49078.4. Samples: 1086816560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 22:28:20,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 22:28:23,104][71000] Updated weights for policy 0, policy_version 95104 (0.0033) [2024-06-12 22:28:24,647][70980] Signal inference workers to stop experience collection... (15950 times) [2024-06-12 22:28:24,649][70980] Signal inference workers to resume experience collection... (15950 times) [2024-06-12 22:28:24,688][71000] InferenceWorker_p0-w0: stopping experience collection (15950 times) [2024-06-12 22:28:24,688][71000] InferenceWorker_p0-w0: resuming experience collection (15950 times) [2024-06-12 22:28:25,939][70768] Fps is (10 sec: 47514.1, 60 sec: 48879.0, 300 sec: 49485.3). Total num frames: 1558282240. Throughput: 0: 49185.9. Samples: 1087113000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-12 22:28:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:28:27,563][71000] Updated weights for policy 0, policy_version 95114 (0.0031) [2024-06-12 22:28:30,006][71000] Updated weights for policy 0, policy_version 95124 (0.0023) [2024-06-12 22:28:30,940][70768] Fps is (10 sec: 52429.6, 60 sec: 49425.0, 300 sec: 49485.3). Total num frames: 1558544384. Throughput: 0: 49298.7. Samples: 1087407020. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-12 22:28:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:28:34,065][71000] Updated weights for policy 0, policy_version 95134 (0.0029) [2024-06-12 22:28:35,939][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1558790144. Throughput: 0: 49516.9. Samples: 1087572180. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-12 22:28:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:28:36,785][71000] Updated weights for policy 0, policy_version 95144 (0.0035) [2024-06-12 22:28:40,510][71000] Updated weights for policy 0, policy_version 95154 (0.0028) [2024-06-12 22:28:40,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1559003136. Throughput: 0: 49258.3. Samples: 1087856640. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-12 22:28:40,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:28:43,392][71000] Updated weights for policy 0, policy_version 95164 (0.0027) [2024-06-12 22:28:45,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1559265280. Throughput: 0: 49186.7. Samples: 1088144280. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-12 22:28:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 22:28:47,645][71000] Updated weights for policy 0, policy_version 95174 (0.0027) [2024-06-12 22:28:49,715][71000] Updated weights for policy 0, policy_version 95184 (0.0029) [2024-06-12 22:28:50,940][70768] Fps is (10 sec: 54066.3, 60 sec: 49424.9, 300 sec: 49540.7). Total num frames: 1559543808. Throughput: 0: 49563.7. Samples: 1088299520. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-12 22:28:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:28:53,991][71000] Updated weights for policy 0, policy_version 95194 (0.0024) [2024-06-12 22:28:55,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1559789568. Throughput: 0: 49854.7. Samples: 1088607200. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-12 22:28:55,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:28:56,477][71000] Updated weights for policy 0, policy_version 95204 (0.0027) [2024-06-12 22:29:00,480][71000] Updated weights for policy 0, policy_version 95214 (0.0026) [2024-06-12 22:29:00,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1560002560. Throughput: 0: 49715.0. Samples: 1088903700. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-12 22:29:00,949][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:29:03,307][71000] Updated weights for policy 0, policy_version 95224 (0.0033) [2024-06-12 22:29:05,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1560264704. Throughput: 0: 49347.3. Samples: 1089037180. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-12 22:29:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:29:07,290][71000] Updated weights for policy 0, policy_version 95234 (0.0030) [2024-06-12 22:29:09,675][71000] Updated weights for policy 0, policy_version 95244 (0.0030) [2024-06-12 22:29:10,939][70768] Fps is (10 sec: 54068.2, 60 sec: 49425.1, 300 sec: 49651.8). Total num frames: 1560543232. Throughput: 0: 49327.5. Samples: 1089332740. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-12 22:29:10,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:29:13,770][71000] Updated weights for policy 0, policy_version 95254 (0.0024) [2024-06-12 22:29:15,583][70980] Signal inference workers to stop experience collection... (16000 times) [2024-06-12 22:29:15,586][70980] Signal inference workers to resume experience collection... (16000 times) [2024-06-12 22:29:15,592][71000] InferenceWorker_p0-w0: stopping experience collection (16000 times) [2024-06-12 22:29:15,624][71000] InferenceWorker_p0-w0: resuming experience collection (16000 times) [2024-06-12 22:29:15,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.1, 300 sec: 49596.5). Total num frames: 1560772608. Throughput: 0: 49516.0. Samples: 1089635240. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-12 22:29:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:29:16,406][71000] Updated weights for policy 0, policy_version 95264 (0.0028) [2024-06-12 22:29:20,387][71000] Updated weights for policy 0, policy_version 95274 (0.0033) [2024-06-12 22:29:20,940][70768] Fps is (10 sec: 44236.7, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1560985600. Throughput: 0: 48919.5. Samples: 1089773560. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-12 22:29:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:29:23,418][71000] Updated weights for policy 0, policy_version 95284 (0.0031) [2024-06-12 22:29:25,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1561247744. Throughput: 0: 49139.4. Samples: 1090067920. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-12 22:29:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:29:27,175][71000] Updated weights for policy 0, policy_version 95294 (0.0026) [2024-06-12 22:29:29,655][71000] Updated weights for policy 0, policy_version 95304 (0.0034) [2024-06-12 22:29:30,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1561509888. Throughput: 0: 49506.2. Samples: 1090372060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:29:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:29:33,558][71000] Updated weights for policy 0, policy_version 95314 (0.0031) [2024-06-12 22:29:35,940][70768] Fps is (10 sec: 52429.5, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1561772032. Throughput: 0: 49685.0. Samples: 1090535340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:29:35,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:29:36,106][71000] Updated weights for policy 0, policy_version 95324 (0.0029) [2024-06-12 22:29:40,095][71000] Updated weights for policy 0, policy_version 95334 (0.0029) [2024-06-12 22:29:40,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49971.1, 300 sec: 49485.2). Total num frames: 1562001408. Throughput: 0: 49314.5. Samples: 1090826360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:29:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 22:29:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000095337_1562001408.pth... [2024-06-12 22:29:40,989][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000094611_1550106624.pth [2024-06-12 22:29:43,047][71000] Updated weights for policy 0, policy_version 95344 (0.0030) [2024-06-12 22:29:45,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49425.1, 300 sec: 49485.3). Total num frames: 1562230784. Throughput: 0: 49137.1. Samples: 1091114860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:29:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:29:46,617][71000] Updated weights for policy 0, policy_version 95354 (0.0027) [2024-06-12 22:29:49,696][71000] Updated weights for policy 0, policy_version 95364 (0.0028) [2024-06-12 22:29:50,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1562509312. Throughput: 0: 49408.9. Samples: 1091260580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:29:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:29:53,361][71000] Updated weights for policy 0, policy_version 95374 (0.0028) [2024-06-12 22:29:55,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1562738688. Throughput: 0: 49612.9. Samples: 1091565320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:29:55,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:29:56,204][71000] Updated weights for policy 0, policy_version 95384 (0.0023) [2024-06-12 22:29:59,809][71000] Updated weights for policy 0, policy_version 95394 (0.0026) [2024-06-12 22:30:00,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1562968064. Throughput: 0: 49392.1. Samples: 1091857880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:30:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:30:02,650][71000] Updated weights for policy 0, policy_version 95404 (0.0027) [2024-06-12 22:30:05,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1563230208. Throughput: 0: 49400.9. Samples: 1091996600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:30:05,949][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:30:06,310][71000] Updated weights for policy 0, policy_version 95414 (0.0031) [2024-06-12 22:30:09,363][71000] Updated weights for policy 0, policy_version 95424 (0.0030) [2024-06-12 22:30:10,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49152.0, 300 sec: 49596.3). Total num frames: 1563492352. Throughput: 0: 49643.3. Samples: 1092301860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:30:10,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:30:12,766][71000] Updated weights for policy 0, policy_version 95434 (0.0030) [2024-06-12 22:30:15,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1563738112. Throughput: 0: 49675.4. Samples: 1092607460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:30:15,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:30:16,002][71000] Updated weights for policy 0, policy_version 95444 (0.0022) [2024-06-12 22:30:19,310][70980] Signal inference workers to stop experience collection... (16050 times) [2024-06-12 22:30:19,313][70980] Signal inference workers to resume experience collection... (16050 times) [2024-06-12 22:30:19,328][71000] InferenceWorker_p0-w0: stopping experience collection (16050 times) [2024-06-12 22:30:19,328][71000] InferenceWorker_p0-w0: resuming experience collection (16050 times) [2024-06-12 22:30:19,444][71000] Updated weights for policy 0, policy_version 95454 (0.0038) [2024-06-12 22:30:20,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 1563983872. Throughput: 0: 49404.9. Samples: 1092758560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:30:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:30:22,442][71000] Updated weights for policy 0, policy_version 95464 (0.0030) [2024-06-12 22:30:25,792][71000] Updated weights for policy 0, policy_version 95474 (0.0033) [2024-06-12 22:30:25,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49971.3, 300 sec: 49596.3). Total num frames: 1564246016. Throughput: 0: 49766.8. Samples: 1093065860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 22:30:25,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 22:30:28,909][71000] Updated weights for policy 0, policy_version 95484 (0.0028) [2024-06-12 22:30:30,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.1, 300 sec: 49651.9). Total num frames: 1564491776. Throughput: 0: 49844.0. Samples: 1093357840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 22:30:30,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:30:32,158][71000] Updated weights for policy 0, policy_version 95494 (0.0025) [2024-06-12 22:30:35,599][71000] Updated weights for policy 0, policy_version 95504 (0.0029) [2024-06-12 22:30:35,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.1, 300 sec: 49651.9). Total num frames: 1564737536. Throughput: 0: 50014.8. Samples: 1093511240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 22:30:35,940][70768] Avg episode reward: [(0, '0.256')] [2024-06-12 22:30:38,978][71000] Updated weights for policy 0, policy_version 95514 (0.0028) [2024-06-12 22:30:40,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1564983296. Throughput: 0: 49806.0. Samples: 1093806600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 22:30:40,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 22:30:41,878][71000] Updated weights for policy 0, policy_version 95524 (0.0032) [2024-06-12 22:30:45,390][71000] Updated weights for policy 0, policy_version 95534 (0.0027) [2024-06-12 22:30:45,940][70768] Fps is (10 sec: 50790.1, 60 sec: 50244.3, 300 sec: 49596.3). Total num frames: 1565245440. Throughput: 0: 50066.6. Samples: 1094110880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 22:30:45,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 22:30:48,662][71000] Updated weights for policy 0, policy_version 95544 (0.0028) [2024-06-12 22:30:50,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.0, 300 sec: 49651.8). Total num frames: 1565474816. Throughput: 0: 50275.8. Samples: 1094259020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 22:30:50,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 22:30:51,997][71000] Updated weights for policy 0, policy_version 95554 (0.0038) [2024-06-12 22:30:55,229][71000] Updated weights for policy 0, policy_version 95564 (0.0023) [2024-06-12 22:30:55,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49971.1, 300 sec: 49651.8). Total num frames: 1565736960. Throughput: 0: 50058.6. Samples: 1094554500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 22:30:55,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:30:58,592][71000] Updated weights for policy 0, policy_version 95574 (0.0031) [2024-06-12 22:31:00,939][70768] Fps is (10 sec: 49153.0, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 1565966336. Throughput: 0: 49849.1. Samples: 1094850660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 22:31:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-12 22:31:02,027][71000] Updated weights for policy 0, policy_version 95584 (0.0025) [2024-06-12 22:31:05,264][71000] Updated weights for policy 0, policy_version 95594 (0.0025) [2024-06-12 22:31:05,939][70768] Fps is (10 sec: 49152.6, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 1566228480. Throughput: 0: 49730.3. Samples: 1094996420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 22:31:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:31:08,607][71000] Updated weights for policy 0, policy_version 95604 (0.0030) [2024-06-12 22:31:10,940][70768] Fps is (10 sec: 49150.9, 60 sec: 49424.9, 300 sec: 49540.7). Total num frames: 1566457856. Throughput: 0: 49564.3. Samples: 1095296260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 22:31:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:31:11,920][71000] Updated weights for policy 0, policy_version 95614 (0.0022) [2024-06-12 22:31:15,373][71000] Updated weights for policy 0, policy_version 95624 (0.0029) [2024-06-12 22:31:15,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1566720000. Throughput: 0: 49719.1. Samples: 1095595200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 22:31:15,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:31:18,386][71000] Updated weights for policy 0, policy_version 95634 (0.0024) [2024-06-12 22:31:20,755][70980] Signal inference workers to stop experience collection... (16100 times) [2024-06-12 22:31:20,757][70980] Signal inference workers to resume experience collection... (16100 times) [2024-06-12 22:31:20,801][71000] InferenceWorker_p0-w0: stopping experience collection (16100 times) [2024-06-12 22:31:20,802][71000] InferenceWorker_p0-w0: resuming experience collection (16100 times) [2024-06-12 22:31:20,939][70768] Fps is (10 sec: 52430.5, 60 sec: 49971.3, 300 sec: 49596.3). Total num frames: 1566982144. Throughput: 0: 49620.5. Samples: 1095744160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-12 22:31:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 22:31:21,750][71000] Updated weights for policy 0, policy_version 95644 (0.0043) [2024-06-12 22:31:25,065][71000] Updated weights for policy 0, policy_version 95654 (0.0024) [2024-06-12 22:31:25,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1567227904. Throughput: 0: 49816.6. Samples: 1096048340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 22:31:25,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:31:28,455][71000] Updated weights for policy 0, policy_version 95664 (0.0036) [2024-06-12 22:31:30,939][70768] Fps is (10 sec: 45874.7, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1567440896. Throughput: 0: 49588.5. Samples: 1096342360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 22:31:30,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:31:31,652][71000] Updated weights for policy 0, policy_version 95674 (0.0030) [2024-06-12 22:31:34,926][71000] Updated weights for policy 0, policy_version 95684 (0.0030) [2024-06-12 22:31:35,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49698.0, 300 sec: 49540.8). Total num frames: 1567719424. Throughput: 0: 49522.3. Samples: 1096487520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 22:31:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:31:38,162][71000] Updated weights for policy 0, policy_version 95694 (0.0028) [2024-06-12 22:31:40,940][70768] Fps is (10 sec: 54066.8, 60 sec: 49971.3, 300 sec: 49651.9). Total num frames: 1567981568. Throughput: 0: 49762.7. Samples: 1096793820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 22:31:40,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:31:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000095702_1567981568.pth... [2024-06-12 22:31:41,008][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000094974_1556054016.pth [2024-06-12 22:31:41,528][71000] Updated weights for policy 0, policy_version 95704 (0.0024) [2024-06-12 22:31:44,438][71000] Updated weights for policy 0, policy_version 95714 (0.0024) [2024-06-12 22:31:45,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49971.1, 300 sec: 49651.8). Total num frames: 1568243712. Throughput: 0: 49964.7. Samples: 1097099080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 22:31:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 22:31:47,779][71000] Updated weights for policy 0, policy_version 95724 (0.0031) [2024-06-12 22:31:50,929][71000] Updated weights for policy 0, policy_version 95734 (0.0036) [2024-06-12 22:31:50,939][70768] Fps is (10 sec: 52429.5, 60 sec: 50517.6, 300 sec: 49707.4). Total num frames: 1568505856. Throughput: 0: 50223.6. Samples: 1097256480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 22:31:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:31:54,247][71000] Updated weights for policy 0, policy_version 95744 (0.0026) [2024-06-12 22:31:55,940][70768] Fps is (10 sec: 45874.6, 60 sec: 49424.9, 300 sec: 49596.3). Total num frames: 1568702464. Throughput: 0: 50086.6. Samples: 1097550160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 22:31:55,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:31:57,604][71000] Updated weights for policy 0, policy_version 95754 (0.0036) [2024-06-12 22:32:00,939][70768] Fps is (10 sec: 47513.3, 60 sec: 50244.3, 300 sec: 49651.9). Total num frames: 1568980992. Throughput: 0: 50066.3. Samples: 1097848180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 22:32:00,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:32:01,130][71000] Updated weights for policy 0, policy_version 95764 (0.0030) [2024-06-12 22:32:04,493][71000] Updated weights for policy 0, policy_version 95774 (0.0026) [2024-06-12 22:32:05,939][70768] Fps is (10 sec: 52430.3, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1569226752. Throughput: 0: 50060.8. Samples: 1097996900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 22:32:05,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:32:07,843][71000] Updated weights for policy 0, policy_version 95784 (0.0025) [2024-06-12 22:32:10,893][71000] Updated weights for policy 0, policy_version 95794 (0.0033) [2024-06-12 22:32:10,940][70768] Fps is (10 sec: 50789.9, 60 sec: 50517.4, 300 sec: 49651.9). Total num frames: 1569488896. Throughput: 0: 50023.9. Samples: 1098299420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 22:32:10,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:32:14,354][71000] Updated weights for policy 0, policy_version 95804 (0.0029) [2024-06-12 22:32:15,940][70768] Fps is (10 sec: 47512.7, 60 sec: 49698.0, 300 sec: 49651.8). Total num frames: 1569701888. Throughput: 0: 50088.7. Samples: 1098596360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 22:32:15,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:32:17,733][71000] Updated weights for policy 0, policy_version 95814 (0.0023) [2024-06-12 22:32:20,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49697.9, 300 sec: 49540.7). Total num frames: 1569964032. Throughput: 0: 49920.3. Samples: 1098733940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 22:32:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:32:21,084][71000] Updated weights for policy 0, policy_version 95824 (0.0036) [2024-06-12 22:32:24,175][71000] Updated weights for policy 0, policy_version 95834 (0.0021) [2024-06-12 22:32:25,940][70768] Fps is (10 sec: 52429.6, 60 sec: 49971.2, 300 sec: 49651.9). Total num frames: 1570226176. Throughput: 0: 49762.7. Samples: 1099033140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 22:32:25,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:32:27,813][71000] Updated weights for policy 0, policy_version 95844 (0.0027) [2024-06-12 22:32:30,660][71000] Updated weights for policy 0, policy_version 95854 (0.0022) [2024-06-12 22:32:30,939][70768] Fps is (10 sec: 52429.9, 60 sec: 50790.4, 300 sec: 49651.8). Total num frames: 1570488320. Throughput: 0: 49898.8. Samples: 1099344520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 22:32:30,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:32:34,570][71000] Updated weights for policy 0, policy_version 95864 (0.0021) [2024-06-12 22:32:35,942][70768] Fps is (10 sec: 45864.2, 60 sec: 49423.1, 300 sec: 49595.9). Total num frames: 1570684928. Throughput: 0: 49620.8. Samples: 1099489540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 22:32:35,943][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:32:35,952][70980] Signal inference workers to stop experience collection... (16150 times) [2024-06-12 22:32:35,953][70980] Signal inference workers to resume experience collection... (16150 times) [2024-06-12 22:32:36,003][71000] InferenceWorker_p0-w0: stopping experience collection (16150 times) [2024-06-12 22:32:36,003][71000] InferenceWorker_p0-w0: resuming experience collection (16150 times) [2024-06-12 22:32:37,303][71000] Updated weights for policy 0, policy_version 95874 (0.0032) [2024-06-12 22:32:40,940][70768] Fps is (10 sec: 45874.2, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1570947072. Throughput: 0: 49607.6. Samples: 1099782500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 22:32:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:32:41,197][71000] Updated weights for policy 0, policy_version 95884 (0.0027) [2024-06-12 22:32:44,183][71000] Updated weights for policy 0, policy_version 95894 (0.0036) [2024-06-12 22:32:45,940][70768] Fps is (10 sec: 54079.5, 60 sec: 49698.1, 300 sec: 49651.8). Total num frames: 1571225600. Throughput: 0: 49450.5. Samples: 1100073460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 22:32:45,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 22:32:47,769][71000] Updated weights for policy 0, policy_version 95904 (0.0025) [2024-06-12 22:32:50,495][71000] Updated weights for policy 0, policy_version 95914 (0.0023) [2024-06-12 22:32:50,940][70768] Fps is (10 sec: 52429.6, 60 sec: 49425.0, 300 sec: 49651.9). Total num frames: 1571471360. Throughput: 0: 49787.5. Samples: 1100237340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 22:32:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 22:32:54,565][71000] Updated weights for policy 0, policy_version 95924 (0.0028) [2024-06-12 22:32:55,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49971.4, 300 sec: 49707.4). Total num frames: 1571700736. Throughput: 0: 49828.1. Samples: 1100541680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 22:32:55,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:32:56,971][71000] Updated weights for policy 0, policy_version 95934 (0.0023) [2024-06-12 22:33:00,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49151.9, 300 sec: 49596.3). Total num frames: 1571930112. Throughput: 0: 49609.4. Samples: 1100828780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 22:33:00,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:33:01,007][71000] Updated weights for policy 0, policy_version 95944 (0.0030) [2024-06-12 22:33:03,928][71000] Updated weights for policy 0, policy_version 95954 (0.0031) [2024-06-12 22:33:05,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1572208640. Throughput: 0: 49658.8. Samples: 1100968580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 22:33:05,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:33:07,782][71000] Updated weights for policy 0, policy_version 95964 (0.0033) [2024-06-12 22:33:10,442][71000] Updated weights for policy 0, policy_version 95974 (0.0031) [2024-06-12 22:33:10,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49425.0, 300 sec: 49651.8). Total num frames: 1572454400. Throughput: 0: 49648.7. Samples: 1101267340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 22:33:10,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:33:14,446][71000] Updated weights for policy 0, policy_version 95984 (0.0024) [2024-06-12 22:33:15,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49971.2, 300 sec: 49762.9). Total num frames: 1572700160. Throughput: 0: 49334.5. Samples: 1101564580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-12 22:33:15,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:33:16,790][71000] Updated weights for policy 0, policy_version 95994 (0.0027) [2024-06-12 22:33:20,911][71000] Updated weights for policy 0, policy_version 96004 (0.0032) [2024-06-12 22:33:20,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.1, 300 sec: 49651.8). Total num frames: 1572929536. Throughput: 0: 49471.8. Samples: 1101715660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:33:20,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:33:23,874][71000] Updated weights for policy 0, policy_version 96014 (0.0025) [2024-06-12 22:33:25,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49425.1, 300 sec: 49651.9). Total num frames: 1573191680. Throughput: 0: 49385.1. Samples: 1102004820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:33:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:33:27,571][71000] Updated weights for policy 0, policy_version 96024 (0.0035) [2024-06-12 22:33:30,378][71000] Updated weights for policy 0, policy_version 96034 (0.0030) [2024-06-12 22:33:30,940][70768] Fps is (10 sec: 50791.3, 60 sec: 49152.0, 300 sec: 49651.8). Total num frames: 1573437440. Throughput: 0: 49563.7. Samples: 1102303820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:33:30,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:33:33,991][71000] Updated weights for policy 0, policy_version 96044 (0.0028) [2024-06-12 22:33:35,940][70768] Fps is (10 sec: 50790.0, 60 sec: 50246.2, 300 sec: 49818.5). Total num frames: 1573699584. Throughput: 0: 49271.9. Samples: 1102454580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:33:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:33:36,870][71000] Updated weights for policy 0, policy_version 96054 (0.0028) [2024-06-12 22:33:40,523][71000] Updated weights for policy 0, policy_version 96064 (0.0031) [2024-06-12 22:33:40,940][70768] Fps is (10 sec: 47512.2, 60 sec: 49425.0, 300 sec: 49651.8). Total num frames: 1573912576. Throughput: 0: 49143.2. Samples: 1102753140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:33:40,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:33:41,002][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000096065_1573928960.pth... [2024-06-12 22:33:41,046][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000095337_1562001408.pth [2024-06-12 22:33:43,561][71000] Updated weights for policy 0, policy_version 96074 (0.0031) [2024-06-12 22:33:45,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.1, 300 sec: 49596.3). Total num frames: 1574174720. Throughput: 0: 49178.7. Samples: 1103041820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:33:45,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:33:47,327][71000] Updated weights for policy 0, policy_version 96084 (0.0030) [2024-06-12 22:33:48,951][70980] Signal inference workers to stop experience collection... (16200 times) [2024-06-12 22:33:48,951][70980] Signal inference workers to resume experience collection... (16200 times) [2024-06-12 22:33:48,973][71000] InferenceWorker_p0-w0: stopping experience collection (16200 times) [2024-06-12 22:33:48,973][71000] InferenceWorker_p0-w0: resuming experience collection (16200 times) [2024-06-12 22:33:50,403][71000] Updated weights for policy 0, policy_version 96094 (0.0029) [2024-06-12 22:33:50,940][70768] Fps is (10 sec: 50791.8, 60 sec: 49152.0, 300 sec: 49596.3). Total num frames: 1574420480. Throughput: 0: 49516.0. Samples: 1103196800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:33:50,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:33:53,978][71000] Updated weights for policy 0, policy_version 96104 (0.0031) [2024-06-12 22:33:55,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.0, 300 sec: 49651.9). Total num frames: 1574649856. Throughput: 0: 49164.2. Samples: 1103479720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:33:55,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:33:57,351][71000] Updated weights for policy 0, policy_version 96114 (0.0020) [2024-06-12 22:34:00,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1574879232. Throughput: 0: 49094.3. Samples: 1103773820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:34:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:34:00,995][71000] Updated weights for policy 0, policy_version 96124 (0.0032) [2024-06-12 22:34:03,954][71000] Updated weights for policy 0, policy_version 96134 (0.0025) [2024-06-12 22:34:05,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1575157760. Throughput: 0: 49094.4. Samples: 1103924900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:34:05,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:34:07,103][71000] Updated weights for policy 0, policy_version 96144 (0.0029) [2024-06-12 22:34:10,423][71000] Updated weights for policy 0, policy_version 96154 (0.0029) [2024-06-12 22:34:10,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48879.0, 300 sec: 49540.8). Total num frames: 1575387136. Throughput: 0: 49302.6. Samples: 1104223440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:34:10,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:34:14,312][71000] Updated weights for policy 0, policy_version 96164 (0.0032) [2024-06-12 22:34:15,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.1, 300 sec: 49707.4). Total num frames: 1575649280. Throughput: 0: 48984.0. Samples: 1104508100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-12 22:34:15,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:34:17,113][71000] Updated weights for policy 0, policy_version 96174 (0.0026) [2024-06-12 22:34:20,898][71000] Updated weights for policy 0, policy_version 96184 (0.0025) [2024-06-12 22:34:20,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 49596.3). Total num frames: 1575878656. Throughput: 0: 49017.8. Samples: 1104660380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:34:20,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:34:23,946][71000] Updated weights for policy 0, policy_version 96194 (0.0027) [2024-06-12 22:34:25,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48878.8, 300 sec: 49540.7). Total num frames: 1576124416. Throughput: 0: 48971.7. Samples: 1104956860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:34:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:34:27,501][71000] Updated weights for policy 0, policy_version 96204 (0.0033) [2024-06-12 22:34:30,591][71000] Updated weights for policy 0, policy_version 96214 (0.0024) [2024-06-12 22:34:30,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1576386560. Throughput: 0: 49282.2. Samples: 1105259520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:34:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:34:34,155][71000] Updated weights for policy 0, policy_version 96224 (0.0024) [2024-06-12 22:34:35,939][70768] Fps is (10 sec: 50791.6, 60 sec: 48879.0, 300 sec: 49596.3). Total num frames: 1576632320. Throughput: 0: 49095.6. Samples: 1105406100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:34:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:34:37,270][71000] Updated weights for policy 0, policy_version 96234 (0.0027) [2024-06-12 22:34:40,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48879.2, 300 sec: 49540.8). Total num frames: 1576845312. Throughput: 0: 49194.2. Samples: 1105693460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:34:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 22:34:41,021][71000] Updated weights for policy 0, policy_version 96244 (0.0038) [2024-06-12 22:34:44,146][71000] Updated weights for policy 0, policy_version 96254 (0.0036) [2024-06-12 22:34:45,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.9, 300 sec: 49485.2). Total num frames: 1577107456. Throughput: 0: 49089.8. Samples: 1105982860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:34:45,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:34:47,389][71000] Updated weights for policy 0, policy_version 96264 (0.0023) [2024-06-12 22:34:50,638][71000] Updated weights for policy 0, policy_version 96274 (0.0028) [2024-06-12 22:34:50,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49151.9, 300 sec: 49596.3). Total num frames: 1577369600. Throughput: 0: 49101.3. Samples: 1106134460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:34:50,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:34:54,034][71000] Updated weights for policy 0, policy_version 96284 (0.0028) [2024-06-12 22:34:55,944][70768] Fps is (10 sec: 50768.9, 60 sec: 49421.5, 300 sec: 49651.1). Total num frames: 1577615360. Throughput: 0: 49120.8. Samples: 1106434080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:34:55,944][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:34:57,209][71000] Updated weights for policy 0, policy_version 96294 (0.0026) [2024-06-12 22:35:00,571][71000] Updated weights for policy 0, policy_version 96304 (0.0019) [2024-06-12 22:35:00,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1577861120. Throughput: 0: 49507.4. Samples: 1106735940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:35:00,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 22:35:01,405][70980] Signal inference workers to stop experience collection... (16250 times) [2024-06-12 22:35:01,454][71000] InferenceWorker_p0-w0: stopping experience collection (16250 times) [2024-06-12 22:35:01,458][70980] Signal inference workers to resume experience collection... (16250 times) [2024-06-12 22:35:01,467][71000] InferenceWorker_p0-w0: resuming experience collection (16250 times) [2024-06-12 22:35:03,607][71000] Updated weights for policy 0, policy_version 96314 (0.0027) [2024-06-12 22:35:05,940][70768] Fps is (10 sec: 47533.4, 60 sec: 48878.8, 300 sec: 49485.2). Total num frames: 1578090496. Throughput: 0: 49277.3. Samples: 1106877860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:35:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:35:07,391][71000] Updated weights for policy 0, policy_version 96324 (0.0024) [2024-06-12 22:35:10,450][71000] Updated weights for policy 0, policy_version 96334 (0.0026) [2024-06-12 22:35:10,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1578352640. Throughput: 0: 49139.3. Samples: 1107168120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:35:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:35:13,827][71000] Updated weights for policy 0, policy_version 96344 (0.0028) [2024-06-12 22:35:15,941][70768] Fps is (10 sec: 50782.6, 60 sec: 49150.7, 300 sec: 49540.5). Total num frames: 1578598400. Throughput: 0: 49117.4. Samples: 1107469880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 22:35:15,942][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:35:17,094][71000] Updated weights for policy 0, policy_version 96354 (0.0034) [2024-06-12 22:35:20,446][71000] Updated weights for policy 0, policy_version 96364 (0.0022) [2024-06-12 22:35:20,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1578844160. Throughput: 0: 49119.5. Samples: 1107616480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 22:35:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:35:23,409][71000] Updated weights for policy 0, policy_version 96374 (0.0027) [2024-06-12 22:35:25,940][70768] Fps is (10 sec: 49159.6, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1579089920. Throughput: 0: 49276.3. Samples: 1107910900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 22:35:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 22:35:27,200][71000] Updated weights for policy 0, policy_version 96384 (0.0032) [2024-06-12 22:35:30,193][71000] Updated weights for policy 0, policy_version 96394 (0.0041) [2024-06-12 22:35:30,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1579352064. Throughput: 0: 49375.6. Samples: 1108204760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 22:35:30,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:35:33,716][71000] Updated weights for policy 0, policy_version 96404 (0.0024) [2024-06-12 22:35:35,940][70768] Fps is (10 sec: 49148.0, 60 sec: 49151.2, 300 sec: 49485.1). Total num frames: 1579581440. Throughput: 0: 49531.5. Samples: 1108363420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 22:35:35,941][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:35:36,676][71000] Updated weights for policy 0, policy_version 96414 (0.0022) [2024-06-12 22:35:40,227][71000] Updated weights for policy 0, policy_version 96424 (0.0031) [2024-06-12 22:35:40,940][70768] Fps is (10 sec: 50790.3, 60 sec: 50244.2, 300 sec: 49540.8). Total num frames: 1579859968. Throughput: 0: 49710.5. Samples: 1108670840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 22:35:40,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:35:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000096427_1579859968.pth... [2024-06-12 22:35:40,997][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000095702_1567981568.pth [2024-06-12 22:35:43,139][71000] Updated weights for policy 0, policy_version 96434 (0.0026) [2024-06-12 22:35:45,940][70768] Fps is (10 sec: 50794.3, 60 sec: 49698.0, 300 sec: 49540.8). Total num frames: 1580089344. Throughput: 0: 49501.3. Samples: 1108963500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 22:35:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 22:35:46,611][71000] Updated weights for policy 0, policy_version 96444 (0.0030) [2024-06-12 22:35:49,906][71000] Updated weights for policy 0, policy_version 96454 (0.0032) [2024-06-12 22:35:50,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1580335104. Throughput: 0: 49550.3. Samples: 1109107620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 22:35:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:35:53,512][71000] Updated weights for policy 0, policy_version 96464 (0.0034) [2024-06-12 22:35:55,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49155.5, 300 sec: 49485.2). Total num frames: 1580564480. Throughput: 0: 49528.4. Samples: 1109396900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 22:35:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:35:56,680][71000] Updated weights for policy 0, policy_version 96474 (0.0031) [2024-06-12 22:36:00,352][71000] Updated weights for policy 0, policy_version 96484 (0.0023) [2024-06-12 22:36:00,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1580826624. Throughput: 0: 49516.9. Samples: 1109698060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 22:36:00,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:36:03,142][71000] Updated weights for policy 0, policy_version 96494 (0.0028) [2024-06-12 22:36:05,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1581088768. Throughput: 0: 49544.8. Samples: 1109846000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 22:36:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 22:36:06,764][71000] Updated weights for policy 0, policy_version 96504 (0.0027) [2024-06-12 22:36:07,177][70980] Signal inference workers to stop experience collection... (16300 times) [2024-06-12 22:36:07,177][70980] Signal inference workers to resume experience collection... (16300 times) [2024-06-12 22:36:07,204][71000] InferenceWorker_p0-w0: stopping experience collection (16300 times) [2024-06-12 22:36:07,204][71000] InferenceWorker_p0-w0: resuming experience collection (16300 times) [2024-06-12 22:36:09,767][71000] Updated weights for policy 0, policy_version 96514 (0.0021) [2024-06-12 22:36:10,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1581334528. Throughput: 0: 49857.0. Samples: 1110154460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 22:36:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 22:36:13,131][71000] Updated weights for policy 0, policy_version 96524 (0.0040) [2024-06-12 22:36:15,939][70768] Fps is (10 sec: 47514.5, 60 sec: 49426.5, 300 sec: 49429.7). Total num frames: 1581563904. Throughput: 0: 49762.3. Samples: 1110444060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 22:36:15,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:36:16,674][71000] Updated weights for policy 0, policy_version 96534 (0.0029) [2024-06-12 22:36:19,942][71000] Updated weights for policy 0, policy_version 96544 (0.0033) [2024-06-12 22:36:20,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1581826048. Throughput: 0: 49449.9. Samples: 1110588620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 22:36:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 22:36:23,263][71000] Updated weights for policy 0, policy_version 96554 (0.0029) [2024-06-12 22:36:25,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1582071808. Throughput: 0: 49307.6. Samples: 1110889680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 22:36:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:36:26,336][71000] Updated weights for policy 0, policy_version 96564 (0.0027) [2024-06-12 22:36:29,991][71000] Updated weights for policy 0, policy_version 96574 (0.0032) [2024-06-12 22:36:30,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1582333952. Throughput: 0: 49549.0. Samples: 1111193200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 22:36:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:36:32,811][71000] Updated weights for policy 0, policy_version 96584 (0.0023) [2024-06-12 22:36:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.9, 300 sec: 49429.7). Total num frames: 1582563328. Throughput: 0: 49678.7. Samples: 1111343160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 22:36:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:36:36,452][71000] Updated weights for policy 0, policy_version 96594 (0.0030) [2024-06-12 22:36:39,351][71000] Updated weights for policy 0, policy_version 96604 (0.0031) [2024-06-12 22:36:40,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1582841856. Throughput: 0: 49878.2. Samples: 1111641420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 22:36:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:36:43,024][71000] Updated weights for policy 0, policy_version 96614 (0.0022) [2024-06-12 22:36:45,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49698.2, 300 sec: 49374.1). Total num frames: 1583071232. Throughput: 0: 49785.3. Samples: 1111938400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 22:36:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:36:45,996][71000] Updated weights for policy 0, policy_version 96624 (0.0029) [2024-06-12 22:36:49,829][71000] Updated weights for policy 0, policy_version 96634 (0.0028) [2024-06-12 22:36:50,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1583316992. Throughput: 0: 49819.2. Samples: 1112087860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 22:36:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:36:52,691][71000] Updated weights for policy 0, policy_version 96644 (0.0036) [2024-06-12 22:36:55,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 1583546368. Throughput: 0: 49537.8. Samples: 1112383660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 22:36:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:36:56,386][71000] Updated weights for policy 0, policy_version 96654 (0.0024) [2024-06-12 22:36:59,112][71000] Updated weights for policy 0, policy_version 96664 (0.0028) [2024-06-12 22:37:00,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49971.1, 300 sec: 49485.2). Total num frames: 1583824896. Throughput: 0: 49720.2. Samples: 1112681480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 22:37:00,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:37:03,147][71000] Updated weights for policy 0, policy_version 96674 (0.0037) [2024-06-12 22:37:05,694][71000] Updated weights for policy 0, policy_version 96684 (0.0030) [2024-06-12 22:37:05,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1584070656. Throughput: 0: 50029.8. Samples: 1112839960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 22:37:05,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:37:09,619][71000] Updated weights for policy 0, policy_version 96694 (0.0032) [2024-06-12 22:37:10,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1584316416. Throughput: 0: 49817.7. Samples: 1113131480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-12 22:37:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:37:12,417][71000] Updated weights for policy 0, policy_version 96704 (0.0025) [2024-06-12 22:37:15,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1584545792. Throughput: 0: 49688.5. Samples: 1113429180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-12 22:37:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:37:16,570][71000] Updated weights for policy 0, policy_version 96714 (0.0023) [2024-06-12 22:37:18,930][71000] Updated weights for policy 0, policy_version 96724 (0.0025) [2024-06-12 22:37:20,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49971.0, 300 sec: 49485.2). Total num frames: 1584824320. Throughput: 0: 49633.1. Samples: 1113576660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-12 22:37:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 22:37:22,984][71000] Updated weights for policy 0, policy_version 96734 (0.0035) [2024-06-12 22:37:23,600][70980] Signal inference workers to stop experience collection... (16350 times) [2024-06-12 22:37:23,602][70980] Signal inference workers to resume experience collection... (16350 times) [2024-06-12 22:37:23,640][71000] InferenceWorker_p0-w0: stopping experience collection (16350 times) [2024-06-12 22:37:23,640][71000] InferenceWorker_p0-w0: resuming experience collection (16350 times) [2024-06-12 22:37:25,575][71000] Updated weights for policy 0, policy_version 96744 (0.0029) [2024-06-12 22:37:25,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 1585070080. Throughput: 0: 49751.1. Samples: 1113880220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-12 22:37:25,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:37:29,403][71000] Updated weights for policy 0, policy_version 96754 (0.0030) [2024-06-12 22:37:30,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49424.9, 300 sec: 49541.1). Total num frames: 1585299456. Throughput: 0: 49705.6. Samples: 1114175160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-12 22:37:30,941][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 22:37:32,137][71000] Updated weights for policy 0, policy_version 96764 (0.0023) [2024-06-12 22:37:35,938][71000] Updated weights for policy 0, policy_version 96774 (0.0027) [2024-06-12 22:37:35,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49698.1, 300 sec: 49485.3). Total num frames: 1585545216. Throughput: 0: 49627.6. Samples: 1114321100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-12 22:37:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:37:38,673][71000] Updated weights for policy 0, policy_version 96784 (0.0031) [2024-06-12 22:37:40,940][70768] Fps is (10 sec: 50791.6, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1585807360. Throughput: 0: 49577.7. Samples: 1114614660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-12 22:37:40,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:37:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000096790_1585807360.pth... [2024-06-12 22:37:40,990][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000096065_1573928960.pth [2024-06-12 22:37:42,621][71000] Updated weights for policy 0, policy_version 96794 (0.0027) [2024-06-12 22:37:45,046][71000] Updated weights for policy 0, policy_version 96804 (0.0026) [2024-06-12 22:37:45,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 1586069504. Throughput: 0: 49592.0. Samples: 1114913120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-12 22:37:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:37:49,028][71000] Updated weights for policy 0, policy_version 96814 (0.0025) [2024-06-12 22:37:50,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1586298880. Throughput: 0: 49593.3. Samples: 1115071660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-12 22:37:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:37:51,769][71000] Updated weights for policy 0, policy_version 96824 (0.0032) [2024-06-12 22:37:55,619][71000] Updated weights for policy 0, policy_version 96834 (0.0038) [2024-06-12 22:37:55,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49698.0, 300 sec: 49485.2). Total num frames: 1586528256. Throughput: 0: 49501.7. Samples: 1115359060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-12 22:37:55,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:37:58,547][71000] Updated weights for policy 0, policy_version 96844 (0.0026) [2024-06-12 22:38:00,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1586806784. Throughput: 0: 49558.6. Samples: 1115659320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-12 22:38:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 22:38:02,235][71000] Updated weights for policy 0, policy_version 96854 (0.0029) [2024-06-12 22:38:04,997][71000] Updated weights for policy 0, policy_version 96864 (0.0030) [2024-06-12 22:38:05,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1587036160. Throughput: 0: 49837.5. Samples: 1115819340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-12 22:38:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:38:08,663][71000] Updated weights for policy 0, policy_version 96874 (0.0034) [2024-06-12 22:38:10,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1587298304. Throughput: 0: 49763.1. Samples: 1116119560. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-06-12 22:38:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:38:11,490][71000] Updated weights for policy 0, policy_version 96884 (0.0021) [2024-06-12 22:38:15,462][71000] Updated weights for policy 0, policy_version 96894 (0.0035) [2024-06-12 22:38:15,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1587511296. Throughput: 0: 49603.9. Samples: 1116407320. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-06-12 22:38:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 22:38:18,374][71000] Updated weights for policy 0, policy_version 96904 (0.0033) [2024-06-12 22:38:20,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1587806208. Throughput: 0: 49557.3. Samples: 1116551180. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-06-12 22:38:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 22:38:21,822][71000] Updated weights for policy 0, policy_version 96914 (0.0027) [2024-06-12 22:38:24,724][71000] Updated weights for policy 0, policy_version 96924 (0.0021) [2024-06-12 22:38:25,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1588035584. Throughput: 0: 49882.6. Samples: 1116859380. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-06-12 22:38:25,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:38:28,025][71000] Updated weights for policy 0, policy_version 96934 (0.0027) [2024-06-12 22:38:30,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49698.4, 300 sec: 49429.7). Total num frames: 1588281344. Throughput: 0: 50017.0. Samples: 1117163880. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-06-12 22:38:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:38:31,352][71000] Updated weights for policy 0, policy_version 96944 (0.0035) [2024-06-12 22:38:32,147][70980] Signal inference workers to stop experience collection... (16400 times) [2024-06-12 22:38:32,148][70980] Signal inference workers to resume experience collection... (16400 times) [2024-06-12 22:38:32,177][71000] InferenceWorker_p0-w0: stopping experience collection (16400 times) [2024-06-12 22:38:32,178][71000] InferenceWorker_p0-w0: resuming experience collection (16400 times) [2024-06-12 22:38:35,170][71000] Updated weights for policy 0, policy_version 96954 (0.0024) [2024-06-12 22:38:35,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1588494336. Throughput: 0: 49448.5. Samples: 1117296840. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-06-12 22:38:35,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:38:38,001][71000] Updated weights for policy 0, policy_version 96964 (0.0029) [2024-06-12 22:38:40,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49698.0, 300 sec: 49540.7). Total num frames: 1588789248. Throughput: 0: 49759.5. Samples: 1117598240. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-06-12 22:38:40,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:38:41,573][71000] Updated weights for policy 0, policy_version 96974 (0.0035) [2024-06-12 22:38:44,686][71000] Updated weights for policy 0, policy_version 96984 (0.0025) [2024-06-12 22:38:45,940][70768] Fps is (10 sec: 54067.2, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1589035008. Throughput: 0: 49709.8. Samples: 1117896260. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-06-12 22:38:45,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:38:48,253][71000] Updated weights for policy 0, policy_version 96994 (0.0025) [2024-06-12 22:38:50,939][70768] Fps is (10 sec: 50791.8, 60 sec: 49971.3, 300 sec: 49651.9). Total num frames: 1589297152. Throughput: 0: 49581.0. Samples: 1118050480. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-06-12 22:38:50,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:38:51,036][71000] Updated weights for policy 0, policy_version 97004 (0.0026) [2024-06-12 22:38:54,590][71000] Updated weights for policy 0, policy_version 97014 (0.0025) [2024-06-12 22:38:55,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1589493760. Throughput: 0: 49647.1. Samples: 1118353680. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-06-12 22:38:55,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:38:57,851][71000] Updated weights for policy 0, policy_version 97024 (0.0028) [2024-06-12 22:39:00,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1589788672. Throughput: 0: 49840.4. Samples: 1118650140. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-06-12 22:39:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:39:01,308][71000] Updated weights for policy 0, policy_version 97034 (0.0024) [2024-06-12 22:39:04,280][71000] Updated weights for policy 0, policy_version 97044 (0.0020) [2024-06-12 22:39:05,940][70768] Fps is (10 sec: 55704.9, 60 sec: 50244.1, 300 sec: 49707.4). Total num frames: 1590050816. Throughput: 0: 50002.5. Samples: 1118801300. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-06-12 22:39:05,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 22:39:07,656][71000] Updated weights for policy 0, policy_version 97054 (0.0024) [2024-06-12 22:39:10,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1590280192. Throughput: 0: 49786.8. Samples: 1119099780. Policy #0 lag: (min: 0.0, avg: 6.6, max: 19.0) [2024-06-12 22:39:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:39:11,038][71000] Updated weights for policy 0, policy_version 97064 (0.0031) [2024-06-12 22:39:14,490][71000] Updated weights for policy 0, policy_version 97074 (0.0027) [2024-06-12 22:39:15,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1590509568. Throughput: 0: 49740.0. Samples: 1119402180. Policy #0 lag: (min: 0.0, avg: 6.6, max: 19.0) [2024-06-12 22:39:15,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:39:17,632][71000] Updated weights for policy 0, policy_version 97084 (0.0027) [2024-06-12 22:39:20,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49424.9, 300 sec: 49651.8). Total num frames: 1590771712. Throughput: 0: 49962.9. Samples: 1119545180. Policy #0 lag: (min: 0.0, avg: 6.6, max: 19.0) [2024-06-12 22:39:20,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 22:39:21,071][71000] Updated weights for policy 0, policy_version 97094 (0.0032) [2024-06-12 22:39:23,937][71000] Updated weights for policy 0, policy_version 97104 (0.0027) [2024-06-12 22:39:25,940][70768] Fps is (10 sec: 54066.9, 60 sec: 50244.3, 300 sec: 49707.4). Total num frames: 1591050240. Throughput: 0: 50087.7. Samples: 1119852180. Policy #0 lag: (min: 0.0, avg: 6.6, max: 19.0) [2024-06-12 22:39:25,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-12 22:39:27,277][71000] Updated weights for policy 0, policy_version 97114 (0.0025) [2024-06-12 22:39:30,869][71000] Updated weights for policy 0, policy_version 97124 (0.0028) [2024-06-12 22:39:30,940][70768] Fps is (10 sec: 50791.4, 60 sec: 49971.2, 300 sec: 49651.8). Total num frames: 1591279616. Throughput: 0: 50211.1. Samples: 1120155760. Policy #0 lag: (min: 0.0, avg: 6.6, max: 19.0) [2024-06-12 22:39:30,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:39:31,315][70980] Signal inference workers to stop experience collection... (16450 times) [2024-06-12 22:39:31,318][70980] Signal inference workers to resume experience collection... (16450 times) [2024-06-12 22:39:31,324][71000] InferenceWorker_p0-w0: stopping experience collection (16450 times) [2024-06-12 22:39:31,337][71000] InferenceWorker_p0-w0: resuming experience collection (16450 times) [2024-06-12 22:39:33,663][71000] Updated weights for policy 0, policy_version 97134 (0.0027) [2024-06-12 22:39:35,940][70768] Fps is (10 sec: 47513.3, 60 sec: 50517.2, 300 sec: 49762.9). Total num frames: 1591525376. Throughput: 0: 50128.2. Samples: 1120306260. Policy #0 lag: (min: 0.0, avg: 6.6, max: 19.0) [2024-06-12 22:39:35,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:39:37,440][71000] Updated weights for policy 0, policy_version 97144 (0.0033) [2024-06-12 22:39:40,815][71000] Updated weights for policy 0, policy_version 97154 (0.0025) [2024-06-12 22:39:40,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49698.1, 300 sec: 49707.4). Total num frames: 1591771136. Throughput: 0: 49770.5. Samples: 1120593360. Policy #0 lag: (min: 0.0, avg: 6.6, max: 19.0) [2024-06-12 22:39:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:39:40,956][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000097154_1591771136.pth... [2024-06-12 22:39:41,011][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000096427_1579859968.pth [2024-06-12 22:39:44,139][71000] Updated weights for policy 0, policy_version 97164 (0.0019) [2024-06-12 22:39:45,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49971.1, 300 sec: 49707.4). Total num frames: 1592033280. Throughput: 0: 49730.2. Samples: 1120888000. Policy #0 lag: (min: 0.0, avg: 6.6, max: 19.0) [2024-06-12 22:39:45,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:39:47,154][71000] Updated weights for policy 0, policy_version 97174 (0.0032) [2024-06-12 22:39:50,719][71000] Updated weights for policy 0, policy_version 97184 (0.0034) [2024-06-12 22:39:50,940][70768] Fps is (10 sec: 50791.4, 60 sec: 49698.0, 300 sec: 49708.1). Total num frames: 1592279040. Throughput: 0: 49962.8. Samples: 1121049620. Policy #0 lag: (min: 0.0, avg: 6.6, max: 19.0) [2024-06-12 22:39:50,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:39:53,556][71000] Updated weights for policy 0, policy_version 97194 (0.0022) [2024-06-12 22:39:55,940][70768] Fps is (10 sec: 47513.8, 60 sec: 50244.3, 300 sec: 49651.9). Total num frames: 1592508416. Throughput: 0: 49856.9. Samples: 1121343340. Policy #0 lag: (min: 0.0, avg: 6.6, max: 19.0) [2024-06-12 22:39:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:39:57,199][71000] Updated weights for policy 0, policy_version 97204 (0.0027) [2024-06-12 22:40:00,617][71000] Updated weights for policy 0, policy_version 97214 (0.0030) [2024-06-12 22:40:00,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49698.0, 300 sec: 49762.9). Total num frames: 1592770560. Throughput: 0: 49660.7. Samples: 1121636920. Policy #0 lag: (min: 0.0, avg: 6.6, max: 19.0) [2024-06-12 22:40:00,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:40:03,925][71000] Updated weights for policy 0, policy_version 97224 (0.0038) [2024-06-12 22:40:05,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.2, 300 sec: 49707.4). Total num frames: 1593016320. Throughput: 0: 49934.9. Samples: 1121792240. Policy #0 lag: (min: 0.0, avg: 6.6, max: 19.0) [2024-06-12 22:40:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:40:07,091][71000] Updated weights for policy 0, policy_version 97234 (0.0027) [2024-06-12 22:40:10,348][71000] Updated weights for policy 0, policy_version 97244 (0.0026) [2024-06-12 22:40:10,939][70768] Fps is (10 sec: 49152.8, 60 sec: 49698.2, 300 sec: 49707.7). Total num frames: 1593262080. Throughput: 0: 49706.8. Samples: 1122088980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:40:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:40:13,612][71000] Updated weights for policy 0, policy_version 97254 (0.0023) [2024-06-12 22:40:15,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49971.2, 300 sec: 49707.4). Total num frames: 1593507840. Throughput: 0: 49636.0. Samples: 1122389380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:40:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:40:17,101][71000] Updated weights for policy 0, policy_version 97264 (0.0029) [2024-06-12 22:40:20,091][71000] Updated weights for policy 0, policy_version 97274 (0.0035) [2024-06-12 22:40:20,940][70768] Fps is (10 sec: 50789.2, 60 sec: 49971.2, 300 sec: 49762.9). Total num frames: 1593769984. Throughput: 0: 49365.2. Samples: 1122527700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:40:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:40:23,659][71000] Updated weights for policy 0, policy_version 97284 (0.0036) [2024-06-12 22:40:25,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.0, 300 sec: 49707.4). Total num frames: 1594015744. Throughput: 0: 49634.3. Samples: 1122826900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:40:25,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-12 22:40:26,951][71000] Updated weights for policy 0, policy_version 97294 (0.0022) [2024-06-12 22:40:30,413][71000] Updated weights for policy 0, policy_version 97304 (0.0025) [2024-06-12 22:40:30,942][70768] Fps is (10 sec: 49142.9, 60 sec: 49696.4, 300 sec: 49762.7). Total num frames: 1594261504. Throughput: 0: 49717.3. Samples: 1123125380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:40:30,942][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:40:33,539][71000] Updated weights for policy 0, policy_version 97314 (0.0033) [2024-06-12 22:40:35,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 1594490880. Throughput: 0: 49402.6. Samples: 1123272740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:40:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 22:40:36,763][71000] Updated weights for policy 0, policy_version 97324 (0.0027) [2024-06-12 22:40:40,027][71000] Updated weights for policy 0, policy_version 97334 (0.0020) [2024-06-12 22:40:40,939][70768] Fps is (10 sec: 49162.3, 60 sec: 49698.4, 300 sec: 49707.4). Total num frames: 1594753024. Throughput: 0: 49687.6. Samples: 1123579280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:40:40,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:40:43,420][71000] Updated weights for policy 0, policy_version 97344 (0.0027) [2024-06-12 22:40:45,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.1, 300 sec: 49651.9). Total num frames: 1594982400. Throughput: 0: 49620.2. Samples: 1123869820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:40:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:40:46,506][71000] Updated weights for policy 0, policy_version 97354 (0.0030) [2024-06-12 22:40:50,243][71000] Updated weights for policy 0, policy_version 97364 (0.0036) [2024-06-12 22:40:50,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49762.9). Total num frames: 1595244544. Throughput: 0: 49412.5. Samples: 1124015800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:40:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:40:53,406][71000] Updated weights for policy 0, policy_version 97374 (0.0034) [2024-06-12 22:40:55,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.1, 300 sec: 49651.9). Total num frames: 1595473920. Throughput: 0: 49328.0. Samples: 1124308740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:40:55,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 22:40:56,750][71000] Updated weights for policy 0, policy_version 97384 (0.0030) [2024-06-12 22:40:59,585][70980] Signal inference workers to stop experience collection... (16500 times) [2024-06-12 22:40:59,620][71000] InferenceWorker_p0-w0: stopping experience collection (16500 times) [2024-06-12 22:40:59,633][70980] Signal inference workers to resume experience collection... (16500 times) [2024-06-12 22:40:59,639][71000] InferenceWorker_p0-w0: resuming experience collection (16500 times) [2024-06-12 22:41:00,009][71000] Updated weights for policy 0, policy_version 97394 (0.0028) [2024-06-12 22:41:00,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.3, 300 sec: 49707.4). Total num frames: 1595752448. Throughput: 0: 49315.6. Samples: 1124608580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:41:00,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:41:03,584][71000] Updated weights for policy 0, policy_version 97404 (0.0023) [2024-06-12 22:41:05,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 49596.3). Total num frames: 1595965440. Throughput: 0: 49566.8. Samples: 1124758200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-12 22:41:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:41:06,646][71000] Updated weights for policy 0, policy_version 97414 (0.0030) [2024-06-12 22:41:10,123][71000] Updated weights for policy 0, policy_version 97424 (0.0022) [2024-06-12 22:41:10,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49698.0, 300 sec: 49762.9). Total num frames: 1596243968. Throughput: 0: 49571.1. Samples: 1125057600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 22:41:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:41:13,084][71000] Updated weights for policy 0, policy_version 97434 (0.0031) [2024-06-12 22:41:15,939][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.1, 300 sec: 49651.9). Total num frames: 1596473344. Throughput: 0: 49345.9. Samples: 1125345840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 22:41:15,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:41:16,695][71000] Updated weights for policy 0, policy_version 97444 (0.0035) [2024-06-12 22:41:20,280][71000] Updated weights for policy 0, policy_version 97454 (0.0027) [2024-06-12 22:41:20,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.1, 300 sec: 49651.8). Total num frames: 1596719104. Throughput: 0: 49340.5. Samples: 1125493060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 22:41:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:41:23,243][71000] Updated weights for policy 0, policy_version 97464 (0.0030) [2024-06-12 22:41:25,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48878.9, 300 sec: 49540.8). Total num frames: 1596948480. Throughput: 0: 49097.2. Samples: 1125788660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 22:41:25,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:41:26,719][71000] Updated weights for policy 0, policy_version 97474 (0.0032) [2024-06-12 22:41:29,853][71000] Updated weights for policy 0, policy_version 97484 (0.0032) [2024-06-12 22:41:30,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49426.8, 300 sec: 49707.4). Total num frames: 1597227008. Throughput: 0: 49330.7. Samples: 1126089700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 22:41:30,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:41:33,281][71000] Updated weights for policy 0, policy_version 97494 (0.0030) [2024-06-12 22:41:35,939][70768] Fps is (10 sec: 50791.2, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1597456384. Throughput: 0: 49528.9. Samples: 1126244600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 22:41:35,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:41:36,555][71000] Updated weights for policy 0, policy_version 97504 (0.0028) [2024-06-12 22:41:40,155][71000] Updated weights for policy 0, policy_version 97514 (0.0021) [2024-06-12 22:41:40,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49425.0, 300 sec: 49651.9). Total num frames: 1597718528. Throughput: 0: 49447.0. Samples: 1126533860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 22:41:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:41:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000097517_1597718528.pth... [2024-06-12 22:41:40,999][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000096790_1585807360.pth [2024-06-12 22:41:43,107][71000] Updated weights for policy 0, policy_version 97524 (0.0039) [2024-06-12 22:41:45,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 49651.8). Total num frames: 1597964288. Throughput: 0: 49382.6. Samples: 1126830800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 22:41:45,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:41:46,666][71000] Updated weights for policy 0, policy_version 97534 (0.0025) [2024-06-12 22:41:49,684][71000] Updated weights for policy 0, policy_version 97544 (0.0024) [2024-06-12 22:41:50,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.0, 300 sec: 49707.4). Total num frames: 1598210048. Throughput: 0: 49389.7. Samples: 1126980740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 22:41:50,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:41:53,322][71000] Updated weights for policy 0, policy_version 97554 (0.0029) [2024-06-12 22:41:55,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1598455808. Throughput: 0: 49304.6. Samples: 1127276300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 22:41:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 22:41:56,458][71000] Updated weights for policy 0, policy_version 97564 (0.0036) [2024-06-12 22:42:00,207][71000] Updated weights for policy 0, policy_version 97574 (0.0035) [2024-06-12 22:42:00,939][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.0, 300 sec: 49596.3). Total num frames: 1598701568. Throughput: 0: 49545.3. Samples: 1127575380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-12 22:42:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 22:42:02,986][71000] Updated weights for policy 0, policy_version 97584 (0.0025) [2024-06-12 22:42:05,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1598947328. Throughput: 0: 49481.3. Samples: 1127719720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 22:42:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:42:06,944][71000] Updated weights for policy 0, policy_version 97594 (0.0023) [2024-06-12 22:42:09,914][71000] Updated weights for policy 0, policy_version 97604 (0.0023) [2024-06-12 22:42:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 49651.9). Total num frames: 1599193088. Throughput: 0: 49327.3. Samples: 1128008380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 22:42:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:42:13,335][71000] Updated weights for policy 0, policy_version 97614 (0.0030) [2024-06-12 22:42:15,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1599438848. Throughput: 0: 49238.2. Samples: 1128305420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 22:42:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:42:16,317][71000] Updated weights for policy 0, policy_version 97624 (0.0024) [2024-06-12 22:42:19,950][71000] Updated weights for policy 0, policy_version 97634 (0.0023) [2024-06-12 22:42:20,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1599684608. Throughput: 0: 49180.7. Samples: 1128457740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 22:42:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:42:21,427][70980] Signal inference workers to stop experience collection... (16550 times) [2024-06-12 22:42:21,427][70980] Signal inference workers to resume experience collection... (16550 times) [2024-06-12 22:42:21,438][71000] InferenceWorker_p0-w0: stopping experience collection (16550 times) [2024-06-12 22:42:21,438][71000] InferenceWorker_p0-w0: resuming experience collection (16550 times) [2024-06-12 22:42:22,947][71000] Updated weights for policy 0, policy_version 97644 (0.0028) [2024-06-12 22:42:25,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49698.2, 300 sec: 49596.4). Total num frames: 1599930368. Throughput: 0: 49298.8. Samples: 1128752300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 22:42:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:42:26,926][71000] Updated weights for policy 0, policy_version 97654 (0.0040) [2024-06-12 22:42:29,847][71000] Updated weights for policy 0, policy_version 97664 (0.0028) [2024-06-12 22:42:30,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48878.9, 300 sec: 49540.8). Total num frames: 1600159744. Throughput: 0: 48937.8. Samples: 1129033000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 22:42:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 22:42:33,506][71000] Updated weights for policy 0, policy_version 97674 (0.0029) [2024-06-12 22:42:35,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1600421888. Throughput: 0: 48926.4. Samples: 1129182420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 22:42:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:42:36,318][71000] Updated weights for policy 0, policy_version 97684 (0.0023) [2024-06-12 22:42:40,188][71000] Updated weights for policy 0, policy_version 97694 (0.0028) [2024-06-12 22:42:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48879.0, 300 sec: 49429.7). Total num frames: 1600651264. Throughput: 0: 49015.9. Samples: 1129482020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 22:42:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:42:42,931][71000] Updated weights for policy 0, policy_version 97704 (0.0027) [2024-06-12 22:42:45,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.9, 300 sec: 49485.2). Total num frames: 1600897024. Throughput: 0: 49047.9. Samples: 1129782540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 22:42:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 22:42:46,686][71000] Updated weights for policy 0, policy_version 97714 (0.0027) [2024-06-12 22:42:49,647][71000] Updated weights for policy 0, policy_version 97724 (0.0033) [2024-06-12 22:42:50,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48879.0, 300 sec: 49540.8). Total num frames: 1601142784. Throughput: 0: 49163.2. Samples: 1129932060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 22:42:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:42:53,129][71000] Updated weights for policy 0, policy_version 97734 (0.0027) [2024-06-12 22:42:55,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.9, 300 sec: 49429.7). Total num frames: 1601388544. Throughput: 0: 49194.1. Samples: 1130222120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 22:42:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:42:56,466][71000] Updated weights for policy 0, policy_version 97744 (0.0026) [2024-06-12 22:42:59,616][71000] Updated weights for policy 0, policy_version 97754 (0.0024) [2024-06-12 22:43:00,939][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.9, 300 sec: 49485.2). Total num frames: 1601634304. Throughput: 0: 49171.6. Samples: 1130518140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-12 22:43:00,944][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:43:03,083][71000] Updated weights for policy 0, policy_version 97764 (0.0033) [2024-06-12 22:43:05,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1601896448. Throughput: 0: 49111.3. Samples: 1130667740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:43:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:43:06,494][71000] Updated weights for policy 0, policy_version 97774 (0.0036) [2024-06-12 22:43:09,343][71000] Updated weights for policy 0, policy_version 97784 (0.0030) [2024-06-12 22:43:10,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.1, 300 sec: 49651.9). Total num frames: 1602158592. Throughput: 0: 49354.7. Samples: 1130973260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:43:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:43:12,928][71000] Updated weights for policy 0, policy_version 97794 (0.0031) [2024-06-12 22:43:15,939][70768] Fps is (10 sec: 50791.3, 60 sec: 49425.2, 300 sec: 49485.3). Total num frames: 1602404352. Throughput: 0: 49798.9. Samples: 1131273940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:43:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:43:16,138][71000] Updated weights for policy 0, policy_version 97804 (0.0031) [2024-06-12 22:43:19,447][71000] Updated weights for policy 0, policy_version 97814 (0.0032) [2024-06-12 22:43:20,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1602650112. Throughput: 0: 49675.0. Samples: 1131417800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:43:20,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 22:43:23,119][71000] Updated weights for policy 0, policy_version 97824 (0.0039) [2024-06-12 22:43:25,939][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.1, 300 sec: 49485.3). Total num frames: 1602879488. Throughput: 0: 49636.6. Samples: 1131715660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:43:25,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:43:26,105][71000] Updated weights for policy 0, policy_version 97834 (0.0028) [2024-06-12 22:43:29,611][71000] Updated weights for policy 0, policy_version 97844 (0.0034) [2024-06-12 22:43:30,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 1603125248. Throughput: 0: 49447.7. Samples: 1132007680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:43:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:43:32,767][71000] Updated weights for policy 0, policy_version 97854 (0.0030) [2024-06-12 22:43:35,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.1, 300 sec: 49485.3). Total num frames: 1603387392. Throughput: 0: 49513.3. Samples: 1132160160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:43:35,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:43:36,127][71000] Updated weights for policy 0, policy_version 97864 (0.0030) [2024-06-12 22:43:39,543][71000] Updated weights for policy 0, policy_version 97874 (0.0027) [2024-06-12 22:43:40,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 1603649536. Throughput: 0: 49657.4. Samples: 1132456700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:43:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:43:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000097879_1603649536.pth... [2024-06-12 22:43:40,997][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000097154_1591771136.pth [2024-06-12 22:43:42,823][71000] Updated weights for policy 0, policy_version 97884 (0.0030) [2024-06-12 22:43:45,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1603878912. Throughput: 0: 49549.3. Samples: 1132747860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:43:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 22:43:45,986][70980] Signal inference workers to stop experience collection... (16600 times) [2024-06-12 22:43:46,033][71000] InferenceWorker_p0-w0: stopping experience collection (16600 times) [2024-06-12 22:43:46,035][70980] Signal inference workers to resume experience collection... (16600 times) [2024-06-12 22:43:46,050][71000] InferenceWorker_p0-w0: resuming experience collection (16600 times) [2024-06-12 22:43:46,054][71000] Updated weights for policy 0, policy_version 97894 (0.0028) [2024-06-12 22:43:49,575][71000] Updated weights for policy 0, policy_version 97904 (0.0031) [2024-06-12 22:43:50,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1604124672. Throughput: 0: 49530.7. Samples: 1132896620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:43:50,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:43:52,375][71000] Updated weights for policy 0, policy_version 97914 (0.0023) [2024-06-12 22:43:55,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1604370432. Throughput: 0: 49372.4. Samples: 1133195020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:43:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:43:56,235][71000] Updated weights for policy 0, policy_version 97924 (0.0034) [2024-06-12 22:43:59,192][71000] Updated weights for policy 0, policy_version 97934 (0.0028) [2024-06-12 22:44:00,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49971.1, 300 sec: 49429.7). Total num frames: 1604632576. Throughput: 0: 49335.8. Samples: 1133494060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-12 22:44:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:44:02,882][71000] Updated weights for policy 0, policy_version 97944 (0.0029) [2024-06-12 22:44:05,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1604861952. Throughput: 0: 49599.0. Samples: 1133649760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 22:44:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:44:06,013][71000] Updated weights for policy 0, policy_version 97954 (0.0033) [2024-06-12 22:44:09,348][71000] Updated weights for policy 0, policy_version 97964 (0.0021) [2024-06-12 22:44:10,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1605107712. Throughput: 0: 49515.9. Samples: 1133943880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 22:44:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:44:12,460][71000] Updated weights for policy 0, policy_version 97974 (0.0030) [2024-06-12 22:44:15,819][71000] Updated weights for policy 0, policy_version 97984 (0.0025) [2024-06-12 22:44:15,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49424.9, 300 sec: 49485.3). Total num frames: 1605369856. Throughput: 0: 49571.1. Samples: 1134238380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 22:44:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:44:18,771][71000] Updated weights for policy 0, policy_version 97994 (0.0025) [2024-06-12 22:44:20,940][70768] Fps is (10 sec: 54066.5, 60 sec: 49971.1, 300 sec: 49485.2). Total num frames: 1605648384. Throughput: 0: 49563.4. Samples: 1134390520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 22:44:20,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:44:22,191][71000] Updated weights for policy 0, policy_version 98004 (0.0030) [2024-06-12 22:44:25,531][71000] Updated weights for policy 0, policy_version 98014 (0.0022) [2024-06-12 22:44:25,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 1605877760. Throughput: 0: 49728.9. Samples: 1134694500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 22:44:25,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 22:44:28,840][71000] Updated weights for policy 0, policy_version 98024 (0.0031) [2024-06-12 22:44:30,940][70768] Fps is (10 sec: 45875.8, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1606107136. Throughput: 0: 49993.8. Samples: 1134997580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 22:44:30,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:44:32,137][71000] Updated weights for policy 0, policy_version 98034 (0.0026) [2024-06-12 22:44:35,533][71000] Updated weights for policy 0, policy_version 98044 (0.0033) [2024-06-12 22:44:35,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1606352896. Throughput: 0: 49775.1. Samples: 1135136500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 22:44:35,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 22:44:38,615][71000] Updated weights for policy 0, policy_version 98054 (0.0032) [2024-06-12 22:44:40,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1606631424. Throughput: 0: 49710.2. Samples: 1135431980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 22:44:40,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 22:44:42,487][71000] Updated weights for policy 0, policy_version 98064 (0.0030) [2024-06-12 22:44:45,157][71000] Updated weights for policy 0, policy_version 98074 (0.0023) [2024-06-12 22:44:45,939][70768] Fps is (10 sec: 52429.0, 60 sec: 49971.3, 300 sec: 49485.3). Total num frames: 1606877184. Throughput: 0: 49719.7. Samples: 1135731440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 22:44:45,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:44:48,925][71000] Updated weights for policy 0, policy_version 98084 (0.0034) [2024-06-12 22:44:50,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1607090176. Throughput: 0: 49504.1. Samples: 1135877440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 22:44:50,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 22:44:52,061][71000] Updated weights for policy 0, policy_version 98094 (0.0033) [2024-06-12 22:44:52,771][70980] Signal inference workers to stop experience collection... (16650 times) [2024-06-12 22:44:52,775][70980] Signal inference workers to resume experience collection... (16650 times) [2024-06-12 22:44:52,807][71000] InferenceWorker_p0-w0: stopping experience collection (16650 times) [2024-06-12 22:44:52,807][71000] InferenceWorker_p0-w0: resuming experience collection (16650 times) [2024-06-12 22:44:55,304][71000] Updated weights for policy 0, policy_version 98104 (0.0026) [2024-06-12 22:44:55,940][70768] Fps is (10 sec: 45874.7, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 1607335936. Throughput: 0: 49474.2. Samples: 1136170220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 22:44:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 22:44:58,488][71000] Updated weights for policy 0, policy_version 98114 (0.0034) [2024-06-12 22:45:00,940][70768] Fps is (10 sec: 54066.9, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 1607630848. Throughput: 0: 49498.6. Samples: 1136465820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 22:45:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 22:45:02,422][71000] Updated weights for policy 0, policy_version 98124 (0.0030) [2024-06-12 22:45:05,128][71000] Updated weights for policy 0, policy_version 98134 (0.0027) [2024-06-12 22:45:05,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49971.3, 300 sec: 49485.2). Total num frames: 1607860224. Throughput: 0: 49849.9. Samples: 1136633760. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-12 22:45:05,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:45:09,081][71000] Updated weights for policy 0, policy_version 98144 (0.0027) [2024-06-12 22:45:10,940][70768] Fps is (10 sec: 44236.7, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 1608073216. Throughput: 0: 49664.3. Samples: 1136929400. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-12 22:45:10,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:45:11,675][71000] Updated weights for policy 0, policy_version 98154 (0.0029) [2024-06-12 22:45:15,362][71000] Updated weights for policy 0, policy_version 98164 (0.0023) [2024-06-12 22:45:15,940][70768] Fps is (10 sec: 45874.8, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 1608318976. Throughput: 0: 49593.7. Samples: 1137229300. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-12 22:45:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:45:18,076][71000] Updated weights for policy 0, policy_version 98174 (0.0031) [2024-06-12 22:45:20,940][70768] Fps is (10 sec: 54067.1, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1608613888. Throughput: 0: 49745.6. Samples: 1137375060. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-12 22:45:20,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:45:21,534][71000] Updated weights for policy 0, policy_version 98184 (0.0027) [2024-06-12 22:45:24,913][71000] Updated weights for policy 0, policy_version 98194 (0.0027) [2024-06-12 22:45:25,940][70768] Fps is (10 sec: 55706.1, 60 sec: 49971.2, 300 sec: 49541.1). Total num frames: 1608876032. Throughput: 0: 50000.4. Samples: 1137682000. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-12 22:45:25,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:45:28,363][71000] Updated weights for policy 0, policy_version 98204 (0.0027) [2024-06-12 22:45:30,939][70768] Fps is (10 sec: 45876.0, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1609072640. Throughput: 0: 49930.2. Samples: 1137978300. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-12 22:45:30,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:45:31,448][71000] Updated weights for policy 0, policy_version 98214 (0.0035) [2024-06-12 22:45:34,854][71000] Updated weights for policy 0, policy_version 98224 (0.0034) [2024-06-12 22:45:35,940][70768] Fps is (10 sec: 44236.9, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1609318400. Throughput: 0: 49744.1. Samples: 1138115920. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-12 22:45:35,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 22:45:36,796][70980] Signal inference workers to stop experience collection... (16700 times) [2024-06-12 22:45:36,829][71000] InferenceWorker_p0-w0: stopping experience collection (16700 times) [2024-06-12 22:45:36,854][70980] Signal inference workers to resume experience collection... (16700 times) [2024-06-12 22:45:36,854][71000] InferenceWorker_p0-w0: resuming experience collection (16700 times) [2024-06-12 22:45:38,063][71000] Updated weights for policy 0, policy_version 98234 (0.0025) [2024-06-12 22:45:40,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49424.9, 300 sec: 49540.7). Total num frames: 1609596928. Throughput: 0: 49779.4. Samples: 1138410300. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-12 22:45:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:45:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000098243_1609613312.pth... [2024-06-12 22:45:40,999][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000097517_1597718528.pth [2024-06-12 22:45:41,211][71000] Updated weights for policy 0, policy_version 98244 (0.0032) [2024-06-12 22:45:44,642][71000] Updated weights for policy 0, policy_version 98254 (0.0028) [2024-06-12 22:45:45,939][70768] Fps is (10 sec: 55705.9, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1609875456. Throughput: 0: 50014.4. Samples: 1138716460. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-12 22:45:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:45:47,457][71000] Updated weights for policy 0, policy_version 98264 (0.0034) [2024-06-12 22:45:50,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 1610088448. Throughput: 0: 49763.6. Samples: 1138873120. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-12 22:45:50,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:45:51,302][71000] Updated weights for policy 0, policy_version 98274 (0.0034) [2024-06-12 22:45:54,332][71000] Updated weights for policy 0, policy_version 98284 (0.0034) [2024-06-12 22:45:55,940][70768] Fps is (10 sec: 42598.0, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 1610301440. Throughput: 0: 49603.7. Samples: 1139161560. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-12 22:45:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 22:45:57,774][71000] Updated weights for policy 0, policy_version 98294 (0.0029) [2024-06-12 22:46:00,774][71000] Updated weights for policy 0, policy_version 98304 (0.0034) [2024-06-12 22:46:00,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49698.2, 300 sec: 49651.9). Total num frames: 1610612736. Throughput: 0: 49406.8. Samples: 1139452600. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-12 22:46:00,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:46:04,624][71000] Updated weights for policy 0, policy_version 98314 (0.0036) [2024-06-12 22:46:05,940][70768] Fps is (10 sec: 55704.9, 60 sec: 49971.1, 300 sec: 49540.8). Total num frames: 1610858496. Throughput: 0: 49771.1. Samples: 1139614760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 22:46:05,941][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:46:07,343][71000] Updated weights for policy 0, policy_version 98324 (0.0028) [2024-06-12 22:46:10,939][70768] Fps is (10 sec: 45875.3, 60 sec: 49971.3, 300 sec: 49485.2). Total num frames: 1611071488. Throughput: 0: 49455.6. Samples: 1139907500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 22:46:10,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:46:11,278][71000] Updated weights for policy 0, policy_version 98334 (0.0023) [2024-06-12 22:46:14,448][71000] Updated weights for policy 0, policy_version 98344 (0.0030) [2024-06-12 22:46:15,940][70768] Fps is (10 sec: 42598.6, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 1611284480. Throughput: 0: 49189.2. Samples: 1140191820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 22:46:15,940][70768] Avg episode reward: [(0, '0.257')] [2024-06-12 22:46:18,259][71000] Updated weights for policy 0, policy_version 98354 (0.0030) [2024-06-12 22:46:20,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.2, 300 sec: 49596.3). Total num frames: 1611579392. Throughput: 0: 49316.9. Samples: 1140335180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 22:46:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 22:46:21,048][71000] Updated weights for policy 0, policy_version 98364 (0.0033) [2024-06-12 22:46:24,927][71000] Updated weights for policy 0, policy_version 98374 (0.0026) [2024-06-12 22:46:25,940][70768] Fps is (10 sec: 52428.8, 60 sec: 48878.8, 300 sec: 49429.7). Total num frames: 1611808768. Throughput: 0: 49210.2. Samples: 1140624760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 22:46:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:46:27,350][70980] Signal inference workers to stop experience collection... (16750 times) [2024-06-12 22:46:27,391][71000] InferenceWorker_p0-w0: stopping experience collection (16750 times) [2024-06-12 22:46:27,460][70980] Signal inference workers to resume experience collection... (16750 times) [2024-06-12 22:46:27,461][71000] InferenceWorker_p0-w0: resuming experience collection (16750 times) [2024-06-12 22:46:27,587][71000] Updated weights for policy 0, policy_version 98384 (0.0033) [2024-06-12 22:46:30,940][70768] Fps is (10 sec: 44236.1, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 1612021760. Throughput: 0: 49142.9. Samples: 1140927900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 22:46:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:46:31,822][71000] Updated weights for policy 0, policy_version 98394 (0.0026) [2024-06-12 22:46:34,657][71000] Updated weights for policy 0, policy_version 98404 (0.0025) [2024-06-12 22:46:35,940][70768] Fps is (10 sec: 45875.5, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 1612267520. Throughput: 0: 48503.9. Samples: 1141055800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 22:46:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 22:46:38,453][71000] Updated weights for policy 0, policy_version 98414 (0.0037) [2024-06-12 22:46:40,940][70768] Fps is (10 sec: 54066.7, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1612562432. Throughput: 0: 48811.8. Samples: 1141358100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 22:46:40,949][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:46:41,046][71000] Updated weights for policy 0, policy_version 98424 (0.0027) [2024-06-12 22:46:45,029][71000] Updated weights for policy 0, policy_version 98434 (0.0027) [2024-06-12 22:46:45,940][70768] Fps is (10 sec: 54067.4, 60 sec: 48878.9, 300 sec: 49485.2). Total num frames: 1612808192. Throughput: 0: 49023.1. Samples: 1141658640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 22:46:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:46:47,508][71000] Updated weights for policy 0, policy_version 98444 (0.0030) [2024-06-12 22:46:50,940][70768] Fps is (10 sec: 47514.4, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 1613037568. Throughput: 0: 48670.8. Samples: 1141804940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 22:46:50,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:46:51,484][71000] Updated weights for policy 0, policy_version 98454 (0.0025) [2024-06-12 22:46:54,505][71000] Updated weights for policy 0, policy_version 98464 (0.0034) [2024-06-12 22:46:55,940][70768] Fps is (10 sec: 44236.3, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 1613250560. Throughput: 0: 48649.6. Samples: 1142096740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 22:46:55,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:46:58,166][71000] Updated weights for policy 0, policy_version 98474 (0.0031) [2024-06-12 22:47:00,939][70768] Fps is (10 sec: 50790.8, 60 sec: 48879.0, 300 sec: 49485.2). Total num frames: 1613545472. Throughput: 0: 49068.2. Samples: 1142399880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-12 22:47:00,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:47:01,020][71000] Updated weights for policy 0, policy_version 98484 (0.0030) [2024-06-12 22:47:04,863][71000] Updated weights for policy 0, policy_version 98494 (0.0031) [2024-06-12 22:47:05,940][70768] Fps is (10 sec: 54068.0, 60 sec: 48879.1, 300 sec: 49485.2). Total num frames: 1613791232. Throughput: 0: 49351.1. Samples: 1142555980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-12 22:47:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:47:07,590][71000] Updated weights for policy 0, policy_version 98504 (0.0030) [2024-06-12 22:47:10,940][70768] Fps is (10 sec: 44236.6, 60 sec: 48605.8, 300 sec: 49318.6). Total num frames: 1613987840. Throughput: 0: 49404.6. Samples: 1142847960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-12 22:47:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:47:11,466][71000] Updated weights for policy 0, policy_version 98514 (0.0022) [2024-06-12 22:47:14,435][71000] Updated weights for policy 0, policy_version 98524 (0.0036) [2024-06-12 22:47:15,940][70768] Fps is (10 sec: 44236.7, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 1614233600. Throughput: 0: 48990.3. Samples: 1143132460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-12 22:47:15,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 22:47:18,363][71000] Updated weights for policy 0, policy_version 98534 (0.0033) [2024-06-12 22:47:20,940][70768] Fps is (10 sec: 54066.5, 60 sec: 49151.9, 300 sec: 49485.2). Total num frames: 1614528512. Throughput: 0: 49518.6. Samples: 1143284140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-12 22:47:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:47:21,510][71000] Updated weights for policy 0, policy_version 98544 (0.0034) [2024-06-12 22:47:25,115][71000] Updated weights for policy 0, policy_version 98554 (0.0037) [2024-06-12 22:47:25,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1614757888. Throughput: 0: 49381.9. Samples: 1143580280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-12 22:47:25,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:47:27,952][71000] Updated weights for policy 0, policy_version 98564 (0.0030) [2024-06-12 22:47:30,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49425.1, 300 sec: 49374.1). Total num frames: 1614987264. Throughput: 0: 49209.2. Samples: 1143873060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-12 22:47:30,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:47:31,395][70980] Signal inference workers to stop experience collection... (16800 times) [2024-06-12 22:47:31,396][70980] Signal inference workers to resume experience collection... (16800 times) [2024-06-12 22:47:31,435][71000] InferenceWorker_p0-w0: stopping experience collection (16800 times) [2024-06-12 22:47:31,435][71000] InferenceWorker_p0-w0: resuming experience collection (16800 times) [2024-06-12 22:47:31,532][71000] Updated weights for policy 0, policy_version 98574 (0.0028) [2024-06-12 22:47:34,425][71000] Updated weights for policy 0, policy_version 98584 (0.0029) [2024-06-12 22:47:35,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 1615216640. Throughput: 0: 49193.7. Samples: 1144018660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-12 22:47:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:47:38,061][71000] Updated weights for policy 0, policy_version 98594 (0.0027) [2024-06-12 22:47:40,839][71000] Updated weights for policy 0, policy_version 98604 (0.0031) [2024-06-12 22:47:40,940][70768] Fps is (10 sec: 54067.5, 60 sec: 49425.2, 300 sec: 49596.3). Total num frames: 1615527936. Throughput: 0: 49245.8. Samples: 1144312800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-12 22:47:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:47:40,953][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000098604_1615527936.pth... [2024-06-12 22:47:40,994][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000097879_1603649536.pth [2024-06-12 22:47:44,769][71000] Updated weights for policy 0, policy_version 98614 (0.0027) [2024-06-12 22:47:45,940][70768] Fps is (10 sec: 52429.3, 60 sec: 48878.9, 300 sec: 49485.2). Total num frames: 1615740928. Throughput: 0: 49315.1. Samples: 1144619060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-12 22:47:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:47:47,591][71000] Updated weights for policy 0, policy_version 98624 (0.0025) [2024-06-12 22:47:50,939][70768] Fps is (10 sec: 45875.7, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 1615986688. Throughput: 0: 49212.4. Samples: 1144770540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-12 22:47:50,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:47:51,121][71000] Updated weights for policy 0, policy_version 98634 (0.0032) [2024-06-12 22:47:54,408][71000] Updated weights for policy 0, policy_version 98644 (0.0032) [2024-06-12 22:47:55,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1616216064. Throughput: 0: 49203.6. Samples: 1145062120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-12 22:47:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:47:57,811][71000] Updated weights for policy 0, policy_version 98654 (0.0024) [2024-06-12 22:48:00,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1616494592. Throughput: 0: 49518.3. Samples: 1145360780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-12 22:48:00,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:48:00,981][71000] Updated weights for policy 0, policy_version 98664 (0.0027) [2024-06-12 22:48:04,213][71000] Updated weights for policy 0, policy_version 98674 (0.0024) [2024-06-12 22:48:05,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 1616740352. Throughput: 0: 49572.0. Samples: 1145514880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-12 22:48:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:48:07,374][71000] Updated weights for policy 0, policy_version 98684 (0.0026) [2024-06-12 22:48:10,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49698.1, 300 sec: 49374.1). Total num frames: 1616969728. Throughput: 0: 49550.8. Samples: 1145810060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-12 22:48:10,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:48:11,170][71000] Updated weights for policy 0, policy_version 98694 (0.0027) [2024-06-12 22:48:14,291][71000] Updated weights for policy 0, policy_version 98704 (0.0029) [2024-06-12 22:48:15,939][70768] Fps is (10 sec: 47514.3, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 1617215488. Throughput: 0: 49670.0. Samples: 1146108200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-12 22:48:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:48:17,702][71000] Updated weights for policy 0, policy_version 98714 (0.0026) [2024-06-12 22:48:20,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 1617477632. Throughput: 0: 49580.1. Samples: 1146249760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-12 22:48:20,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:48:21,049][71000] Updated weights for policy 0, policy_version 98724 (0.0027) [2024-06-12 22:48:24,133][71000] Updated weights for policy 0, policy_version 98734 (0.0030) [2024-06-12 22:48:25,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1617723392. Throughput: 0: 49678.7. Samples: 1146548340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-12 22:48:25,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:48:27,475][71000] Updated weights for policy 0, policy_version 98744 (0.0031) [2024-06-12 22:48:30,707][71000] Updated weights for policy 0, policy_version 98754 (0.0023) [2024-06-12 22:48:30,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49971.3, 300 sec: 49485.2). Total num frames: 1617985536. Throughput: 0: 49555.2. Samples: 1146849040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-12 22:48:30,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:48:33,961][71000] Updated weights for policy 0, policy_version 98764 (0.0026) [2024-06-12 22:48:35,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49971.3, 300 sec: 49374.2). Total num frames: 1618214912. Throughput: 0: 49448.4. Samples: 1146995720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-12 22:48:35,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 22:48:37,245][71000] Updated weights for policy 0, policy_version 98774 (0.0022) [2024-06-12 22:48:40,740][71000] Updated weights for policy 0, policy_version 98784 (0.0039) [2024-06-12 22:48:40,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1618493440. Throughput: 0: 49588.4. Samples: 1147293600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-12 22:48:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:48:43,833][71000] Updated weights for policy 0, policy_version 98794 (0.0025) [2024-06-12 22:48:45,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 1618739200. Throughput: 0: 49734.6. Samples: 1147598840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-12 22:48:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:48:47,227][71000] Updated weights for policy 0, policy_version 98804 (0.0028) [2024-06-12 22:48:48,609][70980] Signal inference workers to stop experience collection... (16850 times) [2024-06-12 22:48:48,655][71000] InferenceWorker_p0-w0: stopping experience collection (16850 times) [2024-06-12 22:48:48,659][70980] Signal inference workers to resume experience collection... (16850 times) [2024-06-12 22:48:48,663][71000] InferenceWorker_p0-w0: resuming experience collection (16850 times) [2024-06-12 22:48:50,323][71000] Updated weights for policy 0, policy_version 98814 (0.0023) [2024-06-12 22:48:50,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 1618984960. Throughput: 0: 49830.8. Samples: 1147757260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-12 22:48:50,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:48:54,109][71000] Updated weights for policy 0, policy_version 98824 (0.0022) [2024-06-12 22:48:55,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 1619214336. Throughput: 0: 49600.0. Samples: 1148042060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-12 22:48:55,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:48:57,269][71000] Updated weights for policy 0, policy_version 98834 (0.0029) [2024-06-12 22:49:00,759][71000] Updated weights for policy 0, policy_version 98844 (0.0029) [2024-06-12 22:49:00,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1619460096. Throughput: 0: 49467.9. Samples: 1148334260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-12 22:49:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:49:03,785][71000] Updated weights for policy 0, policy_version 98854 (0.0040) [2024-06-12 22:49:05,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1619705856. Throughput: 0: 49489.7. Samples: 1148476800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 22:49:05,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:49:07,594][71000] Updated weights for policy 0, policy_version 98864 (0.0030) [2024-06-12 22:49:10,173][71000] Updated weights for policy 0, policy_version 98874 (0.0024) [2024-06-12 22:49:10,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 1619968000. Throughput: 0: 49650.7. Samples: 1148782620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 22:49:10,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:49:13,801][71000] Updated weights for policy 0, policy_version 98884 (0.0024) [2024-06-12 22:49:15,940][70768] Fps is (10 sec: 52429.3, 60 sec: 50244.2, 300 sec: 49429.7). Total num frames: 1620230144. Throughput: 0: 49780.4. Samples: 1149089160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 22:49:15,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:49:17,015][71000] Updated weights for policy 0, policy_version 98894 (0.0032) [2024-06-12 22:49:20,774][71000] Updated weights for policy 0, policy_version 98904 (0.0028) [2024-06-12 22:49:20,940][70768] Fps is (10 sec: 47512.7, 60 sec: 49424.9, 300 sec: 49374.1). Total num frames: 1620443136. Throughput: 0: 49642.9. Samples: 1149229660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 22:49:20,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:49:23,810][71000] Updated weights for policy 0, policy_version 98914 (0.0040) [2024-06-12 22:49:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1620705280. Throughput: 0: 49689.4. Samples: 1149529620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 22:49:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:49:27,360][71000] Updated weights for policy 0, policy_version 98924 (0.0039) [2024-06-12 22:49:30,181][71000] Updated weights for policy 0, policy_version 98934 (0.0025) [2024-06-12 22:49:30,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1620951040. Throughput: 0: 49275.1. Samples: 1149816220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 22:49:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:49:34,045][71000] Updated weights for policy 0, policy_version 98944 (0.0023) [2024-06-12 22:49:35,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 1621213184. Throughput: 0: 49368.9. Samples: 1149978860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 22:49:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:49:36,998][71000] Updated weights for policy 0, policy_version 98954 (0.0025) [2024-06-12 22:49:40,393][71000] Updated weights for policy 0, policy_version 98964 (0.0031) [2024-06-12 22:49:40,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 1621442560. Throughput: 0: 49636.9. Samples: 1150275720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 22:49:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:49:41,043][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000098966_1621458944.pth... [2024-06-12 22:49:41,100][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000098243_1609613312.pth [2024-06-12 22:49:43,775][71000] Updated weights for policy 0, policy_version 98974 (0.0030) [2024-06-12 22:49:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1621688320. Throughput: 0: 49691.2. Samples: 1150570360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 22:49:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:49:47,034][71000] Updated weights for policy 0, policy_version 98984 (0.0026) [2024-06-12 22:49:50,294][71000] Updated weights for policy 0, policy_version 98994 (0.0021) [2024-06-12 22:49:50,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1621950464. Throughput: 0: 49771.2. Samples: 1150716500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 22:49:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:49:53,839][71000] Updated weights for policy 0, policy_version 99004 (0.0026) [2024-06-12 22:49:54,038][70980] Signal inference workers to stop experience collection... (16900 times) [2024-06-12 22:49:54,085][71000] InferenceWorker_p0-w0: stopping experience collection (16900 times) [2024-06-12 22:49:54,092][70980] Signal inference workers to resume experience collection... (16900 times) [2024-06-12 22:49:54,098][71000] InferenceWorker_p0-w0: resuming experience collection (16900 times) [2024-06-12 22:49:55,940][70768] Fps is (10 sec: 52429.1, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 1622212608. Throughput: 0: 49711.1. Samples: 1151019620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 22:49:55,940][70768] Avg episode reward: [(0, '0.254')] [2024-06-12 22:49:56,623][71000] Updated weights for policy 0, policy_version 99014 (0.0025) [2024-06-12 22:50:00,538][71000] Updated weights for policy 0, policy_version 99024 (0.0035) [2024-06-12 22:50:00,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1622441984. Throughput: 0: 49691.1. Samples: 1151325260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-12 22:50:00,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:50:03,457][71000] Updated weights for policy 0, policy_version 99034 (0.0022) [2024-06-12 22:50:05,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1622687744. Throughput: 0: 49838.4. Samples: 1151472380. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-12 22:50:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:50:06,994][71000] Updated weights for policy 0, policy_version 99044 (0.0025) [2024-06-12 22:50:10,097][71000] Updated weights for policy 0, policy_version 99054 (0.0032) [2024-06-12 22:50:10,940][70768] Fps is (10 sec: 49150.8, 60 sec: 49424.9, 300 sec: 49540.8). Total num frames: 1622933504. Throughput: 0: 49690.4. Samples: 1151765700. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-12 22:50:10,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:50:13,819][71000] Updated weights for policy 0, policy_version 99064 (0.0025) [2024-06-12 22:50:15,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1623195648. Throughput: 0: 49601.4. Samples: 1152048280. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-12 22:50:15,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:50:16,997][71000] Updated weights for policy 0, policy_version 99074 (0.0019) [2024-06-12 22:50:20,179][71000] Updated weights for policy 0, policy_version 99084 (0.0036) [2024-06-12 22:50:20,939][70768] Fps is (10 sec: 50791.9, 60 sec: 49971.4, 300 sec: 49374.2). Total num frames: 1623441408. Throughput: 0: 49349.9. Samples: 1152199600. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-12 22:50:20,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:50:23,332][71000] Updated weights for policy 0, policy_version 99094 (0.0029) [2024-06-12 22:50:25,939][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1623654400. Throughput: 0: 49480.4. Samples: 1152502340. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-12 22:50:25,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:50:26,723][71000] Updated weights for policy 0, policy_version 99104 (0.0025) [2024-06-12 22:50:29,975][71000] Updated weights for policy 0, policy_version 99114 (0.0026) [2024-06-12 22:50:30,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1623900160. Throughput: 0: 49519.2. Samples: 1152798720. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-12 22:50:30,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:50:33,333][71000] Updated weights for policy 0, policy_version 99124 (0.0027) [2024-06-12 22:50:35,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1624178688. Throughput: 0: 49668.4. Samples: 1152951580. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-12 22:50:35,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:50:36,630][71000] Updated weights for policy 0, policy_version 99134 (0.0032) [2024-06-12 22:50:39,843][71000] Updated weights for policy 0, policy_version 99144 (0.0034) [2024-06-12 22:50:40,940][70768] Fps is (10 sec: 55704.6, 60 sec: 50244.1, 300 sec: 49429.7). Total num frames: 1624457216. Throughput: 0: 49622.5. Samples: 1153252640. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-12 22:50:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 22:50:42,902][71000] Updated weights for policy 0, policy_version 99154 (0.0029) [2024-06-12 22:50:44,045][70980] Signal inference workers to stop experience collection... (16950 times) [2024-06-12 22:50:44,083][71000] InferenceWorker_p0-w0: stopping experience collection (16950 times) [2024-06-12 22:50:44,090][70980] Signal inference workers to resume experience collection... (16950 times) [2024-06-12 22:50:44,104][71000] InferenceWorker_p0-w0: resuming experience collection (16950 times) [2024-06-12 22:50:45,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1624670208. Throughput: 0: 49388.4. Samples: 1153547740. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-12 22:50:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:50:46,492][71000] Updated weights for policy 0, policy_version 99164 (0.0030) [2024-06-12 22:50:49,534][71000] Updated weights for policy 0, policy_version 99174 (0.0027) [2024-06-12 22:50:50,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49424.9, 300 sec: 49540.8). Total num frames: 1624915968. Throughput: 0: 49427.4. Samples: 1153696620. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-12 22:50:50,944][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:50:53,079][71000] Updated weights for policy 0, policy_version 99184 (0.0026) [2024-06-12 22:50:55,939][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1625161728. Throughput: 0: 49253.2. Samples: 1153982080. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-12 22:50:55,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:50:56,422][71000] Updated weights for policy 0, policy_version 99194 (0.0031) [2024-06-12 22:50:59,844][71000] Updated weights for policy 0, policy_version 99204 (0.0032) [2024-06-12 22:51:00,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 1625423872. Throughput: 0: 49486.6. Samples: 1154275180. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-12 22:51:00,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:51:02,928][71000] Updated weights for policy 0, policy_version 99214 (0.0026) [2024-06-12 22:51:05,940][70768] Fps is (10 sec: 49150.9, 60 sec: 49424.9, 300 sec: 49429.7). Total num frames: 1625653248. Throughput: 0: 49563.7. Samples: 1154429980. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-12 22:51:05,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:51:06,405][71000] Updated weights for policy 0, policy_version 99224 (0.0020) [2024-06-12 22:51:09,552][71000] Updated weights for policy 0, policy_version 99234 (0.0039) [2024-06-12 22:51:10,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1625899008. Throughput: 0: 49513.5. Samples: 1154730460. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-12 22:51:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:51:12,993][71000] Updated weights for policy 0, policy_version 99244 (0.0031) [2024-06-12 22:51:15,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1626144768. Throughput: 0: 49465.8. Samples: 1155024680. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-12 22:51:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:51:16,322][71000] Updated weights for policy 0, policy_version 99254 (0.0027) [2024-06-12 22:51:19,555][71000] Updated weights for policy 0, policy_version 99264 (0.0023) [2024-06-12 22:51:20,940][70768] Fps is (10 sec: 50791.5, 60 sec: 49425.0, 300 sec: 49485.3). Total num frames: 1626406912. Throughput: 0: 49376.1. Samples: 1155173500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-12 22:51:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:51:22,749][71000] Updated weights for policy 0, policy_version 99274 (0.0031) [2024-06-12 22:51:25,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1626652672. Throughput: 0: 49498.8. Samples: 1155480080. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-12 22:51:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:51:26,063][71000] Updated weights for policy 0, policy_version 99284 (0.0024) [2024-06-12 22:51:29,594][71000] Updated weights for policy 0, policy_version 99294 (0.0037) [2024-06-12 22:51:30,939][70768] Fps is (10 sec: 45875.4, 60 sec: 49425.1, 300 sec: 49485.3). Total num frames: 1626865664. Throughput: 0: 49193.9. Samples: 1155761460. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-12 22:51:30,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:51:32,687][71000] Updated weights for policy 0, policy_version 99304 (0.0032) [2024-06-12 22:51:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1627144192. Throughput: 0: 49076.2. Samples: 1155905040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-12 22:51:35,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:51:36,082][71000] Updated weights for policy 0, policy_version 99314 (0.0029) [2024-06-12 22:51:39,073][70980] Signal inference workers to stop experience collection... (17000 times) [2024-06-12 22:51:39,095][71000] InferenceWorker_p0-w0: stopping experience collection (17000 times) [2024-06-12 22:51:39,130][70980] Signal inference workers to resume experience collection... (17000 times) [2024-06-12 22:51:39,131][71000] InferenceWorker_p0-w0: resuming experience collection (17000 times) [2024-06-12 22:51:39,285][71000] Updated weights for policy 0, policy_version 99324 (0.0031) [2024-06-12 22:51:40,940][70768] Fps is (10 sec: 54066.7, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 1627406336. Throughput: 0: 49457.2. Samples: 1156207660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-12 22:51:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:51:40,955][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000099329_1627406336.pth... [2024-06-12 22:51:41,013][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000098604_1615527936.pth [2024-06-12 22:51:42,969][71000] Updated weights for policy 0, policy_version 99334 (0.0034) [2024-06-12 22:51:45,789][71000] Updated weights for policy 0, policy_version 99344 (0.0026) [2024-06-12 22:51:45,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1627652096. Throughput: 0: 49629.2. Samples: 1156508500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-12 22:51:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:51:49,197][71000] Updated weights for policy 0, policy_version 99354 (0.0038) [2024-06-12 22:51:50,940][70768] Fps is (10 sec: 44236.7, 60 sec: 48879.0, 300 sec: 49485.2). Total num frames: 1627848704. Throughput: 0: 49554.4. Samples: 1156659920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-12 22:51:50,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:51:52,322][71000] Updated weights for policy 0, policy_version 99364 (0.0021) [2024-06-12 22:51:55,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49424.9, 300 sec: 49429.7). Total num frames: 1628127232. Throughput: 0: 49312.9. Samples: 1156949540. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-12 22:51:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:51:56,146][71000] Updated weights for policy 0, policy_version 99374 (0.0029) [2024-06-12 22:51:59,112][71000] Updated weights for policy 0, policy_version 99384 (0.0030) [2024-06-12 22:52:00,940][70768] Fps is (10 sec: 54066.3, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1628389376. Throughput: 0: 49282.4. Samples: 1157242400. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-12 22:52:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:52:02,451][71000] Updated weights for policy 0, policy_version 99394 (0.0029) [2024-06-12 22:52:05,615][71000] Updated weights for policy 0, policy_version 99404 (0.0026) [2024-06-12 22:52:05,939][70768] Fps is (10 sec: 52429.9, 60 sec: 49971.4, 300 sec: 49707.4). Total num frames: 1628651520. Throughput: 0: 49624.1. Samples: 1157406580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 22:52:05,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:52:09,107][71000] Updated weights for policy 0, policy_version 99414 (0.0028) [2024-06-12 22:52:10,939][70768] Fps is (10 sec: 45876.2, 60 sec: 49152.2, 300 sec: 49540.8). Total num frames: 1628848128. Throughput: 0: 49289.8. Samples: 1157698120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 22:52:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:52:11,954][71000] Updated weights for policy 0, policy_version 99424 (0.0026) [2024-06-12 22:52:15,524][71000] Updated weights for policy 0, policy_version 99434 (0.0032) [2024-06-12 22:52:15,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49698.1, 300 sec: 49485.3). Total num frames: 1629126656. Throughput: 0: 49631.1. Samples: 1157994860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 22:52:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:52:18,728][71000] Updated weights for policy 0, policy_version 99444 (0.0028) [2024-06-12 22:52:20,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1629372416. Throughput: 0: 49726.6. Samples: 1158142740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 22:52:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-12 22:52:20,947][70980] Saving new best policy, reward=0.287! [2024-06-12 22:52:22,472][71000] Updated weights for policy 0, policy_version 99454 (0.0033) [2024-06-12 22:52:25,174][70980] Signal inference workers to stop experience collection... (17050 times) [2024-06-12 22:52:25,201][71000] InferenceWorker_p0-w0: stopping experience collection (17050 times) [2024-06-12 22:52:25,235][70980] Signal inference workers to resume experience collection... (17050 times) [2024-06-12 22:52:25,235][71000] InferenceWorker_p0-w0: resuming experience collection (17050 times) [2024-06-12 22:52:25,383][71000] Updated weights for policy 0, policy_version 99464 (0.0032) [2024-06-12 22:52:25,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49971.2, 300 sec: 49707.4). Total num frames: 1629650944. Throughput: 0: 49867.1. Samples: 1158451680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 22:52:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:52:28,748][71000] Updated weights for policy 0, policy_version 99474 (0.0026) [2024-06-12 22:52:30,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49971.0, 300 sec: 49651.8). Total num frames: 1629863936. Throughput: 0: 49889.8. Samples: 1158753540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 22:52:30,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:52:31,702][71000] Updated weights for policy 0, policy_version 99484 (0.0027) [2024-06-12 22:52:35,829][71000] Updated weights for policy 0, policy_version 99494 (0.0021) [2024-06-12 22:52:35,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1630109696. Throughput: 0: 49329.8. Samples: 1158879760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 22:52:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:52:38,794][71000] Updated weights for policy 0, policy_version 99504 (0.0035) [2024-06-12 22:52:40,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1630355456. Throughput: 0: 49483.2. Samples: 1159176280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 22:52:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 22:52:42,689][71000] Updated weights for policy 0, policy_version 99514 (0.0026) [2024-06-12 22:52:45,336][71000] Updated weights for policy 0, policy_version 99524 (0.0038) [2024-06-12 22:52:45,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.2, 300 sec: 49596.3). Total num frames: 1630617600. Throughput: 0: 49446.5. Samples: 1159467480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 22:52:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-12 22:52:48,967][71000] Updated weights for policy 0, policy_version 99534 (0.0036) [2024-06-12 22:52:50,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1630830592. Throughput: 0: 49219.9. Samples: 1159621480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 22:52:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:52:51,958][71000] Updated weights for policy 0, policy_version 99544 (0.0029) [2024-06-12 22:52:55,735][71000] Updated weights for policy 0, policy_version 99554 (0.0028) [2024-06-12 22:52:55,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 1631092736. Throughput: 0: 49545.7. Samples: 1159927680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 22:52:55,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:52:58,377][71000] Updated weights for policy 0, policy_version 99564 (0.0024) [2024-06-12 22:53:00,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.2, 300 sec: 49485.3). Total num frames: 1631338496. Throughput: 0: 49456.9. Samples: 1160220420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 22:53:00,940][70768] Avg episode reward: [(0, '0.250')] [2024-06-12 22:53:02,565][71000] Updated weights for policy 0, policy_version 99574 (0.0028) [2024-06-12 22:53:05,165][71000] Updated weights for policy 0, policy_version 99584 (0.0031) [2024-06-12 22:53:05,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.0, 300 sec: 49651.9). Total num frames: 1631617024. Throughput: 0: 49824.6. Samples: 1160384840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 22:53:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:53:08,939][71000] Updated weights for policy 0, policy_version 99594 (0.0024) [2024-06-12 22:53:10,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1631830016. Throughput: 0: 49408.0. Samples: 1160675040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 22:53:10,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:53:11,755][71000] Updated weights for policy 0, policy_version 99604 (0.0035) [2024-06-12 22:53:15,531][71000] Updated weights for policy 0, policy_version 99614 (0.0030) [2024-06-12 22:53:15,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1632075776. Throughput: 0: 49092.6. Samples: 1160962700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 22:53:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:53:18,650][71000] Updated weights for policy 0, policy_version 99624 (0.0032) [2024-06-12 22:53:20,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1632354304. Throughput: 0: 49614.6. Samples: 1161112420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 22:53:20,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 22:53:22,443][71000] Updated weights for policy 0, policy_version 99634 (0.0026) [2024-06-12 22:53:25,055][71000] Updated weights for policy 0, policy_version 99644 (0.0025) [2024-06-12 22:53:25,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1632600064. Throughput: 0: 49740.0. Samples: 1161414580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 22:53:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:53:29,040][71000] Updated weights for policy 0, policy_version 99654 (0.0035) [2024-06-12 22:53:30,939][70768] Fps is (10 sec: 45875.5, 60 sec: 49152.2, 300 sec: 49485.2). Total num frames: 1632813056. Throughput: 0: 49863.6. Samples: 1161711340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 22:53:30,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:53:31,028][70980] Signal inference workers to stop experience collection... (17100 times) [2024-06-12 22:53:31,029][70980] Signal inference workers to resume experience collection... (17100 times) [2024-06-12 22:53:31,048][71000] InferenceWorker_p0-w0: stopping experience collection (17100 times) [2024-06-12 22:53:31,049][71000] InferenceWorker_p0-w0: resuming experience collection (17100 times) [2024-06-12 22:53:31,733][71000] Updated weights for policy 0, policy_version 99664 (0.0028) [2024-06-12 22:53:35,937][71000] Updated weights for policy 0, policy_version 99674 (0.0037) [2024-06-12 22:53:35,940][70768] Fps is (10 sec: 45874.8, 60 sec: 49151.9, 300 sec: 49374.2). Total num frames: 1633058816. Throughput: 0: 49359.5. Samples: 1161842660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 22:53:35,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 22:53:38,567][71000] Updated weights for policy 0, policy_version 99684 (0.0031) [2024-06-12 22:53:40,939][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1633320960. Throughput: 0: 49163.1. Samples: 1162140020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 22:53:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:53:40,987][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000099691_1633337344.pth... [2024-06-12 22:53:41,037][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000098966_1621458944.pth [2024-06-12 22:53:42,365][71000] Updated weights for policy 0, policy_version 99694 (0.0031) [2024-06-12 22:53:45,031][71000] Updated weights for policy 0, policy_version 99704 (0.0030) [2024-06-12 22:53:45,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1633583104. Throughput: 0: 49209.5. Samples: 1162434860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 22:53:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:53:48,787][71000] Updated weights for policy 0, policy_version 99714 (0.0028) [2024-06-12 22:53:50,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1633796096. Throughput: 0: 48955.1. Samples: 1162587820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 22:53:50,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:53:52,076][71000] Updated weights for policy 0, policy_version 99724 (0.0032) [2024-06-12 22:53:55,731][71000] Updated weights for policy 0, policy_version 99734 (0.0029) [2024-06-12 22:53:55,939][70768] Fps is (10 sec: 47515.0, 60 sec: 49425.1, 300 sec: 49485.3). Total num frames: 1634058240. Throughput: 0: 48908.1. Samples: 1162875900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 22:53:55,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:53:58,668][71000] Updated weights for policy 0, policy_version 99744 (0.0020) [2024-06-12 22:54:00,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1634304000. Throughput: 0: 49050.4. Samples: 1163169980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-12 22:54:00,944][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:54:02,229][71000] Updated weights for policy 0, policy_version 99754 (0.0033) [2024-06-12 22:54:05,083][71000] Updated weights for policy 0, policy_version 99764 (0.0021) [2024-06-12 22:54:05,939][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1634566144. Throughput: 0: 49109.0. Samples: 1163322320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 22:54:05,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:54:08,423][71000] Updated weights for policy 0, policy_version 99774 (0.0025) [2024-06-12 22:54:10,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1634795520. Throughput: 0: 49176.0. Samples: 1163627500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 22:54:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:54:11,628][71000] Updated weights for policy 0, policy_version 99784 (0.0029) [2024-06-12 22:54:15,295][71000] Updated weights for policy 0, policy_version 99794 (0.0026) [2024-06-12 22:54:15,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1635057664. Throughput: 0: 49029.7. Samples: 1163917680. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 22:54:15,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:54:18,643][71000] Updated weights for policy 0, policy_version 99804 (0.0029) [2024-06-12 22:54:20,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1635303424. Throughput: 0: 49454.8. Samples: 1164068120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 22:54:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:54:21,790][71000] Updated weights for policy 0, policy_version 99814 (0.0026) [2024-06-12 22:54:25,273][71000] Updated weights for policy 0, policy_version 99824 (0.0022) [2024-06-12 22:54:25,939][70768] Fps is (10 sec: 47514.1, 60 sec: 48879.0, 300 sec: 49429.7). Total num frames: 1635532800. Throughput: 0: 49474.3. Samples: 1164366360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 22:54:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 22:54:28,395][71000] Updated weights for policy 0, policy_version 99834 (0.0024) [2024-06-12 22:54:30,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 1635778560. Throughput: 0: 49555.8. Samples: 1164664860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 22:54:30,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:54:31,595][71000] Updated weights for policy 0, policy_version 99844 (0.0038) [2024-06-12 22:54:34,502][70980] Signal inference workers to stop experience collection... (17150 times) [2024-06-12 22:54:34,547][71000] InferenceWorker_p0-w0: stopping experience collection (17150 times) [2024-06-12 22:54:34,609][70980] Signal inference workers to resume experience collection... (17150 times) [2024-06-12 22:54:34,610][71000] InferenceWorker_p0-w0: resuming experience collection (17150 times) [2024-06-12 22:54:34,735][71000] Updated weights for policy 0, policy_version 99854 (0.0025) [2024-06-12 22:54:35,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1636040704. Throughput: 0: 49595.1. Samples: 1164819600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 22:54:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:54:38,454][71000] Updated weights for policy 0, policy_version 99864 (0.0020) [2024-06-12 22:54:40,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1636270080. Throughput: 0: 49555.4. Samples: 1165105900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 22:54:40,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:54:41,503][71000] Updated weights for policy 0, policy_version 99874 (0.0029) [2024-06-12 22:54:45,053][71000] Updated weights for policy 0, policy_version 99884 (0.0027) [2024-06-12 22:54:45,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1636548608. Throughput: 0: 49762.7. Samples: 1165409300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 22:54:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:54:48,351][71000] Updated weights for policy 0, policy_version 99894 (0.0041) [2024-06-12 22:54:50,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 1636777984. Throughput: 0: 49736.0. Samples: 1165560440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 22:54:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 22:54:51,528][71000] Updated weights for policy 0, policy_version 99904 (0.0033) [2024-06-12 22:54:54,885][71000] Updated weights for policy 0, policy_version 99914 (0.0035) [2024-06-12 22:54:55,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49697.9, 300 sec: 49485.2). Total num frames: 1637040128. Throughput: 0: 49731.8. Samples: 1165865440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 22:54:55,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:54:58,037][71000] Updated weights for policy 0, policy_version 99924 (0.0024) [2024-06-12 22:55:00,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1637285888. Throughput: 0: 49811.5. Samples: 1166159200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 22:55:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:55:01,472][71000] Updated weights for policy 0, policy_version 99934 (0.0022) [2024-06-12 22:55:05,044][71000] Updated weights for policy 0, policy_version 99944 (0.0026) [2024-06-12 22:55:05,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1637531648. Throughput: 0: 49569.6. Samples: 1166298760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:55:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:55:08,041][71000] Updated weights for policy 0, policy_version 99954 (0.0026) [2024-06-12 22:55:10,939][70768] Fps is (10 sec: 45875.6, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1637744640. Throughput: 0: 49459.9. Samples: 1166592060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:55:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:55:11,662][71000] Updated weights for policy 0, policy_version 99964 (0.0032) [2024-06-12 22:55:14,746][71000] Updated weights for policy 0, policy_version 99974 (0.0033) [2024-06-12 22:55:15,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.0, 300 sec: 49485.2). Total num frames: 1638039552. Throughput: 0: 49537.1. Samples: 1166894040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:55:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:55:18,018][71000] Updated weights for policy 0, policy_version 99984 (0.0033) [2024-06-12 22:55:20,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1638268928. Throughput: 0: 49231.5. Samples: 1167035020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:55:20,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:55:21,278][71000] Updated weights for policy 0, policy_version 99994 (0.0028) [2024-06-12 22:55:24,551][71000] Updated weights for policy 0, policy_version 100004 (0.0026) [2024-06-12 22:55:25,940][70768] Fps is (10 sec: 47514.5, 60 sec: 49698.0, 300 sec: 49540.8). Total num frames: 1638514688. Throughput: 0: 49724.5. Samples: 1167343500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:55:25,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:55:27,675][71000] Updated weights for policy 0, policy_version 100014 (0.0028) [2024-06-12 22:55:30,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1638744064. Throughput: 0: 49339.3. Samples: 1167629560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:55:30,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:55:31,606][71000] Updated weights for policy 0, policy_version 100024 (0.0027) [2024-06-12 22:55:34,026][70980] Signal inference workers to stop experience collection... (17200 times) [2024-06-12 22:55:34,026][70980] Signal inference workers to resume experience collection... (17200 times) [2024-06-12 22:55:34,046][71000] InferenceWorker_p0-w0: stopping experience collection (17200 times) [2024-06-12 22:55:34,046][71000] InferenceWorker_p0-w0: resuming experience collection (17200 times) [2024-06-12 22:55:34,756][71000] Updated weights for policy 0, policy_version 100034 (0.0033) [2024-06-12 22:55:35,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49424.9, 300 sec: 49318.6). Total num frames: 1639006208. Throughput: 0: 49092.2. Samples: 1167769600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:55:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:55:38,231][71000] Updated weights for policy 0, policy_version 100044 (0.0028) [2024-06-12 22:55:40,940][70768] Fps is (10 sec: 50789.4, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 1639251968. Throughput: 0: 48913.8. Samples: 1168066560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:55:40,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:55:40,953][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000100052_1639251968.pth... [2024-06-12 22:55:41,003][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000099329_1627406336.pth [2024-06-12 22:55:41,671][71000] Updated weights for policy 0, policy_version 100054 (0.0036) [2024-06-12 22:55:44,850][71000] Updated weights for policy 0, policy_version 100064 (0.0028) [2024-06-12 22:55:45,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 1639497728. Throughput: 0: 49030.2. Samples: 1168365560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:55:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:55:48,291][71000] Updated weights for policy 0, policy_version 100074 (0.0032) [2024-06-12 22:55:50,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49424.9, 300 sec: 49429.7). Total num frames: 1639743488. Throughput: 0: 49444.1. Samples: 1168523740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:55:50,940][70768] Avg episode reward: [(0, '0.255')] [2024-06-12 22:55:51,385][71000] Updated weights for policy 0, policy_version 100084 (0.0026) [2024-06-12 22:55:54,479][71000] Updated weights for policy 0, policy_version 100094 (0.0027) [2024-06-12 22:55:55,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1640005632. Throughput: 0: 49437.3. Samples: 1168816740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:55:55,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:55:57,868][71000] Updated weights for policy 0, policy_version 100104 (0.0029) [2024-06-12 22:56:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1640251392. Throughput: 0: 49304.1. Samples: 1169112720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-12 22:56:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 22:56:01,352][71000] Updated weights for policy 0, policy_version 100114 (0.0029) [2024-06-12 22:56:04,647][71000] Updated weights for policy 0, policy_version 100124 (0.0025) [2024-06-12 22:56:05,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.2, 300 sec: 49485.3). Total num frames: 1640497152. Throughput: 0: 49521.8. Samples: 1169263500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:56:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:56:07,771][71000] Updated weights for policy 0, policy_version 100134 (0.0029) [2024-06-12 22:56:10,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1640726528. Throughput: 0: 49171.5. Samples: 1169556220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:56:10,948][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:56:11,356][71000] Updated weights for policy 0, policy_version 100144 (0.0039) [2024-06-12 22:56:14,687][71000] Updated weights for policy 0, policy_version 100154 (0.0035) [2024-06-12 22:56:15,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 1640988672. Throughput: 0: 49442.6. Samples: 1169854480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:56:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:56:17,739][71000] Updated weights for policy 0, policy_version 100164 (0.0029) [2024-06-12 22:56:20,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1641218048. Throughput: 0: 49569.5. Samples: 1170000220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:56:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:56:21,342][71000] Updated weights for policy 0, policy_version 100174 (0.0026) [2024-06-12 22:56:24,546][71000] Updated weights for policy 0, policy_version 100184 (0.0027) [2024-06-12 22:56:25,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49151.9, 300 sec: 49485.2). Total num frames: 1641463808. Throughput: 0: 49513.0. Samples: 1170294640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:56:25,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:56:27,629][71000] Updated weights for policy 0, policy_version 100194 (0.0021) [2024-06-12 22:56:30,900][71000] Updated weights for policy 0, policy_version 100204 (0.0024) [2024-06-12 22:56:30,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49971.1, 300 sec: 49485.2). Total num frames: 1641742336. Throughput: 0: 49676.7. Samples: 1170601020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:56:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:56:34,391][71000] Updated weights for policy 0, policy_version 100214 (0.0024) [2024-06-12 22:56:35,939][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 1641971712. Throughput: 0: 49396.6. Samples: 1170746580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:56:35,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:56:38,031][71000] Updated weights for policy 0, policy_version 100224 (0.0031) [2024-06-12 22:56:40,939][70768] Fps is (10 sec: 47514.5, 60 sec: 49425.3, 300 sec: 49374.2). Total num frames: 1642217472. Throughput: 0: 49469.4. Samples: 1171042860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:56:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:56:41,007][71000] Updated weights for policy 0, policy_version 100234 (0.0026) [2024-06-12 22:56:44,413][70980] Signal inference workers to stop experience collection... (17250 times) [2024-06-12 22:56:44,467][70980] Signal inference workers to resume experience collection... (17250 times) [2024-06-12 22:56:44,468][71000] InferenceWorker_p0-w0: stopping experience collection (17250 times) [2024-06-12 22:56:44,478][71000] InferenceWorker_p0-w0: resuming experience collection (17250 times) [2024-06-12 22:56:44,600][71000] Updated weights for policy 0, policy_version 100244 (0.0031) [2024-06-12 22:56:45,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1642463232. Throughput: 0: 49354.4. Samples: 1171333660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:56:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:56:47,733][71000] Updated weights for policy 0, policy_version 100254 (0.0023) [2024-06-12 22:56:50,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1642708992. Throughput: 0: 49172.7. Samples: 1171476280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:56:50,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:56:51,395][71000] Updated weights for policy 0, policy_version 100264 (0.0025) [2024-06-12 22:56:54,377][71000] Updated weights for policy 0, policy_version 100274 (0.0029) [2024-06-12 22:56:55,939][70768] Fps is (10 sec: 45875.5, 60 sec: 48606.0, 300 sec: 49263.1). Total num frames: 1642921984. Throughput: 0: 49240.1. Samples: 1171772020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:56:55,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:56:57,925][71000] Updated weights for policy 0, policy_version 100284 (0.0031) [2024-06-12 22:57:00,874][71000] Updated weights for policy 0, policy_version 100294 (0.0021) [2024-06-12 22:57:00,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.1, 300 sec: 49374.1). Total num frames: 1643216896. Throughput: 0: 49327.6. Samples: 1172074220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 22:57:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 22:57:04,559][71000] Updated weights for policy 0, policy_version 100304 (0.0019) [2024-06-12 22:57:05,940][70768] Fps is (10 sec: 54066.5, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1643462656. Throughput: 0: 49240.8. Samples: 1172216060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 22:57:05,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:57:07,446][71000] Updated weights for policy 0, policy_version 100314 (0.0031) [2024-06-12 22:57:10,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 1643692032. Throughput: 0: 49366.3. Samples: 1172516120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 22:57:10,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 22:57:10,963][71000] Updated weights for policy 0, policy_version 100324 (0.0031) [2024-06-12 22:57:14,311][71000] Updated weights for policy 0, policy_version 100334 (0.0033) [2024-06-12 22:57:15,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 1643921408. Throughput: 0: 49129.1. Samples: 1172811820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 22:57:15,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:57:17,646][71000] Updated weights for policy 0, policy_version 100344 (0.0032) [2024-06-12 22:57:20,566][71000] Updated weights for policy 0, policy_version 100354 (0.0027) [2024-06-12 22:57:20,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49971.2, 300 sec: 49374.2). Total num frames: 1644216320. Throughput: 0: 49352.8. Samples: 1172967460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 22:57:20,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 22:57:24,563][71000] Updated weights for policy 0, policy_version 100364 (0.0029) [2024-06-12 22:57:25,939][70768] Fps is (10 sec: 52429.0, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1644445696. Throughput: 0: 49469.8. Samples: 1173269000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 22:57:25,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 22:57:27,322][71000] Updated weights for policy 0, policy_version 100374 (0.0028) [2024-06-12 22:57:30,895][71000] Updated weights for policy 0, policy_version 100384 (0.0024) [2024-06-12 22:57:30,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 1644691456. Throughput: 0: 49605.7. Samples: 1173565920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 22:57:30,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:57:33,795][71000] Updated weights for policy 0, policy_version 100394 (0.0041) [2024-06-12 22:57:35,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1644937216. Throughput: 0: 49765.1. Samples: 1173715700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 22:57:35,940][70768] Avg episode reward: [(0, '0.247')] [2024-06-12 22:57:37,183][71000] Updated weights for policy 0, policy_version 100404 (0.0026) [2024-06-12 22:57:40,430][71000] Updated weights for policy 0, policy_version 100414 (0.0026) [2024-06-12 22:57:40,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49971.1, 300 sec: 49485.2). Total num frames: 1645215744. Throughput: 0: 49856.8. Samples: 1174015580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 22:57:40,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 22:57:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000100416_1645215744.pth... [2024-06-12 22:57:40,985][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000099691_1633337344.pth [2024-06-12 22:57:43,873][71000] Updated weights for policy 0, policy_version 100424 (0.0027) [2024-06-12 22:57:45,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49698.0, 300 sec: 49540.8). Total num frames: 1645445120. Throughput: 0: 49878.6. Samples: 1174318760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 22:57:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:57:46,867][71000] Updated weights for policy 0, policy_version 100434 (0.0029) [2024-06-12 22:57:50,567][71000] Updated weights for policy 0, policy_version 100444 (0.0034) [2024-06-12 22:57:50,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1645690880. Throughput: 0: 49988.4. Samples: 1174465540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 22:57:50,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 22:57:53,362][71000] Updated weights for policy 0, policy_version 100454 (0.0033) [2024-06-12 22:57:55,940][70768] Fps is (10 sec: 49152.7, 60 sec: 50244.2, 300 sec: 49485.2). Total num frames: 1645936640. Throughput: 0: 49893.0. Samples: 1174761300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 22:57:55,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 22:57:57,154][71000] Updated weights for policy 0, policy_version 100464 (0.0024) [2024-06-12 22:57:57,357][70980] Signal inference workers to stop experience collection... (17300 times) [2024-06-12 22:57:57,396][71000] InferenceWorker_p0-w0: stopping experience collection (17300 times) [2024-06-12 22:57:57,405][70980] Signal inference workers to resume experience collection... (17300 times) [2024-06-12 22:57:57,412][71000] InferenceWorker_p0-w0: resuming experience collection (17300 times) [2024-06-12 22:58:00,042][71000] Updated weights for policy 0, policy_version 100474 (0.0028) [2024-06-12 22:58:00,939][70768] Fps is (10 sec: 50791.0, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1646198784. Throughput: 0: 50008.1. Samples: 1175062180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 22:58:00,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 22:58:03,492][71000] Updated weights for policy 0, policy_version 100484 (0.0026) [2024-06-12 22:58:05,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1646460928. Throughput: 0: 50025.3. Samples: 1175218600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 22:58:05,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 22:58:06,651][71000] Updated weights for policy 0, policy_version 100494 (0.0029) [2024-06-12 22:58:09,932][71000] Updated weights for policy 0, policy_version 100504 (0.0030) [2024-06-12 22:58:10,940][70768] Fps is (10 sec: 47512.4, 60 sec: 49698.0, 300 sec: 49485.2). Total num frames: 1646673920. Throughput: 0: 49970.0. Samples: 1175517660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 22:58:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:58:12,955][71000] Updated weights for policy 0, policy_version 100514 (0.0029) [2024-06-12 22:58:15,939][70768] Fps is (10 sec: 45875.5, 60 sec: 49971.2, 300 sec: 49374.2). Total num frames: 1646919680. Throughput: 0: 49901.9. Samples: 1175811500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 22:58:15,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 22:58:16,696][71000] Updated weights for policy 0, policy_version 100524 (0.0033) [2024-06-12 22:58:19,471][71000] Updated weights for policy 0, policy_version 100534 (0.0024) [2024-06-12 22:58:20,940][70768] Fps is (10 sec: 52429.5, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1647198208. Throughput: 0: 50043.0. Samples: 1175967640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 22:58:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 22:58:23,515][71000] Updated weights for policy 0, policy_version 100544 (0.0024) [2024-06-12 22:58:25,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49971.1, 300 sec: 49596.3). Total num frames: 1647443968. Throughput: 0: 49811.0. Samples: 1176257080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 22:58:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:58:26,589][71000] Updated weights for policy 0, policy_version 100554 (0.0025) [2024-06-12 22:58:30,081][71000] Updated weights for policy 0, policy_version 100564 (0.0025) [2024-06-12 22:58:30,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1647656960. Throughput: 0: 49669.9. Samples: 1176553900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 22:58:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 22:58:33,414][71000] Updated weights for policy 0, policy_version 100574 (0.0026) [2024-06-12 22:58:35,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49698.0, 300 sec: 49485.2). Total num frames: 1647919104. Throughput: 0: 49550.2. Samples: 1176695300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 22:58:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:58:36,742][71000] Updated weights for policy 0, policy_version 100584 (0.0025) [2024-06-12 22:58:39,720][71000] Updated weights for policy 0, policy_version 100594 (0.0031) [2024-06-12 22:58:40,939][70768] Fps is (10 sec: 52429.2, 60 sec: 49425.1, 300 sec: 49485.3). Total num frames: 1648181248. Throughput: 0: 49488.9. Samples: 1176988300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 22:58:40,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 22:58:43,591][71000] Updated weights for policy 0, policy_version 100604 (0.0024) [2024-06-12 22:58:45,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1648427008. Throughput: 0: 49381.2. Samples: 1177284340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 22:58:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:58:46,452][71000] Updated weights for policy 0, policy_version 100614 (0.0030) [2024-06-12 22:58:50,089][71000] Updated weights for policy 0, policy_version 100624 (0.0028) [2024-06-12 22:58:50,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1648640000. Throughput: 0: 49162.7. Samples: 1177430920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 22:58:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:58:53,361][71000] Updated weights for policy 0, policy_version 100634 (0.0026) [2024-06-12 22:58:55,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1648918528. Throughput: 0: 49152.6. Samples: 1177729520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 22:58:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 22:58:56,765][71000] Updated weights for policy 0, policy_version 100644 (0.0036) [2024-06-12 22:58:59,790][71000] Updated weights for policy 0, policy_version 100654 (0.0031) [2024-06-12 22:59:00,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49424.8, 300 sec: 49485.2). Total num frames: 1649164288. Throughput: 0: 49273.1. Samples: 1178028800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-12 22:59:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 22:59:03,327][71000] Updated weights for policy 0, policy_version 100664 (0.0028) [2024-06-12 22:59:05,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1649410048. Throughput: 0: 49246.7. Samples: 1178183740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 22:59:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 22:59:06,351][71000] Updated weights for policy 0, policy_version 100674 (0.0028) [2024-06-12 22:59:09,711][71000] Updated weights for policy 0, policy_version 100684 (0.0028) [2024-06-12 22:59:10,939][70768] Fps is (10 sec: 45876.6, 60 sec: 49152.2, 300 sec: 49374.2). Total num frames: 1649623040. Throughput: 0: 49366.9. Samples: 1178478580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 22:59:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 22:59:11,223][70980] Signal inference workers to stop experience collection... (17350 times) [2024-06-12 22:59:11,272][71000] InferenceWorker_p0-w0: stopping experience collection (17350 times) [2024-06-12 22:59:11,285][70980] Signal inference workers to resume experience collection... (17350 times) [2024-06-12 22:59:11,287][71000] InferenceWorker_p0-w0: resuming experience collection (17350 times) [2024-06-12 22:59:12,838][71000] Updated weights for policy 0, policy_version 100694 (0.0027) [2024-06-12 22:59:15,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49698.0, 300 sec: 49485.2). Total num frames: 1649901568. Throughput: 0: 49451.0. Samples: 1178779200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 22:59:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:59:16,439][71000] Updated weights for policy 0, policy_version 100704 (0.0032) [2024-06-12 22:59:19,673][71000] Updated weights for policy 0, policy_version 100714 (0.0028) [2024-06-12 22:59:20,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49152.0, 300 sec: 49540.7). Total num frames: 1650147328. Throughput: 0: 49606.7. Samples: 1178927600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 22:59:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:59:23,266][71000] Updated weights for policy 0, policy_version 100724 (0.0034) [2024-06-12 22:59:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49540.7). Total num frames: 1650393088. Throughput: 0: 49763.8. Samples: 1179227680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 22:59:25,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 22:59:26,112][71000] Updated weights for policy 0, policy_version 100734 (0.0034) [2024-06-12 22:59:29,937][71000] Updated weights for policy 0, policy_version 100744 (0.0033) [2024-06-12 22:59:30,939][70768] Fps is (10 sec: 47514.2, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1650622464. Throughput: 0: 49745.9. Samples: 1179522900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 22:59:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 22:59:32,860][71000] Updated weights for policy 0, policy_version 100754 (0.0026) [2024-06-12 22:59:35,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1650900992. Throughput: 0: 49514.2. Samples: 1179659060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 22:59:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 22:59:36,367][71000] Updated weights for policy 0, policy_version 100764 (0.0029) [2024-06-12 22:59:39,362][71000] Updated weights for policy 0, policy_version 100774 (0.0025) [2024-06-12 22:59:40,940][70768] Fps is (10 sec: 54065.8, 60 sec: 49698.0, 300 sec: 49540.8). Total num frames: 1651163136. Throughput: 0: 49706.5. Samples: 1179966320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 22:59:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 22:59:40,954][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000100779_1651163136.pth... [2024-06-12 22:59:41,006][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000100052_1639251968.pth [2024-06-12 22:59:43,280][71000] Updated weights for policy 0, policy_version 100784 (0.0031) [2024-06-12 22:59:45,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.0, 300 sec: 49540.7). Total num frames: 1651392512. Throughput: 0: 49461.4. Samples: 1180254560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 22:59:45,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 22:59:46,134][71000] Updated weights for policy 0, policy_version 100794 (0.0032) [2024-06-12 22:59:49,889][71000] Updated weights for policy 0, policy_version 100804 (0.0030) [2024-06-12 22:59:50,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 1651621888. Throughput: 0: 49062.5. Samples: 1180391560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 22:59:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 22:59:52,667][71000] Updated weights for policy 0, policy_version 100814 (0.0025) [2024-06-12 22:59:55,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1651867648. Throughput: 0: 49137.2. Samples: 1180689760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 22:59:55,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 22:59:56,363][71000] Updated weights for policy 0, policy_version 100824 (0.0026) [2024-06-12 22:59:59,247][71000] Updated weights for policy 0, policy_version 100834 (0.0024) [2024-06-12 23:00:00,940][70768] Fps is (10 sec: 52429.1, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1652146176. Throughput: 0: 49446.3. Samples: 1181004280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-12 23:00:00,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 23:00:02,745][71000] Updated weights for policy 0, policy_version 100844 (0.0023) [2024-06-12 23:00:05,653][71000] Updated weights for policy 0, policy_version 100854 (0.0024) [2024-06-12 23:00:05,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49698.2, 300 sec: 49651.8). Total num frames: 1652391936. Throughput: 0: 49588.5. Samples: 1181159080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 23:00:05,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:00:09,593][71000] Updated weights for policy 0, policy_version 100864 (0.0031) [2024-06-12 23:00:10,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49971.0, 300 sec: 49429.7). Total num frames: 1652621312. Throughput: 0: 49523.6. Samples: 1181456240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 23:00:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:00:12,395][71000] Updated weights for policy 0, policy_version 100874 (0.0032) [2024-06-12 23:00:15,802][70980] Signal inference workers to stop experience collection... (17400 times) [2024-06-12 23:00:15,805][70980] Signal inference workers to resume experience collection... (17400 times) [2024-06-12 23:00:15,816][71000] InferenceWorker_p0-w0: stopping experience collection (17400 times) [2024-06-12 23:00:15,816][71000] InferenceWorker_p0-w0: resuming experience collection (17400 times) [2024-06-12 23:00:15,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1652883456. Throughput: 0: 49431.4. Samples: 1181747320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 23:00:15,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:00:15,943][71000] Updated weights for policy 0, policy_version 100884 (0.0033) [2024-06-12 23:00:19,221][71000] Updated weights for policy 0, policy_version 100894 (0.0024) [2024-06-12 23:00:20,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1653145600. Throughput: 0: 49890.6. Samples: 1181904140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 23:00:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:00:22,531][71000] Updated weights for policy 0, policy_version 100904 (0.0030) [2024-06-12 23:00:25,763][71000] Updated weights for policy 0, policy_version 100914 (0.0030) [2024-06-12 23:00:25,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49698.3, 300 sec: 49596.3). Total num frames: 1653374976. Throughput: 0: 49813.6. Samples: 1182207920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 23:00:25,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 23:00:28,828][71000] Updated weights for policy 0, policy_version 100924 (0.0028) [2024-06-12 23:00:30,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49971.1, 300 sec: 49540.8). Total num frames: 1653620736. Throughput: 0: 50005.0. Samples: 1182504780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 23:00:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:00:32,473][71000] Updated weights for policy 0, policy_version 100934 (0.0020) [2024-06-12 23:00:35,810][71000] Updated weights for policy 0, policy_version 100944 (0.0029) [2024-06-12 23:00:35,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1653866496. Throughput: 0: 50097.9. Samples: 1182645960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 23:00:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:00:39,056][71000] Updated weights for policy 0, policy_version 100954 (0.0032) [2024-06-12 23:00:40,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.2, 300 sec: 49596.3). Total num frames: 1654128640. Throughput: 0: 50044.0. Samples: 1182941740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 23:00:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:00:42,345][71000] Updated weights for policy 0, policy_version 100964 (0.0034) [2024-06-12 23:00:45,687][71000] Updated weights for policy 0, policy_version 100974 (0.0025) [2024-06-12 23:00:45,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1654374400. Throughput: 0: 49689.7. Samples: 1183240320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 23:00:45,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:00:49,064][71000] Updated weights for policy 0, policy_version 100984 (0.0030) [2024-06-12 23:00:50,940][70768] Fps is (10 sec: 47510.2, 60 sec: 49697.7, 300 sec: 49485.1). Total num frames: 1654603776. Throughput: 0: 49543.2. Samples: 1183388560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 23:00:50,941][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:00:52,415][71000] Updated weights for policy 0, policy_version 100994 (0.0028) [2024-06-12 23:00:55,817][71000] Updated weights for policy 0, policy_version 101004 (0.0030) [2024-06-12 23:00:55,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49698.2, 300 sec: 49485.3). Total num frames: 1654849536. Throughput: 0: 49273.9. Samples: 1183673560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 23:00:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:00:58,850][71000] Updated weights for policy 0, policy_version 101014 (0.0029) [2024-06-12 23:01:00,940][70768] Fps is (10 sec: 50793.9, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1655111680. Throughput: 0: 49404.0. Samples: 1183970500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-12 23:01:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:01:02,381][71000] Updated weights for policy 0, policy_version 101024 (0.0027) [2024-06-12 23:01:05,605][71000] Updated weights for policy 0, policy_version 101034 (0.0028) [2024-06-12 23:01:05,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1655357440. Throughput: 0: 49228.4. Samples: 1184119420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 23:01:05,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 23:01:09,014][71000] Updated weights for policy 0, policy_version 101044 (0.0020) [2024-06-12 23:01:10,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1655586816. Throughput: 0: 49092.0. Samples: 1184417060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 23:01:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:01:12,013][71000] Updated weights for policy 0, policy_version 101054 (0.0031) [2024-06-12 23:01:15,498][71000] Updated weights for policy 0, policy_version 101064 (0.0027) [2024-06-12 23:01:15,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49424.9, 300 sec: 49596.3). Total num frames: 1655848960. Throughput: 0: 49212.3. Samples: 1184719340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 23:01:15,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 23:01:18,793][71000] Updated weights for policy 0, policy_version 101074 (0.0030) [2024-06-12 23:01:20,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49152.0, 300 sec: 49596.3). Total num frames: 1656094720. Throughput: 0: 49529.3. Samples: 1184874780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 23:01:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:01:21,931][71000] Updated weights for policy 0, policy_version 101084 (0.0029) [2024-06-12 23:01:22,262][70980] Signal inference workers to stop experience collection... (17450 times) [2024-06-12 23:01:22,262][70980] Signal inference workers to resume experience collection... (17450 times) [2024-06-12 23:01:22,276][71000] InferenceWorker_p0-w0: stopping experience collection (17450 times) [2024-06-12 23:01:22,276][71000] InferenceWorker_p0-w0: resuming experience collection (17450 times) [2024-06-12 23:01:25,173][71000] Updated weights for policy 0, policy_version 101094 (0.0026) [2024-06-12 23:01:25,940][70768] Fps is (10 sec: 49153.1, 60 sec: 49425.1, 300 sec: 49485.3). Total num frames: 1656340480. Throughput: 0: 49586.7. Samples: 1185173140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 23:01:25,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 23:01:28,646][71000] Updated weights for policy 0, policy_version 101104 (0.0034) [2024-06-12 23:01:30,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1656586240. Throughput: 0: 49492.1. Samples: 1185467460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 23:01:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:01:31,782][71000] Updated weights for policy 0, policy_version 101114 (0.0026) [2024-06-12 23:01:35,243][71000] Updated weights for policy 0, policy_version 101124 (0.0030) [2024-06-12 23:01:35,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1656848384. Throughput: 0: 49491.1. Samples: 1185615620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 23:01:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:01:38,523][71000] Updated weights for policy 0, policy_version 101134 (0.0021) [2024-06-12 23:01:40,940][70768] Fps is (10 sec: 49150.6, 60 sec: 49151.8, 300 sec: 49540.7). Total num frames: 1657077760. Throughput: 0: 49718.3. Samples: 1185910900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 23:01:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:01:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000101140_1657077760.pth... [2024-06-12 23:01:40,998][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000100416_1645215744.pth [2024-06-12 23:01:41,932][71000] Updated weights for policy 0, policy_version 101144 (0.0036) [2024-06-12 23:01:45,070][71000] Updated weights for policy 0, policy_version 101154 (0.0029) [2024-06-12 23:01:45,939][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.2, 300 sec: 49596.3). Total num frames: 1657339904. Throughput: 0: 49851.6. Samples: 1186213820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 23:01:45,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:01:48,195][71000] Updated weights for policy 0, policy_version 101164 (0.0031) [2024-06-12 23:01:50,940][70768] Fps is (10 sec: 50791.3, 60 sec: 49698.6, 300 sec: 49707.4). Total num frames: 1657585664. Throughput: 0: 49737.3. Samples: 1186357600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 23:01:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:01:51,505][71000] Updated weights for policy 0, policy_version 101174 (0.0022) [2024-06-12 23:01:55,082][71000] Updated weights for policy 0, policy_version 101184 (0.0029) [2024-06-12 23:01:55,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1657831424. Throughput: 0: 49761.8. Samples: 1186656340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 23:01:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:01:58,159][71000] Updated weights for policy 0, policy_version 101194 (0.0026) [2024-06-12 23:02:00,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.0, 300 sec: 49596.3). Total num frames: 1658093568. Throughput: 0: 49708.9. Samples: 1186956240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-12 23:02:00,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:02:01,693][71000] Updated weights for policy 0, policy_version 101204 (0.0024) [2024-06-12 23:02:04,978][71000] Updated weights for policy 0, policy_version 101214 (0.0027) [2024-06-12 23:02:05,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.2, 300 sec: 49596.3). Total num frames: 1658322944. Throughput: 0: 49593.9. Samples: 1187106500. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-12 23:02:05,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 23:02:08,277][71000] Updated weights for policy 0, policy_version 101224 (0.0040) [2024-06-12 23:02:10,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49971.2, 300 sec: 49707.4). Total num frames: 1658585088. Throughput: 0: 49514.2. Samples: 1187401280. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-12 23:02:10,943][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:02:11,326][71000] Updated weights for policy 0, policy_version 101234 (0.0030) [2024-06-12 23:02:14,935][71000] Updated weights for policy 0, policy_version 101244 (0.0029) [2024-06-12 23:02:15,940][70768] Fps is (10 sec: 50789.4, 60 sec: 49698.2, 300 sec: 49540.7). Total num frames: 1658830848. Throughput: 0: 49475.4. Samples: 1187693860. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-12 23:02:15,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:02:17,943][71000] Updated weights for policy 0, policy_version 101254 (0.0027) [2024-06-12 23:02:20,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1659076608. Throughput: 0: 49466.0. Samples: 1187841600. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-12 23:02:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:02:21,485][71000] Updated weights for policy 0, policy_version 101264 (0.0026) [2024-06-12 23:02:24,478][71000] Updated weights for policy 0, policy_version 101274 (0.0021) [2024-06-12 23:02:25,939][70768] Fps is (10 sec: 49153.1, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1659322368. Throughput: 0: 49658.2. Samples: 1188145500. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-12 23:02:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:02:28,051][71000] Updated weights for policy 0, policy_version 101284 (0.0026) [2024-06-12 23:02:30,939][70768] Fps is (10 sec: 49153.0, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1659568128. Throughput: 0: 49526.7. Samples: 1188442520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-12 23:02:30,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:02:31,247][71000] Updated weights for policy 0, policy_version 101294 (0.0030) [2024-06-12 23:02:34,523][71000] Updated weights for policy 0, policy_version 101304 (0.0024) [2024-06-12 23:02:35,939][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1659797504. Throughput: 0: 49741.1. Samples: 1188595940. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-12 23:02:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:02:37,621][71000] Updated weights for policy 0, policy_version 101314 (0.0026) [2024-06-12 23:02:40,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49971.4, 300 sec: 49596.3). Total num frames: 1660076032. Throughput: 0: 49653.7. Samples: 1188890760. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-12 23:02:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:02:41,207][71000] Updated weights for policy 0, policy_version 101324 (0.0031) [2024-06-12 23:02:42,832][70980] Signal inference workers to stop experience collection... (17500 times) [2024-06-12 23:02:42,832][70980] Signal inference workers to resume experience collection... (17500 times) [2024-06-12 23:02:42,869][71000] InferenceWorker_p0-w0: stopping experience collection (17500 times) [2024-06-12 23:02:42,870][71000] InferenceWorker_p0-w0: resuming experience collection (17500 times) [2024-06-12 23:02:44,409][71000] Updated weights for policy 0, policy_version 101334 (0.0034) [2024-06-12 23:02:45,940][70768] Fps is (10 sec: 50789.3, 60 sec: 49424.9, 300 sec: 49540.8). Total num frames: 1660305408. Throughput: 0: 49563.1. Samples: 1189186580. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-12 23:02:45,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 23:02:47,838][71000] Updated weights for policy 0, policy_version 101344 (0.0038) [2024-06-12 23:02:50,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1660567552. Throughput: 0: 49526.5. Samples: 1189335200. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-12 23:02:50,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:02:51,026][71000] Updated weights for policy 0, policy_version 101354 (0.0031) [2024-06-12 23:02:54,434][71000] Updated weights for policy 0, policy_version 101364 (0.0026) [2024-06-12 23:02:55,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1660796928. Throughput: 0: 49507.4. Samples: 1189629120. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-12 23:02:55,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:02:57,547][71000] Updated weights for policy 0, policy_version 101374 (0.0024) [2024-06-12 23:03:00,876][71000] Updated weights for policy 0, policy_version 101384 (0.0033) [2024-06-12 23:03:00,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1661075456. Throughput: 0: 49771.2. Samples: 1189933560. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-12 23:03:00,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 23:03:04,271][71000] Updated weights for policy 0, policy_version 101394 (0.0029) [2024-06-12 23:03:05,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1661304832. Throughput: 0: 49768.2. Samples: 1190081160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:03:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 23:03:07,456][71000] Updated weights for policy 0, policy_version 101404 (0.0026) [2024-06-12 23:03:10,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1661550592. Throughput: 0: 49520.7. Samples: 1190373940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:03:10,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 23:03:10,953][71000] Updated weights for policy 0, policy_version 101414 (0.0031) [2024-06-12 23:03:14,279][71000] Updated weights for policy 0, policy_version 101424 (0.0030) [2024-06-12 23:03:15,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 1661796352. Throughput: 0: 49662.6. Samples: 1190677340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:03:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:03:17,753][71000] Updated weights for policy 0, policy_version 101434 (0.0022) [2024-06-12 23:03:20,660][71000] Updated weights for policy 0, policy_version 101444 (0.0026) [2024-06-12 23:03:20,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1662058496. Throughput: 0: 49598.0. Samples: 1190827860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:03:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:03:24,054][71000] Updated weights for policy 0, policy_version 101454 (0.0023) [2024-06-12 23:03:25,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.1, 300 sec: 49651.9). Total num frames: 1662304256. Throughput: 0: 49542.3. Samples: 1191120160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:03:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:03:27,402][71000] Updated weights for policy 0, policy_version 101464 (0.0034) [2024-06-12 23:03:30,924][71000] Updated weights for policy 0, policy_version 101474 (0.0029) [2024-06-12 23:03:30,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1662550016. Throughput: 0: 49601.4. Samples: 1191418640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:03:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:03:34,190][71000] Updated weights for policy 0, policy_version 101484 (0.0030) [2024-06-12 23:03:35,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49971.0, 300 sec: 49540.7). Total num frames: 1662795776. Throughput: 0: 49640.0. Samples: 1191569000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:03:35,944][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:03:37,313][71000] Updated weights for policy 0, policy_version 101494 (0.0033) [2024-06-12 23:03:40,725][71000] Updated weights for policy 0, policy_version 101504 (0.0028) [2024-06-12 23:03:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1663041536. Throughput: 0: 49606.7. Samples: 1191861420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:03:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:03:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000101504_1663041536.pth... [2024-06-12 23:03:40,993][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000100779_1651163136.pth [2024-06-12 23:03:44,298][71000] Updated weights for policy 0, policy_version 101514 (0.0027) [2024-06-12 23:03:45,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 1663270912. Throughput: 0: 49334.2. Samples: 1192153600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:03:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-12 23:03:47,599][71000] Updated weights for policy 0, policy_version 101524 (0.0024) [2024-06-12 23:03:50,791][71000] Updated weights for policy 0, policy_version 101534 (0.0024) [2024-06-12 23:03:50,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1663533056. Throughput: 0: 49196.5. Samples: 1192295000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:03:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:03:54,219][71000] Updated weights for policy 0, policy_version 101544 (0.0022) [2024-06-12 23:03:55,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1663778816. Throughput: 0: 49435.6. Samples: 1192598540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:03:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:03:57,546][71000] Updated weights for policy 0, policy_version 101554 (0.0022) [2024-06-12 23:04:00,423][71000] Updated weights for policy 0, policy_version 101564 (0.0035) [2024-06-12 23:04:00,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1664040960. Throughput: 0: 49268.0. Samples: 1192894400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:04:00,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:04:04,089][71000] Updated weights for policy 0, policy_version 101574 (0.0033) [2024-06-12 23:04:05,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 49651.8). Total num frames: 1664270336. Throughput: 0: 49128.1. Samples: 1193038620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 23:04:05,942][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:04:07,030][71000] Updated weights for policy 0, policy_version 101584 (0.0028) [2024-06-12 23:04:07,991][70980] Signal inference workers to stop experience collection... (17550 times) [2024-06-12 23:04:07,994][70980] Signal inference workers to resume experience collection... (17550 times) [2024-06-12 23:04:08,005][71000] InferenceWorker_p0-w0: stopping experience collection (17550 times) [2024-06-12 23:04:08,005][71000] InferenceWorker_p0-w0: resuming experience collection (17550 times) [2024-06-12 23:04:10,605][71000] Updated weights for policy 0, policy_version 101594 (0.0021) [2024-06-12 23:04:10,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1664516096. Throughput: 0: 49353.7. Samples: 1193341080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 23:04:10,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 23:04:13,546][71000] Updated weights for policy 0, policy_version 101604 (0.0032) [2024-06-12 23:04:15,939][70768] Fps is (10 sec: 50791.2, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1664778240. Throughput: 0: 49511.7. Samples: 1193646660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 23:04:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:04:17,578][71000] Updated weights for policy 0, policy_version 101614 (0.0024) [2024-06-12 23:04:19,840][71000] Updated weights for policy 0, policy_version 101624 (0.0021) [2024-06-12 23:04:20,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.2, 300 sec: 49596.3). Total num frames: 1665024000. Throughput: 0: 49821.5. Samples: 1193810960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 23:04:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:04:23,994][71000] Updated weights for policy 0, policy_version 101634 (0.0032) [2024-06-12 23:04:25,940][70768] Fps is (10 sec: 49150.7, 60 sec: 49424.9, 300 sec: 49651.8). Total num frames: 1665269760. Throughput: 0: 49749.7. Samples: 1194100160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 23:04:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-12 23:04:26,827][71000] Updated weights for policy 0, policy_version 101644 (0.0026) [2024-06-12 23:04:30,750][71000] Updated weights for policy 0, policy_version 101654 (0.0026) [2024-06-12 23:04:30,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49151.9, 300 sec: 49485.2). Total num frames: 1665499136. Throughput: 0: 49614.6. Samples: 1194386260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 23:04:30,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:04:33,242][71000] Updated weights for policy 0, policy_version 101664 (0.0032) [2024-06-12 23:04:35,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1665777664. Throughput: 0: 49617.2. Samples: 1194527780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 23:04:35,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:04:37,222][71000] Updated weights for policy 0, policy_version 101674 (0.0027) [2024-06-12 23:04:39,871][71000] Updated weights for policy 0, policy_version 101684 (0.0027) [2024-06-12 23:04:40,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1666023424. Throughput: 0: 49775.1. Samples: 1194838420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 23:04:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:04:44,119][71000] Updated weights for policy 0, policy_version 101694 (0.0040) [2024-06-12 23:04:45,939][70768] Fps is (10 sec: 47514.6, 60 sec: 49698.3, 300 sec: 49596.4). Total num frames: 1666252800. Throughput: 0: 49933.9. Samples: 1195141420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 23:04:45,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:04:46,434][71000] Updated weights for policy 0, policy_version 101704 (0.0027) [2024-06-12 23:04:50,470][71000] Updated weights for policy 0, policy_version 101714 (0.0031) [2024-06-12 23:04:50,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1666482176. Throughput: 0: 49833.9. Samples: 1195281140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 23:04:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:04:53,412][71000] Updated weights for policy 0, policy_version 101724 (0.0024) [2024-06-12 23:04:55,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1666760704. Throughput: 0: 49601.4. Samples: 1195573140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 23:04:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:04:57,283][71000] Updated weights for policy 0, policy_version 101734 (0.0033) [2024-06-12 23:05:00,114][71000] Updated weights for policy 0, policy_version 101744 (0.0028) [2024-06-12 23:05:00,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49425.0, 300 sec: 49540.7). Total num frames: 1667006464. Throughput: 0: 49248.2. Samples: 1195862840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-12 23:05:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:05:04,018][71000] Updated weights for policy 0, policy_version 101754 (0.0023) [2024-06-12 23:05:05,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1667235840. Throughput: 0: 48931.9. Samples: 1196012900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 23:05:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:05:06,700][71000] Updated weights for policy 0, policy_version 101764 (0.0031) [2024-06-12 23:05:10,667][71000] Updated weights for policy 0, policy_version 101774 (0.0026) [2024-06-12 23:05:10,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1667481600. Throughput: 0: 49004.0. Samples: 1196305340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 23:05:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:05:12,000][70980] Signal inference workers to stop experience collection... (17600 times) [2024-06-12 23:05:12,000][70980] Signal inference workers to resume experience collection... (17600 times) [2024-06-12 23:05:12,015][71000] InferenceWorker_p0-w0: stopping experience collection (17600 times) [2024-06-12 23:05:12,042][71000] InferenceWorker_p0-w0: resuming experience collection (17600 times) [2024-06-12 23:05:13,487][71000] Updated weights for policy 0, policy_version 101784 (0.0030) [2024-06-12 23:05:15,939][70768] Fps is (10 sec: 50791.3, 60 sec: 49425.1, 300 sec: 49485.3). Total num frames: 1667743744. Throughput: 0: 49263.3. Samples: 1196603100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 23:05:15,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:05:17,157][71000] Updated weights for policy 0, policy_version 101794 (0.0040) [2024-06-12 23:05:20,199][71000] Updated weights for policy 0, policy_version 101804 (0.0035) [2024-06-12 23:05:20,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1668005888. Throughput: 0: 49557.4. Samples: 1196757860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 23:05:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-12 23:05:23,607][71000] Updated weights for policy 0, policy_version 101814 (0.0022) [2024-06-12 23:05:25,940][70768] Fps is (10 sec: 47512.7, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 1668218880. Throughput: 0: 49179.0. Samples: 1197051480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 23:05:25,944][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 23:05:26,833][71000] Updated weights for policy 0, policy_version 101824 (0.0034) [2024-06-12 23:05:30,234][71000] Updated weights for policy 0, policy_version 101834 (0.0031) [2024-06-12 23:05:30,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 1668464640. Throughput: 0: 49019.4. Samples: 1197347300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 23:05:30,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 23:05:33,239][71000] Updated weights for policy 0, policy_version 101844 (0.0026) [2024-06-12 23:05:35,940][70768] Fps is (10 sec: 52429.5, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1668743168. Throughput: 0: 49209.4. Samples: 1197495560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 23:05:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:05:37,026][71000] Updated weights for policy 0, policy_version 101854 (0.0030) [2024-06-12 23:05:39,977][71000] Updated weights for policy 0, policy_version 101864 (0.0037) [2024-06-12 23:05:40,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1668988928. Throughput: 0: 49423.6. Samples: 1197797200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 23:05:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:05:41,088][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000101868_1669005312.pth... [2024-06-12 23:05:41,132][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000101140_1657077760.pth [2024-06-12 23:05:43,352][71000] Updated weights for policy 0, policy_version 101874 (0.0030) [2024-06-12 23:05:45,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49151.9, 300 sec: 49485.4). Total num frames: 1669201920. Throughput: 0: 49669.0. Samples: 1198097940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 23:05:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-12 23:05:46,612][71000] Updated weights for policy 0, policy_version 101884 (0.0033) [2024-06-12 23:05:49,887][71000] Updated weights for policy 0, policy_version 101894 (0.0021) [2024-06-12 23:05:50,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49698.0, 300 sec: 49540.7). Total num frames: 1669464064. Throughput: 0: 49495.1. Samples: 1198240180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 23:05:50,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 23:05:53,476][71000] Updated weights for policy 0, policy_version 101904 (0.0033) [2024-06-12 23:05:55,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1669726208. Throughput: 0: 49466.8. Samples: 1198531340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 23:05:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:05:56,726][71000] Updated weights for policy 0, policy_version 101914 (0.0027) [2024-06-12 23:05:59,895][71000] Updated weights for policy 0, policy_version 101924 (0.0025) [2024-06-12 23:06:00,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1669971968. Throughput: 0: 49479.4. Samples: 1198829680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-12 23:06:00,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 23:06:03,142][71000] Updated weights for policy 0, policy_version 101934 (0.0023) [2024-06-12 23:06:05,939][70768] Fps is (10 sec: 47514.1, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1670201344. Throughput: 0: 49316.2. Samples: 1198977080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:06:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 23:06:06,633][71000] Updated weights for policy 0, policy_version 101944 (0.0023) [2024-06-12 23:06:09,693][71000] Updated weights for policy 0, policy_version 101954 (0.0024) [2024-06-12 23:06:10,939][70768] Fps is (10 sec: 49152.5, 60 sec: 49698.3, 300 sec: 49540.8). Total num frames: 1670463488. Throughput: 0: 49543.3. Samples: 1199280920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:06:10,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 23:06:13,138][71000] Updated weights for policy 0, policy_version 101964 (0.0030) [2024-06-12 23:06:15,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1670725632. Throughput: 0: 49636.0. Samples: 1199580920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:06:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:06:16,220][71000] Updated weights for policy 0, policy_version 101974 (0.0029) [2024-06-12 23:06:19,553][71000] Updated weights for policy 0, policy_version 101984 (0.0026) [2024-06-12 23:06:20,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1670955008. Throughput: 0: 49648.4. Samples: 1199729740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:06:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:06:23,019][71000] Updated weights for policy 0, policy_version 101994 (0.0024) [2024-06-12 23:06:25,941][70768] Fps is (10 sec: 47505.6, 60 sec: 49696.8, 300 sec: 49540.5). Total num frames: 1671200768. Throughput: 0: 49449.6. Samples: 1200022520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:06:25,942][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:06:26,434][71000] Updated weights for policy 0, policy_version 102004 (0.0029) [2024-06-12 23:06:28,876][70980] Signal inference workers to stop experience collection... (17650 times) [2024-06-12 23:06:28,881][70980] Signal inference workers to resume experience collection... (17650 times) [2024-06-12 23:06:28,901][71000] InferenceWorker_p0-w0: stopping experience collection (17650 times) [2024-06-12 23:06:28,902][71000] InferenceWorker_p0-w0: resuming experience collection (17650 times) [2024-06-12 23:06:29,610][71000] Updated weights for policy 0, policy_version 102014 (0.0025) [2024-06-12 23:06:30,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1671446528. Throughput: 0: 49247.9. Samples: 1200314100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:06:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:06:33,059][71000] Updated weights for policy 0, policy_version 102024 (0.0022) [2024-06-12 23:06:35,940][70768] Fps is (10 sec: 50799.1, 60 sec: 49425.1, 300 sec: 49596.4). Total num frames: 1671708672. Throughput: 0: 49593.5. Samples: 1200471880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:06:35,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 23:06:36,010][71000] Updated weights for policy 0, policy_version 102034 (0.0027) [2024-06-12 23:06:39,433][71000] Updated weights for policy 0, policy_version 102044 (0.0026) [2024-06-12 23:06:40,939][70768] Fps is (10 sec: 49152.8, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1671938048. Throughput: 0: 49740.9. Samples: 1200769680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:06:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:06:42,793][71000] Updated weights for policy 0, policy_version 102054 (0.0035) [2024-06-12 23:06:45,894][71000] Updated weights for policy 0, policy_version 102064 (0.0030) [2024-06-12 23:06:45,940][70768] Fps is (10 sec: 50790.0, 60 sec: 50244.2, 300 sec: 49596.3). Total num frames: 1672216576. Throughput: 0: 49813.3. Samples: 1201071280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:06:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-12 23:06:49,484][71000] Updated weights for policy 0, policy_version 102074 (0.0036) [2024-06-12 23:06:50,939][70768] Fps is (10 sec: 52428.8, 60 sec: 49971.4, 300 sec: 49596.3). Total num frames: 1672462336. Throughput: 0: 49884.8. Samples: 1201221900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:06:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:06:52,727][71000] Updated weights for policy 0, policy_version 102084 (0.0029) [2024-06-12 23:06:55,940][70768] Fps is (10 sec: 47512.5, 60 sec: 49424.8, 300 sec: 49485.2). Total num frames: 1672691712. Throughput: 0: 49583.6. Samples: 1201512200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:06:55,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:06:56,095][71000] Updated weights for policy 0, policy_version 102094 (0.0027) [2024-06-12 23:06:59,494][71000] Updated weights for policy 0, policy_version 102104 (0.0022) [2024-06-12 23:07:00,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1672921088. Throughput: 0: 49547.6. Samples: 1201810560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:07:00,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 23:07:02,673][71000] Updated weights for policy 0, policy_version 102114 (0.0028) [2024-06-12 23:07:05,939][70768] Fps is (10 sec: 49153.8, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1673183232. Throughput: 0: 49362.7. Samples: 1201951060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:07:05,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:07:05,959][71000] Updated weights for policy 0, policy_version 102124 (0.0033) [2024-06-12 23:07:09,264][71000] Updated weights for policy 0, policy_version 102134 (0.0029) [2024-06-12 23:07:10,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1673428992. Throughput: 0: 49444.0. Samples: 1202247420. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-12 23:07:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:07:12,490][71000] Updated weights for policy 0, policy_version 102144 (0.0022) [2024-06-12 23:07:15,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 49485.3). Total num frames: 1673674752. Throughput: 0: 49870.4. Samples: 1202558260. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-12 23:07:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:07:15,952][71000] Updated weights for policy 0, policy_version 102154 (0.0025) [2024-06-12 23:07:19,177][71000] Updated weights for policy 0, policy_version 102164 (0.0025) [2024-06-12 23:07:20,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1673904128. Throughput: 0: 49477.3. Samples: 1202698360. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-12 23:07:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:07:22,375][71000] Updated weights for policy 0, policy_version 102174 (0.0028) [2024-06-12 23:07:25,871][71000] Updated weights for policy 0, policy_version 102184 (0.0037) [2024-06-12 23:07:25,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49699.6, 300 sec: 49540.8). Total num frames: 1674182656. Throughput: 0: 49552.4. Samples: 1202999540. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-12 23:07:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:07:28,955][71000] Updated weights for policy 0, policy_version 102194 (0.0031) [2024-06-12 23:07:30,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1674428416. Throughput: 0: 49553.4. Samples: 1203301180. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-12 23:07:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 23:07:32,339][71000] Updated weights for policy 0, policy_version 102204 (0.0035) [2024-06-12 23:07:35,861][71000] Updated weights for policy 0, policy_version 102214 (0.0025) [2024-06-12 23:07:35,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.1, 300 sec: 49485.3). Total num frames: 1674674176. Throughput: 0: 49442.7. Samples: 1203446820. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-12 23:07:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:07:38,750][71000] Updated weights for policy 0, policy_version 102224 (0.0030) [2024-06-12 23:07:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1674919936. Throughput: 0: 49619.0. Samples: 1203745040. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-12 23:07:40,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 23:07:40,956][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000102229_1674919936.pth... [2024-06-12 23:07:40,999][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000101504_1663041536.pth [2024-06-12 23:07:42,282][71000] Updated weights for policy 0, policy_version 102234 (0.0033) [2024-06-12 23:07:45,352][71000] Updated weights for policy 0, policy_version 102244 (0.0030) [2024-06-12 23:07:45,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1675165696. Throughput: 0: 49519.5. Samples: 1204038940. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-12 23:07:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:07:48,908][71000] Updated weights for policy 0, policy_version 102254 (0.0033) [2024-06-12 23:07:50,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 1675427840. Throughput: 0: 49850.2. Samples: 1204194320. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-12 23:07:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:07:52,210][71000] Updated weights for policy 0, policy_version 102264 (0.0025) [2024-06-12 23:07:55,680][71000] Updated weights for policy 0, policy_version 102274 (0.0034) [2024-06-12 23:07:55,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49698.4, 300 sec: 49485.2). Total num frames: 1675673600. Throughput: 0: 49930.3. Samples: 1204494280. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-12 23:07:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:07:56,467][70980] Signal inference workers to stop experience collection... (17700 times) [2024-06-12 23:07:56,500][71000] InferenceWorker_p0-w0: stopping experience collection (17700 times) [2024-06-12 23:07:56,516][70980] Signal inference workers to resume experience collection... (17700 times) [2024-06-12 23:07:56,517][71000] InferenceWorker_p0-w0: resuming experience collection (17700 times) [2024-06-12 23:07:58,708][71000] Updated weights for policy 0, policy_version 102284 (0.0030) [2024-06-12 23:08:00,939][70768] Fps is (10 sec: 47513.7, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1675902976. Throughput: 0: 49609.8. Samples: 1204790700. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-12 23:08:00,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 23:08:02,203][71000] Updated weights for policy 0, policy_version 102294 (0.0028) [2024-06-12 23:08:05,110][71000] Updated weights for policy 0, policy_version 102304 (0.0019) [2024-06-12 23:08:05,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1676165120. Throughput: 0: 49730.2. Samples: 1204936220. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-12 23:08:05,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:08:08,876][71000] Updated weights for policy 0, policy_version 102314 (0.0033) [2024-06-12 23:08:10,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49971.3, 300 sec: 49596.3). Total num frames: 1676427264. Throughput: 0: 49724.8. Samples: 1205237160. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-12 23:08:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:08:11,949][71000] Updated weights for policy 0, policy_version 102324 (0.0029) [2024-06-12 23:08:15,296][71000] Updated weights for policy 0, policy_version 102334 (0.0035) [2024-06-12 23:08:15,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1676640256. Throughput: 0: 49545.8. Samples: 1205530740. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-12 23:08:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:08:18,642][71000] Updated weights for policy 0, policy_version 102344 (0.0035) [2024-06-12 23:08:20,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1676886016. Throughput: 0: 49509.6. Samples: 1205674760. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-12 23:08:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:08:22,619][71000] Updated weights for policy 0, policy_version 102354 (0.0029) [2024-06-12 23:08:25,219][71000] Updated weights for policy 0, policy_version 102364 (0.0028) [2024-06-12 23:08:25,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1677164544. Throughput: 0: 49264.5. Samples: 1205961940. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-12 23:08:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:08:28,883][71000] Updated weights for policy 0, policy_version 102374 (0.0027) [2024-06-12 23:08:30,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1677410304. Throughput: 0: 49510.3. Samples: 1206266900. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-12 23:08:30,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:08:31,839][71000] Updated weights for policy 0, policy_version 102384 (0.0025) [2024-06-12 23:08:35,611][71000] Updated weights for policy 0, policy_version 102394 (0.0030) [2024-06-12 23:08:35,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1677639680. Throughput: 0: 49422.1. Samples: 1206418320. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-12 23:08:35,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 23:08:38,582][71000] Updated weights for policy 0, policy_version 102404 (0.0029) [2024-06-12 23:08:40,940][70768] Fps is (10 sec: 45874.6, 60 sec: 49151.9, 300 sec: 49485.2). Total num frames: 1677869056. Throughput: 0: 49187.9. Samples: 1206707740. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-12 23:08:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 23:08:42,279][71000] Updated weights for policy 0, policy_version 102414 (0.0024) [2024-06-12 23:08:45,198][71000] Updated weights for policy 0, policy_version 102424 (0.0034) [2024-06-12 23:08:45,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1678131200. Throughput: 0: 48927.5. Samples: 1206992440. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-12 23:08:45,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:08:49,426][71000] Updated weights for policy 0, policy_version 102434 (0.0037) [2024-06-12 23:08:50,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49151.9, 300 sec: 49485.2). Total num frames: 1678376960. Throughput: 0: 49108.3. Samples: 1207146100. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-12 23:08:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:08:51,824][71000] Updated weights for policy 0, policy_version 102444 (0.0033) [2024-06-12 23:08:55,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48605.8, 300 sec: 49318.6). Total num frames: 1678589952. Throughput: 0: 48969.7. Samples: 1207440800. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-12 23:08:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-12 23:08:56,137][71000] Updated weights for policy 0, policy_version 102454 (0.0036) [2024-06-12 23:08:58,629][71000] Updated weights for policy 0, policy_version 102464 (0.0027) [2024-06-12 23:09:00,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49151.8, 300 sec: 49429.7). Total num frames: 1678852096. Throughput: 0: 48800.7. Samples: 1207726780. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-12 23:09:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-12 23:09:02,598][71000] Updated weights for policy 0, policy_version 102474 (0.0027) [2024-06-12 23:09:04,969][70980] Signal inference workers to stop experience collection... (17750 times) [2024-06-12 23:09:04,978][70980] Signal inference workers to resume experience collection... (17750 times) [2024-06-12 23:09:05,011][71000] InferenceWorker_p0-w0: stopping experience collection (17750 times) [2024-06-12 23:09:05,011][71000] InferenceWorker_p0-w0: resuming experience collection (17750 times) [2024-06-12 23:09:05,108][71000] Updated weights for policy 0, policy_version 102484 (0.0033) [2024-06-12 23:09:05,939][70768] Fps is (10 sec: 52429.6, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 1679114240. Throughput: 0: 49034.0. Samples: 1207881280. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-12 23:09:05,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 23:09:08,986][71000] Updated weights for policy 0, policy_version 102494 (0.0031) [2024-06-12 23:09:10,939][70768] Fps is (10 sec: 50791.3, 60 sec: 48879.0, 300 sec: 49429.7). Total num frames: 1679360000. Throughput: 0: 49410.2. Samples: 1208185400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 23:09:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:09:11,858][71000] Updated weights for policy 0, policy_version 102504 (0.0034) [2024-06-12 23:09:15,901][71000] Updated weights for policy 0, policy_version 102514 (0.0028) [2024-06-12 23:09:15,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49151.9, 300 sec: 49374.2). Total num frames: 1679589376. Throughput: 0: 49401.3. Samples: 1208489960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 23:09:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:09:18,473][71000] Updated weights for policy 0, policy_version 102524 (0.0038) [2024-06-12 23:09:20,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1679851520. Throughput: 0: 49105.3. Samples: 1208628060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 23:09:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:09:22,531][71000] Updated weights for policy 0, policy_version 102534 (0.0037) [2024-06-12 23:09:25,031][71000] Updated weights for policy 0, policy_version 102544 (0.0032) [2024-06-12 23:09:25,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49151.9, 300 sec: 49540.8). Total num frames: 1680113664. Throughput: 0: 49257.0. Samples: 1208924300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 23:09:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:09:28,990][71000] Updated weights for policy 0, policy_version 102554 (0.0035) [2024-06-12 23:09:30,939][70768] Fps is (10 sec: 50791.0, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 1680359424. Throughput: 0: 49656.1. Samples: 1209226960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 23:09:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:09:31,550][71000] Updated weights for policy 0, policy_version 102564 (0.0024) [2024-06-12 23:09:35,339][71000] Updated weights for policy 0, policy_version 102574 (0.0030) [2024-06-12 23:09:35,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 1680588800. Throughput: 0: 49470.8. Samples: 1209372280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 23:09:35,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:09:38,149][71000] Updated weights for policy 0, policy_version 102584 (0.0023) [2024-06-12 23:09:40,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1680834560. Throughput: 0: 49636.1. Samples: 1209674420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 23:09:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:09:40,946][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000102590_1680834560.pth... [2024-06-12 23:09:40,988][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000101868_1669005312.pth [2024-06-12 23:09:42,251][71000] Updated weights for policy 0, policy_version 102594 (0.0036) [2024-06-12 23:09:44,821][71000] Updated weights for policy 0, policy_version 102604 (0.0026) [2024-06-12 23:09:45,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1681113088. Throughput: 0: 49712.5. Samples: 1209963840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 23:09:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:09:48,860][71000] Updated weights for policy 0, policy_version 102614 (0.0038) [2024-06-12 23:09:50,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1681342464. Throughput: 0: 49710.6. Samples: 1210118260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 23:09:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:09:51,312][71000] Updated weights for policy 0, policy_version 102624 (0.0030) [2024-06-12 23:09:55,245][71000] Updated weights for policy 0, policy_version 102634 (0.0028) [2024-06-12 23:09:55,940][70768] Fps is (10 sec: 45875.8, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 1681571840. Throughput: 0: 49564.9. Samples: 1210415820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 23:09:55,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:09:58,225][71000] Updated weights for policy 0, policy_version 102644 (0.0028) [2024-06-12 23:10:00,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1681833984. Throughput: 0: 49213.7. Samples: 1210704580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 23:10:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:10:01,892][71000] Updated weights for policy 0, policy_version 102654 (0.0022) [2024-06-12 23:10:04,542][71000] Updated weights for policy 0, policy_version 102664 (0.0027) [2024-06-12 23:10:05,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49698.0, 300 sec: 49540.8). Total num frames: 1682096128. Throughput: 0: 49685.2. Samples: 1210863900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-12 23:10:05,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:10:08,330][71000] Updated weights for policy 0, policy_version 102674 (0.0038) [2024-06-12 23:10:10,939][70768] Fps is (10 sec: 52429.5, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 1682358272. Throughput: 0: 49781.8. Samples: 1211164480. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-12 23:10:10,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:10:11,122][71000] Updated weights for policy 0, policy_version 102684 (0.0027) [2024-06-12 23:10:15,104][71000] Updated weights for policy 0, policy_version 102694 (0.0025) [2024-06-12 23:10:15,940][70768] Fps is (10 sec: 45875.8, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 1682554880. Throughput: 0: 49809.7. Samples: 1211468400. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-12 23:10:15,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 23:10:15,959][70980] Signal inference workers to stop experience collection... (17800 times) [2024-06-12 23:10:16,005][71000] InferenceWorker_p0-w0: stopping experience collection (17800 times) [2024-06-12 23:10:16,012][70980] Signal inference workers to resume experience collection... (17800 times) [2024-06-12 23:10:16,017][71000] InferenceWorker_p0-w0: resuming experience collection (17800 times) [2024-06-12 23:10:17,528][71000] Updated weights for policy 0, policy_version 102704 (0.0025) [2024-06-12 23:10:20,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1682833408. Throughput: 0: 49798.6. Samples: 1211613220. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-12 23:10:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:10:21,379][71000] Updated weights for policy 0, policy_version 102714 (0.0026) [2024-06-12 23:10:24,357][71000] Updated weights for policy 0, policy_version 102724 (0.0024) [2024-06-12 23:10:25,940][70768] Fps is (10 sec: 54067.1, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1683095552. Throughput: 0: 49761.3. Samples: 1211913680. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-12 23:10:25,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:10:27,698][71000] Updated weights for policy 0, policy_version 102734 (0.0027) [2024-06-12 23:10:30,766][71000] Updated weights for policy 0, policy_version 102744 (0.0030) [2024-06-12 23:10:30,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49971.1, 300 sec: 49540.8). Total num frames: 1683357696. Throughput: 0: 50024.6. Samples: 1212214940. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-12 23:10:30,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 23:10:34,137][71000] Updated weights for policy 0, policy_version 102754 (0.0031) [2024-06-12 23:10:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 1683587072. Throughput: 0: 50005.7. Samples: 1212368520. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-12 23:10:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:10:37,407][71000] Updated weights for policy 0, policy_version 102764 (0.0026) [2024-06-12 23:10:40,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1683832832. Throughput: 0: 49740.0. Samples: 1212654120. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-12 23:10:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:10:41,052][71000] Updated weights for policy 0, policy_version 102774 (0.0029) [2024-06-12 23:10:44,289][71000] Updated weights for policy 0, policy_version 102784 (0.0026) [2024-06-12 23:10:45,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1684094976. Throughput: 0: 49920.0. Samples: 1212950980. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-12 23:10:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:10:47,444][71000] Updated weights for policy 0, policy_version 102794 (0.0029) [2024-06-12 23:10:50,774][71000] Updated weights for policy 0, policy_version 102804 (0.0023) [2024-06-12 23:10:50,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49971.1, 300 sec: 49540.8). Total num frames: 1684340736. Throughput: 0: 49830.7. Samples: 1213106280. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-12 23:10:50,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:10:54,202][71000] Updated weights for policy 0, policy_version 102814 (0.0019) [2024-06-12 23:10:55,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49971.1, 300 sec: 49485.2). Total num frames: 1684570112. Throughput: 0: 49553.1. Samples: 1213394380. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-12 23:10:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:10:57,521][71000] Updated weights for policy 0, policy_version 102824 (0.0032) [2024-06-12 23:11:00,768][71000] Updated weights for policy 0, policy_version 102834 (0.0024) [2024-06-12 23:11:00,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49971.1, 300 sec: 49596.3). Total num frames: 1684832256. Throughput: 0: 49481.1. Samples: 1213695060. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-12 23:11:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:11:04,249][71000] Updated weights for policy 0, policy_version 102844 (0.0035) [2024-06-12 23:11:05,944][70768] Fps is (10 sec: 50769.4, 60 sec: 49694.7, 300 sec: 49540.0). Total num frames: 1685078016. Throughput: 0: 49634.4. Samples: 1213846980. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-12 23:11:05,944][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:11:07,735][71000] Updated weights for policy 0, policy_version 102854 (0.0024) [2024-06-12 23:11:10,911][71000] Updated weights for policy 0, policy_version 102864 (0.0024) [2024-06-12 23:11:10,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1685323776. Throughput: 0: 49412.0. Samples: 1214137220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 23:11:10,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:11:14,299][71000] Updated weights for policy 0, policy_version 102874 (0.0033) [2024-06-12 23:11:15,940][70768] Fps is (10 sec: 47533.3, 60 sec: 49971.1, 300 sec: 49485.2). Total num frames: 1685553152. Throughput: 0: 49251.0. Samples: 1214431240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 23:11:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:11:17,687][71000] Updated weights for policy 0, policy_version 102884 (0.0024) [2024-06-12 23:11:20,377][70980] Signal inference workers to stop experience collection... (17850 times) [2024-06-12 23:11:20,378][70980] Signal inference workers to resume experience collection... (17850 times) [2024-06-12 23:11:20,417][71000] InferenceWorker_p0-w0: stopping experience collection (17850 times) [2024-06-12 23:11:20,417][71000] InferenceWorker_p0-w0: resuming experience collection (17850 times) [2024-06-12 23:11:20,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49425.1, 300 sec: 49485.5). Total num frames: 1685798912. Throughput: 0: 49137.8. Samples: 1214579720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 23:11:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:11:21,027][71000] Updated weights for policy 0, policy_version 102894 (0.0031) [2024-06-12 23:11:24,213][71000] Updated weights for policy 0, policy_version 102904 (0.0031) [2024-06-12 23:11:25,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1686061056. Throughput: 0: 49374.7. Samples: 1214875980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 23:11:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:11:27,776][71000] Updated weights for policy 0, policy_version 102914 (0.0033) [2024-06-12 23:11:30,843][71000] Updated weights for policy 0, policy_version 102924 (0.0043) [2024-06-12 23:11:30,940][70768] Fps is (10 sec: 50788.9, 60 sec: 49151.8, 300 sec: 49485.2). Total num frames: 1686306816. Throughput: 0: 49354.5. Samples: 1215171940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 23:11:30,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:11:34,631][71000] Updated weights for policy 0, policy_version 102934 (0.0026) [2024-06-12 23:11:35,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49425.0, 300 sec: 49540.7). Total num frames: 1686552576. Throughput: 0: 49145.3. Samples: 1215317820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 23:11:35,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:11:37,462][71000] Updated weights for policy 0, policy_version 102944 (0.0025) [2024-06-12 23:11:40,940][70768] Fps is (10 sec: 47514.6, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1686781952. Throughput: 0: 49314.3. Samples: 1215613520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 23:11:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:11:41,040][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000102954_1686798336.pth... [2024-06-12 23:11:41,051][71000] Updated weights for policy 0, policy_version 102954 (0.0031) [2024-06-12 23:11:41,091][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000102229_1674919936.pth [2024-06-12 23:11:44,303][71000] Updated weights for policy 0, policy_version 102964 (0.0029) [2024-06-12 23:11:45,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49151.9, 300 sec: 49429.6). Total num frames: 1687044096. Throughput: 0: 49013.3. Samples: 1215900660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 23:11:45,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:11:47,785][71000] Updated weights for policy 0, policy_version 102974 (0.0033) [2024-06-12 23:11:50,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48605.8, 300 sec: 49374.2). Total num frames: 1687257088. Throughput: 0: 48976.0. Samples: 1216050700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 23:11:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:11:51,260][71000] Updated weights for policy 0, policy_version 102984 (0.0033) [2024-06-12 23:11:54,577][71000] Updated weights for policy 0, policy_version 102994 (0.0038) [2024-06-12 23:11:55,940][70768] Fps is (10 sec: 45876.3, 60 sec: 48879.1, 300 sec: 49429.7). Total num frames: 1687502848. Throughput: 0: 48922.2. Samples: 1216338720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 23:11:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:11:57,887][71000] Updated weights for policy 0, policy_version 103004 (0.0034) [2024-06-12 23:12:00,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48605.9, 300 sec: 49374.1). Total num frames: 1687748608. Throughput: 0: 48957.8. Samples: 1216634340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 23:12:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:12:01,103][71000] Updated weights for policy 0, policy_version 103014 (0.0031) [2024-06-12 23:12:04,671][71000] Updated weights for policy 0, policy_version 103024 (0.0044) [2024-06-12 23:12:05,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48882.4, 300 sec: 49429.7). Total num frames: 1688010752. Throughput: 0: 49043.9. Samples: 1216786700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-12 23:12:05,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:12:07,645][71000] Updated weights for policy 0, policy_version 103034 (0.0025) [2024-06-12 23:12:10,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48605.9, 300 sec: 49374.2). Total num frames: 1688240128. Throughput: 0: 48922.2. Samples: 1217077480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:12:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:12:11,221][71000] Updated weights for policy 0, policy_version 103044 (0.0023) [2024-06-12 23:12:14,102][71000] Updated weights for policy 0, policy_version 103054 (0.0025) [2024-06-12 23:12:15,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 49429.7). Total num frames: 1688485888. Throughput: 0: 48844.7. Samples: 1217369940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:12:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:12:17,857][71000] Updated weights for policy 0, policy_version 103064 (0.0036) [2024-06-12 23:12:20,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 1688731648. Throughput: 0: 48914.9. Samples: 1217518980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:12:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:12:21,289][71000] Updated weights for policy 0, policy_version 103074 (0.0030) [2024-06-12 23:12:24,503][71000] Updated weights for policy 0, policy_version 103084 (0.0032) [2024-06-12 23:12:25,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48878.9, 300 sec: 49374.2). Total num frames: 1688993792. Throughput: 0: 48947.6. Samples: 1217816160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:12:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:12:27,676][71000] Updated weights for policy 0, policy_version 103094 (0.0024) [2024-06-12 23:12:30,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48606.0, 300 sec: 49318.6). Total num frames: 1689223168. Throughput: 0: 49093.1. Samples: 1218109840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:12:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:12:31,238][71000] Updated weights for policy 0, policy_version 103104 (0.0034) [2024-06-12 23:12:34,085][71000] Updated weights for policy 0, policy_version 103114 (0.0034) [2024-06-12 23:12:35,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48605.9, 300 sec: 49318.6). Total num frames: 1689468928. Throughput: 0: 48993.9. Samples: 1218255420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:12:35,948][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:12:37,733][71000] Updated weights for policy 0, policy_version 103124 (0.0025) [2024-06-12 23:12:38,393][70980] Signal inference workers to stop experience collection... (17900 times) [2024-06-12 23:12:38,428][71000] InferenceWorker_p0-w0: stopping experience collection (17900 times) [2024-06-12 23:12:38,451][70980] Signal inference workers to resume experience collection... (17900 times) [2024-06-12 23:12:38,455][71000] InferenceWorker_p0-w0: resuming experience collection (17900 times) [2024-06-12 23:12:40,829][71000] Updated weights for policy 0, policy_version 103134 (0.0028) [2024-06-12 23:12:40,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1689747456. Throughput: 0: 49187.9. Samples: 1218552180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:12:40,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:12:44,593][71000] Updated weights for policy 0, policy_version 103144 (0.0029) [2024-06-12 23:12:45,939][70768] Fps is (10 sec: 52429.5, 60 sec: 49152.2, 300 sec: 49374.2). Total num frames: 1689993216. Throughput: 0: 49306.0. Samples: 1218853100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:12:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:12:47,417][71000] Updated weights for policy 0, policy_version 103154 (0.0027) [2024-06-12 23:12:50,940][70768] Fps is (10 sec: 45875.8, 60 sec: 49152.2, 300 sec: 49263.1). Total num frames: 1690206208. Throughput: 0: 49197.8. Samples: 1219000600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:12:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:12:51,162][71000] Updated weights for policy 0, policy_version 103164 (0.0028) [2024-06-12 23:12:54,158][71000] Updated weights for policy 0, policy_version 103174 (0.0032) [2024-06-12 23:12:55,939][70768] Fps is (10 sec: 45875.1, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1690451968. Throughput: 0: 49199.6. Samples: 1219291460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:12:55,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 23:12:57,945][71000] Updated weights for policy 0, policy_version 103184 (0.0022) [2024-06-12 23:13:00,856][71000] Updated weights for policy 0, policy_version 103194 (0.0025) [2024-06-12 23:13:00,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49698.1, 300 sec: 49374.1). Total num frames: 1690730496. Throughput: 0: 49302.1. Samples: 1219588540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:13:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:13:04,610][71000] Updated weights for policy 0, policy_version 103204 (0.0030) [2024-06-12 23:13:05,940][70768] Fps is (10 sec: 54066.7, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 1690992640. Throughput: 0: 49462.1. Samples: 1219744780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:13:05,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 23:13:07,557][71000] Updated weights for policy 0, policy_version 103214 (0.0028) [2024-06-12 23:13:10,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 1691205632. Throughput: 0: 49464.9. Samples: 1220042080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 23:13:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:13:11,076][71000] Updated weights for policy 0, policy_version 103224 (0.0032) [2024-06-12 23:13:14,047][71000] Updated weights for policy 0, policy_version 103234 (0.0026) [2024-06-12 23:13:15,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 1691467776. Throughput: 0: 49683.5. Samples: 1220345600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 23:13:15,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 23:13:17,386][71000] Updated weights for policy 0, policy_version 103244 (0.0029) [2024-06-12 23:13:20,520][71000] Updated weights for policy 0, policy_version 103254 (0.0031) [2024-06-12 23:13:20,939][70768] Fps is (10 sec: 54067.6, 60 sec: 50244.3, 300 sec: 49429.7). Total num frames: 1691746304. Throughput: 0: 50021.5. Samples: 1220506380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 23:13:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:13:24,251][71000] Updated weights for policy 0, policy_version 103264 (0.0026) [2024-06-12 23:13:25,940][70768] Fps is (10 sec: 52429.7, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 1691992064. Throughput: 0: 49882.4. Samples: 1220796880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 23:13:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:13:27,096][71000] Updated weights for policy 0, policy_version 103274 (0.0024) [2024-06-12 23:13:30,923][71000] Updated weights for policy 0, policy_version 103284 (0.0030) [2024-06-12 23:13:30,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 1692205056. Throughput: 0: 49736.9. Samples: 1221091260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 23:13:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:13:33,762][71000] Updated weights for policy 0, policy_version 103294 (0.0030) [2024-06-12 23:13:35,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49971.1, 300 sec: 49485.2). Total num frames: 1692467200. Throughput: 0: 49699.8. Samples: 1221237100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 23:13:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:13:37,084][70980] Signal inference workers to stop experience collection... (17950 times) [2024-06-12 23:13:37,144][71000] InferenceWorker_p0-w0: stopping experience collection (17950 times) [2024-06-12 23:13:37,193][70980] Signal inference workers to resume experience collection... (17950 times) [2024-06-12 23:13:37,193][71000] InferenceWorker_p0-w0: resuming experience collection (17950 times) [2024-06-12 23:13:37,325][71000] Updated weights for policy 0, policy_version 103304 (0.0024) [2024-06-12 23:13:40,133][71000] Updated weights for policy 0, policy_version 103314 (0.0029) [2024-06-12 23:13:40,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1692729344. Throughput: 0: 49757.2. Samples: 1221530540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 23:13:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:13:40,955][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000103316_1692729344.pth... [2024-06-12 23:13:40,999][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000102590_1680834560.pth [2024-06-12 23:13:43,739][71000] Updated weights for policy 0, policy_version 103324 (0.0031) [2024-06-12 23:13:45,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.0, 300 sec: 49485.2). Total num frames: 1692975104. Throughput: 0: 49793.4. Samples: 1221829240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 23:13:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:13:46,787][71000] Updated weights for policy 0, policy_version 103334 (0.0033) [2024-06-12 23:13:50,653][71000] Updated weights for policy 0, policy_version 103344 (0.0032) [2024-06-12 23:13:50,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49971.0, 300 sec: 49540.8). Total num frames: 1693204480. Throughput: 0: 49864.3. Samples: 1221988680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 23:13:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:13:53,489][71000] Updated weights for policy 0, policy_version 103354 (0.0042) [2024-06-12 23:13:55,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49971.2, 300 sec: 49485.3). Total num frames: 1693450240. Throughput: 0: 49757.9. Samples: 1222281180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 23:13:55,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:13:57,238][71000] Updated weights for policy 0, policy_version 103364 (0.0030) [2024-06-12 23:14:00,033][71000] Updated weights for policy 0, policy_version 103374 (0.0032) [2024-06-12 23:14:00,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1693696000. Throughput: 0: 49644.1. Samples: 1222579580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 23:14:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:14:03,815][71000] Updated weights for policy 0, policy_version 103384 (0.0027) [2024-06-12 23:14:05,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1693974528. Throughput: 0: 49517.3. Samples: 1222734660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-12 23:14:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:14:06,507][71000] Updated weights for policy 0, policy_version 103394 (0.0032) [2024-06-12 23:14:10,472][71000] Updated weights for policy 0, policy_version 103404 (0.0029) [2024-06-12 23:14:10,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1694187520. Throughput: 0: 49430.2. Samples: 1223021240. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 23:14:10,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 23:14:13,282][71000] Updated weights for policy 0, policy_version 103414 (0.0031) [2024-06-12 23:14:15,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1694433280. Throughput: 0: 49334.2. Samples: 1223311300. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 23:14:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 23:14:17,104][71000] Updated weights for policy 0, policy_version 103424 (0.0029) [2024-06-12 23:14:20,133][71000] Updated weights for policy 0, policy_version 103434 (0.0033) [2024-06-12 23:14:20,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48878.8, 300 sec: 49374.1). Total num frames: 1694679040. Throughput: 0: 49311.6. Samples: 1223456120. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 23:14:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:14:23,722][71000] Updated weights for policy 0, policy_version 103444 (0.0022) [2024-06-12 23:14:25,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1694957568. Throughput: 0: 49490.2. Samples: 1223757600. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 23:14:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:14:26,569][71000] Updated weights for policy 0, policy_version 103454 (0.0025) [2024-06-12 23:14:30,004][71000] Updated weights for policy 0, policy_version 103464 (0.0028) [2024-06-12 23:14:30,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49971.0, 300 sec: 49540.7). Total num frames: 1695203328. Throughput: 0: 49652.2. Samples: 1224063600. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 23:14:30,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:14:33,259][71000] Updated weights for policy 0, policy_version 103474 (0.0037) [2024-06-12 23:14:35,939][70768] Fps is (10 sec: 47514.7, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 1695432704. Throughput: 0: 49290.0. Samples: 1224206720. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 23:14:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:14:36,864][71000] Updated weights for policy 0, policy_version 103484 (0.0031) [2024-06-12 23:14:39,702][71000] Updated weights for policy 0, policy_version 103494 (0.0029) [2024-06-12 23:14:40,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49151.8, 300 sec: 49374.1). Total num frames: 1695678464. Throughput: 0: 49347.2. Samples: 1224501820. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 23:14:40,941][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:14:43,452][71000] Updated weights for policy 0, policy_version 103504 (0.0028) [2024-06-12 23:14:45,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1695956992. Throughput: 0: 49318.3. Samples: 1224798900. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 23:14:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:14:46,441][71000] Updated weights for policy 0, policy_version 103514 (0.0029) [2024-06-12 23:14:49,855][71000] Updated weights for policy 0, policy_version 103524 (0.0037) [2024-06-12 23:14:50,940][70768] Fps is (10 sec: 49153.5, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 1696169984. Throughput: 0: 49401.3. Samples: 1224957720. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 23:14:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:14:53,153][71000] Updated weights for policy 0, policy_version 103534 (0.0031) [2024-06-12 23:14:55,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49698.0, 300 sec: 49485.2). Total num frames: 1696432128. Throughput: 0: 49683.1. Samples: 1225256980. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 23:14:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:14:56,335][70980] Signal inference workers to stop experience collection... (18000 times) [2024-06-12 23:14:56,371][71000] InferenceWorker_p0-w0: stopping experience collection (18000 times) [2024-06-12 23:14:56,443][70980] Signal inference workers to resume experience collection... (18000 times) [2024-06-12 23:14:56,443][71000] InferenceWorker_p0-w0: resuming experience collection (18000 times) [2024-06-12 23:14:56,595][71000] Updated weights for policy 0, policy_version 103544 (0.0027) [2024-06-12 23:14:59,829][71000] Updated weights for policy 0, policy_version 103554 (0.0029) [2024-06-12 23:15:00,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 1696645120. Throughput: 0: 49416.5. Samples: 1225535040. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 23:15:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:15:03,310][71000] Updated weights for policy 0, policy_version 103564 (0.0030) [2024-06-12 23:15:05,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1696923648. Throughput: 0: 49412.2. Samples: 1225679660. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-12 23:15:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:15:06,813][71000] Updated weights for policy 0, policy_version 103574 (0.0030) [2024-06-12 23:15:10,162][71000] Updated weights for policy 0, policy_version 103584 (0.0032) [2024-06-12 23:15:10,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1697153024. Throughput: 0: 49367.7. Samples: 1225979140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 23:15:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:15:13,214][71000] Updated weights for policy 0, policy_version 103594 (0.0026) [2024-06-12 23:15:15,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1697415168. Throughput: 0: 49225.2. Samples: 1226278720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 23:15:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:15:16,506][71000] Updated weights for policy 0, policy_version 103604 (0.0022) [2024-06-12 23:15:19,788][71000] Updated weights for policy 0, policy_version 103614 (0.0031) [2024-06-12 23:15:20,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 1697644544. Throughput: 0: 49239.0. Samples: 1226422480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 23:15:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:15:23,431][71000] Updated weights for policy 0, policy_version 103624 (0.0032) [2024-06-12 23:15:25,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 1697906688. Throughput: 0: 49249.2. Samples: 1226718020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 23:15:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:15:26,541][71000] Updated weights for policy 0, policy_version 103634 (0.0037) [2024-06-12 23:15:29,941][71000] Updated weights for policy 0, policy_version 103644 (0.0027) [2024-06-12 23:15:30,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.2, 300 sec: 49374.2). Total num frames: 1698152448. Throughput: 0: 49276.5. Samples: 1227016340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 23:15:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:15:32,842][71000] Updated weights for policy 0, policy_version 103654 (0.0024) [2024-06-12 23:15:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 1698398208. Throughput: 0: 49208.5. Samples: 1227172100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 23:15:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:15:36,332][71000] Updated weights for policy 0, policy_version 103664 (0.0032) [2024-06-12 23:15:39,475][71000] Updated weights for policy 0, policy_version 103674 (0.0028) [2024-06-12 23:15:40,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 1698643968. Throughput: 0: 49202.2. Samples: 1227471080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 23:15:40,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:15:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000103677_1698643968.pth... [2024-06-12 23:15:40,994][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000102954_1686798336.pth [2024-06-12 23:15:42,996][71000] Updated weights for policy 0, policy_version 103684 (0.0036) [2024-06-12 23:15:45,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 1698906112. Throughput: 0: 49602.3. Samples: 1227767140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 23:15:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:15:45,950][71000] Updated weights for policy 0, policy_version 103694 (0.0029) [2024-06-12 23:15:50,092][71000] Updated weights for policy 0, policy_version 103704 (0.0038) [2024-06-12 23:15:50,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49424.9, 300 sec: 49374.2). Total num frames: 1699135488. Throughput: 0: 49539.8. Samples: 1227908960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 23:15:50,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:15:52,505][71000] Updated weights for policy 0, policy_version 103714 (0.0035) [2024-06-12 23:15:55,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1699381248. Throughput: 0: 49532.0. Samples: 1228208080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 23:15:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:15:56,415][71000] Updated weights for policy 0, policy_version 103724 (0.0027) [2024-06-12 23:15:59,275][71000] Updated weights for policy 0, policy_version 103734 (0.0033) [2024-06-12 23:16:00,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49698.0, 300 sec: 49319.3). Total num frames: 1699627008. Throughput: 0: 49577.2. Samples: 1228509700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 23:16:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:16:02,768][71000] Updated weights for policy 0, policy_version 103744 (0.0034) [2024-06-12 23:16:05,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 1699889152. Throughput: 0: 49487.6. Samples: 1228649420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 23:16:05,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:16:05,999][71000] Updated weights for policy 0, policy_version 103754 (0.0022) [2024-06-12 23:16:09,548][71000] Updated weights for policy 0, policy_version 103764 (0.0023) [2024-06-12 23:16:10,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1700118528. Throughput: 0: 49594.3. Samples: 1228949760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-12 23:16:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:16:12,027][70980] Signal inference workers to stop experience collection... (18050 times) [2024-06-12 23:16:12,054][71000] InferenceWorker_p0-w0: stopping experience collection (18050 times) [2024-06-12 23:16:12,080][70980] Signal inference workers to resume experience collection... (18050 times) [2024-06-12 23:16:12,080][71000] InferenceWorker_p0-w0: resuming experience collection (18050 times) [2024-06-12 23:16:12,498][71000] Updated weights for policy 0, policy_version 103774 (0.0025) [2024-06-12 23:16:15,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 1700364288. Throughput: 0: 49508.9. Samples: 1229244240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 23:16:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:16:16,115][71000] Updated weights for policy 0, policy_version 103784 (0.0030) [2024-06-12 23:16:19,084][71000] Updated weights for policy 0, policy_version 103794 (0.0026) [2024-06-12 23:16:20,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1700610048. Throughput: 0: 49411.4. Samples: 1229395620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 23:16:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:16:22,567][71000] Updated weights for policy 0, policy_version 103804 (0.0028) [2024-06-12 23:16:25,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 49318.7). Total num frames: 1700855808. Throughput: 0: 49428.6. Samples: 1229695360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 23:16:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:16:26,145][71000] Updated weights for policy 0, policy_version 103814 (0.0031) [2024-06-12 23:16:29,614][71000] Updated weights for policy 0, policy_version 103824 (0.0031) [2024-06-12 23:16:30,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1701101568. Throughput: 0: 49175.5. Samples: 1229980040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 23:16:30,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:16:32,817][71000] Updated weights for policy 0, policy_version 103834 (0.0033) [2024-06-12 23:16:35,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1701363712. Throughput: 0: 49415.4. Samples: 1230132640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 23:16:35,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:16:35,968][71000] Updated weights for policy 0, policy_version 103844 (0.0028) [2024-06-12 23:16:39,143][71000] Updated weights for policy 0, policy_version 103854 (0.0021) [2024-06-12 23:16:40,939][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 49318.7). Total num frames: 1701593088. Throughput: 0: 49177.0. Samples: 1230421040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 23:16:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:16:42,650][71000] Updated weights for policy 0, policy_version 103864 (0.0033) [2024-06-12 23:16:45,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.8, 300 sec: 49429.7). Total num frames: 1701838848. Throughput: 0: 49145.0. Samples: 1230721220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 23:16:45,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 23:16:46,163][71000] Updated weights for policy 0, policy_version 103874 (0.0038) [2024-06-12 23:16:49,191][71000] Updated weights for policy 0, policy_version 103884 (0.0019) [2024-06-12 23:16:50,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.2, 300 sec: 49429.7). Total num frames: 1702084608. Throughput: 0: 49331.6. Samples: 1230869340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 23:16:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:16:53,131][71000] Updated weights for policy 0, policy_version 103894 (0.0028) [2024-06-12 23:16:55,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.1, 300 sec: 49485.3). Total num frames: 1702346752. Throughput: 0: 49231.5. Samples: 1231165180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 23:16:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:16:56,018][71000] Updated weights for policy 0, policy_version 103904 (0.0032) [2024-06-12 23:16:59,302][71000] Updated weights for policy 0, policy_version 103914 (0.0027) [2024-06-12 23:17:00,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1702608896. Throughput: 0: 49480.9. Samples: 1231470880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 23:17:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:17:02,691][71000] Updated weights for policy 0, policy_version 103924 (0.0026) [2024-06-12 23:17:05,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49151.8, 300 sec: 49485.2). Total num frames: 1702838272. Throughput: 0: 49269.3. Samples: 1231612740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 23:17:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:17:06,072][71000] Updated weights for policy 0, policy_version 103934 (0.0032) [2024-06-12 23:17:09,231][71000] Updated weights for policy 0, policy_version 103944 (0.0023) [2024-06-12 23:17:10,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1703067648. Throughput: 0: 49092.4. Samples: 1231904520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-12 23:17:10,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:17:12,786][71000] Updated weights for policy 0, policy_version 103954 (0.0029) [2024-06-12 23:17:14,160][70980] Signal inference workers to stop experience collection... (18100 times) [2024-06-12 23:17:14,160][70980] Signal inference workers to resume experience collection... (18100 times) [2024-06-12 23:17:14,197][71000] InferenceWorker_p0-w0: stopping experience collection (18100 times) [2024-06-12 23:17:14,197][71000] InferenceWorker_p0-w0: resuming experience collection (18100 times) [2024-06-12 23:17:15,836][71000] Updated weights for policy 0, policy_version 103964 (0.0034) [2024-06-12 23:17:15,941][70768] Fps is (10 sec: 50785.0, 60 sec: 49697.1, 300 sec: 49540.6). Total num frames: 1703346176. Throughput: 0: 49306.6. Samples: 1232198900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 23:17:15,941][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:17:19,376][71000] Updated weights for policy 0, policy_version 103974 (0.0029) [2024-06-12 23:17:20,941][70768] Fps is (10 sec: 52419.8, 60 sec: 49696.8, 300 sec: 49484.9). Total num frames: 1703591936. Throughput: 0: 49243.0. Samples: 1232348660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 23:17:20,942][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:17:22,526][71000] Updated weights for policy 0, policy_version 103984 (0.0024) [2024-06-12 23:17:25,841][71000] Updated weights for policy 0, policy_version 103994 (0.0021) [2024-06-12 23:17:25,940][70768] Fps is (10 sec: 49158.0, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1703837696. Throughput: 0: 49694.1. Samples: 1232657280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 23:17:25,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:17:28,898][71000] Updated weights for policy 0, policy_version 104004 (0.0028) [2024-06-12 23:17:30,940][70768] Fps is (10 sec: 45883.0, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 1704050688. Throughput: 0: 49539.1. Samples: 1232950480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 23:17:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:17:32,442][71000] Updated weights for policy 0, policy_version 104014 (0.0023) [2024-06-12 23:17:35,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49151.9, 300 sec: 49374.2). Total num frames: 1704312832. Throughput: 0: 49491.5. Samples: 1233096460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 23:17:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:17:36,063][71000] Updated weights for policy 0, policy_version 104024 (0.0026) [2024-06-12 23:17:39,328][71000] Updated weights for policy 0, policy_version 104034 (0.0024) [2024-06-12 23:17:40,940][70768] Fps is (10 sec: 54066.5, 60 sec: 49971.0, 300 sec: 49485.2). Total num frames: 1704591360. Throughput: 0: 49511.8. Samples: 1233393220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 23:17:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:17:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000104040_1704591360.pth... [2024-06-12 23:17:40,994][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000103316_1692729344.pth [2024-06-12 23:17:42,411][71000] Updated weights for policy 0, policy_version 104044 (0.0031) [2024-06-12 23:17:45,788][71000] Updated weights for policy 0, policy_version 104054 (0.0026) [2024-06-12 23:17:45,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49971.1, 300 sec: 49596.3). Total num frames: 1704837120. Throughput: 0: 49502.1. Samples: 1233698480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 23:17:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:17:48,995][71000] Updated weights for policy 0, policy_version 104064 (0.0033) [2024-06-12 23:17:50,940][70768] Fps is (10 sec: 44237.6, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1705033728. Throughput: 0: 49638.9. Samples: 1233846480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 23:17:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:17:52,343][71000] Updated weights for policy 0, policy_version 104074 (0.0027) [2024-06-12 23:17:55,622][71000] Updated weights for policy 0, policy_version 104084 (0.0021) [2024-06-12 23:17:55,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1705312256. Throughput: 0: 49623.1. Samples: 1234137560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 23:17:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:17:59,048][71000] Updated weights for policy 0, policy_version 104094 (0.0029) [2024-06-12 23:18:00,940][70768] Fps is (10 sec: 54066.5, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1705574400. Throughput: 0: 49424.4. Samples: 1234422940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 23:18:00,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:18:02,563][71000] Updated weights for policy 0, policy_version 104104 (0.0036) [2024-06-12 23:18:05,929][71000] Updated weights for policy 0, policy_version 104114 (0.0024) [2024-06-12 23:18:05,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.3, 300 sec: 49485.2). Total num frames: 1705803776. Throughput: 0: 49651.3. Samples: 1234582880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 23:18:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:18:09,000][71000] Updated weights for policy 0, policy_version 104124 (0.0035) [2024-06-12 23:18:10,940][70768] Fps is (10 sec: 45875.8, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1706033152. Throughput: 0: 49311.6. Samples: 1234876300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-12 23:18:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:18:12,283][71000] Updated weights for policy 0, policy_version 104134 (0.0028) [2024-06-12 23:18:15,544][71000] Updated weights for policy 0, policy_version 104144 (0.0024) [2024-06-12 23:18:15,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49153.0, 300 sec: 49318.6). Total num frames: 1706295296. Throughput: 0: 49569.4. Samples: 1235181100. Policy #0 lag: (min: 2.0, avg: 11.4, max: 26.0) [2024-06-12 23:18:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-12 23:18:19,046][71000] Updated weights for policy 0, policy_version 104154 (0.0031) [2024-06-12 23:18:20,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49426.5, 300 sec: 49374.1). Total num frames: 1706557440. Throughput: 0: 49582.7. Samples: 1235327680. Policy #0 lag: (min: 2.0, avg: 11.4, max: 26.0) [2024-06-12 23:18:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:18:22,125][71000] Updated weights for policy 0, policy_version 104164 (0.0030) [2024-06-12 23:18:24,364][70980] Signal inference workers to stop experience collection... (18150 times) [2024-06-12 23:18:24,404][71000] InferenceWorker_p0-w0: stopping experience collection (18150 times) [2024-06-12 23:18:24,414][70980] Signal inference workers to resume experience collection... (18150 times) [2024-06-12 23:18:24,429][71000] InferenceWorker_p0-w0: resuming experience collection (18150 times) [2024-06-12 23:18:25,685][71000] Updated weights for policy 0, policy_version 104174 (0.0029) [2024-06-12 23:18:25,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1706803200. Throughput: 0: 49464.5. Samples: 1235619120. Policy #0 lag: (min: 2.0, avg: 11.4, max: 26.0) [2024-06-12 23:18:25,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:18:28,664][71000] Updated weights for policy 0, policy_version 104184 (0.0024) [2024-06-12 23:18:30,940][70768] Fps is (10 sec: 44236.8, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 1706999808. Throughput: 0: 49170.3. Samples: 1235911140. Policy #0 lag: (min: 2.0, avg: 11.4, max: 26.0) [2024-06-12 23:18:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:18:32,474][71000] Updated weights for policy 0, policy_version 104194 (0.0025) [2024-06-12 23:18:35,109][71000] Updated weights for policy 0, policy_version 104204 (0.0026) [2024-06-12 23:18:35,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49424.9, 300 sec: 49318.6). Total num frames: 1707278336. Throughput: 0: 49013.1. Samples: 1236052080. Policy #0 lag: (min: 2.0, avg: 11.4, max: 26.0) [2024-06-12 23:18:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:18:38,799][71000] Updated weights for policy 0, policy_version 104214 (0.0036) [2024-06-12 23:18:40,940][70768] Fps is (10 sec: 54066.5, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 1707540480. Throughput: 0: 49263.8. Samples: 1236354440. Policy #0 lag: (min: 2.0, avg: 11.4, max: 26.0) [2024-06-12 23:18:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:18:42,061][71000] Updated weights for policy 0, policy_version 104224 (0.0033) [2024-06-12 23:18:45,672][71000] Updated weights for policy 0, policy_version 104234 (0.0024) [2024-06-12 23:18:45,940][70768] Fps is (10 sec: 49153.0, 60 sec: 48879.0, 300 sec: 49374.2). Total num frames: 1707769856. Throughput: 0: 49365.9. Samples: 1236644400. Policy #0 lag: (min: 2.0, avg: 11.4, max: 26.0) [2024-06-12 23:18:45,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:18:49,232][71000] Updated weights for policy 0, policy_version 104244 (0.0031) [2024-06-12 23:18:50,939][70768] Fps is (10 sec: 44237.8, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 1707982848. Throughput: 0: 48969.4. Samples: 1236786500. Policy #0 lag: (min: 2.0, avg: 11.4, max: 26.0) [2024-06-12 23:18:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:18:52,286][71000] Updated weights for policy 0, policy_version 104254 (0.0027) [2024-06-12 23:18:55,627][71000] Updated weights for policy 0, policy_version 104264 (0.0027) [2024-06-12 23:18:55,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 1708261376. Throughput: 0: 48921.2. Samples: 1237077760. Policy #0 lag: (min: 2.0, avg: 11.4, max: 26.0) [2024-06-12 23:18:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:18:58,817][71000] Updated weights for policy 0, policy_version 104274 (0.0029) [2024-06-12 23:19:00,939][70768] Fps is (10 sec: 57343.8, 60 sec: 49698.3, 300 sec: 49429.7). Total num frames: 1708556288. Throughput: 0: 48888.1. Samples: 1237381060. Policy #0 lag: (min: 2.0, avg: 11.4, max: 26.0) [2024-06-12 23:19:00,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 23:19:02,494][71000] Updated weights for policy 0, policy_version 104284 (0.0027) [2024-06-12 23:19:05,478][71000] Updated weights for policy 0, policy_version 104294 (0.0025) [2024-06-12 23:19:05,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1708769280. Throughput: 0: 49269.3. Samples: 1237544800. Policy #0 lag: (min: 2.0, avg: 11.4, max: 26.0) [2024-06-12 23:19:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 23:19:09,234][71000] Updated weights for policy 0, policy_version 104304 (0.0039) [2024-06-12 23:19:10,940][70768] Fps is (10 sec: 44235.8, 60 sec: 49424.9, 300 sec: 49374.1). Total num frames: 1708998656. Throughput: 0: 49180.4. Samples: 1237832240. Policy #0 lag: (min: 2.0, avg: 11.4, max: 26.0) [2024-06-12 23:19:10,941][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:19:12,137][71000] Updated weights for policy 0, policy_version 104314 (0.0032) [2024-06-12 23:19:15,807][71000] Updated weights for policy 0, policy_version 104324 (0.0035) [2024-06-12 23:19:15,939][70768] Fps is (10 sec: 47514.3, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 1709244416. Throughput: 0: 49115.3. Samples: 1238121320. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-12 23:19:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:19:18,932][71000] Updated weights for policy 0, policy_version 104334 (0.0030) [2024-06-12 23:19:20,940][70768] Fps is (10 sec: 52429.6, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1709522944. Throughput: 0: 49510.9. Samples: 1238280060. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-12 23:19:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:19:22,402][71000] Updated weights for policy 0, policy_version 104344 (0.0025) [2024-06-12 23:19:25,519][71000] Updated weights for policy 0, policy_version 104354 (0.0030) [2024-06-12 23:19:25,940][70768] Fps is (10 sec: 50789.2, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1709752320. Throughput: 0: 49536.5. Samples: 1238583580. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-12 23:19:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:19:29,017][71000] Updated weights for policy 0, policy_version 104364 (0.0025) [2024-06-12 23:19:30,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 1709981696. Throughput: 0: 49433.3. Samples: 1238868900. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-12 23:19:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:19:31,900][71000] Updated weights for policy 0, policy_version 104374 (0.0023) [2024-06-12 23:19:35,916][71000] Updated weights for policy 0, policy_version 104384 (0.0023) [2024-06-12 23:19:35,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.1, 300 sec: 49318.7). Total num frames: 1710227456. Throughput: 0: 49346.9. Samples: 1239007120. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-12 23:19:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:19:38,814][71000] Updated weights for policy 0, policy_version 104394 (0.0029) [2024-06-12 23:19:40,939][70768] Fps is (10 sec: 52429.2, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 1710505984. Throughput: 0: 49478.4. Samples: 1239304280. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-12 23:19:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-12 23:19:41,019][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000104402_1710522368.pth... [2024-06-12 23:19:41,059][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000103677_1698643968.pth [2024-06-12 23:19:42,544][71000] Updated weights for policy 0, policy_version 104404 (0.0023) [2024-06-12 23:19:43,588][70980] Signal inference workers to stop experience collection... (18200 times) [2024-06-12 23:19:43,629][71000] InferenceWorker_p0-w0: stopping experience collection (18200 times) [2024-06-12 23:19:43,637][70980] Signal inference workers to resume experience collection... (18200 times) [2024-06-12 23:19:43,644][71000] InferenceWorker_p0-w0: resuming experience collection (18200 times) [2024-06-12 23:19:45,109][71000] Updated weights for policy 0, policy_version 104414 (0.0035) [2024-06-12 23:19:45,940][70768] Fps is (10 sec: 52429.5, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1710751744. Throughput: 0: 49457.8. Samples: 1239606660. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-12 23:19:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-12 23:19:48,835][71000] Updated weights for policy 0, policy_version 104424 (0.0031) [2024-06-12 23:19:50,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49971.1, 300 sec: 49318.6). Total num frames: 1710981120. Throughput: 0: 49206.6. Samples: 1239759100. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-12 23:19:50,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:19:51,875][71000] Updated weights for policy 0, policy_version 104434 (0.0042) [2024-06-12 23:19:55,851][71000] Updated weights for policy 0, policy_version 104444 (0.0024) [2024-06-12 23:19:55,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49152.1, 300 sec: 49374.1). Total num frames: 1711210496. Throughput: 0: 49140.2. Samples: 1240043540. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-12 23:19:55,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 23:19:58,613][71000] Updated weights for policy 0, policy_version 104454 (0.0030) [2024-06-12 23:20:00,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48605.7, 300 sec: 49318.6). Total num frames: 1711472640. Throughput: 0: 49033.9. Samples: 1240327860. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-12 23:20:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:20:02,455][71000] Updated weights for policy 0, policy_version 104464 (0.0023) [2024-06-12 23:20:05,341][71000] Updated weights for policy 0, policy_version 104474 (0.0038) [2024-06-12 23:20:05,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1711718400. Throughput: 0: 49158.6. Samples: 1240492200. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-12 23:20:05,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:20:09,144][71000] Updated weights for policy 0, policy_version 104484 (0.0022) [2024-06-12 23:20:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 1711964160. Throughput: 0: 48936.4. Samples: 1240785720. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-12 23:20:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:20:12,223][71000] Updated weights for policy 0, policy_version 104494 (0.0026) [2024-06-12 23:20:15,909][71000] Updated weights for policy 0, policy_version 104504 (0.0027) [2024-06-12 23:20:15,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 1712193536. Throughput: 0: 49206.3. Samples: 1241083180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-12 23:20:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:20:18,614][71000] Updated weights for policy 0, policy_version 104514 (0.0036) [2024-06-12 23:20:20,939][70768] Fps is (10 sec: 49152.9, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 1712455680. Throughput: 0: 49268.6. Samples: 1241224200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-12 23:20:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:20:22,608][71000] Updated weights for policy 0, policy_version 104524 (0.0038) [2024-06-12 23:20:25,134][71000] Updated weights for policy 0, policy_version 104534 (0.0025) [2024-06-12 23:20:25,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49425.2, 300 sec: 49374.1). Total num frames: 1712717824. Throughput: 0: 49336.8. Samples: 1241524440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-12 23:20:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:20:29,120][71000] Updated weights for policy 0, policy_version 104544 (0.0034) [2024-06-12 23:20:30,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1712947200. Throughput: 0: 49271.4. Samples: 1241823880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-12 23:20:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:20:31,847][71000] Updated weights for policy 0, policy_version 104554 (0.0027) [2024-06-12 23:20:35,654][71000] Updated weights for policy 0, policy_version 104564 (0.0030) [2024-06-12 23:20:35,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 1713176576. Throughput: 0: 49077.5. Samples: 1241967580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-12 23:20:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:20:38,685][71000] Updated weights for policy 0, policy_version 104574 (0.0024) [2024-06-12 23:20:40,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.8, 300 sec: 49263.0). Total num frames: 1713438720. Throughput: 0: 49247.0. Samples: 1242259660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-12 23:20:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:20:42,085][71000] Updated weights for policy 0, policy_version 104584 (0.0026) [2024-06-12 23:20:45,242][71000] Updated weights for policy 0, policy_version 104594 (0.0022) [2024-06-12 23:20:45,943][70768] Fps is (10 sec: 52411.0, 60 sec: 49149.2, 300 sec: 49373.6). Total num frames: 1713700864. Throughput: 0: 49602.6. Samples: 1242560140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-12 23:20:45,943][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:20:48,687][71000] Updated weights for policy 0, policy_version 104604 (0.0029) [2024-06-12 23:20:50,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 1713930240. Throughput: 0: 49346.7. Samples: 1242712800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-12 23:20:50,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 23:20:51,754][71000] Updated weights for policy 0, policy_version 104614 (0.0028) [2024-06-12 23:20:55,226][71000] Updated weights for policy 0, policy_version 104624 (0.0024) [2024-06-12 23:20:55,940][70768] Fps is (10 sec: 47529.1, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1714176000. Throughput: 0: 49224.0. Samples: 1243000800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-12 23:20:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:20:58,505][71000] Updated weights for policy 0, policy_version 104634 (0.0036) [2024-06-12 23:21:00,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 1714421760. Throughput: 0: 49200.0. Samples: 1243297180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-12 23:21:00,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 23:21:01,809][71000] Updated weights for policy 0, policy_version 104644 (0.0032) [2024-06-12 23:21:05,154][71000] Updated weights for policy 0, policy_version 104654 (0.0028) [2024-06-12 23:21:05,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1714667520. Throughput: 0: 49387.9. Samples: 1243446660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-12 23:21:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:21:08,447][71000] Updated weights for policy 0, policy_version 104664 (0.0028) [2024-06-12 23:21:10,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1714913280. Throughput: 0: 49361.3. Samples: 1243745700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-12 23:21:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:21:11,952][71000] Updated weights for policy 0, policy_version 104674 (0.0023) [2024-06-12 23:21:13,242][70980] Signal inference workers to stop experience collection... (18250 times) [2024-06-12 23:21:13,273][71000] InferenceWorker_p0-w0: stopping experience collection (18250 times) [2024-06-12 23:21:13,300][70980] Signal inference workers to resume experience collection... (18250 times) [2024-06-12 23:21:13,301][71000] InferenceWorker_p0-w0: resuming experience collection (18250 times) [2024-06-12 23:21:14,859][71000] Updated weights for policy 0, policy_version 104684 (0.0026) [2024-06-12 23:21:15,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 1715159040. Throughput: 0: 49309.9. Samples: 1244042820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 23:21:15,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:21:18,315][71000] Updated weights for policy 0, policy_version 104694 (0.0034) [2024-06-12 23:21:20,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1715404800. Throughput: 0: 49476.0. Samples: 1244194000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 23:21:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:21:21,826][71000] Updated weights for policy 0, policy_version 104704 (0.0034) [2024-06-12 23:21:24,797][71000] Updated weights for policy 0, policy_version 104714 (0.0029) [2024-06-12 23:21:25,940][70768] Fps is (10 sec: 50789.4, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 1715666944. Throughput: 0: 49655.5. Samples: 1244494160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 23:21:25,940][70768] Avg episode reward: [(0, '0.261')] [2024-06-12 23:21:28,364][71000] Updated weights for policy 0, policy_version 104724 (0.0024) [2024-06-12 23:21:30,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 1715912704. Throughput: 0: 49545.5. Samples: 1244789520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 23:21:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:21:31,635][71000] Updated weights for policy 0, policy_version 104734 (0.0036) [2024-06-12 23:21:34,932][71000] Updated weights for policy 0, policy_version 104744 (0.0025) [2024-06-12 23:21:35,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 1716174848. Throughput: 0: 49388.5. Samples: 1244935280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 23:21:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:21:38,201][71000] Updated weights for policy 0, policy_version 104754 (0.0021) [2024-06-12 23:21:40,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1716420608. Throughput: 0: 49750.8. Samples: 1245239580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 23:21:40,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:21:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000104762_1716420608.pth... [2024-06-12 23:21:40,994][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000104040_1704591360.pth [2024-06-12 23:21:41,403][71000] Updated weights for policy 0, policy_version 104764 (0.0024) [2024-06-12 23:21:44,599][71000] Updated weights for policy 0, policy_version 104774 (0.0034) [2024-06-12 23:21:45,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49154.8, 300 sec: 49374.2). Total num frames: 1716649984. Throughput: 0: 49789.8. Samples: 1245537720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 23:21:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:21:48,143][71000] Updated weights for policy 0, policy_version 104784 (0.0030) [2024-06-12 23:21:50,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 1716912128. Throughput: 0: 49697.8. Samples: 1245683060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 23:21:50,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 23:21:51,536][71000] Updated weights for policy 0, policy_version 104794 (0.0027) [2024-06-12 23:21:54,559][71000] Updated weights for policy 0, policy_version 104804 (0.0038) [2024-06-12 23:21:55,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49971.3, 300 sec: 49374.1). Total num frames: 1717174272. Throughput: 0: 49884.0. Samples: 1245990480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 23:21:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-12 23:21:57,964][71000] Updated weights for policy 0, policy_version 104814 (0.0028) [2024-06-12 23:22:00,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49971.1, 300 sec: 49429.7). Total num frames: 1717420032. Throughput: 0: 49725.1. Samples: 1246280460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 23:22:00,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:22:01,001][71000] Updated weights for policy 0, policy_version 104824 (0.0030) [2024-06-12 23:22:04,471][71000] Updated weights for policy 0, policy_version 104834 (0.0037) [2024-06-12 23:22:05,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1717649408. Throughput: 0: 49675.5. Samples: 1246429400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 23:22:05,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:22:07,569][71000] Updated weights for policy 0, policy_version 104844 (0.0022) [2024-06-12 23:22:10,940][70768] Fps is (10 sec: 45876.0, 60 sec: 49425.2, 300 sec: 49263.3). Total num frames: 1717878784. Throughput: 0: 49547.8. Samples: 1246723800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-12 23:22:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-12 23:22:11,503][71000] Updated weights for policy 0, policy_version 104854 (0.0031) [2024-06-12 23:22:14,661][71000] Updated weights for policy 0, policy_version 104864 (0.0023) [2024-06-12 23:22:15,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49971.1, 300 sec: 49374.4). Total num frames: 1718157312. Throughput: 0: 49440.8. Samples: 1247014360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:22:15,949][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:22:18,329][71000] Updated weights for policy 0, policy_version 104874 (0.0029) [2024-06-12 23:22:20,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49971.2, 300 sec: 49374.2). Total num frames: 1718403072. Throughput: 0: 49620.5. Samples: 1247168200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:22:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:22:21,089][71000] Updated weights for policy 0, policy_version 104884 (0.0029) [2024-06-12 23:22:24,781][70980] Signal inference workers to stop experience collection... (18300 times) [2024-06-12 23:22:24,818][71000] InferenceWorker_p0-w0: stopping experience collection (18300 times) [2024-06-12 23:22:24,842][70980] Signal inference workers to resume experience collection... (18300 times) [2024-06-12 23:22:24,842][71000] InferenceWorker_p0-w0: resuming experience collection (18300 times) [2024-06-12 23:22:24,975][71000] Updated weights for policy 0, policy_version 104894 (0.0035) [2024-06-12 23:22:25,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1718632448. Throughput: 0: 49383.1. Samples: 1247461820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:22:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:22:27,407][71000] Updated weights for policy 0, policy_version 104904 (0.0025) [2024-06-12 23:22:30,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1718878208. Throughput: 0: 49363.0. Samples: 1247759060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:22:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:22:31,672][71000] Updated weights for policy 0, policy_version 104914 (0.0025) [2024-06-12 23:22:34,414][71000] Updated weights for policy 0, policy_version 104924 (0.0033) [2024-06-12 23:22:35,940][70768] Fps is (10 sec: 52429.1, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 1719156736. Throughput: 0: 49455.2. Samples: 1247908540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:22:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-12 23:22:38,161][71000] Updated weights for policy 0, policy_version 104934 (0.0031) [2024-06-12 23:22:40,876][71000] Updated weights for policy 0, policy_version 104944 (0.0033) [2024-06-12 23:22:40,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 1719402496. Throughput: 0: 49213.0. Samples: 1248205060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:22:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-12 23:22:44,615][71000] Updated weights for policy 0, policy_version 104954 (0.0024) [2024-06-12 23:22:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1719631872. Throughput: 0: 49352.2. Samples: 1248501300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:22:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:22:47,429][71000] Updated weights for policy 0, policy_version 104964 (0.0028) [2024-06-12 23:22:50,939][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 1719861248. Throughput: 0: 49513.4. Samples: 1248657500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:22:50,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 23:22:51,153][71000] Updated weights for policy 0, policy_version 104974 (0.0021) [2024-06-12 23:22:53,895][71000] Updated weights for policy 0, policy_version 104984 (0.0027) [2024-06-12 23:22:55,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1720139776. Throughput: 0: 49558.7. Samples: 1248953940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:22:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-12 23:22:57,778][71000] Updated weights for policy 0, policy_version 104994 (0.0032) [2024-06-12 23:23:00,543][71000] Updated weights for policy 0, policy_version 105004 (0.0037) [2024-06-12 23:23:00,940][70768] Fps is (10 sec: 52427.7, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1720385536. Throughput: 0: 49650.2. Samples: 1249248620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:23:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:23:04,523][71000] Updated weights for policy 0, policy_version 105014 (0.0033) [2024-06-12 23:23:05,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1720631296. Throughput: 0: 49583.8. Samples: 1249399480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:23:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:23:07,254][71000] Updated weights for policy 0, policy_version 105024 (0.0028) [2024-06-12 23:23:10,940][70768] Fps is (10 sec: 47514.5, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 1720860672. Throughput: 0: 49587.6. Samples: 1249693260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:23:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:23:11,047][71000] Updated weights for policy 0, policy_version 105034 (0.0032) [2024-06-12 23:23:13,818][71000] Updated weights for policy 0, policy_version 105044 (0.0022) [2024-06-12 23:23:15,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1721139200. Throughput: 0: 49644.9. Samples: 1249993080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:23:15,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 23:23:17,587][70980] Signal inference workers to stop experience collection... (18350 times) [2024-06-12 23:23:17,588][70980] Signal inference workers to resume experience collection... (18350 times) [2024-06-12 23:23:17,605][71000] InferenceWorker_p0-w0: stopping experience collection (18350 times) [2024-06-12 23:23:17,605][71000] InferenceWorker_p0-w0: resuming experience collection (18350 times) [2024-06-12 23:23:17,723][71000] Updated weights for policy 0, policy_version 105054 (0.0029) [2024-06-12 23:23:20,507][71000] Updated weights for policy 0, policy_version 105064 (0.0037) [2024-06-12 23:23:20,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 1721368576. Throughput: 0: 49639.9. Samples: 1250142340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 23:23:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:23:24,156][71000] Updated weights for policy 0, policy_version 105074 (0.0031) [2024-06-12 23:23:25,940][70768] Fps is (10 sec: 47509.7, 60 sec: 49697.5, 300 sec: 49540.6). Total num frames: 1721614336. Throughput: 0: 49634.6. Samples: 1250438660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 23:23:25,941][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:23:27,373][71000] Updated weights for policy 0, policy_version 105084 (0.0026) [2024-06-12 23:23:30,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.0, 300 sec: 49318.7). Total num frames: 1721827328. Throughput: 0: 49740.4. Samples: 1250739620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 23:23:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:23:31,210][71000] Updated weights for policy 0, policy_version 105094 (0.0025) [2024-06-12 23:23:33,939][71000] Updated weights for policy 0, policy_version 105104 (0.0026) [2024-06-12 23:23:35,940][70768] Fps is (10 sec: 50794.8, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1722122240. Throughput: 0: 49240.0. Samples: 1250873300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 23:23:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:23:37,622][71000] Updated weights for policy 0, policy_version 105114 (0.0023) [2024-06-12 23:23:40,407][71000] Updated weights for policy 0, policy_version 105124 (0.0028) [2024-06-12 23:23:40,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 1722351616. Throughput: 0: 49483.9. Samples: 1251180720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 23:23:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:23:41,079][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000105126_1722384384.pth... [2024-06-12 23:23:41,130][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000104402_1710522368.pth [2024-06-12 23:23:44,396][71000] Updated weights for policy 0, policy_version 105134 (0.0027) [2024-06-12 23:23:45,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49698.0, 300 sec: 49596.3). Total num frames: 1722613760. Throughput: 0: 49588.0. Samples: 1251480080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 23:23:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 23:23:47,137][71000] Updated weights for policy 0, policy_version 105144 (0.0029) [2024-06-12 23:23:50,863][71000] Updated weights for policy 0, policy_version 105154 (0.0032) [2024-06-12 23:23:50,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 1722843136. Throughput: 0: 49337.8. Samples: 1251619680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 23:23:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:23:53,813][71000] Updated weights for policy 0, policy_version 105164 (0.0034) [2024-06-12 23:23:55,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1723105280. Throughput: 0: 49383.5. Samples: 1251915520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 23:23:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:23:57,644][71000] Updated weights for policy 0, policy_version 105174 (0.0026) [2024-06-12 23:24:00,300][71000] Updated weights for policy 0, policy_version 105184 (0.0024) [2024-06-12 23:24:00,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1723367424. Throughput: 0: 49416.3. Samples: 1252216820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 23:24:00,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:24:04,070][71000] Updated weights for policy 0, policy_version 105194 (0.0033) [2024-06-12 23:24:05,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1723596800. Throughput: 0: 49534.6. Samples: 1252371400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 23:24:05,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 23:24:06,877][71000] Updated weights for policy 0, policy_version 105204 (0.0029) [2024-06-12 23:24:10,927][71000] Updated weights for policy 0, policy_version 105214 (0.0026) [2024-06-12 23:24:10,939][70768] Fps is (10 sec: 45876.4, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1723826176. Throughput: 0: 49386.8. Samples: 1252661020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 23:24:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:24:13,454][71000] Updated weights for policy 0, policy_version 105224 (0.0022) [2024-06-12 23:24:15,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1724088320. Throughput: 0: 49186.2. Samples: 1252953000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-12 23:24:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:24:17,617][71000] Updated weights for policy 0, policy_version 105234 (0.0036) [2024-06-12 23:24:20,519][71000] Updated weights for policy 0, policy_version 105244 (0.0028) [2024-06-12 23:24:20,939][70768] Fps is (10 sec: 52428.7, 60 sec: 49698.2, 300 sec: 49485.3). Total num frames: 1724350464. Throughput: 0: 49622.3. Samples: 1253106300. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-12 23:24:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:24:23,948][70980] Signal inference workers to stop experience collection... (18400 times) [2024-06-12 23:24:24,001][71000] InferenceWorker_p0-w0: stopping experience collection (18400 times) [2024-06-12 23:24:24,009][70980] Signal inference workers to resume experience collection... (18400 times) [2024-06-12 23:24:24,010][71000] InferenceWorker_p0-w0: resuming experience collection (18400 times) [2024-06-12 23:24:24,143][71000] Updated weights for policy 0, policy_version 105254 (0.0025) [2024-06-12 23:24:25,939][70768] Fps is (10 sec: 49152.5, 60 sec: 49425.8, 300 sec: 49485.2). Total num frames: 1724579840. Throughput: 0: 49577.5. Samples: 1253411700. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-12 23:24:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:24:27,096][71000] Updated weights for policy 0, policy_version 105264 (0.0019) [2024-06-12 23:24:30,686][71000] Updated weights for policy 0, policy_version 105274 (0.0028) [2024-06-12 23:24:30,940][70768] Fps is (10 sec: 45874.2, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 1724809216. Throughput: 0: 49341.3. Samples: 1253700440. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-12 23:24:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:24:33,577][71000] Updated weights for policy 0, policy_version 105284 (0.0030) [2024-06-12 23:24:35,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 1725054976. Throughput: 0: 49496.5. Samples: 1253847020. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-12 23:24:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:24:37,341][71000] Updated weights for policy 0, policy_version 105294 (0.0033) [2024-06-12 23:24:40,214][71000] Updated weights for policy 0, policy_version 105304 (0.0025) [2024-06-12 23:24:40,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1725333504. Throughput: 0: 49570.5. Samples: 1254146200. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-12 23:24:40,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:24:44,069][71000] Updated weights for policy 0, policy_version 105314 (0.0033) [2024-06-12 23:24:45,939][70768] Fps is (10 sec: 50791.0, 60 sec: 49152.2, 300 sec: 49429.7). Total num frames: 1725562880. Throughput: 0: 49290.1. Samples: 1254434860. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-12 23:24:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:24:46,950][71000] Updated weights for policy 0, policy_version 105324 (0.0031) [2024-06-12 23:24:50,476][71000] Updated weights for policy 0, policy_version 105334 (0.0033) [2024-06-12 23:24:50,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1725808640. Throughput: 0: 49224.0. Samples: 1254586480. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-12 23:24:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:24:53,479][71000] Updated weights for policy 0, policy_version 105344 (0.0029) [2024-06-12 23:24:55,940][70768] Fps is (10 sec: 47512.4, 60 sec: 48878.8, 300 sec: 49374.1). Total num frames: 1726038016. Throughput: 0: 49323.7. Samples: 1254880600. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-12 23:24:55,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 23:24:57,250][71000] Updated weights for policy 0, policy_version 105354 (0.0026) [2024-06-12 23:24:59,910][71000] Updated weights for policy 0, policy_version 105364 (0.0039) [2024-06-12 23:25:00,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1726332928. Throughput: 0: 49524.5. Samples: 1255181600. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-12 23:25:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:25:03,666][71000] Updated weights for policy 0, policy_version 105374 (0.0026) [2024-06-12 23:25:05,940][70768] Fps is (10 sec: 52429.1, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1726562304. Throughput: 0: 49560.3. Samples: 1255336520. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-12 23:25:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:25:06,857][71000] Updated weights for policy 0, policy_version 105384 (0.0035) [2024-06-12 23:25:10,066][71000] Updated weights for policy 0, policy_version 105394 (0.0021) [2024-06-12 23:25:10,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49971.0, 300 sec: 49596.3). Total num frames: 1726824448. Throughput: 0: 49444.7. Samples: 1255636720. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-12 23:25:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-12 23:25:13,063][71000] Updated weights for policy 0, policy_version 105404 (0.0027) [2024-06-12 23:25:15,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 1727053824. Throughput: 0: 49619.8. Samples: 1255933320. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-12 23:25:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:25:16,816][71000] Updated weights for policy 0, policy_version 105414 (0.0024) [2024-06-12 23:25:19,722][71000] Updated weights for policy 0, policy_version 105424 (0.0030) [2024-06-12 23:25:20,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1727315968. Throughput: 0: 49580.4. Samples: 1256078140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 23:25:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:25:23,290][71000] Updated weights for policy 0, policy_version 105434 (0.0029) [2024-06-12 23:25:25,940][70768] Fps is (10 sec: 49150.7, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1727545344. Throughput: 0: 49565.3. Samples: 1256376640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 23:25:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:25:26,501][71000] Updated weights for policy 0, policy_version 105444 (0.0038) [2024-06-12 23:25:30,020][71000] Updated weights for policy 0, policy_version 105454 (0.0024) [2024-06-12 23:25:30,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1727807488. Throughput: 0: 49624.2. Samples: 1256667960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 23:25:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:25:33,180][71000] Updated weights for policy 0, policy_version 105464 (0.0029) [2024-06-12 23:25:35,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1728020480. Throughput: 0: 49379.1. Samples: 1256808540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 23:25:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:25:36,580][71000] Updated weights for policy 0, policy_version 105474 (0.0028) [2024-06-12 23:25:39,714][71000] Updated weights for policy 0, policy_version 105484 (0.0024) [2024-06-12 23:25:40,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.1, 300 sec: 49430.3). Total num frames: 1728282624. Throughput: 0: 49473.1. Samples: 1257106880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 23:25:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:25:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000105487_1728299008.pth... [2024-06-12 23:25:40,988][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000104762_1716420608.pth [2024-06-12 23:25:42,536][70980] Signal inference workers to stop experience collection... (18450 times) [2024-06-12 23:25:42,537][70980] Signal inference workers to resume experience collection... (18450 times) [2024-06-12 23:25:42,580][71000] InferenceWorker_p0-w0: stopping experience collection (18450 times) [2024-06-12 23:25:42,580][71000] InferenceWorker_p0-w0: resuming experience collection (18450 times) [2024-06-12 23:25:43,238][71000] Updated weights for policy 0, policy_version 105494 (0.0025) [2024-06-12 23:25:45,940][70768] Fps is (10 sec: 54066.9, 60 sec: 49971.0, 300 sec: 49596.3). Total num frames: 1728561152. Throughput: 0: 49555.8. Samples: 1257411620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 23:25:45,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:25:46,252][71000] Updated weights for policy 0, policy_version 105504 (0.0024) [2024-06-12 23:25:49,621][71000] Updated weights for policy 0, policy_version 105514 (0.0026) [2024-06-12 23:25:50,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.2, 300 sec: 49485.3). Total num frames: 1728774144. Throughput: 0: 49534.3. Samples: 1257565560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 23:25:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:25:52,764][71000] Updated weights for policy 0, policy_version 105524 (0.0028) [2024-06-12 23:25:55,940][70768] Fps is (10 sec: 49152.7, 60 sec: 50244.4, 300 sec: 49596.3). Total num frames: 1729052672. Throughput: 0: 49386.8. Samples: 1257859120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 23:25:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:25:56,023][71000] Updated weights for policy 0, policy_version 105534 (0.0032) [2024-06-12 23:25:59,485][71000] Updated weights for policy 0, policy_version 105544 (0.0027) [2024-06-12 23:26:00,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1729282048. Throughput: 0: 49537.3. Samples: 1258162500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 23:26:00,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 23:26:02,703][71000] Updated weights for policy 0, policy_version 105554 (0.0029) [2024-06-12 23:26:05,940][70768] Fps is (10 sec: 49150.9, 60 sec: 49698.0, 300 sec: 49596.3). Total num frames: 1729544192. Throughput: 0: 49577.1. Samples: 1258309120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 23:26:05,944][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:26:06,115][71000] Updated weights for policy 0, policy_version 105564 (0.0031) [2024-06-12 23:26:09,429][71000] Updated weights for policy 0, policy_version 105574 (0.0029) [2024-06-12 23:26:10,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49152.0, 300 sec: 49540.7). Total num frames: 1729773568. Throughput: 0: 49544.5. Samples: 1258606140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 23:26:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-12 23:26:12,758][71000] Updated weights for policy 0, policy_version 105584 (0.0026) [2024-06-12 23:26:15,773][71000] Updated weights for policy 0, policy_version 105594 (0.0034) [2024-06-12 23:26:15,940][70768] Fps is (10 sec: 50791.3, 60 sec: 49971.1, 300 sec: 49651.8). Total num frames: 1730052096. Throughput: 0: 49630.8. Samples: 1258901340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-12 23:26:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:26:19,289][71000] Updated weights for policy 0, policy_version 105604 (0.0037) [2024-06-12 23:26:20,939][70768] Fps is (10 sec: 49152.8, 60 sec: 49152.1, 300 sec: 49485.3). Total num frames: 1730265088. Throughput: 0: 49765.0. Samples: 1259047960. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-06-12 23:26:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:26:22,740][71000] Updated weights for policy 0, policy_version 105614 (0.0032) [2024-06-12 23:26:25,775][71000] Updated weights for policy 0, policy_version 105624 (0.0041) [2024-06-12 23:26:25,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49971.4, 300 sec: 49596.3). Total num frames: 1730543616. Throughput: 0: 49880.9. Samples: 1259351520. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-06-12 23:26:25,947][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:26:29,618][71000] Updated weights for policy 0, policy_version 105634 (0.0023) [2024-06-12 23:26:30,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 1730756608. Throughput: 0: 49537.0. Samples: 1259640780. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-06-12 23:26:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:26:32,314][71000] Updated weights for policy 0, policy_version 105644 (0.0025) [2024-06-12 23:26:35,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49971.3, 300 sec: 49485.2). Total num frames: 1731018752. Throughput: 0: 49398.2. Samples: 1259788480. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-06-12 23:26:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:26:36,065][71000] Updated weights for policy 0, policy_version 105654 (0.0028) [2024-06-12 23:26:39,336][71000] Updated weights for policy 0, policy_version 105664 (0.0031) [2024-06-12 23:26:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1731248128. Throughput: 0: 49335.1. Samples: 1260079200. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-06-12 23:26:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:26:42,826][71000] Updated weights for policy 0, policy_version 105674 (0.0026) [2024-06-12 23:26:45,614][70980] Signal inference workers to stop experience collection... (18500 times) [2024-06-12 23:26:45,664][70980] Signal inference workers to resume experience collection... (18500 times) [2024-06-12 23:26:45,668][71000] InferenceWorker_p0-w0: stopping experience collection (18500 times) [2024-06-12 23:26:45,687][71000] InferenceWorker_p0-w0: resuming experience collection (18500 times) [2024-06-12 23:26:45,798][71000] Updated weights for policy 0, policy_version 105684 (0.0029) [2024-06-12 23:26:45,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1731526656. Throughput: 0: 49210.7. Samples: 1260376980. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-06-12 23:26:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:26:49,523][71000] Updated weights for policy 0, policy_version 105694 (0.0027) [2024-06-12 23:26:50,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 1731739648. Throughput: 0: 49282.4. Samples: 1260526820. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-06-12 23:26:50,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 23:26:52,503][71000] Updated weights for policy 0, policy_version 105704 (0.0027) [2024-06-12 23:26:55,902][71000] Updated weights for policy 0, policy_version 105714 (0.0023) [2024-06-12 23:26:55,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1732018176. Throughput: 0: 49347.5. Samples: 1260826780. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-06-12 23:26:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:26:59,199][71000] Updated weights for policy 0, policy_version 105724 (0.0036) [2024-06-12 23:27:00,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1732231168. Throughput: 0: 49403.6. Samples: 1261124500. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-06-12 23:27:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:27:02,265][71000] Updated weights for policy 0, policy_version 105734 (0.0024) [2024-06-12 23:27:05,852][71000] Updated weights for policy 0, policy_version 105744 (0.0031) [2024-06-12 23:27:05,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.2, 300 sec: 49596.3). Total num frames: 1732509696. Throughput: 0: 49483.4. Samples: 1261274720. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-06-12 23:27:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 23:27:09,737][71000] Updated weights for policy 0, policy_version 105754 (0.0033) [2024-06-12 23:27:10,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 1732722688. Throughput: 0: 49132.2. Samples: 1261562480. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-06-12 23:27:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:27:12,554][71000] Updated weights for policy 0, policy_version 105764 (0.0022) [2024-06-12 23:27:15,940][70768] Fps is (10 sec: 45874.6, 60 sec: 48605.7, 300 sec: 49374.1). Total num frames: 1732968448. Throughput: 0: 49249.1. Samples: 1261857000. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-06-12 23:27:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:27:16,305][71000] Updated weights for policy 0, policy_version 105774 (0.0035) [2024-06-12 23:27:19,329][71000] Updated weights for policy 0, policy_version 105784 (0.0035) [2024-06-12 23:27:20,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1733230592. Throughput: 0: 49066.5. Samples: 1261996480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:27:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:27:22,723][71000] Updated weights for policy 0, policy_version 105794 (0.0032) [2024-06-12 23:27:25,876][71000] Updated weights for policy 0, policy_version 105804 (0.0035) [2024-06-12 23:27:25,940][70768] Fps is (10 sec: 52429.5, 60 sec: 49151.9, 300 sec: 49540.8). Total num frames: 1733492736. Throughput: 0: 49395.0. Samples: 1262301980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:27:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:27:29,493][71000] Updated weights for policy 0, policy_version 105814 (0.0032) [2024-06-12 23:27:30,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1733705728. Throughput: 0: 49376.8. Samples: 1262598940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:27:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:27:32,577][71000] Updated weights for policy 0, policy_version 105824 (0.0030) [2024-06-12 23:27:35,940][70768] Fps is (10 sec: 45875.7, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 1733951488. Throughput: 0: 49073.0. Samples: 1262735100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:27:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:27:36,399][71000] Updated weights for policy 0, policy_version 105834 (0.0025) [2024-06-12 23:27:39,261][71000] Updated weights for policy 0, policy_version 105844 (0.0036) [2024-06-12 23:27:40,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1734213632. Throughput: 0: 49023.8. Samples: 1263032840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:27:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:27:41,009][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000105849_1734230016.pth... [2024-06-12 23:27:41,055][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000105126_1722384384.pth [2024-06-12 23:27:42,803][71000] Updated weights for policy 0, policy_version 105854 (0.0026) [2024-06-12 23:27:45,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48878.9, 300 sec: 49485.2). Total num frames: 1734459392. Throughput: 0: 49017.4. Samples: 1263330280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:27:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:27:46,097][71000] Updated weights for policy 0, policy_version 105864 (0.0024) [2024-06-12 23:27:47,158][70980] Signal inference workers to stop experience collection... (18550 times) [2024-06-12 23:27:47,200][71000] InferenceWorker_p0-w0: stopping experience collection (18550 times) [2024-06-12 23:27:47,215][70980] Signal inference workers to resume experience collection... (18550 times) [2024-06-12 23:27:47,220][71000] InferenceWorker_p0-w0: resuming experience collection (18550 times) [2024-06-12 23:27:49,534][71000] Updated weights for policy 0, policy_version 105874 (0.0026) [2024-06-12 23:27:50,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1734688768. Throughput: 0: 48964.1. Samples: 1263478100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:27:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:27:52,655][71000] Updated weights for policy 0, policy_version 105884 (0.0033) [2024-06-12 23:27:55,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48879.1, 300 sec: 49374.2). Total num frames: 1734950912. Throughput: 0: 49054.9. Samples: 1263769940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:27:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:27:55,987][71000] Updated weights for policy 0, policy_version 105894 (0.0019) [2024-06-12 23:27:59,364][71000] Updated weights for policy 0, policy_version 105904 (0.0029) [2024-06-12 23:28:00,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 1735196672. Throughput: 0: 49078.5. Samples: 1264065520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:28:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:28:02,657][71000] Updated weights for policy 0, policy_version 105914 (0.0034) [2024-06-12 23:28:05,779][71000] Updated weights for policy 0, policy_version 105924 (0.0029) [2024-06-12 23:28:05,944][70768] Fps is (10 sec: 50769.9, 60 sec: 49148.8, 300 sec: 49484.6). Total num frames: 1735458816. Throughput: 0: 49502.4. Samples: 1264224280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:28:05,944][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:28:09,480][71000] Updated weights for policy 0, policy_version 105934 (0.0029) [2024-06-12 23:28:10,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 1735671808. Throughput: 0: 49238.8. Samples: 1264517720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:28:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:28:12,395][71000] Updated weights for policy 0, policy_version 105944 (0.0022) [2024-06-12 23:28:15,944][70768] Fps is (10 sec: 47512.9, 60 sec: 49421.8, 300 sec: 49373.5). Total num frames: 1735933952. Throughput: 0: 49125.2. Samples: 1264809780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-12 23:28:15,944][70768] Avg episode reward: [(0, '0.285')] [2024-06-12 23:28:15,993][71000] Updated weights for policy 0, policy_version 105954 (0.0028) [2024-06-12 23:28:19,341][71000] Updated weights for policy 0, policy_version 105964 (0.0036) [2024-06-12 23:28:20,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49425.2, 300 sec: 49429.8). Total num frames: 1736196096. Throughput: 0: 49326.6. Samples: 1264954800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 23:28:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:28:22,791][71000] Updated weights for policy 0, policy_version 105974 (0.0035) [2024-06-12 23:28:25,940][70768] Fps is (10 sec: 49171.9, 60 sec: 48878.9, 300 sec: 49485.2). Total num frames: 1736425472. Throughput: 0: 49299.8. Samples: 1265251340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 23:28:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:28:25,967][71000] Updated weights for policy 0, policy_version 105984 (0.0027) [2024-06-12 23:28:29,577][71000] Updated weights for policy 0, policy_version 105994 (0.0030) [2024-06-12 23:28:30,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 1736654848. Throughput: 0: 49387.6. Samples: 1265552720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 23:28:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:28:32,501][71000] Updated weights for policy 0, policy_version 106004 (0.0028) [2024-06-12 23:28:35,887][71000] Updated weights for policy 0, policy_version 106014 (0.0034) [2024-06-12 23:28:35,939][70768] Fps is (10 sec: 50791.4, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1736933376. Throughput: 0: 49229.9. Samples: 1265693440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 23:28:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:28:38,963][71000] Updated weights for policy 0, policy_version 106024 (0.0032) [2024-06-12 23:28:40,940][70768] Fps is (10 sec: 54067.2, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1737195520. Throughput: 0: 49234.7. Samples: 1265985500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 23:28:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:28:42,583][71000] Updated weights for policy 0, policy_version 106034 (0.0030) [2024-06-12 23:28:45,704][71000] Updated weights for policy 0, policy_version 106044 (0.0024) [2024-06-12 23:28:45,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1737424896. Throughput: 0: 49331.5. Samples: 1266285440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 23:28:45,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 23:28:49,334][71000] Updated weights for policy 0, policy_version 106054 (0.0029) [2024-06-12 23:28:50,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 1737654272. Throughput: 0: 49278.2. Samples: 1266441600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 23:28:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:28:52,382][71000] Updated weights for policy 0, policy_version 106064 (0.0025) [2024-06-12 23:28:55,881][71000] Updated weights for policy 0, policy_version 106074 (0.0021) [2024-06-12 23:28:55,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 1737916416. Throughput: 0: 49225.3. Samples: 1266732860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 23:28:55,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 23:28:59,072][71000] Updated weights for policy 0, policy_version 106084 (0.0031) [2024-06-12 23:29:00,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49424.9, 300 sec: 49374.2). Total num frames: 1738162176. Throughput: 0: 49305.3. Samples: 1267028320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 23:29:00,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:29:02,515][71000] Updated weights for policy 0, policy_version 106094 (0.0023) [2024-06-12 23:29:05,416][71000] Updated weights for policy 0, policy_version 106104 (0.0024) [2024-06-12 23:29:05,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49428.3, 300 sec: 49485.2). Total num frames: 1738424320. Throughput: 0: 49637.7. Samples: 1267188500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 23:29:05,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:29:09,141][71000] Updated weights for policy 0, policy_version 106114 (0.0041) [2024-06-12 23:29:10,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1738637312. Throughput: 0: 49348.0. Samples: 1267472000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 23:29:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:29:11,262][70980] Signal inference workers to stop experience collection... (18600 times) [2024-06-12 23:29:11,310][71000] InferenceWorker_p0-w0: stopping experience collection (18600 times) [2024-06-12 23:29:11,318][70980] Signal inference workers to resume experience collection... (18600 times) [2024-06-12 23:29:11,319][71000] InferenceWorker_p0-w0: resuming experience collection (18600 times) [2024-06-12 23:29:12,275][71000] Updated weights for policy 0, policy_version 106124 (0.0032) [2024-06-12 23:29:15,940][70768] Fps is (10 sec: 45875.7, 60 sec: 49155.4, 300 sec: 49263.1). Total num frames: 1738883072. Throughput: 0: 49333.8. Samples: 1267772740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-12 23:29:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:29:16,028][71000] Updated weights for policy 0, policy_version 106134 (0.0033) [2024-06-12 23:29:18,841][71000] Updated weights for policy 0, policy_version 106144 (0.0036) [2024-06-12 23:29:20,939][70768] Fps is (10 sec: 52430.0, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1739161600. Throughput: 0: 49624.9. Samples: 1267926560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:29:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:29:22,382][71000] Updated weights for policy 0, policy_version 106154 (0.0024) [2024-06-12 23:29:25,409][71000] Updated weights for policy 0, policy_version 106164 (0.0026) [2024-06-12 23:29:25,942][70768] Fps is (10 sec: 52415.4, 60 sec: 49696.1, 300 sec: 49484.8). Total num frames: 1739407360. Throughput: 0: 49712.3. Samples: 1268222680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:29:25,943][70768] Avg episode reward: [(0, '0.263')] [2024-06-12 23:29:29,199][71000] Updated weights for policy 0, policy_version 106174 (0.0033) [2024-06-12 23:29:30,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1739636736. Throughput: 0: 49624.0. Samples: 1268518520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:29:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:29:31,999][71000] Updated weights for policy 0, policy_version 106184 (0.0033) [2024-06-12 23:29:35,538][71000] Updated weights for policy 0, policy_version 106194 (0.0028) [2024-06-12 23:29:35,940][70768] Fps is (10 sec: 47525.9, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 1739882496. Throughput: 0: 49364.5. Samples: 1268663000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:29:35,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:29:38,646][71000] Updated weights for policy 0, policy_version 106204 (0.0030) [2024-06-12 23:29:40,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1740144640. Throughput: 0: 49698.2. Samples: 1268969280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:29:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:29:40,965][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000106211_1740161024.pth... [2024-06-12 23:29:41,005][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000105487_1728299008.pth [2024-06-12 23:29:41,859][71000] Updated weights for policy 0, policy_version 106214 (0.0027) [2024-06-12 23:29:45,159][71000] Updated weights for policy 0, policy_version 106224 (0.0021) [2024-06-12 23:29:45,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49698.0, 300 sec: 49485.2). Total num frames: 1740406784. Throughput: 0: 49828.4. Samples: 1269270600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:29:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:29:48,373][71000] Updated weights for policy 0, policy_version 106234 (0.0027) [2024-06-12 23:29:50,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 1740652544. Throughput: 0: 49543.7. Samples: 1269417960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:29:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:29:51,891][71000] Updated weights for policy 0, policy_version 106244 (0.0025) [2024-06-12 23:29:55,634][71000] Updated weights for policy 0, policy_version 106254 (0.0029) [2024-06-12 23:29:55,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1740881920. Throughput: 0: 49775.2. Samples: 1269711880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:29:55,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:29:58,535][71000] Updated weights for policy 0, policy_version 106264 (0.0026) [2024-06-12 23:30:00,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49698.3, 300 sec: 49429.7). Total num frames: 1741144064. Throughput: 0: 49740.9. Samples: 1270011080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:30:00,948][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:30:01,766][71000] Updated weights for policy 0, policy_version 106274 (0.0026) [2024-06-12 23:30:05,231][71000] Updated weights for policy 0, policy_version 106284 (0.0029) [2024-06-12 23:30:05,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 1741389824. Throughput: 0: 49650.1. Samples: 1270160820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:30:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:30:08,282][71000] Updated weights for policy 0, policy_version 106294 (0.0026) [2024-06-12 23:30:10,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49971.3, 300 sec: 49429.7). Total num frames: 1741635584. Throughput: 0: 49741.5. Samples: 1270460920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:30:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:30:11,747][71000] Updated weights for policy 0, policy_version 106304 (0.0036) [2024-06-12 23:30:15,090][71000] Updated weights for policy 0, policy_version 106314 (0.0031) [2024-06-12 23:30:15,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 1741864960. Throughput: 0: 49489.7. Samples: 1270745560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:30:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:30:18,633][71000] Updated weights for policy 0, policy_version 106324 (0.0023) [2024-06-12 23:30:20,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1742127104. Throughput: 0: 49695.1. Samples: 1270899280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:30:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:30:21,725][71000] Updated weights for policy 0, policy_version 106334 (0.0027) [2024-06-12 23:30:25,107][71000] Updated weights for policy 0, policy_version 106344 (0.0033) [2024-06-12 23:30:25,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49154.0, 300 sec: 49318.6). Total num frames: 1742356480. Throughput: 0: 49622.5. Samples: 1271202300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 23:30:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:30:28,138][71000] Updated weights for policy 0, policy_version 106354 (0.0029) [2024-06-12 23:30:30,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.2, 300 sec: 49485.3). Total num frames: 1742618624. Throughput: 0: 49468.2. Samples: 1271496660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 23:30:30,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:30:31,515][71000] Updated weights for policy 0, policy_version 106364 (0.0028) [2024-06-12 23:30:32,490][70980] Signal inference workers to stop experience collection... (18650 times) [2024-06-12 23:30:32,530][71000] InferenceWorker_p0-w0: stopping experience collection (18650 times) [2024-06-12 23:30:32,595][70980] Signal inference workers to resume experience collection... (18650 times) [2024-06-12 23:30:32,595][71000] InferenceWorker_p0-w0: resuming experience collection (18650 times) [2024-06-12 23:30:34,525][71000] Updated weights for policy 0, policy_version 106374 (0.0027) [2024-06-12 23:30:35,940][70768] Fps is (10 sec: 52429.5, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 1742880768. Throughput: 0: 49718.2. Samples: 1271655280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 23:30:35,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 23:30:38,030][71000] Updated weights for policy 0, policy_version 106384 (0.0033) [2024-06-12 23:30:40,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 1743126528. Throughput: 0: 49820.4. Samples: 1271953800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 23:30:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:30:41,361][71000] Updated weights for policy 0, policy_version 106394 (0.0028) [2024-06-12 23:30:44,799][71000] Updated weights for policy 0, policy_version 106404 (0.0026) [2024-06-12 23:30:45,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1743372288. Throughput: 0: 49682.4. Samples: 1272246800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 23:30:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:30:48,065][71000] Updated weights for policy 0, policy_version 106414 (0.0024) [2024-06-12 23:30:50,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 1743618048. Throughput: 0: 49536.8. Samples: 1272389980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 23:30:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:30:51,512][71000] Updated weights for policy 0, policy_version 106424 (0.0030) [2024-06-12 23:30:54,593][71000] Updated weights for policy 0, policy_version 106434 (0.0031) [2024-06-12 23:30:55,939][70768] Fps is (10 sec: 50791.5, 60 sec: 49971.3, 300 sec: 49485.2). Total num frames: 1743880192. Throughput: 0: 49481.9. Samples: 1272687600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 23:30:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:30:58,219][71000] Updated weights for policy 0, policy_version 106444 (0.0028) [2024-06-12 23:31:00,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1744109568. Throughput: 0: 49692.1. Samples: 1272981700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 23:31:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:31:01,138][71000] Updated weights for policy 0, policy_version 106454 (0.0033) [2024-06-12 23:31:04,831][71000] Updated weights for policy 0, policy_version 106464 (0.0027) [2024-06-12 23:31:05,940][70768] Fps is (10 sec: 44236.6, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 1744322560. Throughput: 0: 49520.4. Samples: 1273127700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 23:31:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:31:08,033][71000] Updated weights for policy 0, policy_version 106474 (0.0029) [2024-06-12 23:31:10,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1744601088. Throughput: 0: 49294.7. Samples: 1273420560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 23:31:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:31:11,380][71000] Updated weights for policy 0, policy_version 106484 (0.0021) [2024-06-12 23:31:14,496][71000] Updated weights for policy 0, policy_version 106494 (0.0022) [2024-06-12 23:31:15,940][70768] Fps is (10 sec: 55705.1, 60 sec: 50244.2, 300 sec: 49540.8). Total num frames: 1744879616. Throughput: 0: 49367.0. Samples: 1273718180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 23:31:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:31:17,970][71000] Updated weights for policy 0, policy_version 106504 (0.0026) [2024-06-12 23:31:20,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1745092608. Throughput: 0: 49173.7. Samples: 1273868100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-12 23:31:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:31:21,225][71000] Updated weights for policy 0, policy_version 106514 (0.0023) [2024-06-12 23:31:24,541][71000] Updated weights for policy 0, policy_version 106524 (0.0025) [2024-06-12 23:31:25,940][70768] Fps is (10 sec: 45875.7, 60 sec: 49698.3, 300 sec: 49429.7). Total num frames: 1745338368. Throughput: 0: 49172.5. Samples: 1274166560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 23:31:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:31:27,777][71000] Updated weights for policy 0, policy_version 106534 (0.0029) [2024-06-12 23:31:30,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1745584128. Throughput: 0: 49396.8. Samples: 1274469640. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 23:31:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:31:31,173][71000] Updated weights for policy 0, policy_version 106544 (0.0028) [2024-06-12 23:31:34,325][71000] Updated weights for policy 0, policy_version 106554 (0.0028) [2024-06-12 23:31:35,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1745862656. Throughput: 0: 49500.9. Samples: 1274617520. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 23:31:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:31:37,931][71000] Updated weights for policy 0, policy_version 106564 (0.0025) [2024-06-12 23:31:40,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1746092032. Throughput: 0: 49455.1. Samples: 1274913080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 23:31:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:31:41,087][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000106574_1746108416.pth... [2024-06-12 23:31:41,095][71000] Updated weights for policy 0, policy_version 106574 (0.0033) [2024-06-12 23:31:41,127][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000105849_1734230016.pth [2024-06-12 23:31:44,391][71000] Updated weights for policy 0, policy_version 106584 (0.0036) [2024-06-12 23:31:45,940][70768] Fps is (10 sec: 45875.5, 60 sec: 49152.2, 300 sec: 49429.7). Total num frames: 1746321408. Throughput: 0: 49513.3. Samples: 1275209800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 23:31:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:31:47,263][70980] Signal inference workers to stop experience collection... (18700 times) [2024-06-12 23:31:47,263][70980] Signal inference workers to resume experience collection... (18700 times) [2024-06-12 23:31:47,277][71000] InferenceWorker_p0-w0: stopping experience collection (18700 times) [2024-06-12 23:31:47,277][71000] InferenceWorker_p0-w0: resuming experience collection (18700 times) [2024-06-12 23:31:47,704][71000] Updated weights for policy 0, policy_version 106594 (0.0028) [2024-06-12 23:31:50,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1746583552. Throughput: 0: 49387.5. Samples: 1275350140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 23:31:50,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:31:50,999][71000] Updated weights for policy 0, policy_version 106604 (0.0032) [2024-06-12 23:31:54,413][71000] Updated weights for policy 0, policy_version 106614 (0.0028) [2024-06-12 23:31:55,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1746829312. Throughput: 0: 49301.0. Samples: 1275639100. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 23:31:55,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 23:31:57,894][71000] Updated weights for policy 0, policy_version 106624 (0.0026) [2024-06-12 23:32:00,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 1747075072. Throughput: 0: 49333.4. Samples: 1275938180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 23:32:00,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:32:00,969][71000] Updated weights for policy 0, policy_version 106634 (0.0028) [2024-06-12 23:32:04,724][71000] Updated weights for policy 0, policy_version 106644 (0.0034) [2024-06-12 23:32:05,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1747304448. Throughput: 0: 49326.2. Samples: 1276087780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 23:32:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:32:07,942][71000] Updated weights for policy 0, policy_version 106654 (0.0031) [2024-06-12 23:32:10,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 1747550208. Throughput: 0: 49033.8. Samples: 1276373080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 23:32:10,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 23:32:11,265][71000] Updated weights for policy 0, policy_version 106664 (0.0041) [2024-06-12 23:32:14,765][71000] Updated weights for policy 0, policy_version 106674 (0.0032) [2024-06-12 23:32:15,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48878.9, 300 sec: 49429.7). Total num frames: 1747812352. Throughput: 0: 48648.6. Samples: 1276658840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 23:32:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-12 23:32:18,089][71000] Updated weights for policy 0, policy_version 106684 (0.0040) [2024-06-12 23:32:20,940][70768] Fps is (10 sec: 50789.3, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 1748058112. Throughput: 0: 48840.7. Samples: 1276815360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-12 23:32:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:32:21,074][71000] Updated weights for policy 0, policy_version 106694 (0.0029) [2024-06-12 23:32:24,765][71000] Updated weights for policy 0, policy_version 106704 (0.0026) [2024-06-12 23:32:25,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1748303872. Throughput: 0: 49135.8. Samples: 1277124200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 23:32:25,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:32:27,718][71000] Updated weights for policy 0, policy_version 106714 (0.0028) [2024-06-12 23:32:30,941][70768] Fps is (10 sec: 49145.3, 60 sec: 49423.7, 300 sec: 49485.0). Total num frames: 1748549632. Throughput: 0: 49056.5. Samples: 1277417420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 23:32:30,942][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:32:31,226][71000] Updated weights for policy 0, policy_version 106724 (0.0025) [2024-06-12 23:32:34,492][71000] Updated weights for policy 0, policy_version 106734 (0.0034) [2024-06-12 23:32:35,939][70768] Fps is (10 sec: 49152.9, 60 sec: 48879.0, 300 sec: 49429.7). Total num frames: 1748795392. Throughput: 0: 49167.2. Samples: 1277562660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 23:32:35,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 23:32:37,741][71000] Updated weights for policy 0, policy_version 106744 (0.0027) [2024-06-12 23:32:40,939][70768] Fps is (10 sec: 49159.7, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1749041152. Throughput: 0: 49202.7. Samples: 1277853220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 23:32:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:32:41,105][71000] Updated weights for policy 0, policy_version 106754 (0.0026) [2024-06-12 23:32:44,544][71000] Updated weights for policy 0, policy_version 106764 (0.0029) [2024-06-12 23:32:45,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1749270528. Throughput: 0: 49246.7. Samples: 1278154280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 23:32:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-12 23:32:47,785][71000] Updated weights for policy 0, policy_version 106774 (0.0027) [2024-06-12 23:32:50,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 1749532672. Throughput: 0: 49173.9. Samples: 1278300600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 23:32:50,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:32:51,043][71000] Updated weights for policy 0, policy_version 106784 (0.0030) [2024-06-12 23:32:54,463][71000] Updated weights for policy 0, policy_version 106794 (0.0028) [2024-06-12 23:32:55,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1749778432. Throughput: 0: 49483.5. Samples: 1278599840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 23:32:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:32:57,509][71000] Updated weights for policy 0, policy_version 106804 (0.0026) [2024-06-12 23:33:00,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 49374.8). Total num frames: 1750024192. Throughput: 0: 49804.2. Samples: 1278900020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 23:33:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:33:01,026][71000] Updated weights for policy 0, policy_version 106814 (0.0030) [2024-06-12 23:33:03,985][71000] Updated weights for policy 0, policy_version 106824 (0.0024) [2024-06-12 23:33:05,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49698.0, 300 sec: 49540.7). Total num frames: 1750286336. Throughput: 0: 49597.8. Samples: 1279047260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 23:33:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:33:06,794][70980] Signal inference workers to stop experience collection... (18750 times) [2024-06-12 23:33:06,795][70980] Signal inference workers to resume experience collection... (18750 times) [2024-06-12 23:33:06,840][71000] InferenceWorker_p0-w0: stopping experience collection (18750 times) [2024-06-12 23:33:06,840][71000] InferenceWorker_p0-w0: resuming experience collection (18750 times) [2024-06-12 23:33:07,403][71000] Updated weights for policy 0, policy_version 106834 (0.0024) [2024-06-12 23:33:10,742][71000] Updated weights for policy 0, policy_version 106844 (0.0037) [2024-06-12 23:33:10,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.1, 300 sec: 49485.9). Total num frames: 1750532096. Throughput: 0: 49459.7. Samples: 1279349880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 23:33:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-12 23:33:14,547][71000] Updated weights for policy 0, policy_version 106854 (0.0032) [2024-06-12 23:33:15,939][70768] Fps is (10 sec: 49153.3, 60 sec: 49425.3, 300 sec: 49429.7). Total num frames: 1750777856. Throughput: 0: 49500.5. Samples: 1279644860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 23:33:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:33:17,279][71000] Updated weights for policy 0, policy_version 106864 (0.0035) [2024-06-12 23:33:20,923][71000] Updated weights for policy 0, policy_version 106874 (0.0025) [2024-06-12 23:33:20,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.2, 300 sec: 49485.3). Total num frames: 1751023616. Throughput: 0: 49654.6. Samples: 1279797120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-12 23:33:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:33:23,867][71000] Updated weights for policy 0, policy_version 106884 (0.0027) [2024-06-12 23:33:25,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1751269376. Throughput: 0: 49678.6. Samples: 1280088760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:33:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:33:27,431][71000] Updated weights for policy 0, policy_version 106894 (0.0027) [2024-06-12 23:33:30,566][71000] Updated weights for policy 0, policy_version 106904 (0.0029) [2024-06-12 23:33:30,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49699.3, 300 sec: 49485.2). Total num frames: 1751531520. Throughput: 0: 49668.8. Samples: 1280389380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:33:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:33:33,996][71000] Updated weights for policy 0, policy_version 106914 (0.0030) [2024-06-12 23:33:35,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 1751777280. Throughput: 0: 49762.9. Samples: 1280539940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:33:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:33:37,144][71000] Updated weights for policy 0, policy_version 106924 (0.0030) [2024-06-12 23:33:40,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 1751990272. Throughput: 0: 49637.3. Samples: 1280833520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:33:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:33:40,954][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000106933_1751990272.pth... [2024-06-12 23:33:41,026][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000106211_1740161024.pth [2024-06-12 23:33:41,185][71000] Updated weights for policy 0, policy_version 106934 (0.0036) [2024-06-12 23:33:43,667][71000] Updated weights for policy 0, policy_version 106944 (0.0027) [2024-06-12 23:33:45,939][70768] Fps is (10 sec: 49153.0, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 1752268800. Throughput: 0: 49711.2. Samples: 1281137020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:33:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:33:47,544][71000] Updated weights for policy 0, policy_version 106954 (0.0032) [2024-06-12 23:33:50,417][71000] Updated weights for policy 0, policy_version 106964 (0.0026) [2024-06-12 23:33:50,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49698.0, 300 sec: 49485.2). Total num frames: 1752514560. Throughput: 0: 49762.7. Samples: 1281286580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:33:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:33:53,891][71000] Updated weights for policy 0, policy_version 106974 (0.0026) [2024-06-12 23:33:55,940][70768] Fps is (10 sec: 52428.6, 60 sec: 50244.3, 300 sec: 49596.3). Total num frames: 1752793088. Throughput: 0: 49766.3. Samples: 1281589360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:33:55,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:33:56,751][71000] Updated weights for policy 0, policy_version 106984 (0.0026) [2024-06-12 23:34:00,528][71000] Updated weights for policy 0, policy_version 106994 (0.0031) [2024-06-12 23:34:00,944][70768] Fps is (10 sec: 49131.5, 60 sec: 49694.6, 300 sec: 49429.0). Total num frames: 1753006080. Throughput: 0: 49862.7. Samples: 1281888900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:34:00,944][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:34:03,547][71000] Updated weights for policy 0, policy_version 107004 (0.0029) [2024-06-12 23:34:05,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1753268224. Throughput: 0: 49706.6. Samples: 1282033920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:34:05,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:34:07,270][71000] Updated weights for policy 0, policy_version 107014 (0.0039) [2024-06-12 23:34:10,143][71000] Updated weights for policy 0, policy_version 107024 (0.0028) [2024-06-12 23:34:10,940][70768] Fps is (10 sec: 52451.4, 60 sec: 49971.2, 300 sec: 49651.9). Total num frames: 1753530368. Throughput: 0: 49742.6. Samples: 1282327180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:34:10,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 23:34:13,874][71000] Updated weights for policy 0, policy_version 107034 (0.0023) [2024-06-12 23:34:15,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49697.9, 300 sec: 49485.2). Total num frames: 1753759744. Throughput: 0: 49692.8. Samples: 1282625560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:34:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:34:16,779][71000] Updated weights for policy 0, policy_version 107044 (0.0038) [2024-06-12 23:34:20,479][71000] Updated weights for policy 0, policy_version 107054 (0.0032) [2024-06-12 23:34:20,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49425.0, 300 sec: 49430.1). Total num frames: 1753989120. Throughput: 0: 49684.5. Samples: 1282775740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-12 23:34:20,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 23:34:23,470][71000] Updated weights for policy 0, policy_version 107064 (0.0027) [2024-06-12 23:34:25,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1754251264. Throughput: 0: 49604.5. Samples: 1283065720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-12 23:34:25,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 23:34:26,999][71000] Updated weights for policy 0, policy_version 107074 (0.0034) [2024-06-12 23:34:28,194][70980] Signal inference workers to stop experience collection... (18800 times) [2024-06-12 23:34:28,240][71000] InferenceWorker_p0-w0: stopping experience collection (18800 times) [2024-06-12 23:34:28,242][70980] Signal inference workers to resume experience collection... (18800 times) [2024-06-12 23:34:28,248][71000] InferenceWorker_p0-w0: resuming experience collection (18800 times) [2024-06-12 23:34:29,902][71000] Updated weights for policy 0, policy_version 107084 (0.0028) [2024-06-12 23:34:30,940][70768] Fps is (10 sec: 52429.1, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1754513408. Throughput: 0: 49515.0. Samples: 1283365200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-12 23:34:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:34:33,729][71000] Updated weights for policy 0, policy_version 107094 (0.0031) [2024-06-12 23:34:35,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 1754742784. Throughput: 0: 49619.7. Samples: 1283519460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-12 23:34:35,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:34:36,825][71000] Updated weights for policy 0, policy_version 107104 (0.0025) [2024-06-12 23:34:40,336][71000] Updated weights for policy 0, policy_version 107114 (0.0038) [2024-06-12 23:34:40,940][70768] Fps is (10 sec: 49151.5, 60 sec: 50244.2, 300 sec: 49485.2). Total num frames: 1755004928. Throughput: 0: 49423.0. Samples: 1283813400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-12 23:34:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:34:43,220][71000] Updated weights for policy 0, policy_version 107124 (0.0026) [2024-06-12 23:34:45,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1755234304. Throughput: 0: 49293.1. Samples: 1284106880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-12 23:34:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:34:47,016][71000] Updated weights for policy 0, policy_version 107134 (0.0032) [2024-06-12 23:34:49,888][71000] Updated weights for policy 0, policy_version 107144 (0.0036) [2024-06-12 23:34:50,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49698.3, 300 sec: 49540.8). Total num frames: 1755496448. Throughput: 0: 49313.9. Samples: 1284253040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-12 23:34:50,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:34:53,868][71000] Updated weights for policy 0, policy_version 107154 (0.0027) [2024-06-12 23:34:55,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48605.9, 300 sec: 49374.2). Total num frames: 1755709440. Throughput: 0: 49413.4. Samples: 1284550780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-12 23:34:55,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 23:34:56,872][71000] Updated weights for policy 0, policy_version 107164 (0.0022) [2024-06-12 23:35:00,415][71000] Updated weights for policy 0, policy_version 107174 (0.0024) [2024-06-12 23:35:00,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49428.6, 300 sec: 49429.7). Total num frames: 1755971584. Throughput: 0: 49515.7. Samples: 1284853760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-12 23:35:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:35:03,570][71000] Updated weights for policy 0, policy_version 107184 (0.0034) [2024-06-12 23:35:05,940][70768] Fps is (10 sec: 49150.9, 60 sec: 48878.8, 300 sec: 49374.1). Total num frames: 1756200960. Throughput: 0: 49262.5. Samples: 1284992560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-12 23:35:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-12 23:35:06,939][71000] Updated weights for policy 0, policy_version 107194 (0.0031) [2024-06-12 23:35:10,158][71000] Updated weights for policy 0, policy_version 107204 (0.0032) [2024-06-12 23:35:10,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1756479488. Throughput: 0: 49381.3. Samples: 1285287880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-12 23:35:10,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:35:13,364][71000] Updated weights for policy 0, policy_version 107214 (0.0029) [2024-06-12 23:35:15,940][70768] Fps is (10 sec: 49152.9, 60 sec: 48879.1, 300 sec: 49374.2). Total num frames: 1756692480. Throughput: 0: 49228.9. Samples: 1285580500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-12 23:35:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:35:16,801][71000] Updated weights for policy 0, policy_version 107224 (0.0028) [2024-06-12 23:35:20,095][71000] Updated weights for policy 0, policy_version 107234 (0.0026) [2024-06-12 23:35:20,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49698.0, 300 sec: 49540.8). Total num frames: 1756971008. Throughput: 0: 49050.9. Samples: 1285726760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-12 23:35:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-12 23:35:23,102][71000] Updated weights for policy 0, policy_version 107244 (0.0032) [2024-06-12 23:35:25,940][70768] Fps is (10 sec: 52427.7, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1757216768. Throughput: 0: 49379.4. Samples: 1286035480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-12 23:35:25,944][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:35:26,327][71000] Updated weights for policy 0, policy_version 107254 (0.0034) [2024-06-12 23:35:29,690][71000] Updated weights for policy 0, policy_version 107264 (0.0036) [2024-06-12 23:35:30,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1757462528. Throughput: 0: 49472.0. Samples: 1286333120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:35:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:35:33,066][71000] Updated weights for policy 0, policy_version 107274 (0.0029) [2024-06-12 23:35:35,940][70768] Fps is (10 sec: 49153.0, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1757708288. Throughput: 0: 49516.4. Samples: 1286481280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:35:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:35:36,451][71000] Updated weights for policy 0, policy_version 107284 (0.0029) [2024-06-12 23:35:39,734][71000] Updated weights for policy 0, policy_version 107294 (0.0025) [2024-06-12 23:35:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 1757954048. Throughput: 0: 49522.2. Samples: 1286779280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:35:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:35:40,945][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000107297_1757954048.pth... [2024-06-12 23:35:40,987][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000106574_1746108416.pth [2024-06-12 23:35:42,318][70980] Signal inference workers to stop experience collection... (18850 times) [2024-06-12 23:35:42,365][71000] InferenceWorker_p0-w0: stopping experience collection (18850 times) [2024-06-12 23:35:42,432][70980] Signal inference workers to resume experience collection... (18850 times) [2024-06-12 23:35:42,432][71000] InferenceWorker_p0-w0: resuming experience collection (18850 times) [2024-06-12 23:35:43,030][71000] Updated weights for policy 0, policy_version 107304 (0.0031) [2024-06-12 23:35:45,940][70768] Fps is (10 sec: 50789.3, 60 sec: 49698.0, 300 sec: 49485.2). Total num frames: 1758216192. Throughput: 0: 49439.3. Samples: 1287078540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:35:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:35:46,108][71000] Updated weights for policy 0, policy_version 107314 (0.0028) [2024-06-12 23:35:49,700][71000] Updated weights for policy 0, policy_version 107324 (0.0033) [2024-06-12 23:35:50,939][70768] Fps is (10 sec: 52429.1, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1758478336. Throughput: 0: 49696.3. Samples: 1287228880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:35:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:35:52,759][71000] Updated weights for policy 0, policy_version 107334 (0.0034) [2024-06-12 23:35:55,939][70768] Fps is (10 sec: 47514.8, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1758691328. Throughput: 0: 49664.1. Samples: 1287522760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:35:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:35:56,383][71000] Updated weights for policy 0, policy_version 107344 (0.0025) [2024-06-12 23:35:59,565][71000] Updated weights for policy 0, policy_version 107354 (0.0029) [2024-06-12 23:36:00,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1758953472. Throughput: 0: 49790.2. Samples: 1287821060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:36:00,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 23:36:03,091][71000] Updated weights for policy 0, policy_version 107364 (0.0021) [2024-06-12 23:36:05,939][70768] Fps is (10 sec: 50790.3, 60 sec: 49971.4, 300 sec: 49485.3). Total num frames: 1759199232. Throughput: 0: 49771.8. Samples: 1287966480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:36:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:36:06,018][71000] Updated weights for policy 0, policy_version 107374 (0.0031) [2024-06-12 23:36:09,491][71000] Updated weights for policy 0, policy_version 107384 (0.0021) [2024-06-12 23:36:10,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1759461376. Throughput: 0: 49719.3. Samples: 1288272840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:36:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-12 23:36:10,954][70980] Saving new best policy, reward=0.290! [2024-06-12 23:36:12,473][71000] Updated weights for policy 0, policy_version 107394 (0.0026) [2024-06-12 23:36:15,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1759674368. Throughput: 0: 49648.5. Samples: 1288567300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:36:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:36:16,171][71000] Updated weights for policy 0, policy_version 107404 (0.0031) [2024-06-12 23:36:19,271][71000] Updated weights for policy 0, policy_version 107414 (0.0038) [2024-06-12 23:36:20,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49698.1, 300 sec: 49540.7). Total num frames: 1759952896. Throughput: 0: 49496.7. Samples: 1288708640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:36:20,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:36:23,063][71000] Updated weights for policy 0, policy_version 107424 (0.0031) [2024-06-12 23:36:25,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 1760182272. Throughput: 0: 49282.1. Samples: 1288996980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:36:25,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:36:26,173][71000] Updated weights for policy 0, policy_version 107434 (0.0039) [2024-06-12 23:36:29,427][71000] Updated weights for policy 0, policy_version 107444 (0.0016) [2024-06-12 23:36:30,939][70768] Fps is (10 sec: 49153.1, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1760444416. Throughput: 0: 49425.6. Samples: 1289302680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 23:36:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:36:32,737][71000] Updated weights for policy 0, policy_version 107454 (0.0029) [2024-06-12 23:36:35,861][71000] Updated weights for policy 0, policy_version 107464 (0.0031) [2024-06-12 23:36:35,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1760690176. Throughput: 0: 49535.5. Samples: 1289457980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 23:36:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:36:39,159][71000] Updated weights for policy 0, policy_version 107474 (0.0034) [2024-06-12 23:36:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1760935936. Throughput: 0: 49726.1. Samples: 1289760440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 23:36:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 23:36:42,447][71000] Updated weights for policy 0, policy_version 107484 (0.0028) [2024-06-12 23:36:45,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.2, 300 sec: 49429.7). Total num frames: 1761165312. Throughput: 0: 49508.0. Samples: 1290048920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 23:36:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:36:45,949][71000] Updated weights for policy 0, policy_version 107494 (0.0037) [2024-06-12 23:36:48,963][71000] Updated weights for policy 0, policy_version 107504 (0.0029) [2024-06-12 23:36:49,525][70980] Signal inference workers to stop experience collection... (18900 times) [2024-06-12 23:36:49,525][70980] Signal inference workers to resume experience collection... (18900 times) [2024-06-12 23:36:49,540][71000] InferenceWorker_p0-w0: stopping experience collection (18900 times) [2024-06-12 23:36:49,541][71000] InferenceWorker_p0-w0: resuming experience collection (18900 times) [2024-06-12 23:36:50,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49151.8, 300 sec: 49485.2). Total num frames: 1761427456. Throughput: 0: 49712.3. Samples: 1290203540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 23:36:50,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:36:52,683][71000] Updated weights for policy 0, policy_version 107514 (0.0025) [2024-06-12 23:36:55,612][71000] Updated weights for policy 0, policy_version 107524 (0.0019) [2024-06-12 23:36:55,942][70768] Fps is (10 sec: 50779.5, 60 sec: 49696.3, 300 sec: 49484.9). Total num frames: 1761673216. Throughput: 0: 49474.6. Samples: 1290499300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 23:36:55,942][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:36:59,406][71000] Updated weights for policy 0, policy_version 107534 (0.0031) [2024-06-12 23:37:00,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1761918976. Throughput: 0: 49403.1. Samples: 1290790440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 23:37:00,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:37:02,309][71000] Updated weights for policy 0, policy_version 107544 (0.0022) [2024-06-12 23:37:05,736][71000] Updated weights for policy 0, policy_version 107554 (0.0025) [2024-06-12 23:37:05,940][70768] Fps is (10 sec: 49162.8, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1762164736. Throughput: 0: 49635.8. Samples: 1290942240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 23:37:05,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 23:37:08,862][71000] Updated weights for policy 0, policy_version 107564 (0.0029) [2024-06-12 23:37:10,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1762426880. Throughput: 0: 49809.8. Samples: 1291238420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 23:37:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:37:12,164][71000] Updated weights for policy 0, policy_version 107574 (0.0036) [2024-06-12 23:37:15,397][71000] Updated weights for policy 0, policy_version 107584 (0.0027) [2024-06-12 23:37:15,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.2, 300 sec: 49485.3). Total num frames: 1762656256. Throughput: 0: 49470.2. Samples: 1291528840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 23:37:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:37:19,046][71000] Updated weights for policy 0, policy_version 107594 (0.0025) [2024-06-12 23:37:20,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 1762902016. Throughput: 0: 49502.1. Samples: 1291685580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 23:37:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:37:21,799][71000] Updated weights for policy 0, policy_version 107604 (0.0027) [2024-06-12 23:37:25,811][71000] Updated weights for policy 0, policy_version 107614 (0.0043) [2024-06-12 23:37:25,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.1, 300 sec: 49485.5). Total num frames: 1763147776. Throughput: 0: 49376.0. Samples: 1291982360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-12 23:37:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:37:28,627][71000] Updated weights for policy 0, policy_version 107624 (0.0036) [2024-06-12 23:37:30,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1763409920. Throughput: 0: 49542.2. Samples: 1292278320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 23:37:30,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 23:37:32,134][71000] Updated weights for policy 0, policy_version 107634 (0.0026) [2024-06-12 23:37:35,296][71000] Updated weights for policy 0, policy_version 107644 (0.0027) [2024-06-12 23:37:35,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49425.0, 300 sec: 49540.7). Total num frames: 1763655680. Throughput: 0: 49423.1. Samples: 1292427580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 23:37:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:37:39,080][71000] Updated weights for policy 0, policy_version 107654 (0.0027) [2024-06-12 23:37:40,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.1, 300 sec: 49651.8). Total num frames: 1763917824. Throughput: 0: 49581.5. Samples: 1292730360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 23:37:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:37:41,071][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000107662_1763934208.pth... [2024-06-12 23:37:41,111][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000106933_1751990272.pth [2024-06-12 23:37:41,733][71000] Updated weights for policy 0, policy_version 107664 (0.0030) [2024-06-12 23:37:45,400][71000] Updated weights for policy 0, policy_version 107674 (0.0027) [2024-06-12 23:37:45,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1764147200. Throughput: 0: 49760.0. Samples: 1293029640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 23:37:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:37:48,379][71000] Updated weights for policy 0, policy_version 107684 (0.0027) [2024-06-12 23:37:50,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1764392960. Throughput: 0: 49586.6. Samples: 1293173640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 23:37:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-12 23:37:51,991][71000] Updated weights for policy 0, policy_version 107694 (0.0023) [2024-06-12 23:37:54,839][71000] Updated weights for policy 0, policy_version 107704 (0.0026) [2024-06-12 23:37:55,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49699.9, 300 sec: 49596.3). Total num frames: 1764655104. Throughput: 0: 49697.8. Samples: 1293474820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 23:37:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:37:58,495][71000] Updated weights for policy 0, policy_version 107714 (0.0022) [2024-06-12 23:38:00,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1764900864. Throughput: 0: 49795.6. Samples: 1293769640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 23:38:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 23:38:01,669][71000] Updated weights for policy 0, policy_version 107724 (0.0027) [2024-06-12 23:38:02,632][70980] Signal inference workers to stop experience collection... (18950 times) [2024-06-12 23:38:02,681][71000] InferenceWorker_p0-w0: stopping experience collection (18950 times) [2024-06-12 23:38:02,738][70980] Signal inference workers to resume experience collection... (18950 times) [2024-06-12 23:38:02,738][71000] InferenceWorker_p0-w0: resuming experience collection (18950 times) [2024-06-12 23:38:05,426][71000] Updated weights for policy 0, policy_version 107734 (0.0027) [2024-06-12 23:38:05,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 1765113856. Throughput: 0: 49466.3. Samples: 1293911560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 23:38:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:38:08,412][71000] Updated weights for policy 0, policy_version 107744 (0.0033) [2024-06-12 23:38:10,939][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 1765376000. Throughput: 0: 49512.1. Samples: 1294210400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 23:38:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:38:12,151][71000] Updated weights for policy 0, policy_version 107754 (0.0026) [2024-06-12 23:38:15,013][71000] Updated weights for policy 0, policy_version 107764 (0.0030) [2024-06-12 23:38:15,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1765638144. Throughput: 0: 49388.0. Samples: 1294500780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 23:38:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:38:18,592][71000] Updated weights for policy 0, policy_version 107774 (0.0026) [2024-06-12 23:38:20,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49971.3, 300 sec: 49596.3). Total num frames: 1765900288. Throughput: 0: 49496.1. Samples: 1294654900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 23:38:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:38:21,371][71000] Updated weights for policy 0, policy_version 107784 (0.0029) [2024-06-12 23:38:25,225][71000] Updated weights for policy 0, policy_version 107794 (0.0020) [2024-06-12 23:38:25,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1766113280. Throughput: 0: 49494.3. Samples: 1294957600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-12 23:38:25,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:38:28,099][71000] Updated weights for policy 0, policy_version 107804 (0.0023) [2024-06-12 23:38:30,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.1, 300 sec: 49485.3). Total num frames: 1766375424. Throughput: 0: 49249.3. Samples: 1295245860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:38:30,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:38:31,925][71000] Updated weights for policy 0, policy_version 107814 (0.0032) [2024-06-12 23:38:34,587][71000] Updated weights for policy 0, policy_version 107824 (0.0025) [2024-06-12 23:38:35,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49698.2, 300 sec: 49651.9). Total num frames: 1766637568. Throughput: 0: 49402.3. Samples: 1295396740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:38:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:38:38,974][71000] Updated weights for policy 0, policy_version 107834 (0.0026) [2024-06-12 23:38:40,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1766899712. Throughput: 0: 49574.6. Samples: 1295705680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:38:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:38:41,031][71000] Updated weights for policy 0, policy_version 107844 (0.0032) [2024-06-12 23:38:45,265][71000] Updated weights for policy 0, policy_version 107854 (0.0027) [2024-06-12 23:38:45,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1767112704. Throughput: 0: 49740.3. Samples: 1296007960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:38:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:38:47,930][71000] Updated weights for policy 0, policy_version 107864 (0.0030) [2024-06-12 23:38:50,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1767374848. Throughput: 0: 49566.7. Samples: 1296142060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:38:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-12 23:38:51,797][71000] Updated weights for policy 0, policy_version 107874 (0.0028) [2024-06-12 23:38:54,333][71000] Updated weights for policy 0, policy_version 107884 (0.0033) [2024-06-12 23:38:55,940][70768] Fps is (10 sec: 54066.7, 60 sec: 49971.0, 300 sec: 49652.5). Total num frames: 1767653376. Throughput: 0: 49573.5. Samples: 1296441220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:38:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:38:58,435][71000] Updated weights for policy 0, policy_version 107894 (0.0023) [2024-06-12 23:39:00,755][71000] Updated weights for policy 0, policy_version 107904 (0.0025) [2024-06-12 23:39:00,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1767899136. Throughput: 0: 49913.8. Samples: 1296746900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:39:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:39:04,804][71000] Updated weights for policy 0, policy_version 107914 (0.0033) [2024-06-12 23:39:05,939][70768] Fps is (10 sec: 44237.8, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 1768095744. Throughput: 0: 49786.7. Samples: 1296895300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:39:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:39:07,472][71000] Updated weights for policy 0, policy_version 107924 (0.0026) [2024-06-12 23:39:10,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49698.1, 300 sec: 49485.3). Total num frames: 1768357888. Throughput: 0: 49601.3. Samples: 1297189660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:39:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:39:11,300][71000] Updated weights for policy 0, policy_version 107934 (0.0029) [2024-06-12 23:39:12,833][70980] Signal inference workers to stop experience collection... (19000 times) [2024-06-12 23:39:12,862][71000] InferenceWorker_p0-w0: stopping experience collection (19000 times) [2024-06-12 23:39:12,887][70980] Signal inference workers to resume experience collection... (19000 times) [2024-06-12 23:39:12,889][71000] InferenceWorker_p0-w0: resuming experience collection (19000 times) [2024-06-12 23:39:14,560][71000] Updated weights for policy 0, policy_version 107944 (0.0023) [2024-06-12 23:39:15,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1768620032. Throughput: 0: 49824.9. Samples: 1297487980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:39:15,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:39:17,965][71000] Updated weights for policy 0, policy_version 107954 (0.0031) [2024-06-12 23:39:20,921][71000] Updated weights for policy 0, policy_version 107964 (0.0022) [2024-06-12 23:39:20,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1768882176. Throughput: 0: 50063.1. Samples: 1297649580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:39:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:39:24,186][71000] Updated weights for policy 0, policy_version 107974 (0.0028) [2024-06-12 23:39:25,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 1769095168. Throughput: 0: 49766.1. Samples: 1297945160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-12 23:39:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:39:27,241][71000] Updated weights for policy 0, policy_version 107984 (0.0036) [2024-06-12 23:39:30,930][71000] Updated weights for policy 0, policy_version 107994 (0.0027) [2024-06-12 23:39:30,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49971.1, 300 sec: 49596.3). Total num frames: 1769373696. Throughput: 0: 49512.9. Samples: 1298236040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 23:39:30,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-12 23:39:33,934][71000] Updated weights for policy 0, policy_version 108004 (0.0025) [2024-06-12 23:39:35,940][70768] Fps is (10 sec: 52429.7, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1769619456. Throughput: 0: 50044.1. Samples: 1298394040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 23:39:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:39:37,238][71000] Updated weights for policy 0, policy_version 108014 (0.0025) [2024-06-12 23:39:40,565][71000] Updated weights for policy 0, policy_version 108024 (0.0030) [2024-06-12 23:39:40,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49698.1, 300 sec: 49651.9). Total num frames: 1769881600. Throughput: 0: 50112.2. Samples: 1298696260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 23:39:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 23:39:41,042][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000108026_1769897984.pth... [2024-06-12 23:39:41,086][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000107297_1757954048.pth [2024-06-12 23:39:43,648][71000] Updated weights for policy 0, policy_version 108034 (0.0021) [2024-06-12 23:39:45,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49971.3, 300 sec: 49540.8). Total num frames: 1770110976. Throughput: 0: 50055.1. Samples: 1298999380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 23:39:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:39:46,997][71000] Updated weights for policy 0, policy_version 108044 (0.0034) [2024-06-12 23:39:50,271][71000] Updated weights for policy 0, policy_version 108054 (0.0036) [2024-06-12 23:39:50,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49698.1, 300 sec: 49651.8). Total num frames: 1770356736. Throughput: 0: 49819.5. Samples: 1299137180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 23:39:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:39:53,821][71000] Updated weights for policy 0, policy_version 108064 (0.0022) [2024-06-12 23:39:55,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.2, 300 sec: 49651.9). Total num frames: 1770618880. Throughput: 0: 49956.0. Samples: 1299437680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 23:39:55,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:39:57,034][71000] Updated weights for policy 0, policy_version 108074 (0.0026) [2024-06-12 23:40:00,450][71000] Updated weights for policy 0, policy_version 108084 (0.0032) [2024-06-12 23:40:00,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.0, 300 sec: 49707.4). Total num frames: 1770864640. Throughput: 0: 50030.2. Samples: 1299739340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 23:40:00,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 23:40:03,431][71000] Updated weights for policy 0, policy_version 108094 (0.0037) [2024-06-12 23:40:05,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 1771094016. Throughput: 0: 49571.2. Samples: 1299880280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 23:40:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:40:07,140][71000] Updated weights for policy 0, policy_version 108104 (0.0027) [2024-06-12 23:40:10,260][71000] Updated weights for policy 0, policy_version 108114 (0.0032) [2024-06-12 23:40:10,944][70768] Fps is (10 sec: 49131.0, 60 sec: 49967.6, 300 sec: 49706.7). Total num frames: 1771356160. Throughput: 0: 49530.1. Samples: 1300174220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 23:40:10,944][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:40:13,756][71000] Updated weights for policy 0, policy_version 108124 (0.0020) [2024-06-12 23:40:14,955][70980] Signal inference workers to stop experience collection... (19050 times) [2024-06-12 23:40:15,000][71000] InferenceWorker_p0-w0: stopping experience collection (19050 times) [2024-06-12 23:40:15,065][70980] Signal inference workers to resume experience collection... (19050 times) [2024-06-12 23:40:15,065][71000] InferenceWorker_p0-w0: resuming experience collection (19050 times) [2024-06-12 23:40:15,940][70768] Fps is (10 sec: 52424.1, 60 sec: 49970.5, 300 sec: 49651.7). Total num frames: 1771618304. Throughput: 0: 49596.9. Samples: 1300467940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 23:40:15,941][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:40:16,871][71000] Updated weights for policy 0, policy_version 108134 (0.0029) [2024-06-12 23:40:20,389][71000] Updated weights for policy 0, policy_version 108144 (0.0029) [2024-06-12 23:40:20,940][70768] Fps is (10 sec: 49173.1, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 1771847680. Throughput: 0: 49576.0. Samples: 1300624960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 23:40:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-12 23:40:23,273][71000] Updated weights for policy 0, policy_version 108154 (0.0029) [2024-06-12 23:40:25,940][70768] Fps is (10 sec: 45878.7, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1772077056. Throughput: 0: 49535.5. Samples: 1300925360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-12 23:40:25,949][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:40:26,743][71000] Updated weights for policy 0, policy_version 108164 (0.0030) [2024-06-12 23:40:30,111][71000] Updated weights for policy 0, policy_version 108174 (0.0026) [2024-06-12 23:40:30,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49971.2, 300 sec: 49707.4). Total num frames: 1772371968. Throughput: 0: 49490.1. Samples: 1301226440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 23:40:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:40:33,459][71000] Updated weights for policy 0, policy_version 108184 (0.0023) [2024-06-12 23:40:35,942][70768] Fps is (10 sec: 54053.1, 60 sec: 49968.9, 300 sec: 49706.9). Total num frames: 1772617728. Throughput: 0: 49782.3. Samples: 1301377520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 23:40:35,943][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:40:36,581][71000] Updated weights for policy 0, policy_version 108194 (0.0031) [2024-06-12 23:40:40,227][71000] Updated weights for policy 0, policy_version 108204 (0.0027) [2024-06-12 23:40:40,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 1772847104. Throughput: 0: 49753.7. Samples: 1301676600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 23:40:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:40:43,137][71000] Updated weights for policy 0, policy_version 108214 (0.0026) [2024-06-12 23:40:45,939][70768] Fps is (10 sec: 44249.3, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 1773060096. Throughput: 0: 49653.0. Samples: 1301973720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 23:40:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-12 23:40:46,741][71000] Updated weights for policy 0, policy_version 108224 (0.0027) [2024-06-12 23:40:49,792][71000] Updated weights for policy 0, policy_version 108234 (0.0025) [2024-06-12 23:40:50,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49971.1, 300 sec: 49707.4). Total num frames: 1773355008. Throughput: 0: 49595.8. Samples: 1302112100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 23:40:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:40:53,387][71000] Updated weights for policy 0, policy_version 108244 (0.0033) [2024-06-12 23:40:55,940][70768] Fps is (10 sec: 54066.2, 60 sec: 49698.0, 300 sec: 49651.8). Total num frames: 1773600768. Throughput: 0: 49792.2. Samples: 1302414660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 23:40:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-12 23:40:56,444][71000] Updated weights for policy 0, policy_version 108254 (0.0035) [2024-06-12 23:41:00,095][71000] Updated weights for policy 0, policy_version 108264 (0.0031) [2024-06-12 23:41:00,940][70768] Fps is (10 sec: 45875.7, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1773813760. Throughput: 0: 49552.5. Samples: 1302697760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 23:41:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:41:03,250][71000] Updated weights for policy 0, policy_version 108274 (0.0026) [2024-06-12 23:41:05,940][70768] Fps is (10 sec: 44237.3, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1774043136. Throughput: 0: 49300.9. Samples: 1302843500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 23:41:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:41:06,986][71000] Updated weights for policy 0, policy_version 108284 (0.0032) [2024-06-12 23:41:09,783][71000] Updated weights for policy 0, policy_version 108294 (0.0023) [2024-06-12 23:41:10,939][70768] Fps is (10 sec: 54067.6, 60 sec: 49974.8, 300 sec: 49762.9). Total num frames: 1774354432. Throughput: 0: 49402.0. Samples: 1303148440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 23:41:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:41:13,652][71000] Updated weights for policy 0, policy_version 108304 (0.0034) [2024-06-12 23:41:15,940][70768] Fps is (10 sec: 54067.0, 60 sec: 49425.8, 300 sec: 49596.3). Total num frames: 1774583808. Throughput: 0: 49136.5. Samples: 1303437580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 23:41:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-12 23:41:16,397][71000] Updated weights for policy 0, policy_version 108314 (0.0027) [2024-06-12 23:41:20,226][71000] Updated weights for policy 0, policy_version 108324 (0.0025) [2024-06-12 23:41:20,940][70768] Fps is (10 sec: 44236.1, 60 sec: 49151.9, 300 sec: 49540.8). Total num frames: 1774796800. Throughput: 0: 48988.2. Samples: 1303581860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 23:41:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:41:21,785][70980] Signal inference workers to stop experience collection... (19100 times) [2024-06-12 23:41:21,789][70980] Signal inference workers to resume experience collection... (19100 times) [2024-06-12 23:41:21,806][71000] InferenceWorker_p0-w0: stopping experience collection (19100 times) [2024-06-12 23:41:21,806][71000] InferenceWorker_p0-w0: resuming experience collection (19100 times) [2024-06-12 23:41:23,063][71000] Updated weights for policy 0, policy_version 108334 (0.0027) [2024-06-12 23:41:25,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 1775042560. Throughput: 0: 48824.9. Samples: 1303873720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 23:41:25,949][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:41:27,014][71000] Updated weights for policy 0, policy_version 108344 (0.0027) [2024-06-12 23:41:29,815][71000] Updated weights for policy 0, policy_version 108354 (0.0024) [2024-06-12 23:41:30,940][70768] Fps is (10 sec: 50791.1, 60 sec: 48879.0, 300 sec: 49540.8). Total num frames: 1775304704. Throughput: 0: 48695.0. Samples: 1304165000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-12 23:41:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:41:33,755][71000] Updated weights for policy 0, policy_version 108364 (0.0028) [2024-06-12 23:41:35,940][70768] Fps is (10 sec: 52427.8, 60 sec: 49154.1, 300 sec: 49596.3). Total num frames: 1775566848. Throughput: 0: 49263.1. Samples: 1304328940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 23:41:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:41:36,295][71000] Updated weights for policy 0, policy_version 108374 (0.0029) [2024-06-12 23:41:40,539][71000] Updated weights for policy 0, policy_version 108384 (0.0026) [2024-06-12 23:41:40,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.9, 300 sec: 49540.8). Total num frames: 1775779840. Throughput: 0: 48940.0. Samples: 1304616960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 23:41:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:41:40,963][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000108386_1775796224.pth... [2024-06-12 23:41:41,027][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000107662_1763934208.pth [2024-06-12 23:41:43,087][71000] Updated weights for policy 0, policy_version 108394 (0.0026) [2024-06-12 23:41:45,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49697.9, 300 sec: 49540.8). Total num frames: 1776041984. Throughput: 0: 49093.2. Samples: 1304906960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 23:41:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:41:47,377][71000] Updated weights for policy 0, policy_version 108404 (0.0028) [2024-06-12 23:41:49,724][71000] Updated weights for policy 0, policy_version 108414 (0.0028) [2024-06-12 23:41:50,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49152.0, 300 sec: 49596.6). Total num frames: 1776304128. Throughput: 0: 49221.6. Samples: 1305058480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 23:41:50,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:41:53,679][71000] Updated weights for policy 0, policy_version 108424 (0.0025) [2024-06-12 23:41:55,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 49596.3). Total num frames: 1776549888. Throughput: 0: 49153.2. Samples: 1305360340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 23:41:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:41:56,728][71000] Updated weights for policy 0, policy_version 108434 (0.0026) [2024-06-12 23:42:00,370][71000] Updated weights for policy 0, policy_version 108444 (0.0032) [2024-06-12 23:42:00,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1776762880. Throughput: 0: 49252.0. Samples: 1305653920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 23:42:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:42:03,667][71000] Updated weights for policy 0, policy_version 108454 (0.0028) [2024-06-12 23:42:05,939][70768] Fps is (10 sec: 47514.4, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1777025024. Throughput: 0: 49216.2. Samples: 1305796580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 23:42:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:42:07,133][71000] Updated weights for policy 0, policy_version 108464 (0.0030) [2024-06-12 23:42:10,278][71000] Updated weights for policy 0, policy_version 108474 (0.0028) [2024-06-12 23:42:10,940][70768] Fps is (10 sec: 52428.8, 60 sec: 48878.9, 300 sec: 49596.3). Total num frames: 1777287168. Throughput: 0: 49187.1. Samples: 1306087140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 23:42:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:42:14,110][71000] Updated weights for policy 0, policy_version 108484 (0.0025) [2024-06-12 23:42:15,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48878.9, 300 sec: 49540.8). Total num frames: 1777516544. Throughput: 0: 49368.3. Samples: 1306386580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 23:42:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:42:16,605][71000] Updated weights for policy 0, policy_version 108494 (0.0025) [2024-06-12 23:42:20,536][71000] Updated weights for policy 0, policy_version 108504 (0.0025) [2024-06-12 23:42:20,940][70768] Fps is (10 sec: 44236.2, 60 sec: 48878.9, 300 sec: 49429.7). Total num frames: 1777729536. Throughput: 0: 48891.6. Samples: 1306529060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 23:42:20,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 23:42:23,642][71000] Updated weights for policy 0, policy_version 108514 (0.0019) [2024-06-12 23:42:25,939][70768] Fps is (10 sec: 49152.6, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1778008064. Throughput: 0: 49150.3. Samples: 1306828720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 23:42:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:42:27,242][71000] Updated weights for policy 0, policy_version 108524 (0.0029) [2024-06-12 23:42:30,141][71000] Updated weights for policy 0, policy_version 108534 (0.0025) [2024-06-12 23:42:30,940][70768] Fps is (10 sec: 54068.0, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1778270208. Throughput: 0: 49297.9. Samples: 1307125360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-12 23:42:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:42:33,948][71000] Updated weights for policy 0, policy_version 108544 (0.0027) [2024-06-12 23:42:35,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 1778515968. Throughput: 0: 49409.1. Samples: 1307281880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:42:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:42:36,411][71000] Updated weights for policy 0, policy_version 108554 (0.0024) [2024-06-12 23:42:40,381][71000] Updated weights for policy 0, policy_version 108564 (0.0027) [2024-06-12 23:42:40,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1778728960. Throughput: 0: 49283.6. Samples: 1307578100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:42:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:42:40,999][70980] Signal inference workers to stop experience collection... (19150 times) [2024-06-12 23:42:41,041][71000] InferenceWorker_p0-w0: stopping experience collection (19150 times) [2024-06-12 23:42:41,111][70980] Signal inference workers to resume experience collection... (19150 times) [2024-06-12 23:42:41,112][71000] InferenceWorker_p0-w0: resuming experience collection (19150 times) [2024-06-12 23:42:43,306][71000] Updated weights for policy 0, policy_version 108574 (0.0023) [2024-06-12 23:42:45,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1779007488. Throughput: 0: 49139.2. Samples: 1307865180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:42:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:42:47,060][71000] Updated weights for policy 0, policy_version 108584 (0.0040) [2024-06-12 23:42:49,930][71000] Updated weights for policy 0, policy_version 108594 (0.0024) [2024-06-12 23:42:50,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1779253248. Throughput: 0: 49292.6. Samples: 1308014760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:42:50,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:42:53,715][71000] Updated weights for policy 0, policy_version 108604 (0.0029) [2024-06-12 23:42:55,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1779499008. Throughput: 0: 49650.1. Samples: 1308321400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:42:55,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-12 23:42:56,380][71000] Updated weights for policy 0, policy_version 108614 (0.0029) [2024-06-12 23:43:00,369][71000] Updated weights for policy 0, policy_version 108624 (0.0034) [2024-06-12 23:43:00,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1779728384. Throughput: 0: 49713.8. Samples: 1308623700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:43:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:43:02,804][71000] Updated weights for policy 0, policy_version 108634 (0.0028) [2024-06-12 23:43:05,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49424.9, 300 sec: 49540.7). Total num frames: 1779990528. Throughput: 0: 49495.1. Samples: 1308756340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:43:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:43:06,747][71000] Updated weights for policy 0, policy_version 108644 (0.0018) [2024-06-12 23:43:09,518][71000] Updated weights for policy 0, policy_version 108654 (0.0025) [2024-06-12 23:43:10,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1780252672. Throughput: 0: 49683.9. Samples: 1309064500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:43:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:43:13,285][71000] Updated weights for policy 0, policy_version 108664 (0.0038) [2024-06-12 23:43:15,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.0, 300 sec: 49485.2). Total num frames: 1780498432. Throughput: 0: 49651.8. Samples: 1309359700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:43:15,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:43:16,254][71000] Updated weights for policy 0, policy_version 108674 (0.0020) [2024-06-12 23:43:20,052][71000] Updated weights for policy 0, policy_version 108684 (0.0024) [2024-06-12 23:43:20,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49971.3, 300 sec: 49540.8). Total num frames: 1780727808. Throughput: 0: 49410.6. Samples: 1309505360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:43:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:43:22,599][71000] Updated weights for policy 0, policy_version 108694 (0.0027) [2024-06-12 23:43:25,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1780973568. Throughput: 0: 49388.4. Samples: 1309800580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:43:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:43:26,551][71000] Updated weights for policy 0, policy_version 108704 (0.0023) [2024-06-12 23:43:29,370][71000] Updated weights for policy 0, policy_version 108714 (0.0026) [2024-06-12 23:43:30,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 1781219328. Throughput: 0: 49442.5. Samples: 1310090100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-12 23:43:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:43:33,130][71000] Updated weights for policy 0, policy_version 108724 (0.0031) [2024-06-12 23:43:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1781465088. Throughput: 0: 49442.4. Samples: 1310239660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 23:43:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:43:36,215][71000] Updated weights for policy 0, policy_version 108734 (0.0031) [2024-06-12 23:43:39,963][71000] Updated weights for policy 0, policy_version 108744 (0.0029) [2024-06-12 23:43:40,939][70768] Fps is (10 sec: 49153.1, 60 sec: 49698.2, 300 sec: 49485.3). Total num frames: 1781710848. Throughput: 0: 49382.0. Samples: 1310543580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 23:43:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:43:40,953][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000108747_1781710848.pth... [2024-06-12 23:43:40,994][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000108026_1769897984.pth [2024-06-12 23:43:42,652][71000] Updated weights for policy 0, policy_version 108754 (0.0024) [2024-06-12 23:43:45,942][70768] Fps is (10 sec: 49139.7, 60 sec: 49149.9, 300 sec: 49429.3). Total num frames: 1781956608. Throughput: 0: 49437.7. Samples: 1310848520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 23:43:45,943][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 23:43:46,251][71000] Updated weights for policy 0, policy_version 108764 (0.0027) [2024-06-12 23:43:49,183][71000] Updated weights for policy 0, policy_version 108774 (0.0030) [2024-06-12 23:43:50,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 1782202368. Throughput: 0: 49670.3. Samples: 1310991500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 23:43:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:43:52,677][71000] Updated weights for policy 0, policy_version 108784 (0.0036) [2024-06-12 23:43:55,093][70980] Signal inference workers to stop experience collection... (19200 times) [2024-06-12 23:43:55,134][71000] InferenceWorker_p0-w0: stopping experience collection (19200 times) [2024-06-12 23:43:55,150][70980] Signal inference workers to resume experience collection... (19200 times) [2024-06-12 23:43:55,151][71000] InferenceWorker_p0-w0: resuming experience collection (19200 times) [2024-06-12 23:43:55,715][71000] Updated weights for policy 0, policy_version 108794 (0.0029) [2024-06-12 23:43:55,940][70768] Fps is (10 sec: 54081.1, 60 sec: 49971.3, 300 sec: 49485.2). Total num frames: 1782497280. Throughput: 0: 49609.4. Samples: 1311296920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 23:43:55,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:43:59,354][71000] Updated weights for policy 0, policy_version 108804 (0.0029) [2024-06-12 23:44:00,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49698.1, 300 sec: 49540.7). Total num frames: 1782710272. Throughput: 0: 49515.2. Samples: 1311587880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 23:44:00,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:44:02,327][71000] Updated weights for policy 0, policy_version 108814 (0.0023) [2024-06-12 23:44:05,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1782956032. Throughput: 0: 49492.5. Samples: 1311732520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 23:44:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:44:06,152][71000] Updated weights for policy 0, policy_version 108824 (0.0039) [2024-06-12 23:44:08,831][71000] Updated weights for policy 0, policy_version 108834 (0.0033) [2024-06-12 23:44:10,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1783218176. Throughput: 0: 49468.3. Samples: 1312026660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 23:44:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:44:12,528][71000] Updated weights for policy 0, policy_version 108844 (0.0028) [2024-06-12 23:44:15,439][71000] Updated weights for policy 0, policy_version 108854 (0.0029) [2024-06-12 23:44:15,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49698.3, 300 sec: 49485.2). Total num frames: 1783480320. Throughput: 0: 49652.7. Samples: 1312324460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 23:44:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:44:19,480][71000] Updated weights for policy 0, policy_version 108864 (0.0030) [2024-06-12 23:44:20,939][70768] Fps is (10 sec: 45876.3, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 1783676928. Throughput: 0: 49746.8. Samples: 1312478260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 23:44:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:44:22,303][71000] Updated weights for policy 0, policy_version 108874 (0.0033) [2024-06-12 23:44:25,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1783939072. Throughput: 0: 49363.1. Samples: 1312764920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 23:44:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:44:26,311][71000] Updated weights for policy 0, policy_version 108884 (0.0024) [2024-06-12 23:44:28,938][71000] Updated weights for policy 0, policy_version 108894 (0.0028) [2024-06-12 23:44:30,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49698.3, 300 sec: 49429.7). Total num frames: 1784201216. Throughput: 0: 49064.1. Samples: 1313056280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-12 23:44:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-12 23:44:32,817][71000] Updated weights for policy 0, policy_version 108904 (0.0024) [2024-06-12 23:44:35,703][71000] Updated weights for policy 0, policy_version 108914 (0.0024) [2024-06-12 23:44:35,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 1784463360. Throughput: 0: 49288.5. Samples: 1313209480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 23:44:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:44:39,785][71000] Updated weights for policy 0, policy_version 108924 (0.0034) [2024-06-12 23:44:40,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1784659968. Throughput: 0: 48999.1. Samples: 1313501880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 23:44:40,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:44:42,392][71000] Updated weights for policy 0, policy_version 108934 (0.0024) [2024-06-12 23:44:45,940][70768] Fps is (10 sec: 44236.9, 60 sec: 49154.1, 300 sec: 49318.6). Total num frames: 1784905728. Throughput: 0: 49033.0. Samples: 1313794360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 23:44:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:44:46,486][71000] Updated weights for policy 0, policy_version 108944 (0.0023) [2024-06-12 23:44:48,899][71000] Updated weights for policy 0, policy_version 108954 (0.0024) [2024-06-12 23:44:50,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1785167872. Throughput: 0: 49053.2. Samples: 1313939920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 23:44:50,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:44:52,979][71000] Updated weights for policy 0, policy_version 108964 (0.0023) [2024-06-12 23:44:55,474][71000] Updated weights for policy 0, policy_version 108974 (0.0024) [2024-06-12 23:44:55,939][70768] Fps is (10 sec: 52429.1, 60 sec: 48879.0, 300 sec: 49374.2). Total num frames: 1785430016. Throughput: 0: 49178.1. Samples: 1314239660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 23:44:55,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 23:44:59,005][70980] Signal inference workers to stop experience collection... (19250 times) [2024-06-12 23:44:59,047][71000] InferenceWorker_p0-w0: stopping experience collection (19250 times) [2024-06-12 23:44:59,056][70980] Signal inference workers to resume experience collection... (19250 times) [2024-06-12 23:44:59,062][71000] InferenceWorker_p0-w0: resuming experience collection (19250 times) [2024-06-12 23:44:59,358][71000] Updated weights for policy 0, policy_version 108984 (0.0022) [2024-06-12 23:45:00,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.1, 300 sec: 49374.1). Total num frames: 1785659392. Throughput: 0: 49305.3. Samples: 1314543200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 23:45:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:45:02,308][71000] Updated weights for policy 0, policy_version 108994 (0.0026) [2024-06-12 23:45:05,940][70768] Fps is (10 sec: 45874.2, 60 sec: 48878.8, 300 sec: 49263.8). Total num frames: 1785888768. Throughput: 0: 48816.2. Samples: 1314675000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 23:45:05,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:45:06,270][71000] Updated weights for policy 0, policy_version 109004 (0.0034) [2024-06-12 23:45:08,961][71000] Updated weights for policy 0, policy_version 109014 (0.0024) [2024-06-12 23:45:10,940][70768] Fps is (10 sec: 49151.0, 60 sec: 48878.9, 300 sec: 49263.2). Total num frames: 1786150912. Throughput: 0: 48767.8. Samples: 1314959480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 23:45:10,941][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:45:13,158][71000] Updated weights for policy 0, policy_version 109024 (0.0027) [2024-06-12 23:45:15,939][70768] Fps is (10 sec: 50791.5, 60 sec: 48605.9, 300 sec: 49318.6). Total num frames: 1786396672. Throughput: 0: 48866.3. Samples: 1315255260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 23:45:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:45:15,946][71000] Updated weights for policy 0, policy_version 109034 (0.0030) [2024-06-12 23:45:19,805][71000] Updated weights for policy 0, policy_version 109044 (0.0024) [2024-06-12 23:45:20,939][70768] Fps is (10 sec: 49153.3, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1786642432. Throughput: 0: 48850.3. Samples: 1315407740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 23:45:20,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:45:22,611][71000] Updated weights for policy 0, policy_version 109054 (0.0030) [2024-06-12 23:45:25,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 1786871808. Throughput: 0: 48912.5. Samples: 1315702940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 23:45:25,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:45:26,215][71000] Updated weights for policy 0, policy_version 109064 (0.0034) [2024-06-12 23:45:29,365][71000] Updated weights for policy 0, policy_version 109074 (0.0039) [2024-06-12 23:45:30,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48605.9, 300 sec: 49152.5). Total num frames: 1787117568. Throughput: 0: 48721.8. Samples: 1315986840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-12 23:45:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:45:33,256][71000] Updated weights for policy 0, policy_version 109084 (0.0040) [2024-06-12 23:45:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48332.8, 300 sec: 49207.5). Total num frames: 1787363328. Throughput: 0: 48623.7. Samples: 1316127980. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-12 23:45:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:45:36,146][71000] Updated weights for policy 0, policy_version 109094 (0.0029) [2024-06-12 23:45:40,052][71000] Updated weights for policy 0, policy_version 109104 (0.0027) [2024-06-12 23:45:40,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1787609088. Throughput: 0: 48752.4. Samples: 1316433520. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-12 23:45:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:45:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000109107_1787609088.pth... [2024-06-12 23:45:40,995][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000108386_1775796224.pth [2024-06-12 23:45:42,706][71000] Updated weights for policy 0, policy_version 109114 (0.0031) [2024-06-12 23:45:45,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 1787854848. Throughput: 0: 48777.3. Samples: 1316738180. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-12 23:45:45,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:45:46,482][71000] Updated weights for policy 0, policy_version 109124 (0.0034) [2024-06-12 23:45:49,720][71000] Updated weights for policy 0, policy_version 109134 (0.0028) [2024-06-12 23:45:50,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 1788116992. Throughput: 0: 48953.8. Samples: 1316877920. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-12 23:45:50,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:45:53,057][71000] Updated weights for policy 0, policy_version 109144 (0.0037) [2024-06-12 23:45:55,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48332.8, 300 sec: 49207.5). Total num frames: 1788329984. Throughput: 0: 49007.3. Samples: 1317164800. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-12 23:45:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:45:56,407][71000] Updated weights for policy 0, policy_version 109154 (0.0022) [2024-06-12 23:46:00,114][71000] Updated weights for policy 0, policy_version 109164 (0.0034) [2024-06-12 23:46:00,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 1788592128. Throughput: 0: 49011.0. Samples: 1317460760. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-12 23:46:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:46:03,057][71000] Updated weights for policy 0, policy_version 109174 (0.0029) [2024-06-12 23:46:05,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48879.1, 300 sec: 49040.9). Total num frames: 1788821504. Throughput: 0: 48840.4. Samples: 1317605560. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-12 23:46:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:46:06,475][71000] Updated weights for policy 0, policy_version 109184 (0.0024) [2024-06-12 23:46:09,579][71000] Updated weights for policy 0, policy_version 109194 (0.0029) [2024-06-12 23:46:10,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 1789083648. Throughput: 0: 49081.2. Samples: 1317911600. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-12 23:46:10,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:46:12,937][70980] Signal inference workers to stop experience collection... (19300 times) [2024-06-12 23:46:12,938][70980] Signal inference workers to resume experience collection... (19300 times) [2024-06-12 23:46:12,957][71000] InferenceWorker_p0-w0: stopping experience collection (19300 times) [2024-06-12 23:46:12,957][71000] InferenceWorker_p0-w0: resuming experience collection (19300 times) [2024-06-12 23:46:13,091][71000] Updated weights for policy 0, policy_version 109204 (0.0032) [2024-06-12 23:46:15,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48605.7, 300 sec: 49207.5). Total num frames: 1789313024. Throughput: 0: 49251.0. Samples: 1318203140. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-12 23:46:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:46:16,406][71000] Updated weights for policy 0, policy_version 109214 (0.0028) [2024-06-12 23:46:19,759][71000] Updated weights for policy 0, policy_version 109224 (0.0019) [2024-06-12 23:46:20,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.8, 300 sec: 49263.1). Total num frames: 1789575168. Throughput: 0: 49418.1. Samples: 1318351800. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-12 23:46:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:46:22,889][71000] Updated weights for policy 0, policy_version 109234 (0.0031) [2024-06-12 23:46:25,939][70768] Fps is (10 sec: 50791.2, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 1789820928. Throughput: 0: 49261.9. Samples: 1318650300. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-12 23:46:25,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-12 23:46:26,253][71000] Updated weights for policy 0, policy_version 109244 (0.0025) [2024-06-12 23:46:29,499][71000] Updated weights for policy 0, policy_version 109254 (0.0028) [2024-06-12 23:46:30,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 1790099456. Throughput: 0: 49248.9. Samples: 1318954380. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-12 23:46:30,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:46:32,732][71000] Updated weights for policy 0, policy_version 109264 (0.0033) [2024-06-12 23:46:35,857][71000] Updated weights for policy 0, policy_version 109274 (0.0027) [2024-06-12 23:46:35,940][70768] Fps is (10 sec: 52427.5, 60 sec: 49698.0, 300 sec: 49374.1). Total num frames: 1790345216. Throughput: 0: 49342.2. Samples: 1319098320. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-12 23:46:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:46:39,326][71000] Updated weights for policy 0, policy_version 109284 (0.0030) [2024-06-12 23:46:40,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 1790574592. Throughput: 0: 49532.8. Samples: 1319393780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:46:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:46:42,787][71000] Updated weights for policy 0, policy_version 109294 (0.0031) [2024-06-12 23:46:45,774][71000] Updated weights for policy 0, policy_version 109304 (0.0023) [2024-06-12 23:46:45,939][70768] Fps is (10 sec: 49153.1, 60 sec: 49698.2, 300 sec: 49263.1). Total num frames: 1790836736. Throughput: 0: 49551.7. Samples: 1319690580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:46:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 23:46:49,568][71000] Updated weights for policy 0, policy_version 109314 (0.0031) [2024-06-12 23:46:50,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 1791066112. Throughput: 0: 49774.6. Samples: 1319845420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:46:50,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 23:46:52,362][71000] Updated weights for policy 0, policy_version 109324 (0.0036) [2024-06-12 23:46:55,939][70768] Fps is (10 sec: 47513.6, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 1791311872. Throughput: 0: 49503.7. Samples: 1320139260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:46:55,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:46:56,026][71000] Updated weights for policy 0, policy_version 109334 (0.0030) [2024-06-12 23:46:58,906][71000] Updated weights for policy 0, policy_version 109344 (0.0030) [2024-06-12 23:47:00,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 1791574016. Throughput: 0: 49605.9. Samples: 1320435400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:47:00,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:47:02,783][71000] Updated weights for policy 0, policy_version 109354 (0.0024) [2024-06-12 23:47:05,555][71000] Updated weights for policy 0, policy_version 109364 (0.0024) [2024-06-12 23:47:05,939][70768] Fps is (10 sec: 52428.8, 60 sec: 50244.3, 300 sec: 49318.6). Total num frames: 1791836160. Throughput: 0: 49707.3. Samples: 1320588620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:47:05,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-12 23:47:09,471][71000] Updated weights for policy 0, policy_version 109374 (0.0031) [2024-06-12 23:47:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 1792065536. Throughput: 0: 49839.0. Samples: 1320893060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:47:10,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:47:11,982][71000] Updated weights for policy 0, policy_version 109384 (0.0033) [2024-06-12 23:47:13,015][70980] Signal inference workers to stop experience collection... (19350 times) [2024-06-12 23:47:13,055][71000] InferenceWorker_p0-w0: stopping experience collection (19350 times) [2024-06-12 23:47:13,064][70980] Signal inference workers to resume experience collection... (19350 times) [2024-06-12 23:47:13,069][71000] InferenceWorker_p0-w0: resuming experience collection (19350 times) [2024-06-12 23:47:15,939][70768] Fps is (10 sec: 44236.9, 60 sec: 49425.2, 300 sec: 49318.7). Total num frames: 1792278528. Throughput: 0: 49614.8. Samples: 1321187040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:47:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:47:16,178][71000] Updated weights for policy 0, policy_version 109394 (0.0028) [2024-06-12 23:47:18,546][71000] Updated weights for policy 0, policy_version 109404 (0.0026) [2024-06-12 23:47:20,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 1792557056. Throughput: 0: 49488.9. Samples: 1321325320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:47:20,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:47:22,789][71000] Updated weights for policy 0, policy_version 109414 (0.0032) [2024-06-12 23:47:25,202][71000] Updated weights for policy 0, policy_version 109424 (0.0033) [2024-06-12 23:47:25,940][70768] Fps is (10 sec: 54066.0, 60 sec: 49971.0, 300 sec: 49318.6). Total num frames: 1792819200. Throughput: 0: 49675.9. Samples: 1321629200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:47:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:47:29,480][71000] Updated weights for policy 0, policy_version 109434 (0.0030) [2024-06-12 23:47:30,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 1793048576. Throughput: 0: 49632.3. Samples: 1321924040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:47:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:47:31,752][71000] Updated weights for policy 0, policy_version 109444 (0.0026) [2024-06-12 23:47:35,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48879.1, 300 sec: 49318.6). Total num frames: 1793277952. Throughput: 0: 49350.2. Samples: 1322066180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-12 23:47:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:47:35,985][71000] Updated weights for policy 0, policy_version 109454 (0.0023) [2024-06-12 23:47:38,850][71000] Updated weights for policy 0, policy_version 109464 (0.0040) [2024-06-12 23:47:40,939][70768] Fps is (10 sec: 49152.6, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 1793540096. Throughput: 0: 49464.9. Samples: 1322365180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 23:47:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:47:41,034][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000109470_1793556480.pth... [2024-06-12 23:47:41,076][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000108747_1781710848.pth [2024-06-12 23:47:42,734][71000] Updated weights for policy 0, policy_version 109474 (0.0034) [2024-06-12 23:47:45,181][71000] Updated weights for policy 0, policy_version 109484 (0.0038) [2024-06-12 23:47:45,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49424.9, 300 sec: 49318.6). Total num frames: 1793802240. Throughput: 0: 49460.2. Samples: 1322661120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 23:47:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:47:49,245][71000] Updated weights for policy 0, policy_version 109494 (0.0032) [2024-06-12 23:47:50,939][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 1794031616. Throughput: 0: 49542.2. Samples: 1322818020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 23:47:50,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 23:47:51,877][71000] Updated weights for policy 0, policy_version 109504 (0.0024) [2024-06-12 23:47:55,940][70768] Fps is (10 sec: 45876.1, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 1794260992. Throughput: 0: 49399.6. Samples: 1323116040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 23:47:55,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 23:47:55,966][71000] Updated weights for policy 0, policy_version 109514 (0.0034) [2024-06-12 23:47:58,631][71000] Updated weights for policy 0, policy_version 109524 (0.0034) [2024-06-12 23:48:00,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 1794539520. Throughput: 0: 49473.7. Samples: 1323413360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 23:48:00,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:48:02,437][71000] Updated weights for policy 0, policy_version 109534 (0.0027) [2024-06-12 23:48:04,840][71000] Updated weights for policy 0, policy_version 109544 (0.0022) [2024-06-12 23:48:05,940][70768] Fps is (10 sec: 55705.4, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 1794818048. Throughput: 0: 49913.9. Samples: 1323571440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 23:48:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:48:08,786][71000] Updated weights for policy 0, policy_version 109554 (0.0022) [2024-06-12 23:48:09,728][70980] Signal inference workers to stop experience collection... (19400 times) [2024-06-12 23:48:09,729][70980] Signal inference workers to resume experience collection... (19400 times) [2024-06-12 23:48:09,744][71000] InferenceWorker_p0-w0: stopping experience collection (19400 times) [2024-06-12 23:48:09,757][71000] InferenceWorker_p0-w0: resuming experience collection (19400 times) [2024-06-12 23:48:10,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 1795047424. Throughput: 0: 49781.5. Samples: 1323869360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 23:48:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:48:11,569][71000] Updated weights for policy 0, policy_version 109564 (0.0034) [2024-06-12 23:48:15,838][71000] Updated weights for policy 0, policy_version 109574 (0.0044) [2024-06-12 23:48:15,940][70768] Fps is (10 sec: 44236.4, 60 sec: 49698.0, 300 sec: 49263.1). Total num frames: 1795260416. Throughput: 0: 49635.0. Samples: 1324157620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 23:48:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:48:18,406][71000] Updated weights for policy 0, policy_version 109584 (0.0030) [2024-06-12 23:48:20,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 1795522560. Throughput: 0: 49528.7. Samples: 1324294980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 23:48:20,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 23:48:22,346][71000] Updated weights for policy 0, policy_version 109594 (0.0025) [2024-06-12 23:48:25,101][71000] Updated weights for policy 0, policy_version 109604 (0.0027) [2024-06-12 23:48:25,940][70768] Fps is (10 sec: 52429.6, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 1795784704. Throughput: 0: 49613.7. Samples: 1324597800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 23:48:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:48:28,860][71000] Updated weights for policy 0, policy_version 109614 (0.0026) [2024-06-12 23:48:30,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 1796046848. Throughput: 0: 49788.6. Samples: 1324901600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 23:48:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:48:31,230][71000] Updated weights for policy 0, policy_version 109624 (0.0033) [2024-06-12 23:48:35,311][71000] Updated weights for policy 0, policy_version 109634 (0.0028) [2024-06-12 23:48:35,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 1796259840. Throughput: 0: 49634.1. Samples: 1325051560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-12 23:48:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:48:37,857][71000] Updated weights for policy 0, policy_version 109644 (0.0032) [2024-06-12 23:48:40,940][70768] Fps is (10 sec: 45874.5, 60 sec: 49424.9, 300 sec: 49319.0). Total num frames: 1796505600. Throughput: 0: 49532.2. Samples: 1325345000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 23:48:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:48:42,102][71000] Updated weights for policy 0, policy_version 109654 (0.0026) [2024-06-12 23:48:44,776][71000] Updated weights for policy 0, policy_version 109664 (0.0028) [2024-06-12 23:48:45,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1796784128. Throughput: 0: 49596.8. Samples: 1325645220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 23:48:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:48:48,517][71000] Updated weights for policy 0, policy_version 109674 (0.0036) [2024-06-12 23:48:50,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49971.0, 300 sec: 49263.0). Total num frames: 1797029888. Throughput: 0: 49564.7. Samples: 1325801860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 23:48:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:48:51,338][71000] Updated weights for policy 0, policy_version 109684 (0.0031) [2024-06-12 23:48:55,157][71000] Updated weights for policy 0, policy_version 109694 (0.0025) [2024-06-12 23:48:55,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49971.2, 300 sec: 49318.6). Total num frames: 1797259264. Throughput: 0: 49508.5. Samples: 1326097240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 23:48:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:48:57,837][71000] Updated weights for policy 0, policy_version 109704 (0.0026) [2024-06-12 23:49:00,940][70768] Fps is (10 sec: 45876.1, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 1797488640. Throughput: 0: 49572.1. Samples: 1326388360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 23:49:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:49:01,868][71000] Updated weights for policy 0, policy_version 109714 (0.0031) [2024-06-12 23:49:04,369][71000] Updated weights for policy 0, policy_version 109724 (0.0037) [2024-06-12 23:49:05,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1797767168. Throughput: 0: 49615.7. Samples: 1326527680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 23:49:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:49:08,342][71000] Updated weights for policy 0, policy_version 109734 (0.0027) [2024-06-12 23:49:10,939][70768] Fps is (10 sec: 54067.4, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 1798029312. Throughput: 0: 49692.0. Samples: 1326833940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 23:49:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:49:11,134][71000] Updated weights for policy 0, policy_version 109744 (0.0023) [2024-06-12 23:49:15,074][71000] Updated weights for policy 0, policy_version 109754 (0.0029) [2024-06-12 23:49:15,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49971.3, 300 sec: 49429.7). Total num frames: 1798258688. Throughput: 0: 49564.0. Samples: 1327131980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 23:49:15,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:49:16,179][70980] Signal inference workers to stop experience collection... (19450 times) [2024-06-12 23:49:16,180][70980] Signal inference workers to resume experience collection... (19450 times) [2024-06-12 23:49:16,219][71000] InferenceWorker_p0-w0: stopping experience collection (19450 times) [2024-06-12 23:49:16,219][71000] InferenceWorker_p0-w0: resuming experience collection (19450 times) [2024-06-12 23:49:17,914][71000] Updated weights for policy 0, policy_version 109764 (0.0031) [2024-06-12 23:49:20,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 1798488064. Throughput: 0: 49367.6. Samples: 1327273100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 23:49:20,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 23:49:21,574][71000] Updated weights for policy 0, policy_version 109774 (0.0022) [2024-06-12 23:49:24,237][71000] Updated weights for policy 0, policy_version 109784 (0.0028) [2024-06-12 23:49:25,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49424.9, 300 sec: 49318.6). Total num frames: 1798750208. Throughput: 0: 49421.0. Samples: 1327568940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 23:49:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:49:28,123][71000] Updated weights for policy 0, policy_version 109794 (0.0029) [2024-06-12 23:49:30,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1799012352. Throughput: 0: 49232.0. Samples: 1327860660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 23:49:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:49:31,123][71000] Updated weights for policy 0, policy_version 109804 (0.0029) [2024-06-12 23:49:34,734][71000] Updated weights for policy 0, policy_version 109814 (0.0027) [2024-06-12 23:49:35,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 1799258112. Throughput: 0: 49249.5. Samples: 1328018080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-12 23:49:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:49:37,662][71000] Updated weights for policy 0, policy_version 109824 (0.0026) [2024-06-12 23:49:40,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49425.1, 300 sec: 49374.1). Total num frames: 1799471104. Throughput: 0: 49259.4. Samples: 1328313920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 23:49:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:49:41,032][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000109832_1799487488.pth... [2024-06-12 23:49:41,087][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000109107_1787609088.pth [2024-06-12 23:49:41,375][71000] Updated weights for policy 0, policy_version 109834 (0.0027) [2024-06-12 23:49:44,481][71000] Updated weights for policy 0, policy_version 109844 (0.0028) [2024-06-12 23:49:45,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1799733248. Throughput: 0: 49501.2. Samples: 1328615920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 23:49:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:49:47,892][71000] Updated weights for policy 0, policy_version 109854 (0.0025) [2024-06-12 23:49:50,791][71000] Updated weights for policy 0, policy_version 109864 (0.0024) [2024-06-12 23:49:50,940][70768] Fps is (10 sec: 54067.4, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1800011776. Throughput: 0: 49600.4. Samples: 1328759700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 23:49:50,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 23:49:54,470][71000] Updated weights for policy 0, policy_version 109874 (0.0036) [2024-06-12 23:49:55,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 1800241152. Throughput: 0: 49553.1. Samples: 1329063840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 23:49:55,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:49:57,548][71000] Updated weights for policy 0, policy_version 109884 (0.0035) [2024-06-12 23:50:00,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49971.2, 300 sec: 49485.3). Total num frames: 1800486912. Throughput: 0: 49564.9. Samples: 1329362400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 23:50:00,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-12 23:50:01,074][71000] Updated weights for policy 0, policy_version 109894 (0.0020) [2024-06-12 23:50:03,994][71000] Updated weights for policy 0, policy_version 109904 (0.0025) [2024-06-12 23:50:05,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49151.9, 300 sec: 49374.2). Total num frames: 1800716288. Throughput: 0: 49467.9. Samples: 1329499160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 23:50:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:50:07,761][71000] Updated weights for policy 0, policy_version 109914 (0.0020) [2024-06-12 23:50:10,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49151.8, 300 sec: 49429.7). Total num frames: 1800978432. Throughput: 0: 49444.4. Samples: 1329793940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 23:50:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:50:10,995][71000] Updated weights for policy 0, policy_version 109924 (0.0026) [2024-06-12 23:50:14,385][71000] Updated weights for policy 0, policy_version 109934 (0.0032) [2024-06-12 23:50:15,939][70768] Fps is (10 sec: 50791.2, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1801224192. Throughput: 0: 49575.7. Samples: 1330091560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 23:50:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:50:17,640][71000] Updated weights for policy 0, policy_version 109944 (0.0030) [2024-06-12 23:50:19,482][70980] Signal inference workers to stop experience collection... (19500 times) [2024-06-12 23:50:19,540][71000] InferenceWorker_p0-w0: stopping experience collection (19500 times) [2024-06-12 23:50:19,591][70980] Signal inference workers to resume experience collection... (19500 times) [2024-06-12 23:50:19,591][71000] InferenceWorker_p0-w0: resuming experience collection (19500 times) [2024-06-12 23:50:20,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1801469952. Throughput: 0: 49252.4. Samples: 1330234440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 23:50:20,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 23:50:21,238][71000] Updated weights for policy 0, policy_version 109954 (0.0024) [2024-06-12 23:50:24,298][71000] Updated weights for policy 0, policy_version 109964 (0.0029) [2024-06-12 23:50:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 1801715712. Throughput: 0: 49369.1. Samples: 1330535520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 23:50:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-12 23:50:27,777][71000] Updated weights for policy 0, policy_version 109974 (0.0030) [2024-06-12 23:50:30,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1801961472. Throughput: 0: 49133.0. Samples: 1330826900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 23:50:30,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:50:31,273][71000] Updated weights for policy 0, policy_version 109984 (0.0033) [2024-06-12 23:50:34,443][71000] Updated weights for policy 0, policy_version 109994 (0.0024) [2024-06-12 23:50:35,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1802223616. Throughput: 0: 49344.0. Samples: 1330980180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-12 23:50:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:50:37,567][71000] Updated weights for policy 0, policy_version 110004 (0.0029) [2024-06-12 23:50:40,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1802452992. Throughput: 0: 49329.3. Samples: 1331283660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:50:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:50:41,001][71000] Updated weights for policy 0, policy_version 110014 (0.0032) [2024-06-12 23:50:44,555][71000] Updated weights for policy 0, policy_version 110024 (0.0028) [2024-06-12 23:50:45,939][70768] Fps is (10 sec: 45875.9, 60 sec: 49152.2, 300 sec: 49374.2). Total num frames: 1802682368. Throughput: 0: 49019.6. Samples: 1331568280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:50:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:50:47,768][71000] Updated weights for policy 0, policy_version 110034 (0.0028) [2024-06-12 23:50:50,917][71000] Updated weights for policy 0, policy_version 110044 (0.0026) [2024-06-12 23:50:50,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.0, 300 sec: 49596.3). Total num frames: 1802960896. Throughput: 0: 49158.7. Samples: 1331711300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:50:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:50:54,185][71000] Updated weights for policy 0, policy_version 110054 (0.0024) [2024-06-12 23:50:55,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1803206656. Throughput: 0: 49295.6. Samples: 1332012240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:50:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:50:57,771][71000] Updated weights for policy 0, policy_version 110064 (0.0026) [2024-06-12 23:51:00,790][71000] Updated weights for policy 0, policy_version 110074 (0.0024) [2024-06-12 23:51:00,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49424.9, 300 sec: 49596.3). Total num frames: 1803452416. Throughput: 0: 49362.9. Samples: 1332312900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:51:00,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:51:04,039][71000] Updated weights for policy 0, policy_version 110084 (0.0033) [2024-06-12 23:51:05,940][70768] Fps is (10 sec: 45875.6, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 1803665408. Throughput: 0: 49663.2. Samples: 1332469280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:51:05,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 23:51:07,416][71000] Updated weights for policy 0, policy_version 110094 (0.0035) [2024-06-12 23:51:10,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.1, 300 sec: 49540.8). Total num frames: 1803927552. Throughput: 0: 49313.3. Samples: 1332754620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:51:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:51:11,250][71000] Updated weights for policy 0, policy_version 110104 (0.0030) [2024-06-12 23:51:14,097][71000] Updated weights for policy 0, policy_version 110114 (0.0026) [2024-06-12 23:51:15,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1804189696. Throughput: 0: 49371.1. Samples: 1333048600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:51:15,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 23:51:17,785][71000] Updated weights for policy 0, policy_version 110124 (0.0035) [2024-06-12 23:51:20,593][71000] Updated weights for policy 0, policy_version 110134 (0.0026) [2024-06-12 23:51:20,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1804451840. Throughput: 0: 49442.6. Samples: 1333205100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:51:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:51:24,206][71000] Updated weights for policy 0, policy_version 110144 (0.0033) [2024-06-12 23:51:25,939][70768] Fps is (10 sec: 45875.5, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 1804648448. Throughput: 0: 49152.3. Samples: 1333495500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:51:25,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:51:27,260][71000] Updated weights for policy 0, policy_version 110154 (0.0031) [2024-06-12 23:51:30,939][70768] Fps is (10 sec: 45876.1, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 1804910592. Throughput: 0: 49416.9. Samples: 1333792040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:51:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:51:31,078][71000] Updated weights for policy 0, policy_version 110164 (0.0035) [2024-06-12 23:51:34,215][71000] Updated weights for policy 0, policy_version 110174 (0.0024) [2024-06-12 23:51:35,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1805172736. Throughput: 0: 49427.1. Samples: 1333935520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:51:35,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 23:51:37,751][71000] Updated weights for policy 0, policy_version 110184 (0.0034) [2024-06-12 23:51:40,310][70980] Signal inference workers to stop experience collection... (19550 times) [2024-06-12 23:51:40,311][70980] Signal inference workers to resume experience collection... (19550 times) [2024-06-12 23:51:40,347][71000] InferenceWorker_p0-w0: stopping experience collection (19550 times) [2024-06-12 23:51:40,348][71000] InferenceWorker_p0-w0: resuming experience collection (19550 times) [2024-06-12 23:51:40,446][71000] Updated weights for policy 0, policy_version 110194 (0.0031) [2024-06-12 23:51:40,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49698.3, 300 sec: 49485.2). Total num frames: 1805434880. Throughput: 0: 49545.0. Samples: 1334241760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:51:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:51:40,946][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000110195_1805434880.pth... [2024-06-12 23:51:40,990][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000109470_1793556480.pth [2024-06-12 23:51:44,262][71000] Updated weights for policy 0, policy_version 110204 (0.0018) [2024-06-12 23:51:45,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1805664256. Throughput: 0: 49357.1. Samples: 1334533960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:51:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:51:47,121][71000] Updated weights for policy 0, policy_version 110214 (0.0028) [2024-06-12 23:51:50,629][71000] Updated weights for policy 0, policy_version 110224 (0.0027) [2024-06-12 23:51:50,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1805910016. Throughput: 0: 49124.3. Samples: 1334679880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:51:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:51:53,650][71000] Updated weights for policy 0, policy_version 110234 (0.0027) [2024-06-12 23:51:55,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48879.1, 300 sec: 49374.2). Total num frames: 1806139392. Throughput: 0: 49418.3. Samples: 1334978440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:51:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:51:57,473][71000] Updated weights for policy 0, policy_version 110244 (0.0025) [2024-06-12 23:52:00,363][71000] Updated weights for policy 0, policy_version 110254 (0.0028) [2024-06-12 23:52:00,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1806417920. Throughput: 0: 49516.0. Samples: 1335276820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:52:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:52:03,913][71000] Updated weights for policy 0, policy_version 110264 (0.0030) [2024-06-12 23:52:05,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 1806663680. Throughput: 0: 49505.5. Samples: 1335432840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:52:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:52:07,159][71000] Updated weights for policy 0, policy_version 110274 (0.0033) [2024-06-12 23:52:10,391][71000] Updated weights for policy 0, policy_version 110284 (0.0026) [2024-06-12 23:52:10,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1806909440. Throughput: 0: 49693.8. Samples: 1335731720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:52:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:52:13,764][71000] Updated weights for policy 0, policy_version 110294 (0.0026) [2024-06-12 23:52:15,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49485.3). Total num frames: 1807155200. Throughput: 0: 49530.7. Samples: 1336020920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:52:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:52:17,253][71000] Updated weights for policy 0, policy_version 110304 (0.0035) [2024-06-12 23:52:20,437][71000] Updated weights for policy 0, policy_version 110314 (0.0029) [2024-06-12 23:52:20,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.2, 300 sec: 49429.7). Total num frames: 1807400960. Throughput: 0: 49820.6. Samples: 1336177440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:52:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:52:23,803][71000] Updated weights for policy 0, policy_version 110324 (0.0035) [2024-06-12 23:52:25,940][70768] Fps is (10 sec: 50789.7, 60 sec: 50244.2, 300 sec: 49540.8). Total num frames: 1807663104. Throughput: 0: 49735.4. Samples: 1336479860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:52:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:52:26,797][71000] Updated weights for policy 0, policy_version 110334 (0.0032) [2024-06-12 23:52:30,314][71000] Updated weights for policy 0, policy_version 110344 (0.0027) [2024-06-12 23:52:30,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49698.0, 300 sec: 49540.8). Total num frames: 1807892480. Throughput: 0: 49726.5. Samples: 1336771660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:52:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:52:33,641][71000] Updated weights for policy 0, policy_version 110354 (0.0036) [2024-06-12 23:52:35,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 49540.7). Total num frames: 1808154624. Throughput: 0: 49771.2. Samples: 1336919580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:52:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:52:36,826][71000] Updated weights for policy 0, policy_version 110364 (0.0034) [2024-06-12 23:52:40,236][71000] Updated weights for policy 0, policy_version 110374 (0.0027) [2024-06-12 23:52:40,939][70768] Fps is (10 sec: 49152.8, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1808384000. Throughput: 0: 49752.4. Samples: 1337217300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-12 23:52:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:52:43,680][71000] Updated weights for policy 0, policy_version 110384 (0.0026) [2024-06-12 23:52:45,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.0, 300 sec: 49540.7). Total num frames: 1808646144. Throughput: 0: 49567.0. Samples: 1337507340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 23:52:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:52:46,660][71000] Updated weights for policy 0, policy_version 110394 (0.0024) [2024-06-12 23:52:50,375][71000] Updated weights for policy 0, policy_version 110404 (0.0022) [2024-06-12 23:52:50,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1808875520. Throughput: 0: 49580.0. Samples: 1337663940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 23:52:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:52:53,299][71000] Updated weights for policy 0, policy_version 110414 (0.0027) [2024-06-12 23:52:55,939][70768] Fps is (10 sec: 49152.8, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 1809137664. Throughput: 0: 49605.3. Samples: 1337963960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 23:52:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:52:56,985][71000] Updated weights for policy 0, policy_version 110424 (0.0037) [2024-06-12 23:52:58,196][70980] Signal inference workers to stop experience collection... (19600 times) [2024-06-12 23:52:58,199][70980] Signal inference workers to resume experience collection... (19600 times) [2024-06-12 23:52:58,238][71000] InferenceWorker_p0-w0: stopping experience collection (19600 times) [2024-06-12 23:52:58,238][71000] InferenceWorker_p0-w0: resuming experience collection (19600 times) [2024-06-12 23:53:00,286][71000] Updated weights for policy 0, policy_version 110434 (0.0038) [2024-06-12 23:53:00,940][70768] Fps is (10 sec: 50789.3, 60 sec: 49424.9, 300 sec: 49374.1). Total num frames: 1809383424. Throughput: 0: 49857.9. Samples: 1338264540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 23:53:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:53:03,624][71000] Updated weights for policy 0, policy_version 110444 (0.0026) [2024-06-12 23:53:05,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1809629184. Throughput: 0: 49638.7. Samples: 1338411180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 23:53:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:53:06,693][71000] Updated weights for policy 0, policy_version 110454 (0.0025) [2024-06-12 23:53:10,070][71000] Updated weights for policy 0, policy_version 110464 (0.0027) [2024-06-12 23:53:10,940][70768] Fps is (10 sec: 50791.6, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1809891328. Throughput: 0: 49452.5. Samples: 1338705220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 23:53:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:53:12,971][71000] Updated weights for policy 0, policy_version 110474 (0.0038) [2024-06-12 23:53:15,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.0, 300 sec: 49485.3). Total num frames: 1810120704. Throughput: 0: 49716.6. Samples: 1339008900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 23:53:15,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:53:16,400][71000] Updated weights for policy 0, policy_version 110484 (0.0027) [2024-06-12 23:53:20,023][71000] Updated weights for policy 0, policy_version 110494 (0.0026) [2024-06-12 23:53:20,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1810366464. Throughput: 0: 49547.2. Samples: 1339149200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 23:53:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:53:23,126][71000] Updated weights for policy 0, policy_version 110504 (0.0024) [2024-06-12 23:53:25,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1810628608. Throughput: 0: 49648.4. Samples: 1339451480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 23:53:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:53:26,538][71000] Updated weights for policy 0, policy_version 110514 (0.0025) [2024-06-12 23:53:29,787][71000] Updated weights for policy 0, policy_version 110524 (0.0024) [2024-06-12 23:53:30,939][70768] Fps is (10 sec: 52429.2, 60 sec: 49971.3, 300 sec: 49596.3). Total num frames: 1810890752. Throughput: 0: 49925.5. Samples: 1339753980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 23:53:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:53:32,825][71000] Updated weights for policy 0, policy_version 110534 (0.0026) [2024-06-12 23:53:35,940][70768] Fps is (10 sec: 49150.6, 60 sec: 49424.9, 300 sec: 49540.8). Total num frames: 1811120128. Throughput: 0: 49799.3. Samples: 1339904920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 23:53:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:53:36,466][71000] Updated weights for policy 0, policy_version 110544 (0.0030) [2024-06-12 23:53:39,439][71000] Updated weights for policy 0, policy_version 110554 (0.0026) [2024-06-12 23:53:40,940][70768] Fps is (10 sec: 47512.5, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 1811365888. Throughput: 0: 49669.5. Samples: 1340199100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-12 23:53:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:53:41,064][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000110558_1811382272.pth... [2024-06-12 23:53:41,115][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000109832_1799487488.pth [2024-06-12 23:53:42,900][71000] Updated weights for policy 0, policy_version 110564 (0.0035) [2024-06-12 23:53:45,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1811611648. Throughput: 0: 49568.1. Samples: 1340495100. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-12 23:53:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:53:46,275][71000] Updated weights for policy 0, policy_version 110574 (0.0027) [2024-06-12 23:53:49,441][71000] Updated weights for policy 0, policy_version 110584 (0.0031) [2024-06-12 23:53:50,940][70768] Fps is (10 sec: 52429.8, 60 sec: 50244.3, 300 sec: 49596.3). Total num frames: 1811890176. Throughput: 0: 49696.0. Samples: 1340647500. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-12 23:53:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:53:52,667][71000] Updated weights for policy 0, policy_version 110594 (0.0024) [2024-06-12 23:53:55,940][70768] Fps is (10 sec: 50791.3, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1812119552. Throughput: 0: 49738.3. Samples: 1340943440. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-12 23:53:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:53:56,074][71000] Updated weights for policy 0, policy_version 110604 (0.0027) [2024-06-12 23:53:59,513][71000] Updated weights for policy 0, policy_version 110614 (0.0033) [2024-06-12 23:54:00,940][70768] Fps is (10 sec: 45874.7, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1812348928. Throughput: 0: 49630.1. Samples: 1341242260. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-12 23:54:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:54:01,044][70980] Signal inference workers to stop experience collection... (19650 times) [2024-06-12 23:54:01,079][71000] InferenceWorker_p0-w0: stopping experience collection (19650 times) [2024-06-12 23:54:01,090][70980] Signal inference workers to resume experience collection... (19650 times) [2024-06-12 23:54:01,099][71000] InferenceWorker_p0-w0: resuming experience collection (19650 times) [2024-06-12 23:54:02,815][71000] Updated weights for policy 0, policy_version 110624 (0.0024) [2024-06-12 23:54:05,922][71000] Updated weights for policy 0, policy_version 110634 (0.0030) [2024-06-12 23:54:05,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49971.1, 300 sec: 49485.2). Total num frames: 1812627456. Throughput: 0: 49728.9. Samples: 1341387000. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-12 23:54:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:54:09,178][71000] Updated weights for policy 0, policy_version 110644 (0.0027) [2024-06-12 23:54:10,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49698.0, 300 sec: 49540.7). Total num frames: 1812873216. Throughput: 0: 49688.7. Samples: 1341687480. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-12 23:54:10,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:54:12,490][71000] Updated weights for policy 0, policy_version 110654 (0.0026) [2024-06-12 23:54:15,871][71000] Updated weights for policy 0, policy_version 110664 (0.0028) [2024-06-12 23:54:15,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1813118976. Throughput: 0: 49612.0. Samples: 1341986520. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-12 23:54:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:54:19,449][71000] Updated weights for policy 0, policy_version 110674 (0.0027) [2024-06-12 23:54:20,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1813331968. Throughput: 0: 49512.8. Samples: 1342132980. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-12 23:54:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:54:22,664][71000] Updated weights for policy 0, policy_version 110684 (0.0030) [2024-06-12 23:54:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1813594112. Throughput: 0: 49470.0. Samples: 1342425240. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-12 23:54:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:54:26,043][71000] Updated weights for policy 0, policy_version 110694 (0.0026) [2024-06-12 23:54:29,054][71000] Updated weights for policy 0, policy_version 110704 (0.0028) [2024-06-12 23:54:30,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1813856256. Throughput: 0: 49622.8. Samples: 1342728120. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-12 23:54:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:54:32,695][71000] Updated weights for policy 0, policy_version 110714 (0.0022) [2024-06-12 23:54:35,513][71000] Updated weights for policy 0, policy_version 110724 (0.0024) [2024-06-12 23:54:35,940][70768] Fps is (10 sec: 54066.4, 60 sec: 50244.4, 300 sec: 49707.4). Total num frames: 1814134784. Throughput: 0: 49666.5. Samples: 1342882500. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-12 23:54:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:54:39,135][71000] Updated weights for policy 0, policy_version 110734 (0.0031) [2024-06-12 23:54:40,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.2, 300 sec: 49485.3). Total num frames: 1814331392. Throughput: 0: 49619.1. Samples: 1343176300. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-12 23:54:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:54:42,336][71000] Updated weights for policy 0, policy_version 110744 (0.0023) [2024-06-12 23:54:45,939][70768] Fps is (10 sec: 44237.7, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 1814577152. Throughput: 0: 49402.8. Samples: 1343465380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 23:54:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:54:45,969][71000] Updated weights for policy 0, policy_version 110754 (0.0033) [2024-06-12 23:54:48,987][71000] Updated weights for policy 0, policy_version 110764 (0.0031) [2024-06-12 23:54:50,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1814839296. Throughput: 0: 49508.4. Samples: 1343614880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 23:54:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:54:52,961][71000] Updated weights for policy 0, policy_version 110774 (0.0033) [2024-06-12 23:54:55,649][71000] Updated weights for policy 0, policy_version 110784 (0.0028) [2024-06-12 23:54:55,940][70768] Fps is (10 sec: 52427.8, 60 sec: 49698.0, 300 sec: 49540.7). Total num frames: 1815101440. Throughput: 0: 49364.9. Samples: 1343908900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 23:54:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:54:59,358][71000] Updated weights for policy 0, policy_version 110794 (0.0029) [2024-06-12 23:55:00,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1815314432. Throughput: 0: 49246.5. Samples: 1344202620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 23:55:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:55:01,673][70980] Signal inference workers to stop experience collection... (19700 times) [2024-06-12 23:55:01,717][71000] InferenceWorker_p0-w0: stopping experience collection (19700 times) [2024-06-12 23:55:01,725][70980] Signal inference workers to resume experience collection... (19700 times) [2024-06-12 23:55:01,729][71000] InferenceWorker_p0-w0: resuming experience collection (19700 times) [2024-06-12 23:55:02,197][71000] Updated weights for policy 0, policy_version 110804 (0.0025) [2024-06-12 23:55:05,830][71000] Updated weights for policy 0, policy_version 110814 (0.0033) [2024-06-12 23:55:05,944][70768] Fps is (10 sec: 47494.1, 60 sec: 49148.5, 300 sec: 49484.5). Total num frames: 1815576576. Throughput: 0: 49181.6. Samples: 1344346360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 23:55:05,944][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:55:08,908][71000] Updated weights for policy 0, policy_version 110824 (0.0035) [2024-06-12 23:55:10,939][70768] Fps is (10 sec: 50791.4, 60 sec: 49152.2, 300 sec: 49485.2). Total num frames: 1815822336. Throughput: 0: 49112.9. Samples: 1344635320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 23:55:10,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:55:12,731][71000] Updated weights for policy 0, policy_version 110834 (0.0020) [2024-06-12 23:55:15,487][71000] Updated weights for policy 0, policy_version 110844 (0.0027) [2024-06-12 23:55:15,941][70768] Fps is (10 sec: 50804.8, 60 sec: 49423.9, 300 sec: 49540.5). Total num frames: 1816084480. Throughput: 0: 49065.5. Samples: 1344936140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 23:55:15,942][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:55:19,253][71000] Updated weights for policy 0, policy_version 110854 (0.0025) [2024-06-12 23:55:20,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1816313856. Throughput: 0: 48975.2. Samples: 1345086380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 23:55:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:55:22,001][71000] Updated weights for policy 0, policy_version 110864 (0.0027) [2024-06-12 23:55:25,525][71000] Updated weights for policy 0, policy_version 110874 (0.0028) [2024-06-12 23:55:25,940][70768] Fps is (10 sec: 47520.4, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1816559616. Throughput: 0: 49196.4. Samples: 1345390140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 23:55:25,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:55:28,669][71000] Updated weights for policy 0, policy_version 110884 (0.0023) [2024-06-12 23:55:30,939][70768] Fps is (10 sec: 50791.2, 60 sec: 49425.1, 300 sec: 49485.3). Total num frames: 1816821760. Throughput: 0: 49351.2. Samples: 1345686180. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 23:55:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:55:32,343][71000] Updated weights for policy 0, policy_version 110894 (0.0027) [2024-06-12 23:55:35,335][71000] Updated weights for policy 0, policy_version 110904 (0.0026) [2024-06-12 23:55:35,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48879.0, 300 sec: 49540.8). Total num frames: 1817067520. Throughput: 0: 49291.6. Samples: 1345833000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 23:55:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:55:39,042][71000] Updated weights for policy 0, policy_version 110914 (0.0027) [2024-06-12 23:55:40,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1817313280. Throughput: 0: 49284.6. Samples: 1346126700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 23:55:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:55:41,051][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000110921_1817329664.pth... [2024-06-12 23:55:41,090][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000110195_1805434880.pth [2024-06-12 23:55:41,880][71000] Updated weights for policy 0, policy_version 110924 (0.0027) [2024-06-12 23:55:45,481][71000] Updated weights for policy 0, policy_version 110934 (0.0031) [2024-06-12 23:55:45,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1817542656. Throughput: 0: 49477.5. Samples: 1346429100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-12 23:55:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:55:48,602][71000] Updated weights for policy 0, policy_version 110944 (0.0038) [2024-06-12 23:55:50,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 1817788416. Throughput: 0: 49411.4. Samples: 1346569660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-12 23:55:50,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 23:55:52,386][71000] Updated weights for policy 0, policy_version 110954 (0.0026) [2024-06-12 23:55:55,204][71000] Updated weights for policy 0, policy_version 110964 (0.0033) [2024-06-12 23:55:55,944][70768] Fps is (10 sec: 50769.5, 60 sec: 49148.8, 300 sec: 49484.6). Total num frames: 1818050560. Throughput: 0: 49651.9. Samples: 1346869860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-12 23:55:55,944][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:55:58,721][71000] Updated weights for policy 0, policy_version 110974 (0.0026) [2024-06-12 23:56:00,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.3, 300 sec: 49596.3). Total num frames: 1818296320. Throughput: 0: 49624.7. Samples: 1347169180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-12 23:56:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:56:01,714][71000] Updated weights for policy 0, policy_version 110984 (0.0032) [2024-06-12 23:56:04,831][70980] Signal inference workers to stop experience collection... (19750 times) [2024-06-12 23:56:04,831][70980] Signal inference workers to resume experience collection... (19750 times) [2024-06-12 23:56:04,870][71000] InferenceWorker_p0-w0: stopping experience collection (19750 times) [2024-06-12 23:56:04,870][71000] InferenceWorker_p0-w0: resuming experience collection (19750 times) [2024-06-12 23:56:05,267][71000] Updated weights for policy 0, policy_version 110994 (0.0032) [2024-06-12 23:56:05,940][70768] Fps is (10 sec: 49171.4, 60 sec: 49428.5, 300 sec: 49540.8). Total num frames: 1818542080. Throughput: 0: 49659.1. Samples: 1347321040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-12 23:56:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:56:08,324][71000] Updated weights for policy 0, policy_version 111004 (0.0024) [2024-06-12 23:56:10,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1818787840. Throughput: 0: 49555.6. Samples: 1347620140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-12 23:56:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:56:11,904][71000] Updated weights for policy 0, policy_version 111014 (0.0030) [2024-06-12 23:56:14,919][71000] Updated weights for policy 0, policy_version 111024 (0.0033) [2024-06-12 23:56:15,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49426.1, 300 sec: 49485.2). Total num frames: 1819049984. Throughput: 0: 49374.0. Samples: 1347908020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-12 23:56:15,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-12 23:56:18,373][71000] Updated weights for policy 0, policy_version 111034 (0.0035) [2024-06-12 23:56:20,940][70768] Fps is (10 sec: 50789.3, 60 sec: 49698.0, 300 sec: 49651.8). Total num frames: 1819295744. Throughput: 0: 49567.8. Samples: 1348063560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-12 23:56:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:56:21,583][71000] Updated weights for policy 0, policy_version 111044 (0.0037) [2024-06-12 23:56:25,254][71000] Updated weights for policy 0, policy_version 111054 (0.0025) [2024-06-12 23:56:25,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1819541504. Throughput: 0: 49682.7. Samples: 1348362420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-12 23:56:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:56:28,013][71000] Updated weights for policy 0, policy_version 111064 (0.0023) [2024-06-12 23:56:30,940][70768] Fps is (10 sec: 47514.8, 60 sec: 49152.0, 300 sec: 49485.3). Total num frames: 1819770880. Throughput: 0: 49616.9. Samples: 1348661860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-12 23:56:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:56:31,994][71000] Updated weights for policy 0, policy_version 111074 (0.0027) [2024-06-12 23:56:34,806][71000] Updated weights for policy 0, policy_version 111084 (0.0030) [2024-06-12 23:56:35,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1820049408. Throughput: 0: 49755.6. Samples: 1348808660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-12 23:56:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:56:38,470][71000] Updated weights for policy 0, policy_version 111094 (0.0029) [2024-06-12 23:56:40,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1820295168. Throughput: 0: 49516.0. Samples: 1349097880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-12 23:56:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:56:41,427][71000] Updated weights for policy 0, policy_version 111104 (0.0022) [2024-06-12 23:56:45,107][71000] Updated weights for policy 0, policy_version 111114 (0.0023) [2024-06-12 23:56:45,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49971.1, 300 sec: 49596.3). Total num frames: 1820540928. Throughput: 0: 49538.0. Samples: 1349398400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-12 23:56:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:56:48,051][71000] Updated weights for policy 0, policy_version 111124 (0.0024) [2024-06-12 23:56:50,939][70768] Fps is (10 sec: 45875.6, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1820753920. Throughput: 0: 49371.3. Samples: 1349542740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:56:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:56:51,633][71000] Updated weights for policy 0, policy_version 111134 (0.0032) [2024-06-12 23:56:54,521][71000] Updated weights for policy 0, policy_version 111144 (0.0030) [2024-06-12 23:56:55,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49701.4, 300 sec: 49540.8). Total num frames: 1821032448. Throughput: 0: 49338.1. Samples: 1349840360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:56:55,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-12 23:56:58,492][71000] Updated weights for policy 0, policy_version 111154 (0.0030) [2024-06-12 23:57:00,940][70768] Fps is (10 sec: 52427.8, 60 sec: 49698.0, 300 sec: 49540.7). Total num frames: 1821278208. Throughput: 0: 49525.3. Samples: 1350136660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:57:00,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:57:01,345][71000] Updated weights for policy 0, policy_version 111164 (0.0029) [2024-06-12 23:57:05,078][71000] Updated weights for policy 0, policy_version 111174 (0.0024) [2024-06-12 23:57:05,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1821523968. Throughput: 0: 49554.4. Samples: 1350293500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:57:05,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-12 23:57:08,099][71000] Updated weights for policy 0, policy_version 111184 (0.0026) [2024-06-12 23:57:10,939][70768] Fps is (10 sec: 47514.8, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 1821753344. Throughput: 0: 49643.2. Samples: 1350596360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:57:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:57:11,537][71000] Updated weights for policy 0, policy_version 111194 (0.0032) [2024-06-12 23:57:14,794][71000] Updated weights for policy 0, policy_version 111204 (0.0033) [2024-06-12 23:57:15,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49425.1, 300 sec: 49540.7). Total num frames: 1822015488. Throughput: 0: 49252.3. Samples: 1350878220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:57:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:57:18,372][71000] Updated weights for policy 0, policy_version 111214 (0.0033) [2024-06-12 23:57:18,553][70980] Signal inference workers to stop experience collection... (19800 times) [2024-06-12 23:57:18,553][70980] Signal inference workers to resume experience collection... (19800 times) [2024-06-12 23:57:18,586][71000] InferenceWorker_p0-w0: stopping experience collection (19800 times) [2024-06-12 23:57:18,586][71000] InferenceWorker_p0-w0: resuming experience collection (19800 times) [2024-06-12 23:57:20,939][70768] Fps is (10 sec: 50790.0, 60 sec: 49425.3, 300 sec: 49485.3). Total num frames: 1822261248. Throughput: 0: 49439.5. Samples: 1351033440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:57:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-12 23:57:21,236][71000] Updated weights for policy 0, policy_version 111224 (0.0033) [2024-06-12 23:57:24,962][71000] Updated weights for policy 0, policy_version 111234 (0.0032) [2024-06-12 23:57:25,939][70768] Fps is (10 sec: 50791.3, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1822523392. Throughput: 0: 49609.0. Samples: 1351330280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:57:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:57:28,392][71000] Updated weights for policy 0, policy_version 111244 (0.0030) [2024-06-12 23:57:30,940][70768] Fps is (10 sec: 45874.2, 60 sec: 49151.8, 300 sec: 49374.1). Total num frames: 1822720000. Throughput: 0: 49291.5. Samples: 1351616520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:57:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:57:31,675][71000] Updated weights for policy 0, policy_version 111254 (0.0028) [2024-06-12 23:57:34,879][71000] Updated weights for policy 0, policy_version 111264 (0.0030) [2024-06-12 23:57:35,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49151.9, 300 sec: 49540.8). Total num frames: 1822998528. Throughput: 0: 49231.1. Samples: 1351758140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:57:35,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-12 23:57:38,428][71000] Updated weights for policy 0, policy_version 111274 (0.0033) [2024-06-12 23:57:40,940][70768] Fps is (10 sec: 50791.5, 60 sec: 48879.0, 300 sec: 49429.7). Total num frames: 1823227904. Throughput: 0: 49201.9. Samples: 1352054440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:57:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:57:41,045][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000111282_1823244288.pth... [2024-06-12 23:57:41,086][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000110558_1811382272.pth [2024-06-12 23:57:41,784][71000] Updated weights for policy 0, policy_version 111284 (0.0036) [2024-06-12 23:57:44,919][71000] Updated weights for policy 0, policy_version 111294 (0.0027) [2024-06-12 23:57:45,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 49540.8). Total num frames: 1823490048. Throughput: 0: 49338.4. Samples: 1352356880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-12 23:57:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-12 23:57:48,303][71000] Updated weights for policy 0, policy_version 111304 (0.0025) [2024-06-12 23:57:50,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49424.9, 300 sec: 49429.7). Total num frames: 1823719424. Throughput: 0: 49119.0. Samples: 1352503860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-12 23:57:50,940][70768] Avg episode reward: [(0, '0.260')] [2024-06-12 23:57:51,555][71000] Updated weights for policy 0, policy_version 111314 (0.0023) [2024-06-12 23:57:55,154][71000] Updated weights for policy 0, policy_version 111324 (0.0022) [2024-06-12 23:57:55,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.2, 300 sec: 49485.3). Total num frames: 1823981568. Throughput: 0: 48997.3. Samples: 1352801240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-12 23:57:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:57:58,041][71000] Updated weights for policy 0, policy_version 111334 (0.0029) [2024-06-12 23:58:00,939][70768] Fps is (10 sec: 50791.0, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 1824227328. Throughput: 0: 49414.4. Samples: 1353101860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-12 23:58:00,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:58:01,477][71000] Updated weights for policy 0, policy_version 111344 (0.0029) [2024-06-12 23:58:04,752][71000] Updated weights for policy 0, policy_version 111354 (0.0033) [2024-06-12 23:58:05,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 1824473088. Throughput: 0: 49073.4. Samples: 1353241740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-12 23:58:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:58:08,187][71000] Updated weights for policy 0, policy_version 111364 (0.0041) [2024-06-12 23:58:10,940][70768] Fps is (10 sec: 50788.8, 60 sec: 49697.8, 300 sec: 49540.7). Total num frames: 1824735232. Throughput: 0: 49197.4. Samples: 1353544180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-12 23:58:10,941][70768] Avg episode reward: [(0, '0.279')] [2024-06-12 23:58:11,327][71000] Updated weights for policy 0, policy_version 111374 (0.0029) [2024-06-12 23:58:15,142][71000] Updated weights for policy 0, policy_version 111384 (0.0030) [2024-06-12 23:58:15,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48879.0, 300 sec: 49429.7). Total num frames: 1824948224. Throughput: 0: 49376.2. Samples: 1353838440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-12 23:58:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-12 23:58:17,785][71000] Updated weights for policy 0, policy_version 111394 (0.0029) [2024-06-12 23:58:20,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1825226752. Throughput: 0: 49464.7. Samples: 1353984060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-12 23:58:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-12 23:58:21,609][71000] Updated weights for policy 0, policy_version 111404 (0.0029) [2024-06-12 23:58:24,124][71000] Updated weights for policy 0, policy_version 111414 (0.0032) [2024-06-12 23:58:25,939][70768] Fps is (10 sec: 52429.0, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1825472512. Throughput: 0: 49454.7. Samples: 1354279900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-12 23:58:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:58:27,194][70980] Signal inference workers to stop experience collection... (19850 times) [2024-06-12 23:58:27,194][70980] Signal inference workers to resume experience collection... (19850 times) [2024-06-12 23:58:27,211][71000] InferenceWorker_p0-w0: stopping experience collection (19850 times) [2024-06-12 23:58:27,211][71000] InferenceWorker_p0-w0: resuming experience collection (19850 times) [2024-06-12 23:58:27,838][71000] Updated weights for policy 0, policy_version 111424 (0.0031) [2024-06-12 23:58:30,758][71000] Updated weights for policy 0, policy_version 111434 (0.0024) [2024-06-12 23:58:30,940][70768] Fps is (10 sec: 50790.5, 60 sec: 50244.3, 300 sec: 49540.8). Total num frames: 1825734656. Throughput: 0: 49492.3. Samples: 1354584040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-12 23:58:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:58:34,624][71000] Updated weights for policy 0, policy_version 111444 (0.0034) [2024-06-12 23:58:35,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1825964032. Throughput: 0: 49694.2. Samples: 1354740100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-12 23:58:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-12 23:58:37,491][71000] Updated weights for policy 0, policy_version 111454 (0.0033) [2024-06-12 23:58:40,939][70768] Fps is (10 sec: 45876.1, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1826193408. Throughput: 0: 49460.9. Samples: 1355026980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-12 23:58:40,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-12 23:58:41,125][71000] Updated weights for policy 0, policy_version 111464 (0.0024) [2024-06-12 23:58:43,991][71000] Updated weights for policy 0, policy_version 111474 (0.0031) [2024-06-12 23:58:45,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49424.9, 300 sec: 49374.1). Total num frames: 1826455552. Throughput: 0: 49241.6. Samples: 1355317740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-12 23:58:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:58:47,667][71000] Updated weights for policy 0, policy_version 111484 (0.0020) [2024-06-12 23:58:50,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1826701312. Throughput: 0: 49608.7. Samples: 1355474140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 23:58:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-12 23:58:50,962][71000] Updated weights for policy 0, policy_version 111494 (0.0022) [2024-06-12 23:58:54,774][71000] Updated weights for policy 0, policy_version 111504 (0.0024) [2024-06-12 23:58:55,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1826947072. Throughput: 0: 49388.6. Samples: 1355766660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 23:58:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:58:57,801][71000] Updated weights for policy 0, policy_version 111514 (0.0032) [2024-06-12 23:59:00,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 1827176448. Throughput: 0: 49338.1. Samples: 1356058660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 23:59:00,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-12 23:59:01,366][71000] Updated weights for policy 0, policy_version 111524 (0.0025) [2024-06-12 23:59:04,236][71000] Updated weights for policy 0, policy_version 111534 (0.0030) [2024-06-12 23:59:05,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49151.8, 300 sec: 49318.6). Total num frames: 1827422208. Throughput: 0: 49348.4. Samples: 1356204740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 23:59:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-12 23:59:07,897][71000] Updated weights for policy 0, policy_version 111544 (0.0027) [2024-06-12 23:59:10,825][71000] Updated weights for policy 0, policy_version 111554 (0.0034) [2024-06-12 23:59:10,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1827700736. Throughput: 0: 49366.5. Samples: 1356501400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 23:59:10,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-12 23:59:14,732][71000] Updated weights for policy 0, policy_version 111564 (0.0024) [2024-06-12 23:59:15,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1827930112. Throughput: 0: 49328.6. Samples: 1356803820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 23:59:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:59:17,564][71000] Updated weights for policy 0, policy_version 111574 (0.0034) [2024-06-12 23:59:20,939][70768] Fps is (10 sec: 45876.0, 60 sec: 48879.1, 300 sec: 49374.2). Total num frames: 1828159488. Throughput: 0: 49009.0. Samples: 1356945500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 23:59:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:59:21,458][71000] Updated weights for policy 0, policy_version 111584 (0.0027) [2024-06-12 23:59:24,198][71000] Updated weights for policy 0, policy_version 111594 (0.0037) [2024-06-12 23:59:25,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 1828405248. Throughput: 0: 49321.3. Samples: 1357246440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 23:59:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-12 23:59:27,971][71000] Updated weights for policy 0, policy_version 111604 (0.0030) [2024-06-12 23:59:30,711][71000] Updated weights for policy 0, policy_version 111614 (0.0026) [2024-06-12 23:59:30,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 1828683776. Throughput: 0: 49646.8. Samples: 1357551840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 23:59:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-12 23:59:34,355][71000] Updated weights for policy 0, policy_version 111624 (0.0028) [2024-06-12 23:59:35,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1828929536. Throughput: 0: 49520.9. Samples: 1357702580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 23:59:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-12 23:59:37,194][71000] Updated weights for policy 0, policy_version 111634 (0.0018) [2024-06-12 23:59:40,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49424.9, 300 sec: 49429.7). Total num frames: 1829158912. Throughput: 0: 49602.3. Samples: 1357998760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 23:59:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-12 23:59:40,982][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000111644_1829175296.pth... [2024-06-12 23:59:40,987][71000] Updated weights for policy 0, policy_version 111644 (0.0024) [2024-06-12 23:59:41,026][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000110921_1817329664.pth [2024-06-12 23:59:44,163][71000] Updated weights for policy 0, policy_version 111654 (0.0027) [2024-06-12 23:59:45,659][70980] Signal inference workers to stop experience collection... (19900 times) [2024-06-12 23:59:45,660][70980] Signal inference workers to resume experience collection... (19900 times) [2024-06-12 23:59:45,700][71000] InferenceWorker_p0-w0: stopping experience collection (19900 times) [2024-06-12 23:59:45,701][71000] InferenceWorker_p0-w0: resuming experience collection (19900 times) [2024-06-12 23:59:45,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 1829404672. Throughput: 0: 49584.1. Samples: 1358289940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-12 23:59:45,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-12 23:59:47,430][71000] Updated weights for policy 0, policy_version 111664 (0.0031) [2024-06-12 23:59:50,461][71000] Updated weights for policy 0, policy_version 111674 (0.0024) [2024-06-12 23:59:50,940][70768] Fps is (10 sec: 52429.1, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1829683200. Throughput: 0: 49718.3. Samples: 1358442060. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-12 23:59:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-12 23:59:53,942][71000] Updated weights for policy 0, policy_version 111684 (0.0025) [2024-06-12 23:59:55,939][70768] Fps is (10 sec: 52429.4, 60 sec: 49698.4, 300 sec: 49540.8). Total num frames: 1829928960. Throughput: 0: 49730.0. Samples: 1358739240. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-12 23:59:55,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-12 23:59:57,042][71000] Updated weights for policy 0, policy_version 111694 (0.0026) [2024-06-13 00:00:00,602][71000] Updated weights for policy 0, policy_version 111704 (0.0029) [2024-06-13 00:00:00,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49698.2, 300 sec: 49430.4). Total num frames: 1830158336. Throughput: 0: 49659.6. Samples: 1359038500. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-13 00:00:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:00:03,845][71000] Updated weights for policy 0, policy_version 111714 (0.0030) [2024-06-13 00:00:05,940][70768] Fps is (10 sec: 44236.3, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 1830371328. Throughput: 0: 49646.2. Samples: 1359179580. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-13 00:00:05,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:00:07,402][71000] Updated weights for policy 0, policy_version 111724 (0.0033) [2024-06-13 00:00:10,294][71000] Updated weights for policy 0, policy_version 111734 (0.0031) [2024-06-13 00:00:10,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49425.0, 300 sec: 49429.9). Total num frames: 1830666240. Throughput: 0: 49466.0. Samples: 1359472420. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-13 00:00:10,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:00:14,035][71000] Updated weights for policy 0, policy_version 111744 (0.0032) [2024-06-13 00:00:15,939][70768] Fps is (10 sec: 52429.1, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1830895616. Throughput: 0: 49441.0. Samples: 1359776680. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-13 00:00:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:00:16,985][71000] Updated weights for policy 0, policy_version 111754 (0.0028) [2024-06-13 00:00:20,671][71000] Updated weights for policy 0, policy_version 111764 (0.0027) [2024-06-13 00:00:20,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1831141376. Throughput: 0: 49303.6. Samples: 1359921240. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-13 00:00:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:00:23,658][71000] Updated weights for policy 0, policy_version 111774 (0.0029) [2024-06-13 00:00:25,939][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 49374.1). Total num frames: 1831387136. Throughput: 0: 49202.4. Samples: 1360212860. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-13 00:00:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:00:27,173][71000] Updated weights for policy 0, policy_version 111784 (0.0035) [2024-06-13 00:00:30,081][71000] Updated weights for policy 0, policy_version 111794 (0.0023) [2024-06-13 00:00:30,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1831649280. Throughput: 0: 49243.5. Samples: 1360505900. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-13 00:00:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:00:33,885][71000] Updated weights for policy 0, policy_version 111804 (0.0031) [2024-06-13 00:00:35,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 1831878656. Throughput: 0: 49335.5. Samples: 1360662160. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-13 00:00:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:00:36,760][71000] Updated weights for policy 0, policy_version 111814 (0.0027) [2024-06-13 00:00:40,534][71000] Updated weights for policy 0, policy_version 111824 (0.0026) [2024-06-13 00:00:40,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1832124416. Throughput: 0: 49257.6. Samples: 1360955840. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-13 00:00:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:00:43,685][71000] Updated weights for policy 0, policy_version 111834 (0.0024) [2024-06-13 00:00:45,939][70768] Fps is (10 sec: 47514.7, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 1832353792. Throughput: 0: 49226.8. Samples: 1361253700. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-13 00:00:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:00:46,866][71000] Updated weights for policy 0, policy_version 111844 (0.0038) [2024-06-13 00:00:49,961][70980] Signal inference workers to stop experience collection... (19950 times) [2024-06-13 00:00:49,962][70980] Signal inference workers to resume experience collection... (19950 times) [2024-06-13 00:00:50,003][71000] InferenceWorker_p0-w0: stopping experience collection (19950 times) [2024-06-13 00:00:50,003][71000] InferenceWorker_p0-w0: resuming experience collection (19950 times) [2024-06-13 00:00:50,096][71000] Updated weights for policy 0, policy_version 111854 (0.0029) [2024-06-13 00:00:50,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49151.9, 300 sec: 49430.3). Total num frames: 1832632320. Throughput: 0: 49305.1. Samples: 1361398320. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-13 00:00:50,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-13 00:00:53,684][71000] Updated weights for policy 0, policy_version 111864 (0.0030) [2024-06-13 00:00:55,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 1832878080. Throughput: 0: 49494.9. Samples: 1361699680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:00:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:00:56,948][71000] Updated weights for policy 0, policy_version 111874 (0.0024) [2024-06-13 00:01:00,379][71000] Updated weights for policy 0, policy_version 111884 (0.0027) [2024-06-13 00:01:00,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1833123840. Throughput: 0: 49146.6. Samples: 1361988280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:01:00,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-13 00:01:03,728][71000] Updated weights for policy 0, policy_version 111894 (0.0022) [2024-06-13 00:01:05,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49698.1, 300 sec: 49374.1). Total num frames: 1833353216. Throughput: 0: 49147.1. Samples: 1362132860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:01:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:01:06,898][71000] Updated weights for policy 0, policy_version 111904 (0.0027) [2024-06-13 00:01:10,534][71000] Updated weights for policy 0, policy_version 111914 (0.0034) [2024-06-13 00:01:10,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 1833615360. Throughput: 0: 49321.3. Samples: 1362432320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:01:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:01:13,523][71000] Updated weights for policy 0, policy_version 111924 (0.0031) [2024-06-13 00:01:15,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49424.9, 300 sec: 49374.2). Total num frames: 1833861120. Throughput: 0: 49499.5. Samples: 1362733380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:01:15,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 00:01:17,006][71000] Updated weights for policy 0, policy_version 111934 (0.0024) [2024-06-13 00:01:20,214][71000] Updated weights for policy 0, policy_version 111944 (0.0026) [2024-06-13 00:01:20,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1834106880. Throughput: 0: 49319.3. Samples: 1362881520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:01:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:01:23,687][71000] Updated weights for policy 0, policy_version 111954 (0.0028) [2024-06-13 00:01:25,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 1834336256. Throughput: 0: 49321.8. Samples: 1363175320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:01:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:01:26,805][71000] Updated weights for policy 0, policy_version 111964 (0.0029) [2024-06-13 00:01:30,411][71000] Updated weights for policy 0, policy_version 111974 (0.0026) [2024-06-13 00:01:30,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 1834598400. Throughput: 0: 49411.3. Samples: 1363477220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:01:30,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:01:33,297][71000] Updated weights for policy 0, policy_version 111984 (0.0034) [2024-06-13 00:01:35,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49698.2, 300 sec: 49374.1). Total num frames: 1834860544. Throughput: 0: 49384.5. Samples: 1363620620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:01:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:01:36,828][71000] Updated weights for policy 0, policy_version 111994 (0.0036) [2024-06-13 00:01:40,216][71000] Updated weights for policy 0, policy_version 112004 (0.0029) [2024-06-13 00:01:40,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.2, 300 sec: 49318.7). Total num frames: 1835089920. Throughput: 0: 49277.8. Samples: 1363917180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:01:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:01:40,999][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000112006_1835106304.pth... [2024-06-13 00:01:41,035][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000111282_1823244288.pth [2024-06-13 00:01:43,806][71000] Updated weights for policy 0, policy_version 112014 (0.0030) [2024-06-13 00:01:45,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 1835335680. Throughput: 0: 49560.9. Samples: 1364218520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:01:45,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:01:46,741][71000] Updated weights for policy 0, policy_version 112024 (0.0031) [2024-06-13 00:01:50,303][71000] Updated weights for policy 0, policy_version 112034 (0.0025) [2024-06-13 00:01:50,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 1835581440. Throughput: 0: 49601.7. Samples: 1364364940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:01:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:01:53,203][71000] Updated weights for policy 0, policy_version 112044 (0.0026) [2024-06-13 00:01:55,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1835843584. Throughput: 0: 49507.2. Samples: 1364660140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 00:01:55,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:01:57,256][71000] Updated weights for policy 0, policy_version 112054 (0.0021) [2024-06-13 00:01:59,863][70980] Signal inference workers to stop experience collection... (20000 times) [2024-06-13 00:01:59,891][71000] InferenceWorker_p0-w0: stopping experience collection (20000 times) [2024-06-13 00:01:59,975][70980] Signal inference workers to resume experience collection... (20000 times) [2024-06-13 00:01:59,976][71000] InferenceWorker_p0-w0: resuming experience collection (20000 times) [2024-06-13 00:02:00,105][71000] Updated weights for policy 0, policy_version 112064 (0.0026) [2024-06-13 00:02:00,939][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 1836072960. Throughput: 0: 49159.3. Samples: 1364945540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 00:02:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:02:03,815][71000] Updated weights for policy 0, policy_version 112074 (0.0031) [2024-06-13 00:02:05,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.1, 300 sec: 49374.1). Total num frames: 1836318720. Throughput: 0: 49304.0. Samples: 1365100200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 00:02:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:02:06,829][71000] Updated weights for policy 0, policy_version 112084 (0.0016) [2024-06-13 00:02:10,390][71000] Updated weights for policy 0, policy_version 112094 (0.0032) [2024-06-13 00:02:10,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 1836548096. Throughput: 0: 49169.8. Samples: 1365387960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 00:02:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:02:13,526][71000] Updated weights for policy 0, policy_version 112104 (0.0024) [2024-06-13 00:02:15,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 1836826624. Throughput: 0: 49040.1. Samples: 1365684020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 00:02:15,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:02:17,123][71000] Updated weights for policy 0, policy_version 112114 (0.0025) [2024-06-13 00:02:20,137][71000] Updated weights for policy 0, policy_version 112124 (0.0030) [2024-06-13 00:02:20,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1837072384. Throughput: 0: 49340.6. Samples: 1365840940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 00:02:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:02:23,535][71000] Updated weights for policy 0, policy_version 112134 (0.0030) [2024-06-13 00:02:25,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49698.3, 300 sec: 49485.3). Total num frames: 1837318144. Throughput: 0: 49391.1. Samples: 1366139780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 00:02:25,940][70768] Avg episode reward: [(0, '0.263')] [2024-06-13 00:02:26,639][71000] Updated weights for policy 0, policy_version 112144 (0.0036) [2024-06-13 00:02:30,094][71000] Updated weights for policy 0, policy_version 112154 (0.0029) [2024-06-13 00:02:30,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.1, 300 sec: 49374.1). Total num frames: 1837563904. Throughput: 0: 49493.8. Samples: 1366445740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 00:02:30,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 00:02:33,401][71000] Updated weights for policy 0, policy_version 112164 (0.0035) [2024-06-13 00:02:35,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 1837809664. Throughput: 0: 49684.5. Samples: 1366600740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 00:02:35,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-13 00:02:36,543][71000] Updated weights for policy 0, policy_version 112174 (0.0031) [2024-06-13 00:02:39,835][71000] Updated weights for policy 0, policy_version 112184 (0.0035) [2024-06-13 00:02:40,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1838071808. Throughput: 0: 49664.9. Samples: 1366895060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 00:02:40,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-13 00:02:43,152][71000] Updated weights for policy 0, policy_version 112194 (0.0033) [2024-06-13 00:02:45,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1838317568. Throughput: 0: 49825.2. Samples: 1367187680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 00:02:45,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-13 00:02:46,598][71000] Updated weights for policy 0, policy_version 112204 (0.0040) [2024-06-13 00:02:49,742][71000] Updated weights for policy 0, policy_version 112214 (0.0032) [2024-06-13 00:02:50,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 1838579712. Throughput: 0: 49662.5. Samples: 1367335020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 00:02:50,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 00:02:53,017][71000] Updated weights for policy 0, policy_version 112224 (0.0026) [2024-06-13 00:02:55,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1838825472. Throughput: 0: 49871.7. Samples: 1367632180. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-13 00:02:55,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 00:02:56,245][71000] Updated weights for policy 0, policy_version 112234 (0.0040) [2024-06-13 00:02:59,835][71000] Updated weights for policy 0, policy_version 112244 (0.0029) [2024-06-13 00:03:00,940][70768] Fps is (10 sec: 50791.2, 60 sec: 50244.3, 300 sec: 49540.8). Total num frames: 1839087616. Throughput: 0: 50068.0. Samples: 1367937080. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-13 00:03:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 00:03:02,847][71000] Updated weights for policy 0, policy_version 112254 (0.0025) [2024-06-13 00:03:05,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 1839300608. Throughput: 0: 49818.2. Samples: 1368082760. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-13 00:03:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 00:03:06,421][71000] Updated weights for policy 0, policy_version 112264 (0.0034) [2024-06-13 00:03:09,647][71000] Updated weights for policy 0, policy_version 112274 (0.0032) [2024-06-13 00:03:10,940][70768] Fps is (10 sec: 47512.6, 60 sec: 50244.2, 300 sec: 49540.7). Total num frames: 1839562752. Throughput: 0: 49723.7. Samples: 1368377360. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-13 00:03:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:03:13,126][71000] Updated weights for policy 0, policy_version 112284 (0.0031) [2024-06-13 00:03:15,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1839808512. Throughput: 0: 49708.9. Samples: 1368682640. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-13 00:03:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:03:16,000][71000] Updated weights for policy 0, policy_version 112294 (0.0030) [2024-06-13 00:03:19,409][70980] Signal inference workers to stop experience collection... (20050 times) [2024-06-13 00:03:19,410][70980] Signal inference workers to resume experience collection... (20050 times) [2024-06-13 00:03:19,419][71000] InferenceWorker_p0-w0: stopping experience collection (20050 times) [2024-06-13 00:03:19,429][71000] InferenceWorker_p0-w0: resuming experience collection (20050 times) [2024-06-13 00:03:19,582][71000] Updated weights for policy 0, policy_version 112304 (0.0026) [2024-06-13 00:03:20,940][70768] Fps is (10 sec: 52429.8, 60 sec: 50244.3, 300 sec: 49540.8). Total num frames: 1840087040. Throughput: 0: 49677.8. Samples: 1368836240. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-13 00:03:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:03:22,514][71000] Updated weights for policy 0, policy_version 112314 (0.0023) [2024-06-13 00:03:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1840283648. Throughput: 0: 49588.8. Samples: 1369126560. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-13 00:03:25,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-13 00:03:26,094][71000] Updated weights for policy 0, policy_version 112324 (0.0026) [2024-06-13 00:03:29,278][71000] Updated weights for policy 0, policy_version 112334 (0.0030) [2024-06-13 00:03:30,940][70768] Fps is (10 sec: 45874.7, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1840545792. Throughput: 0: 49561.3. Samples: 1369417940. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-13 00:03:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:03:33,007][71000] Updated weights for policy 0, policy_version 112344 (0.0027) [2024-06-13 00:03:35,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1840791552. Throughput: 0: 49510.4. Samples: 1369562980. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-13 00:03:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:03:36,037][71000] Updated weights for policy 0, policy_version 112354 (0.0029) [2024-06-13 00:03:39,304][71000] Updated weights for policy 0, policy_version 112364 (0.0027) [2024-06-13 00:03:40,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49698.1, 300 sec: 49485.3). Total num frames: 1841053696. Throughput: 0: 49672.0. Samples: 1369867420. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-13 00:03:40,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-13 00:03:40,954][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000112369_1841053696.pth... [2024-06-13 00:03:41,003][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000111644_1829175296.pth [2024-06-13 00:03:42,863][71000] Updated weights for policy 0, policy_version 112374 (0.0023) [2024-06-13 00:03:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 1841266688. Throughput: 0: 49466.2. Samples: 1370163060. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-13 00:03:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:03:46,359][71000] Updated weights for policy 0, policy_version 112384 (0.0024) [2024-06-13 00:03:49,333][71000] Updated weights for policy 0, policy_version 112394 (0.0032) [2024-06-13 00:03:50,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1841528832. Throughput: 0: 49340.3. Samples: 1370303080. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-13 00:03:50,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-13 00:03:52,848][71000] Updated weights for policy 0, policy_version 112404 (0.0030) [2024-06-13 00:03:55,831][71000] Updated weights for policy 0, policy_version 112414 (0.0031) [2024-06-13 00:03:55,940][70768] Fps is (10 sec: 52427.4, 60 sec: 49424.8, 300 sec: 49540.7). Total num frames: 1841790976. Throughput: 0: 49279.4. Samples: 1370594940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:03:55,941][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:03:59,297][71000] Updated weights for policy 0, policy_version 112424 (0.0029) [2024-06-13 00:04:00,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49151.9, 300 sec: 49540.8). Total num frames: 1842036736. Throughput: 0: 49184.8. Samples: 1370895960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:04:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:04:02,577][71000] Updated weights for policy 0, policy_version 112434 (0.0027) [2024-06-13 00:04:05,940][70768] Fps is (10 sec: 45876.0, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 1842249728. Throughput: 0: 48996.8. Samples: 1371041100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:04:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:04:06,183][71000] Updated weights for policy 0, policy_version 112444 (0.0033) [2024-06-13 00:04:09,309][71000] Updated weights for policy 0, policy_version 112454 (0.0024) [2024-06-13 00:04:10,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 1842511872. Throughput: 0: 49050.6. Samples: 1371333840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:04:10,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:04:12,912][71000] Updated weights for policy 0, policy_version 112464 (0.0025) [2024-06-13 00:04:15,807][71000] Updated weights for policy 0, policy_version 112474 (0.0033) [2024-06-13 00:04:15,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1842774016. Throughput: 0: 49019.2. Samples: 1371623800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:04:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:04:19,516][71000] Updated weights for policy 0, policy_version 112484 (0.0029) [2024-06-13 00:04:20,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48605.7, 300 sec: 49485.2). Total num frames: 1843003392. Throughput: 0: 49170.9. Samples: 1371775680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:04:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:04:22,689][71000] Updated weights for policy 0, policy_version 112494 (0.0026) [2024-06-13 00:04:25,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 1843249152. Throughput: 0: 48981.7. Samples: 1372071600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:04:25,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:04:26,007][71000] Updated weights for policy 0, policy_version 112504 (0.0035) [2024-06-13 00:04:29,076][71000] Updated weights for policy 0, policy_version 112514 (0.0025) [2024-06-13 00:04:30,474][70980] Signal inference workers to stop experience collection... (20100 times) [2024-06-13 00:04:30,475][70980] Signal inference workers to resume experience collection... (20100 times) [2024-06-13 00:04:30,487][71000] InferenceWorker_p0-w0: stopping experience collection (20100 times) [2024-06-13 00:04:30,487][71000] InferenceWorker_p0-w0: resuming experience collection (20100 times) [2024-06-13 00:04:30,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1843511296. Throughput: 0: 49135.1. Samples: 1372374140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:04:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:04:32,676][71000] Updated weights for policy 0, policy_version 112524 (0.0025) [2024-06-13 00:04:35,669][71000] Updated weights for policy 0, policy_version 112534 (0.0030) [2024-06-13 00:04:35,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1843757056. Throughput: 0: 49373.5. Samples: 1372524880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:04:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:04:39,091][71000] Updated weights for policy 0, policy_version 112544 (0.0023) [2024-06-13 00:04:40,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48879.0, 300 sec: 49429.7). Total num frames: 1843986432. Throughput: 0: 49435.4. Samples: 1372819520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:04:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:04:42,154][71000] Updated weights for policy 0, policy_version 112554 (0.0024) [2024-06-13 00:04:45,939][71000] Updated weights for policy 0, policy_version 112564 (0.0027) [2024-06-13 00:04:45,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49698.0, 300 sec: 49374.1). Total num frames: 1844248576. Throughput: 0: 49272.4. Samples: 1373113220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:04:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:04:48,875][71000] Updated weights for policy 0, policy_version 112574 (0.0029) [2024-06-13 00:04:50,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.2, 300 sec: 49374.1). Total num frames: 1844494336. Throughput: 0: 49481.0. Samples: 1373267740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:04:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:04:52,384][71000] Updated weights for policy 0, policy_version 112584 (0.0033) [2024-06-13 00:04:55,540][71000] Updated weights for policy 0, policy_version 112594 (0.0028) [2024-06-13 00:04:55,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 1844740096. Throughput: 0: 49547.1. Samples: 1373563460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:04:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:04:59,071][71000] Updated weights for policy 0, policy_version 112604 (0.0029) [2024-06-13 00:05:00,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1844985856. Throughput: 0: 49765.7. Samples: 1373863260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:05:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:05:02,009][71000] Updated weights for policy 0, policy_version 112614 (0.0031) [2024-06-13 00:05:05,695][71000] Updated weights for policy 0, policy_version 112624 (0.0028) [2024-06-13 00:05:05,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49971.3, 300 sec: 49429.7). Total num frames: 1845248000. Throughput: 0: 49685.1. Samples: 1374011500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:05:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:05:08,751][71000] Updated weights for policy 0, policy_version 112634 (0.0032) [2024-06-13 00:05:10,939][70768] Fps is (10 sec: 50791.1, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1845493760. Throughput: 0: 49670.3. Samples: 1374306760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:05:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 00:05:12,210][71000] Updated weights for policy 0, policy_version 112644 (0.0024) [2024-06-13 00:05:15,787][71000] Updated weights for policy 0, policy_version 112654 (0.0029) [2024-06-13 00:05:15,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1845723136. Throughput: 0: 49523.5. Samples: 1374602700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:05:15,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-13 00:05:18,944][71000] Updated weights for policy 0, policy_version 112664 (0.0032) [2024-06-13 00:05:20,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1845985280. Throughput: 0: 49499.1. Samples: 1374752340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:05:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:05:22,223][71000] Updated weights for policy 0, policy_version 112674 (0.0027) [2024-06-13 00:05:25,444][71000] Updated weights for policy 0, policy_version 112684 (0.0023) [2024-06-13 00:05:25,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 1846231040. Throughput: 0: 49558.9. Samples: 1375049680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:05:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:05:28,965][71000] Updated weights for policy 0, policy_version 112694 (0.0027) [2024-06-13 00:05:30,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1846460416. Throughput: 0: 49452.5. Samples: 1375338580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:05:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:05:31,978][71000] Updated weights for policy 0, policy_version 112704 (0.0030) [2024-06-13 00:05:35,577][71000] Updated weights for policy 0, policy_version 112714 (0.0032) [2024-06-13 00:05:35,728][70980] Signal inference workers to stop experience collection... (20150 times) [2024-06-13 00:05:35,729][70980] Signal inference workers to resume experience collection... (20150 times) [2024-06-13 00:05:35,779][71000] InferenceWorker_p0-w0: stopping experience collection (20150 times) [2024-06-13 00:05:35,779][71000] InferenceWorker_p0-w0: resuming experience collection (20150 times) [2024-06-13 00:05:35,940][70768] Fps is (10 sec: 50791.3, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1846738944. Throughput: 0: 49673.8. Samples: 1375503060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:05:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:05:38,654][71000] Updated weights for policy 0, policy_version 112724 (0.0034) [2024-06-13 00:05:40,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1846951936. Throughput: 0: 49620.0. Samples: 1375796360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:05:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:05:41,014][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000112730_1846968320.pth... [2024-06-13 00:05:41,058][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000112006_1835106304.pth [2024-06-13 00:05:41,919][71000] Updated weights for policy 0, policy_version 112734 (0.0033) [2024-06-13 00:05:45,115][71000] Updated weights for policy 0, policy_version 112744 (0.0027) [2024-06-13 00:05:45,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1847214080. Throughput: 0: 49397.9. Samples: 1376086160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:05:45,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 00:05:48,453][71000] Updated weights for policy 0, policy_version 112754 (0.0028) [2024-06-13 00:05:50,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1847459840. Throughput: 0: 49478.9. Samples: 1376238060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:05:50,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:05:51,819][71000] Updated weights for policy 0, policy_version 112764 (0.0028) [2024-06-13 00:05:55,133][71000] Updated weights for policy 0, policy_version 112774 (0.0034) [2024-06-13 00:05:55,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1847721984. Throughput: 0: 49563.0. Samples: 1376537100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:05:55,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:05:58,345][71000] Updated weights for policy 0, policy_version 112784 (0.0030) [2024-06-13 00:06:00,939][70768] Fps is (10 sec: 50791.4, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1847967744. Throughput: 0: 49672.5. Samples: 1376837960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 00:06:00,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:06:01,483][71000] Updated weights for policy 0, policy_version 112794 (0.0026) [2024-06-13 00:06:05,025][71000] Updated weights for policy 0, policy_version 112804 (0.0038) [2024-06-13 00:06:05,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49151.8, 300 sec: 49429.7). Total num frames: 1848197120. Throughput: 0: 49755.0. Samples: 1376991320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 00:06:05,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 00:06:08,159][71000] Updated weights for policy 0, policy_version 112814 (0.0036) [2024-06-13 00:06:10,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49698.0, 300 sec: 49540.8). Total num frames: 1848475648. Throughput: 0: 49647.2. Samples: 1377283800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 00:06:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:06:11,807][71000] Updated weights for policy 0, policy_version 112824 (0.0032) [2024-06-13 00:06:14,747][71000] Updated weights for policy 0, policy_version 112834 (0.0032) [2024-06-13 00:06:15,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1848705024. Throughput: 0: 49797.7. Samples: 1377579480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 00:06:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:06:18,483][71000] Updated weights for policy 0, policy_version 112844 (0.0033) [2024-06-13 00:06:20,939][70768] Fps is (10 sec: 47514.3, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1848950784. Throughput: 0: 49421.8. Samples: 1377727040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 00:06:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:06:21,446][71000] Updated weights for policy 0, policy_version 112854 (0.0033) [2024-06-13 00:06:24,692][71000] Updated weights for policy 0, policy_version 112864 (0.0021) [2024-06-13 00:06:25,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.3, 300 sec: 49540.8). Total num frames: 1849212928. Throughput: 0: 49547.2. Samples: 1378025980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 00:06:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:06:28,095][71000] Updated weights for policy 0, policy_version 112874 (0.0025) [2024-06-13 00:06:30,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 1849458688. Throughput: 0: 49867.9. Samples: 1378330220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 00:06:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:06:31,389][71000] Updated weights for policy 0, policy_version 112884 (0.0031) [2024-06-13 00:06:34,610][70980] Signal inference workers to stop experience collection... (20200 times) [2024-06-13 00:06:34,656][71000] InferenceWorker_p0-w0: stopping experience collection (20200 times) [2024-06-13 00:06:34,659][70980] Signal inference workers to resume experience collection... (20200 times) [2024-06-13 00:06:34,670][71000] InferenceWorker_p0-w0: resuming experience collection (20200 times) [2024-06-13 00:06:34,819][71000] Updated weights for policy 0, policy_version 112894 (0.0036) [2024-06-13 00:06:35,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49425.0, 300 sec: 49540.7). Total num frames: 1849704448. Throughput: 0: 49710.7. Samples: 1378475040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 00:06:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:06:38,438][71000] Updated weights for policy 0, policy_version 112904 (0.0034) [2024-06-13 00:06:40,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1849933824. Throughput: 0: 49429.8. Samples: 1378761440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 00:06:40,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-13 00:06:41,462][71000] Updated weights for policy 0, policy_version 112914 (0.0032) [2024-06-13 00:06:44,975][71000] Updated weights for policy 0, policy_version 112924 (0.0040) [2024-06-13 00:06:45,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1850195968. Throughput: 0: 49475.0. Samples: 1379064340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 00:06:45,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:06:48,022][71000] Updated weights for policy 0, policy_version 112934 (0.0021) [2024-06-13 00:06:50,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1850441728. Throughput: 0: 49298.8. Samples: 1379209760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 00:06:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:06:51,382][71000] Updated weights for policy 0, policy_version 112944 (0.0036) [2024-06-13 00:06:54,789][71000] Updated weights for policy 0, policy_version 112954 (0.0029) [2024-06-13 00:06:55,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 49540.7). Total num frames: 1850687488. Throughput: 0: 49608.8. Samples: 1379516200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 00:06:55,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-13 00:06:57,871][71000] Updated weights for policy 0, policy_version 112964 (0.0024) [2024-06-13 00:07:00,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49151.9, 300 sec: 49485.2). Total num frames: 1850916864. Throughput: 0: 49420.0. Samples: 1379803380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:07:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:07:01,403][71000] Updated weights for policy 0, policy_version 112974 (0.0031) [2024-06-13 00:07:04,907][71000] Updated weights for policy 0, policy_version 112984 (0.0028) [2024-06-13 00:07:05,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49698.3, 300 sec: 49596.3). Total num frames: 1851179008. Throughput: 0: 49392.8. Samples: 1379949720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:07:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:07:07,987][71000] Updated weights for policy 0, policy_version 112994 (0.0035) [2024-06-13 00:07:10,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 1851424768. Throughput: 0: 49296.9. Samples: 1380244340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:07:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:07:11,422][71000] Updated weights for policy 0, policy_version 113004 (0.0031) [2024-06-13 00:07:14,761][71000] Updated weights for policy 0, policy_version 113014 (0.0022) [2024-06-13 00:07:15,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1851670528. Throughput: 0: 49268.4. Samples: 1380547300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:07:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:07:17,798][71000] Updated weights for policy 0, policy_version 113024 (0.0024) [2024-06-13 00:07:20,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1851916288. Throughput: 0: 49296.0. Samples: 1380693360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:07:20,944][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:07:21,413][71000] Updated weights for policy 0, policy_version 113034 (0.0035) [2024-06-13 00:07:24,184][71000] Updated weights for policy 0, policy_version 113044 (0.0029) [2024-06-13 00:07:25,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1852178432. Throughput: 0: 49648.9. Samples: 1380995640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:07:25,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:07:27,830][71000] Updated weights for policy 0, policy_version 113054 (0.0026) [2024-06-13 00:07:30,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1852424192. Throughput: 0: 49596.0. Samples: 1381296160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:07:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:07:31,008][71000] Updated weights for policy 0, policy_version 113064 (0.0034) [2024-06-13 00:07:34,245][71000] Updated weights for policy 0, policy_version 113074 (0.0022) [2024-06-13 00:07:35,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 1852669952. Throughput: 0: 49687.7. Samples: 1381445700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:07:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:07:37,693][71000] Updated weights for policy 0, policy_version 113084 (0.0035) [2024-06-13 00:07:40,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1852899328. Throughput: 0: 49453.4. Samples: 1381741600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:07:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 00:07:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000113092_1852899328.pth... [2024-06-13 00:07:41,009][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000112369_1841053696.pth [2024-06-13 00:07:41,347][71000] Updated weights for policy 0, policy_version 113094 (0.0024) [2024-06-13 00:07:44,079][71000] Updated weights for policy 0, policy_version 113104 (0.0023) [2024-06-13 00:07:45,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1853177856. Throughput: 0: 49813.9. Samples: 1382045000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:07:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:07:47,676][71000] Updated weights for policy 0, policy_version 113114 (0.0031) [2024-06-13 00:07:50,720][71000] Updated weights for policy 0, policy_version 113124 (0.0025) [2024-06-13 00:07:50,908][70980] Signal inference workers to stop experience collection... (20250 times) [2024-06-13 00:07:50,939][70768] Fps is (10 sec: 52429.5, 60 sec: 49698.3, 300 sec: 49485.2). Total num frames: 1853423616. Throughput: 0: 49996.1. Samples: 1382199540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:07:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:07:50,954][71000] InferenceWorker_p0-w0: stopping experience collection (20250 times) [2024-06-13 00:07:50,956][70980] Signal inference workers to resume experience collection... (20250 times) [2024-06-13 00:07:50,966][71000] InferenceWorker_p0-w0: resuming experience collection (20250 times) [2024-06-13 00:07:54,274][71000] Updated weights for policy 0, policy_version 113134 (0.0030) [2024-06-13 00:07:55,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1853669376. Throughput: 0: 49933.8. Samples: 1382491360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:07:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:07:57,221][71000] Updated weights for policy 0, policy_version 113144 (0.0029) [2024-06-13 00:08:00,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1853898752. Throughput: 0: 49793.9. Samples: 1382788020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:08:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:08:01,016][71000] Updated weights for policy 0, policy_version 113154 (0.0025) [2024-06-13 00:08:03,988][71000] Updated weights for policy 0, policy_version 113164 (0.0033) [2024-06-13 00:08:05,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49971.3, 300 sec: 49540.8). Total num frames: 1854177280. Throughput: 0: 49717.5. Samples: 1382930640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:08:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:08:07,771][71000] Updated weights for policy 0, policy_version 113174 (0.0030) [2024-06-13 00:08:10,488][71000] Updated weights for policy 0, policy_version 113184 (0.0024) [2024-06-13 00:08:10,940][70768] Fps is (10 sec: 54067.1, 60 sec: 50244.2, 300 sec: 49596.3). Total num frames: 1854439424. Throughput: 0: 49756.9. Samples: 1383234700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:08:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:08:14,362][71000] Updated weights for policy 0, policy_version 113194 (0.0021) [2024-06-13 00:08:15,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 1854668800. Throughput: 0: 49744.3. Samples: 1383534660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:08:15,949][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:08:16,951][71000] Updated weights for policy 0, policy_version 113204 (0.0029) [2024-06-13 00:08:20,810][71000] Updated weights for policy 0, policy_version 113214 (0.0033) [2024-06-13 00:08:20,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1854898176. Throughput: 0: 49506.2. Samples: 1383673480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:08:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:08:23,485][71000] Updated weights for policy 0, policy_version 113224 (0.0038) [2024-06-13 00:08:25,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1855160320. Throughput: 0: 49679.7. Samples: 1383977180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:08:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:08:27,542][71000] Updated weights for policy 0, policy_version 113234 (0.0028) [2024-06-13 00:08:29,852][71000] Updated weights for policy 0, policy_version 113244 (0.0028) [2024-06-13 00:08:30,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49971.1, 300 sec: 49596.3). Total num frames: 1855422464. Throughput: 0: 49522.1. Samples: 1384273500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:08:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:08:34,351][71000] Updated weights for policy 0, policy_version 113254 (0.0032) [2024-06-13 00:08:35,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1855651840. Throughput: 0: 49776.0. Samples: 1384439460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:08:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:08:36,462][71000] Updated weights for policy 0, policy_version 113264 (0.0022) [2024-06-13 00:08:40,720][71000] Updated weights for policy 0, policy_version 113274 (0.0026) [2024-06-13 00:08:40,939][70768] Fps is (10 sec: 45876.1, 60 sec: 49698.3, 300 sec: 49540.8). Total num frames: 1855881216. Throughput: 0: 49709.8. Samples: 1384728300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:08:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:08:43,178][71000] Updated weights for policy 0, policy_version 113284 (0.0029) [2024-06-13 00:08:45,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1856143360. Throughput: 0: 49526.6. Samples: 1385016720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:08:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:08:47,354][71000] Updated weights for policy 0, policy_version 113294 (0.0022) [2024-06-13 00:08:50,055][71000] Updated weights for policy 0, policy_version 113304 (0.0033) [2024-06-13 00:08:50,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.0, 300 sec: 49485.3). Total num frames: 1856389120. Throughput: 0: 49790.6. Samples: 1385171220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:08:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:08:54,179][71000] Updated weights for policy 0, policy_version 113314 (0.0031) [2024-06-13 00:08:55,939][70768] Fps is (10 sec: 52429.4, 60 sec: 49971.3, 300 sec: 49596.3). Total num frames: 1856667648. Throughput: 0: 49653.5. Samples: 1385469100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:08:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:08:56,488][71000] Updated weights for policy 0, policy_version 113324 (0.0026) [2024-06-13 00:09:00,603][71000] Updated weights for policy 0, policy_version 113334 (0.0028) [2024-06-13 00:09:00,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1856880640. Throughput: 0: 49825.5. Samples: 1385776800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 00:09:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:09:03,200][71000] Updated weights for policy 0, policy_version 113344 (0.0033) [2024-06-13 00:09:05,940][70768] Fps is (10 sec: 45874.8, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1857126400. Throughput: 0: 49664.0. Samples: 1385908360. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 00:09:05,948][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:09:07,099][71000] Updated weights for policy 0, policy_version 113354 (0.0031) [2024-06-13 00:09:09,604][71000] Updated weights for policy 0, policy_version 113364 (0.0032) [2024-06-13 00:09:10,940][70768] Fps is (10 sec: 52429.1, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 1857404928. Throughput: 0: 49677.8. Samples: 1386212680. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 00:09:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:09:13,734][71000] Updated weights for policy 0, policy_version 113374 (0.0026) [2024-06-13 00:09:14,734][70980] Signal inference workers to stop experience collection... (20300 times) [2024-06-13 00:09:14,734][70980] Signal inference workers to resume experience collection... (20300 times) [2024-06-13 00:09:14,749][71000] InferenceWorker_p0-w0: stopping experience collection (20300 times) [2024-06-13 00:09:14,749][71000] InferenceWorker_p0-w0: resuming experience collection (20300 times) [2024-06-13 00:09:15,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49698.2, 300 sec: 49651.9). Total num frames: 1857650688. Throughput: 0: 49639.7. Samples: 1386507280. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 00:09:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:09:16,078][71000] Updated weights for policy 0, policy_version 113384 (0.0025) [2024-06-13 00:09:20,388][71000] Updated weights for policy 0, policy_version 113394 (0.0028) [2024-06-13 00:09:20,940][70768] Fps is (10 sec: 45874.6, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1857863680. Throughput: 0: 49170.1. Samples: 1386652120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 00:09:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:09:22,984][71000] Updated weights for policy 0, policy_version 113404 (0.0030) [2024-06-13 00:09:25,942][70768] Fps is (10 sec: 47501.8, 60 sec: 49423.0, 300 sec: 49540.3). Total num frames: 1858125824. Throughput: 0: 49335.9. Samples: 1386948540. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 00:09:25,943][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:09:26,906][71000] Updated weights for policy 0, policy_version 113414 (0.0036) [2024-06-13 00:09:29,656][71000] Updated weights for policy 0, policy_version 113424 (0.0031) [2024-06-13 00:09:30,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 1858387968. Throughput: 0: 49580.4. Samples: 1387247840. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 00:09:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:09:33,732][71000] Updated weights for policy 0, policy_version 113434 (0.0026) [2024-06-13 00:09:35,940][70768] Fps is (10 sec: 50802.2, 60 sec: 49697.9, 300 sec: 49651.8). Total num frames: 1858633728. Throughput: 0: 49595.8. Samples: 1387403040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 00:09:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:09:36,434][71000] Updated weights for policy 0, policy_version 113444 (0.0030) [2024-06-13 00:09:40,287][71000] Updated weights for policy 0, policy_version 113454 (0.0026) [2024-06-13 00:09:40,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49698.0, 300 sec: 49540.8). Total num frames: 1858863104. Throughput: 0: 49422.0. Samples: 1387693100. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 00:09:40,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:09:40,993][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000113457_1858879488.pth... [2024-06-13 00:09:41,041][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000112730_1846968320.pth [2024-06-13 00:09:42,955][71000] Updated weights for policy 0, policy_version 113464 (0.0026) [2024-06-13 00:09:45,939][70768] Fps is (10 sec: 47514.7, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1859108864. Throughput: 0: 49159.2. Samples: 1387988960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 00:09:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:09:46,913][71000] Updated weights for policy 0, policy_version 113474 (0.0023) [2024-06-13 00:09:49,807][71000] Updated weights for policy 0, policy_version 113484 (0.0032) [2024-06-13 00:09:50,940][70768] Fps is (10 sec: 52429.7, 60 sec: 49971.2, 300 sec: 49651.9). Total num frames: 1859387392. Throughput: 0: 49625.4. Samples: 1388141500. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 00:09:50,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:09:53,250][71000] Updated weights for policy 0, policy_version 113494 (0.0025) [2024-06-13 00:09:55,940][70768] Fps is (10 sec: 50789.1, 60 sec: 49151.8, 300 sec: 49596.3). Total num frames: 1859616768. Throughput: 0: 49598.9. Samples: 1388444640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 00:09:55,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-13 00:09:56,302][71000] Updated weights for policy 0, policy_version 113504 (0.0029) [2024-06-13 00:09:59,974][71000] Updated weights for policy 0, policy_version 113514 (0.0034) [2024-06-13 00:10:00,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49698.0, 300 sec: 49540.7). Total num frames: 1859862528. Throughput: 0: 49535.0. Samples: 1388736360. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 00:10:00,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 00:10:03,018][71000] Updated weights for policy 0, policy_version 113524 (0.0046) [2024-06-13 00:10:05,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1860091904. Throughput: 0: 49495.1. Samples: 1388879400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:10:05,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:10:06,612][71000] Updated weights for policy 0, policy_version 113534 (0.0035) [2024-06-13 00:10:09,549][71000] Updated weights for policy 0, policy_version 113544 (0.0024) [2024-06-13 00:10:10,940][70768] Fps is (10 sec: 52429.6, 60 sec: 49698.1, 300 sec: 49707.4). Total num frames: 1860386816. Throughput: 0: 49631.2. Samples: 1389181820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:10:10,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-13 00:10:13,151][71000] Updated weights for policy 0, policy_version 113554 (0.0030) [2024-06-13 00:10:15,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49151.9, 300 sec: 49540.8). Total num frames: 1860599808. Throughput: 0: 49683.9. Samples: 1389483620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:10:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:10:16,404][71000] Updated weights for policy 0, policy_version 113564 (0.0026) [2024-06-13 00:10:19,740][71000] Updated weights for policy 0, policy_version 113574 (0.0031) [2024-06-13 00:10:20,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1860861952. Throughput: 0: 49511.3. Samples: 1389631040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:10:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:10:23,010][71000] Updated weights for policy 0, policy_version 113584 (0.0033) [2024-06-13 00:10:25,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49427.1, 300 sec: 49596.3). Total num frames: 1861091328. Throughput: 0: 49491.2. Samples: 1389920200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:10:25,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 00:10:26,287][70980] Signal inference workers to stop experience collection... (20350 times) [2024-06-13 00:10:26,289][70980] Signal inference workers to resume experience collection... (20350 times) [2024-06-13 00:10:26,328][71000] InferenceWorker_p0-w0: stopping experience collection (20350 times) [2024-06-13 00:10:26,328][71000] InferenceWorker_p0-w0: resuming experience collection (20350 times) [2024-06-13 00:10:26,415][71000] Updated weights for policy 0, policy_version 113594 (0.0022) [2024-06-13 00:10:29,488][71000] Updated weights for policy 0, policy_version 113604 (0.0036) [2024-06-13 00:10:30,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1861369856. Throughput: 0: 49549.3. Samples: 1390218680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:10:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:10:32,791][71000] Updated weights for policy 0, policy_version 113614 (0.0022) [2024-06-13 00:10:35,745][71000] Updated weights for policy 0, policy_version 113624 (0.0028) [2024-06-13 00:10:35,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49698.2, 300 sec: 49707.4). Total num frames: 1861615616. Throughput: 0: 49663.0. Samples: 1390376340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:10:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:10:39,311][71000] Updated weights for policy 0, policy_version 113634 (0.0027) [2024-06-13 00:10:40,940][70768] Fps is (10 sec: 50790.5, 60 sec: 50244.4, 300 sec: 49707.4). Total num frames: 1861877760. Throughput: 0: 49510.5. Samples: 1390672600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:10:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:10:42,686][71000] Updated weights for policy 0, policy_version 113644 (0.0035) [2024-06-13 00:10:45,939][70768] Fps is (10 sec: 47514.1, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1862090752. Throughput: 0: 49693.6. Samples: 1390972560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:10:45,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:10:46,084][71000] Updated weights for policy 0, policy_version 113654 (0.0027) [2024-06-13 00:10:49,435][71000] Updated weights for policy 0, policy_version 113664 (0.0020) [2024-06-13 00:10:50,939][70768] Fps is (10 sec: 44237.0, 60 sec: 48879.0, 300 sec: 49485.3). Total num frames: 1862320128. Throughput: 0: 49733.0. Samples: 1391117380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:10:50,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:10:52,604][71000] Updated weights for policy 0, policy_version 113674 (0.0030) [2024-06-13 00:10:55,760][71000] Updated weights for policy 0, policy_version 113684 (0.0029) [2024-06-13 00:10:55,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1862598656. Throughput: 0: 49443.5. Samples: 1391406780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:10:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:10:59,070][71000] Updated weights for policy 0, policy_version 113694 (0.0035) [2024-06-13 00:11:00,940][70768] Fps is (10 sec: 54067.0, 60 sec: 49971.3, 300 sec: 49707.4). Total num frames: 1862860800. Throughput: 0: 49423.3. Samples: 1391707660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:11:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:11:02,454][71000] Updated weights for policy 0, policy_version 113704 (0.0035) [2024-06-13 00:11:05,714][71000] Updated weights for policy 0, policy_version 113714 (0.0029) [2024-06-13 00:11:05,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49971.3, 300 sec: 49540.8). Total num frames: 1863090176. Throughput: 0: 49836.0. Samples: 1391873660. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 00:11:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:11:08,921][71000] Updated weights for policy 0, policy_version 113724 (0.0036) [2024-06-13 00:11:10,940][70768] Fps is (10 sec: 47512.7, 60 sec: 49151.9, 300 sec: 49596.3). Total num frames: 1863335936. Throughput: 0: 49993.2. Samples: 1392169900. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 00:11:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:11:12,116][71000] Updated weights for policy 0, policy_version 113734 (0.0031) [2024-06-13 00:11:15,327][71000] Updated weights for policy 0, policy_version 113744 (0.0028) [2024-06-13 00:11:15,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49698.3, 300 sec: 49596.3). Total num frames: 1863581696. Throughput: 0: 49935.6. Samples: 1392465780. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 00:11:15,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 00:11:18,761][71000] Updated weights for policy 0, policy_version 113754 (0.0030) [2024-06-13 00:11:20,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49971.1, 300 sec: 49651.8). Total num frames: 1863860224. Throughput: 0: 49756.4. Samples: 1392615380. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 00:11:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:11:21,673][71000] Updated weights for policy 0, policy_version 113764 (0.0031) [2024-06-13 00:11:25,402][71000] Updated weights for policy 0, policy_version 113774 (0.0024) [2024-06-13 00:11:25,780][70980] Signal inference workers to stop experience collection... (20400 times) [2024-06-13 00:11:25,825][71000] InferenceWorker_p0-w0: stopping experience collection (20400 times) [2024-06-13 00:11:25,829][70980] Signal inference workers to resume experience collection... (20400 times) [2024-06-13 00:11:25,844][71000] InferenceWorker_p0-w0: resuming experience collection (20400 times) [2024-06-13 00:11:25,940][70768] Fps is (10 sec: 52428.5, 60 sec: 50244.3, 300 sec: 49651.9). Total num frames: 1864105984. Throughput: 0: 50009.3. Samples: 1392923020. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 00:11:25,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-13 00:11:28,588][71000] Updated weights for policy 0, policy_version 113784 (0.0022) [2024-06-13 00:11:30,940][70768] Fps is (10 sec: 45875.6, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1864318976. Throughput: 0: 49960.8. Samples: 1393220800. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 00:11:30,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-13 00:11:31,963][71000] Updated weights for policy 0, policy_version 113794 (0.0041) [2024-06-13 00:11:35,113][71000] Updated weights for policy 0, policy_version 113804 (0.0035) [2024-06-13 00:11:35,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.0, 300 sec: 49596.3). Total num frames: 1864564736. Throughput: 0: 49627.0. Samples: 1393350600. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 00:11:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:11:38,374][71000] Updated weights for policy 0, policy_version 113814 (0.0028) [2024-06-13 00:11:40,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49425.0, 300 sec: 49651.9). Total num frames: 1864843264. Throughput: 0: 49962.3. Samples: 1393655080. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 00:11:40,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:11:40,999][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000113822_1864859648.pth... [2024-06-13 00:11:41,042][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000113092_1852899328.pth [2024-06-13 00:11:41,566][71000] Updated weights for policy 0, policy_version 113824 (0.0028) [2024-06-13 00:11:44,997][71000] Updated weights for policy 0, policy_version 113834 (0.0023) [2024-06-13 00:11:45,939][70768] Fps is (10 sec: 55706.0, 60 sec: 50517.3, 300 sec: 49763.0). Total num frames: 1865121792. Throughput: 0: 50104.0. Samples: 1393962340. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 00:11:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:11:48,010][71000] Updated weights for policy 0, policy_version 113844 (0.0037) [2024-06-13 00:11:50,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1865318400. Throughput: 0: 49656.0. Samples: 1394108180. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 00:11:50,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:11:51,513][71000] Updated weights for policy 0, policy_version 113854 (0.0031) [2024-06-13 00:11:54,923][71000] Updated weights for policy 0, policy_version 113864 (0.0030) [2024-06-13 00:11:55,939][70768] Fps is (10 sec: 44236.8, 60 sec: 49425.2, 300 sec: 49651.9). Total num frames: 1865564160. Throughput: 0: 49771.3. Samples: 1394409600. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 00:11:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:11:57,934][71000] Updated weights for policy 0, policy_version 113874 (0.0034) [2024-06-13 00:12:00,940][70768] Fps is (10 sec: 54066.2, 60 sec: 49971.0, 300 sec: 49762.9). Total num frames: 1865859072. Throughput: 0: 49873.5. Samples: 1394710100. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 00:12:00,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:12:01,246][71000] Updated weights for policy 0, policy_version 113884 (0.0027) [2024-06-13 00:12:04,218][71000] Updated weights for policy 0, policy_version 113894 (0.0024) [2024-06-13 00:12:05,939][70768] Fps is (10 sec: 54067.2, 60 sec: 50244.3, 300 sec: 49762.9). Total num frames: 1866104832. Throughput: 0: 50090.0. Samples: 1394869420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 00:12:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:12:07,959][71000] Updated weights for policy 0, policy_version 113904 (0.0024) [2024-06-13 00:12:10,940][70768] Fps is (10 sec: 49152.8, 60 sec: 50244.4, 300 sec: 49762.9). Total num frames: 1866350592. Throughput: 0: 50016.0. Samples: 1395173740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 00:12:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:12:11,011][71000] Updated weights for policy 0, policy_version 113914 (0.0027) [2024-06-13 00:12:14,376][71000] Updated weights for policy 0, policy_version 113924 (0.0024) [2024-06-13 00:12:15,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49971.1, 300 sec: 49707.4). Total num frames: 1866579968. Throughput: 0: 50059.0. Samples: 1395473460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 00:12:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:12:17,689][71000] Updated weights for policy 0, policy_version 113934 (0.0023) [2024-06-13 00:12:20,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49698.3, 300 sec: 49707.4). Total num frames: 1866842112. Throughput: 0: 50150.7. Samples: 1395607380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 00:12:20,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 00:12:21,293][71000] Updated weights for policy 0, policy_version 113944 (0.0032) [2024-06-13 00:12:24,258][71000] Updated weights for policy 0, policy_version 113954 (0.0024) [2024-06-13 00:12:25,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49971.1, 300 sec: 49762.9). Total num frames: 1867104256. Throughput: 0: 50021.7. Samples: 1395906060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 00:12:25,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 00:12:27,645][71000] Updated weights for policy 0, policy_version 113964 (0.0031) [2024-06-13 00:12:30,327][70980] Signal inference workers to stop experience collection... (20450 times) [2024-06-13 00:12:30,328][70980] Signal inference workers to resume experience collection... (20450 times) [2024-06-13 00:12:30,349][71000] InferenceWorker_p0-w0: stopping experience collection (20450 times) [2024-06-13 00:12:30,349][71000] InferenceWorker_p0-w0: resuming experience collection (20450 times) [2024-06-13 00:12:30,866][71000] Updated weights for policy 0, policy_version 113974 (0.0027) [2024-06-13 00:12:30,940][70768] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 49762.9). Total num frames: 1867350016. Throughput: 0: 49953.6. Samples: 1396210260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 00:12:30,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-13 00:12:34,673][71000] Updated weights for policy 0, policy_version 113984 (0.0032) [2024-06-13 00:12:35,940][70768] Fps is (10 sec: 45875.7, 60 sec: 49971.2, 300 sec: 49707.4). Total num frames: 1867563008. Throughput: 0: 49869.3. Samples: 1396352300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 00:12:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:12:37,597][71000] Updated weights for policy 0, policy_version 113994 (0.0025) [2024-06-13 00:12:40,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49698.1, 300 sec: 49651.8). Total num frames: 1867825152. Throughput: 0: 49798.1. Samples: 1396650520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 00:12:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:12:41,216][71000] Updated weights for policy 0, policy_version 114004 (0.0026) [2024-06-13 00:12:44,048][71000] Updated weights for policy 0, policy_version 114014 (0.0028) [2024-06-13 00:12:45,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.0, 300 sec: 49707.4). Total num frames: 1868087296. Throughput: 0: 49760.6. Samples: 1396949320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 00:12:45,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-13 00:12:47,926][71000] Updated weights for policy 0, policy_version 114024 (0.0029) [2024-06-13 00:12:50,487][71000] Updated weights for policy 0, policy_version 114034 (0.0027) [2024-06-13 00:12:50,940][70768] Fps is (10 sec: 52428.4, 60 sec: 50517.2, 300 sec: 49762.9). Total num frames: 1868349440. Throughput: 0: 49773.6. Samples: 1397109240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 00:12:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:12:54,132][71000] Updated weights for policy 0, policy_version 114044 (0.0027) [2024-06-13 00:12:55,940][70768] Fps is (10 sec: 49152.1, 60 sec: 50244.2, 300 sec: 49762.9). Total num frames: 1868578816. Throughput: 0: 49668.0. Samples: 1397408800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 00:12:55,944][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 00:12:57,069][71000] Updated weights for policy 0, policy_version 114054 (0.0023) [2024-06-13 00:13:00,786][71000] Updated weights for policy 0, policy_version 114064 (0.0033) [2024-06-13 00:13:00,939][70768] Fps is (10 sec: 47514.4, 60 sec: 49425.3, 300 sec: 49651.9). Total num frames: 1868824576. Throughput: 0: 49658.8. Samples: 1397708100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 00:13:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 00:13:03,644][71000] Updated weights for policy 0, policy_version 114074 (0.0030) [2024-06-13 00:13:05,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1869070336. Throughput: 0: 49817.3. Samples: 1397849160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 00:13:05,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 00:13:07,182][71000] Updated weights for policy 0, policy_version 114084 (0.0031) [2024-06-13 00:13:10,322][71000] Updated weights for policy 0, policy_version 114094 (0.0020) [2024-06-13 00:13:10,940][70768] Fps is (10 sec: 50789.4, 60 sec: 49698.0, 300 sec: 49707.4). Total num frames: 1869332480. Throughput: 0: 49972.9. Samples: 1398154840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 00:13:10,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:13:13,806][71000] Updated weights for policy 0, policy_version 114104 (0.0034) [2024-06-13 00:13:15,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49971.3, 300 sec: 49762.9). Total num frames: 1869578240. Throughput: 0: 50084.1. Samples: 1398464040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 00:13:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:13:16,852][71000] Updated weights for policy 0, policy_version 114114 (0.0025) [2024-06-13 00:13:20,098][71000] Updated weights for policy 0, policy_version 114124 (0.0031) [2024-06-13 00:13:20,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49698.1, 300 sec: 49707.4). Total num frames: 1869824000. Throughput: 0: 50132.5. Samples: 1398608260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 00:13:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:13:23,142][71000] Updated weights for policy 0, policy_version 114134 (0.0029) [2024-06-13 00:13:25,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.3, 300 sec: 49707.4). Total num frames: 1870086144. Throughput: 0: 49986.3. Samples: 1398899900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 00:13:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:13:27,125][71000] Updated weights for policy 0, policy_version 114144 (0.0033) [2024-06-13 00:13:29,826][71000] Updated weights for policy 0, policy_version 114154 (0.0028) [2024-06-13 00:13:30,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.2, 300 sec: 49762.9). Total num frames: 1870331904. Throughput: 0: 50030.7. Samples: 1399200700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 00:13:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:13:33,490][71000] Updated weights for policy 0, policy_version 114164 (0.0031) [2024-06-13 00:13:35,939][70768] Fps is (10 sec: 50790.6, 60 sec: 50517.4, 300 sec: 49874.0). Total num frames: 1870594048. Throughput: 0: 49984.2. Samples: 1399358520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 00:13:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:13:36,212][71000] Updated weights for policy 0, policy_version 114174 (0.0025) [2024-06-13 00:13:37,454][70980] Signal inference workers to stop experience collection... (20500 times) [2024-06-13 00:13:37,454][70980] Signal inference workers to resume experience collection... (20500 times) [2024-06-13 00:13:37,466][71000] InferenceWorker_p0-w0: stopping experience collection (20500 times) [2024-06-13 00:13:37,467][71000] InferenceWorker_p0-w0: resuming experience collection (20500 times) [2024-06-13 00:13:40,085][71000] Updated weights for policy 0, policy_version 114184 (0.0024) [2024-06-13 00:13:40,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49971.1, 300 sec: 49762.9). Total num frames: 1870823424. Throughput: 0: 49878.9. Samples: 1399653360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 00:13:40,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:13:41,062][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000114187_1870839808.pth... [2024-06-13 00:13:41,108][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000113457_1858879488.pth [2024-06-13 00:13:42,954][71000] Updated weights for policy 0, policy_version 114194 (0.0034) [2024-06-13 00:13:45,939][70768] Fps is (10 sec: 45875.2, 60 sec: 49425.1, 300 sec: 49707.4). Total num frames: 1871052800. Throughput: 0: 49813.8. Samples: 1399949720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 00:13:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:13:46,579][71000] Updated weights for policy 0, policy_version 114204 (0.0024) [2024-06-13 00:13:49,410][71000] Updated weights for policy 0, policy_version 114214 (0.0030) [2024-06-13 00:13:50,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.1, 300 sec: 49707.4). Total num frames: 1871331328. Throughput: 0: 50015.4. Samples: 1400099860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 00:13:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:13:53,093][71000] Updated weights for policy 0, policy_version 114224 (0.0033) [2024-06-13 00:13:55,849][71000] Updated weights for policy 0, policy_version 114234 (0.0036) [2024-06-13 00:13:55,940][70768] Fps is (10 sec: 55705.4, 60 sec: 50517.3, 300 sec: 49929.6). Total num frames: 1871609856. Throughput: 0: 49914.4. Samples: 1400400980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 00:13:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:13:59,736][71000] Updated weights for policy 0, policy_version 114244 (0.0024) [2024-06-13 00:14:00,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49971.0, 300 sec: 49818.4). Total num frames: 1871822848. Throughput: 0: 49922.9. Samples: 1400710580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 00:14:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:14:02,458][71000] Updated weights for policy 0, policy_version 114254 (0.0025) [2024-06-13 00:14:05,939][70768] Fps is (10 sec: 47513.8, 60 sec: 50244.3, 300 sec: 49762.9). Total num frames: 1872084992. Throughput: 0: 49923.6. Samples: 1400854820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 00:14:05,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:14:06,544][71000] Updated weights for policy 0, policy_version 114264 (0.0021) [2024-06-13 00:14:09,395][71000] Updated weights for policy 0, policy_version 114274 (0.0039) [2024-06-13 00:14:10,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49971.3, 300 sec: 49762.9). Total num frames: 1872330752. Throughput: 0: 49834.2. Samples: 1401142440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 00:14:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:14:13,167][71000] Updated weights for policy 0, policy_version 114284 (0.0036) [2024-06-13 00:14:15,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49971.1, 300 sec: 49874.0). Total num frames: 1872576512. Throughput: 0: 49599.1. Samples: 1401432660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 00:14:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:14:16,050][71000] Updated weights for policy 0, policy_version 114294 (0.0029) [2024-06-13 00:14:19,614][71000] Updated weights for policy 0, policy_version 114304 (0.0026) [2024-06-13 00:14:20,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49971.3, 300 sec: 49818.9). Total num frames: 1872822272. Throughput: 0: 49600.0. Samples: 1401590520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 00:14:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:14:22,359][71000] Updated weights for policy 0, policy_version 114314 (0.0025) [2024-06-13 00:14:25,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.0, 300 sec: 49707.4). Total num frames: 1873051648. Throughput: 0: 49637.5. Samples: 1401887040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 00:14:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:14:26,350][71000] Updated weights for policy 0, policy_version 114324 (0.0025) [2024-06-13 00:14:29,317][71000] Updated weights for policy 0, policy_version 114334 (0.0034) [2024-06-13 00:14:30,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49698.2, 300 sec: 49763.0). Total num frames: 1873313792. Throughput: 0: 49636.0. Samples: 1402183340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 00:14:30,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:14:33,292][71000] Updated weights for policy 0, policy_version 114344 (0.0023) [2024-06-13 00:14:35,725][71000] Updated weights for policy 0, policy_version 114354 (0.0033) [2024-06-13 00:14:35,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49698.1, 300 sec: 49874.0). Total num frames: 1873575936. Throughput: 0: 49730.4. Samples: 1402337720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 00:14:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:14:39,718][71000] Updated weights for policy 0, policy_version 114364 (0.0026) [2024-06-13 00:14:40,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49698.3, 300 sec: 49818.5). Total num frames: 1873805312. Throughput: 0: 49468.5. Samples: 1402627060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 00:14:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:14:42,519][71000] Updated weights for policy 0, policy_version 114374 (0.0030) [2024-06-13 00:14:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49971.2, 300 sec: 49707.4). Total num frames: 1874051072. Throughput: 0: 49164.6. Samples: 1402922980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 00:14:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:14:46,386][71000] Updated weights for policy 0, policy_version 114384 (0.0030) [2024-06-13 00:14:47,326][70980] Signal inference workers to stop experience collection... (20550 times) [2024-06-13 00:14:47,377][71000] InferenceWorker_p0-w0: stopping experience collection (20550 times) [2024-06-13 00:14:47,377][70980] Signal inference workers to resume experience collection... (20550 times) [2024-06-13 00:14:47,392][71000] InferenceWorker_p0-w0: resuming experience collection (20550 times) [2024-06-13 00:14:49,068][71000] Updated weights for policy 0, policy_version 114394 (0.0028) [2024-06-13 00:14:50,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49698.2, 300 sec: 49818.5). Total num frames: 1874313216. Throughput: 0: 49441.7. Samples: 1403079700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 00:14:50,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:14:53,011][71000] Updated weights for policy 0, policy_version 114404 (0.0028) [2024-06-13 00:14:55,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48878.8, 300 sec: 49762.9). Total num frames: 1874542592. Throughput: 0: 49383.5. Samples: 1403364700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 00:14:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:14:56,004][71000] Updated weights for policy 0, policy_version 114414 (0.0039) [2024-06-13 00:14:59,579][71000] Updated weights for policy 0, policy_version 114424 (0.0031) [2024-06-13 00:15:00,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49425.2, 300 sec: 49818.5). Total num frames: 1874788352. Throughput: 0: 49362.3. Samples: 1403653960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 00:15:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:15:02,541][71000] Updated weights for policy 0, policy_version 114434 (0.0023) [2024-06-13 00:15:05,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48878.8, 300 sec: 49596.3). Total num frames: 1875017728. Throughput: 0: 49126.1. Samples: 1403801200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 00:15:05,942][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:15:06,381][71000] Updated weights for policy 0, policy_version 114444 (0.0030) [2024-06-13 00:15:09,481][71000] Updated weights for policy 0, policy_version 114454 (0.0022) [2024-06-13 00:15:10,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.9, 300 sec: 49707.4). Total num frames: 1875263488. Throughput: 0: 49273.7. Samples: 1404104360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-13 00:15:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:15:12,919][71000] Updated weights for policy 0, policy_version 114464 (0.0031) [2024-06-13 00:15:15,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48879.0, 300 sec: 49651.9). Total num frames: 1875509248. Throughput: 0: 49170.7. Samples: 1404396020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-13 00:15:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:15:16,083][71000] Updated weights for policy 0, policy_version 114474 (0.0031) [2024-06-13 00:15:19,650][71000] Updated weights for policy 0, policy_version 114484 (0.0045) [2024-06-13 00:15:20,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49152.0, 300 sec: 49762.9). Total num frames: 1875771392. Throughput: 0: 48906.7. Samples: 1404538520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-13 00:15:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:15:22,890][71000] Updated weights for policy 0, policy_version 114494 (0.0031) [2024-06-13 00:15:25,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.0, 300 sec: 49651.8). Total num frames: 1876017152. Throughput: 0: 49180.3. Samples: 1404840180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-13 00:15:25,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:15:26,007][71000] Updated weights for policy 0, policy_version 114504 (0.0028) [2024-06-13 00:15:29,403][71000] Updated weights for policy 0, policy_version 114514 (0.0022) [2024-06-13 00:15:30,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.0, 300 sec: 49707.4). Total num frames: 1876279296. Throughput: 0: 49284.4. Samples: 1405140780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-13 00:15:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:15:32,729][71000] Updated weights for policy 0, policy_version 114524 (0.0027) [2024-06-13 00:15:35,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48878.9, 300 sec: 49596.3). Total num frames: 1876508672. Throughput: 0: 48983.2. Samples: 1405283940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-13 00:15:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 00:15:36,226][71000] Updated weights for policy 0, policy_version 114534 (0.0027) [2024-06-13 00:15:39,526][71000] Updated weights for policy 0, policy_version 114544 (0.0034) [2024-06-13 00:15:40,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.0, 300 sec: 49762.9). Total num frames: 1876770816. Throughput: 0: 49177.9. Samples: 1405577700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-13 00:15:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:15:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000114549_1876770816.pth... [2024-06-13 00:15:41,007][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000113822_1864859648.pth [2024-06-13 00:15:42,889][71000] Updated weights for policy 0, policy_version 114554 (0.0025) [2024-06-13 00:15:45,883][71000] Updated weights for policy 0, policy_version 114564 (0.0033) [2024-06-13 00:15:45,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.0, 300 sec: 49818.5). Total num frames: 1877016576. Throughput: 0: 49360.9. Samples: 1405875200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-13 00:15:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:15:49,445][71000] Updated weights for policy 0, policy_version 114574 (0.0022) [2024-06-13 00:15:50,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49707.4). Total num frames: 1877262336. Throughput: 0: 49475.2. Samples: 1406027580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-13 00:15:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:15:52,402][71000] Updated weights for policy 0, policy_version 114584 (0.0027) [2024-06-13 00:15:55,942][70768] Fps is (10 sec: 45865.2, 60 sec: 48877.2, 300 sec: 49540.4). Total num frames: 1877475328. Throughput: 0: 49419.0. Samples: 1406328320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-13 00:15:55,942][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:15:56,179][71000] Updated weights for policy 0, policy_version 114594 (0.0020) [2024-06-13 00:15:57,191][70980] Signal inference workers to stop experience collection... (20600 times) [2024-06-13 00:15:57,193][70980] Signal inference workers to resume experience collection... (20600 times) [2024-06-13 00:15:57,215][71000] InferenceWorker_p0-w0: stopping experience collection (20600 times) [2024-06-13 00:15:57,215][71000] InferenceWorker_p0-w0: resuming experience collection (20600 times) [2024-06-13 00:15:59,022][71000] Updated weights for policy 0, policy_version 114604 (0.0036) [2024-06-13 00:16:00,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.0, 300 sec: 49707.4). Total num frames: 1877753856. Throughput: 0: 49287.5. Samples: 1406613960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-13 00:16:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:16:02,639][71000] Updated weights for policy 0, policy_version 114614 (0.0026) [2024-06-13 00:16:05,681][71000] Updated weights for policy 0, policy_version 114624 (0.0033) [2024-06-13 00:16:05,940][70768] Fps is (10 sec: 52440.3, 60 sec: 49698.2, 300 sec: 49707.4). Total num frames: 1877999616. Throughput: 0: 49640.4. Samples: 1406772340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-13 00:16:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:16:09,517][71000] Updated weights for policy 0, policy_version 114634 (0.0033) [2024-06-13 00:16:10,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.1, 300 sec: 49651.8). Total num frames: 1878228992. Throughput: 0: 49506.7. Samples: 1407067980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 00:16:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:16:12,338][71000] Updated weights for policy 0, policy_version 114644 (0.0025) [2024-06-13 00:16:15,940][70768] Fps is (10 sec: 45874.1, 60 sec: 49151.8, 300 sec: 49485.2). Total num frames: 1878458368. Throughput: 0: 49314.9. Samples: 1407359960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 00:16:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:16:16,193][71000] Updated weights for policy 0, policy_version 114654 (0.0033) [2024-06-13 00:16:19,052][71000] Updated weights for policy 0, policy_version 114664 (0.0028) [2024-06-13 00:16:20,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1878736896. Throughput: 0: 49336.4. Samples: 1407504080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 00:16:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:16:23,179][71000] Updated weights for policy 0, policy_version 114674 (0.0024) [2024-06-13 00:16:25,639][71000] Updated weights for policy 0, policy_version 114684 (0.0037) [2024-06-13 00:16:25,940][70768] Fps is (10 sec: 54068.0, 60 sec: 49698.1, 300 sec: 49762.9). Total num frames: 1878999040. Throughput: 0: 49286.2. Samples: 1407795580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 00:16:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:16:29,768][71000] Updated weights for policy 0, policy_version 114694 (0.0030) [2024-06-13 00:16:30,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.9, 300 sec: 49651.8). Total num frames: 1879212032. Throughput: 0: 49281.8. Samples: 1408092880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 00:16:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:16:32,284][71000] Updated weights for policy 0, policy_version 114704 (0.0028) [2024-06-13 00:16:35,940][70768] Fps is (10 sec: 44237.2, 60 sec: 48878.9, 300 sec: 49485.2). Total num frames: 1879441408. Throughput: 0: 49017.8. Samples: 1408233380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 00:16:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:16:36,362][71000] Updated weights for policy 0, policy_version 114714 (0.0026) [2024-06-13 00:16:38,854][71000] Updated weights for policy 0, policy_version 114724 (0.0034) [2024-06-13 00:16:40,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1879719936. Throughput: 0: 48957.9. Samples: 1408531320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 00:16:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:16:43,060][71000] Updated weights for policy 0, policy_version 114734 (0.0036) [2024-06-13 00:16:45,335][71000] Updated weights for policy 0, policy_version 114744 (0.0027) [2024-06-13 00:16:45,939][70768] Fps is (10 sec: 55706.1, 60 sec: 49698.2, 300 sec: 49762.9). Total num frames: 1879998464. Throughput: 0: 49221.0. Samples: 1408828900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 00:16:45,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-13 00:16:49,681][71000] Updated weights for policy 0, policy_version 114754 (0.0031) [2024-06-13 00:16:50,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49651.8). Total num frames: 1880211456. Throughput: 0: 49194.2. Samples: 1408986080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 00:16:50,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-13 00:16:51,962][71000] Updated weights for policy 0, policy_version 114764 (0.0026) [2024-06-13 00:16:55,927][71000] Updated weights for policy 0, policy_version 114774 (0.0031) [2024-06-13 00:16:55,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49700.0, 300 sec: 49485.3). Total num frames: 1880457216. Throughput: 0: 49209.4. Samples: 1409282400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 00:16:55,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 00:16:58,525][71000] Updated weights for policy 0, policy_version 114784 (0.0028) [2024-06-13 00:17:00,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1880719360. Throughput: 0: 49301.2. Samples: 1409578500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 00:17:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:17:02,778][71000] Updated weights for policy 0, policy_version 114794 (0.0034) [2024-06-13 00:17:04,609][70980] Signal inference workers to stop experience collection... (20650 times) [2024-06-13 00:17:04,659][71000] InferenceWorker_p0-w0: stopping experience collection (20650 times) [2024-06-13 00:17:04,666][70980] Signal inference workers to resume experience collection... (20650 times) [2024-06-13 00:17:04,676][71000] InferenceWorker_p0-w0: resuming experience collection (20650 times) [2024-06-13 00:17:05,539][71000] Updated weights for policy 0, policy_version 114804 (0.0023) [2024-06-13 00:17:05,940][70768] Fps is (10 sec: 54063.6, 60 sec: 49970.7, 300 sec: 49651.8). Total num frames: 1880997888. Throughput: 0: 49639.8. Samples: 1409737900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 00:17:05,941][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 00:17:09,516][71000] Updated weights for policy 0, policy_version 114814 (0.0024) [2024-06-13 00:17:10,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49698.0, 300 sec: 49596.3). Total num frames: 1881210880. Throughput: 0: 49694.6. Samples: 1410031840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 00:17:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:17:11,959][71000] Updated weights for policy 0, policy_version 114824 (0.0032) [2024-06-13 00:17:15,940][70768] Fps is (10 sec: 42601.1, 60 sec: 49425.3, 300 sec: 49429.7). Total num frames: 1881423872. Throughput: 0: 49583.2. Samples: 1410324120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:17:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:17:15,972][71000] Updated weights for policy 0, policy_version 114834 (0.0024) [2024-06-13 00:17:18,696][71000] Updated weights for policy 0, policy_version 114844 (0.0026) [2024-06-13 00:17:20,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1881702400. Throughput: 0: 49757.6. Samples: 1410472480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:17:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:17:22,587][71000] Updated weights for policy 0, policy_version 114854 (0.0024) [2024-06-13 00:17:25,087][71000] Updated weights for policy 0, policy_version 114864 (0.0025) [2024-06-13 00:17:25,940][70768] Fps is (10 sec: 54067.0, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1881964544. Throughput: 0: 49878.2. Samples: 1410775840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:17:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:17:29,421][71000] Updated weights for policy 0, policy_version 114874 (0.0024) [2024-06-13 00:17:30,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49697.9, 300 sec: 49596.3). Total num frames: 1882193920. Throughput: 0: 49692.1. Samples: 1411065060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:17:30,941][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 00:17:31,976][71000] Updated weights for policy 0, policy_version 114884 (0.0030) [2024-06-13 00:17:35,940][70768] Fps is (10 sec: 44236.7, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1882406912. Throughput: 0: 49219.1. Samples: 1411200940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:17:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:17:35,986][71000] Updated weights for policy 0, policy_version 114894 (0.0034) [2024-06-13 00:17:38,671][71000] Updated weights for policy 0, policy_version 114904 (0.0028) [2024-06-13 00:17:40,940][70768] Fps is (10 sec: 47514.6, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1882669056. Throughput: 0: 49187.5. Samples: 1411495840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:17:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:17:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000114909_1882669056.pth... [2024-06-13 00:17:41,005][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000114187_1870839808.pth [2024-06-13 00:17:42,628][71000] Updated weights for policy 0, policy_version 114914 (0.0027) [2024-06-13 00:17:45,451][71000] Updated weights for policy 0, policy_version 114924 (0.0031) [2024-06-13 00:17:45,939][70768] Fps is (10 sec: 54067.7, 60 sec: 49152.0, 300 sec: 49485.3). Total num frames: 1882947584. Throughput: 0: 49398.2. Samples: 1411801420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:17:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:17:49,110][71000] Updated weights for policy 0, policy_version 114934 (0.0033) [2024-06-13 00:17:50,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1883160576. Throughput: 0: 49007.8. Samples: 1411943220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:17:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:17:51,986][71000] Updated weights for policy 0, policy_version 114944 (0.0027) [2024-06-13 00:17:55,858][71000] Updated weights for policy 0, policy_version 114954 (0.0033) [2024-06-13 00:17:55,940][70768] Fps is (10 sec: 45874.4, 60 sec: 49151.8, 300 sec: 49429.7). Total num frames: 1883406336. Throughput: 0: 48973.4. Samples: 1412235640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:17:55,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-13 00:17:58,499][71000] Updated weights for policy 0, policy_version 114964 (0.0028) [2024-06-13 00:18:00,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.9, 300 sec: 49429.7). Total num frames: 1883652096. Throughput: 0: 49140.0. Samples: 1412535420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:18:00,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:18:02,327][71000] Updated weights for policy 0, policy_version 114974 (0.0027) [2024-06-13 00:18:04,399][70980] Signal inference workers to stop experience collection... (20700 times) [2024-06-13 00:18:04,446][71000] InferenceWorker_p0-w0: stopping experience collection (20700 times) [2024-06-13 00:18:04,507][70980] Signal inference workers to resume experience collection... (20700 times) [2024-06-13 00:18:04,507][71000] InferenceWorker_p0-w0: resuming experience collection (20700 times) [2024-06-13 00:18:05,238][71000] Updated weights for policy 0, policy_version 114984 (0.0026) [2024-06-13 00:18:05,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48606.3, 300 sec: 49429.7). Total num frames: 1883914240. Throughput: 0: 49179.1. Samples: 1412685540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:18:05,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:18:08,784][71000] Updated weights for policy 0, policy_version 114994 (0.0026) [2024-06-13 00:18:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48879.1, 300 sec: 49374.1). Total num frames: 1884143616. Throughput: 0: 48995.1. Samples: 1412980620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:18:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:18:11,969][71000] Updated weights for policy 0, policy_version 115004 (0.0029) [2024-06-13 00:18:15,299][71000] Updated weights for policy 0, policy_version 115014 (0.0027) [2024-06-13 00:18:15,939][70768] Fps is (10 sec: 50791.3, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 1884422144. Throughput: 0: 49304.8. Samples: 1413283760. Policy #0 lag: (min: 2.0, avg: 12.5, max: 22.0) [2024-06-13 00:18:15,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:18:18,623][71000] Updated weights for policy 0, policy_version 115024 (0.0025) [2024-06-13 00:18:20,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 1884651520. Throughput: 0: 49546.6. Samples: 1413430540. Policy #0 lag: (min: 2.0, avg: 12.5, max: 22.0) [2024-06-13 00:18:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:18:22,002][71000] Updated weights for policy 0, policy_version 115034 (0.0032) [2024-06-13 00:18:25,158][71000] Updated weights for policy 0, policy_version 115044 (0.0026) [2024-06-13 00:18:25,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1884913664. Throughput: 0: 49788.5. Samples: 1413736320. Policy #0 lag: (min: 2.0, avg: 12.5, max: 22.0) [2024-06-13 00:18:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:18:28,314][71000] Updated weights for policy 0, policy_version 115054 (0.0020) [2024-06-13 00:18:30,939][70768] Fps is (10 sec: 47514.3, 60 sec: 48879.2, 300 sec: 49263.1). Total num frames: 1885126656. Throughput: 0: 49646.7. Samples: 1414035520. Policy #0 lag: (min: 2.0, avg: 12.5, max: 22.0) [2024-06-13 00:18:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:18:31,896][71000] Updated weights for policy 0, policy_version 115064 (0.0033) [2024-06-13 00:18:34,981][71000] Updated weights for policy 0, policy_version 115074 (0.0026) [2024-06-13 00:18:35,940][70768] Fps is (10 sec: 50790.5, 60 sec: 50244.3, 300 sec: 49485.3). Total num frames: 1885421568. Throughput: 0: 49482.7. Samples: 1414169940. Policy #0 lag: (min: 2.0, avg: 12.5, max: 22.0) [2024-06-13 00:18:35,948][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:18:38,647][71000] Updated weights for policy 0, policy_version 115084 (0.0028) [2024-06-13 00:18:40,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1885634560. Throughput: 0: 49519.3. Samples: 1414464000. Policy #0 lag: (min: 2.0, avg: 12.5, max: 22.0) [2024-06-13 00:18:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:18:41,809][71000] Updated weights for policy 0, policy_version 115094 (0.0032) [2024-06-13 00:18:45,274][71000] Updated weights for policy 0, policy_version 115104 (0.0030) [2024-06-13 00:18:45,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49151.9, 300 sec: 49374.2). Total num frames: 1885896704. Throughput: 0: 49431.5. Samples: 1414759840. Policy #0 lag: (min: 2.0, avg: 12.5, max: 22.0) [2024-06-13 00:18:45,949][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:18:48,434][71000] Updated weights for policy 0, policy_version 115114 (0.0027) [2024-06-13 00:18:50,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 1886126080. Throughput: 0: 49404.0. Samples: 1414908720. Policy #0 lag: (min: 2.0, avg: 12.5, max: 22.0) [2024-06-13 00:18:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:18:51,966][71000] Updated weights for policy 0, policy_version 115124 (0.0034) [2024-06-13 00:18:54,753][71000] Updated weights for policy 0, policy_version 115134 (0.0033) [2024-06-13 00:18:55,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49971.3, 300 sec: 49429.7). Total num frames: 1886404608. Throughput: 0: 49586.6. Samples: 1415212020. Policy #0 lag: (min: 2.0, avg: 12.5, max: 22.0) [2024-06-13 00:18:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:18:58,794][71000] Updated weights for policy 0, policy_version 115144 (0.0027) [2024-06-13 00:19:00,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 1886633984. Throughput: 0: 49387.5. Samples: 1415506200. Policy #0 lag: (min: 2.0, avg: 12.5, max: 22.0) [2024-06-13 00:19:00,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:19:01,349][71000] Updated weights for policy 0, policy_version 115154 (0.0023) [2024-06-13 00:19:05,267][71000] Updated weights for policy 0, policy_version 115164 (0.0031) [2024-06-13 00:19:05,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 1886896128. Throughput: 0: 49490.3. Samples: 1415657600. Policy #0 lag: (min: 2.0, avg: 12.5, max: 22.0) [2024-06-13 00:19:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:19:07,986][71000] Updated weights for policy 0, policy_version 115174 (0.0026) [2024-06-13 00:19:10,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 1887109120. Throughput: 0: 49333.8. Samples: 1415956340. Policy #0 lag: (min: 2.0, avg: 12.5, max: 22.0) [2024-06-13 00:19:10,943][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:19:11,962][71000] Updated weights for policy 0, policy_version 115184 (0.0027) [2024-06-13 00:19:14,769][71000] Updated weights for policy 0, policy_version 115194 (0.0032) [2024-06-13 00:19:14,770][70980] Signal inference workers to stop experience collection... (20750 times) [2024-06-13 00:19:14,770][70980] Signal inference workers to resume experience collection... (20750 times) [2024-06-13 00:19:14,815][71000] InferenceWorker_p0-w0: stopping experience collection (20750 times) [2024-06-13 00:19:14,815][71000] InferenceWorker_p0-w0: resuming experience collection (20750 times) [2024-06-13 00:19:15,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1887387648. Throughput: 0: 48854.2. Samples: 1416233960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 00:19:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 00:19:18,661][71000] Updated weights for policy 0, policy_version 115204 (0.0031) [2024-06-13 00:19:20,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1887617024. Throughput: 0: 49574.6. Samples: 1416400800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 00:19:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:19:21,280][71000] Updated weights for policy 0, policy_version 115214 (0.0026) [2024-06-13 00:19:25,548][71000] Updated weights for policy 0, policy_version 115224 (0.0034) [2024-06-13 00:19:25,940][70768] Fps is (10 sec: 44236.7, 60 sec: 48605.9, 300 sec: 49207.5). Total num frames: 1887830016. Throughput: 0: 49280.0. Samples: 1416681600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 00:19:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:19:28,079][71000] Updated weights for policy 0, policy_version 115234 (0.0034) [2024-06-13 00:19:30,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 1888108544. Throughput: 0: 49329.4. Samples: 1416979660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 00:19:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:19:32,175][71000] Updated weights for policy 0, policy_version 115244 (0.0031) [2024-06-13 00:19:34,581][71000] Updated weights for policy 0, policy_version 115254 (0.0023) [2024-06-13 00:19:35,940][70768] Fps is (10 sec: 54067.2, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1888370688. Throughput: 0: 49477.5. Samples: 1417135200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 00:19:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:19:38,713][71000] Updated weights for policy 0, policy_version 115264 (0.0029) [2024-06-13 00:19:40,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 1888632832. Throughput: 0: 49474.3. Samples: 1417438360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 00:19:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 00:19:41,047][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000115274_1888649216.pth... [2024-06-13 00:19:41,053][71000] Updated weights for policy 0, policy_version 115274 (0.0029) [2024-06-13 00:19:41,097][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000114549_1876770816.pth [2024-06-13 00:19:45,520][71000] Updated weights for policy 0, policy_version 115284 (0.0024) [2024-06-13 00:19:45,939][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 1888845824. Throughput: 0: 49598.3. Samples: 1417738120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 00:19:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:19:47,771][71000] Updated weights for policy 0, policy_version 115294 (0.0030) [2024-06-13 00:19:50,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 1889091584. Throughput: 0: 48998.2. Samples: 1417862520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 00:19:50,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:19:52,068][71000] Updated weights for policy 0, policy_version 115304 (0.0025) [2024-06-13 00:19:54,291][71000] Updated weights for policy 0, policy_version 115314 (0.0032) [2024-06-13 00:19:55,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1889370112. Throughput: 0: 49060.0. Samples: 1418164040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 00:19:55,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 00:19:58,485][71000] Updated weights for policy 0, policy_version 115324 (0.0029) [2024-06-13 00:20:00,939][70768] Fps is (10 sec: 52429.4, 60 sec: 49698.2, 300 sec: 49485.3). Total num frames: 1889615872. Throughput: 0: 49754.3. Samples: 1418472900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 00:20:00,940][70768] Avg episode reward: [(0, '0.259')] [2024-06-13 00:20:00,981][71000] Updated weights for policy 0, policy_version 115334 (0.0034) [2024-06-13 00:20:05,416][71000] Updated weights for policy 0, policy_version 115344 (0.0028) [2024-06-13 00:20:05,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48879.0, 300 sec: 49374.2). Total num frames: 1889828864. Throughput: 0: 49357.4. Samples: 1418621880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 00:20:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:20:07,434][71000] Updated weights for policy 0, policy_version 115354 (0.0032) [2024-06-13 00:20:10,940][70768] Fps is (10 sec: 45870.6, 60 sec: 49424.3, 300 sec: 49374.0). Total num frames: 1890074624. Throughput: 0: 49595.8. Samples: 1418913460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 00:20:10,941][70768] Avg episode reward: [(0, '0.259')] [2024-06-13 00:20:11,851][71000] Updated weights for policy 0, policy_version 115364 (0.0033) [2024-06-13 00:20:14,167][71000] Updated weights for policy 0, policy_version 115374 (0.0031) [2024-06-13 00:20:15,940][70768] Fps is (10 sec: 54066.3, 60 sec: 49698.0, 300 sec: 49485.2). Total num frames: 1890369536. Throughput: 0: 49649.7. Samples: 1419213900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 00:20:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:20:18,537][71000] Updated weights for policy 0, policy_version 115384 (0.0028) [2024-06-13 00:20:20,255][70980] Signal inference workers to stop experience collection... (20800 times) [2024-06-13 00:20:20,255][70980] Signal inference workers to resume experience collection... (20800 times) [2024-06-13 00:20:20,279][71000] InferenceWorker_p0-w0: stopping experience collection (20800 times) [2024-06-13 00:20:20,279][71000] InferenceWorker_p0-w0: resuming experience collection (20800 times) [2024-06-13 00:20:20,597][71000] Updated weights for policy 0, policy_version 115394 (0.0026) [2024-06-13 00:20:20,940][70768] Fps is (10 sec: 54072.0, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 1890615296. Throughput: 0: 49845.3. Samples: 1419378240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 00:20:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:20:25,298][71000] Updated weights for policy 0, policy_version 115404 (0.0035) [2024-06-13 00:20:25,940][70768] Fps is (10 sec: 47513.4, 60 sec: 50244.1, 300 sec: 49374.1). Total num frames: 1890844672. Throughput: 0: 49608.3. Samples: 1419670740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 00:20:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:20:27,332][71000] Updated weights for policy 0, policy_version 115414 (0.0023) [2024-06-13 00:20:30,939][70768] Fps is (10 sec: 45875.9, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 1891074048. Throughput: 0: 49512.5. Samples: 1419966180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 00:20:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:20:31,665][71000] Updated weights for policy 0, policy_version 115424 (0.0033) [2024-06-13 00:20:33,810][71000] Updated weights for policy 0, policy_version 115434 (0.0025) [2024-06-13 00:20:35,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1891352576. Throughput: 0: 50025.3. Samples: 1420113660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 00:20:35,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-13 00:20:38,022][71000] Updated weights for policy 0, policy_version 115444 (0.0019) [2024-06-13 00:20:40,368][71000] Updated weights for policy 0, policy_version 115454 (0.0028) [2024-06-13 00:20:40,940][70768] Fps is (10 sec: 54066.5, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1891614720. Throughput: 0: 50194.7. Samples: 1420422800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 00:20:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:20:44,367][71000] Updated weights for policy 0, policy_version 115464 (0.0032) [2024-06-13 00:20:45,940][70768] Fps is (10 sec: 50790.7, 60 sec: 50244.2, 300 sec: 49485.2). Total num frames: 1891860480. Throughput: 0: 49992.4. Samples: 1420722560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 00:20:45,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-13 00:20:47,030][71000] Updated weights for policy 0, policy_version 115474 (0.0027) [2024-06-13 00:20:50,940][70768] Fps is (10 sec: 44236.9, 60 sec: 49425.1, 300 sec: 49430.1). Total num frames: 1892057088. Throughput: 0: 49720.8. Samples: 1420859320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 00:20:50,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:20:51,095][71000] Updated weights for policy 0, policy_version 115484 (0.0030) [2024-06-13 00:20:53,706][71000] Updated weights for policy 0, policy_version 115494 (0.0023) [2024-06-13 00:20:55,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1892335616. Throughput: 0: 49753.8. Samples: 1421152340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 00:20:55,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 00:20:57,642][71000] Updated weights for policy 0, policy_version 115504 (0.0037) [2024-06-13 00:21:00,418][71000] Updated weights for policy 0, policy_version 115514 (0.0026) [2024-06-13 00:21:00,940][70768] Fps is (10 sec: 54066.5, 60 sec: 49698.0, 300 sec: 49485.2). Total num frames: 1892597760. Throughput: 0: 49721.8. Samples: 1421451380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 00:21:00,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 00:21:04,332][71000] Updated weights for policy 0, policy_version 115524 (0.0030) [2024-06-13 00:21:05,940][70768] Fps is (10 sec: 50790.4, 60 sec: 50244.1, 300 sec: 49540.8). Total num frames: 1892843520. Throughput: 0: 49646.2. Samples: 1421612320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 00:21:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:21:06,755][71000] Updated weights for policy 0, policy_version 115534 (0.0027) [2024-06-13 00:21:10,755][71000] Updated weights for policy 0, policy_version 115544 (0.0025) [2024-06-13 00:21:10,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49972.0, 300 sec: 49540.8). Total num frames: 1893072896. Throughput: 0: 49789.5. Samples: 1421911260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 00:21:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:21:11,940][70980] Signal inference workers to stop experience collection... (20850 times) [2024-06-13 00:21:11,942][70980] Signal inference workers to resume experience collection... (20850 times) [2024-06-13 00:21:11,980][71000] InferenceWorker_p0-w0: stopping experience collection (20850 times) [2024-06-13 00:21:11,980][71000] InferenceWorker_p0-w0: resuming experience collection (20850 times) [2024-06-13 00:21:13,598][71000] Updated weights for policy 0, policy_version 115554 (0.0022) [2024-06-13 00:21:15,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 1893318656. Throughput: 0: 49645.6. Samples: 1422200240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 00:21:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 00:21:17,243][71000] Updated weights for policy 0, policy_version 115564 (0.0027) [2024-06-13 00:21:20,149][71000] Updated weights for policy 0, policy_version 115574 (0.0032) [2024-06-13 00:21:20,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1893580800. Throughput: 0: 49837.8. Samples: 1422356360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 00:21:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:21:23,884][71000] Updated weights for policy 0, policy_version 115584 (0.0024) [2024-06-13 00:21:25,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49698.3, 300 sec: 49540.8). Total num frames: 1893826560. Throughput: 0: 49527.2. Samples: 1422651520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 00:21:25,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:21:26,851][71000] Updated weights for policy 0, policy_version 115594 (0.0025) [2024-06-13 00:21:30,362][71000] Updated weights for policy 0, policy_version 115604 (0.0029) [2024-06-13 00:21:30,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49970.9, 300 sec: 49596.3). Total num frames: 1894072320. Throughput: 0: 49686.4. Samples: 1422958460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 00:21:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:21:33,363][71000] Updated weights for policy 0, policy_version 115614 (0.0027) [2024-06-13 00:21:35,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1894318080. Throughput: 0: 49828.4. Samples: 1423101600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 00:21:35,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-13 00:21:37,131][71000] Updated weights for policy 0, policy_version 115624 (0.0027) [2024-06-13 00:21:40,137][71000] Updated weights for policy 0, policy_version 115634 (0.0037) [2024-06-13 00:21:40,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49424.9, 300 sec: 49429.7). Total num frames: 1894580224. Throughput: 0: 49753.3. Samples: 1423391240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 00:21:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:21:40,953][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000115636_1894580224.pth... [2024-06-13 00:21:41,022][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000114909_1882669056.pth [2024-06-13 00:21:44,132][71000] Updated weights for policy 0, policy_version 115644 (0.0034) [2024-06-13 00:21:45,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 49429.7). Total num frames: 1894793216. Throughput: 0: 49486.3. Samples: 1423678260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 00:21:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:21:47,129][71000] Updated weights for policy 0, policy_version 115654 (0.0022) [2024-06-13 00:21:50,454][71000] Updated weights for policy 0, policy_version 115664 (0.0023) [2024-06-13 00:21:50,939][70768] Fps is (10 sec: 47514.8, 60 sec: 49971.3, 300 sec: 49485.2). Total num frames: 1895055360. Throughput: 0: 49374.4. Samples: 1423834160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 00:21:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:21:53,348][71000] Updated weights for policy 0, policy_version 115674 (0.0030) [2024-06-13 00:21:55,939][70768] Fps is (10 sec: 52429.1, 60 sec: 49698.3, 300 sec: 49485.2). Total num frames: 1895317504. Throughput: 0: 49424.9. Samples: 1424135380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 00:21:55,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:21:57,327][71000] Updated weights for policy 0, policy_version 115684 (0.0031) [2024-06-13 00:22:00,147][71000] Updated weights for policy 0, policy_version 115694 (0.0029) [2024-06-13 00:22:00,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.2, 300 sec: 49374.3). Total num frames: 1895563264. Throughput: 0: 49523.1. Samples: 1424428780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 00:22:00,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 00:22:03,953][71000] Updated weights for policy 0, policy_version 115704 (0.0031) [2024-06-13 00:22:05,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48879.0, 300 sec: 49374.2). Total num frames: 1895776256. Throughput: 0: 49435.1. Samples: 1424580940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 00:22:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:22:07,015][71000] Updated weights for policy 0, policy_version 115714 (0.0026) [2024-06-13 00:22:10,594][71000] Updated weights for policy 0, policy_version 115724 (0.0022) [2024-06-13 00:22:10,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1896022016. Throughput: 0: 49207.1. Samples: 1424865840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 00:22:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:22:11,091][70980] Signal inference workers to stop experience collection... (20900 times) [2024-06-13 00:22:11,119][71000] InferenceWorker_p0-w0: stopping experience collection (20900 times) [2024-06-13 00:22:11,141][70980] Signal inference workers to resume experience collection... (20900 times) [2024-06-13 00:22:11,142][71000] InferenceWorker_p0-w0: resuming experience collection (20900 times) [2024-06-13 00:22:13,619][71000] Updated weights for policy 0, policy_version 115734 (0.0031) [2024-06-13 00:22:15,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1896284160. Throughput: 0: 48790.9. Samples: 1425154040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 00:22:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:22:17,428][71000] Updated weights for policy 0, policy_version 115744 (0.0037) [2024-06-13 00:22:20,276][71000] Updated weights for policy 0, policy_version 115754 (0.0031) [2024-06-13 00:22:20,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1896529920. Throughput: 0: 48926.6. Samples: 1425303300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:22:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:22:23,943][71000] Updated weights for policy 0, policy_version 115764 (0.0026) [2024-06-13 00:22:25,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48878.8, 300 sec: 49374.2). Total num frames: 1896759296. Throughput: 0: 49169.0. Samples: 1425603840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:22:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:22:27,002][71000] Updated weights for policy 0, policy_version 115774 (0.0027) [2024-06-13 00:22:30,534][71000] Updated weights for policy 0, policy_version 115784 (0.0034) [2024-06-13 00:22:30,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.2, 300 sec: 49540.8). Total num frames: 1897021440. Throughput: 0: 49332.5. Samples: 1425898220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:22:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:22:33,653][71000] Updated weights for policy 0, policy_version 115794 (0.0027) [2024-06-13 00:22:35,939][70768] Fps is (10 sec: 52429.6, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1897283584. Throughput: 0: 49150.6. Samples: 1426045940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:22:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:22:37,082][71000] Updated weights for policy 0, policy_version 115804 (0.0029) [2024-06-13 00:22:40,174][71000] Updated weights for policy 0, policy_version 115814 (0.0028) [2024-06-13 00:22:40,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48879.0, 300 sec: 49374.1). Total num frames: 1897512960. Throughput: 0: 49066.9. Samples: 1426343400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:22:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:22:43,458][71000] Updated weights for policy 0, policy_version 115824 (0.0018) [2024-06-13 00:22:45,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1897758720. Throughput: 0: 49128.9. Samples: 1426639580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:22:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 00:22:46,853][71000] Updated weights for policy 0, policy_version 115834 (0.0029) [2024-06-13 00:22:50,379][71000] Updated weights for policy 0, policy_version 115844 (0.0030) [2024-06-13 00:22:50,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1898020864. Throughput: 0: 48920.0. Samples: 1426782340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:22:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:22:53,590][71000] Updated weights for policy 0, policy_version 115854 (0.0024) [2024-06-13 00:22:55,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.9, 300 sec: 49485.2). Total num frames: 1898250240. Throughput: 0: 48991.5. Samples: 1427070460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:22:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:22:57,177][71000] Updated weights for policy 0, policy_version 115864 (0.0036) [2024-06-13 00:23:00,722][71000] Updated weights for policy 0, policy_version 115874 (0.0024) [2024-06-13 00:23:00,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48605.9, 300 sec: 49374.2). Total num frames: 1898479616. Throughput: 0: 49184.0. Samples: 1427367320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:23:00,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:23:03,568][71000] Updated weights for policy 0, policy_version 115884 (0.0022) [2024-06-13 00:23:05,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 1898725376. Throughput: 0: 49068.8. Samples: 1427511400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:23:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:23:07,274][71000] Updated weights for policy 0, policy_version 115894 (0.0020) [2024-06-13 00:23:10,313][71000] Updated weights for policy 0, policy_version 115904 (0.0030) [2024-06-13 00:23:10,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 1898987520. Throughput: 0: 48837.8. Samples: 1427801540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:23:10,949][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:23:13,868][71000] Updated weights for policy 0, policy_version 115914 (0.0022) [2024-06-13 00:23:14,407][70980] Signal inference workers to stop experience collection... (20950 times) [2024-06-13 00:23:14,440][71000] InferenceWorker_p0-w0: stopping experience collection (20950 times) [2024-06-13 00:23:14,466][70980] Signal inference workers to resume experience collection... (20950 times) [2024-06-13 00:23:14,466][71000] InferenceWorker_p0-w0: resuming experience collection (20950 times) [2024-06-13 00:23:15,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1899233280. Throughput: 0: 48879.1. Samples: 1428097780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 00:23:15,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:23:17,149][71000] Updated weights for policy 0, policy_version 115924 (0.0027) [2024-06-13 00:23:20,526][71000] Updated weights for policy 0, policy_version 115934 (0.0022) [2024-06-13 00:23:20,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 1899462656. Throughput: 0: 48855.5. Samples: 1428244440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:23:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:23:23,383][71000] Updated weights for policy 0, policy_version 115944 (0.0022) [2024-06-13 00:23:25,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 1899724800. Throughput: 0: 49022.9. Samples: 1428549420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:23:25,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:23:27,357][71000] Updated weights for policy 0, policy_version 115954 (0.0036) [2024-06-13 00:23:30,073][71000] Updated weights for policy 0, policy_version 115964 (0.0030) [2024-06-13 00:23:30,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 1899970560. Throughput: 0: 48875.8. Samples: 1428839000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:23:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:23:33,910][71000] Updated weights for policy 0, policy_version 115974 (0.0031) [2024-06-13 00:23:35,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.9, 300 sec: 49429.7). Total num frames: 1900216320. Throughput: 0: 49049.8. Samples: 1428989580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:23:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:23:36,735][71000] Updated weights for policy 0, policy_version 115984 (0.0022) [2024-06-13 00:23:40,884][71000] Updated weights for policy 0, policy_version 115994 (0.0030) [2024-06-13 00:23:40,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 1900445696. Throughput: 0: 49190.6. Samples: 1429284040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:23:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 00:23:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000115994_1900445696.pth... [2024-06-13 00:23:41,008][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000115274_1888649216.pth [2024-06-13 00:23:43,686][71000] Updated weights for policy 0, policy_version 116004 (0.0035) [2024-06-13 00:23:45,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1900707840. Throughput: 0: 49101.3. Samples: 1429576880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:23:45,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 00:23:47,498][71000] Updated weights for policy 0, policy_version 116014 (0.0035) [2024-06-13 00:23:50,106][71000] Updated weights for policy 0, policy_version 116024 (0.0024) [2024-06-13 00:23:50,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 1900953600. Throughput: 0: 49071.2. Samples: 1429719600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:23:50,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 00:23:54,042][71000] Updated weights for policy 0, policy_version 116034 (0.0028) [2024-06-13 00:23:55,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1901199360. Throughput: 0: 49296.4. Samples: 1430019880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:23:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:23:56,777][71000] Updated weights for policy 0, policy_version 116044 (0.0026) [2024-06-13 00:24:00,682][71000] Updated weights for policy 0, policy_version 116054 (0.0023) [2024-06-13 00:24:00,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49151.8, 300 sec: 49263.1). Total num frames: 1901428736. Throughput: 0: 49285.2. Samples: 1430315620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:24:00,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:24:03,511][71000] Updated weights for policy 0, policy_version 116064 (0.0029) [2024-06-13 00:24:05,939][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 1901674496. Throughput: 0: 49132.5. Samples: 1430455400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:24:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:24:07,208][71000] Updated weights for policy 0, policy_version 116074 (0.0030) [2024-06-13 00:24:09,899][70980] Signal inference workers to stop experience collection... (21000 times) [2024-06-13 00:24:09,941][71000] InferenceWorker_p0-w0: stopping experience collection (21000 times) [2024-06-13 00:24:09,952][70980] Signal inference workers to resume experience collection... (21000 times) [2024-06-13 00:24:09,953][71000] InferenceWorker_p0-w0: resuming experience collection (21000 times) [2024-06-13 00:24:10,095][71000] Updated weights for policy 0, policy_version 116084 (0.0025) [2024-06-13 00:24:10,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 1901936640. Throughput: 0: 48955.8. Samples: 1430752440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:24:10,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:24:13,964][71000] Updated weights for policy 0, policy_version 116094 (0.0026) [2024-06-13 00:24:15,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1902198784. Throughput: 0: 49189.9. Samples: 1431052540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 00:24:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:24:16,563][71000] Updated weights for policy 0, policy_version 116104 (0.0036) [2024-06-13 00:24:20,579][71000] Updated weights for policy 0, policy_version 116114 (0.0038) [2024-06-13 00:24:20,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1902428160. Throughput: 0: 49257.6. Samples: 1431206180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 00:24:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:24:23,020][71000] Updated weights for policy 0, policy_version 116124 (0.0029) [2024-06-13 00:24:25,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1902673920. Throughput: 0: 49281.5. Samples: 1431501700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 00:24:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:24:27,376][71000] Updated weights for policy 0, policy_version 116134 (0.0029) [2024-06-13 00:24:29,866][71000] Updated weights for policy 0, policy_version 116144 (0.0031) [2024-06-13 00:24:30,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1902919680. Throughput: 0: 48998.5. Samples: 1431781820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 00:24:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:24:33,930][71000] Updated weights for policy 0, policy_version 116154 (0.0033) [2024-06-13 00:24:35,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 1903181824. Throughput: 0: 49533.4. Samples: 1431948600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 00:24:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:24:36,400][71000] Updated weights for policy 0, policy_version 116164 (0.0028) [2024-06-13 00:24:40,575][71000] Updated weights for policy 0, policy_version 116174 (0.0033) [2024-06-13 00:24:40,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1903411200. Throughput: 0: 49247.6. Samples: 1432236020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 00:24:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:24:42,950][71000] Updated weights for policy 0, policy_version 116184 (0.0021) [2024-06-13 00:24:45,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1903656960. Throughput: 0: 49318.4. Samples: 1432534940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 00:24:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:24:47,351][71000] Updated weights for policy 0, policy_version 116194 (0.0035) [2024-06-13 00:24:49,874][71000] Updated weights for policy 0, policy_version 116204 (0.0032) [2024-06-13 00:24:50,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 1903902720. Throughput: 0: 49367.0. Samples: 1432676920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 00:24:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 00:24:53,900][71000] Updated weights for policy 0, policy_version 116214 (0.0026) [2024-06-13 00:24:55,940][70768] Fps is (10 sec: 50789.4, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1904164864. Throughput: 0: 49517.7. Samples: 1432980740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 00:24:55,949][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:24:56,311][71000] Updated weights for policy 0, policy_version 116224 (0.0024) [2024-06-13 00:25:00,340][71000] Updated weights for policy 0, policy_version 116234 (0.0024) [2024-06-13 00:25:00,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49425.1, 300 sec: 49374.1). Total num frames: 1904394240. Throughput: 0: 49458.5. Samples: 1433278180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 00:25:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:25:03,019][71000] Updated weights for policy 0, policy_version 116244 (0.0028) [2024-06-13 00:25:05,940][70768] Fps is (10 sec: 47514.5, 60 sec: 49425.0, 300 sec: 49374.3). Total num frames: 1904640000. Throughput: 0: 49241.5. Samples: 1433422040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 00:25:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:25:06,815][71000] Updated weights for policy 0, policy_version 116254 (0.0022) [2024-06-13 00:25:09,291][70980] Signal inference workers to stop experience collection... (21050 times) [2024-06-13 00:25:09,291][70980] Signal inference workers to resume experience collection... (21050 times) [2024-06-13 00:25:09,325][71000] InferenceWorker_p0-w0: stopping experience collection (21050 times) [2024-06-13 00:25:09,325][71000] InferenceWorker_p0-w0: resuming experience collection (21050 times) [2024-06-13 00:25:09,599][71000] Updated weights for policy 0, policy_version 116264 (0.0022) [2024-06-13 00:25:10,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 1904902144. Throughput: 0: 49382.1. Samples: 1433723900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 00:25:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:25:13,699][71000] Updated weights for policy 0, policy_version 116274 (0.0030) [2024-06-13 00:25:15,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 1905147904. Throughput: 0: 49760.0. Samples: 1434021020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 00:25:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:25:16,226][71000] Updated weights for policy 0, policy_version 116284 (0.0022) [2024-06-13 00:25:20,129][71000] Updated weights for policy 0, policy_version 116294 (0.0028) [2024-06-13 00:25:20,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 1905377280. Throughput: 0: 49256.9. Samples: 1434165160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 00:25:20,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-13 00:25:23,251][71000] Updated weights for policy 0, policy_version 116304 (0.0034) [2024-06-13 00:25:25,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 1905639424. Throughput: 0: 49564.4. Samples: 1434466420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 00:25:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:25:26,687][71000] Updated weights for policy 0, policy_version 116314 (0.0031) [2024-06-13 00:25:29,799][71000] Updated weights for policy 0, policy_version 116324 (0.0037) [2024-06-13 00:25:30,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.2, 300 sec: 49207.6). Total num frames: 1905868800. Throughput: 0: 49380.1. Samples: 1434757040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 00:25:30,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:25:34,146][71000] Updated weights for policy 0, policy_version 116334 (0.0031) [2024-06-13 00:25:35,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 1906147328. Throughput: 0: 49433.4. Samples: 1434901420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 00:25:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 00:25:36,357][71000] Updated weights for policy 0, policy_version 116344 (0.0022) [2024-06-13 00:25:40,505][71000] Updated weights for policy 0, policy_version 116354 (0.0029) [2024-06-13 00:25:40,939][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 1906360320. Throughput: 0: 49399.3. Samples: 1435203700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 00:25:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:25:41,048][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000116356_1906376704.pth... [2024-06-13 00:25:41,089][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000115636_1894580224.pth [2024-06-13 00:25:43,335][71000] Updated weights for policy 0, policy_version 116364 (0.0026) [2024-06-13 00:25:45,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 1906622464. Throughput: 0: 49271.6. Samples: 1435495400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 00:25:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 00:25:46,865][71000] Updated weights for policy 0, policy_version 116374 (0.0033) [2024-06-13 00:25:50,019][71000] Updated weights for policy 0, policy_version 116384 (0.0029) [2024-06-13 00:25:50,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 1906868224. Throughput: 0: 49437.3. Samples: 1435646720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 00:25:50,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:25:53,629][71000] Updated weights for policy 0, policy_version 116394 (0.0024) [2024-06-13 00:25:55,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 1907113984. Throughput: 0: 49393.4. Samples: 1435946600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 00:25:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:25:56,782][71000] Updated weights for policy 0, policy_version 116404 (0.0026) [2024-06-13 00:26:00,495][71000] Updated weights for policy 0, policy_version 116414 (0.0033) [2024-06-13 00:26:00,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 1907343360. Throughput: 0: 49377.0. Samples: 1436242980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 00:26:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:26:03,319][71000] Updated weights for policy 0, policy_version 116424 (0.0031) [2024-06-13 00:26:05,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49971.1, 300 sec: 49374.1). Total num frames: 1907638272. Throughput: 0: 49411.0. Samples: 1436388660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 00:26:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:26:06,918][71000] Updated weights for policy 0, policy_version 116434 (0.0030) [2024-06-13 00:26:08,038][70980] Signal inference workers to stop experience collection... (21100 times) [2024-06-13 00:26:08,039][70980] Signal inference workers to resume experience collection... (21100 times) [2024-06-13 00:26:08,069][71000] InferenceWorker_p0-w0: stopping experience collection (21100 times) [2024-06-13 00:26:08,069][71000] InferenceWorker_p0-w0: resuming experience collection (21100 times) [2024-06-13 00:26:09,683][71000] Updated weights for policy 0, policy_version 116444 (0.0031) [2024-06-13 00:26:10,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1907867648. Throughput: 0: 49273.2. Samples: 1436683720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 00:26:10,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:26:13,201][71000] Updated weights for policy 0, policy_version 116454 (0.0023) [2024-06-13 00:26:15,940][70768] Fps is (10 sec: 45875.6, 60 sec: 49152.1, 300 sec: 49207.5). Total num frames: 1908097024. Throughput: 0: 49547.0. Samples: 1436986660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 00:26:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:26:16,465][71000] Updated weights for policy 0, policy_version 116464 (0.0033) [2024-06-13 00:26:19,961][71000] Updated weights for policy 0, policy_version 116474 (0.0035) [2024-06-13 00:26:20,941][70768] Fps is (10 sec: 45869.0, 60 sec: 49150.7, 300 sec: 49151.7). Total num frames: 1908326400. Throughput: 0: 49420.1. Samples: 1437125400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 00:26:20,942][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:26:23,609][71000] Updated weights for policy 0, policy_version 116484 (0.0032) [2024-06-13 00:26:25,939][70768] Fps is (10 sec: 52429.1, 60 sec: 49698.2, 300 sec: 49318.7). Total num frames: 1908621312. Throughput: 0: 49227.6. Samples: 1437418940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:26:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:26:26,475][71000] Updated weights for policy 0, policy_version 116494 (0.0028) [2024-06-13 00:26:30,163][71000] Updated weights for policy 0, policy_version 116504 (0.0034) [2024-06-13 00:26:30,941][70768] Fps is (10 sec: 52431.2, 60 sec: 49697.2, 300 sec: 49262.9). Total num frames: 1908850688. Throughput: 0: 49400.7. Samples: 1437718480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:26:30,941][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:26:33,131][71000] Updated weights for policy 0, policy_version 116514 (0.0022) [2024-06-13 00:26:35,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 1909080064. Throughput: 0: 49229.8. Samples: 1437862060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:26:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:26:36,630][71000] Updated weights for policy 0, policy_version 116524 (0.0030) [2024-06-13 00:26:39,735][71000] Updated weights for policy 0, policy_version 116534 (0.0021) [2024-06-13 00:26:40,940][70768] Fps is (10 sec: 45879.5, 60 sec: 49151.8, 300 sec: 49207.5). Total num frames: 1909309440. Throughput: 0: 49110.0. Samples: 1438156560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:26:40,949][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:26:43,330][71000] Updated weights for policy 0, policy_version 116544 (0.0035) [2024-06-13 00:26:45,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 1909604352. Throughput: 0: 48948.4. Samples: 1438445660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:26:45,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:26:46,155][71000] Updated weights for policy 0, policy_version 116554 (0.0025) [2024-06-13 00:26:50,217][71000] Updated weights for policy 0, policy_version 116564 (0.0034) [2024-06-13 00:26:50,940][70768] Fps is (10 sec: 54068.1, 60 sec: 49698.2, 300 sec: 49263.1). Total num frames: 1909850112. Throughput: 0: 49247.2. Samples: 1438604780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:26:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:26:52,631][71000] Updated weights for policy 0, policy_version 116574 (0.0025) [2024-06-13 00:26:55,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 1910079488. Throughput: 0: 49460.1. Samples: 1438909420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:26:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:26:56,517][71000] Updated weights for policy 0, policy_version 116584 (0.0024) [2024-06-13 00:26:59,777][71000] Updated weights for policy 0, policy_version 116594 (0.0038) [2024-06-13 00:27:00,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 1910308864. Throughput: 0: 49216.9. Samples: 1439201420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:27:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:27:02,161][70980] Signal inference workers to stop experience collection... (21150 times) [2024-06-13 00:27:02,162][70980] Signal inference workers to resume experience collection... (21150 times) [2024-06-13 00:27:02,177][71000] InferenceWorker_p0-w0: stopping experience collection (21150 times) [2024-06-13 00:27:02,178][71000] InferenceWorker_p0-w0: resuming experience collection (21150 times) [2024-06-13 00:27:03,034][71000] Updated weights for policy 0, policy_version 116604 (0.0024) [2024-06-13 00:27:05,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 1910571008. Throughput: 0: 49335.8. Samples: 1439345440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:27:05,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-13 00:27:06,311][71000] Updated weights for policy 0, policy_version 116614 (0.0036) [2024-06-13 00:27:09,884][71000] Updated weights for policy 0, policy_version 116624 (0.0026) [2024-06-13 00:27:10,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 1910833152. Throughput: 0: 49383.5. Samples: 1439641200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:27:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:27:13,053][71000] Updated weights for policy 0, policy_version 116634 (0.0030) [2024-06-13 00:27:15,939][70768] Fps is (10 sec: 47514.3, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 1911046144. Throughput: 0: 49328.8. Samples: 1439938220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:27:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:27:16,665][71000] Updated weights for policy 0, policy_version 116644 (0.0037) [2024-06-13 00:27:19,586][71000] Updated weights for policy 0, policy_version 116654 (0.0029) [2024-06-13 00:27:20,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49699.3, 300 sec: 49318.6). Total num frames: 1911308288. Throughput: 0: 49158.6. Samples: 1440074200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:27:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:27:23,195][71000] Updated weights for policy 0, policy_version 116664 (0.0028) [2024-06-13 00:27:25,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 1911570432. Throughput: 0: 49192.6. Samples: 1440370220. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 00:27:25,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:27:26,318][71000] Updated weights for policy 0, policy_version 116674 (0.0029) [2024-06-13 00:27:29,693][71000] Updated weights for policy 0, policy_version 116684 (0.0026) [2024-06-13 00:27:30,940][70768] Fps is (10 sec: 52429.6, 60 sec: 49699.1, 300 sec: 49318.6). Total num frames: 1911832576. Throughput: 0: 49505.0. Samples: 1440673380. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 00:27:30,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:27:32,769][71000] Updated weights for policy 0, policy_version 116694 (0.0029) [2024-06-13 00:27:35,939][70768] Fps is (10 sec: 47514.1, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 1912045568. Throughput: 0: 49286.3. Samples: 1440822660. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 00:27:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:27:36,273][71000] Updated weights for policy 0, policy_version 116704 (0.0027) [2024-06-13 00:27:39,698][71000] Updated weights for policy 0, policy_version 116714 (0.0026) [2024-06-13 00:27:40,940][70768] Fps is (10 sec: 45874.3, 60 sec: 49698.1, 300 sec: 49263.0). Total num frames: 1912291328. Throughput: 0: 49047.4. Samples: 1441116560. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 00:27:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:27:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000116717_1912291328.pth... [2024-06-13 00:27:40,993][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000115994_1900445696.pth [2024-06-13 00:27:43,110][71000] Updated weights for policy 0, policy_version 116724 (0.0028) [2024-06-13 00:27:45,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 1912537088. Throughput: 0: 49002.3. Samples: 1441406520. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 00:27:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:27:46,174][71000] Updated weights for policy 0, policy_version 116734 (0.0032) [2024-06-13 00:27:49,897][71000] Updated weights for policy 0, policy_version 116744 (0.0026) [2024-06-13 00:27:50,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1912799232. Throughput: 0: 49342.7. Samples: 1441565860. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 00:27:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:27:52,791][71000] Updated weights for policy 0, policy_version 116754 (0.0021) [2024-06-13 00:27:55,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48878.9, 300 sec: 49263.0). Total num frames: 1913012224. Throughput: 0: 49167.9. Samples: 1441853760. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 00:27:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:27:56,347][71000] Updated weights for policy 0, policy_version 116764 (0.0030) [2024-06-13 00:27:59,415][71000] Updated weights for policy 0, policy_version 116774 (0.0028) [2024-06-13 00:28:00,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1913274368. Throughput: 0: 49304.8. Samples: 1442156940. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 00:28:00,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:28:02,755][71000] Updated weights for policy 0, policy_version 116784 (0.0025) [2024-06-13 00:28:05,940][70768] Fps is (10 sec: 52429.8, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 1913536512. Throughput: 0: 49489.1. Samples: 1442301200. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 00:28:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 00:28:05,977][71000] Updated weights for policy 0, policy_version 116794 (0.0032) [2024-06-13 00:28:09,420][71000] Updated weights for policy 0, policy_version 116804 (0.0037) [2024-06-13 00:28:10,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 1913798656. Throughput: 0: 49740.4. Samples: 1442608540. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 00:28:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:28:12,460][71000] Updated weights for policy 0, policy_version 116814 (0.0028) [2024-06-13 00:28:14,811][70980] Signal inference workers to stop experience collection... (21200 times) [2024-06-13 00:28:14,811][70980] Signal inference workers to resume experience collection... (21200 times) [2024-06-13 00:28:14,833][71000] InferenceWorker_p0-w0: stopping experience collection (21200 times) [2024-06-13 00:28:14,833][71000] InferenceWorker_p0-w0: resuming experience collection (21200 times) [2024-06-13 00:28:15,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49698.0, 300 sec: 49374.1). Total num frames: 1914028032. Throughput: 0: 49735.0. Samples: 1442911460. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 00:28:15,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:28:16,081][71000] Updated weights for policy 0, policy_version 116824 (0.0026) [2024-06-13 00:28:18,983][71000] Updated weights for policy 0, policy_version 116834 (0.0033) [2024-06-13 00:28:20,940][70768] Fps is (10 sec: 45874.4, 60 sec: 49151.9, 300 sec: 49263.0). Total num frames: 1914257408. Throughput: 0: 49523.2. Samples: 1443051220. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 00:28:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:28:22,651][71000] Updated weights for policy 0, policy_version 116844 (0.0033) [2024-06-13 00:28:25,619][71000] Updated weights for policy 0, policy_version 116854 (0.0024) [2024-06-13 00:28:25,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1914535936. Throughput: 0: 49561.5. Samples: 1443346820. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 00:28:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:28:29,234][71000] Updated weights for policy 0, policy_version 116864 (0.0030) [2024-06-13 00:28:30,939][70768] Fps is (10 sec: 52430.6, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1914781696. Throughput: 0: 49783.6. Samples: 1443646780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 00:28:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:28:32,118][71000] Updated weights for policy 0, policy_version 116874 (0.0035) [2024-06-13 00:28:35,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49424.9, 300 sec: 49374.1). Total num frames: 1915011072. Throughput: 0: 49568.3. Samples: 1443796440. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 00:28:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:28:36,073][71000] Updated weights for policy 0, policy_version 116884 (0.0024) [2024-06-13 00:28:38,798][71000] Updated weights for policy 0, policy_version 116894 (0.0023) [2024-06-13 00:28:40,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 1915256832. Throughput: 0: 49619.7. Samples: 1444086640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 00:28:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:28:42,567][71000] Updated weights for policy 0, policy_version 116904 (0.0030) [2024-06-13 00:28:45,697][71000] Updated weights for policy 0, policy_version 116914 (0.0021) [2024-06-13 00:28:45,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.0, 300 sec: 49374.1). Total num frames: 1915518976. Throughput: 0: 49374.6. Samples: 1444378800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 00:28:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:28:49,305][71000] Updated weights for policy 0, policy_version 116924 (0.0030) [2024-06-13 00:28:50,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 1915764736. Throughput: 0: 49717.5. Samples: 1444538500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 00:28:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:28:52,248][71000] Updated weights for policy 0, policy_version 116934 (0.0024) [2024-06-13 00:28:55,889][71000] Updated weights for policy 0, policy_version 116944 (0.0027) [2024-06-13 00:28:55,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 1916010496. Throughput: 0: 49259.5. Samples: 1444825220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 00:28:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:28:58,805][71000] Updated weights for policy 0, policy_version 116954 (0.0027) [2024-06-13 00:29:00,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49425.1, 300 sec: 49374.1). Total num frames: 1916239872. Throughput: 0: 49333.9. Samples: 1445131480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 00:29:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:29:02,331][71000] Updated weights for policy 0, policy_version 116964 (0.0027) [2024-06-13 00:29:05,647][71000] Updated weights for policy 0, policy_version 116974 (0.0031) [2024-06-13 00:29:05,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1916518400. Throughput: 0: 49608.3. Samples: 1445283580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 00:29:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:29:09,204][71000] Updated weights for policy 0, policy_version 116984 (0.0031) [2024-06-13 00:29:10,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1916764160. Throughput: 0: 49260.4. Samples: 1445563540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 00:29:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:29:12,296][71000] Updated weights for policy 0, policy_version 116994 (0.0030) [2024-06-13 00:29:15,690][71000] Updated weights for policy 0, policy_version 117004 (0.0025) [2024-06-13 00:29:15,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1916993536. Throughput: 0: 49196.2. Samples: 1445860620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 00:29:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:29:18,970][71000] Updated weights for policy 0, policy_version 117014 (0.0030) [2024-06-13 00:29:20,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49698.4, 300 sec: 49374.2). Total num frames: 1917239296. Throughput: 0: 49246.4. Samples: 1446012520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 00:29:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:29:22,547][71000] Updated weights for policy 0, policy_version 117024 (0.0023) [2024-06-13 00:29:25,465][71000] Updated weights for policy 0, policy_version 117034 (0.0028) [2024-06-13 00:29:25,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1917485056. Throughput: 0: 49484.4. Samples: 1446313440. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 00:29:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:29:29,145][71000] Updated weights for policy 0, policy_version 117044 (0.0035) [2024-06-13 00:29:30,940][70768] Fps is (10 sec: 50789.2, 60 sec: 49424.9, 300 sec: 49374.1). Total num frames: 1917747200. Throughput: 0: 49597.7. Samples: 1446610700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:29:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:29:32,008][70980] Signal inference workers to stop experience collection... (21250 times) [2024-06-13 00:29:32,052][71000] InferenceWorker_p0-w0: stopping experience collection (21250 times) [2024-06-13 00:29:32,056][70980] Signal inference workers to resume experience collection... (21250 times) [2024-06-13 00:29:32,063][71000] InferenceWorker_p0-w0: resuming experience collection (21250 times) [2024-06-13 00:29:32,202][71000] Updated weights for policy 0, policy_version 117054 (0.0027) [2024-06-13 00:29:35,765][71000] Updated weights for policy 0, policy_version 117064 (0.0033) [2024-06-13 00:29:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.1, 300 sec: 49374.1). Total num frames: 1917976576. Throughput: 0: 49492.2. Samples: 1446765640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:29:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:29:39,032][71000] Updated weights for policy 0, policy_version 117074 (0.0034) [2024-06-13 00:29:40,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 1918238720. Throughput: 0: 49480.9. Samples: 1447051860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:29:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:29:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000117080_1918238720.pth... [2024-06-13 00:29:40,996][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000116356_1906376704.pth [2024-06-13 00:29:42,505][71000] Updated weights for policy 0, policy_version 117084 (0.0024) [2024-06-13 00:29:45,661][71000] Updated weights for policy 0, policy_version 117094 (0.0027) [2024-06-13 00:29:45,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 1918468096. Throughput: 0: 49355.6. Samples: 1447352480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:29:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:29:49,175][71000] Updated weights for policy 0, policy_version 117104 (0.0026) [2024-06-13 00:29:50,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 1918713856. Throughput: 0: 49049.3. Samples: 1447490800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:29:50,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:29:52,359][71000] Updated weights for policy 0, policy_version 117114 (0.0029) [2024-06-13 00:29:55,648][71000] Updated weights for policy 0, policy_version 117124 (0.0034) [2024-06-13 00:29:55,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1918976000. Throughput: 0: 49527.5. Samples: 1447792280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:29:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:29:59,008][71000] Updated weights for policy 0, policy_version 117134 (0.0026) [2024-06-13 00:30:00,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49971.1, 300 sec: 49485.2). Total num frames: 1919238144. Throughput: 0: 49491.9. Samples: 1448087760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:30:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:30:01,939][71000] Updated weights for policy 0, policy_version 117144 (0.0025) [2024-06-13 00:30:05,501][71000] Updated weights for policy 0, policy_version 117154 (0.0027) [2024-06-13 00:30:05,944][70768] Fps is (10 sec: 49131.8, 60 sec: 49148.6, 300 sec: 49373.5). Total num frames: 1919467520. Throughput: 0: 49640.2. Samples: 1448246540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:30:05,944][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:30:08,650][71000] Updated weights for policy 0, policy_version 117164 (0.0033) [2024-06-13 00:30:10,940][70768] Fps is (10 sec: 47514.4, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1919713280. Throughput: 0: 49475.6. Samples: 1448539840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:30:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:30:12,380][71000] Updated weights for policy 0, policy_version 117174 (0.0032) [2024-06-13 00:30:15,439][71000] Updated weights for policy 0, policy_version 117184 (0.0033) [2024-06-13 00:30:15,940][70768] Fps is (10 sec: 49172.5, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1919959040. Throughput: 0: 49095.7. Samples: 1448820000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:30:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:30:19,299][71000] Updated weights for policy 0, policy_version 117194 (0.0035) [2024-06-13 00:30:20,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1920221184. Throughput: 0: 48960.0. Samples: 1448968840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:30:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:30:22,031][71000] Updated weights for policy 0, policy_version 117204 (0.0029) [2024-06-13 00:30:25,658][71000] Updated weights for policy 0, policy_version 117214 (0.0028) [2024-06-13 00:30:25,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1920450560. Throughput: 0: 49198.7. Samples: 1449265800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 00:30:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:30:28,637][71000] Updated weights for policy 0, policy_version 117224 (0.0025) [2024-06-13 00:30:30,940][70768] Fps is (10 sec: 45874.7, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 1920679936. Throughput: 0: 49166.1. Samples: 1449564960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:30:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:30:32,223][71000] Updated weights for policy 0, policy_version 117234 (0.0038) [2024-06-13 00:30:35,374][71000] Updated weights for policy 0, policy_version 117244 (0.0028) [2024-06-13 00:30:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1920942080. Throughput: 0: 49415.9. Samples: 1449714520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:30:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:30:38,938][71000] Updated weights for policy 0, policy_version 117254 (0.0034) [2024-06-13 00:30:39,112][70980] Signal inference workers to stop experience collection... (21300 times) [2024-06-13 00:30:39,164][71000] InferenceWorker_p0-w0: stopping experience collection (21300 times) [2024-06-13 00:30:39,221][70980] Signal inference workers to resume experience collection... (21300 times) [2024-06-13 00:30:39,222][71000] InferenceWorker_p0-w0: resuming experience collection (21300 times) [2024-06-13 00:30:40,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1921204224. Throughput: 0: 49201.0. Samples: 1450006320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:30:40,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-13 00:30:41,830][71000] Updated weights for policy 0, policy_version 117264 (0.0024) [2024-06-13 00:30:45,692][71000] Updated weights for policy 0, policy_version 117274 (0.0022) [2024-06-13 00:30:45,939][70768] Fps is (10 sec: 49152.8, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1921433600. Throughput: 0: 49335.4. Samples: 1450307840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:30:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:30:48,650][71000] Updated weights for policy 0, policy_version 117284 (0.0039) [2024-06-13 00:30:50,939][70768] Fps is (10 sec: 45875.5, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 1921662976. Throughput: 0: 48889.5. Samples: 1450446360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:30:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:30:52,059][71000] Updated weights for policy 0, policy_version 117294 (0.0030) [2024-06-13 00:30:55,471][71000] Updated weights for policy 0, policy_version 117304 (0.0028) [2024-06-13 00:30:55,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1921925120. Throughput: 0: 49183.9. Samples: 1450753120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:30:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:30:58,814][71000] Updated weights for policy 0, policy_version 117314 (0.0024) [2024-06-13 00:31:00,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 1922170880. Throughput: 0: 49512.4. Samples: 1451048060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:31:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:31:01,740][71000] Updated weights for policy 0, policy_version 117324 (0.0027) [2024-06-13 00:31:05,368][71000] Updated weights for policy 0, policy_version 117334 (0.0035) [2024-06-13 00:31:05,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49428.4, 300 sec: 49374.2). Total num frames: 1922433024. Throughput: 0: 49490.6. Samples: 1451195920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:31:05,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 00:31:08,734][71000] Updated weights for policy 0, policy_version 117344 (0.0035) [2024-06-13 00:31:10,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 1922646016. Throughput: 0: 49468.9. Samples: 1451491900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:31:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:31:11,781][71000] Updated weights for policy 0, policy_version 117354 (0.0026) [2024-06-13 00:31:14,915][71000] Updated weights for policy 0, policy_version 117364 (0.0026) [2024-06-13 00:31:15,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.2, 300 sec: 49485.5). Total num frames: 1922924544. Throughput: 0: 49394.0. Samples: 1451787680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:31:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:31:18,472][71000] Updated weights for policy 0, policy_version 117374 (0.0022) [2024-06-13 00:31:20,939][70768] Fps is (10 sec: 52429.3, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1923170304. Throughput: 0: 49529.5. Samples: 1451943340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:31:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:31:21,775][71000] Updated weights for policy 0, policy_version 117384 (0.0033) [2024-06-13 00:31:25,019][71000] Updated weights for policy 0, policy_version 117394 (0.0031) [2024-06-13 00:31:25,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49425.1, 300 sec: 49374.3). Total num frames: 1923416064. Throughput: 0: 49711.1. Samples: 1452243320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:31:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:31:28,271][71000] Updated weights for policy 0, policy_version 117404 (0.0026) [2024-06-13 00:31:30,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 1923645440. Throughput: 0: 49631.9. Samples: 1452541280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 00:31:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:31:31,732][71000] Updated weights for policy 0, policy_version 117414 (0.0024) [2024-06-13 00:31:34,729][71000] Updated weights for policy 0, policy_version 117424 (0.0023) [2024-06-13 00:31:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49485.3). Total num frames: 1923907584. Throughput: 0: 49590.2. Samples: 1452677920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 00:31:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:31:38,195][71000] Updated weights for policy 0, policy_version 117434 (0.0028) [2024-06-13 00:31:40,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49424.9, 300 sec: 49374.1). Total num frames: 1924169728. Throughput: 0: 49742.6. Samples: 1452991540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 00:31:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:31:40,953][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000117442_1924169728.pth... [2024-06-13 00:31:40,996][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000116717_1912291328.pth [2024-06-13 00:31:41,527][71000] Updated weights for policy 0, policy_version 117444 (0.0029) [2024-06-13 00:31:44,918][71000] Updated weights for policy 0, policy_version 117454 (0.0027) [2024-06-13 00:31:45,666][70980] Signal inference workers to stop experience collection... (21350 times) [2024-06-13 00:31:45,683][71000] InferenceWorker_p0-w0: stopping experience collection (21350 times) [2024-06-13 00:31:45,731][70980] Signal inference workers to resume experience collection... (21350 times) [2024-06-13 00:31:45,731][71000] InferenceWorker_p0-w0: resuming experience collection (21350 times) [2024-06-13 00:31:45,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 1924415488. Throughput: 0: 49497.4. Samples: 1453275440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 00:31:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:31:48,253][71000] Updated weights for policy 0, policy_version 117464 (0.0023) [2024-06-13 00:31:50,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49971.1, 300 sec: 49429.7). Total num frames: 1924661248. Throughput: 0: 49621.3. Samples: 1453428880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 00:31:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 00:31:51,240][71000] Updated weights for policy 0, policy_version 117474 (0.0018) [2024-06-13 00:31:54,675][71000] Updated weights for policy 0, policy_version 117484 (0.0024) [2024-06-13 00:31:55,939][70768] Fps is (10 sec: 45875.5, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 1924874240. Throughput: 0: 49639.2. Samples: 1453725660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 00:31:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 00:31:57,933][71000] Updated weights for policy 0, policy_version 117494 (0.0027) [2024-06-13 00:32:00,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1925152768. Throughput: 0: 49790.8. Samples: 1454028280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 00:32:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:32:01,405][71000] Updated weights for policy 0, policy_version 117504 (0.0023) [2024-06-13 00:32:04,560][71000] Updated weights for policy 0, policy_version 117514 (0.0025) [2024-06-13 00:32:05,940][70768] Fps is (10 sec: 54066.8, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1925414912. Throughput: 0: 49619.1. Samples: 1454176200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 00:32:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-13 00:32:08,168][71000] Updated weights for policy 0, policy_version 117524 (0.0023) [2024-06-13 00:32:10,940][70768] Fps is (10 sec: 50790.7, 60 sec: 50244.2, 300 sec: 49540.7). Total num frames: 1925660672. Throughput: 0: 49607.4. Samples: 1454475660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 00:32:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:32:11,118][71000] Updated weights for policy 0, policy_version 117534 (0.0030) [2024-06-13 00:32:14,659][71000] Updated weights for policy 0, policy_version 117544 (0.0029) [2024-06-13 00:32:15,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49151.9, 300 sec: 49374.2). Total num frames: 1925873664. Throughput: 0: 49498.7. Samples: 1454768720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 00:32:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:32:18,014][71000] Updated weights for policy 0, policy_version 117554 (0.0027) [2024-06-13 00:32:20,939][70768] Fps is (10 sec: 49152.8, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1926152192. Throughput: 0: 49715.2. Samples: 1454915100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 00:32:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:32:21,046][71000] Updated weights for policy 0, policy_version 117564 (0.0031) [2024-06-13 00:32:24,648][71000] Updated weights for policy 0, policy_version 117574 (0.0036) [2024-06-13 00:32:25,940][70768] Fps is (10 sec: 54067.1, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 1926414336. Throughput: 0: 49348.2. Samples: 1455212200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 00:32:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:32:27,685][71000] Updated weights for policy 0, policy_version 117584 (0.0027) [2024-06-13 00:32:30,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1926627328. Throughput: 0: 49703.5. Samples: 1455512100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 00:32:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 00:32:31,379][71000] Updated weights for policy 0, policy_version 117594 (0.0031) [2024-06-13 00:32:35,114][71000] Updated weights for policy 0, policy_version 117604 (0.0030) [2024-06-13 00:32:35,940][70768] Fps is (10 sec: 45874.5, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1926873088. Throughput: 0: 49333.7. Samples: 1455648900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-13 00:32:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:32:37,846][71000] Updated weights for policy 0, policy_version 117614 (0.0027) [2024-06-13 00:32:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 1927118848. Throughput: 0: 49276.8. Samples: 1455943120. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-13 00:32:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:32:41,433][71000] Updated weights for policy 0, policy_version 117624 (0.0023) [2024-06-13 00:32:44,588][71000] Updated weights for policy 0, policy_version 117634 (0.0031) [2024-06-13 00:32:45,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1927397376. Throughput: 0: 49265.9. Samples: 1456245240. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-13 00:32:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:32:47,815][71000] Updated weights for policy 0, policy_version 117644 (0.0035) [2024-06-13 00:32:50,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1927626752. Throughput: 0: 49494.6. Samples: 1456403460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-13 00:32:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:32:51,056][71000] Updated weights for policy 0, policy_version 117654 (0.0030) [2024-06-13 00:32:54,547][71000] Updated weights for policy 0, policy_version 117664 (0.0024) [2024-06-13 00:32:55,939][70768] Fps is (10 sec: 45875.8, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1927856128. Throughput: 0: 49423.7. Samples: 1456699720. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-13 00:32:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:32:56,273][70980] Signal inference workers to stop experience collection... (21400 times) [2024-06-13 00:32:56,275][70980] Signal inference workers to resume experience collection... (21400 times) [2024-06-13 00:32:56,286][71000] InferenceWorker_p0-w0: stopping experience collection (21400 times) [2024-06-13 00:32:56,310][71000] InferenceWorker_p0-w0: resuming experience collection (21400 times) [2024-06-13 00:32:57,882][71000] Updated weights for policy 0, policy_version 117674 (0.0036) [2024-06-13 00:33:00,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 1928101888. Throughput: 0: 49297.6. Samples: 1456987120. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-13 00:33:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 00:33:01,318][71000] Updated weights for policy 0, policy_version 117684 (0.0037) [2024-06-13 00:33:04,392][71000] Updated weights for policy 0, policy_version 117694 (0.0029) [2024-06-13 00:33:05,940][70768] Fps is (10 sec: 54067.0, 60 sec: 49698.1, 300 sec: 49485.3). Total num frames: 1928396800. Throughput: 0: 49461.3. Samples: 1457140860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-13 00:33:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:33:07,770][71000] Updated weights for policy 0, policy_version 117704 (0.0040) [2024-06-13 00:33:10,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1928609792. Throughput: 0: 49563.0. Samples: 1457442540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-13 00:33:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:33:11,086][71000] Updated weights for policy 0, policy_version 117714 (0.0031) [2024-06-13 00:33:14,550][71000] Updated weights for policy 0, policy_version 117724 (0.0025) [2024-06-13 00:33:15,939][70768] Fps is (10 sec: 42598.7, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1928822784. Throughput: 0: 49364.1. Samples: 1457733480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-13 00:33:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:33:17,573][71000] Updated weights for policy 0, policy_version 117734 (0.0037) [2024-06-13 00:33:20,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 1929101312. Throughput: 0: 49410.4. Samples: 1457872360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-13 00:33:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:33:21,387][71000] Updated weights for policy 0, policy_version 117744 (0.0027) [2024-06-13 00:33:24,251][71000] Updated weights for policy 0, policy_version 117754 (0.0036) [2024-06-13 00:33:25,942][70768] Fps is (10 sec: 55692.5, 60 sec: 49423.2, 300 sec: 49484.8). Total num frames: 1929379840. Throughput: 0: 49526.8. Samples: 1458171940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-13 00:33:25,942][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:33:27,699][71000] Updated weights for policy 0, policy_version 117764 (0.0028) [2024-06-13 00:33:30,750][71000] Updated weights for policy 0, policy_version 117774 (0.0024) [2024-06-13 00:33:30,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49698.2, 300 sec: 49485.3). Total num frames: 1929609216. Throughput: 0: 49701.0. Samples: 1458481780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-13 00:33:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:33:34,368][71000] Updated weights for policy 0, policy_version 117784 (0.0028) [2024-06-13 00:33:35,939][70768] Fps is (10 sec: 44247.0, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 1929822208. Throughput: 0: 49252.5. Samples: 1458619820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 00:33:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:33:37,366][71000] Updated weights for policy 0, policy_version 117794 (0.0027) [2024-06-13 00:33:40,940][70768] Fps is (10 sec: 47512.3, 60 sec: 49424.9, 300 sec: 49374.1). Total num frames: 1930084352. Throughput: 0: 49390.4. Samples: 1458922300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 00:33:40,949][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:33:41,077][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000117804_1930100736.pth... [2024-06-13 00:33:41,085][71000] Updated weights for policy 0, policy_version 117804 (0.0023) [2024-06-13 00:33:41,114][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000117080_1918238720.pth [2024-06-13 00:33:44,174][71000] Updated weights for policy 0, policy_version 117814 (0.0023) [2024-06-13 00:33:45,940][70768] Fps is (10 sec: 54067.1, 60 sec: 49425.2, 300 sec: 49485.3). Total num frames: 1930362880. Throughput: 0: 49383.8. Samples: 1459209380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 00:33:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:33:47,571][71000] Updated weights for policy 0, policy_version 117824 (0.0025) [2024-06-13 00:33:50,855][71000] Updated weights for policy 0, policy_version 117834 (0.0027) [2024-06-13 00:33:50,939][70768] Fps is (10 sec: 50791.8, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1930592256. Throughput: 0: 49496.0. Samples: 1459368180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 00:33:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:33:54,139][71000] Updated weights for policy 0, policy_version 117844 (0.0028) [2024-06-13 00:33:55,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1930821632. Throughput: 0: 49295.2. Samples: 1459660820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 00:33:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:33:57,329][71000] Updated weights for policy 0, policy_version 117854 (0.0030) [2024-06-13 00:34:00,442][71000] Updated weights for policy 0, policy_version 117864 (0.0022) [2024-06-13 00:34:00,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49698.2, 300 sec: 49374.1). Total num frames: 1931083776. Throughput: 0: 49419.9. Samples: 1459957380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 00:34:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:34:03,980][71000] Updated weights for policy 0, policy_version 117874 (0.0033) [2024-06-13 00:34:05,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 1931345920. Throughput: 0: 49789.7. Samples: 1460112900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 00:34:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:34:07,415][71000] Updated weights for policy 0, policy_version 117884 (0.0030) [2024-06-13 00:34:09,036][70980] Signal inference workers to stop experience collection... (21450 times) [2024-06-13 00:34:09,085][71000] InferenceWorker_p0-w0: stopping experience collection (21450 times) [2024-06-13 00:34:09,092][70980] Signal inference workers to resume experience collection... (21450 times) [2024-06-13 00:34:09,097][71000] InferenceWorker_p0-w0: resuming experience collection (21450 times) [2024-06-13 00:34:10,481][71000] Updated weights for policy 0, policy_version 117894 (0.0022) [2024-06-13 00:34:10,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1931591680. Throughput: 0: 49582.4. Samples: 1460403040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 00:34:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 00:34:13,785][71000] Updated weights for policy 0, policy_version 117904 (0.0025) [2024-06-13 00:34:15,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 1931821056. Throughput: 0: 49343.1. Samples: 1460702220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 00:34:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:34:17,360][71000] Updated weights for policy 0, policy_version 117914 (0.0027) [2024-06-13 00:34:20,332][71000] Updated weights for policy 0, policy_version 117924 (0.0030) [2024-06-13 00:34:20,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1932066816. Throughput: 0: 49337.7. Samples: 1460840020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 00:34:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:34:24,041][71000] Updated weights for policy 0, policy_version 117934 (0.0036) [2024-06-13 00:34:25,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49153.8, 300 sec: 49429.7). Total num frames: 1932328960. Throughput: 0: 49291.7. Samples: 1461140420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 00:34:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:34:26,979][71000] Updated weights for policy 0, policy_version 117944 (0.0027) [2024-06-13 00:34:30,566][71000] Updated weights for policy 0, policy_version 117954 (0.0032) [2024-06-13 00:34:30,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1932558336. Throughput: 0: 49649.4. Samples: 1461443600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 00:34:30,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:34:33,726][71000] Updated weights for policy 0, policy_version 117964 (0.0032) [2024-06-13 00:34:35,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 1932804096. Throughput: 0: 49245.8. Samples: 1461584240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 00:34:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:34:37,280][71000] Updated weights for policy 0, policy_version 117974 (0.0031) [2024-06-13 00:34:40,378][71000] Updated weights for policy 0, policy_version 117984 (0.0030) [2024-06-13 00:34:40,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1933066240. Throughput: 0: 49336.4. Samples: 1461880960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 00:34:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:34:44,200][71000] Updated weights for policy 0, policy_version 117994 (0.0028) [2024-06-13 00:34:45,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48878.9, 300 sec: 49429.7). Total num frames: 1933295616. Throughput: 0: 49014.7. Samples: 1462163040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 00:34:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:34:47,349][71000] Updated weights for policy 0, policy_version 118004 (0.0033) [2024-06-13 00:34:50,572][71000] Updated weights for policy 0, policy_version 118014 (0.0026) [2024-06-13 00:34:50,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1933557760. Throughput: 0: 49067.2. Samples: 1462320920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 00:34:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:34:54,087][71000] Updated weights for policy 0, policy_version 118024 (0.0031) [2024-06-13 00:34:55,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 1933787136. Throughput: 0: 49120.0. Samples: 1462613440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 00:34:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:34:57,238][71000] Updated weights for policy 0, policy_version 118034 (0.0018) [2024-06-13 00:35:00,483][71000] Updated weights for policy 0, policy_version 118044 (0.0024) [2024-06-13 00:35:00,939][70768] Fps is (10 sec: 49152.6, 60 sec: 49425.2, 300 sec: 49430.4). Total num frames: 1934049280. Throughput: 0: 49308.9. Samples: 1462921120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 00:35:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:35:03,589][71000] Updated weights for policy 0, policy_version 118054 (0.0022) [2024-06-13 00:35:05,939][70768] Fps is (10 sec: 49152.6, 60 sec: 48879.0, 300 sec: 49374.2). Total num frames: 1934278656. Throughput: 0: 49513.9. Samples: 1463068140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 00:35:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:35:07,065][71000] Updated weights for policy 0, policy_version 118064 (0.0027) [2024-06-13 00:35:10,442][71000] Updated weights for policy 0, policy_version 118074 (0.0028) [2024-06-13 00:35:10,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1934540800. Throughput: 0: 49319.5. Samples: 1463359800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 00:35:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:35:13,828][71000] Updated weights for policy 0, policy_version 118084 (0.0027) [2024-06-13 00:35:15,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1934770176. Throughput: 0: 49121.3. Samples: 1463654060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 00:35:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:35:17,021][71000] Updated weights for policy 0, policy_version 118094 (0.0028) [2024-06-13 00:35:20,208][70980] Signal inference workers to stop experience collection... (21500 times) [2024-06-13 00:35:20,208][70980] Signal inference workers to resume experience collection... (21500 times) [2024-06-13 00:35:20,220][71000] InferenceWorker_p0-w0: stopping experience collection (21500 times) [2024-06-13 00:35:20,220][71000] InferenceWorker_p0-w0: resuming experience collection (21500 times) [2024-06-13 00:35:20,345][71000] Updated weights for policy 0, policy_version 118104 (0.0028) [2024-06-13 00:35:20,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49151.9, 300 sec: 49374.2). Total num frames: 1935015936. Throughput: 0: 49175.8. Samples: 1463797160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 00:35:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:35:23,730][71000] Updated weights for policy 0, policy_version 118114 (0.0032) [2024-06-13 00:35:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48879.0, 300 sec: 49429.7). Total num frames: 1935261696. Throughput: 0: 49046.8. Samples: 1464088060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 00:35:25,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 00:35:27,067][71000] Updated weights for policy 0, policy_version 118124 (0.0030) [2024-06-13 00:35:30,611][71000] Updated weights for policy 0, policy_version 118134 (0.0024) [2024-06-13 00:35:30,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49151.9, 300 sec: 49374.2). Total num frames: 1935507456. Throughput: 0: 49348.4. Samples: 1464383720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 00:35:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:35:33,942][71000] Updated weights for policy 0, policy_version 118144 (0.0025) [2024-06-13 00:35:35,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 1935753216. Throughput: 0: 49228.1. Samples: 1464536180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 00:35:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:35:36,880][71000] Updated weights for policy 0, policy_version 118154 (0.0025) [2024-06-13 00:35:40,310][71000] Updated weights for policy 0, policy_version 118164 (0.0024) [2024-06-13 00:35:40,944][70768] Fps is (10 sec: 52407.3, 60 sec: 49421.7, 300 sec: 49484.5). Total num frames: 1936031744. Throughput: 0: 49596.8. Samples: 1464845500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 00:35:40,944][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:35:40,959][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000118166_1936031744.pth... [2024-06-13 00:35:41,002][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000117442_1924169728.pth [2024-06-13 00:35:43,608][71000] Updated weights for policy 0, policy_version 118174 (0.0028) [2024-06-13 00:35:45,940][70768] Fps is (10 sec: 50789.1, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1936261120. Throughput: 0: 49215.2. Samples: 1465135820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 00:35:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:35:47,016][71000] Updated weights for policy 0, policy_version 118184 (0.0025) [2024-06-13 00:35:50,179][71000] Updated weights for policy 0, policy_version 118194 (0.0033) [2024-06-13 00:35:50,940][70768] Fps is (10 sec: 47533.6, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1936506880. Throughput: 0: 49248.0. Samples: 1465284300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 00:35:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:35:53,685][71000] Updated weights for policy 0, policy_version 118204 (0.0024) [2024-06-13 00:35:55,939][70768] Fps is (10 sec: 49153.5, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1936752640. Throughput: 0: 49175.4. Samples: 1465572680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 00:35:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:35:56,926][71000] Updated weights for policy 0, policy_version 118214 (0.0026) [2024-06-13 00:36:00,028][71000] Updated weights for policy 0, policy_version 118224 (0.0021) [2024-06-13 00:36:00,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49698.0, 300 sec: 49485.2). Total num frames: 1937031168. Throughput: 0: 49438.2. Samples: 1465878780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 00:36:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:36:03,318][71000] Updated weights for policy 0, policy_version 118234 (0.0036) [2024-06-13 00:36:05,939][70768] Fps is (10 sec: 50790.1, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1937260544. Throughput: 0: 49782.0. Samples: 1466037340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 00:36:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:36:06,777][71000] Updated weights for policy 0, policy_version 118244 (0.0037) [2024-06-13 00:36:09,711][71000] Updated weights for policy 0, policy_version 118254 (0.0032) [2024-06-13 00:36:10,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.1, 300 sec: 49374.1). Total num frames: 1937489920. Throughput: 0: 49829.3. Samples: 1466330380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 00:36:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:36:13,392][71000] Updated weights for policy 0, policy_version 118264 (0.0031) [2024-06-13 00:36:15,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 1937735680. Throughput: 0: 49596.2. Samples: 1466615540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 00:36:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:36:16,696][71000] Updated weights for policy 0, policy_version 118274 (0.0035) [2024-06-13 00:36:20,584][71000] Updated weights for policy 0, policy_version 118284 (0.0029) [2024-06-13 00:36:20,939][70768] Fps is (10 sec: 49152.5, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 1937981440. Throughput: 0: 49576.5. Samples: 1466767120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 00:36:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:36:23,483][71000] Updated weights for policy 0, policy_version 118294 (0.0025) [2024-06-13 00:36:25,940][70768] Fps is (10 sec: 52427.6, 60 sec: 49971.1, 300 sec: 49540.8). Total num frames: 1938259968. Throughput: 0: 49370.2. Samples: 1467066960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 00:36:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:36:26,989][71000] Updated weights for policy 0, policy_version 118304 (0.0029) [2024-06-13 00:36:29,914][71000] Updated weights for policy 0, policy_version 118314 (0.0022) [2024-06-13 00:36:30,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1938489344. Throughput: 0: 49547.3. Samples: 1467365440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 00:36:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:36:33,462][71000] Updated weights for policy 0, policy_version 118324 (0.0028) [2024-06-13 00:36:35,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49971.1, 300 sec: 49429.7). Total num frames: 1938751488. Throughput: 0: 49585.7. Samples: 1467515660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 00:36:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:36:36,770][71000] Updated weights for policy 0, policy_version 118334 (0.0032) [2024-06-13 00:36:40,221][71000] Updated weights for policy 0, policy_version 118344 (0.0027) [2024-06-13 00:36:40,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48882.3, 300 sec: 49318.6). Total num frames: 1938964480. Throughput: 0: 49748.3. Samples: 1467811360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 00:36:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:36:40,950][70980] Signal inference workers to stop experience collection... (21550 times) [2024-06-13 00:36:40,985][71000] InferenceWorker_p0-w0: stopping experience collection (21550 times) [2024-06-13 00:36:41,005][70980] Signal inference workers to resume experience collection... (21550 times) [2024-06-13 00:36:41,012][71000] InferenceWorker_p0-w0: resuming experience collection (21550 times) [2024-06-13 00:36:43,032][71000] Updated weights for policy 0, policy_version 118354 (0.0039) [2024-06-13 00:36:45,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1939243008. Throughput: 0: 49461.2. Samples: 1468104540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 00:36:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:36:46,746][71000] Updated weights for policy 0, policy_version 118364 (0.0026) [2024-06-13 00:36:49,381][71000] Updated weights for policy 0, policy_version 118374 (0.0029) [2024-06-13 00:36:50,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49698.0, 300 sec: 49540.7). Total num frames: 1939488768. Throughput: 0: 49441.1. Samples: 1468262200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 00:36:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:36:53,296][71000] Updated weights for policy 0, policy_version 118384 (0.0028) [2024-06-13 00:36:55,897][71000] Updated weights for policy 0, policy_version 118394 (0.0029) [2024-06-13 00:36:55,940][70768] Fps is (10 sec: 52429.4, 60 sec: 50244.2, 300 sec: 49540.8). Total num frames: 1939767296. Throughput: 0: 49953.4. Samples: 1468578280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 00:36:55,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:36:59,838][71000] Updated weights for policy 0, policy_version 118404 (0.0030) [2024-06-13 00:37:00,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 1939980288. Throughput: 0: 50094.4. Samples: 1468869800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 00:37:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:37:02,717][71000] Updated weights for policy 0, policy_version 118414 (0.0027) [2024-06-13 00:37:05,939][70768] Fps is (10 sec: 45875.8, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 1940226048. Throughput: 0: 49818.3. Samples: 1469008940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 00:37:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:37:06,651][71000] Updated weights for policy 0, policy_version 118424 (0.0034) [2024-06-13 00:37:09,277][71000] Updated weights for policy 0, policy_version 118434 (0.0033) [2024-06-13 00:37:10,940][70768] Fps is (10 sec: 52429.3, 60 sec: 50244.3, 300 sec: 49596.3). Total num frames: 1940504576. Throughput: 0: 49772.1. Samples: 1469306700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 00:37:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:37:13,379][71000] Updated weights for policy 0, policy_version 118444 (0.0026) [2024-06-13 00:37:15,866][71000] Updated weights for policy 0, policy_version 118454 (0.0023) [2024-06-13 00:37:15,940][70768] Fps is (10 sec: 52427.9, 60 sec: 50244.2, 300 sec: 49485.2). Total num frames: 1940750336. Throughput: 0: 49756.0. Samples: 1469604460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 00:37:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:37:19,739][71000] Updated weights for policy 0, policy_version 118464 (0.0033) [2024-06-13 00:37:20,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49698.0, 300 sec: 49318.6). Total num frames: 1940963328. Throughput: 0: 49788.5. Samples: 1469756140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 00:37:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:37:22,592][71000] Updated weights for policy 0, policy_version 118474 (0.0027) [2024-06-13 00:37:25,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1941209088. Throughput: 0: 49556.8. Samples: 1470041420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 00:37:25,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:37:26,229][71000] Updated weights for policy 0, policy_version 118484 (0.0025) [2024-06-13 00:37:29,283][71000] Updated weights for policy 0, policy_version 118494 (0.0028) [2024-06-13 00:37:30,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49698.2, 300 sec: 49485.3). Total num frames: 1941471232. Throughput: 0: 49584.2. Samples: 1470335820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 00:37:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:37:33,104][71000] Updated weights for policy 0, policy_version 118504 (0.0027) [2024-06-13 00:37:35,719][71000] Updated weights for policy 0, policy_version 118514 (0.0026) [2024-06-13 00:37:35,940][70768] Fps is (10 sec: 54067.9, 60 sec: 49971.3, 300 sec: 49596.3). Total num frames: 1941749760. Throughput: 0: 49754.0. Samples: 1470501120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 00:37:35,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:37:39,380][71000] Updated weights for policy 0, policy_version 118524 (0.0028) [2024-06-13 00:37:40,940][70768] Fps is (10 sec: 50789.4, 60 sec: 50244.2, 300 sec: 49429.7). Total num frames: 1941979136. Throughput: 0: 49367.4. Samples: 1470799820. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-13 00:37:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:37:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000118529_1941979136.pth... [2024-06-13 00:37:41,001][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000117804_1930100736.pth [2024-06-13 00:37:42,007][71000] Updated weights for policy 0, policy_version 118534 (0.0034) [2024-06-13 00:37:43,185][70980] Signal inference workers to stop experience collection... (21600 times) [2024-06-13 00:37:43,185][70980] Signal inference workers to resume experience collection... (21600 times) [2024-06-13 00:37:43,221][71000] InferenceWorker_p0-w0: stopping experience collection (21600 times) [2024-06-13 00:37:43,221][71000] InferenceWorker_p0-w0: resuming experience collection (21600 times) [2024-06-13 00:37:45,939][70768] Fps is (10 sec: 45875.5, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1942208512. Throughput: 0: 49326.5. Samples: 1471089480. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-13 00:37:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:37:45,966][71000] Updated weights for policy 0, policy_version 118544 (0.0036) [2024-06-13 00:37:49,183][71000] Updated weights for policy 0, policy_version 118554 (0.0029) [2024-06-13 00:37:50,939][70768] Fps is (10 sec: 47514.6, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 1942454272. Throughput: 0: 49500.8. Samples: 1471236480. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-13 00:37:50,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:37:52,539][71000] Updated weights for policy 0, policy_version 118564 (0.0025) [2024-06-13 00:37:55,807][71000] Updated weights for policy 0, policy_version 118574 (0.0027) [2024-06-13 00:37:55,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1942716416. Throughput: 0: 49397.8. Samples: 1471529600. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-13 00:37:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:37:59,459][71000] Updated weights for policy 0, policy_version 118584 (0.0031) [2024-06-13 00:38:00,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 1942929408. Throughput: 0: 49312.0. Samples: 1471823500. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-13 00:38:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:38:02,289][71000] Updated weights for policy 0, policy_version 118594 (0.0026) [2024-06-13 00:38:05,829][71000] Updated weights for policy 0, policy_version 118604 (0.0024) [2024-06-13 00:38:05,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49697.9, 300 sec: 49485.2). Total num frames: 1943207936. Throughput: 0: 49176.8. Samples: 1471969100. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-13 00:38:05,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-13 00:38:09,115][71000] Updated weights for policy 0, policy_version 118614 (0.0017) [2024-06-13 00:38:10,940][70768] Fps is (10 sec: 52426.9, 60 sec: 49151.7, 300 sec: 49596.2). Total num frames: 1943453696. Throughput: 0: 49536.1. Samples: 1472270560. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-13 00:38:10,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:38:12,267][71000] Updated weights for policy 0, policy_version 118624 (0.0029) [2024-06-13 00:38:15,807][71000] Updated weights for policy 0, policy_version 118634 (0.0023) [2024-06-13 00:38:15,939][70768] Fps is (10 sec: 50791.6, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1943715840. Throughput: 0: 49866.3. Samples: 1472579800. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-13 00:38:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:38:18,748][71000] Updated weights for policy 0, policy_version 118644 (0.0026) [2024-06-13 00:38:20,939][70768] Fps is (10 sec: 49154.1, 60 sec: 49698.2, 300 sec: 49374.5). Total num frames: 1943945216. Throughput: 0: 49448.5. Samples: 1472726300. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-13 00:38:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:38:22,185][71000] Updated weights for policy 0, policy_version 118654 (0.0032) [2024-06-13 00:38:25,469][71000] Updated weights for policy 0, policy_version 118664 (0.0035) [2024-06-13 00:38:25,939][70768] Fps is (10 sec: 47513.4, 60 sec: 49698.3, 300 sec: 49429.7). Total num frames: 1944190976. Throughput: 0: 49412.2. Samples: 1473023360. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-13 00:38:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:38:28,573][71000] Updated weights for policy 0, policy_version 118674 (0.0027) [2024-06-13 00:38:30,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49424.9, 300 sec: 49540.7). Total num frames: 1944436736. Throughput: 0: 49591.7. Samples: 1473321120. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-13 00:38:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:38:32,096][71000] Updated weights for policy 0, policy_version 118684 (0.0027) [2024-06-13 00:38:35,380][71000] Updated weights for policy 0, policy_version 118694 (0.0027) [2024-06-13 00:38:35,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49151.9, 300 sec: 49540.8). Total num frames: 1944698880. Throughput: 0: 49482.4. Samples: 1473463200. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-13 00:38:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:38:38,692][71000] Updated weights for policy 0, policy_version 118704 (0.0033) [2024-06-13 00:38:40,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 1944928256. Throughput: 0: 49542.7. Samples: 1473759020. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-13 00:38:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 00:38:42,197][71000] Updated weights for policy 0, policy_version 118714 (0.0032) [2024-06-13 00:38:45,716][71000] Updated weights for policy 0, policy_version 118724 (0.0034) [2024-06-13 00:38:45,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1945174016. Throughput: 0: 49449.8. Samples: 1474048740. Policy #0 lag: (min: 1.0, avg: 9.0, max: 19.0) [2024-06-13 00:38:45,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:38:48,803][71000] Updated weights for policy 0, policy_version 118734 (0.0022) [2024-06-13 00:38:50,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1945419776. Throughput: 0: 49661.4. Samples: 1474203860. Policy #0 lag: (min: 1.0, avg: 9.0, max: 19.0) [2024-06-13 00:38:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:38:52,202][71000] Updated weights for policy 0, policy_version 118744 (0.0024) [2024-06-13 00:38:54,655][70980] Signal inference workers to stop experience collection... (21650 times) [2024-06-13 00:38:54,700][71000] InferenceWorker_p0-w0: stopping experience collection (21650 times) [2024-06-13 00:38:54,766][70980] Signal inference workers to resume experience collection... (21650 times) [2024-06-13 00:38:54,766][71000] InferenceWorker_p0-w0: resuming experience collection (21650 times) [2024-06-13 00:38:55,480][71000] Updated weights for policy 0, policy_version 118754 (0.0025) [2024-06-13 00:38:55,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1945681920. Throughput: 0: 49503.9. Samples: 1474498220. Policy #0 lag: (min: 1.0, avg: 9.0, max: 19.0) [2024-06-13 00:38:55,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:38:58,939][71000] Updated weights for policy 0, policy_version 118764 (0.0023) [2024-06-13 00:39:00,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 1945911296. Throughput: 0: 49134.9. Samples: 1474790880. Policy #0 lag: (min: 1.0, avg: 9.0, max: 19.0) [2024-06-13 00:39:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:39:02,750][71000] Updated weights for policy 0, policy_version 118774 (0.0034) [2024-06-13 00:39:05,676][71000] Updated weights for policy 0, policy_version 118784 (0.0027) [2024-06-13 00:39:05,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1946173440. Throughput: 0: 49082.2. Samples: 1474935000. Policy #0 lag: (min: 1.0, avg: 9.0, max: 19.0) [2024-06-13 00:39:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:39:09,110][71000] Updated weights for policy 0, policy_version 118794 (0.0025) [2024-06-13 00:39:10,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.3, 300 sec: 49429.7). Total num frames: 1946402816. Throughput: 0: 49212.9. Samples: 1475237940. Policy #0 lag: (min: 1.0, avg: 9.0, max: 19.0) [2024-06-13 00:39:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:39:12,004][71000] Updated weights for policy 0, policy_version 118804 (0.0040) [2024-06-13 00:39:15,478][71000] Updated weights for policy 0, policy_version 118814 (0.0029) [2024-06-13 00:39:15,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49151.8, 300 sec: 49485.2). Total num frames: 1946664960. Throughput: 0: 49211.1. Samples: 1475535620. Policy #0 lag: (min: 1.0, avg: 9.0, max: 19.0) [2024-06-13 00:39:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:39:18,792][71000] Updated weights for policy 0, policy_version 118824 (0.0032) [2024-06-13 00:39:20,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49698.1, 300 sec: 49485.3). Total num frames: 1946927104. Throughput: 0: 49375.3. Samples: 1475685080. Policy #0 lag: (min: 1.0, avg: 9.0, max: 19.0) [2024-06-13 00:39:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:39:22,110][71000] Updated weights for policy 0, policy_version 118834 (0.0026) [2024-06-13 00:39:25,354][71000] Updated weights for policy 0, policy_version 118844 (0.0032) [2024-06-13 00:39:25,939][70768] Fps is (10 sec: 50791.6, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1947172864. Throughput: 0: 49333.4. Samples: 1475979020. Policy #0 lag: (min: 1.0, avg: 9.0, max: 19.0) [2024-06-13 00:39:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:39:28,808][71000] Updated weights for policy 0, policy_version 118854 (0.0030) [2024-06-13 00:39:30,940][70768] Fps is (10 sec: 45874.3, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1947385856. Throughput: 0: 49690.1. Samples: 1476284800. Policy #0 lag: (min: 1.0, avg: 9.0, max: 19.0) [2024-06-13 00:39:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:39:31,750][71000] Updated weights for policy 0, policy_version 118864 (0.0028) [2024-06-13 00:39:35,278][71000] Updated weights for policy 0, policy_version 118874 (0.0027) [2024-06-13 00:39:35,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.2, 300 sec: 49485.3). Total num frames: 1947664384. Throughput: 0: 49373.8. Samples: 1476425680. Policy #0 lag: (min: 1.0, avg: 9.0, max: 19.0) [2024-06-13 00:39:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:39:38,139][71000] Updated weights for policy 0, policy_version 118884 (0.0024) [2024-06-13 00:39:40,940][70768] Fps is (10 sec: 54067.4, 60 sec: 49971.1, 300 sec: 49596.3). Total num frames: 1947926528. Throughput: 0: 49474.2. Samples: 1476724560. Policy #0 lag: (min: 1.0, avg: 9.0, max: 19.0) [2024-06-13 00:39:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:39:40,963][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000118892_1947926528.pth... [2024-06-13 00:39:41,003][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000118166_1936031744.pth [2024-06-13 00:39:41,813][71000] Updated weights for policy 0, policy_version 118894 (0.0026) [2024-06-13 00:39:44,939][71000] Updated weights for policy 0, policy_version 118904 (0.0029) [2024-06-13 00:39:45,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1948155904. Throughput: 0: 49744.6. Samples: 1477029380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:39:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 00:39:48,554][71000] Updated weights for policy 0, policy_version 118914 (0.0029) [2024-06-13 00:39:50,939][70768] Fps is (10 sec: 47514.6, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1948401664. Throughput: 0: 49722.3. Samples: 1477172500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:39:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:39:51,407][71000] Updated weights for policy 0, policy_version 118924 (0.0038) [2024-06-13 00:39:55,182][71000] Updated weights for policy 0, policy_version 118934 (0.0032) [2024-06-13 00:39:55,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1948647424. Throughput: 0: 49715.4. Samples: 1477475140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:39:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:39:58,174][71000] Updated weights for policy 0, policy_version 118944 (0.0030) [2024-06-13 00:40:00,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49971.3, 300 sec: 49596.3). Total num frames: 1948909568. Throughput: 0: 49651.7. Samples: 1477769940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:40:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:40:01,551][71000] Updated weights for policy 0, policy_version 118954 (0.0035) [2024-06-13 00:40:04,920][71000] Updated weights for policy 0, policy_version 118964 (0.0037) [2024-06-13 00:40:05,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1949155328. Throughput: 0: 49763.5. Samples: 1477924440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:40:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:40:08,400][71000] Updated weights for policy 0, policy_version 118974 (0.0025) [2024-06-13 00:40:10,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1949384704. Throughput: 0: 49727.0. Samples: 1478216740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:40:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:40:11,611][71000] Updated weights for policy 0, policy_version 118984 (0.0027) [2024-06-13 00:40:14,908][71000] Updated weights for policy 0, policy_version 118994 (0.0026) [2024-06-13 00:40:15,883][70980] Signal inference workers to stop experience collection... (21700 times) [2024-06-13 00:40:15,888][70980] Signal inference workers to resume experience collection... (21700 times) [2024-06-13 00:40:15,915][71000] InferenceWorker_p0-w0: stopping experience collection (21700 times) [2024-06-13 00:40:15,915][71000] InferenceWorker_p0-w0: resuming experience collection (21700 times) [2024-06-13 00:40:15,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1949630464. Throughput: 0: 49598.0. Samples: 1478516700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:40:15,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 00:40:18,142][71000] Updated weights for policy 0, policy_version 119004 (0.0034) [2024-06-13 00:40:20,942][70768] Fps is (10 sec: 50779.8, 60 sec: 49423.3, 300 sec: 49596.0). Total num frames: 1949892608. Throughput: 0: 49715.9. Samples: 1478663000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:40:20,942][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 00:40:21,448][71000] Updated weights for policy 0, policy_version 119014 (0.0033) [2024-06-13 00:40:24,624][71000] Updated weights for policy 0, policy_version 119024 (0.0022) [2024-06-13 00:40:25,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49424.9, 300 sec: 49596.3). Total num frames: 1950138368. Throughput: 0: 49774.7. Samples: 1478964420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:40:25,949][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:40:28,014][71000] Updated weights for policy 0, policy_version 119034 (0.0024) [2024-06-13 00:40:30,940][70768] Fps is (10 sec: 49162.2, 60 sec: 49971.3, 300 sec: 49596.3). Total num frames: 1950384128. Throughput: 0: 49657.3. Samples: 1479263960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:40:30,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-13 00:40:31,155][71000] Updated weights for policy 0, policy_version 119044 (0.0026) [2024-06-13 00:40:34,366][71000] Updated weights for policy 0, policy_version 119054 (0.0032) [2024-06-13 00:40:35,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49698.1, 300 sec: 49541.5). Total num frames: 1950646272. Throughput: 0: 49924.3. Samples: 1479419100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:40:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:40:37,549][71000] Updated weights for policy 0, policy_version 119064 (0.0028) [2024-06-13 00:40:40,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.2, 300 sec: 49596.4). Total num frames: 1950892032. Throughput: 0: 49885.5. Samples: 1479719980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-13 00:40:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:40:41,018][71000] Updated weights for policy 0, policy_version 119074 (0.0029) [2024-06-13 00:40:44,127][71000] Updated weights for policy 0, policy_version 119084 (0.0024) [2024-06-13 00:40:45,940][70768] Fps is (10 sec: 52428.9, 60 sec: 50244.2, 300 sec: 49707.4). Total num frames: 1951170560. Throughput: 0: 49960.4. Samples: 1480018160. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-13 00:40:45,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:40:47,856][71000] Updated weights for policy 0, policy_version 119094 (0.0030) [2024-06-13 00:40:50,807][71000] Updated weights for policy 0, policy_version 119104 (0.0028) [2024-06-13 00:40:50,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49971.1, 300 sec: 49651.8). Total num frames: 1951399936. Throughput: 0: 50080.5. Samples: 1480178060. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-13 00:40:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:40:54,445][71000] Updated weights for policy 0, policy_version 119114 (0.0026) [2024-06-13 00:40:55,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 1951645696. Throughput: 0: 50259.5. Samples: 1480478420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-13 00:40:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:40:57,288][71000] Updated weights for policy 0, policy_version 119124 (0.0024) [2024-06-13 00:41:00,912][71000] Updated weights for policy 0, policy_version 119134 (0.0032) [2024-06-13 00:41:00,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1951891456. Throughput: 0: 50076.5. Samples: 1480770140. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-13 00:41:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:41:03,674][71000] Updated weights for policy 0, policy_version 119144 (0.0034) [2024-06-13 00:41:05,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49971.2, 300 sec: 49707.4). Total num frames: 1952153600. Throughput: 0: 50109.5. Samples: 1480917820. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-13 00:41:05,940][70768] Avg episode reward: [(0, '0.264')] [2024-06-13 00:41:07,246][71000] Updated weights for policy 0, policy_version 119154 (0.0024) [2024-06-13 00:41:10,177][71000] Updated weights for policy 0, policy_version 119164 (0.0044) [2024-06-13 00:41:10,940][70768] Fps is (10 sec: 50790.1, 60 sec: 50244.2, 300 sec: 49707.4). Total num frames: 1952399360. Throughput: 0: 50090.3. Samples: 1481218480. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-13 00:41:10,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:41:14,127][71000] Updated weights for policy 0, policy_version 119174 (0.0030) [2024-06-13 00:41:15,944][70768] Fps is (10 sec: 47493.2, 60 sec: 49967.6, 300 sec: 49651.1). Total num frames: 1952628736. Throughput: 0: 50033.1. Samples: 1481515660. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-13 00:41:15,944][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:41:17,036][71000] Updated weights for policy 0, policy_version 119184 (0.0029) [2024-06-13 00:41:20,899][71000] Updated weights for policy 0, policy_version 119194 (0.0029) [2024-06-13 00:41:20,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49699.8, 300 sec: 49540.8). Total num frames: 1952874496. Throughput: 0: 49695.1. Samples: 1481655380. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-13 00:41:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:41:23,091][70980] Signal inference workers to stop experience collection... (21750 times) [2024-06-13 00:41:23,138][71000] InferenceWorker_p0-w0: stopping experience collection (21750 times) [2024-06-13 00:41:23,144][70980] Signal inference workers to resume experience collection... (21750 times) [2024-06-13 00:41:23,156][71000] InferenceWorker_p0-w0: resuming experience collection (21750 times) [2024-06-13 00:41:23,435][71000] Updated weights for policy 0, policy_version 119204 (0.0021) [2024-06-13 00:41:25,940][70768] Fps is (10 sec: 49173.1, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1953120256. Throughput: 0: 49572.4. Samples: 1481950740. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-13 00:41:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 00:41:27,343][71000] Updated weights for policy 0, policy_version 119214 (0.0029) [2024-06-13 00:41:30,126][71000] Updated weights for policy 0, policy_version 119224 (0.0023) [2024-06-13 00:41:30,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1953382400. Throughput: 0: 49359.6. Samples: 1482239340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-13 00:41:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:41:33,844][71000] Updated weights for policy 0, policy_version 119234 (0.0025) [2024-06-13 00:41:35,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.2, 300 sec: 49707.4). Total num frames: 1953628160. Throughput: 0: 49306.7. Samples: 1482396860. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-13 00:41:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:41:36,801][71000] Updated weights for policy 0, policy_version 119244 (0.0020) [2024-06-13 00:41:40,382][71000] Updated weights for policy 0, policy_version 119254 (0.0033) [2024-06-13 00:41:40,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1953873920. Throughput: 0: 49385.9. Samples: 1482700780. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-13 00:41:40,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:41:41,025][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000119256_1953890304.pth... [2024-06-13 00:41:41,074][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000118529_1941979136.pth [2024-06-13 00:41:43,460][71000] Updated weights for policy 0, policy_version 119264 (0.0027) [2024-06-13 00:41:45,939][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 49596.3). Total num frames: 1954119680. Throughput: 0: 49521.8. Samples: 1482998620. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-13 00:41:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:41:47,054][71000] Updated weights for policy 0, policy_version 119274 (0.0027) [2024-06-13 00:41:49,963][71000] Updated weights for policy 0, policy_version 119284 (0.0029) [2024-06-13 00:41:50,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1954381824. Throughput: 0: 49621.3. Samples: 1483150780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:41:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:41:53,824][71000] Updated weights for policy 0, policy_version 119294 (0.0028) [2024-06-13 00:41:55,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.3, 300 sec: 49651.9). Total num frames: 1954627584. Throughput: 0: 49543.2. Samples: 1483447920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:41:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:41:56,439][71000] Updated weights for policy 0, policy_version 119304 (0.0030) [2024-06-13 00:42:00,013][71000] Updated weights for policy 0, policy_version 119314 (0.0029) [2024-06-13 00:42:00,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49698.0, 300 sec: 49651.8). Total num frames: 1954873344. Throughput: 0: 49754.0. Samples: 1483754380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:42:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:42:02,915][71000] Updated weights for policy 0, policy_version 119324 (0.0027) [2024-06-13 00:42:05,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1955119104. Throughput: 0: 49887.6. Samples: 1483900320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:42:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:42:06,730][71000] Updated weights for policy 0, policy_version 119334 (0.0028) [2024-06-13 00:42:09,430][71000] Updated weights for policy 0, policy_version 119344 (0.0032) [2024-06-13 00:42:10,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1955381248. Throughput: 0: 49804.3. Samples: 1484191940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:42:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:42:13,461][71000] Updated weights for policy 0, policy_version 119354 (0.0036) [2024-06-13 00:42:15,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49701.7, 300 sec: 49651.9). Total num frames: 1955610624. Throughput: 0: 49995.7. Samples: 1484489140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:42:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:42:16,312][71000] Updated weights for policy 0, policy_version 119364 (0.0023) [2024-06-13 00:42:19,930][71000] Updated weights for policy 0, policy_version 119374 (0.0027) [2024-06-13 00:42:20,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49698.2, 300 sec: 49651.9). Total num frames: 1955856384. Throughput: 0: 49720.0. Samples: 1484634260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:42:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:42:22,933][71000] Updated weights for policy 0, policy_version 119384 (0.0027) [2024-06-13 00:42:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1956102144. Throughput: 0: 49532.0. Samples: 1484929720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:42:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:42:26,646][71000] Updated weights for policy 0, policy_version 119394 (0.0025) [2024-06-13 00:42:29,744][71000] Updated weights for policy 0, policy_version 119404 (0.0024) [2024-06-13 00:42:30,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1956364288. Throughput: 0: 49534.6. Samples: 1485227680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:42:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:42:33,361][71000] Updated weights for policy 0, policy_version 119414 (0.0024) [2024-06-13 00:42:35,579][70980] Signal inference workers to stop experience collection... (21800 times) [2024-06-13 00:42:35,630][71000] InferenceWorker_p0-w0: stopping experience collection (21800 times) [2024-06-13 00:42:35,631][70980] Signal inference workers to resume experience collection... (21800 times) [2024-06-13 00:42:35,642][71000] InferenceWorker_p0-w0: resuming experience collection (21800 times) [2024-06-13 00:42:35,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1956610048. Throughput: 0: 49611.2. Samples: 1485383280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:42:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:42:36,411][71000] Updated weights for policy 0, policy_version 119424 (0.0035) [2024-06-13 00:42:39,833][71000] Updated weights for policy 0, policy_version 119434 (0.0023) [2024-06-13 00:42:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49698.1, 300 sec: 49651.8). Total num frames: 1956855808. Throughput: 0: 49568.3. Samples: 1485678500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:42:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:42:42,727][71000] Updated weights for policy 0, policy_version 119444 (0.0027) [2024-06-13 00:42:45,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49698.1, 300 sec: 49651.8). Total num frames: 1957101568. Throughput: 0: 49421.9. Samples: 1485978360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:42:45,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 00:42:46,454][71000] Updated weights for policy 0, policy_version 119454 (0.0027) [2024-06-13 00:42:49,516][71000] Updated weights for policy 0, policy_version 119464 (0.0033) [2024-06-13 00:42:50,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.1, 300 sec: 49651.8). Total num frames: 1957363712. Throughput: 0: 49544.0. Samples: 1486129800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-13 00:42:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:42:53,058][71000] Updated weights for policy 0, policy_version 119474 (0.0027) [2024-06-13 00:42:55,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49698.1, 300 sec: 49762.9). Total num frames: 1957609472. Throughput: 0: 49734.4. Samples: 1486429980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-13 00:42:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:42:55,954][71000] Updated weights for policy 0, policy_version 119484 (0.0032) [2024-06-13 00:42:59,795][71000] Updated weights for policy 0, policy_version 119494 (0.0027) [2024-06-13 00:43:00,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 1957838848. Throughput: 0: 49567.9. Samples: 1486719700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-13 00:43:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:43:02,626][71000] Updated weights for policy 0, policy_version 119504 (0.0035) [2024-06-13 00:43:05,940][70768] Fps is (10 sec: 47512.7, 60 sec: 49425.0, 300 sec: 49596.4). Total num frames: 1958084608. Throughput: 0: 49667.8. Samples: 1486869320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-13 00:43:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:43:06,406][71000] Updated weights for policy 0, policy_version 119514 (0.0021) [2024-06-13 00:43:09,181][71000] Updated weights for policy 0, policy_version 119524 (0.0030) [2024-06-13 00:43:10,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.2, 300 sec: 49596.3). Total num frames: 1958346752. Throughput: 0: 49549.8. Samples: 1487159460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-13 00:43:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:43:12,827][71000] Updated weights for policy 0, policy_version 119534 (0.0040) [2024-06-13 00:43:15,939][70768] Fps is (10 sec: 49153.2, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 1958576128. Throughput: 0: 49635.2. Samples: 1487461260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-13 00:43:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:43:16,097][71000] Updated weights for policy 0, policy_version 119544 (0.0029) [2024-06-13 00:43:19,364][71000] Updated weights for policy 0, policy_version 119554 (0.0030) [2024-06-13 00:43:20,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1958821888. Throughput: 0: 49347.0. Samples: 1487603900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-13 00:43:20,941][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:43:22,622][71000] Updated weights for policy 0, policy_version 119564 (0.0029) [2024-06-13 00:43:25,851][71000] Updated weights for policy 0, policy_version 119574 (0.0032) [2024-06-13 00:43:25,940][70768] Fps is (10 sec: 52427.5, 60 sec: 49971.1, 300 sec: 49707.4). Total num frames: 1959100416. Throughput: 0: 49465.7. Samples: 1487904460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-13 00:43:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:43:29,411][71000] Updated weights for policy 0, policy_version 119584 (0.0025) [2024-06-13 00:43:30,091][70980] Signal inference workers to stop experience collection... (21850 times) [2024-06-13 00:43:30,142][71000] InferenceWorker_p0-w0: stopping experience collection (21850 times) [2024-06-13 00:43:30,147][70980] Signal inference workers to resume experience collection... (21850 times) [2024-06-13 00:43:30,153][71000] InferenceWorker_p0-w0: resuming experience collection (21850 times) [2024-06-13 00:43:30,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 1959329792. Throughput: 0: 49334.2. Samples: 1488198400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-13 00:43:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:43:32,606][71000] Updated weights for policy 0, policy_version 119594 (0.0034) [2024-06-13 00:43:35,940][70768] Fps is (10 sec: 47514.5, 60 sec: 49425.1, 300 sec: 49651.9). Total num frames: 1959575552. Throughput: 0: 49220.5. Samples: 1488344720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-13 00:43:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:43:36,143][71000] Updated weights for policy 0, policy_version 119604 (0.0032) [2024-06-13 00:43:39,219][71000] Updated weights for policy 0, policy_version 119614 (0.0029) [2024-06-13 00:43:40,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.1, 300 sec: 49651.8). Total num frames: 1959821312. Throughput: 0: 48971.0. Samples: 1488633680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-13 00:43:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:43:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000119618_1959821312.pth... [2024-06-13 00:43:41,000][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000118892_1947926528.pth [2024-06-13 00:43:42,786][71000] Updated weights for policy 0, policy_version 119624 (0.0024) [2024-06-13 00:43:45,770][71000] Updated weights for policy 0, policy_version 119634 (0.0036) [2024-06-13 00:43:45,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49698.1, 300 sec: 49707.4). Total num frames: 1960083456. Throughput: 0: 49484.0. Samples: 1488946480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-13 00:43:45,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:43:49,481][71000] Updated weights for policy 0, policy_version 119644 (0.0031) [2024-06-13 00:43:50,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49151.9, 300 sec: 49596.3). Total num frames: 1960312832. Throughput: 0: 49354.7. Samples: 1489090280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 00:43:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:43:52,611][71000] Updated weights for policy 0, policy_version 119654 (0.0031) [2024-06-13 00:43:55,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48878.9, 300 sec: 49596.3). Total num frames: 1960542208. Throughput: 0: 49413.4. Samples: 1489383060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 00:43:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:43:56,142][71000] Updated weights for policy 0, policy_version 119664 (0.0027) [2024-06-13 00:43:59,453][71000] Updated weights for policy 0, policy_version 119674 (0.0028) [2024-06-13 00:44:00,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49151.9, 300 sec: 49540.7). Total num frames: 1960787968. Throughput: 0: 49201.5. Samples: 1489675340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 00:44:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:44:02,720][71000] Updated weights for policy 0, policy_version 119684 (0.0026) [2024-06-13 00:44:05,826][71000] Updated weights for policy 0, policy_version 119694 (0.0031) [2024-06-13 00:44:05,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49698.2, 300 sec: 49707.4). Total num frames: 1961066496. Throughput: 0: 49371.1. Samples: 1489825600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 00:44:05,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:44:09,411][71000] Updated weights for policy 0, policy_version 119704 (0.0033) [2024-06-13 00:44:10,940][70768] Fps is (10 sec: 54067.7, 60 sec: 49698.1, 300 sec: 49707.4). Total num frames: 1961328640. Throughput: 0: 49567.6. Samples: 1490135000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 00:44:10,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:44:12,269][71000] Updated weights for policy 0, policy_version 119714 (0.0024) [2024-06-13 00:44:15,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1961541632. Throughput: 0: 49524.5. Samples: 1490427000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 00:44:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:44:16,061][71000] Updated weights for policy 0, policy_version 119724 (0.0033) [2024-06-13 00:44:18,830][71000] Updated weights for policy 0, policy_version 119734 (0.0024) [2024-06-13 00:44:20,939][70768] Fps is (10 sec: 45876.1, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1961787392. Throughput: 0: 49562.8. Samples: 1490575040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 00:44:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 00:44:22,285][71000] Updated weights for policy 0, policy_version 119744 (0.0021) [2024-06-13 00:44:25,773][71000] Updated weights for policy 0, policy_version 119754 (0.0029) [2024-06-13 00:44:25,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.2, 300 sec: 49707.4). Total num frames: 1962049536. Throughput: 0: 49679.8. Samples: 1490869260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 00:44:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 00:44:28,986][71000] Updated weights for policy 0, policy_version 119764 (0.0031) [2024-06-13 00:44:30,940][70768] Fps is (10 sec: 52427.4, 60 sec: 49698.0, 300 sec: 49651.8). Total num frames: 1962311680. Throughput: 0: 49396.3. Samples: 1491169320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 00:44:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:44:32,135][71000] Updated weights for policy 0, policy_version 119774 (0.0027) [2024-06-13 00:44:35,627][71000] Updated weights for policy 0, policy_version 119784 (0.0031) [2024-06-13 00:44:35,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1962557440. Throughput: 0: 49526.3. Samples: 1491318960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 00:44:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:44:36,297][70980] Signal inference workers to stop experience collection... (21900 times) [2024-06-13 00:44:36,341][71000] InferenceWorker_p0-w0: stopping experience collection (21900 times) [2024-06-13 00:44:36,352][70980] Signal inference workers to resume experience collection... (21900 times) [2024-06-13 00:44:36,353][71000] InferenceWorker_p0-w0: resuming experience collection (21900 times) [2024-06-13 00:44:38,556][71000] Updated weights for policy 0, policy_version 119794 (0.0021) [2024-06-13 00:44:40,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1962786816. Throughput: 0: 49658.5. Samples: 1491617700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 00:44:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:44:42,135][71000] Updated weights for policy 0, policy_version 119804 (0.0037) [2024-06-13 00:44:45,376][71000] Updated weights for policy 0, policy_version 119814 (0.0025) [2024-06-13 00:44:45,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49651.8). Total num frames: 1963048960. Throughput: 0: 49672.6. Samples: 1491910600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 00:44:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:44:48,990][71000] Updated weights for policy 0, policy_version 119824 (0.0035) [2024-06-13 00:44:50,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.1, 300 sec: 49651.8). Total num frames: 1963294720. Throughput: 0: 49674.2. Samples: 1492060940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 00:44:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:44:52,179][71000] Updated weights for policy 0, policy_version 119834 (0.0031) [2024-06-13 00:44:55,723][71000] Updated weights for policy 0, policy_version 119844 (0.0030) [2024-06-13 00:44:55,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1963524096. Throughput: 0: 49514.3. Samples: 1492363140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:44:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:44:58,447][71000] Updated weights for policy 0, policy_version 119854 (0.0029) [2024-06-13 00:45:00,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49971.3, 300 sec: 49596.3). Total num frames: 1963786240. Throughput: 0: 49901.3. Samples: 1492672560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:45:00,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:45:01,921][71000] Updated weights for policy 0, policy_version 119864 (0.0031) [2024-06-13 00:45:04,873][71000] Updated weights for policy 0, policy_version 119874 (0.0026) [2024-06-13 00:45:05,940][70768] Fps is (10 sec: 54066.2, 60 sec: 49971.2, 300 sec: 49762.9). Total num frames: 1964064768. Throughput: 0: 49954.8. Samples: 1492823020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:45:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:45:08,476][71000] Updated weights for policy 0, policy_version 119884 (0.0031) [2024-06-13 00:45:10,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49425.0, 300 sec: 49707.4). Total num frames: 1964294144. Throughput: 0: 50073.9. Samples: 1493122600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:45:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:45:11,728][71000] Updated weights for policy 0, policy_version 119894 (0.0033) [2024-06-13 00:45:15,390][71000] Updated weights for policy 0, policy_version 119904 (0.0033) [2024-06-13 00:45:15,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49698.1, 300 sec: 49596.7). Total num frames: 1964523520. Throughput: 0: 50002.8. Samples: 1493419440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:45:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:45:18,298][71000] Updated weights for policy 0, policy_version 119914 (0.0037) [2024-06-13 00:45:20,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49971.0, 300 sec: 49651.9). Total num frames: 1964785664. Throughput: 0: 49816.8. Samples: 1493560720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:45:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:45:21,877][71000] Updated weights for policy 0, policy_version 119924 (0.0033) [2024-06-13 00:45:24,898][71000] Updated weights for policy 0, policy_version 119934 (0.0030) [2024-06-13 00:45:25,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49971.1, 300 sec: 49707.4). Total num frames: 1965047808. Throughput: 0: 49791.7. Samples: 1493858320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:45:25,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-13 00:45:28,277][71000] Updated weights for policy 0, policy_version 119944 (0.0021) [2024-06-13 00:45:30,939][70768] Fps is (10 sec: 50791.5, 60 sec: 49698.4, 300 sec: 49651.9). Total num frames: 1965293568. Throughput: 0: 49810.8. Samples: 1494152080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:45:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:45:31,547][71000] Updated weights for policy 0, policy_version 119954 (0.0030) [2024-06-13 00:45:35,175][71000] Updated weights for policy 0, policy_version 119964 (0.0039) [2024-06-13 00:45:35,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49425.2, 300 sec: 49596.3). Total num frames: 1965522944. Throughput: 0: 49865.6. Samples: 1494304880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:45:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:45:37,986][70980] Signal inference workers to stop experience collection... (21950 times) [2024-06-13 00:45:38,031][71000] InferenceWorker_p0-w0: stopping experience collection (21950 times) [2024-06-13 00:45:38,095][70980] Signal inference workers to resume experience collection... (21950 times) [2024-06-13 00:45:38,095][71000] InferenceWorker_p0-w0: resuming experience collection (21950 times) [2024-06-13 00:45:38,233][71000] Updated weights for policy 0, policy_version 119974 (0.0032) [2024-06-13 00:45:40,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 1965768704. Throughput: 0: 49503.1. Samples: 1494590780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:45:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:45:40,955][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000119982_1965785088.pth... [2024-06-13 00:45:40,999][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000119256_1953890304.pth [2024-06-13 00:45:41,859][71000] Updated weights for policy 0, policy_version 119984 (0.0030) [2024-06-13 00:45:44,824][71000] Updated weights for policy 0, policy_version 119994 (0.0027) [2024-06-13 00:45:45,942][70768] Fps is (10 sec: 50779.5, 60 sec: 49696.4, 300 sec: 49596.0). Total num frames: 1966030848. Throughput: 0: 49266.6. Samples: 1494889660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:45:45,942][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:45:48,086][71000] Updated weights for policy 0, policy_version 120004 (0.0025) [2024-06-13 00:45:50,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1966276608. Throughput: 0: 49383.2. Samples: 1495045260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 00:45:50,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:45:51,607][71000] Updated weights for policy 0, policy_version 120014 (0.0025) [2024-06-13 00:45:54,969][71000] Updated weights for policy 0, policy_version 120024 (0.0034) [2024-06-13 00:45:55,939][70768] Fps is (10 sec: 49162.5, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1966522368. Throughput: 0: 49512.7. Samples: 1495350660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 00:45:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:45:58,042][71000] Updated weights for policy 0, policy_version 120034 (0.0022) [2024-06-13 00:46:00,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1966768128. Throughput: 0: 49276.9. Samples: 1495636900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 00:46:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:46:01,454][71000] Updated weights for policy 0, policy_version 120044 (0.0038) [2024-06-13 00:46:04,787][71000] Updated weights for policy 0, policy_version 120054 (0.0028) [2024-06-13 00:46:05,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.1, 300 sec: 49540.8). Total num frames: 1967013888. Throughput: 0: 49364.5. Samples: 1495782120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 00:46:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:46:08,137][71000] Updated weights for policy 0, policy_version 120064 (0.0033) [2024-06-13 00:46:10,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.3, 300 sec: 49597.0). Total num frames: 1967259648. Throughput: 0: 49408.1. Samples: 1496081680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 00:46:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:46:11,210][71000] Updated weights for policy 0, policy_version 120074 (0.0027) [2024-06-13 00:46:14,975][71000] Updated weights for policy 0, policy_version 120084 (0.0025) [2024-06-13 00:46:15,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1967505408. Throughput: 0: 49638.1. Samples: 1496385800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 00:46:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:46:17,869][71000] Updated weights for policy 0, policy_version 120094 (0.0028) [2024-06-13 00:46:20,940][70768] Fps is (10 sec: 49150.6, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1967751168. Throughput: 0: 49351.3. Samples: 1496525700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 00:46:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:46:21,756][71000] Updated weights for policy 0, policy_version 120104 (0.0025) [2024-06-13 00:46:24,559][71000] Updated weights for policy 0, policy_version 120114 (0.0026) [2024-06-13 00:46:25,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1967996928. Throughput: 0: 49571.6. Samples: 1496821500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 00:46:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:46:28,268][71000] Updated weights for policy 0, policy_version 120124 (0.0038) [2024-06-13 00:46:30,940][70768] Fps is (10 sec: 50791.3, 60 sec: 49424.9, 300 sec: 49596.3). Total num frames: 1968259072. Throughput: 0: 49625.4. Samples: 1497122700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 00:46:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:46:31,129][71000] Updated weights for policy 0, policy_version 120134 (0.0031) [2024-06-13 00:46:34,604][71000] Updated weights for policy 0, policy_version 120144 (0.0037) [2024-06-13 00:46:35,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1968504832. Throughput: 0: 49439.7. Samples: 1497270040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 00:46:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:46:37,463][71000] Updated weights for policy 0, policy_version 120154 (0.0033) [2024-06-13 00:46:40,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49425.0, 300 sec: 49540.7). Total num frames: 1968734208. Throughput: 0: 49270.5. Samples: 1497567840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 00:46:40,949][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:46:41,402][71000] Updated weights for policy 0, policy_version 120164 (0.0029) [2024-06-13 00:46:44,264][71000] Updated weights for policy 0, policy_version 120174 (0.0032) [2024-06-13 00:46:45,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49153.8, 300 sec: 49485.2). Total num frames: 1968979968. Throughput: 0: 49497.8. Samples: 1497864300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 00:46:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:46:46,066][70980] Signal inference workers to stop experience collection... (22000 times) [2024-06-13 00:46:46,066][70980] Signal inference workers to resume experience collection... (22000 times) [2024-06-13 00:46:46,084][71000] InferenceWorker_p0-w0: stopping experience collection (22000 times) [2024-06-13 00:46:46,084][71000] InferenceWorker_p0-w0: resuming experience collection (22000 times) [2024-06-13 00:46:48,073][71000] Updated weights for policy 0, policy_version 120184 (0.0033) [2024-06-13 00:46:50,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1969242112. Throughput: 0: 49493.3. Samples: 1498009320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 00:46:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:46:51,137][71000] Updated weights for policy 0, policy_version 120194 (0.0030) [2024-06-13 00:46:55,070][71000] Updated weights for policy 0, policy_version 120204 (0.0031) [2024-06-13 00:46:55,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49151.9, 300 sec: 49485.2). Total num frames: 1969471488. Throughput: 0: 49282.1. Samples: 1498299380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:46:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:46:57,574][71000] Updated weights for policy 0, policy_version 120214 (0.0028) [2024-06-13 00:47:00,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1969733632. Throughput: 0: 49185.3. Samples: 1498599140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:47:00,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:47:01,519][71000] Updated weights for policy 0, policy_version 120224 (0.0031) [2024-06-13 00:47:04,004][71000] Updated weights for policy 0, policy_version 120234 (0.0026) [2024-06-13 00:47:05,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1969963008. Throughput: 0: 49365.6. Samples: 1498747140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:47:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:47:07,767][71000] Updated weights for policy 0, policy_version 120244 (0.0027) [2024-06-13 00:47:10,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49151.9, 300 sec: 49485.2). Total num frames: 1970208768. Throughput: 0: 49505.3. Samples: 1499049240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:47:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:47:11,155][71000] Updated weights for policy 0, policy_version 120254 (0.0031) [2024-06-13 00:47:14,834][71000] Updated weights for policy 0, policy_version 120264 (0.0036) [2024-06-13 00:47:15,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1970454528. Throughput: 0: 49266.7. Samples: 1499339700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:47:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:47:17,480][71000] Updated weights for policy 0, policy_version 120274 (0.0027) [2024-06-13 00:47:20,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.2, 300 sec: 49485.2). Total num frames: 1970700288. Throughput: 0: 49411.1. Samples: 1499493540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:47:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:47:21,372][71000] Updated weights for policy 0, policy_version 120284 (0.0026) [2024-06-13 00:47:23,938][71000] Updated weights for policy 0, policy_version 120294 (0.0027) [2024-06-13 00:47:25,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1970962432. Throughput: 0: 49222.8. Samples: 1499782860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:47:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:47:27,748][71000] Updated weights for policy 0, policy_version 120304 (0.0028) [2024-06-13 00:47:30,871][71000] Updated weights for policy 0, policy_version 120314 (0.0028) [2024-06-13 00:47:30,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1971224576. Throughput: 0: 49220.8. Samples: 1500079240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:47:30,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 00:47:34,234][71000] Updated weights for policy 0, policy_version 120324 (0.0036) [2024-06-13 00:47:35,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49151.9, 300 sec: 49485.2). Total num frames: 1971453952. Throughput: 0: 49321.3. Samples: 1500228780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:47:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:47:37,586][71000] Updated weights for policy 0, policy_version 120334 (0.0027) [2024-06-13 00:47:40,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1971699712. Throughput: 0: 49423.0. Samples: 1500523420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:47:40,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 00:47:41,005][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000120344_1971716096.pth... [2024-06-13 00:47:41,007][71000] Updated weights for policy 0, policy_version 120344 (0.0026) [2024-06-13 00:47:41,051][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000119618_1959821312.pth [2024-06-13 00:47:41,752][70980] Signal inference workers to stop experience collection... (22050 times) [2024-06-13 00:47:41,752][70980] Signal inference workers to resume experience collection... (22050 times) [2024-06-13 00:47:41,791][71000] InferenceWorker_p0-w0: stopping experience collection (22050 times) [2024-06-13 00:47:41,791][71000] InferenceWorker_p0-w0: resuming experience collection (22050 times) [2024-06-13 00:47:43,853][71000] Updated weights for policy 0, policy_version 120354 (0.0033) [2024-06-13 00:47:45,939][70768] Fps is (10 sec: 52429.5, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 1971978240. Throughput: 0: 49560.1. Samples: 1500829340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:47:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:47:47,689][71000] Updated weights for policy 0, policy_version 120364 (0.0026) [2024-06-13 00:47:50,442][71000] Updated weights for policy 0, policy_version 120374 (0.0031) [2024-06-13 00:47:50,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1972207616. Throughput: 0: 49786.1. Samples: 1500987520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:47:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:47:54,114][71000] Updated weights for policy 0, policy_version 120384 (0.0028) [2024-06-13 00:47:55,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1972453376. Throughput: 0: 49535.1. Samples: 1501278320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 00:47:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:47:57,161][71000] Updated weights for policy 0, policy_version 120394 (0.0024) [2024-06-13 00:48:00,790][71000] Updated weights for policy 0, policy_version 120404 (0.0023) [2024-06-13 00:48:00,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1972699136. Throughput: 0: 49914.7. Samples: 1501585860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 00:48:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:48:03,616][71000] Updated weights for policy 0, policy_version 120414 (0.0022) [2024-06-13 00:48:05,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49971.3, 300 sec: 49540.8). Total num frames: 1972961280. Throughput: 0: 49761.8. Samples: 1501732820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 00:48:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:48:07,114][71000] Updated weights for policy 0, policy_version 120424 (0.0025) [2024-06-13 00:48:10,070][71000] Updated weights for policy 0, policy_version 120434 (0.0026) [2024-06-13 00:48:10,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49971.3, 300 sec: 49596.3). Total num frames: 1973207040. Throughput: 0: 50080.5. Samples: 1502036480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 00:48:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:48:14,060][71000] Updated weights for policy 0, policy_version 120444 (0.0025) [2024-06-13 00:48:15,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49971.3, 300 sec: 49596.3). Total num frames: 1973452800. Throughput: 0: 50072.1. Samples: 1502332480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 00:48:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:48:16,782][71000] Updated weights for policy 0, policy_version 120454 (0.0033) [2024-06-13 00:48:20,572][71000] Updated weights for policy 0, policy_version 120464 (0.0032) [2024-06-13 00:48:20,940][70768] Fps is (10 sec: 47512.1, 60 sec: 49697.9, 300 sec: 49429.7). Total num frames: 1973682176. Throughput: 0: 49844.2. Samples: 1502471780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 00:48:20,941][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 00:48:23,766][71000] Updated weights for policy 0, policy_version 120474 (0.0030) [2024-06-13 00:48:25,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49971.1, 300 sec: 49596.3). Total num frames: 1973960704. Throughput: 0: 49947.6. Samples: 1502771060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 00:48:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:48:27,190][71000] Updated weights for policy 0, policy_version 120484 (0.0031) [2024-06-13 00:48:30,023][71000] Updated weights for policy 0, policy_version 120494 (0.0027) [2024-06-13 00:48:30,940][70768] Fps is (10 sec: 54068.3, 60 sec: 49971.2, 300 sec: 49651.8). Total num frames: 1974222848. Throughput: 0: 49803.9. Samples: 1503070520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 00:48:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:48:33,719][71000] Updated weights for policy 0, policy_version 120504 (0.0034) [2024-06-13 00:48:35,939][70768] Fps is (10 sec: 47514.2, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1974435840. Throughput: 0: 49724.1. Samples: 1503225100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 00:48:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:48:36,698][71000] Updated weights for policy 0, policy_version 120514 (0.0029) [2024-06-13 00:48:40,180][71000] Updated weights for policy 0, policy_version 120524 (0.0027) [2024-06-13 00:48:40,939][70768] Fps is (10 sec: 44237.3, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 1974665216. Throughput: 0: 49697.4. Samples: 1503514700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 00:48:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:48:43,310][71000] Updated weights for policy 0, policy_version 120534 (0.0028) [2024-06-13 00:48:45,940][70768] Fps is (10 sec: 50789.0, 60 sec: 49424.8, 300 sec: 49596.3). Total num frames: 1974943744. Throughput: 0: 49464.2. Samples: 1503811760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 00:48:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:48:47,095][71000] Updated weights for policy 0, policy_version 120544 (0.0028) [2024-06-13 00:48:49,975][71000] Updated weights for policy 0, policy_version 120554 (0.0028) [2024-06-13 00:48:50,940][70768] Fps is (10 sec: 55705.1, 60 sec: 50244.3, 300 sec: 49762.9). Total num frames: 1975222272. Throughput: 0: 49744.4. Samples: 1503971320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 00:48:50,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 00:48:53,570][71000] Updated weights for policy 0, policy_version 120564 (0.0027) [2024-06-13 00:48:54,693][70980] Signal inference workers to stop experience collection... (22100 times) [2024-06-13 00:48:54,745][70980] Signal inference workers to resume experience collection... (22100 times) [2024-06-13 00:48:54,746][71000] InferenceWorker_p0-w0: stopping experience collection (22100 times) [2024-06-13 00:48:54,755][71000] InferenceWorker_p0-w0: resuming experience collection (22100 times) [2024-06-13 00:48:55,940][70768] Fps is (10 sec: 47514.8, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 1975418880. Throughput: 0: 49634.2. Samples: 1504270020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 00:48:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:48:56,607][71000] Updated weights for policy 0, policy_version 120574 (0.0029) [2024-06-13 00:49:00,185][71000] Updated weights for policy 0, policy_version 120584 (0.0040) [2024-06-13 00:49:00,940][70768] Fps is (10 sec: 42597.9, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 1975648256. Throughput: 0: 49318.4. Samples: 1504551820. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-13 00:49:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:49:03,461][71000] Updated weights for policy 0, policy_version 120594 (0.0025) [2024-06-13 00:49:05,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1975926784. Throughput: 0: 49333.1. Samples: 1504691760. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-13 00:49:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:49:06,891][71000] Updated weights for policy 0, policy_version 120604 (0.0023) [2024-06-13 00:49:10,078][71000] Updated weights for policy 0, policy_version 120614 (0.0023) [2024-06-13 00:49:10,939][70768] Fps is (10 sec: 55706.8, 60 sec: 49971.2, 300 sec: 49707.4). Total num frames: 1976205312. Throughput: 0: 49548.2. Samples: 1505000720. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-13 00:49:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:49:13,417][71000] Updated weights for policy 0, policy_version 120624 (0.0027) [2024-06-13 00:49:15,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1976418304. Throughput: 0: 49599.2. Samples: 1505302480. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-13 00:49:15,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 00:49:16,590][71000] Updated weights for policy 0, policy_version 120634 (0.0026) [2024-06-13 00:49:19,881][71000] Updated weights for policy 0, policy_version 120644 (0.0023) [2024-06-13 00:49:20,939][70768] Fps is (10 sec: 44236.7, 60 sec: 49425.3, 300 sec: 49485.2). Total num frames: 1976647680. Throughput: 0: 49292.4. Samples: 1505443260. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-13 00:49:20,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-13 00:49:23,330][71000] Updated weights for policy 0, policy_version 120654 (0.0034) [2024-06-13 00:49:25,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1976926208. Throughput: 0: 49243.5. Samples: 1505730660. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-13 00:49:25,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-13 00:49:26,687][71000] Updated weights for policy 0, policy_version 120664 (0.0022) [2024-06-13 00:49:29,787][71000] Updated weights for policy 0, policy_version 120674 (0.0029) [2024-06-13 00:49:30,940][70768] Fps is (10 sec: 54066.3, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1977188352. Throughput: 0: 49353.0. Samples: 1506032640. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-13 00:49:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:49:33,272][71000] Updated weights for policy 0, policy_version 120684 (0.0026) [2024-06-13 00:49:35,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1977401344. Throughput: 0: 49148.6. Samples: 1506183000. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-13 00:49:35,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 00:49:36,443][71000] Updated weights for policy 0, policy_version 120694 (0.0024) [2024-06-13 00:49:39,945][71000] Updated weights for policy 0, policy_version 120704 (0.0027) [2024-06-13 00:49:40,940][70768] Fps is (10 sec: 44237.1, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 1977630720. Throughput: 0: 49150.1. Samples: 1506481780. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-13 00:49:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:49:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000120705_1977630720.pth... [2024-06-13 00:49:41,000][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000119982_1965785088.pth [2024-06-13 00:49:43,089][71000] Updated weights for policy 0, policy_version 120714 (0.0025) [2024-06-13 00:49:45,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.3, 300 sec: 49540.8). Total num frames: 1977909248. Throughput: 0: 49349.5. Samples: 1506772540. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-13 00:49:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:49:46,856][71000] Updated weights for policy 0, policy_version 120724 (0.0033) [2024-06-13 00:49:49,846][71000] Updated weights for policy 0, policy_version 120734 (0.0028) [2024-06-13 00:49:50,940][70768] Fps is (10 sec: 55705.8, 60 sec: 49425.1, 300 sec: 49707.4). Total num frames: 1978187776. Throughput: 0: 49767.2. Samples: 1506931280. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-13 00:49:50,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-13 00:49:53,129][71000] Updated weights for policy 0, policy_version 120744 (0.0033) [2024-06-13 00:49:55,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1978400768. Throughput: 0: 49598.5. Samples: 1507232660. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-13 00:49:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:49:56,224][70980] Signal inference workers to stop experience collection... (22150 times) [2024-06-13 00:49:56,227][70980] Signal inference workers to resume experience collection... (22150 times) [2024-06-13 00:49:56,231][71000] Updated weights for policy 0, policy_version 120754 (0.0032) [2024-06-13 00:49:56,243][71000] InferenceWorker_p0-w0: stopping experience collection (22150 times) [2024-06-13 00:49:56,243][71000] InferenceWorker_p0-w0: resuming experience collection (22150 times) [2024-06-13 00:49:59,239][71000] Updated weights for policy 0, policy_version 120764 (0.0031) [2024-06-13 00:50:00,940][70768] Fps is (10 sec: 45874.4, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 1978646528. Throughput: 0: 49727.8. Samples: 1507540240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:50:00,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 00:50:02,722][71000] Updated weights for policy 0, policy_version 120774 (0.0024) [2024-06-13 00:50:05,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1978908672. Throughput: 0: 49659.8. Samples: 1507677960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:50:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:50:06,536][71000] Updated weights for policy 0, policy_version 120784 (0.0029) [2024-06-13 00:50:09,414][71000] Updated weights for policy 0, policy_version 120794 (0.0026) [2024-06-13 00:50:10,940][70768] Fps is (10 sec: 52429.1, 60 sec: 49424.9, 300 sec: 49651.8). Total num frames: 1979170816. Throughput: 0: 49756.7. Samples: 1507969720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:50:10,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-13 00:50:12,929][71000] Updated weights for policy 0, policy_version 120804 (0.0023) [2024-06-13 00:50:15,919][71000] Updated weights for policy 0, policy_version 120814 (0.0027) [2024-06-13 00:50:15,939][70768] Fps is (10 sec: 50791.4, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1979416576. Throughput: 0: 49946.9. Samples: 1508280240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:50:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:50:19,138][71000] Updated weights for policy 0, policy_version 120824 (0.0030) [2024-06-13 00:50:20,940][70768] Fps is (10 sec: 49151.9, 60 sec: 50244.1, 300 sec: 49540.7). Total num frames: 1979662336. Throughput: 0: 49778.4. Samples: 1508423040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:50:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:50:22,414][71000] Updated weights for policy 0, policy_version 120834 (0.0031) [2024-06-13 00:50:25,940][70768] Fps is (10 sec: 47512.5, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 1979891712. Throughput: 0: 49797.2. Samples: 1508722660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:50:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:50:26,110][71000] Updated weights for policy 0, policy_version 120844 (0.0027) [2024-06-13 00:50:29,047][71000] Updated weights for policy 0, policy_version 120854 (0.0017) [2024-06-13 00:50:30,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.2, 300 sec: 49596.3). Total num frames: 1980153856. Throughput: 0: 49700.5. Samples: 1509009060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:50:30,940][70768] Avg episode reward: [(0, '0.262')] [2024-06-13 00:50:32,690][71000] Updated weights for policy 0, policy_version 120864 (0.0026) [2024-06-13 00:50:35,732][71000] Updated weights for policy 0, policy_version 120874 (0.0028) [2024-06-13 00:50:35,940][70768] Fps is (10 sec: 52429.7, 60 sec: 50244.2, 300 sec: 49651.9). Total num frames: 1980416000. Throughput: 0: 49919.1. Samples: 1509177640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:50:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:50:39,134][71000] Updated weights for policy 0, policy_version 120884 (0.0021) [2024-06-13 00:50:40,940][70768] Fps is (10 sec: 49149.1, 60 sec: 50243.9, 300 sec: 49541.0). Total num frames: 1980645376. Throughput: 0: 49727.4. Samples: 1509470420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:50:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:50:42,327][71000] Updated weights for policy 0, policy_version 120894 (0.0026) [2024-06-13 00:50:45,486][71000] Updated weights for policy 0, policy_version 120904 (0.0025) [2024-06-13 00:50:45,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1980891136. Throughput: 0: 49444.6. Samples: 1509765240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:50:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:50:49,100][71000] Updated weights for policy 0, policy_version 120914 (0.0029) [2024-06-13 00:50:50,940][70768] Fps is (10 sec: 50792.7, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1981153280. Throughput: 0: 49734.7. Samples: 1509916020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:50:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:50:52,550][71000] Updated weights for policy 0, policy_version 120924 (0.0026) [2024-06-13 00:50:53,568][70980] Signal inference workers to stop experience collection... (22200 times) [2024-06-13 00:50:53,569][70980] Signal inference workers to resume experience collection... (22200 times) [2024-06-13 00:50:53,583][71000] InferenceWorker_p0-w0: stopping experience collection (22200 times) [2024-06-13 00:50:53,583][71000] InferenceWorker_p0-w0: resuming experience collection (22200 times) [2024-06-13 00:50:55,782][71000] Updated weights for policy 0, policy_version 120934 (0.0027) [2024-06-13 00:50:55,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 1981399040. Throughput: 0: 49669.5. Samples: 1510204840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:50:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:50:58,995][71000] Updated weights for policy 0, policy_version 120944 (0.0026) [2024-06-13 00:51:00,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1981628416. Throughput: 0: 49385.2. Samples: 1510502580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:51:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:51:02,347][71000] Updated weights for policy 0, policy_version 120954 (0.0033) [2024-06-13 00:51:05,553][71000] Updated weights for policy 0, policy_version 120964 (0.0021) [2024-06-13 00:51:05,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1981874176. Throughput: 0: 49523.3. Samples: 1510651580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:51:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:51:08,983][71000] Updated weights for policy 0, policy_version 120974 (0.0039) [2024-06-13 00:51:10,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1982136320. Throughput: 0: 49492.4. Samples: 1510949820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:51:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:51:12,521][71000] Updated weights for policy 0, policy_version 120984 (0.0026) [2024-06-13 00:51:15,689][71000] Updated weights for policy 0, policy_version 120994 (0.0041) [2024-06-13 00:51:15,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 49596.4). Total num frames: 1982382080. Throughput: 0: 49606.2. Samples: 1511241340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:51:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:51:19,074][71000] Updated weights for policy 0, policy_version 121004 (0.0028) [2024-06-13 00:51:20,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.2, 300 sec: 49596.3). Total num frames: 1982627840. Throughput: 0: 49351.1. Samples: 1511398440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:51:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:51:22,106][71000] Updated weights for policy 0, policy_version 121014 (0.0028) [2024-06-13 00:51:25,470][71000] Updated weights for policy 0, policy_version 121024 (0.0027) [2024-06-13 00:51:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 1982857216. Throughput: 0: 49411.7. Samples: 1511693920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:51:25,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 00:51:28,941][71000] Updated weights for policy 0, policy_version 121034 (0.0027) [2024-06-13 00:51:30,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1983119360. Throughput: 0: 49523.0. Samples: 1511993780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:51:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:51:32,003][71000] Updated weights for policy 0, policy_version 121044 (0.0023) [2024-06-13 00:51:35,750][71000] Updated weights for policy 0, policy_version 121054 (0.0029) [2024-06-13 00:51:35,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48878.8, 300 sec: 49540.8). Total num frames: 1983348736. Throughput: 0: 49191.1. Samples: 1512129620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:51:35,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:51:38,908][71000] Updated weights for policy 0, policy_version 121064 (0.0039) [2024-06-13 00:51:40,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.4, 300 sec: 49540.8). Total num frames: 1983594496. Throughput: 0: 49336.0. Samples: 1512424960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:51:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:51:41,056][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000121070_1983610880.pth... [2024-06-13 00:51:41,103][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000120344_1971716096.pth [2024-06-13 00:51:42,532][71000] Updated weights for policy 0, policy_version 121074 (0.0031) [2024-06-13 00:51:45,642][71000] Updated weights for policy 0, policy_version 121084 (0.0027) [2024-06-13 00:51:45,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1983856640. Throughput: 0: 49226.3. Samples: 1512717760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:51:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:51:49,161][71000] Updated weights for policy 0, policy_version 121094 (0.0031) [2024-06-13 00:51:50,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.1, 300 sec: 49596.3). Total num frames: 1984102400. Throughput: 0: 49418.2. Samples: 1512875400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:51:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:51:52,182][71000] Updated weights for policy 0, policy_version 121104 (0.0030) [2024-06-13 00:51:55,597][71000] Updated weights for policy 0, policy_version 121114 (0.0028) [2024-06-13 00:51:55,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.8, 300 sec: 49485.2). Total num frames: 1984331776. Throughput: 0: 49332.9. Samples: 1513169800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:51:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:51:58,915][71000] Updated weights for policy 0, policy_version 121124 (0.0026) [2024-06-13 00:52:00,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 1984593920. Throughput: 0: 49306.2. Samples: 1513460120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 00:52:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:52:02,284][71000] Updated weights for policy 0, policy_version 121134 (0.0026) [2024-06-13 00:52:05,672][71000] Updated weights for policy 0, policy_version 121144 (0.0032) [2024-06-13 00:52:05,809][70980] Signal inference workers to stop experience collection... (22250 times) [2024-06-13 00:52:05,812][70980] Signal inference workers to resume experience collection... (22250 times) [2024-06-13 00:52:05,835][71000] InferenceWorker_p0-w0: stopping experience collection (22250 times) [2024-06-13 00:52:05,835][71000] InferenceWorker_p0-w0: resuming experience collection (22250 times) [2024-06-13 00:52:05,939][70768] Fps is (10 sec: 50791.4, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 1984839680. Throughput: 0: 49161.4. Samples: 1513610700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 00:52:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:52:08,562][71000] Updated weights for policy 0, policy_version 121154 (0.0028) [2024-06-13 00:52:10,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 49596.3). Total num frames: 1985085440. Throughput: 0: 49377.3. Samples: 1513915900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 00:52:10,943][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:52:12,000][71000] Updated weights for policy 0, policy_version 121164 (0.0025) [2024-06-13 00:52:15,623][71000] Updated weights for policy 0, policy_version 121174 (0.0033) [2024-06-13 00:52:15,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49151.9, 300 sec: 49596.3). Total num frames: 1985331200. Throughput: 0: 49224.4. Samples: 1514208880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 00:52:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:52:19,003][71000] Updated weights for policy 0, policy_version 121184 (0.0033) [2024-06-13 00:52:20,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49151.8, 300 sec: 49540.7). Total num frames: 1985576960. Throughput: 0: 49481.7. Samples: 1514356300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 00:52:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:52:22,011][71000] Updated weights for policy 0, policy_version 121194 (0.0031) [2024-06-13 00:52:25,559][71000] Updated weights for policy 0, policy_version 121204 (0.0034) [2024-06-13 00:52:25,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1985822720. Throughput: 0: 49446.7. Samples: 1514650060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 00:52:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:52:28,625][71000] Updated weights for policy 0, policy_version 121214 (0.0025) [2024-06-13 00:52:30,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49152.1, 300 sec: 49540.8). Total num frames: 1986068480. Throughput: 0: 49513.4. Samples: 1514945860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 00:52:30,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:52:32,017][71000] Updated weights for policy 0, policy_version 121224 (0.0025) [2024-06-13 00:52:35,249][71000] Updated weights for policy 0, policy_version 121234 (0.0028) [2024-06-13 00:52:35,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 1986330624. Throughput: 0: 49236.8. Samples: 1515091060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 00:52:35,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:52:38,469][71000] Updated weights for policy 0, policy_version 121244 (0.0031) [2024-06-13 00:52:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 1986560000. Throughput: 0: 49324.6. Samples: 1515389400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 00:52:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:52:41,707][71000] Updated weights for policy 0, policy_version 121254 (0.0035) [2024-06-13 00:52:45,194][71000] Updated weights for policy 0, policy_version 121264 (0.0034) [2024-06-13 00:52:45,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1986822144. Throughput: 0: 49464.5. Samples: 1515686020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 00:52:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 00:52:48,508][71000] Updated weights for policy 0, policy_version 121274 (0.0028) [2024-06-13 00:52:50,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1987067904. Throughput: 0: 49590.2. Samples: 1515842260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 00:52:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:52:51,665][71000] Updated weights for policy 0, policy_version 121284 (0.0038) [2024-06-13 00:52:54,778][71000] Updated weights for policy 0, policy_version 121294 (0.0031) [2024-06-13 00:52:55,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1987313664. Throughput: 0: 49334.2. Samples: 1516135940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 00:52:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:52:58,114][71000] Updated weights for policy 0, policy_version 121304 (0.0021) [2024-06-13 00:53:00,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1987575808. Throughput: 0: 49499.2. Samples: 1516436340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 00:53:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:53:01,491][71000] Updated weights for policy 0, policy_version 121314 (0.0033) [2024-06-13 00:53:04,741][71000] Updated weights for policy 0, policy_version 121324 (0.0032) [2024-06-13 00:53:05,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1987805184. Throughput: 0: 49741.1. Samples: 1516594640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 00:53:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:53:07,867][71000] Updated weights for policy 0, policy_version 121334 (0.0031) [2024-06-13 00:53:10,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1988050944. Throughput: 0: 49590.2. Samples: 1516881620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 00:53:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:53:10,966][70980] Signal inference workers to stop experience collection... (22300 times) [2024-06-13 00:53:10,997][71000] InferenceWorker_p0-w0: stopping experience collection (22300 times) [2024-06-13 00:53:11,016][70980] Signal inference workers to resume experience collection... (22300 times) [2024-06-13 00:53:11,018][71000] InferenceWorker_p0-w0: resuming experience collection (22300 times) [2024-06-13 00:53:11,481][71000] Updated weights for policy 0, policy_version 121344 (0.0024) [2024-06-13 00:53:14,983][71000] Updated weights for policy 0, policy_version 121354 (0.0029) [2024-06-13 00:53:15,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 1988296704. Throughput: 0: 49662.3. Samples: 1517180660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 00:53:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:53:18,117][71000] Updated weights for policy 0, policy_version 121364 (0.0028) [2024-06-13 00:53:20,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49971.3, 300 sec: 49540.8). Total num frames: 1988575232. Throughput: 0: 49636.9. Samples: 1517324720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 00:53:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 00:53:21,276][71000] Updated weights for policy 0, policy_version 121374 (0.0026) [2024-06-13 00:53:24,761][71000] Updated weights for policy 0, policy_version 121384 (0.0025) [2024-06-13 00:53:25,943][70768] Fps is (10 sec: 50770.8, 60 sec: 49695.0, 300 sec: 49429.1). Total num frames: 1988804608. Throughput: 0: 49791.7. Samples: 1517630220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 00:53:25,944][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:53:27,671][71000] Updated weights for policy 0, policy_version 121394 (0.0035) [2024-06-13 00:53:30,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49698.0, 300 sec: 49540.7). Total num frames: 1989050368. Throughput: 0: 49750.9. Samples: 1517924820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 00:53:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 00:53:31,296][71000] Updated weights for policy 0, policy_version 121404 (0.0027) [2024-06-13 00:53:34,350][71000] Updated weights for policy 0, policy_version 121414 (0.0034) [2024-06-13 00:53:35,940][70768] Fps is (10 sec: 47531.8, 60 sec: 49152.1, 300 sec: 49540.8). Total num frames: 1989279744. Throughput: 0: 49373.3. Samples: 1518064060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 00:53:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:53:37,817][71000] Updated weights for policy 0, policy_version 121424 (0.0035) [2024-06-13 00:53:40,782][71000] Updated weights for policy 0, policy_version 121434 (0.0024) [2024-06-13 00:53:40,940][70768] Fps is (10 sec: 52429.2, 60 sec: 50244.2, 300 sec: 49596.3). Total num frames: 1989574656. Throughput: 0: 49710.2. Samples: 1518372900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 00:53:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:53:40,954][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000121434_1989574656.pth... [2024-06-13 00:53:41,010][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000120705_1977630720.pth [2024-06-13 00:53:44,410][71000] Updated weights for policy 0, policy_version 121444 (0.0022) [2024-06-13 00:53:45,939][70768] Fps is (10 sec: 52429.1, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 1989804032. Throughput: 0: 49737.0. Samples: 1518674500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 00:53:45,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:53:47,454][71000] Updated weights for policy 0, policy_version 121454 (0.0024) [2024-06-13 00:53:50,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1990033408. Throughput: 0: 49517.2. Samples: 1518822920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 00:53:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:53:51,194][71000] Updated weights for policy 0, policy_version 121464 (0.0023) [2024-06-13 00:53:54,229][71000] Updated weights for policy 0, policy_version 121474 (0.0042) [2024-06-13 00:53:55,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 1990279168. Throughput: 0: 49523.9. Samples: 1519110200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 00:53:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:53:57,706][71000] Updated weights for policy 0, policy_version 121484 (0.0032) [2024-06-13 00:54:00,803][71000] Updated weights for policy 0, policy_version 121494 (0.0023) [2024-06-13 00:54:00,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 1990557696. Throughput: 0: 49564.7. Samples: 1519411080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 00:54:00,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-13 00:54:04,143][71000] Updated weights for policy 0, policy_version 121504 (0.0021) [2024-06-13 00:54:05,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 1990803456. Throughput: 0: 49951.3. Samples: 1519572520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 00:54:05,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:54:06,147][70980] Signal inference workers to stop experience collection... (22350 times) [2024-06-13 00:54:06,185][71000] InferenceWorker_p0-w0: stopping experience collection (22350 times) [2024-06-13 00:54:06,195][70980] Signal inference workers to resume experience collection... (22350 times) [2024-06-13 00:54:06,205][71000] InferenceWorker_p0-w0: resuming experience collection (22350 times) [2024-06-13 00:54:07,197][71000] Updated weights for policy 0, policy_version 121514 (0.0026) [2024-06-13 00:54:10,650][71000] Updated weights for policy 0, policy_version 121524 (0.0029) [2024-06-13 00:54:10,939][70768] Fps is (10 sec: 49152.8, 60 sec: 49971.3, 300 sec: 49596.3). Total num frames: 1991049216. Throughput: 0: 49835.4. Samples: 1519872620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 00:54:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:54:14,136][71000] Updated weights for policy 0, policy_version 121534 (0.0029) [2024-06-13 00:54:15,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1991262208. Throughput: 0: 49719.7. Samples: 1520162200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 00:54:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:54:17,401][71000] Updated weights for policy 0, policy_version 121544 (0.0031) [2024-06-13 00:54:20,655][71000] Updated weights for policy 0, policy_version 121554 (0.0033) [2024-06-13 00:54:20,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1991540736. Throughput: 0: 49704.4. Samples: 1520300760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 00:54:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:54:24,015][71000] Updated weights for policy 0, policy_version 121564 (0.0028) [2024-06-13 00:54:25,940][70768] Fps is (10 sec: 54067.3, 60 sec: 49974.4, 300 sec: 49540.8). Total num frames: 1991802880. Throughput: 0: 49580.5. Samples: 1520604020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 00:54:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:54:27,199][71000] Updated weights for policy 0, policy_version 121574 (0.0038) [2024-06-13 00:54:30,787][71000] Updated weights for policy 0, policy_version 121584 (0.0025) [2024-06-13 00:54:30,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49698.3, 300 sec: 49596.3). Total num frames: 1992032256. Throughput: 0: 49584.4. Samples: 1520905800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 00:54:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:54:34,005][71000] Updated weights for policy 0, policy_version 121594 (0.0035) [2024-06-13 00:54:35,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49971.1, 300 sec: 49651.8). Total num frames: 1992278016. Throughput: 0: 49535.5. Samples: 1521052020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 00:54:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:54:37,361][71000] Updated weights for policy 0, policy_version 121604 (0.0023) [2024-06-13 00:54:40,597][71000] Updated weights for policy 0, policy_version 121614 (0.0030) [2024-06-13 00:54:40,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49151.9, 300 sec: 49540.7). Total num frames: 1992523776. Throughput: 0: 49443.9. Samples: 1521335180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 00:54:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:54:44,098][71000] Updated weights for policy 0, policy_version 121624 (0.0031) [2024-06-13 00:54:45,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1992785920. Throughput: 0: 49549.9. Samples: 1521640820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 00:54:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:54:47,148][71000] Updated weights for policy 0, policy_version 121634 (0.0027) [2024-06-13 00:54:50,757][71000] Updated weights for policy 0, policy_version 121644 (0.0037) [2024-06-13 00:54:50,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49971.1, 300 sec: 49596.3). Total num frames: 1993031680. Throughput: 0: 49434.5. Samples: 1521797080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 00:54:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:54:54,072][71000] Updated weights for policy 0, policy_version 121654 (0.0023) [2024-06-13 00:54:55,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1993261056. Throughput: 0: 49271.0. Samples: 1522089820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 00:54:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:54:57,215][71000] Updated weights for policy 0, policy_version 121664 (0.0027) [2024-06-13 00:55:00,602][71000] Updated weights for policy 0, policy_version 121674 (0.0027) [2024-06-13 00:55:00,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 1993506816. Throughput: 0: 49280.4. Samples: 1522379820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 00:55:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:55:03,935][71000] Updated weights for policy 0, policy_version 121684 (0.0033) [2024-06-13 00:55:05,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 1993752576. Throughput: 0: 49578.7. Samples: 1522531800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 00:55:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:55:07,212][71000] Updated weights for policy 0, policy_version 121694 (0.0019) [2024-06-13 00:55:08,818][70980] Signal inference workers to stop experience collection... (22400 times) [2024-06-13 00:55:08,821][70980] Signal inference workers to resume experience collection... (22400 times) [2024-06-13 00:55:08,832][71000] InferenceWorker_p0-w0: stopping experience collection (22400 times) [2024-06-13 00:55:08,842][71000] InferenceWorker_p0-w0: resuming experience collection (22400 times) [2024-06-13 00:55:10,486][71000] Updated weights for policy 0, policy_version 121704 (0.0034) [2024-06-13 00:55:10,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 1994014720. Throughput: 0: 49467.0. Samples: 1522830040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:55:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:55:13,999][71000] Updated weights for policy 0, policy_version 121714 (0.0029) [2024-06-13 00:55:15,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49971.2, 300 sec: 49485.3). Total num frames: 1994260480. Throughput: 0: 49487.1. Samples: 1523132720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:55:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:55:16,806][71000] Updated weights for policy 0, policy_version 121724 (0.0031) [2024-06-13 00:55:20,536][71000] Updated weights for policy 0, policy_version 121734 (0.0025) [2024-06-13 00:55:20,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 1994506240. Throughput: 0: 49494.8. Samples: 1523279280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:55:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:55:23,704][71000] Updated weights for policy 0, policy_version 121744 (0.0028) [2024-06-13 00:55:25,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 1994752000. Throughput: 0: 49688.2. Samples: 1523571140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:55:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:55:27,086][71000] Updated weights for policy 0, policy_version 121754 (0.0028) [2024-06-13 00:55:30,131][71000] Updated weights for policy 0, policy_version 121764 (0.0030) [2024-06-13 00:55:30,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 1995014144. Throughput: 0: 49698.7. Samples: 1523877260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:55:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:55:33,689][71000] Updated weights for policy 0, policy_version 121774 (0.0026) [2024-06-13 00:55:35,940][70768] Fps is (10 sec: 50789.3, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 1995259904. Throughput: 0: 49607.1. Samples: 1524029400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:55:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:55:36,559][71000] Updated weights for policy 0, policy_version 121784 (0.0028) [2024-06-13 00:55:40,092][71000] Updated weights for policy 0, policy_version 121794 (0.0024) [2024-06-13 00:55:40,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1995505664. Throughput: 0: 49787.9. Samples: 1524330280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:55:40,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:55:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000121796_1995505664.pth... [2024-06-13 00:55:41,015][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000121070_1983610880.pth [2024-06-13 00:55:43,469][71000] Updated weights for policy 0, policy_version 121804 (0.0033) [2024-06-13 00:55:45,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 1995735040. Throughput: 0: 49503.9. Samples: 1524607500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:55:45,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:55:47,073][71000] Updated weights for policy 0, policy_version 121814 (0.0032) [2024-06-13 00:55:50,295][71000] Updated weights for policy 0, policy_version 121824 (0.0029) [2024-06-13 00:55:50,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1995997184. Throughput: 0: 49399.9. Samples: 1524754800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:55:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:55:53,512][71000] Updated weights for policy 0, policy_version 121834 (0.0027) [2024-06-13 00:55:55,939][70768] Fps is (10 sec: 50791.2, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1996242944. Throughput: 0: 49582.8. Samples: 1525061260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:55:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:55:56,651][71000] Updated weights for policy 0, policy_version 121844 (0.0020) [2024-06-13 00:56:00,050][71000] Updated weights for policy 0, policy_version 121854 (0.0036) [2024-06-13 00:56:00,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1996488704. Throughput: 0: 49451.1. Samples: 1525358020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:56:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:56:03,480][71000] Updated weights for policy 0, policy_version 121864 (0.0028) [2024-06-13 00:56:05,940][70768] Fps is (10 sec: 47512.7, 60 sec: 49424.9, 300 sec: 49429.7). Total num frames: 1996718080. Throughput: 0: 49488.3. Samples: 1525506260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 00:56:05,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:56:06,647][71000] Updated weights for policy 0, policy_version 121874 (0.0024) [2024-06-13 00:56:10,207][71000] Updated weights for policy 0, policy_version 121884 (0.0027) [2024-06-13 00:56:10,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1996980224. Throughput: 0: 49509.3. Samples: 1525799060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 00:56:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:56:13,278][71000] Updated weights for policy 0, policy_version 121894 (0.0025) [2024-06-13 00:56:15,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1997225984. Throughput: 0: 49213.7. Samples: 1526091880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 00:56:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:56:16,939][71000] Updated weights for policy 0, policy_version 121904 (0.0024) [2024-06-13 00:56:20,236][71000] Updated weights for policy 0, policy_version 121914 (0.0025) [2024-06-13 00:56:20,940][70768] Fps is (10 sec: 47512.4, 60 sec: 49151.8, 300 sec: 49485.2). Total num frames: 1997455360. Throughput: 0: 49162.6. Samples: 1526241720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 00:56:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:56:21,680][70980] Signal inference workers to stop experience collection... (22450 times) [2024-06-13 00:56:21,680][70980] Signal inference workers to resume experience collection... (22450 times) [2024-06-13 00:56:21,698][71000] InferenceWorker_p0-w0: stopping experience collection (22450 times) [2024-06-13 00:56:21,728][71000] InferenceWorker_p0-w0: resuming experience collection (22450 times) [2024-06-13 00:56:23,350][71000] Updated weights for policy 0, policy_version 121924 (0.0031) [2024-06-13 00:56:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 1997701120. Throughput: 0: 48775.2. Samples: 1526525160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 00:56:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:56:26,875][71000] Updated weights for policy 0, policy_version 121934 (0.0033) [2024-06-13 00:56:29,957][71000] Updated weights for policy 0, policy_version 121944 (0.0026) [2024-06-13 00:56:30,940][70768] Fps is (10 sec: 50791.3, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1997963264. Throughput: 0: 49169.0. Samples: 1526820100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 00:56:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:56:33,549][71000] Updated weights for policy 0, policy_version 121954 (0.0026) [2024-06-13 00:56:35,939][70768] Fps is (10 sec: 52429.3, 60 sec: 49425.2, 300 sec: 49596.3). Total num frames: 1998225408. Throughput: 0: 49269.0. Samples: 1526971900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 00:56:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 00:56:36,614][71000] Updated weights for policy 0, policy_version 121964 (0.0025) [2024-06-13 00:56:39,930][71000] Updated weights for policy 0, policy_version 121974 (0.0027) [2024-06-13 00:56:40,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 49429.7). Total num frames: 1998438400. Throughput: 0: 49200.3. Samples: 1527275280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 00:56:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:56:43,179][71000] Updated weights for policy 0, policy_version 121984 (0.0031) [2024-06-13 00:56:45,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 1998700544. Throughput: 0: 49310.6. Samples: 1527577000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 00:56:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:56:46,807][71000] Updated weights for policy 0, policy_version 121994 (0.0028) [2024-06-13 00:56:49,616][71000] Updated weights for policy 0, policy_version 122004 (0.0023) [2024-06-13 00:56:50,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49151.9, 300 sec: 49540.8). Total num frames: 1998946304. Throughput: 0: 49278.2. Samples: 1527723780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 00:56:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:56:53,120][71000] Updated weights for policy 0, policy_version 122014 (0.0030) [2024-06-13 00:56:55,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1999208448. Throughput: 0: 49500.8. Samples: 1528026600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 00:56:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:56:56,195][71000] Updated weights for policy 0, policy_version 122024 (0.0036) [2024-06-13 00:56:59,982][71000] Updated weights for policy 0, policy_version 122034 (0.0033) [2024-06-13 00:57:00,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 1999454208. Throughput: 0: 49560.9. Samples: 1528322120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 00:57:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:57:02,879][71000] Updated weights for policy 0, policy_version 122044 (0.0029) [2024-06-13 00:57:05,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 1999699968. Throughput: 0: 49433.6. Samples: 1528466220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 00:57:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:57:06,915][71000] Updated weights for policy 0, policy_version 122054 (0.0035) [2024-06-13 00:57:09,364][71000] Updated weights for policy 0, policy_version 122064 (0.0030) [2024-06-13 00:57:10,939][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.0, 300 sec: 49485.3). Total num frames: 1999929344. Throughput: 0: 49481.4. Samples: 1528751820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 00:57:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:57:13,654][71000] Updated weights for policy 0, policy_version 122074 (0.0031) [2024-06-13 00:57:15,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 2000191488. Throughput: 0: 49669.4. Samples: 1529055220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 00:57:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:57:16,339][71000] Updated weights for policy 0, policy_version 122084 (0.0036) [2024-06-13 00:57:17,862][70980] Signal inference workers to stop experience collection... (22500 times) [2024-06-13 00:57:17,912][71000] InferenceWorker_p0-w0: stopping experience collection (22500 times) [2024-06-13 00:57:17,918][70980] Signal inference workers to resume experience collection... (22500 times) [2024-06-13 00:57:17,931][71000] InferenceWorker_p0-w0: resuming experience collection (22500 times) [2024-06-13 00:57:20,108][71000] Updated weights for policy 0, policy_version 122094 (0.0034) [2024-06-13 00:57:20,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.4, 300 sec: 49540.8). Total num frames: 2000437248. Throughput: 0: 49448.9. Samples: 1529197100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 00:57:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:57:22,853][71000] Updated weights for policy 0, policy_version 122104 (0.0025) [2024-06-13 00:57:25,940][70768] Fps is (10 sec: 49150.8, 60 sec: 49698.0, 300 sec: 49540.7). Total num frames: 2000683008. Throughput: 0: 49443.4. Samples: 1529500240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 00:57:25,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:57:26,674][71000] Updated weights for policy 0, policy_version 122114 (0.0018) [2024-06-13 00:57:29,490][71000] Updated weights for policy 0, policy_version 122124 (0.0031) [2024-06-13 00:57:30,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 2000945152. Throughput: 0: 49619.7. Samples: 1529809880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 00:57:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:57:33,393][71000] Updated weights for policy 0, policy_version 122134 (0.0024) [2024-06-13 00:57:35,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 2001190912. Throughput: 0: 49797.0. Samples: 1529964640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 00:57:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:57:36,074][71000] Updated weights for policy 0, policy_version 122144 (0.0030) [2024-06-13 00:57:39,750][71000] Updated weights for policy 0, policy_version 122154 (0.0025) [2024-06-13 00:57:40,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 2001436672. Throughput: 0: 49547.6. Samples: 1530256240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 00:57:40,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 00:57:40,979][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000122159_2001453056.pth... [2024-06-13 00:57:41,040][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000121434_1989574656.pth [2024-06-13 00:57:42,701][71000] Updated weights for policy 0, policy_version 122164 (0.0030) [2024-06-13 00:57:45,939][70768] Fps is (10 sec: 47514.1, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2001666048. Throughput: 0: 49742.8. Samples: 1530560540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 00:57:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 00:57:46,362][71000] Updated weights for policy 0, policy_version 122174 (0.0026) [2024-06-13 00:57:49,358][71000] Updated weights for policy 0, policy_version 122184 (0.0027) [2024-06-13 00:57:50,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49971.3, 300 sec: 49596.3). Total num frames: 2001944576. Throughput: 0: 49663.6. Samples: 1530701080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 00:57:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:57:53,034][71000] Updated weights for policy 0, policy_version 122194 (0.0026) [2024-06-13 00:57:55,850][71000] Updated weights for policy 0, policy_version 122204 (0.0026) [2024-06-13 00:57:55,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 2002190336. Throughput: 0: 49878.6. Samples: 1530996360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 00:57:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:57:59,578][71000] Updated weights for policy 0, policy_version 122214 (0.0025) [2024-06-13 00:58:00,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 2002436096. Throughput: 0: 49747.9. Samples: 1531293880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 00:58:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:58:02,858][71000] Updated weights for policy 0, policy_version 122224 (0.0023) [2024-06-13 00:58:05,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2002649088. Throughput: 0: 49778.6. Samples: 1531437140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 00:58:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:58:06,479][71000] Updated weights for policy 0, policy_version 122234 (0.0026) [2024-06-13 00:58:09,489][71000] Updated weights for policy 0, policy_version 122244 (0.0027) [2024-06-13 00:58:10,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 2002911232. Throughput: 0: 49551.8. Samples: 1531730060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 00:58:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 00:58:12,849][71000] Updated weights for policy 0, policy_version 122254 (0.0024) [2024-06-13 00:58:15,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2003156992. Throughput: 0: 49276.3. Samples: 1532027320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 00:58:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:58:16,106][71000] Updated weights for policy 0, policy_version 122264 (0.0031) [2024-06-13 00:58:19,493][71000] Updated weights for policy 0, policy_version 122274 (0.0031) [2024-06-13 00:58:20,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.0, 300 sec: 49485.9). Total num frames: 2003402752. Throughput: 0: 49184.1. Samples: 1532177920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 00:58:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:58:22,829][71000] Updated weights for policy 0, policy_version 122284 (0.0030) [2024-06-13 00:58:25,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.2, 300 sec: 49429.7). Total num frames: 2003632128. Throughput: 0: 49306.7. Samples: 1532475040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 00:58:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 00:58:26,323][71000] Updated weights for policy 0, policy_version 122294 (0.0028) [2024-06-13 00:58:29,296][71000] Updated weights for policy 0, policy_version 122304 (0.0029) [2024-06-13 00:58:30,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 49540.8). Total num frames: 2003894272. Throughput: 0: 48942.5. Samples: 1532762960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 00:58:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:58:33,055][71000] Updated weights for policy 0, policy_version 122314 (0.0036) [2024-06-13 00:58:35,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 2004140032. Throughput: 0: 49154.0. Samples: 1532913020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 00:58:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:58:36,065][71000] Updated weights for policy 0, policy_version 122324 (0.0032) [2024-06-13 00:58:38,747][70980] Signal inference workers to stop experience collection... (22550 times) [2024-06-13 00:58:38,748][70980] Signal inference workers to resume experience collection... (22550 times) [2024-06-13 00:58:38,765][71000] InferenceWorker_p0-w0: stopping experience collection (22550 times) [2024-06-13 00:58:38,765][71000] InferenceWorker_p0-w0: resuming experience collection (22550 times) [2024-06-13 00:58:39,458][71000] Updated weights for policy 0, policy_version 122334 (0.0023) [2024-06-13 00:58:40,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 2004385792. Throughput: 0: 49153.6. Samples: 1533208280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 00:58:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 00:58:42,834][71000] Updated weights for policy 0, policy_version 122344 (0.0026) [2024-06-13 00:58:45,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 2004615168. Throughput: 0: 49104.8. Samples: 1533503600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 00:58:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:58:46,323][71000] Updated weights for policy 0, policy_version 122354 (0.0029) [2024-06-13 00:58:49,598][71000] Updated weights for policy 0, policy_version 122364 (0.0036) [2024-06-13 00:58:50,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48878.9, 300 sec: 49485.2). Total num frames: 2004877312. Throughput: 0: 49059.5. Samples: 1533644820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 00:58:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:58:52,997][71000] Updated weights for policy 0, policy_version 122374 (0.0026) [2024-06-13 00:58:55,939][70768] Fps is (10 sec: 50791.3, 60 sec: 48879.0, 300 sec: 49374.2). Total num frames: 2005123072. Throughput: 0: 49045.0. Samples: 1533937080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 00:58:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:58:56,081][71000] Updated weights for policy 0, policy_version 122384 (0.0029) [2024-06-13 00:58:59,507][71000] Updated weights for policy 0, policy_version 122394 (0.0023) [2024-06-13 00:59:00,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48605.8, 300 sec: 49318.6). Total num frames: 2005352448. Throughput: 0: 49043.9. Samples: 1534234300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 00:59:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:59:03,179][71000] Updated weights for policy 0, policy_version 122404 (0.0019) [2024-06-13 00:59:05,940][70768] Fps is (10 sec: 49150.9, 60 sec: 49424.9, 300 sec: 49374.1). Total num frames: 2005614592. Throughput: 0: 49098.9. Samples: 1534387380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 00:59:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:59:06,102][71000] Updated weights for policy 0, policy_version 122414 (0.0025) [2024-06-13 00:59:09,457][71000] Updated weights for policy 0, policy_version 122424 (0.0025) [2024-06-13 00:59:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48878.9, 300 sec: 49429.7). Total num frames: 2005843968. Throughput: 0: 49091.0. Samples: 1534684140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 00:59:10,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 00:59:12,658][71000] Updated weights for policy 0, policy_version 122434 (0.0022) [2024-06-13 00:59:15,940][70768] Fps is (10 sec: 47514.5, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 2006089728. Throughput: 0: 49316.1. Samples: 1534982180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 00:59:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 00:59:16,252][71000] Updated weights for policy 0, policy_version 122444 (0.0034) [2024-06-13 00:59:19,454][71000] Updated weights for policy 0, policy_version 122454 (0.0029) [2024-06-13 00:59:20,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2006351872. Throughput: 0: 49234.9. Samples: 1535128580. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-13 00:59:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:59:22,690][71000] Updated weights for policy 0, policy_version 122464 (0.0027) [2024-06-13 00:59:25,717][71000] Updated weights for policy 0, policy_version 122474 (0.0025) [2024-06-13 00:59:25,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 2006614016. Throughput: 0: 49306.7. Samples: 1535427080. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-13 00:59:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:59:29,236][71000] Updated weights for policy 0, policy_version 122484 (0.0031) [2024-06-13 00:59:30,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2006859776. Throughput: 0: 49451.7. Samples: 1535728920. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-13 00:59:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 00:59:32,382][71000] Updated weights for policy 0, policy_version 122494 (0.0027) [2024-06-13 00:59:35,867][71000] Updated weights for policy 0, policy_version 122504 (0.0032) [2024-06-13 00:59:35,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2007105536. Throughput: 0: 49599.1. Samples: 1535876780. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-13 00:59:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:59:39,067][71000] Updated weights for policy 0, policy_version 122514 (0.0037) [2024-06-13 00:59:40,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2007334912. Throughput: 0: 49539.0. Samples: 1536166340. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-13 00:59:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 00:59:41,066][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000122519_2007351296.pth... [2024-06-13 00:59:41,114][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000121796_1995505664.pth [2024-06-13 00:59:42,846][71000] Updated weights for policy 0, policy_version 122524 (0.0029) [2024-06-13 00:59:44,735][70980] Signal inference workers to stop experience collection... (22600 times) [2024-06-13 00:59:44,789][71000] InferenceWorker_p0-w0: stopping experience collection (22600 times) [2024-06-13 00:59:44,796][70980] Signal inference workers to resume experience collection... (22600 times) [2024-06-13 00:59:44,797][71000] InferenceWorker_p0-w0: resuming experience collection (22600 times) [2024-06-13 00:59:45,755][71000] Updated weights for policy 0, policy_version 122534 (0.0023) [2024-06-13 00:59:45,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49698.3, 300 sec: 49374.2). Total num frames: 2007597056. Throughput: 0: 49400.6. Samples: 1536457320. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-13 00:59:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 00:59:49,222][71000] Updated weights for policy 0, policy_version 122544 (0.0026) [2024-06-13 00:59:50,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2007842816. Throughput: 0: 49454.3. Samples: 1536612820. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-13 00:59:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 00:59:52,311][71000] Updated weights for policy 0, policy_version 122554 (0.0029) [2024-06-13 00:59:55,407][71000] Updated weights for policy 0, policy_version 122564 (0.0034) [2024-06-13 00:59:55,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2008088576. Throughput: 0: 49617.5. Samples: 1536916920. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-13 00:59:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 00:59:59,080][71000] Updated weights for policy 0, policy_version 122574 (0.0029) [2024-06-13 01:00:00,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.1, 300 sec: 49374.1). Total num frames: 2008317952. Throughput: 0: 49242.2. Samples: 1537198080. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-13 01:00:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:00:02,577][71000] Updated weights for policy 0, policy_version 122584 (0.0029) [2024-06-13 01:00:05,702][71000] Updated weights for policy 0, policy_version 122594 (0.0029) [2024-06-13 01:00:05,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.3, 300 sec: 49374.2). Total num frames: 2008580096. Throughput: 0: 49348.9. Samples: 1537349280. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-13 01:00:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:00:09,308][71000] Updated weights for policy 0, policy_version 122604 (0.0032) [2024-06-13 01:00:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2008809472. Throughput: 0: 49254.8. Samples: 1537643540. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-13 01:00:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:00:12,421][71000] Updated weights for policy 0, policy_version 122614 (0.0036) [2024-06-13 01:00:15,643][71000] Updated weights for policy 0, policy_version 122624 (0.0029) [2024-06-13 01:00:15,940][70768] Fps is (10 sec: 49150.7, 60 sec: 49698.0, 300 sec: 49374.1). Total num frames: 2009071616. Throughput: 0: 49233.1. Samples: 1537944420. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-13 01:00:15,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-13 01:00:18,836][71000] Updated weights for policy 0, policy_version 122634 (0.0029) [2024-06-13 01:00:20,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2009284608. Throughput: 0: 49153.3. Samples: 1538088680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 01:00:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:00:22,144][71000] Updated weights for policy 0, policy_version 122644 (0.0022) [2024-06-13 01:00:25,483][71000] Updated weights for policy 0, policy_version 122654 (0.0035) [2024-06-13 01:00:25,939][70768] Fps is (10 sec: 49153.2, 60 sec: 49152.2, 300 sec: 49318.6). Total num frames: 2009563136. Throughput: 0: 49247.2. Samples: 1538382460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 01:00:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:00:28,977][71000] Updated weights for policy 0, policy_version 122664 (0.0026) [2024-06-13 01:00:30,940][70768] Fps is (10 sec: 54065.9, 60 sec: 49424.8, 300 sec: 49374.1). Total num frames: 2009825280. Throughput: 0: 49488.6. Samples: 1538684320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 01:00:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:00:32,137][71000] Updated weights for policy 0, policy_version 122674 (0.0029) [2024-06-13 01:00:35,762][71000] Updated weights for policy 0, policy_version 122684 (0.0025) [2024-06-13 01:00:35,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2010054656. Throughput: 0: 49188.5. Samples: 1538826300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 01:00:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:00:38,789][71000] Updated weights for policy 0, policy_version 122694 (0.0031) [2024-06-13 01:00:40,939][70768] Fps is (10 sec: 44238.1, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 2010267648. Throughput: 0: 48882.3. Samples: 1539116620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 01:00:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:00:42,629][71000] Updated weights for policy 0, policy_version 122704 (0.0028) [2024-06-13 01:00:45,365][71000] Updated weights for policy 0, policy_version 122714 (0.0025) [2024-06-13 01:00:45,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2010562560. Throughput: 0: 49262.2. Samples: 1539414880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 01:00:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:00:49,079][71000] Updated weights for policy 0, policy_version 122724 (0.0029) [2024-06-13 01:00:50,940][70768] Fps is (10 sec: 54066.8, 60 sec: 49425.1, 300 sec: 49374.1). Total num frames: 2010808320. Throughput: 0: 49307.5. Samples: 1539568120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 01:00:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:00:52,043][71000] Updated weights for policy 0, policy_version 122734 (0.0032) [2024-06-13 01:00:55,906][71000] Updated weights for policy 0, policy_version 122744 (0.0026) [2024-06-13 01:00:55,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2011037696. Throughput: 0: 49421.4. Samples: 1539867500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 01:00:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:00:58,159][70980] Signal inference workers to stop experience collection... (22650 times) [2024-06-13 01:00:58,159][70980] Signal inference workers to resume experience collection... (22650 times) [2024-06-13 01:00:58,207][71000] InferenceWorker_p0-w0: stopping experience collection (22650 times) [2024-06-13 01:00:58,207][71000] InferenceWorker_p0-w0: resuming experience collection (22650 times) [2024-06-13 01:00:58,997][71000] Updated weights for policy 0, policy_version 122754 (0.0026) [2024-06-13 01:01:00,940][70768] Fps is (10 sec: 45874.6, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2011267072. Throughput: 0: 49021.8. Samples: 1540150400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 01:01:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:01:02,479][71000] Updated weights for policy 0, policy_version 122764 (0.0023) [2024-06-13 01:01:05,474][71000] Updated weights for policy 0, policy_version 122774 (0.0022) [2024-06-13 01:01:05,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2011545600. Throughput: 0: 49139.6. Samples: 1540299960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 01:01:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:01:09,002][71000] Updated weights for policy 0, policy_version 122784 (0.0028) [2024-06-13 01:01:10,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2011774976. Throughput: 0: 49234.5. Samples: 1540598020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 01:01:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:01:12,098][71000] Updated weights for policy 0, policy_version 122794 (0.0019) [2024-06-13 01:01:15,940][70768] Fps is (10 sec: 44236.6, 60 sec: 48606.0, 300 sec: 49263.1). Total num frames: 2011987968. Throughput: 0: 49145.6. Samples: 1540895860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 01:01:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:01:16,091][71000] Updated weights for policy 0, policy_version 122804 (0.0035) [2024-06-13 01:01:18,655][71000] Updated weights for policy 0, policy_version 122814 (0.0029) [2024-06-13 01:01:20,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2012250112. Throughput: 0: 48984.5. Samples: 1541030600. Policy #0 lag: (min: 0.0, avg: 13.7, max: 25.0) [2024-06-13 01:01:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:01:22,659][71000] Updated weights for policy 0, policy_version 122824 (0.0031) [2024-06-13 01:01:25,511][71000] Updated weights for policy 0, policy_version 122834 (0.0032) [2024-06-13 01:01:25,940][70768] Fps is (10 sec: 54065.7, 60 sec: 49424.8, 300 sec: 49374.1). Total num frames: 2012528640. Throughput: 0: 49146.7. Samples: 1541328240. Policy #0 lag: (min: 0.0, avg: 13.7, max: 25.0) [2024-06-13 01:01:25,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:01:29,210][71000] Updated weights for policy 0, policy_version 122844 (0.0029) [2024-06-13 01:01:30,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48606.1, 300 sec: 49207.5). Total num frames: 2012741632. Throughput: 0: 49147.6. Samples: 1541626520. Policy #0 lag: (min: 0.0, avg: 13.7, max: 25.0) [2024-06-13 01:01:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:01:31,989][71000] Updated weights for policy 0, policy_version 122854 (0.0024) [2024-06-13 01:01:35,940][70768] Fps is (10 sec: 45876.5, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 2012987392. Throughput: 0: 49101.8. Samples: 1541777700. Policy #0 lag: (min: 0.0, avg: 13.7, max: 25.0) [2024-06-13 01:01:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:01:35,957][71000] Updated weights for policy 0, policy_version 122864 (0.0029) [2024-06-13 01:01:38,496][71000] Updated weights for policy 0, policy_version 122874 (0.0025) [2024-06-13 01:01:40,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49971.2, 300 sec: 49374.2). Total num frames: 2013265920. Throughput: 0: 48951.5. Samples: 1542070320. Policy #0 lag: (min: 0.0, avg: 13.7, max: 25.0) [2024-06-13 01:01:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:01:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000122880_2013265920.pth... [2024-06-13 01:01:40,994][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000122159_2001453056.pth [2024-06-13 01:01:42,585][71000] Updated weights for policy 0, policy_version 122884 (0.0027) [2024-06-13 01:01:45,394][71000] Updated weights for policy 0, policy_version 122894 (0.0025) [2024-06-13 01:01:45,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2013511680. Throughput: 0: 49074.3. Samples: 1542358740. Policy #0 lag: (min: 0.0, avg: 13.7, max: 25.0) [2024-06-13 01:01:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:01:49,484][71000] Updated weights for policy 0, policy_version 122904 (0.0029) [2024-06-13 01:01:50,939][70768] Fps is (10 sec: 45875.5, 60 sec: 48605.9, 300 sec: 49207.6). Total num frames: 2013724672. Throughput: 0: 49293.8. Samples: 1542518180. Policy #0 lag: (min: 0.0, avg: 13.7, max: 25.0) [2024-06-13 01:01:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:01:52,366][71000] Updated weights for policy 0, policy_version 122914 (0.0033) [2024-06-13 01:01:55,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48878.8, 300 sec: 49207.5). Total num frames: 2013970432. Throughput: 0: 48935.5. Samples: 1542800120. Policy #0 lag: (min: 0.0, avg: 13.7, max: 25.0) [2024-06-13 01:01:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:01:56,109][71000] Updated weights for policy 0, policy_version 122924 (0.0034) [2024-06-13 01:01:58,819][71000] Updated weights for policy 0, policy_version 122934 (0.0042) [2024-06-13 01:02:00,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2014232576. Throughput: 0: 48942.1. Samples: 1543098260. Policy #0 lag: (min: 0.0, avg: 13.7, max: 25.0) [2024-06-13 01:02:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:02:02,783][71000] Updated weights for policy 0, policy_version 122944 (0.0026) [2024-06-13 01:02:05,388][71000] Updated weights for policy 0, policy_version 122954 (0.0037) [2024-06-13 01:02:05,940][70768] Fps is (10 sec: 52429.6, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2014494720. Throughput: 0: 49407.2. Samples: 1543253920. Policy #0 lag: (min: 0.0, avg: 13.7, max: 25.0) [2024-06-13 01:02:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:02:09,631][71000] Updated weights for policy 0, policy_version 122964 (0.0026) [2024-06-13 01:02:10,939][70768] Fps is (10 sec: 47514.4, 60 sec: 48879.1, 300 sec: 49207.5). Total num frames: 2014707712. Throughput: 0: 49271.9. Samples: 1543545460. Policy #0 lag: (min: 0.0, avg: 13.7, max: 25.0) [2024-06-13 01:02:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:02:12,257][71000] Updated weights for policy 0, policy_version 122974 (0.0036) [2024-06-13 01:02:15,938][71000] Updated weights for policy 0, policy_version 122984 (0.0029) [2024-06-13 01:02:15,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2014969856. Throughput: 0: 49263.9. Samples: 1543843400. Policy #0 lag: (min: 0.0, avg: 13.7, max: 25.0) [2024-06-13 01:02:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:02:17,464][70980] Signal inference workers to stop experience collection... (22700 times) [2024-06-13 01:02:17,464][70980] Signal inference workers to resume experience collection... (22700 times) [2024-06-13 01:02:17,480][71000] InferenceWorker_p0-w0: stopping experience collection (22700 times) [2024-06-13 01:02:17,481][71000] InferenceWorker_p0-w0: resuming experience collection (22700 times) [2024-06-13 01:02:18,586][71000] Updated weights for policy 0, policy_version 122994 (0.0029) [2024-06-13 01:02:20,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 2015232000. Throughput: 0: 49086.1. Samples: 1543986580. Policy #0 lag: (min: 0.0, avg: 13.7, max: 25.0) [2024-06-13 01:02:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:02:22,612][71000] Updated weights for policy 0, policy_version 123004 (0.0033) [2024-06-13 01:02:25,463][71000] Updated weights for policy 0, policy_version 123014 (0.0026) [2024-06-13 01:02:25,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48879.2, 300 sec: 49207.5). Total num frames: 2015461376. Throughput: 0: 49265.8. Samples: 1544287280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 01:02:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:02:29,398][71000] Updated weights for policy 0, policy_version 123024 (0.0024) [2024-06-13 01:02:30,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2015707136. Throughput: 0: 49572.9. Samples: 1544589520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 01:02:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:02:32,269][71000] Updated weights for policy 0, policy_version 123034 (0.0028) [2024-06-13 01:02:35,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2015936512. Throughput: 0: 49130.6. Samples: 1544729060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 01:02:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:02:36,123][71000] Updated weights for policy 0, policy_version 123044 (0.0034) [2024-06-13 01:02:38,742][71000] Updated weights for policy 0, policy_version 123054 (0.0035) [2024-06-13 01:02:40,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2016215040. Throughput: 0: 49429.4. Samples: 1545024440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 01:02:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:02:42,645][71000] Updated weights for policy 0, policy_version 123064 (0.0036) [2024-06-13 01:02:45,520][71000] Updated weights for policy 0, policy_version 123074 (0.0032) [2024-06-13 01:02:45,940][70768] Fps is (10 sec: 52429.1, 60 sec: 49152.1, 300 sec: 49207.5). Total num frames: 2016460800. Throughput: 0: 49315.3. Samples: 1545317440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 01:02:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:02:49,323][71000] Updated weights for policy 0, policy_version 123084 (0.0025) [2024-06-13 01:02:50,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49698.1, 300 sec: 49207.5). Total num frames: 2016706560. Throughput: 0: 49333.3. Samples: 1545473920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 01:02:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:02:52,427][71000] Updated weights for policy 0, policy_version 123094 (0.0026) [2024-06-13 01:02:55,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2016919552. Throughput: 0: 49066.6. Samples: 1545753460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 01:02:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:02:56,210][71000] Updated weights for policy 0, policy_version 123104 (0.0023) [2024-06-13 01:02:59,211][71000] Updated weights for policy 0, policy_version 123114 (0.0032) [2024-06-13 01:03:00,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2017198080. Throughput: 0: 48960.4. Samples: 1546046620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 01:03:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:03:02,739][71000] Updated weights for policy 0, policy_version 123124 (0.0034) [2024-06-13 01:03:05,937][71000] Updated weights for policy 0, policy_version 123134 (0.0031) [2024-06-13 01:03:05,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2017427456. Throughput: 0: 49174.8. Samples: 1546199440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 01:03:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:03:09,568][71000] Updated weights for policy 0, policy_version 123144 (0.0028) [2024-06-13 01:03:10,940][70768] Fps is (10 sec: 45875.6, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2017656832. Throughput: 0: 49099.1. Samples: 1546496740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 01:03:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:03:12,221][71000] Updated weights for policy 0, policy_version 123154 (0.0029) [2024-06-13 01:03:15,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2017902592. Throughput: 0: 48736.8. Samples: 1546782680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 01:03:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:03:16,344][71000] Updated weights for policy 0, policy_version 123164 (0.0026) [2024-06-13 01:03:19,140][71000] Updated weights for policy 0, policy_version 123174 (0.0035) [2024-06-13 01:03:19,542][70980] Signal inference workers to stop experience collection... (22750 times) [2024-06-13 01:03:19,543][70980] Signal inference workers to resume experience collection... (22750 times) [2024-06-13 01:03:19,560][71000] InferenceWorker_p0-w0: stopping experience collection (22750 times) [2024-06-13 01:03:19,560][71000] InferenceWorker_p0-w0: resuming experience collection (22750 times) [2024-06-13 01:03:20,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2018181120. Throughput: 0: 49067.4. Samples: 1546937100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 01:03:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:03:22,834][71000] Updated weights for policy 0, policy_version 123184 (0.0032) [2024-06-13 01:03:25,790][71000] Updated weights for policy 0, policy_version 123194 (0.0029) [2024-06-13 01:03:25,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 2018410496. Throughput: 0: 48924.5. Samples: 1547226040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:03:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:03:29,453][71000] Updated weights for policy 0, policy_version 123204 (0.0023) [2024-06-13 01:03:30,940][70768] Fps is (10 sec: 47514.5, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 2018656256. Throughput: 0: 48996.9. Samples: 1547522300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:03:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:03:32,624][71000] Updated weights for policy 0, policy_version 123214 (0.0030) [2024-06-13 01:03:35,939][70768] Fps is (10 sec: 45875.8, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2018869248. Throughput: 0: 48694.3. Samples: 1547665160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:03:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:03:36,269][71000] Updated weights for policy 0, policy_version 123224 (0.0020) [2024-06-13 01:03:39,379][71000] Updated weights for policy 0, policy_version 123234 (0.0027) [2024-06-13 01:03:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2019147776. Throughput: 0: 49075.5. Samples: 1547961860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:03:40,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 01:03:40,971][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000123240_2019164160.pth... [2024-06-13 01:03:41,016][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000122519_2007351296.pth [2024-06-13 01:03:42,975][71000] Updated weights for policy 0, policy_version 123244 (0.0028) [2024-06-13 01:03:45,940][70768] Fps is (10 sec: 49150.7, 60 sec: 48332.6, 300 sec: 49096.4). Total num frames: 2019360768. Throughput: 0: 48962.1. Samples: 1548249920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:03:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:03:46,163][71000] Updated weights for policy 0, policy_version 123254 (0.0024) [2024-06-13 01:03:49,483][71000] Updated weights for policy 0, policy_version 123264 (0.0025) [2024-06-13 01:03:50,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48605.7, 300 sec: 49152.0). Total num frames: 2019622912. Throughput: 0: 48748.7. Samples: 1548393140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:03:50,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 01:03:52,758][71000] Updated weights for policy 0, policy_version 123274 (0.0025) [2024-06-13 01:03:55,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 2019868672. Throughput: 0: 48606.2. Samples: 1548684020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:03:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:03:56,347][71000] Updated weights for policy 0, policy_version 123284 (0.0031) [2024-06-13 01:03:59,310][71000] Updated weights for policy 0, policy_version 123294 (0.0029) [2024-06-13 01:04:00,939][70768] Fps is (10 sec: 52429.8, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2020147200. Throughput: 0: 48961.9. Samples: 1548985960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:04:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:04:02,695][71000] Updated weights for policy 0, policy_version 123304 (0.0034) [2024-06-13 01:04:05,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.8, 300 sec: 49207.5). Total num frames: 2020360192. Throughput: 0: 48928.5. Samples: 1549138880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:04:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:04:05,991][71000] Updated weights for policy 0, policy_version 123314 (0.0028) [2024-06-13 01:04:09,455][71000] Updated weights for policy 0, policy_version 123324 (0.0018) [2024-06-13 01:04:10,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2020605952. Throughput: 0: 49040.0. Samples: 1549432840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:04:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:04:12,609][71000] Updated weights for policy 0, policy_version 123334 (0.0026) [2024-06-13 01:04:15,833][71000] Updated weights for policy 0, policy_version 123344 (0.0028) [2024-06-13 01:04:15,939][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.2, 300 sec: 49207.5). Total num frames: 2020868096. Throughput: 0: 49065.8. Samples: 1549730260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:04:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:04:19,294][71000] Updated weights for policy 0, policy_version 123354 (0.0029) [2024-06-13 01:04:20,940][70768] Fps is (10 sec: 50789.4, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2021113856. Throughput: 0: 49125.5. Samples: 1549875820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:04:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:04:22,816][71000] Updated weights for policy 0, policy_version 123364 (0.0028) [2024-06-13 01:04:24,710][70980] Signal inference workers to stop experience collection... (22800 times) [2024-06-13 01:04:24,710][70980] Signal inference workers to resume experience collection... (22800 times) [2024-06-13 01:04:24,731][71000] InferenceWorker_p0-w0: stopping experience collection (22800 times) [2024-06-13 01:04:24,731][71000] InferenceWorker_p0-w0: resuming experience collection (22800 times) [2024-06-13 01:04:25,940][70768] Fps is (10 sec: 45874.6, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 2021326848. Throughput: 0: 48927.5. Samples: 1550163600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 01:04:25,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 01:04:26,219][71000] Updated weights for policy 0, policy_version 123374 (0.0034) [2024-06-13 01:04:29,472][71000] Updated weights for policy 0, policy_version 123384 (0.0028) [2024-06-13 01:04:30,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48605.7, 300 sec: 49040.9). Total num frames: 2021572608. Throughput: 0: 48890.2. Samples: 1550449980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 01:04:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:04:33,054][71000] Updated weights for policy 0, policy_version 123394 (0.0033) [2024-06-13 01:04:35,939][70768] Fps is (10 sec: 49152.8, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2021818368. Throughput: 0: 48933.1. Samples: 1550595120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 01:04:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:04:36,114][71000] Updated weights for policy 0, policy_version 123404 (0.0031) [2024-06-13 01:04:39,820][71000] Updated weights for policy 0, policy_version 123414 (0.0032) [2024-06-13 01:04:40,940][70768] Fps is (10 sec: 49153.0, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 2022064128. Throughput: 0: 49126.7. Samples: 1550894720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 01:04:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:04:42,953][71000] Updated weights for policy 0, policy_version 123424 (0.0032) [2024-06-13 01:04:45,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 2022309888. Throughput: 0: 48910.1. Samples: 1551186920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 01:04:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:04:46,439][71000] Updated weights for policy 0, policy_version 123434 (0.0035) [2024-06-13 01:04:49,556][71000] Updated weights for policy 0, policy_version 123444 (0.0034) [2024-06-13 01:04:50,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 2022539264. Throughput: 0: 48701.8. Samples: 1551330460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 01:04:50,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-13 01:04:52,904][71000] Updated weights for policy 0, policy_version 123454 (0.0031) [2024-06-13 01:04:55,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2022817792. Throughput: 0: 48859.6. Samples: 1551631520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 01:04:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:04:55,958][71000] Updated weights for policy 0, policy_version 123464 (0.0024) [2024-06-13 01:04:59,732][71000] Updated weights for policy 0, policy_version 123474 (0.0026) [2024-06-13 01:05:00,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48332.8, 300 sec: 49040.9). Total num frames: 2023047168. Throughput: 0: 48686.7. Samples: 1551921160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 01:05:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:05:02,865][71000] Updated weights for policy 0, policy_version 123484 (0.0027) [2024-06-13 01:05:05,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 2023276544. Throughput: 0: 48562.9. Samples: 1552061140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 01:05:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:05:06,248][71000] Updated weights for policy 0, policy_version 123494 (0.0034) [2024-06-13 01:05:09,421][71000] Updated weights for policy 0, policy_version 123504 (0.0025) [2024-06-13 01:05:10,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 2023522304. Throughput: 0: 48798.3. Samples: 1552359520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 01:05:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:05:13,032][71000] Updated weights for policy 0, policy_version 123514 (0.0023) [2024-06-13 01:05:15,775][71000] Updated weights for policy 0, policy_version 123524 (0.0023) [2024-06-13 01:05:15,940][70768] Fps is (10 sec: 54067.2, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2023817216. Throughput: 0: 49006.8. Samples: 1552655280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 01:05:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:05:17,507][70980] Signal inference workers to stop experience collection... (22850 times) [2024-06-13 01:05:17,508][70980] Signal inference workers to resume experience collection... (22850 times) [2024-06-13 01:05:17,522][71000] InferenceWorker_p0-w0: stopping experience collection (22850 times) [2024-06-13 01:05:17,522][71000] InferenceWorker_p0-w0: resuming experience collection (22850 times) [2024-06-13 01:05:19,835][71000] Updated weights for policy 0, policy_version 123534 (0.0026) [2024-06-13 01:05:20,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 2024030208. Throughput: 0: 49091.8. Samples: 1552804260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 01:05:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:05:22,511][71000] Updated weights for policy 0, policy_version 123544 (0.0024) [2024-06-13 01:05:25,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2024275968. Throughput: 0: 49190.6. Samples: 1553108300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 01:05:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:05:26,498][71000] Updated weights for policy 0, policy_version 123554 (0.0027) [2024-06-13 01:05:29,422][71000] Updated weights for policy 0, policy_version 123564 (0.0024) [2024-06-13 01:05:30,939][70768] Fps is (10 sec: 47514.3, 60 sec: 48879.1, 300 sec: 48985.4). Total num frames: 2024505344. Throughput: 0: 48965.4. Samples: 1553390360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:05:30,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:05:32,888][71000] Updated weights for policy 0, policy_version 123574 (0.0035) [2024-06-13 01:05:35,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2024783872. Throughput: 0: 49103.1. Samples: 1553540100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:05:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:05:35,957][71000] Updated weights for policy 0, policy_version 123584 (0.0033) [2024-06-13 01:05:39,975][71000] Updated weights for policy 0, policy_version 123594 (0.0033) [2024-06-13 01:05:40,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 2025013248. Throughput: 0: 49038.5. Samples: 1553838260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:05:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 01:05:40,958][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000123597_2025013248.pth... [2024-06-13 01:05:41,011][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000122880_2013265920.pth [2024-06-13 01:05:42,707][71000] Updated weights for policy 0, policy_version 123604 (0.0022) [2024-06-13 01:05:45,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2025259008. Throughput: 0: 49014.6. Samples: 1554126820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:05:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:05:46,663][71000] Updated weights for policy 0, policy_version 123614 (0.0022) [2024-06-13 01:05:49,458][71000] Updated weights for policy 0, policy_version 123624 (0.0024) [2024-06-13 01:05:50,940][70768] Fps is (10 sec: 49150.8, 60 sec: 49424.8, 300 sec: 49040.9). Total num frames: 2025504768. Throughput: 0: 49194.8. Samples: 1554274920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:05:50,944][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:05:53,367][71000] Updated weights for policy 0, policy_version 123634 (0.0028) [2024-06-13 01:05:55,789][71000] Updated weights for policy 0, policy_version 123644 (0.0024) [2024-06-13 01:05:55,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49425.0, 300 sec: 49207.6). Total num frames: 2025783296. Throughput: 0: 49419.9. Samples: 1554583420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:05:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:06:00,003][71000] Updated weights for policy 0, policy_version 123654 (0.0027) [2024-06-13 01:06:00,940][70768] Fps is (10 sec: 50792.1, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 2026012672. Throughput: 0: 49635.6. Samples: 1554888880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:06:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:06:02,372][71000] Updated weights for policy 0, policy_version 123664 (0.0028) [2024-06-13 01:06:05,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49698.0, 300 sec: 49096.4). Total num frames: 2026258432. Throughput: 0: 49477.2. Samples: 1555030740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:06:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:06:06,236][71000] Updated weights for policy 0, policy_version 123674 (0.0028) [2024-06-13 01:06:08,950][71000] Updated weights for policy 0, policy_version 123684 (0.0027) [2024-06-13 01:06:10,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49971.2, 300 sec: 49263.1). Total num frames: 2026520576. Throughput: 0: 49263.6. Samples: 1555325160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:06:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 01:06:12,881][71000] Updated weights for policy 0, policy_version 123694 (0.0027) [2024-06-13 01:06:15,509][71000] Updated weights for policy 0, policy_version 123704 (0.0028) [2024-06-13 01:06:15,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2026766336. Throughput: 0: 49621.6. Samples: 1555623340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:06:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:06:19,770][71000] Updated weights for policy 0, policy_version 123714 (0.0025) [2024-06-13 01:06:20,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.2, 300 sec: 49096.5). Total num frames: 2027012096. Throughput: 0: 49649.3. Samples: 1555774320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:06:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:06:22,515][71000] Updated weights for policy 0, policy_version 123724 (0.0022) [2024-06-13 01:06:25,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 2027225088. Throughput: 0: 49685.3. Samples: 1556074100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:06:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:06:26,293][71000] Updated weights for policy 0, policy_version 123734 (0.0036) [2024-06-13 01:06:27,037][70980] Signal inference workers to stop experience collection... (22900 times) [2024-06-13 01:06:27,060][71000] InferenceWorker_p0-w0: stopping experience collection (22900 times) [2024-06-13 01:06:27,142][70980] Signal inference workers to resume experience collection... (22900 times) [2024-06-13 01:06:27,143][71000] InferenceWorker_p0-w0: resuming experience collection (22900 times) [2024-06-13 01:06:28,977][71000] Updated weights for policy 0, policy_version 123744 (0.0029) [2024-06-13 01:06:30,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49971.2, 300 sec: 49207.5). Total num frames: 2027503616. Throughput: 0: 49689.4. Samples: 1556362840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:06:30,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 01:06:32,718][71000] Updated weights for policy 0, policy_version 123754 (0.0028) [2024-06-13 01:06:35,806][71000] Updated weights for policy 0, policy_version 123764 (0.0029) [2024-06-13 01:06:35,940][70768] Fps is (10 sec: 52429.6, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2027749376. Throughput: 0: 50003.9. Samples: 1556525080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:06:35,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-13 01:06:39,824][71000] Updated weights for policy 0, policy_version 123774 (0.0038) [2024-06-13 01:06:40,940][70768] Fps is (10 sec: 49150.7, 60 sec: 49698.0, 300 sec: 49096.4). Total num frames: 2027995136. Throughput: 0: 49808.3. Samples: 1556824800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:06:40,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 01:06:42,152][71000] Updated weights for policy 0, policy_version 123784 (0.0030) [2024-06-13 01:06:45,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2028224512. Throughput: 0: 49444.0. Samples: 1557113860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:06:45,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 01:06:46,458][71000] Updated weights for policy 0, policy_version 123794 (0.0024) [2024-06-13 01:06:49,403][71000] Updated weights for policy 0, policy_version 123804 (0.0021) [2024-06-13 01:06:50,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49971.3, 300 sec: 49263.1). Total num frames: 2028503040. Throughput: 0: 49381.8. Samples: 1557252920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:06:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:06:53,142][71000] Updated weights for policy 0, policy_version 123814 (0.0030) [2024-06-13 01:06:55,890][71000] Updated weights for policy 0, policy_version 123824 (0.0032) [2024-06-13 01:06:55,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2028732416. Throughput: 0: 49532.3. Samples: 1557554120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:06:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:06:59,569][71000] Updated weights for policy 0, policy_version 123834 (0.0029) [2024-06-13 01:07:00,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.0, 300 sec: 49152.0). Total num frames: 2028994560. Throughput: 0: 49656.9. Samples: 1557857900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:07:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:07:02,180][71000] Updated weights for policy 0, policy_version 123844 (0.0032) [2024-06-13 01:07:05,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48879.0, 300 sec: 49096.4). Total num frames: 2029191168. Throughput: 0: 49556.4. Samples: 1558004360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:07:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:07:06,283][71000] Updated weights for policy 0, policy_version 123854 (0.0032) [2024-06-13 01:07:08,917][71000] Updated weights for policy 0, policy_version 123864 (0.0041) [2024-06-13 01:07:10,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49424.8, 300 sec: 49207.5). Total num frames: 2029486080. Throughput: 0: 49375.8. Samples: 1558296020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:07:10,941][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:07:12,769][71000] Updated weights for policy 0, policy_version 123874 (0.0027) [2024-06-13 01:07:15,628][71000] Updated weights for policy 0, policy_version 123884 (0.0033) [2024-06-13 01:07:15,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49152.0, 300 sec: 49096.4). Total num frames: 2029715456. Throughput: 0: 49433.1. Samples: 1558587340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:07:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:07:19,476][71000] Updated weights for policy 0, policy_version 123894 (0.0030) [2024-06-13 01:07:20,576][70980] Signal inference workers to stop experience collection... (22950 times) [2024-06-13 01:07:20,577][70980] Signal inference workers to resume experience collection... (22950 times) [2024-06-13 01:07:20,628][71000] InferenceWorker_p0-w0: stopping experience collection (22950 times) [2024-06-13 01:07:20,628][71000] InferenceWorker_p0-w0: resuming experience collection (22950 times) [2024-06-13 01:07:20,940][70768] Fps is (10 sec: 49153.2, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2029977600. Throughput: 0: 49183.5. Samples: 1558738340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:07:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:07:22,243][71000] Updated weights for policy 0, policy_version 123904 (0.0030) [2024-06-13 01:07:25,940][70768] Fps is (10 sec: 47514.5, 60 sec: 49425.2, 300 sec: 49096.5). Total num frames: 2030190592. Throughput: 0: 49285.6. Samples: 1559042640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:07:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:07:25,952][71000] Updated weights for policy 0, policy_version 123914 (0.0031) [2024-06-13 01:07:28,551][71000] Updated weights for policy 0, policy_version 123924 (0.0027) [2024-06-13 01:07:30,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2030469120. Throughput: 0: 49339.1. Samples: 1559334120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:07:30,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:07:32,570][71000] Updated weights for policy 0, policy_version 123934 (0.0025) [2024-06-13 01:07:35,357][71000] Updated weights for policy 0, policy_version 123944 (0.0024) [2024-06-13 01:07:35,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2030714880. Throughput: 0: 49582.3. Samples: 1559484120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 01:07:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:07:39,279][71000] Updated weights for policy 0, policy_version 123954 (0.0029) [2024-06-13 01:07:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.3, 300 sec: 49152.0). Total num frames: 2030960640. Throughput: 0: 49670.8. Samples: 1559789300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 01:07:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:07:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000123960_2030960640.pth... [2024-06-13 01:07:41,010][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000123240_2019164160.pth [2024-06-13 01:07:41,815][71000] Updated weights for policy 0, policy_version 123964 (0.0032) [2024-06-13 01:07:45,762][71000] Updated weights for policy 0, policy_version 123974 (0.0022) [2024-06-13 01:07:45,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49698.1, 300 sec: 49152.0). Total num frames: 2031206400. Throughput: 0: 49520.2. Samples: 1560086300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 01:07:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:07:48,527][71000] Updated weights for policy 0, policy_version 123984 (0.0033) [2024-06-13 01:07:50,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.2, 300 sec: 49263.1). Total num frames: 2031452160. Throughput: 0: 49279.2. Samples: 1560221920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 01:07:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:07:52,140][71000] Updated weights for policy 0, policy_version 123994 (0.0027) [2024-06-13 01:07:55,015][71000] Updated weights for policy 0, policy_version 124004 (0.0024) [2024-06-13 01:07:55,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.2, 300 sec: 49152.0). Total num frames: 2031697920. Throughput: 0: 49595.1. Samples: 1560527780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 01:07:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:07:58,855][71000] Updated weights for policy 0, policy_version 124014 (0.0029) [2024-06-13 01:08:00,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2031927296. Throughput: 0: 49612.2. Samples: 1560819880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 01:08:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:08:02,115][71000] Updated weights for policy 0, policy_version 124024 (0.0038) [2024-06-13 01:08:05,409][71000] Updated weights for policy 0, policy_version 124034 (0.0024) [2024-06-13 01:08:05,940][70768] Fps is (10 sec: 50789.9, 60 sec: 50244.3, 300 sec: 49318.6). Total num frames: 2032205824. Throughput: 0: 49447.1. Samples: 1560963460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 01:08:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:08:08,461][71000] Updated weights for policy 0, policy_version 124044 (0.0036) [2024-06-13 01:08:10,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.2, 300 sec: 49263.1). Total num frames: 2032435200. Throughput: 0: 49342.6. Samples: 1561263060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 01:08:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:08:11,912][71000] Updated weights for policy 0, policy_version 124054 (0.0024) [2024-06-13 01:08:15,172][71000] Updated weights for policy 0, policy_version 124064 (0.0032) [2024-06-13 01:08:15,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49971.3, 300 sec: 49263.1). Total num frames: 2032713728. Throughput: 0: 49623.4. Samples: 1561567180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 01:08:15,944][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:08:18,352][71000] Updated weights for policy 0, policy_version 124074 (0.0031) [2024-06-13 01:08:20,939][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 2032926720. Throughput: 0: 49580.2. Samples: 1561715220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 01:08:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:08:21,651][71000] Updated weights for policy 0, policy_version 124084 (0.0031) [2024-06-13 01:08:25,141][71000] Updated weights for policy 0, policy_version 124094 (0.0032) [2024-06-13 01:08:25,560][70980] Signal inference workers to stop experience collection... (23000 times) [2024-06-13 01:08:25,607][71000] InferenceWorker_p0-w0: stopping experience collection (23000 times) [2024-06-13 01:08:25,613][70980] Signal inference workers to resume experience collection... (23000 times) [2024-06-13 01:08:25,626][71000] InferenceWorker_p0-w0: resuming experience collection (23000 times) [2024-06-13 01:08:25,939][70768] Fps is (10 sec: 49152.7, 60 sec: 50244.3, 300 sec: 49318.6). Total num frames: 2033205248. Throughput: 0: 49404.0. Samples: 1562012480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 01:08:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:08:28,110][71000] Updated weights for policy 0, policy_version 124104 (0.0034) [2024-06-13 01:08:30,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2033418240. Throughput: 0: 49387.5. Samples: 1562308740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 01:08:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:08:31,605][71000] Updated weights for policy 0, policy_version 124114 (0.0032) [2024-06-13 01:08:34,815][71000] Updated weights for policy 0, policy_version 124124 (0.0024) [2024-06-13 01:08:35,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 2033696768. Throughput: 0: 49732.7. Samples: 1562459900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 01:08:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 01:08:38,171][71000] Updated weights for policy 0, policy_version 124134 (0.0031) [2024-06-13 01:08:40,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2033926144. Throughput: 0: 49626.5. Samples: 1562760980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 01:08:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:08:41,353][71000] Updated weights for policy 0, policy_version 124144 (0.0034) [2024-06-13 01:08:45,066][71000] Updated weights for policy 0, policy_version 124154 (0.0021) [2024-06-13 01:08:45,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 2034188288. Throughput: 0: 49674.7. Samples: 1563055240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 01:08:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:08:48,034][71000] Updated weights for policy 0, policy_version 124164 (0.0041) [2024-06-13 01:08:50,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 2034434048. Throughput: 0: 49738.3. Samples: 1563201680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 01:08:50,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 01:08:51,442][71000] Updated weights for policy 0, policy_version 124174 (0.0028) [2024-06-13 01:08:54,538][71000] Updated weights for policy 0, policy_version 124184 (0.0030) [2024-06-13 01:08:55,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2034679808. Throughput: 0: 49789.4. Samples: 1563503580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 01:08:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:08:57,954][71000] Updated weights for policy 0, policy_version 124194 (0.0030) [2024-06-13 01:09:00,940][70768] Fps is (10 sec: 50790.1, 60 sec: 50244.2, 300 sec: 49429.7). Total num frames: 2034941952. Throughput: 0: 49856.5. Samples: 1563810720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 01:09:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:09:01,147][71000] Updated weights for policy 0, policy_version 124204 (0.0041) [2024-06-13 01:09:04,486][71000] Updated weights for policy 0, policy_version 124214 (0.0028) [2024-06-13 01:09:05,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 2035171328. Throughput: 0: 49766.5. Samples: 1563954720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 01:09:05,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 01:09:07,717][71000] Updated weights for policy 0, policy_version 124224 (0.0027) [2024-06-13 01:09:10,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49971.3, 300 sec: 49374.2). Total num frames: 2035433472. Throughput: 0: 49825.4. Samples: 1564254620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 01:09:10,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:09:11,050][71000] Updated weights for policy 0, policy_version 124234 (0.0033) [2024-06-13 01:09:14,414][71000] Updated weights for policy 0, policy_version 124244 (0.0024) [2024-06-13 01:09:15,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2035662848. Throughput: 0: 49577.3. Samples: 1564539720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 01:09:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:09:17,557][70980] Signal inference workers to stop experience collection... (23050 times) [2024-06-13 01:09:17,557][70980] Signal inference workers to resume experience collection... (23050 times) [2024-06-13 01:09:17,588][71000] InferenceWorker_p0-w0: stopping experience collection (23050 times) [2024-06-13 01:09:17,588][71000] InferenceWorker_p0-w0: resuming experience collection (23050 times) [2024-06-13 01:09:17,703][71000] Updated weights for policy 0, policy_version 124254 (0.0025) [2024-06-13 01:09:20,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49971.1, 300 sec: 49485.2). Total num frames: 2035924992. Throughput: 0: 49567.2. Samples: 1564690420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 01:09:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:09:21,051][71000] Updated weights for policy 0, policy_version 124264 (0.0028) [2024-06-13 01:09:24,343][71000] Updated weights for policy 0, policy_version 124274 (0.0026) [2024-06-13 01:09:25,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2036154368. Throughput: 0: 49468.1. Samples: 1564987040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 01:09:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:09:27,709][71000] Updated weights for policy 0, policy_version 124284 (0.0026) [2024-06-13 01:09:30,879][71000] Updated weights for policy 0, policy_version 124294 (0.0024) [2024-06-13 01:09:30,940][70768] Fps is (10 sec: 50790.1, 60 sec: 50244.2, 300 sec: 49540.7). Total num frames: 2036432896. Throughput: 0: 49789.7. Samples: 1565295780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 01:09:30,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:09:34,445][71000] Updated weights for policy 0, policy_version 124304 (0.0028) [2024-06-13 01:09:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 2036645888. Throughput: 0: 49765.4. Samples: 1565441120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 01:09:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:09:37,769][71000] Updated weights for policy 0, policy_version 124314 (0.0026) [2024-06-13 01:09:40,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 2036908032. Throughput: 0: 49465.2. Samples: 1565729520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 01:09:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 01:09:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000124323_2036908032.pth... [2024-06-13 01:09:41,006][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000123597_2025013248.pth [2024-06-13 01:09:41,163][71000] Updated weights for policy 0, policy_version 124324 (0.0025) [2024-06-13 01:09:44,447][71000] Updated weights for policy 0, policy_version 124334 (0.0032) [2024-06-13 01:09:45,940][70768] Fps is (10 sec: 50789.4, 60 sec: 49424.9, 300 sec: 49540.8). Total num frames: 2037153792. Throughput: 0: 49245.7. Samples: 1566026780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 01:09:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:09:47,641][71000] Updated weights for policy 0, policy_version 124344 (0.0042) [2024-06-13 01:09:50,837][71000] Updated weights for policy 0, policy_version 124354 (0.0036) [2024-06-13 01:09:50,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 2037415936. Throughput: 0: 49296.5. Samples: 1566173060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 01:09:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:09:54,529][71000] Updated weights for policy 0, policy_version 124364 (0.0033) [2024-06-13 01:09:55,939][70768] Fps is (10 sec: 49153.0, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2037645312. Throughput: 0: 49109.7. Samples: 1566464560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 01:09:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:09:57,598][71000] Updated weights for policy 0, policy_version 124374 (0.0019) [2024-06-13 01:10:00,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48879.0, 300 sec: 49485.2). Total num frames: 2037874688. Throughput: 0: 49424.9. Samples: 1566763840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 01:10:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:10:01,117][71000] Updated weights for policy 0, policy_version 124384 (0.0035) [2024-06-13 01:10:04,426][71000] Updated weights for policy 0, policy_version 124394 (0.0029) [2024-06-13 01:10:05,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 2038136832. Throughput: 0: 49344.3. Samples: 1566910920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 01:10:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:10:07,749][71000] Updated weights for policy 0, policy_version 124404 (0.0025) [2024-06-13 01:10:10,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49151.9, 300 sec: 49374.2). Total num frames: 2038382592. Throughput: 0: 49290.6. Samples: 1567205120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 01:10:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:10:10,989][71000] Updated weights for policy 0, policy_version 124414 (0.0031) [2024-06-13 01:10:14,261][71000] Updated weights for policy 0, policy_version 124424 (0.0031) [2024-06-13 01:10:15,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2038611968. Throughput: 0: 49122.3. Samples: 1567506280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 01:10:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:10:17,898][71000] Updated weights for policy 0, policy_version 124434 (0.0029) [2024-06-13 01:10:20,805][71000] Updated weights for policy 0, policy_version 124444 (0.0032) [2024-06-13 01:10:20,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 2038890496. Throughput: 0: 48948.4. Samples: 1567643800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 01:10:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:10:22,329][70980] Signal inference workers to stop experience collection... (23100 times) [2024-06-13 01:10:22,378][71000] InferenceWorker_p0-w0: stopping experience collection (23100 times) [2024-06-13 01:10:22,380][70980] Signal inference workers to resume experience collection... (23100 times) [2024-06-13 01:10:22,390][71000] InferenceWorker_p0-w0: resuming experience collection (23100 times) [2024-06-13 01:10:24,431][71000] Updated weights for policy 0, policy_version 124454 (0.0035) [2024-06-13 01:10:25,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 2039119872. Throughput: 0: 49079.6. Samples: 1567938100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 01:10:25,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:10:27,854][71000] Updated weights for policy 0, policy_version 124464 (0.0035) [2024-06-13 01:10:30,925][71000] Updated weights for policy 0, policy_version 124474 (0.0019) [2024-06-13 01:10:30,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.2, 300 sec: 49485.3). Total num frames: 2039382016. Throughput: 0: 48973.2. Samples: 1568230560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 01:10:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:10:34,325][71000] Updated weights for policy 0, policy_version 124484 (0.0026) [2024-06-13 01:10:35,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 2039611392. Throughput: 0: 49095.9. Samples: 1568382380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 01:10:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:10:37,581][71000] Updated weights for policy 0, policy_version 124494 (0.0037) [2024-06-13 01:10:40,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 2039857152. Throughput: 0: 49249.3. Samples: 1568680780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 01:10:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:10:41,038][71000] Updated weights for policy 0, policy_version 124504 (0.0034) [2024-06-13 01:10:44,306][71000] Updated weights for policy 0, policy_version 124514 (0.0030) [2024-06-13 01:10:45,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 49485.3). Total num frames: 2040102912. Throughput: 0: 49190.7. Samples: 1568977420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 01:10:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:10:47,555][71000] Updated weights for policy 0, policy_version 124524 (0.0021) [2024-06-13 01:10:50,542][71000] Updated weights for policy 0, policy_version 124534 (0.0028) [2024-06-13 01:10:50,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 2040365056. Throughput: 0: 49452.9. Samples: 1569136300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 01:10:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:10:54,120][71000] Updated weights for policy 0, policy_version 124544 (0.0031) [2024-06-13 01:10:55,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48878.9, 300 sec: 49374.2). Total num frames: 2040578048. Throughput: 0: 49431.6. Samples: 1569429540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 01:10:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:10:57,590][71000] Updated weights for policy 0, policy_version 124554 (0.0026) [2024-06-13 01:11:00,694][71000] Updated weights for policy 0, policy_version 124564 (0.0030) [2024-06-13 01:11:00,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49698.0, 300 sec: 49485.2). Total num frames: 2040856576. Throughput: 0: 49220.7. Samples: 1569721220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 01:11:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:11:04,101][71000] Updated weights for policy 0, policy_version 124574 (0.0028) [2024-06-13 01:11:05,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2041102336. Throughput: 0: 49428.0. Samples: 1569868060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 01:11:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:11:07,232][71000] Updated weights for policy 0, policy_version 124584 (0.0029) [2024-06-13 01:11:10,467][71000] Updated weights for policy 0, policy_version 124594 (0.0027) [2024-06-13 01:11:10,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2041348096. Throughput: 0: 49617.0. Samples: 1570170860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 01:11:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:11:14,166][71000] Updated weights for policy 0, policy_version 124604 (0.0038) [2024-06-13 01:11:15,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 2041593856. Throughput: 0: 49773.2. Samples: 1570470360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 01:11:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:11:17,149][71000] Updated weights for policy 0, policy_version 124614 (0.0031) [2024-06-13 01:11:20,846][71000] Updated weights for policy 0, policy_version 124624 (0.0027) [2024-06-13 01:11:20,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 49540.8). Total num frames: 2041839616. Throughput: 0: 49531.9. Samples: 1570611320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 01:11:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:11:23,813][71000] Updated weights for policy 0, policy_version 124634 (0.0023) [2024-06-13 01:11:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2042085376. Throughput: 0: 49444.9. Samples: 1570905800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 01:11:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:11:27,602][71000] Updated weights for policy 0, policy_version 124644 (0.0026) [2024-06-13 01:11:30,600][71000] Updated weights for policy 0, policy_version 124654 (0.0032) [2024-06-13 01:11:30,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 2042331136. Throughput: 0: 49168.4. Samples: 1571190000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 01:11:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:11:34,362][71000] Updated weights for policy 0, policy_version 124664 (0.0029) [2024-06-13 01:11:35,939][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 2042560512. Throughput: 0: 48977.5. Samples: 1571340280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 01:11:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:11:37,097][70980] Signal inference workers to stop experience collection... (23150 times) [2024-06-13 01:11:37,102][70980] Signal inference workers to resume experience collection... (23150 times) [2024-06-13 01:11:37,130][71000] InferenceWorker_p0-w0: stopping experience collection (23150 times) [2024-06-13 01:11:37,130][71000] InferenceWorker_p0-w0: resuming experience collection (23150 times) [2024-06-13 01:11:37,237][71000] Updated weights for policy 0, policy_version 124674 (0.0031) [2024-06-13 01:11:40,939][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 2042806272. Throughput: 0: 49140.0. Samples: 1571640840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 01:11:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:11:41,037][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000124684_2042822656.pth... [2024-06-13 01:11:41,047][71000] Updated weights for policy 0, policy_version 124684 (0.0032) [2024-06-13 01:11:41,084][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000123960_2030960640.pth [2024-06-13 01:11:43,755][71000] Updated weights for policy 0, policy_version 124694 (0.0028) [2024-06-13 01:11:45,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2043068416. Throughput: 0: 49221.5. Samples: 1571936180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 01:11:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:11:47,605][71000] Updated weights for policy 0, policy_version 124704 (0.0024) [2024-06-13 01:11:50,712][71000] Updated weights for policy 0, policy_version 124714 (0.0033) [2024-06-13 01:11:50,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2043314176. Throughput: 0: 49284.9. Samples: 1572085880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 01:11:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:11:54,253][71000] Updated weights for policy 0, policy_version 124724 (0.0031) [2024-06-13 01:11:55,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2043543552. Throughput: 0: 48979.1. Samples: 1572374920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 01:11:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:11:57,291][71000] Updated weights for policy 0, policy_version 124734 (0.0026) [2024-06-13 01:12:00,818][71000] Updated weights for policy 0, policy_version 124744 (0.0027) [2024-06-13 01:12:00,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 49540.8). Total num frames: 2043805696. Throughput: 0: 49073.4. Samples: 1572678660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 01:12:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:12:03,663][71000] Updated weights for policy 0, policy_version 124754 (0.0030) [2024-06-13 01:12:05,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2044051456. Throughput: 0: 49264.1. Samples: 1572828200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 01:12:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:12:07,611][71000] Updated weights for policy 0, policy_version 124764 (0.0031) [2024-06-13 01:12:10,796][71000] Updated weights for policy 0, policy_version 124774 (0.0033) [2024-06-13 01:12:10,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 2044297216. Throughput: 0: 49152.3. Samples: 1573117660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 01:12:10,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 01:12:14,260][71000] Updated weights for policy 0, policy_version 124784 (0.0024) [2024-06-13 01:12:15,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 2044542976. Throughput: 0: 49351.9. Samples: 1573410840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 01:12:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:12:17,391][71000] Updated weights for policy 0, policy_version 124794 (0.0034) [2024-06-13 01:12:20,766][71000] Updated weights for policy 0, policy_version 124804 (0.0024) [2024-06-13 01:12:20,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.0, 300 sec: 49540.7). Total num frames: 2044805120. Throughput: 0: 49216.2. Samples: 1573555020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 01:12:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:12:23,950][71000] Updated weights for policy 0, policy_version 124814 (0.0022) [2024-06-13 01:12:25,939][70768] Fps is (10 sec: 47514.6, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 2045018112. Throughput: 0: 49118.6. Samples: 1573851180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 01:12:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:12:27,491][71000] Updated weights for policy 0, policy_version 124824 (0.0032) [2024-06-13 01:12:30,750][71000] Updated weights for policy 0, policy_version 124834 (0.0030) [2024-06-13 01:12:30,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2045280256. Throughput: 0: 49062.2. Samples: 1574143980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 01:12:30,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:12:33,986][71000] Updated weights for policy 0, policy_version 124844 (0.0025) [2024-06-13 01:12:35,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2045509632. Throughput: 0: 49167.7. Samples: 1574298420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 01:12:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:12:37,444][71000] Updated weights for policy 0, policy_version 124854 (0.0031) [2024-06-13 01:12:40,720][71000] Updated weights for policy 0, policy_version 124864 (0.0027) [2024-06-13 01:12:40,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2045771776. Throughput: 0: 49304.1. Samples: 1574593600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 01:12:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:12:44,031][71000] Updated weights for policy 0, policy_version 124874 (0.0029) [2024-06-13 01:12:45,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 2046017536. Throughput: 0: 49114.8. Samples: 1574888820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:12:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 01:12:47,143][71000] Updated weights for policy 0, policy_version 124884 (0.0028) [2024-06-13 01:12:50,742][71000] Updated weights for policy 0, policy_version 124894 (0.0021) [2024-06-13 01:12:50,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 2046263296. Throughput: 0: 49231.6. Samples: 1575043620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:12:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:12:53,149][70980] Signal inference workers to stop experience collection... (23200 times) [2024-06-13 01:12:53,150][70980] Signal inference workers to resume experience collection... (23200 times) [2024-06-13 01:12:53,168][71000] InferenceWorker_p0-w0: stopping experience collection (23200 times) [2024-06-13 01:12:53,168][71000] InferenceWorker_p0-w0: resuming experience collection (23200 times) [2024-06-13 01:12:53,441][71000] Updated weights for policy 0, policy_version 124904 (0.0023) [2024-06-13 01:12:55,941][70768] Fps is (10 sec: 49142.8, 60 sec: 49423.7, 300 sec: 49429.4). Total num frames: 2046509056. Throughput: 0: 49485.7. Samples: 1575344600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:12:55,942][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:12:57,489][71000] Updated weights for policy 0, policy_version 124914 (0.0024) [2024-06-13 01:13:00,410][71000] Updated weights for policy 0, policy_version 124924 (0.0021) [2024-06-13 01:13:00,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2046754816. Throughput: 0: 49397.5. Samples: 1575633720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:13:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:13:03,863][71000] Updated weights for policy 0, policy_version 124934 (0.0034) [2024-06-13 01:13:05,940][70768] Fps is (10 sec: 50799.3, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2047016960. Throughput: 0: 49699.3. Samples: 1575791480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:13:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 01:13:06,846][71000] Updated weights for policy 0, policy_version 124944 (0.0028) [2024-06-13 01:13:10,738][71000] Updated weights for policy 0, policy_version 124954 (0.0025) [2024-06-13 01:13:10,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2047262720. Throughput: 0: 49702.1. Samples: 1576087780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:13:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:13:13,521][71000] Updated weights for policy 0, policy_version 124964 (0.0031) [2024-06-13 01:13:15,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.1, 300 sec: 49374.1). Total num frames: 2047492096. Throughput: 0: 49653.3. Samples: 1576378380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:13:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:13:17,345][71000] Updated weights for policy 0, policy_version 124974 (0.0028) [2024-06-13 01:13:20,401][71000] Updated weights for policy 0, policy_version 124984 (0.0029) [2024-06-13 01:13:20,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2047754240. Throughput: 0: 49458.5. Samples: 1576524060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:13:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:13:23,876][71000] Updated weights for policy 0, policy_version 124994 (0.0033) [2024-06-13 01:13:25,944][70768] Fps is (10 sec: 52406.5, 60 sec: 49967.6, 300 sec: 49484.5). Total num frames: 2048016384. Throughput: 0: 49431.7. Samples: 1576818240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:13:25,944][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:13:26,980][71000] Updated weights for policy 0, policy_version 125004 (0.0032) [2024-06-13 01:13:30,224][71000] Updated weights for policy 0, policy_version 125014 (0.0024) [2024-06-13 01:13:30,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2048245760. Throughput: 0: 49544.2. Samples: 1577118320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:13:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:13:33,368][71000] Updated weights for policy 0, policy_version 125024 (0.0032) [2024-06-13 01:13:35,940][70768] Fps is (10 sec: 47533.9, 60 sec: 49698.0, 300 sec: 49374.2). Total num frames: 2048491520. Throughput: 0: 49404.8. Samples: 1577266840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:13:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:13:36,990][71000] Updated weights for policy 0, policy_version 125034 (0.0030) [2024-06-13 01:13:40,357][71000] Updated weights for policy 0, policy_version 125044 (0.0032) [2024-06-13 01:13:40,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2048737280. Throughput: 0: 49206.0. Samples: 1577558780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 01:13:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:13:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000125045_2048737280.pth... [2024-06-13 01:13:40,993][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000124323_2036908032.pth [2024-06-13 01:13:43,719][71000] Updated weights for policy 0, policy_version 125054 (0.0024) [2024-06-13 01:13:45,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2048983040. Throughput: 0: 49242.7. Samples: 1577849640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-13 01:13:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 01:13:46,908][71000] Updated weights for policy 0, policy_version 125064 (0.0033) [2024-06-13 01:13:50,121][71000] Updated weights for policy 0, policy_version 125074 (0.0020) [2024-06-13 01:13:50,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49698.0, 300 sec: 49374.1). Total num frames: 2049245184. Throughput: 0: 49220.4. Samples: 1578006400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-13 01:13:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:13:53,655][71000] Updated weights for policy 0, policy_version 125084 (0.0023) [2024-06-13 01:13:55,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49426.5, 300 sec: 49263.1). Total num frames: 2049474560. Throughput: 0: 49296.0. Samples: 1578306100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-13 01:13:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:13:56,737][71000] Updated weights for policy 0, policy_version 125094 (0.0032) [2024-06-13 01:13:57,804][70980] Signal inference workers to stop experience collection... (23250 times) [2024-06-13 01:13:57,805][70980] Signal inference workers to resume experience collection... (23250 times) [2024-06-13 01:13:57,823][71000] InferenceWorker_p0-w0: stopping experience collection (23250 times) [2024-06-13 01:13:57,851][71000] InferenceWorker_p0-w0: resuming experience collection (23250 times) [2024-06-13 01:14:00,012][71000] Updated weights for policy 0, policy_version 125104 (0.0033) [2024-06-13 01:14:00,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2049720320. Throughput: 0: 49453.4. Samples: 1578603780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-13 01:14:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:14:03,209][71000] Updated weights for policy 0, policy_version 125114 (0.0021) [2024-06-13 01:14:05,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2049982464. Throughput: 0: 49422.2. Samples: 1578748060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-13 01:14:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:14:06,802][71000] Updated weights for policy 0, policy_version 125124 (0.0023) [2024-06-13 01:14:09,771][71000] Updated weights for policy 0, policy_version 125134 (0.0033) [2024-06-13 01:14:10,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2050228224. Throughput: 0: 49815.4. Samples: 1579059720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-13 01:14:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:14:13,338][71000] Updated weights for policy 0, policy_version 125144 (0.0027) [2024-06-13 01:14:15,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49971.3, 300 sec: 49374.2). Total num frames: 2050490368. Throughput: 0: 49909.1. Samples: 1579364220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-13 01:14:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 01:14:16,184][71000] Updated weights for policy 0, policy_version 125154 (0.0026) [2024-06-13 01:14:19,783][71000] Updated weights for policy 0, policy_version 125164 (0.0025) [2024-06-13 01:14:20,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2050719744. Throughput: 0: 49596.5. Samples: 1579498680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-13 01:14:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:14:23,038][71000] Updated weights for policy 0, policy_version 125174 (0.0035) [2024-06-13 01:14:25,939][70768] Fps is (10 sec: 45875.3, 60 sec: 48882.5, 300 sec: 49207.6). Total num frames: 2050949120. Throughput: 0: 49773.8. Samples: 1579798600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-13 01:14:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:14:26,408][71000] Updated weights for policy 0, policy_version 125184 (0.0029) [2024-06-13 01:14:29,835][71000] Updated weights for policy 0, policy_version 125194 (0.0029) [2024-06-13 01:14:30,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 2051227648. Throughput: 0: 49899.5. Samples: 1580095120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-13 01:14:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:14:33,303][71000] Updated weights for policy 0, policy_version 125204 (0.0031) [2024-06-13 01:14:35,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 2051473408. Throughput: 0: 49856.8. Samples: 1580249960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-13 01:14:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:14:36,421][71000] Updated weights for policy 0, policy_version 125214 (0.0029) [2024-06-13 01:14:39,627][71000] Updated weights for policy 0, policy_version 125224 (0.0026) [2024-06-13 01:14:40,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.0, 300 sec: 49374.2). Total num frames: 2051719168. Throughput: 0: 49757.2. Samples: 1580545180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-13 01:14:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:14:43,159][71000] Updated weights for policy 0, policy_version 125234 (0.0032) [2024-06-13 01:14:45,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2051948544. Throughput: 0: 49646.7. Samples: 1580837880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-13 01:14:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:14:46,398][71000] Updated weights for policy 0, policy_version 125244 (0.0021) [2024-06-13 01:14:49,950][71000] Updated weights for policy 0, policy_version 125254 (0.0036) [2024-06-13 01:14:50,939][70768] Fps is (10 sec: 50791.1, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 2052227072. Throughput: 0: 49831.6. Samples: 1580990480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:14:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:14:53,180][71000] Updated weights for policy 0, policy_version 125264 (0.0028) [2024-06-13 01:14:55,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 2052456448. Throughput: 0: 49519.2. Samples: 1581288080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:14:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:14:56,204][71000] Updated weights for policy 0, policy_version 125274 (0.0026) [2024-06-13 01:14:56,662][70980] Signal inference workers to stop experience collection... (23300 times) [2024-06-13 01:14:56,663][70980] Signal inference workers to resume experience collection... (23300 times) [2024-06-13 01:14:56,702][71000] InferenceWorker_p0-w0: stopping experience collection (23300 times) [2024-06-13 01:14:56,702][71000] InferenceWorker_p0-w0: resuming experience collection (23300 times) [2024-06-13 01:14:59,590][71000] Updated weights for policy 0, policy_version 125284 (0.0027) [2024-06-13 01:15:00,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49971.3, 300 sec: 49429.7). Total num frames: 2052718592. Throughput: 0: 49464.9. Samples: 1581590140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:15:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:15:02,878][71000] Updated weights for policy 0, policy_version 125294 (0.0022) [2024-06-13 01:15:05,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2052947968. Throughput: 0: 49739.5. Samples: 1581736960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:15:05,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 01:15:06,470][71000] Updated weights for policy 0, policy_version 125304 (0.0031) [2024-06-13 01:15:09,690][71000] Updated weights for policy 0, policy_version 125314 (0.0035) [2024-06-13 01:15:10,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 2053226496. Throughput: 0: 49587.1. Samples: 1582030020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:15:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:15:13,215][71000] Updated weights for policy 0, policy_version 125324 (0.0027) [2024-06-13 01:15:15,939][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2053439488. Throughput: 0: 49357.6. Samples: 1582316200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:15:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:15:16,214][71000] Updated weights for policy 0, policy_version 125334 (0.0032) [2024-06-13 01:15:19,914][71000] Updated weights for policy 0, policy_version 125344 (0.0025) [2024-06-13 01:15:20,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2053685248. Throughput: 0: 49161.4. Samples: 1582462220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:15:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:15:22,562][71000] Updated weights for policy 0, policy_version 125354 (0.0027) [2024-06-13 01:15:25,940][70768] Fps is (10 sec: 49150.9, 60 sec: 49698.0, 300 sec: 49318.6). Total num frames: 2053931008. Throughput: 0: 49416.9. Samples: 1582768940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:15:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:15:26,313][71000] Updated weights for policy 0, policy_version 125364 (0.0026) [2024-06-13 01:15:29,243][71000] Updated weights for policy 0, policy_version 125374 (0.0039) [2024-06-13 01:15:30,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 2054209536. Throughput: 0: 49441.8. Samples: 1583062760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:15:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:15:33,035][71000] Updated weights for policy 0, policy_version 125384 (0.0024) [2024-06-13 01:15:35,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2054438912. Throughput: 0: 49500.4. Samples: 1583218000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:15:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 01:15:36,112][71000] Updated weights for policy 0, policy_version 125394 (0.0025) [2024-06-13 01:15:39,450][71000] Updated weights for policy 0, policy_version 125404 (0.0025) [2024-06-13 01:15:40,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 2054717440. Throughput: 0: 49596.8. Samples: 1583519940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:15:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 01:15:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000125410_2054717440.pth... [2024-06-13 01:15:40,996][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000124684_2042822656.pth [2024-06-13 01:15:42,400][71000] Updated weights for policy 0, policy_version 125414 (0.0023) [2024-06-13 01:15:45,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 2054930432. Throughput: 0: 49430.6. Samples: 1583814520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:15:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:15:46,036][71000] Updated weights for policy 0, policy_version 125424 (0.0030) [2024-06-13 01:15:48,999][71000] Updated weights for policy 0, policy_version 125434 (0.0030) [2024-06-13 01:15:50,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49151.9, 300 sec: 49485.2). Total num frames: 2055176192. Throughput: 0: 49319.0. Samples: 1583956320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:15:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:15:52,635][71000] Updated weights for policy 0, policy_version 125444 (0.0032) [2024-06-13 01:15:55,730][71000] Updated weights for policy 0, policy_version 125454 (0.0028) [2024-06-13 01:15:55,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 2055438336. Throughput: 0: 49522.6. Samples: 1584258540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:15:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:15:59,236][71000] Updated weights for policy 0, policy_version 125464 (0.0025) [2024-06-13 01:16:00,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 2055700480. Throughput: 0: 49639.0. Samples: 1584549960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:16:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:16:02,281][71000] Updated weights for policy 0, policy_version 125474 (0.0034) [2024-06-13 01:16:05,704][71000] Updated weights for policy 0, policy_version 125484 (0.0034) [2024-06-13 01:16:05,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 2055929856. Throughput: 0: 49863.0. Samples: 1584706060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:16:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:16:09,001][71000] Updated weights for policy 0, policy_version 125494 (0.0033) [2024-06-13 01:16:10,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48878.9, 300 sec: 49374.2). Total num frames: 2056159232. Throughput: 0: 49528.1. Samples: 1584997700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:16:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:16:11,149][70980] Signal inference workers to stop experience collection... (23350 times) [2024-06-13 01:16:11,150][70980] Signal inference workers to resume experience collection... (23350 times) [2024-06-13 01:16:11,166][71000] InferenceWorker_p0-w0: stopping experience collection (23350 times) [2024-06-13 01:16:11,166][71000] InferenceWorker_p0-w0: resuming experience collection (23350 times) [2024-06-13 01:16:12,395][71000] Updated weights for policy 0, policy_version 125504 (0.0024) [2024-06-13 01:16:15,699][71000] Updated weights for policy 0, policy_version 125514 (0.0031) [2024-06-13 01:16:15,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 2056421376. Throughput: 0: 49386.2. Samples: 1585285140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:16:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:16:19,268][71000] Updated weights for policy 0, policy_version 125524 (0.0031) [2024-06-13 01:16:20,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49971.1, 300 sec: 49485.2). Total num frames: 2056683520. Throughput: 0: 49501.2. Samples: 1585445560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:16:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:16:22,294][71000] Updated weights for policy 0, policy_version 125534 (0.0031) [2024-06-13 01:16:25,709][71000] Updated weights for policy 0, policy_version 125544 (0.0022) [2024-06-13 01:16:25,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 2056929280. Throughput: 0: 49235.9. Samples: 1585735560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:16:25,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-13 01:16:28,843][71000] Updated weights for policy 0, policy_version 125554 (0.0034) [2024-06-13 01:16:30,940][70768] Fps is (10 sec: 45875.7, 60 sec: 48878.9, 300 sec: 49429.7). Total num frames: 2057142272. Throughput: 0: 49346.2. Samples: 1586035100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:16:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:16:32,363][71000] Updated weights for policy 0, policy_version 125564 (0.0036) [2024-06-13 01:16:35,462][71000] Updated weights for policy 0, policy_version 125574 (0.0038) [2024-06-13 01:16:35,940][70768] Fps is (10 sec: 47514.4, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2057404416. Throughput: 0: 49309.9. Samples: 1586175260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:16:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:16:38,735][71000] Updated weights for policy 0, policy_version 125584 (0.0024) [2024-06-13 01:16:40,939][70768] Fps is (10 sec: 52429.2, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 2057666560. Throughput: 0: 49272.6. Samples: 1586475800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:16:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:16:41,779][71000] Updated weights for policy 0, policy_version 125594 (0.0023) [2024-06-13 01:16:45,574][71000] Updated weights for policy 0, policy_version 125604 (0.0035) [2024-06-13 01:16:45,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2057895936. Throughput: 0: 49315.2. Samples: 1586769140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 01:16:45,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:16:48,574][71000] Updated weights for policy 0, policy_version 125614 (0.0029) [2024-06-13 01:16:50,940][70768] Fps is (10 sec: 45873.9, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 2058125312. Throughput: 0: 48940.3. Samples: 1586908380. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:16:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:16:52,386][71000] Updated weights for policy 0, policy_version 125624 (0.0030) [2024-06-13 01:16:55,371][71000] Updated weights for policy 0, policy_version 125634 (0.0024) [2024-06-13 01:16:55,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2058387456. Throughput: 0: 48997.8. Samples: 1587202600. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:16:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:16:58,920][71000] Updated weights for policy 0, policy_version 125644 (0.0027) [2024-06-13 01:17:00,939][70768] Fps is (10 sec: 52430.3, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 2058649600. Throughput: 0: 49392.0. Samples: 1587507780. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:17:00,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 01:17:01,653][71000] Updated weights for policy 0, policy_version 125654 (0.0022) [2024-06-13 01:17:05,171][71000] Updated weights for policy 0, policy_version 125664 (0.0023) [2024-06-13 01:17:05,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.2, 300 sec: 49485.3). Total num frames: 2058895360. Throughput: 0: 49594.9. Samples: 1587677320. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:17:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:17:08,234][71000] Updated weights for policy 0, policy_version 125674 (0.0026) [2024-06-13 01:17:10,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2059124736. Throughput: 0: 49714.9. Samples: 1587972720. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:17:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:17:11,907][71000] Updated weights for policy 0, policy_version 125684 (0.0024) [2024-06-13 01:17:14,902][71000] Updated weights for policy 0, policy_version 125694 (0.0033) [2024-06-13 01:17:15,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2059370496. Throughput: 0: 49357.3. Samples: 1588256180. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:17:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:17:18,458][71000] Updated weights for policy 0, policy_version 125704 (0.0028) [2024-06-13 01:17:20,939][70768] Fps is (10 sec: 52429.1, 60 sec: 49425.3, 300 sec: 49596.3). Total num frames: 2059649024. Throughput: 0: 49610.7. Samples: 1588407740. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:17:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:17:21,021][70980] Signal inference workers to stop experience collection... (23400 times) [2024-06-13 01:17:21,022][70980] Signal inference workers to resume experience collection... (23400 times) [2024-06-13 01:17:21,037][71000] InferenceWorker_p0-w0: stopping experience collection (23400 times) [2024-06-13 01:17:21,037][71000] InferenceWorker_p0-w0: resuming experience collection (23400 times) [2024-06-13 01:17:21,361][71000] Updated weights for policy 0, policy_version 125714 (0.0034) [2024-06-13 01:17:25,042][71000] Updated weights for policy 0, policy_version 125724 (0.0021) [2024-06-13 01:17:25,939][70768] Fps is (10 sec: 52429.5, 60 sec: 49425.3, 300 sec: 49540.8). Total num frames: 2059894784. Throughput: 0: 49650.3. Samples: 1588710060. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:17:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:17:28,385][71000] Updated weights for policy 0, policy_version 125734 (0.0031) [2024-06-13 01:17:30,939][70768] Fps is (10 sec: 45875.1, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2060107776. Throughput: 0: 49666.2. Samples: 1589004120. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:17:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:17:32,006][71000] Updated weights for policy 0, policy_version 125744 (0.0036) [2024-06-13 01:17:35,183][71000] Updated weights for policy 0, policy_version 125754 (0.0043) [2024-06-13 01:17:35,940][70768] Fps is (10 sec: 45874.2, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 2060353536. Throughput: 0: 49427.3. Samples: 1589132600. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:17:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:17:38,518][71000] Updated weights for policy 0, policy_version 125764 (0.0027) [2024-06-13 01:17:40,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49151.9, 300 sec: 49485.2). Total num frames: 2060615680. Throughput: 0: 49631.6. Samples: 1589436020. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:17:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:17:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000125770_2060615680.pth... [2024-06-13 01:17:41,012][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000125045_2048737280.pth [2024-06-13 01:17:41,736][71000] Updated weights for policy 0, policy_version 125774 (0.0027) [2024-06-13 01:17:45,297][71000] Updated weights for policy 0, policy_version 125784 (0.0026) [2024-06-13 01:17:45,939][70768] Fps is (10 sec: 54068.2, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 2060894208. Throughput: 0: 49603.1. Samples: 1589739920. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:17:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:17:48,603][71000] Updated weights for policy 0, policy_version 125794 (0.0027) [2024-06-13 01:17:50,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49425.1, 300 sec: 49430.0). Total num frames: 2061090816. Throughput: 0: 48990.9. Samples: 1589881920. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:17:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:17:51,882][71000] Updated weights for policy 0, policy_version 125804 (0.0031) [2024-06-13 01:17:55,399][71000] Updated weights for policy 0, policy_version 125814 (0.0036) [2024-06-13 01:17:55,940][70768] Fps is (10 sec: 44236.6, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 2061336576. Throughput: 0: 48778.7. Samples: 1590167760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 01:17:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:17:58,718][71000] Updated weights for policy 0, policy_version 125824 (0.0036) [2024-06-13 01:18:00,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49151.8, 300 sec: 49429.7). Total num frames: 2061598720. Throughput: 0: 48845.6. Samples: 1590454240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 01:18:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:18:02,415][71000] Updated weights for policy 0, policy_version 125834 (0.0033) [2024-06-13 01:18:05,133][71000] Updated weights for policy 0, policy_version 125844 (0.0029) [2024-06-13 01:18:05,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 2061860864. Throughput: 0: 49286.6. Samples: 1590625640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 01:18:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:18:09,142][71000] Updated weights for policy 0, policy_version 125854 (0.0030) [2024-06-13 01:18:10,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2062090240. Throughput: 0: 49138.6. Samples: 1590921300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 01:18:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:18:11,917][70980] Signal inference workers to stop experience collection... (23450 times) [2024-06-13 01:18:11,918][70980] Signal inference workers to resume experience collection... (23450 times) [2024-06-13 01:18:11,958][71000] InferenceWorker_p0-w0: stopping experience collection (23450 times) [2024-06-13 01:18:11,959][71000] InferenceWorker_p0-w0: resuming experience collection (23450 times) [2024-06-13 01:18:12,087][71000] Updated weights for policy 0, policy_version 125864 (0.0033) [2024-06-13 01:18:15,680][71000] Updated weights for policy 0, policy_version 125874 (0.0024) [2024-06-13 01:18:15,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 2062319616. Throughput: 0: 48917.2. Samples: 1591205400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 01:18:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:18:18,602][71000] Updated weights for policy 0, policy_version 125884 (0.0032) [2024-06-13 01:18:20,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48878.9, 300 sec: 49374.9). Total num frames: 2062581760. Throughput: 0: 49194.8. Samples: 1591346360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 01:18:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:18:22,629][71000] Updated weights for policy 0, policy_version 125894 (0.0032) [2024-06-13 01:18:25,536][71000] Updated weights for policy 0, policy_version 125904 (0.0033) [2024-06-13 01:18:25,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49151.8, 300 sec: 49485.2). Total num frames: 2062843904. Throughput: 0: 49100.8. Samples: 1591645560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 01:18:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:18:29,360][71000] Updated weights for policy 0, policy_version 125914 (0.0026) [2024-06-13 01:18:30,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2063056896. Throughput: 0: 48999.1. Samples: 1591944880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 01:18:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:18:31,919][71000] Updated weights for policy 0, policy_version 125924 (0.0029) [2024-06-13 01:18:35,765][71000] Updated weights for policy 0, policy_version 125934 (0.0027) [2024-06-13 01:18:35,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2063319040. Throughput: 0: 49013.9. Samples: 1592087540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 01:18:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:18:38,406][71000] Updated weights for policy 0, policy_version 125944 (0.0033) [2024-06-13 01:18:40,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2063564800. Throughput: 0: 49324.4. Samples: 1592387360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 01:18:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 01:18:42,106][71000] Updated weights for policy 0, policy_version 125954 (0.0027) [2024-06-13 01:18:44,945][71000] Updated weights for policy 0, policy_version 125964 (0.0033) [2024-06-13 01:18:45,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2063843328. Throughput: 0: 49688.2. Samples: 1592690200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 01:18:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:18:48,865][71000] Updated weights for policy 0, policy_version 125974 (0.0030) [2024-06-13 01:18:50,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.3, 300 sec: 49485.2). Total num frames: 2064072704. Throughput: 0: 49544.9. Samples: 1592855160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 01:18:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:18:51,315][71000] Updated weights for policy 0, policy_version 125984 (0.0029) [2024-06-13 01:18:55,407][71000] Updated weights for policy 0, policy_version 125994 (0.0023) [2024-06-13 01:18:55,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 2064318464. Throughput: 0: 49717.3. Samples: 1593158580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 01:18:55,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 01:18:58,065][71000] Updated weights for policy 0, policy_version 126004 (0.0025) [2024-06-13 01:19:00,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 2064547840. Throughput: 0: 49643.6. Samples: 1593439360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 01:19:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:19:02,284][71000] Updated weights for policy 0, policy_version 126014 (0.0028) [2024-06-13 01:19:04,740][71000] Updated weights for policy 0, policy_version 126024 (0.0027) [2024-06-13 01:19:05,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2064826368. Throughput: 0: 49823.2. Samples: 1593588400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 01:19:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:19:08,808][71000] Updated weights for policy 0, policy_version 126034 (0.0033) [2024-06-13 01:19:10,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 2065055744. Throughput: 0: 49739.1. Samples: 1593883820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 01:19:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:19:11,653][71000] Updated weights for policy 0, policy_version 126044 (0.0024) [2024-06-13 01:19:15,273][71000] Updated weights for policy 0, policy_version 126054 (0.0029) [2024-06-13 01:19:15,940][70768] Fps is (10 sec: 45874.5, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 2065285120. Throughput: 0: 49763.4. Samples: 1594184240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 01:19:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:19:18,109][71000] Updated weights for policy 0, policy_version 126064 (0.0020) [2024-06-13 01:19:18,432][70980] Signal inference workers to stop experience collection... (23500 times) [2024-06-13 01:19:18,432][70980] Signal inference workers to resume experience collection... (23500 times) [2024-06-13 01:19:18,449][71000] InferenceWorker_p0-w0: stopping experience collection (23500 times) [2024-06-13 01:19:18,450][71000] InferenceWorker_p0-w0: resuming experience collection (23500 times) [2024-06-13 01:19:20,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2065530880. Throughput: 0: 49659.7. Samples: 1594322220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 01:19:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:19:21,984][71000] Updated weights for policy 0, policy_version 126074 (0.0037) [2024-06-13 01:19:24,819][71000] Updated weights for policy 0, policy_version 126084 (0.0023) [2024-06-13 01:19:25,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2065809408. Throughput: 0: 49555.2. Samples: 1594617340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 01:19:25,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-13 01:19:28,671][71000] Updated weights for policy 0, policy_version 126094 (0.0034) [2024-06-13 01:19:30,940][70768] Fps is (10 sec: 50789.4, 60 sec: 49698.0, 300 sec: 49374.1). Total num frames: 2066038784. Throughput: 0: 49158.9. Samples: 1594902360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 01:19:30,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 01:19:31,941][71000] Updated weights for policy 0, policy_version 126104 (0.0033) [2024-06-13 01:19:35,568][71000] Updated weights for policy 0, policy_version 126114 (0.0034) [2024-06-13 01:19:35,940][70768] Fps is (10 sec: 45874.8, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2066268160. Throughput: 0: 48721.3. Samples: 1595047620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 01:19:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:19:38,495][71000] Updated weights for policy 0, policy_version 126124 (0.0024) [2024-06-13 01:19:40,940][70768] Fps is (10 sec: 47514.4, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2066513920. Throughput: 0: 48439.5. Samples: 1595338360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 01:19:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:19:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000126130_2066513920.pth... [2024-06-13 01:19:41,002][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000125410_2054717440.pth [2024-06-13 01:19:42,226][71000] Updated weights for policy 0, policy_version 126134 (0.0023) [2024-06-13 01:19:45,046][71000] Updated weights for policy 0, policy_version 126144 (0.0028) [2024-06-13 01:19:45,942][70768] Fps is (10 sec: 52417.7, 60 sec: 49150.2, 300 sec: 49373.8). Total num frames: 2066792448. Throughput: 0: 48824.7. Samples: 1595636580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 01:19:45,942][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:19:48,729][71000] Updated weights for policy 0, policy_version 126154 (0.0023) [2024-06-13 01:19:50,939][70768] Fps is (10 sec: 52429.4, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2067038208. Throughput: 0: 49159.6. Samples: 1595800580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 01:19:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:19:51,546][71000] Updated weights for policy 0, policy_version 126164 (0.0023) [2024-06-13 01:19:55,384][71000] Updated weights for policy 0, policy_version 126174 (0.0022) [2024-06-13 01:19:55,940][70768] Fps is (10 sec: 49162.4, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 2067283968. Throughput: 0: 49238.2. Samples: 1596099540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 01:19:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:19:58,179][71000] Updated weights for policy 0, policy_version 126184 (0.0026) [2024-06-13 01:20:00,940][70768] Fps is (10 sec: 45874.6, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2067496960. Throughput: 0: 49180.5. Samples: 1596397360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 01:20:00,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:20:01,723][71000] Updated weights for policy 0, policy_version 126194 (0.0021) [2024-06-13 01:20:04,775][71000] Updated weights for policy 0, policy_version 126204 (0.0021) [2024-06-13 01:20:05,939][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2067775488. Throughput: 0: 49259.1. Samples: 1596538880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 01:20:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:20:08,257][71000] Updated weights for policy 0, policy_version 126214 (0.0025) [2024-06-13 01:20:10,940][70768] Fps is (10 sec: 54067.2, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 2068037632. Throughput: 0: 49435.5. Samples: 1596841940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 01:20:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:20:11,063][71000] Updated weights for policy 0, policy_version 126224 (0.0031) [2024-06-13 01:20:14,693][71000] Updated weights for policy 0, policy_version 126234 (0.0032) [2024-06-13 01:20:15,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49971.3, 300 sec: 49485.2). Total num frames: 2068283392. Throughput: 0: 49813.9. Samples: 1597143980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 01:20:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:20:17,877][71000] Updated weights for policy 0, policy_version 126244 (0.0024) [2024-06-13 01:20:20,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49971.2, 300 sec: 49485.3). Total num frames: 2068529152. Throughput: 0: 49983.2. Samples: 1597296860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 01:20:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:20:21,265][71000] Updated weights for policy 0, policy_version 126254 (0.0032) [2024-06-13 01:20:23,138][70980] Signal inference workers to stop experience collection... (23550 times) [2024-06-13 01:20:23,139][70980] Signal inference workers to resume experience collection... (23550 times) [2024-06-13 01:20:23,173][71000] InferenceWorker_p0-w0: stopping experience collection (23550 times) [2024-06-13 01:20:23,173][71000] InferenceWorker_p0-w0: resuming experience collection (23550 times) [2024-06-13 01:20:24,743][71000] Updated weights for policy 0, policy_version 126264 (0.0037) [2024-06-13 01:20:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2068758528. Throughput: 0: 49982.6. Samples: 1597587580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 01:20:25,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:20:27,852][71000] Updated weights for policy 0, policy_version 126274 (0.0031) [2024-06-13 01:20:30,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49698.3, 300 sec: 49429.7). Total num frames: 2069020672. Throughput: 0: 50003.3. Samples: 1597886620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 01:20:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:20:31,088][71000] Updated weights for policy 0, policy_version 126284 (0.0035) [2024-06-13 01:20:34,361][71000] Updated weights for policy 0, policy_version 126294 (0.0033) [2024-06-13 01:20:35,939][70768] Fps is (10 sec: 52429.4, 60 sec: 50244.4, 300 sec: 49374.2). Total num frames: 2069282816. Throughput: 0: 49830.2. Samples: 1598042940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 01:20:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:20:37,541][71000] Updated weights for policy 0, policy_version 126304 (0.0025) [2024-06-13 01:20:40,939][70768] Fps is (10 sec: 49152.5, 60 sec: 49971.3, 300 sec: 49429.7). Total num frames: 2069512192. Throughput: 0: 49761.1. Samples: 1598338780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 01:20:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:20:40,954][71000] Updated weights for policy 0, policy_version 126314 (0.0028) [2024-06-13 01:20:44,874][71000] Updated weights for policy 0, policy_version 126324 (0.0030) [2024-06-13 01:20:45,940][70768] Fps is (10 sec: 45874.7, 60 sec: 49153.8, 300 sec: 49374.2). Total num frames: 2069741568. Throughput: 0: 49695.5. Samples: 1598633660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 01:20:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:20:47,910][71000] Updated weights for policy 0, policy_version 126334 (0.0030) [2024-06-13 01:20:50,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2070003712. Throughput: 0: 49487.5. Samples: 1598765820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 01:20:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:20:51,356][71000] Updated weights for policy 0, policy_version 126344 (0.0032) [2024-06-13 01:20:54,337][71000] Updated weights for policy 0, policy_version 126354 (0.0023) [2024-06-13 01:20:55,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 2070265856. Throughput: 0: 49477.4. Samples: 1599068420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 01:20:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:20:57,628][71000] Updated weights for policy 0, policy_version 126364 (0.0026) [2024-06-13 01:21:00,920][71000] Updated weights for policy 0, policy_version 126374 (0.0034) [2024-06-13 01:21:00,940][70768] Fps is (10 sec: 50790.2, 60 sec: 50244.3, 300 sec: 49429.7). Total num frames: 2070511616. Throughput: 0: 49513.4. Samples: 1599372080. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:21:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:21:04,447][71000] Updated weights for policy 0, policy_version 126384 (0.0025) [2024-06-13 01:21:05,939][70768] Fps is (10 sec: 44237.0, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 2070708224. Throughput: 0: 49350.7. Samples: 1599517640. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:21:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:21:07,323][71000] Updated weights for policy 0, policy_version 126394 (0.0029) [2024-06-13 01:21:10,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 2070986752. Throughput: 0: 49364.9. Samples: 1599809000. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:21:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:21:11,443][71000] Updated weights for policy 0, policy_version 126404 (0.0027) [2024-06-13 01:21:14,125][71000] Updated weights for policy 0, policy_version 126414 (0.0026) [2024-06-13 01:21:15,940][70768] Fps is (10 sec: 55704.7, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 2071265280. Throughput: 0: 49450.1. Samples: 1600111880. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:21:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:21:17,854][71000] Updated weights for policy 0, policy_version 126424 (0.0022) [2024-06-13 01:21:20,693][71000] Updated weights for policy 0, policy_version 126434 (0.0032) [2024-06-13 01:21:20,939][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2071494656. Throughput: 0: 49511.5. Samples: 1600270960. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:21:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:21:21,239][70980] Signal inference workers to stop experience collection... (23600 times) [2024-06-13 01:21:21,284][71000] InferenceWorker_p0-w0: stopping experience collection (23600 times) [2024-06-13 01:21:21,290][70980] Signal inference workers to resume experience collection... (23600 times) [2024-06-13 01:21:21,296][71000] InferenceWorker_p0-w0: resuming experience collection (23600 times) [2024-06-13 01:21:24,371][71000] Updated weights for policy 0, policy_version 126444 (0.0027) [2024-06-13 01:21:25,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2071724032. Throughput: 0: 49619.9. Samples: 1600571680. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:21:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:21:27,230][71000] Updated weights for policy 0, policy_version 126454 (0.0030) [2024-06-13 01:21:30,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 2071969792. Throughput: 0: 49701.2. Samples: 1600870220. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:21:30,949][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:21:31,338][71000] Updated weights for policy 0, policy_version 126464 (0.0031) [2024-06-13 01:21:33,699][71000] Updated weights for policy 0, policy_version 126474 (0.0023) [2024-06-13 01:21:35,940][70768] Fps is (10 sec: 55706.0, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 2072281088. Throughput: 0: 50029.8. Samples: 1601017160. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:21:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:21:37,705][71000] Updated weights for policy 0, policy_version 126484 (0.0035) [2024-06-13 01:21:40,375][71000] Updated weights for policy 0, policy_version 126494 (0.0033) [2024-06-13 01:21:40,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49698.0, 300 sec: 49485.2). Total num frames: 2072494080. Throughput: 0: 49979.1. Samples: 1601317480. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:21:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:21:41,023][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000126496_2072510464.pth... [2024-06-13 01:21:41,079][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000125770_2060615680.pth [2024-06-13 01:21:44,153][71000] Updated weights for policy 0, policy_version 126504 (0.0024) [2024-06-13 01:21:45,939][70768] Fps is (10 sec: 44237.0, 60 sec: 49698.2, 300 sec: 49485.3). Total num frames: 2072723456. Throughput: 0: 49831.6. Samples: 1601614500. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:21:45,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 01:21:47,047][71000] Updated weights for policy 0, policy_version 126514 (0.0032) [2024-06-13 01:21:50,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2072952832. Throughput: 0: 49563.9. Samples: 1601748020. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:21:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:21:51,170][71000] Updated weights for policy 0, policy_version 126524 (0.0034) [2024-06-13 01:21:53,766][71000] Updated weights for policy 0, policy_version 126534 (0.0027) [2024-06-13 01:21:55,940][70768] Fps is (10 sec: 54066.9, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 2073264128. Throughput: 0: 49626.7. Samples: 1602042200. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-13 01:21:55,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 01:21:57,961][71000] Updated weights for policy 0, policy_version 126544 (0.0026) [2024-06-13 01:22:00,541][71000] Updated weights for policy 0, policy_version 126554 (0.0029) [2024-06-13 01:22:00,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2073477120. Throughput: 0: 49559.7. Samples: 1602342060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 01:22:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:22:04,265][71000] Updated weights for policy 0, policy_version 126564 (0.0026) [2024-06-13 01:22:05,940][70768] Fps is (10 sec: 44235.9, 60 sec: 49971.0, 300 sec: 49429.7). Total num frames: 2073706496. Throughput: 0: 49263.3. Samples: 1602487820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 01:22:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:22:07,254][71000] Updated weights for policy 0, policy_version 126574 (0.0031) [2024-06-13 01:22:10,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 2073935872. Throughput: 0: 49009.0. Samples: 1602777080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 01:22:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:22:10,977][71000] Updated weights for policy 0, policy_version 126584 (0.0029) [2024-06-13 01:22:13,638][71000] Updated weights for policy 0, policy_version 126594 (0.0026) [2024-06-13 01:22:15,940][70768] Fps is (10 sec: 52430.0, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2074230784. Throughput: 0: 49065.9. Samples: 1603078180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 01:22:15,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 01:22:17,519][71000] Updated weights for policy 0, policy_version 126604 (0.0034) [2024-06-13 01:22:20,114][71000] Updated weights for policy 0, policy_version 126614 (0.0024) [2024-06-13 01:22:20,940][70768] Fps is (10 sec: 55704.9, 60 sec: 49971.1, 300 sec: 49485.2). Total num frames: 2074492928. Throughput: 0: 49559.9. Samples: 1603247360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 01:22:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:22:24,011][71000] Updated weights for policy 0, policy_version 126624 (0.0026) [2024-06-13 01:22:25,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 2074705920. Throughput: 0: 49576.8. Samples: 1603548440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 01:22:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:22:26,398][70980] Signal inference workers to stop experience collection... (23650 times) [2024-06-13 01:22:26,445][71000] InferenceWorker_p0-w0: stopping experience collection (23650 times) [2024-06-13 01:22:26,453][70980] Signal inference workers to resume experience collection... (23650 times) [2024-06-13 01:22:26,457][71000] InferenceWorker_p0-w0: resuming experience collection (23650 times) [2024-06-13 01:22:26,590][71000] Updated weights for policy 0, policy_version 126634 (0.0028) [2024-06-13 01:22:30,550][71000] Updated weights for policy 0, policy_version 126644 (0.0035) [2024-06-13 01:22:30,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 2074951680. Throughput: 0: 49578.9. Samples: 1603845560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 01:22:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:22:33,645][71000] Updated weights for policy 0, policy_version 126654 (0.0027) [2024-06-13 01:22:35,939][70768] Fps is (10 sec: 50791.3, 60 sec: 48879.0, 300 sec: 49485.3). Total num frames: 2075213824. Throughput: 0: 49680.1. Samples: 1603983620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 01:22:35,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-13 01:22:37,209][71000] Updated weights for policy 0, policy_version 126664 (0.0033) [2024-06-13 01:22:40,013][71000] Updated weights for policy 0, policy_version 126674 (0.0026) [2024-06-13 01:22:40,939][70768] Fps is (10 sec: 52429.7, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 2075475968. Throughput: 0: 49749.8. Samples: 1604280940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 01:22:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:22:43,791][71000] Updated weights for policy 0, policy_version 126684 (0.0031) [2024-06-13 01:22:45,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 2075705344. Throughput: 0: 49727.6. Samples: 1604579800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 01:22:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:22:46,747][71000] Updated weights for policy 0, policy_version 126694 (0.0026) [2024-06-13 01:22:50,229][71000] Updated weights for policy 0, policy_version 126704 (0.0023) [2024-06-13 01:22:50,940][70768] Fps is (10 sec: 45874.8, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 2075934720. Throughput: 0: 49545.9. Samples: 1604717380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 01:22:50,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 01:22:53,328][71000] Updated weights for policy 0, policy_version 126714 (0.0035) [2024-06-13 01:22:55,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48879.0, 300 sec: 49485.3). Total num frames: 2076196864. Throughput: 0: 49871.6. Samples: 1605021300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 01:22:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:22:56,968][71000] Updated weights for policy 0, policy_version 126724 (0.0023) [2024-06-13 01:22:59,786][71000] Updated weights for policy 0, policy_version 126734 (0.0025) [2024-06-13 01:23:00,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 2076459008. Throughput: 0: 49811.1. Samples: 1605319680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 01:23:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:23:03,281][71000] Updated weights for policy 0, policy_version 126744 (0.0030) [2024-06-13 01:23:05,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49971.4, 300 sec: 49540.8). Total num frames: 2076704768. Throughput: 0: 49439.7. Samples: 1605472140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 01:23:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:23:06,452][71000] Updated weights for policy 0, policy_version 126754 (0.0023) [2024-06-13 01:23:09,674][71000] Updated weights for policy 0, policy_version 126764 (0.0025) [2024-06-13 01:23:10,940][70768] Fps is (10 sec: 49151.5, 60 sec: 50244.2, 300 sec: 49596.3). Total num frames: 2076950528. Throughput: 0: 49412.0. Samples: 1605771980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 01:23:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 01:23:13,051][71000] Updated weights for policy 0, policy_version 126774 (0.0038) [2024-06-13 01:23:15,939][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.1, 300 sec: 49485.3). Total num frames: 2077179904. Throughput: 0: 49386.0. Samples: 1606067920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 01:23:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:23:16,564][71000] Updated weights for policy 0, policy_version 126784 (0.0035) [2024-06-13 01:23:19,761][71000] Updated weights for policy 0, policy_version 126794 (0.0030) [2024-06-13 01:23:20,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2077442048. Throughput: 0: 49550.4. Samples: 1606213400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 01:23:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:23:23,183][71000] Updated weights for policy 0, policy_version 126804 (0.0026) [2024-06-13 01:23:25,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 2077687808. Throughput: 0: 49683.1. Samples: 1606516680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 01:23:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:23:26,230][71000] Updated weights for policy 0, policy_version 126814 (0.0027) [2024-06-13 01:23:29,722][71000] Updated weights for policy 0, policy_version 126824 (0.0029) [2024-06-13 01:23:30,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 2077933568. Throughput: 0: 49555.9. Samples: 1606809820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 01:23:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:23:32,884][71000] Updated weights for policy 0, policy_version 126834 (0.0025) [2024-06-13 01:23:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 2078179328. Throughput: 0: 49785.9. Samples: 1606957740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 01:23:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:23:36,244][71000] Updated weights for policy 0, policy_version 126844 (0.0022) [2024-06-13 01:23:39,608][71000] Updated weights for policy 0, policy_version 126854 (0.0033) [2024-06-13 01:23:40,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2078425088. Throughput: 0: 49724.4. Samples: 1607258900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 01:23:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:23:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000126857_2078425088.pth... [2024-06-13 01:23:40,999][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000126130_2066513920.pth [2024-06-13 01:23:42,058][70980] Signal inference workers to stop experience collection... (23700 times) [2024-06-13 01:23:42,089][71000] InferenceWorker_p0-w0: stopping experience collection (23700 times) [2024-06-13 01:23:42,114][70980] Signal inference workers to resume experience collection... (23700 times) [2024-06-13 01:23:42,114][71000] InferenceWorker_p0-w0: resuming experience collection (23700 times) [2024-06-13 01:23:42,759][71000] Updated weights for policy 0, policy_version 126864 (0.0031) [2024-06-13 01:23:45,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 2078687232. Throughput: 0: 49864.5. Samples: 1607563580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 01:23:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:23:46,026][71000] Updated weights for policy 0, policy_version 126874 (0.0027) [2024-06-13 01:23:49,239][71000] Updated weights for policy 0, policy_version 126884 (0.0027) [2024-06-13 01:23:50,940][70768] Fps is (10 sec: 52428.2, 60 sec: 50244.2, 300 sec: 49596.3). Total num frames: 2078949376. Throughput: 0: 49784.8. Samples: 1607712460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 01:23:50,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-13 01:23:52,721][71000] Updated weights for policy 0, policy_version 126894 (0.0023) [2024-06-13 01:23:55,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 2079178752. Throughput: 0: 49809.9. Samples: 1608013420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 01:23:55,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 01:23:55,963][71000] Updated weights for policy 0, policy_version 126904 (0.0032) [2024-06-13 01:23:59,229][71000] Updated weights for policy 0, policy_version 126914 (0.0026) [2024-06-13 01:24:00,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 2079424512. Throughput: 0: 49683.8. Samples: 1608303700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 01:24:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:24:02,554][71000] Updated weights for policy 0, policy_version 126924 (0.0032) [2024-06-13 01:24:05,843][71000] Updated weights for policy 0, policy_version 126934 (0.0031) [2024-06-13 01:24:05,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 2079686656. Throughput: 0: 49634.3. Samples: 1608446940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-13 01:24:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:24:09,214][71000] Updated weights for policy 0, policy_version 126944 (0.0025) [2024-06-13 01:24:10,939][70768] Fps is (10 sec: 50791.2, 60 sec: 49698.3, 300 sec: 49651.9). Total num frames: 2079932416. Throughput: 0: 49593.4. Samples: 1608748380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-13 01:24:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:24:12,428][71000] Updated weights for policy 0, policy_version 126954 (0.0034) [2024-06-13 01:24:15,613][71000] Updated weights for policy 0, policy_version 126964 (0.0027) [2024-06-13 01:24:15,940][70768] Fps is (10 sec: 50791.0, 60 sec: 50244.2, 300 sec: 49707.4). Total num frames: 2080194560. Throughput: 0: 49821.4. Samples: 1609051780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-13 01:24:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:24:18,801][71000] Updated weights for policy 0, policy_version 126974 (0.0040) [2024-06-13 01:24:20,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2080407552. Throughput: 0: 49932.3. Samples: 1609204700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-13 01:24:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:24:22,397][71000] Updated weights for policy 0, policy_version 126984 (0.0022) [2024-06-13 01:24:25,381][71000] Updated weights for policy 0, policy_version 126994 (0.0042) [2024-06-13 01:24:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49971.2, 300 sec: 49651.9). Total num frames: 2080686080. Throughput: 0: 49862.7. Samples: 1609502720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-13 01:24:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:24:28,901][71000] Updated weights for policy 0, policy_version 127004 (0.0024) [2024-06-13 01:24:30,940][70768] Fps is (10 sec: 52429.5, 60 sec: 49971.3, 300 sec: 49707.4). Total num frames: 2080931840. Throughput: 0: 49430.2. Samples: 1609787940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-13 01:24:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 01:24:31,656][70980] Signal inference workers to stop experience collection... (23750 times) [2024-06-13 01:24:31,656][70980] Signal inference workers to resume experience collection... (23750 times) [2024-06-13 01:24:31,689][71000] InferenceWorker_p0-w0: stopping experience collection (23750 times) [2024-06-13 01:24:31,689][71000] InferenceWorker_p0-w0: resuming experience collection (23750 times) [2024-06-13 01:24:32,183][71000] Updated weights for policy 0, policy_version 127014 (0.0031) [2024-06-13 01:24:35,362][71000] Updated weights for policy 0, policy_version 127024 (0.0020) [2024-06-13 01:24:35,940][70768] Fps is (10 sec: 50790.4, 60 sec: 50244.3, 300 sec: 49762.9). Total num frames: 2081193984. Throughput: 0: 49594.4. Samples: 1609944200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-13 01:24:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:24:38,747][71000] Updated weights for policy 0, policy_version 127034 (0.0031) [2024-06-13 01:24:40,940][70768] Fps is (10 sec: 47512.7, 60 sec: 49698.0, 300 sec: 49541.1). Total num frames: 2081406976. Throughput: 0: 49468.7. Samples: 1610239520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-13 01:24:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:24:42,037][71000] Updated weights for policy 0, policy_version 127044 (0.0023) [2024-06-13 01:24:45,736][71000] Updated weights for policy 0, policy_version 127054 (0.0034) [2024-06-13 01:24:45,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 2081652736. Throughput: 0: 49500.1. Samples: 1610531200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-13 01:24:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:24:48,828][71000] Updated weights for policy 0, policy_version 127064 (0.0028) [2024-06-13 01:24:50,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 2081898496. Throughput: 0: 49603.1. Samples: 1610679080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-13 01:24:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:24:52,558][71000] Updated weights for policy 0, policy_version 127074 (0.0026) [2024-06-13 01:24:55,566][71000] Updated weights for policy 0, policy_version 127084 (0.0031) [2024-06-13 01:24:55,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.2, 300 sec: 49707.4). Total num frames: 2082160640. Throughput: 0: 49415.5. Samples: 1610972080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-13 01:24:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:24:59,104][71000] Updated weights for policy 0, policy_version 127094 (0.0027) [2024-06-13 01:25:00,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 2082390016. Throughput: 0: 49449.3. Samples: 1611277000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-13 01:25:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:25:02,050][71000] Updated weights for policy 0, policy_version 127104 (0.0033) [2024-06-13 01:25:05,600][71000] Updated weights for policy 0, policy_version 127114 (0.0038) [2024-06-13 01:25:05,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 2082635776. Throughput: 0: 49047.2. Samples: 1611411820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-13 01:25:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:25:08,794][71000] Updated weights for policy 0, policy_version 127124 (0.0033) [2024-06-13 01:25:10,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 49485.2). Total num frames: 2082881536. Throughput: 0: 48882.9. Samples: 1611702460. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:25:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:25:12,301][71000] Updated weights for policy 0, policy_version 127134 (0.0021) [2024-06-13 01:25:15,447][71000] Updated weights for policy 0, policy_version 127144 (0.0032) [2024-06-13 01:25:15,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 2083143680. Throughput: 0: 49120.9. Samples: 1611998380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:25:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:25:19,151][71000] Updated weights for policy 0, policy_version 127154 (0.0030) [2024-06-13 01:25:20,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 2083373056. Throughput: 0: 49068.2. Samples: 1612152280. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:25:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:25:21,895][71000] Updated weights for policy 0, policy_version 127164 (0.0025) [2024-06-13 01:25:25,679][71000] Updated weights for policy 0, policy_version 127174 (0.0032) [2024-06-13 01:25:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 49485.2). Total num frames: 2083618816. Throughput: 0: 49208.2. Samples: 1612453880. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:25:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:25:28,699][71000] Updated weights for policy 0, policy_version 127184 (0.0026) [2024-06-13 01:25:29,746][70980] Signal inference workers to stop experience collection... (23800 times) [2024-06-13 01:25:29,773][71000] InferenceWorker_p0-w0: stopping experience collection (23800 times) [2024-06-13 01:25:29,855][70980] Signal inference workers to resume experience collection... (23800 times) [2024-06-13 01:25:29,855][71000] InferenceWorker_p0-w0: resuming experience collection (23800 times) [2024-06-13 01:25:30,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49424.9, 300 sec: 49540.7). Total num frames: 2083897344. Throughput: 0: 49137.6. Samples: 1612742400. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:25:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:25:32,310][71000] Updated weights for policy 0, policy_version 127194 (0.0024) [2024-06-13 01:25:35,219][71000] Updated weights for policy 0, policy_version 127204 (0.0043) [2024-06-13 01:25:35,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48878.8, 300 sec: 49540.7). Total num frames: 2084126720. Throughput: 0: 49487.5. Samples: 1612906020. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:25:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:25:39,014][71000] Updated weights for policy 0, policy_version 127214 (0.0031) [2024-06-13 01:25:40,940][70768] Fps is (10 sec: 45875.6, 60 sec: 49152.1, 300 sec: 49540.8). Total num frames: 2084356096. Throughput: 0: 49389.7. Samples: 1613194620. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:25:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:25:41,085][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000127220_2084372480.pth... [2024-06-13 01:25:41,123][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000126496_2072510464.pth [2024-06-13 01:25:41,987][71000] Updated weights for policy 0, policy_version 127224 (0.0029) [2024-06-13 01:25:45,662][71000] Updated weights for policy 0, policy_version 127234 (0.0029) [2024-06-13 01:25:45,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2084601856. Throughput: 0: 49150.3. Samples: 1613488760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:25:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:25:48,816][71000] Updated weights for policy 0, policy_version 127244 (0.0035) [2024-06-13 01:25:50,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49698.1, 300 sec: 49540.7). Total num frames: 2084880384. Throughput: 0: 49533.6. Samples: 1613640840. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:25:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:25:52,306][71000] Updated weights for policy 0, policy_version 127254 (0.0023) [2024-06-13 01:25:55,620][71000] Updated weights for policy 0, policy_version 127264 (0.0031) [2024-06-13 01:25:55,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2085109760. Throughput: 0: 49465.5. Samples: 1613928400. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:25:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:25:58,844][71000] Updated weights for policy 0, policy_version 127274 (0.0023) [2024-06-13 01:26:00,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49425.1, 300 sec: 49651.8). Total num frames: 2085355520. Throughput: 0: 49596.8. Samples: 1614230240. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:26:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:26:02,133][71000] Updated weights for policy 0, policy_version 127284 (0.0028) [2024-06-13 01:26:05,178][71000] Updated weights for policy 0, policy_version 127294 (0.0026) [2024-06-13 01:26:05,939][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 49485.3). Total num frames: 2085584896. Throughput: 0: 49171.4. Samples: 1614364980. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:26:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:26:08,817][71000] Updated weights for policy 0, policy_version 127304 (0.0020) [2024-06-13 01:26:10,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.3, 300 sec: 49485.3). Total num frames: 2085863424. Throughput: 0: 49215.6. Samples: 1614668580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 01:26:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:26:12,007][71000] Updated weights for policy 0, policy_version 127314 (0.0024) [2024-06-13 01:26:15,338][71000] Updated weights for policy 0, policy_version 127324 (0.0029) [2024-06-13 01:26:15,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2086092800. Throughput: 0: 49451.8. Samples: 1614967720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 01:26:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:26:18,494][71000] Updated weights for policy 0, policy_version 127334 (0.0022) [2024-06-13 01:26:20,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 2086338560. Throughput: 0: 49034.4. Samples: 1615112560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 01:26:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 01:26:21,950][71000] Updated weights for policy 0, policy_version 127344 (0.0023) [2024-06-13 01:26:23,087][70980] Signal inference workers to stop experience collection... (23850 times) [2024-06-13 01:26:23,122][71000] InferenceWorker_p0-w0: stopping experience collection (23850 times) [2024-06-13 01:26:23,196][70980] Signal inference workers to resume experience collection... (23850 times) [2024-06-13 01:26:23,197][71000] InferenceWorker_p0-w0: resuming experience collection (23850 times) [2024-06-13 01:26:25,124][71000] Updated weights for policy 0, policy_version 127354 (0.0032) [2024-06-13 01:26:25,940][70768] Fps is (10 sec: 47512.6, 60 sec: 49151.8, 300 sec: 49485.2). Total num frames: 2086567936. Throughput: 0: 49041.2. Samples: 1615401480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 01:26:25,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 01:26:28,833][71000] Updated weights for policy 0, policy_version 127364 (0.0030) [2024-06-13 01:26:30,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.1, 300 sec: 49374.1). Total num frames: 2086846464. Throughput: 0: 49260.0. Samples: 1615705460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 01:26:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:26:31,594][71000] Updated weights for policy 0, policy_version 127374 (0.0027) [2024-06-13 01:26:35,379][71000] Updated weights for policy 0, policy_version 127384 (0.0026) [2024-06-13 01:26:35,939][70768] Fps is (10 sec: 52430.0, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 2087092224. Throughput: 0: 49379.3. Samples: 1615862900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 01:26:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:26:38,434][71000] Updated weights for policy 0, policy_version 127394 (0.0035) [2024-06-13 01:26:40,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2087321600. Throughput: 0: 49565.7. Samples: 1616158860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 01:26:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 01:26:41,627][71000] Updated weights for policy 0, policy_version 127404 (0.0021) [2024-06-13 01:26:44,755][71000] Updated weights for policy 0, policy_version 127414 (0.0024) [2024-06-13 01:26:45,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 2087567360. Throughput: 0: 49398.7. Samples: 1616453180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 01:26:45,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 01:26:48,578][71000] Updated weights for policy 0, policy_version 127424 (0.0029) [2024-06-13 01:26:50,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2087845888. Throughput: 0: 49826.0. Samples: 1616607160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 01:26:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:26:51,622][71000] Updated weights for policy 0, policy_version 127434 (0.0026) [2024-06-13 01:26:54,951][71000] Updated weights for policy 0, policy_version 127444 (0.0027) [2024-06-13 01:26:55,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 2088091648. Throughput: 0: 49586.6. Samples: 1616899980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 01:26:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:26:58,175][71000] Updated weights for policy 0, policy_version 127454 (0.0032) [2024-06-13 01:27:00,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 2088337408. Throughput: 0: 49745.2. Samples: 1617206260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 01:27:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 01:27:01,714][71000] Updated weights for policy 0, policy_version 127464 (0.0024) [2024-06-13 01:27:04,872][71000] Updated weights for policy 0, policy_version 127474 (0.0037) [2024-06-13 01:27:05,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 2088566784. Throughput: 0: 49696.1. Samples: 1617348880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 01:27:05,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 01:27:08,346][71000] Updated weights for policy 0, policy_version 127484 (0.0035) [2024-06-13 01:27:10,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2088812544. Throughput: 0: 49965.1. Samples: 1617649900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 01:27:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:27:11,454][71000] Updated weights for policy 0, policy_version 127494 (0.0030) [2024-06-13 01:27:14,824][71000] Updated weights for policy 0, policy_version 127504 (0.0034) [2024-06-13 01:27:15,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 2089074688. Throughput: 0: 49797.3. Samples: 1617946340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:27:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:27:17,983][71000] Updated weights for policy 0, policy_version 127514 (0.0029) [2024-06-13 01:27:20,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 2089320448. Throughput: 0: 49622.5. Samples: 1618095920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:27:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:27:21,492][71000] Updated weights for policy 0, policy_version 127524 (0.0021) [2024-06-13 01:27:24,518][71000] Updated weights for policy 0, policy_version 127534 (0.0030) [2024-06-13 01:27:25,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49971.4, 300 sec: 49540.8). Total num frames: 2089566208. Throughput: 0: 49536.5. Samples: 1618388000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:27:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:27:28,234][71000] Updated weights for policy 0, policy_version 127544 (0.0024) [2024-06-13 01:27:30,939][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2089795584. Throughput: 0: 49380.5. Samples: 1618675300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:27:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:27:31,524][71000] Updated weights for policy 0, policy_version 127554 (0.0035) [2024-06-13 01:27:35,007][71000] Updated weights for policy 0, policy_version 127564 (0.0022) [2024-06-13 01:27:35,942][70768] Fps is (10 sec: 45864.3, 60 sec: 48877.0, 300 sec: 49318.2). Total num frames: 2090024960. Throughput: 0: 49115.4. Samples: 1618817460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:27:35,942][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:27:35,943][70980] Signal inference workers to stop experience collection... (23900 times) [2024-06-13 01:27:35,993][71000] InferenceWorker_p0-w0: stopping experience collection (23900 times) [2024-06-13 01:27:35,998][70980] Signal inference workers to resume experience collection... (23900 times) [2024-06-13 01:27:36,006][71000] InferenceWorker_p0-w0: resuming experience collection (23900 times) [2024-06-13 01:27:38,231][71000] Updated weights for policy 0, policy_version 127574 (0.0027) [2024-06-13 01:27:40,940][70768] Fps is (10 sec: 50789.3, 60 sec: 49698.0, 300 sec: 49485.2). Total num frames: 2090303488. Throughput: 0: 49240.3. Samples: 1619115800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:27:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:27:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000127582_2090303488.pth... [2024-06-13 01:27:41,001][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000126857_2078425088.pth [2024-06-13 01:27:41,670][71000] Updated weights for policy 0, policy_version 127584 (0.0031) [2024-06-13 01:27:44,484][71000] Updated weights for policy 0, policy_version 127594 (0.0040) [2024-06-13 01:27:45,939][70768] Fps is (10 sec: 52441.4, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 2090549248. Throughput: 0: 49133.5. Samples: 1619417260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:27:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:27:48,177][71000] Updated weights for policy 0, policy_version 127604 (0.0027) [2024-06-13 01:27:50,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 2090795008. Throughput: 0: 49255.5. Samples: 1619565380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:27:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:27:51,111][71000] Updated weights for policy 0, policy_version 127614 (0.0035) [2024-06-13 01:27:54,874][71000] Updated weights for policy 0, policy_version 127624 (0.0029) [2024-06-13 01:27:55,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2091040768. Throughput: 0: 49206.7. Samples: 1619864200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:27:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:27:58,125][71000] Updated weights for policy 0, policy_version 127634 (0.0026) [2024-06-13 01:28:00,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 2091286528. Throughput: 0: 49306.6. Samples: 1620165140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:28:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:28:01,509][71000] Updated weights for policy 0, policy_version 127644 (0.0025) [2024-06-13 01:28:04,501][71000] Updated weights for policy 0, policy_version 127654 (0.0034) [2024-06-13 01:28:05,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2091532288. Throughput: 0: 49044.1. Samples: 1620302900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:28:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:28:08,019][71000] Updated weights for policy 0, policy_version 127664 (0.0029) [2024-06-13 01:28:10,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.1, 300 sec: 49540.7). Total num frames: 2091794432. Throughput: 0: 49286.6. Samples: 1620605900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:28:10,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:28:11,012][71000] Updated weights for policy 0, policy_version 127674 (0.0036) [2024-06-13 01:28:14,898][71000] Updated weights for policy 0, policy_version 127684 (0.0039) [2024-06-13 01:28:15,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2092023808. Throughput: 0: 49353.6. Samples: 1620896220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 01:28:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:28:17,890][71000] Updated weights for policy 0, policy_version 127694 (0.0031) [2024-06-13 01:28:20,939][70768] Fps is (10 sec: 45875.8, 60 sec: 48879.1, 300 sec: 49374.2). Total num frames: 2092253184. Throughput: 0: 49471.1. Samples: 1621043540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 01:28:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:28:21,462][71000] Updated weights for policy 0, policy_version 127704 (0.0026) [2024-06-13 01:28:24,469][71000] Updated weights for policy 0, policy_version 127714 (0.0028) [2024-06-13 01:28:25,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2092515328. Throughput: 0: 49317.5. Samples: 1621335080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 01:28:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:28:28,148][71000] Updated weights for policy 0, policy_version 127724 (0.0028) [2024-06-13 01:28:30,940][70768] Fps is (10 sec: 52427.2, 60 sec: 49697.9, 300 sec: 49485.2). Total num frames: 2092777472. Throughput: 0: 49289.5. Samples: 1621635300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 01:28:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:28:31,333][71000] Updated weights for policy 0, policy_version 127734 (0.0022) [2024-06-13 01:28:34,826][71000] Updated weights for policy 0, policy_version 127744 (0.0023) [2024-06-13 01:28:35,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49700.1, 300 sec: 49429.7). Total num frames: 2093006848. Throughput: 0: 49320.9. Samples: 1621784820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 01:28:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:28:37,554][70980] Signal inference workers to stop experience collection... (23950 times) [2024-06-13 01:28:37,592][71000] InferenceWorker_p0-w0: stopping experience collection (23950 times) [2024-06-13 01:28:37,601][70980] Signal inference workers to resume experience collection... (23950 times) [2024-06-13 01:28:37,611][71000] InferenceWorker_p0-w0: resuming experience collection (23950 times) [2024-06-13 01:28:37,749][71000] Updated weights for policy 0, policy_version 127754 (0.0028) [2024-06-13 01:28:40,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49152.1, 300 sec: 49374.1). Total num frames: 2093252608. Throughput: 0: 49369.2. Samples: 1622085820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 01:28:40,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-13 01:28:41,308][71000] Updated weights for policy 0, policy_version 127764 (0.0030) [2024-06-13 01:28:44,377][71000] Updated weights for policy 0, policy_version 127774 (0.0024) [2024-06-13 01:28:45,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2093514752. Throughput: 0: 49217.9. Samples: 1622379940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 01:28:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:28:48,030][71000] Updated weights for policy 0, policy_version 127784 (0.0033) [2024-06-13 01:28:50,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2093760512. Throughput: 0: 49447.0. Samples: 1622528020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 01:28:50,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 01:28:51,178][71000] Updated weights for policy 0, policy_version 127794 (0.0034) [2024-06-13 01:28:54,778][71000] Updated weights for policy 0, policy_version 127804 (0.0026) [2024-06-13 01:28:55,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 2093973504. Throughput: 0: 49318.7. Samples: 1622825240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 01:28:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:28:57,750][71000] Updated weights for policy 0, policy_version 127814 (0.0035) [2024-06-13 01:29:00,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2094235648. Throughput: 0: 49270.3. Samples: 1623113380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 01:29:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:29:01,292][71000] Updated weights for policy 0, policy_version 127824 (0.0032) [2024-06-13 01:29:04,255][71000] Updated weights for policy 0, policy_version 127834 (0.0031) [2024-06-13 01:29:05,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2094481408. Throughput: 0: 49440.0. Samples: 1623268340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 01:29:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:29:07,809][71000] Updated weights for policy 0, policy_version 127844 (0.0028) [2024-06-13 01:29:10,789][71000] Updated weights for policy 0, policy_version 127854 (0.0027) [2024-06-13 01:29:10,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49425.1, 300 sec: 49374.1). Total num frames: 2094759936. Throughput: 0: 49466.6. Samples: 1623561080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 01:29:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:29:14,591][71000] Updated weights for policy 0, policy_version 127864 (0.0030) [2024-06-13 01:29:15,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 2094972928. Throughput: 0: 49229.5. Samples: 1623850620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 01:29:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:29:17,921][71000] Updated weights for policy 0, policy_version 127874 (0.0031) [2024-06-13 01:29:20,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49698.0, 300 sec: 49318.6). Total num frames: 2095235072. Throughput: 0: 48982.2. Samples: 1623989020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 01:29:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:29:21,190][71000] Updated weights for policy 0, policy_version 127884 (0.0025) [2024-06-13 01:29:24,440][71000] Updated weights for policy 0, policy_version 127894 (0.0034) [2024-06-13 01:29:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2095464448. Throughput: 0: 48904.1. Samples: 1624286500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 01:29:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:29:27,847][71000] Updated weights for policy 0, policy_version 127904 (0.0029) [2024-06-13 01:29:30,749][71000] Updated weights for policy 0, policy_version 127914 (0.0026) [2024-06-13 01:29:30,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2095742976. Throughput: 0: 49203.9. Samples: 1624594120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 01:29:30,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-13 01:29:34,224][71000] Updated weights for policy 0, policy_version 127924 (0.0034) [2024-06-13 01:29:35,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2095955968. Throughput: 0: 49454.4. Samples: 1624753460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 01:29:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:29:37,449][71000] Updated weights for policy 0, policy_version 127934 (0.0029) [2024-06-13 01:29:40,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.1, 300 sec: 49374.1). Total num frames: 2096218112. Throughput: 0: 49043.9. Samples: 1625032220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 01:29:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:29:40,991][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000127944_2096234496.pth... [2024-06-13 01:29:40,993][71000] Updated weights for policy 0, policy_version 127944 (0.0035) [2024-06-13 01:29:41,036][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000127220_2084372480.pth [2024-06-13 01:29:44,339][71000] Updated weights for policy 0, policy_version 127954 (0.0031) [2024-06-13 01:29:45,944][70768] Fps is (10 sec: 50768.5, 60 sec: 49148.5, 300 sec: 49373.4). Total num frames: 2096463872. Throughput: 0: 49240.7. Samples: 1625329420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 01:29:45,944][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:29:47,804][71000] Updated weights for policy 0, policy_version 127964 (0.0030) [2024-06-13 01:29:49,370][70980] Signal inference workers to stop experience collection... (24000 times) [2024-06-13 01:29:49,370][70980] Signal inference workers to resume experience collection... (24000 times) [2024-06-13 01:29:49,411][71000] InferenceWorker_p0-w0: stopping experience collection (24000 times) [2024-06-13 01:29:49,411][71000] InferenceWorker_p0-w0: resuming experience collection (24000 times) [2024-06-13 01:29:50,939][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2096709632. Throughput: 0: 49214.2. Samples: 1625482980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 01:29:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:29:51,005][71000] Updated weights for policy 0, policy_version 127974 (0.0029) [2024-06-13 01:29:54,316][71000] Updated weights for policy 0, policy_version 127984 (0.0032) [2024-06-13 01:29:55,940][70768] Fps is (10 sec: 47533.5, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2096939008. Throughput: 0: 49164.4. Samples: 1625773480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 01:29:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:29:57,786][71000] Updated weights for policy 0, policy_version 127994 (0.0037) [2024-06-13 01:30:00,942][70768] Fps is (10 sec: 47502.3, 60 sec: 49150.1, 300 sec: 49318.2). Total num frames: 2097184768. Throughput: 0: 49254.8. Samples: 1626067200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 01:30:00,943][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:30:01,253][71000] Updated weights for policy 0, policy_version 128004 (0.0032) [2024-06-13 01:30:04,335][71000] Updated weights for policy 0, policy_version 128014 (0.0030) [2024-06-13 01:30:05,939][70768] Fps is (10 sec: 52429.7, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 2097463296. Throughput: 0: 49456.1. Samples: 1626214540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 01:30:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:30:07,867][71000] Updated weights for policy 0, policy_version 128024 (0.0030) [2024-06-13 01:30:10,940][70768] Fps is (10 sec: 49163.5, 60 sec: 48605.9, 300 sec: 49263.1). Total num frames: 2097676288. Throughput: 0: 49441.4. Samples: 1626511360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 01:30:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:30:11,159][71000] Updated weights for policy 0, policy_version 128034 (0.0028) [2024-06-13 01:30:14,525][71000] Updated weights for policy 0, policy_version 128044 (0.0026) [2024-06-13 01:30:15,940][70768] Fps is (10 sec: 45874.7, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2097922048. Throughput: 0: 49134.8. Samples: 1626805180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 01:30:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:30:17,637][71000] Updated weights for policy 0, policy_version 128054 (0.0028) [2024-06-13 01:30:20,940][70768] Fps is (10 sec: 50788.8, 60 sec: 49151.8, 300 sec: 49374.1). Total num frames: 2098184192. Throughput: 0: 48947.2. Samples: 1626956100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 01:30:20,941][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:30:21,089][71000] Updated weights for policy 0, policy_version 128064 (0.0040) [2024-06-13 01:30:24,471][71000] Updated weights for policy 0, policy_version 128074 (0.0030) [2024-06-13 01:30:25,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 2098446336. Throughput: 0: 49367.6. Samples: 1627253760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 01:30:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:30:27,906][71000] Updated weights for policy 0, policy_version 128084 (0.0025) [2024-06-13 01:30:30,940][70768] Fps is (10 sec: 49153.4, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 2098675712. Throughput: 0: 49342.0. Samples: 1627549600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 01:30:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:30:31,249][71000] Updated weights for policy 0, policy_version 128094 (0.0031) [2024-06-13 01:30:34,447][71000] Updated weights for policy 0, policy_version 128104 (0.0027) [2024-06-13 01:30:35,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2098921472. Throughput: 0: 49111.5. Samples: 1627693000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 01:30:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:30:37,913][71000] Updated weights for policy 0, policy_version 128114 (0.0036) [2024-06-13 01:30:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 2099167232. Throughput: 0: 49271.2. Samples: 1627990680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 01:30:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:30:41,335][71000] Updated weights for policy 0, policy_version 128124 (0.0029) [2024-06-13 01:30:44,651][71000] Updated weights for policy 0, policy_version 128134 (0.0037) [2024-06-13 01:30:45,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49155.5, 300 sec: 49263.1). Total num frames: 2099412992. Throughput: 0: 49268.8. Samples: 1628284180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 01:30:45,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 01:30:47,752][71000] Updated weights for policy 0, policy_version 128144 (0.0027) [2024-06-13 01:30:50,922][71000] Updated weights for policy 0, policy_version 128154 (0.0031) [2024-06-13 01:30:50,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 2099675136. Throughput: 0: 49416.8. Samples: 1628438300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 01:30:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:30:54,410][71000] Updated weights for policy 0, policy_version 128164 (0.0028) [2024-06-13 01:30:55,942][70768] Fps is (10 sec: 50777.8, 60 sec: 49696.2, 300 sec: 49373.7). Total num frames: 2099920896. Throughput: 0: 49515.9. Samples: 1628739700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 01:30:55,943][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:30:58,069][71000] Updated weights for policy 0, policy_version 128174 (0.0025) [2024-06-13 01:31:00,939][70768] Fps is (10 sec: 47514.1, 60 sec: 49427.0, 300 sec: 49374.2). Total num frames: 2100150272. Throughput: 0: 49377.4. Samples: 1629027160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 01:31:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:31:01,049][71000] Updated weights for policy 0, policy_version 128184 (0.0027) [2024-06-13 01:31:04,566][71000] Updated weights for policy 0, policy_version 128194 (0.0029) [2024-06-13 01:31:05,939][70768] Fps is (10 sec: 49164.5, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2100412416. Throughput: 0: 49397.7. Samples: 1629178980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 01:31:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:31:07,541][71000] Updated weights for policy 0, policy_version 128204 (0.0030) [2024-06-13 01:31:10,939][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2100625408. Throughput: 0: 49301.4. Samples: 1629472320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 01:31:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:31:11,105][70980] Signal inference workers to stop experience collection... (24050 times) [2024-06-13 01:31:11,105][70980] Signal inference workers to resume experience collection... (24050 times) [2024-06-13 01:31:11,118][71000] InferenceWorker_p0-w0: stopping experience collection (24050 times) [2024-06-13 01:31:11,118][71000] InferenceWorker_p0-w0: resuming experience collection (24050 times) [2024-06-13 01:31:11,253][71000] Updated weights for policy 0, policy_version 128214 (0.0026) [2024-06-13 01:31:14,235][71000] Updated weights for policy 0, policy_version 128224 (0.0031) [2024-06-13 01:31:15,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49698.1, 300 sec: 49374.1). Total num frames: 2100903936. Throughput: 0: 49155.5. Samples: 1629761600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 01:31:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:31:18,207][71000] Updated weights for policy 0, policy_version 128234 (0.0033) [2024-06-13 01:31:20,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49152.2, 300 sec: 49374.2). Total num frames: 2101133312. Throughput: 0: 49343.5. Samples: 1629913460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 01:31:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:31:20,998][71000] Updated weights for policy 0, policy_version 128244 (0.0028) [2024-06-13 01:31:24,675][71000] Updated weights for policy 0, policy_version 128254 (0.0025) [2024-06-13 01:31:25,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2101395456. Throughput: 0: 49328.5. Samples: 1630210460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 01:31:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:31:27,539][71000] Updated weights for policy 0, policy_version 128264 (0.0026) [2024-06-13 01:31:30,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2101608448. Throughput: 0: 49247.5. Samples: 1630500320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 01:31:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:31:31,438][71000] Updated weights for policy 0, policy_version 128274 (0.0035) [2024-06-13 01:31:34,317][71000] Updated weights for policy 0, policy_version 128284 (0.0031) [2024-06-13 01:31:35,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2101870592. Throughput: 0: 49012.9. Samples: 1630643880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 01:31:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:31:38,306][71000] Updated weights for policy 0, policy_version 128294 (0.0022) [2024-06-13 01:31:40,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2102116352. Throughput: 0: 48754.3. Samples: 1630933520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 01:31:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:31:40,961][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000128304_2102132736.pth... [2024-06-13 01:31:40,965][71000] Updated weights for policy 0, policy_version 128304 (0.0032) [2024-06-13 01:31:41,009][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000127582_2090303488.pth [2024-06-13 01:31:44,737][71000] Updated weights for policy 0, policy_version 128314 (0.0028) [2024-06-13 01:31:45,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 2102362112. Throughput: 0: 49083.9. Samples: 1631235940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 01:31:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:31:47,654][71000] Updated weights for policy 0, policy_version 128324 (0.0025) [2024-06-13 01:31:50,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48606.0, 300 sec: 49152.0). Total num frames: 2102591488. Throughput: 0: 48807.1. Samples: 1631375300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 01:31:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:31:51,379][71000] Updated weights for policy 0, policy_version 128334 (0.0028) [2024-06-13 01:31:54,374][71000] Updated weights for policy 0, policy_version 128344 (0.0028) [2024-06-13 01:31:55,940][70768] Fps is (10 sec: 50789.4, 60 sec: 49153.9, 300 sec: 49263.1). Total num frames: 2102870016. Throughput: 0: 48866.8. Samples: 1631671340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 01:31:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:31:58,201][71000] Updated weights for policy 0, policy_version 128354 (0.0034) [2024-06-13 01:32:00,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2103083008. Throughput: 0: 48981.9. Samples: 1631965780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 01:32:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:32:01,185][71000] Updated weights for policy 0, policy_version 128364 (0.0028) [2024-06-13 01:32:04,631][71000] Updated weights for policy 0, policy_version 128374 (0.0030) [2024-06-13 01:32:05,940][70768] Fps is (10 sec: 45876.0, 60 sec: 48605.8, 300 sec: 49207.5). Total num frames: 2103328768. Throughput: 0: 48894.7. Samples: 1632113720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 01:32:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:32:07,778][71000] Updated weights for policy 0, policy_version 128384 (0.0034) [2024-06-13 01:32:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2103574528. Throughput: 0: 48823.6. Samples: 1632407520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 01:32:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:32:11,549][71000] Updated weights for policy 0, policy_version 128394 (0.0025) [2024-06-13 01:32:14,295][71000] Updated weights for policy 0, policy_version 128404 (0.0029) [2024-06-13 01:32:15,940][70768] Fps is (10 sec: 52429.1, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2103853056. Throughput: 0: 48889.9. Samples: 1632700360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 01:32:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:32:18,212][71000] Updated weights for policy 0, policy_version 128414 (0.0039) [2024-06-13 01:32:19,325][70980] Signal inference workers to stop experience collection... (24100 times) [2024-06-13 01:32:19,327][70980] Signal inference workers to resume experience collection... (24100 times) [2024-06-13 01:32:19,361][71000] InferenceWorker_p0-w0: stopping experience collection (24100 times) [2024-06-13 01:32:19,361][71000] InferenceWorker_p0-w0: resuming experience collection (24100 times) [2024-06-13 01:32:20,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2104082432. Throughput: 0: 49225.7. Samples: 1632859040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 01:32:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:32:21,089][71000] Updated weights for policy 0, policy_version 128424 (0.0029) [2024-06-13 01:32:24,846][71000] Updated weights for policy 0, policy_version 128434 (0.0034) [2024-06-13 01:32:25,940][70768] Fps is (10 sec: 45874.2, 60 sec: 48605.7, 300 sec: 49207.5). Total num frames: 2104311808. Throughput: 0: 49362.9. Samples: 1633154860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 01:32:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:32:27,646][71000] Updated weights for policy 0, policy_version 128444 (0.0029) [2024-06-13 01:32:30,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.1, 300 sec: 49263.5). Total num frames: 2104557568. Throughput: 0: 49205.8. Samples: 1633450200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 01:32:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:32:31,416][71000] Updated weights for policy 0, policy_version 128454 (0.0033) [2024-06-13 01:32:34,039][71000] Updated weights for policy 0, policy_version 128464 (0.0036) [2024-06-13 01:32:35,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49424.9, 300 sec: 49263.1). Total num frames: 2104836096. Throughput: 0: 49290.4. Samples: 1633593380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 01:32:35,943][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:32:38,403][71000] Updated weights for policy 0, policy_version 128474 (0.0028) [2024-06-13 01:32:40,891][71000] Updated weights for policy 0, policy_version 128484 (0.0025) [2024-06-13 01:32:40,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2105081856. Throughput: 0: 49069.0. Samples: 1633879440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 01:32:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:32:45,129][71000] Updated weights for policy 0, policy_version 128494 (0.0022) [2024-06-13 01:32:45,944][70768] Fps is (10 sec: 45856.4, 60 sec: 48875.5, 300 sec: 49151.3). Total num frames: 2105294848. Throughput: 0: 49162.9. Samples: 1634178320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 01:32:45,944][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:32:47,853][71000] Updated weights for policy 0, policy_version 128504 (0.0029) [2024-06-13 01:32:50,940][70768] Fps is (10 sec: 44236.5, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 2105524224. Throughput: 0: 48883.0. Samples: 1634313460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 01:32:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:32:51,595][71000] Updated weights for policy 0, policy_version 128514 (0.0031) [2024-06-13 01:32:54,146][71000] Updated weights for policy 0, policy_version 128524 (0.0028) [2024-06-13 01:32:55,940][70768] Fps is (10 sec: 50811.9, 60 sec: 48879.1, 300 sec: 49207.6). Total num frames: 2105802752. Throughput: 0: 48944.4. Samples: 1634610020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 01:32:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:32:58,375][71000] Updated weights for policy 0, policy_version 128534 (0.0026) [2024-06-13 01:33:00,732][71000] Updated weights for policy 0, policy_version 128544 (0.0035) [2024-06-13 01:33:00,940][70768] Fps is (10 sec: 54066.9, 60 sec: 49698.0, 300 sec: 49263.1). Total num frames: 2106064896. Throughput: 0: 49001.1. Samples: 1634905420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 01:33:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:33:04,951][71000] Updated weights for policy 0, policy_version 128554 (0.0038) [2024-06-13 01:33:05,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2106277888. Throughput: 0: 48881.4. Samples: 1635058700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 01:33:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:33:07,420][71000] Updated weights for policy 0, policy_version 128564 (0.0029) [2024-06-13 01:33:10,939][70768] Fps is (10 sec: 45876.2, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2106523648. Throughput: 0: 49038.0. Samples: 1635361560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 01:33:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:33:11,283][71000] Updated weights for policy 0, policy_version 128574 (0.0026) [2024-06-13 01:33:13,892][71000] Updated weights for policy 0, policy_version 128584 (0.0028) [2024-06-13 01:33:15,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48878.8, 300 sec: 49263.1). Total num frames: 2106785792. Throughput: 0: 49099.9. Samples: 1635659700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 01:33:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:33:17,777][71000] Updated weights for policy 0, policy_version 128594 (0.0022) [2024-06-13 01:33:20,438][71000] Updated weights for policy 0, policy_version 128604 (0.0025) [2024-06-13 01:33:20,939][70768] Fps is (10 sec: 54067.2, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 2107064320. Throughput: 0: 49264.6. Samples: 1635810280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 01:33:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:33:24,737][71000] Updated weights for policy 0, policy_version 128614 (0.0028) [2024-06-13 01:33:25,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.2, 300 sec: 49152.0). Total num frames: 2107277312. Throughput: 0: 49418.7. Samples: 1636103280. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 01:33:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:33:27,069][70980] Signal inference workers to stop experience collection... (24150 times) [2024-06-13 01:33:27,070][70980] Signal inference workers to resume experience collection... (24150 times) [2024-06-13 01:33:27,093][71000] InferenceWorker_p0-w0: stopping experience collection (24150 times) [2024-06-13 01:33:27,093][71000] InferenceWorker_p0-w0: resuming experience collection (24150 times) [2024-06-13 01:33:27,213][71000] Updated weights for policy 0, policy_version 128624 (0.0028) [2024-06-13 01:33:30,940][70768] Fps is (10 sec: 42598.3, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2107490304. Throughput: 0: 49293.1. Samples: 1636396300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 01:33:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:33:31,272][71000] Updated weights for policy 0, policy_version 128634 (0.0033) [2024-06-13 01:33:33,998][71000] Updated weights for policy 0, policy_version 128644 (0.0024) [2024-06-13 01:33:35,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.1, 300 sec: 49207.6). Total num frames: 2107768832. Throughput: 0: 49486.9. Samples: 1636540360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 01:33:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:33:37,909][71000] Updated weights for policy 0, policy_version 128654 (0.0027) [2024-06-13 01:33:40,526][71000] Updated weights for policy 0, policy_version 128664 (0.0029) [2024-06-13 01:33:40,940][70768] Fps is (10 sec: 55705.6, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2108047360. Throughput: 0: 49478.3. Samples: 1636836540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 01:33:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:33:40,959][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000128665_2108047360.pth... [2024-06-13 01:33:41,000][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000127944_2096234496.pth [2024-06-13 01:33:44,372][71000] Updated weights for policy 0, policy_version 128674 (0.0034) [2024-06-13 01:33:45,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49701.6, 300 sec: 49207.5). Total num frames: 2108276736. Throughput: 0: 49756.5. Samples: 1637144460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 01:33:45,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-13 01:33:47,235][71000] Updated weights for policy 0, policy_version 128684 (0.0031) [2024-06-13 01:33:50,940][70768] Fps is (10 sec: 44236.5, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2108489728. Throughput: 0: 49409.7. Samples: 1637282140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 01:33:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:33:51,407][71000] Updated weights for policy 0, policy_version 128694 (0.0022) [2024-06-13 01:33:53,866][71000] Updated weights for policy 0, policy_version 128704 (0.0027) [2024-06-13 01:33:55,939][70768] Fps is (10 sec: 47514.3, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 2108751872. Throughput: 0: 49008.0. Samples: 1637566920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 01:33:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:33:57,951][71000] Updated weights for policy 0, policy_version 128714 (0.0026) [2024-06-13 01:34:00,602][71000] Updated weights for policy 0, policy_version 128724 (0.0030) [2024-06-13 01:34:00,940][70768] Fps is (10 sec: 54067.5, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 2109030400. Throughput: 0: 49161.0. Samples: 1637871940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 01:34:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:34:04,375][71000] Updated weights for policy 0, policy_version 128734 (0.0024) [2024-06-13 01:34:05,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.2, 300 sec: 49152.0). Total num frames: 2109259776. Throughput: 0: 49410.2. Samples: 1638033740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 01:34:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 01:34:07,207][71000] Updated weights for policy 0, policy_version 128744 (0.0021) [2024-06-13 01:34:10,940][70768] Fps is (10 sec: 45874.4, 60 sec: 49424.9, 300 sec: 49207.5). Total num frames: 2109489152. Throughput: 0: 49301.6. Samples: 1638321860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 01:34:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:34:11,171][71000] Updated weights for policy 0, policy_version 128754 (0.0033) [2024-06-13 01:34:13,774][71000] Updated weights for policy 0, policy_version 128764 (0.0028) [2024-06-13 01:34:15,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2109734912. Throughput: 0: 49309.4. Samples: 1638615220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 01:34:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:34:17,587][71000] Updated weights for policy 0, policy_version 128774 (0.0033) [2024-06-13 01:34:20,244][71000] Updated weights for policy 0, policy_version 128784 (0.0031) [2024-06-13 01:34:20,939][70768] Fps is (10 sec: 52429.9, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2110013440. Throughput: 0: 49554.7. Samples: 1638770320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 01:34:20,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 01:34:24,269][71000] Updated weights for policy 0, policy_version 128794 (0.0028) [2024-06-13 01:34:25,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49698.1, 300 sec: 49207.6). Total num frames: 2110259200. Throughput: 0: 49607.5. Samples: 1639068880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 01:34:25,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 01:34:27,094][71000] Updated weights for policy 0, policy_version 128804 (0.0027) [2024-06-13 01:34:30,790][71000] Updated weights for policy 0, policy_version 128814 (0.0030) [2024-06-13 01:34:30,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49971.2, 300 sec: 49263.1). Total num frames: 2110488576. Throughput: 0: 49321.0. Samples: 1639363900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 01:34:30,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:34:33,617][71000] Updated weights for policy 0, policy_version 128824 (0.0032) [2024-06-13 01:34:35,939][70768] Fps is (10 sec: 45875.7, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2110717952. Throughput: 0: 49450.4. Samples: 1639507400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 01:34:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:34:36,202][70980] Signal inference workers to stop experience collection... (24200 times) [2024-06-13 01:34:36,250][71000] InferenceWorker_p0-w0: stopping experience collection (24200 times) [2024-06-13 01:34:36,252][70980] Signal inference workers to resume experience collection... (24200 times) [2024-06-13 01:34:36,263][71000] InferenceWorker_p0-w0: resuming experience collection (24200 times) [2024-06-13 01:34:37,483][71000] Updated weights for policy 0, policy_version 128834 (0.0029) [2024-06-13 01:34:40,346][71000] Updated weights for policy 0, policy_version 128844 (0.0027) [2024-06-13 01:34:40,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.0, 300 sec: 49263.8). Total num frames: 2110996480. Throughput: 0: 49831.5. Samples: 1639809340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 01:34:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:34:43,943][71000] Updated weights for policy 0, policy_version 128854 (0.0038) [2024-06-13 01:34:45,940][70768] Fps is (10 sec: 50788.9, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2111225856. Throughput: 0: 49447.8. Samples: 1640097100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 01:34:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:34:46,851][71000] Updated weights for policy 0, policy_version 128864 (0.0028) [2024-06-13 01:34:50,570][71000] Updated weights for policy 0, policy_version 128874 (0.0024) [2024-06-13 01:34:50,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49698.2, 300 sec: 49263.1). Total num frames: 2111471616. Throughput: 0: 49285.8. Samples: 1640251600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 01:34:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:34:54,006][71000] Updated weights for policy 0, policy_version 128884 (0.0033) [2024-06-13 01:34:55,939][70768] Fps is (10 sec: 45876.6, 60 sec: 48879.0, 300 sec: 49152.4). Total num frames: 2111684608. Throughput: 0: 49139.0. Samples: 1640533100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 01:34:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:34:57,391][71000] Updated weights for policy 0, policy_version 128894 (0.0023) [2024-06-13 01:35:00,649][71000] Updated weights for policy 0, policy_version 128904 (0.0023) [2024-06-13 01:35:00,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2111979520. Throughput: 0: 49128.7. Samples: 1640826020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 01:35:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:35:03,994][71000] Updated weights for policy 0, policy_version 128914 (0.0030) [2024-06-13 01:35:05,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2112208896. Throughput: 0: 49120.9. Samples: 1640980760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 01:35:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:35:07,270][71000] Updated weights for policy 0, policy_version 128924 (0.0030) [2024-06-13 01:35:10,704][71000] Updated weights for policy 0, policy_version 128934 (0.0030) [2024-06-13 01:35:10,939][70768] Fps is (10 sec: 47514.4, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 2112454656. Throughput: 0: 49066.7. Samples: 1641276880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 01:35:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:35:14,173][71000] Updated weights for policy 0, policy_version 128944 (0.0022) [2024-06-13 01:35:15,939][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.0, 300 sec: 49152.1). Total num frames: 2112684032. Throughput: 0: 49094.8. Samples: 1641573160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 01:35:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:35:17,401][71000] Updated weights for policy 0, policy_version 128954 (0.0033) [2024-06-13 01:35:20,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48605.8, 300 sec: 49096.5). Total num frames: 2112929792. Throughput: 0: 49038.6. Samples: 1641714140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 01:35:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:35:21,065][71000] Updated weights for policy 0, policy_version 128964 (0.0024) [2024-06-13 01:35:23,849][71000] Updated weights for policy 0, policy_version 128974 (0.0026) [2024-06-13 01:35:25,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2113191936. Throughput: 0: 48826.7. Samples: 1642006540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 01:35:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:35:27,618][71000] Updated weights for policy 0, policy_version 128984 (0.0028) [2024-06-13 01:35:30,700][71000] Updated weights for policy 0, policy_version 128994 (0.0027) [2024-06-13 01:35:30,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2113437696. Throughput: 0: 49027.8. Samples: 1642303340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:35:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:35:34,668][71000] Updated weights for policy 0, policy_version 129004 (0.0031) [2024-06-13 01:35:35,940][70768] Fps is (10 sec: 45874.5, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 2113650688. Throughput: 0: 48606.0. Samples: 1642438880. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:35:35,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-13 01:35:37,599][71000] Updated weights for policy 0, policy_version 129014 (0.0026) [2024-06-13 01:35:40,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48332.7, 300 sec: 49096.5). Total num frames: 2113896448. Throughput: 0: 49001.6. Samples: 1642738180. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:35:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:35:41,039][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000129023_2113912832.pth... [2024-06-13 01:35:41,102][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000128304_2102132736.pth [2024-06-13 01:35:41,249][71000] Updated weights for policy 0, policy_version 129024 (0.0032) [2024-06-13 01:35:44,242][71000] Updated weights for policy 0, policy_version 129034 (0.0036) [2024-06-13 01:35:45,940][70768] Fps is (10 sec: 50791.0, 60 sec: 48879.1, 300 sec: 49096.5). Total num frames: 2114158592. Throughput: 0: 48861.9. Samples: 1643024800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:35:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:35:47,942][71000] Updated weights for policy 0, policy_version 129044 (0.0033) [2024-06-13 01:35:50,705][71000] Updated weights for policy 0, policy_version 129054 (0.0026) [2024-06-13 01:35:50,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49151.8, 300 sec: 49152.4). Total num frames: 2114420736. Throughput: 0: 48978.4. Samples: 1643184800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:35:50,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-13 01:35:54,418][71000] Updated weights for policy 0, policy_version 129064 (0.0023) [2024-06-13 01:35:55,145][70980] Signal inference workers to stop experience collection... (24250 times) [2024-06-13 01:35:55,196][71000] InferenceWorker_p0-w0: stopping experience collection (24250 times) [2024-06-13 01:35:55,199][70980] Signal inference workers to resume experience collection... (24250 times) [2024-06-13 01:35:55,212][71000] InferenceWorker_p0-w0: resuming experience collection (24250 times) [2024-06-13 01:35:55,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49424.9, 300 sec: 49152.0). Total num frames: 2114650112. Throughput: 0: 48970.9. Samples: 1643480580. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:35:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:35:57,140][71000] Updated weights for policy 0, policy_version 129074 (0.0036) [2024-06-13 01:36:00,940][70768] Fps is (10 sec: 47514.5, 60 sec: 48606.0, 300 sec: 49096.5). Total num frames: 2114895872. Throughput: 0: 49038.2. Samples: 1643779880. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:36:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:36:01,254][71000] Updated weights for policy 0, policy_version 129084 (0.0031) [2024-06-13 01:36:03,925][71000] Updated weights for policy 0, policy_version 129094 (0.0026) [2024-06-13 01:36:05,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2115158016. Throughput: 0: 49095.1. Samples: 1643923420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:36:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:36:07,716][71000] Updated weights for policy 0, policy_version 129104 (0.0027) [2024-06-13 01:36:10,491][71000] Updated weights for policy 0, policy_version 129114 (0.0027) [2024-06-13 01:36:10,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2115403776. Throughput: 0: 49176.8. Samples: 1644219500. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:36:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:36:14,215][71000] Updated weights for policy 0, policy_version 129124 (0.0027) [2024-06-13 01:36:15,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.0, 300 sec: 49207.6). Total num frames: 2115649536. Throughput: 0: 49435.6. Samples: 1644527940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:36:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:36:17,147][71000] Updated weights for policy 0, policy_version 129134 (0.0028) [2024-06-13 01:36:20,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2115878912. Throughput: 0: 49426.4. Samples: 1644663060. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:36:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:36:21,615][71000] Updated weights for policy 0, policy_version 129144 (0.0031) [2024-06-13 01:36:23,874][71000] Updated weights for policy 0, policy_version 129154 (0.0025) [2024-06-13 01:36:25,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2116141056. Throughput: 0: 49121.2. Samples: 1644948640. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:36:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:36:28,063][71000] Updated weights for policy 0, policy_version 129164 (0.0030) [2024-06-13 01:36:30,654][71000] Updated weights for policy 0, policy_version 129174 (0.0028) [2024-06-13 01:36:30,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2116403200. Throughput: 0: 49467.4. Samples: 1645250840. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 01:36:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:36:34,320][71000] Updated weights for policy 0, policy_version 129184 (0.0038) [2024-06-13 01:36:35,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49698.3, 300 sec: 49207.5). Total num frames: 2116632576. Throughput: 0: 49515.3. Samples: 1645412980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 01:36:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:36:37,222][71000] Updated weights for policy 0, policy_version 129194 (0.0030) [2024-06-13 01:36:40,942][70768] Fps is (10 sec: 45864.6, 60 sec: 49423.1, 300 sec: 49151.6). Total num frames: 2116861952. Throughput: 0: 49310.8. Samples: 1645699680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 01:36:40,942][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:36:41,151][71000] Updated weights for policy 0, policy_version 129204 (0.0027) [2024-06-13 01:36:44,135][71000] Updated weights for policy 0, policy_version 129214 (0.0031) [2024-06-13 01:36:45,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49424.9, 300 sec: 49263.0). Total num frames: 2117124096. Throughput: 0: 49004.7. Samples: 1645985100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 01:36:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:36:48,058][71000] Updated weights for policy 0, policy_version 129224 (0.0028) [2024-06-13 01:36:50,760][71000] Updated weights for policy 0, policy_version 129234 (0.0022) [2024-06-13 01:36:50,940][70768] Fps is (10 sec: 50802.7, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2117369856. Throughput: 0: 49308.0. Samples: 1646142280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 01:36:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:36:54,346][71000] Updated weights for policy 0, policy_version 129244 (0.0027) [2024-06-13 01:36:55,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 2117632000. Throughput: 0: 49505.2. Samples: 1646447240. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 01:36:55,949][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:36:56,983][71000] Updated weights for policy 0, policy_version 129254 (0.0026) [2024-06-13 01:37:00,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2117844992. Throughput: 0: 49367.0. Samples: 1646749460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 01:37:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:37:01,014][71000] Updated weights for policy 0, policy_version 129264 (0.0026) [2024-06-13 01:37:03,538][71000] Updated weights for policy 0, policy_version 129274 (0.0031) [2024-06-13 01:37:05,939][70768] Fps is (10 sec: 47514.4, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2118107136. Throughput: 0: 49487.1. Samples: 1646889980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 01:37:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:37:07,696][71000] Updated weights for policy 0, policy_version 129284 (0.0030) [2024-06-13 01:37:07,859][70980] Signal inference workers to stop experience collection... (24300 times) [2024-06-13 01:37:07,869][71000] InferenceWorker_p0-w0: stopping experience collection (24300 times) [2024-06-13 01:37:07,964][70980] Signal inference workers to resume experience collection... (24300 times) [2024-06-13 01:37:07,964][71000] InferenceWorker_p0-w0: resuming experience collection (24300 times) [2024-06-13 01:37:10,385][71000] Updated weights for policy 0, policy_version 129294 (0.0033) [2024-06-13 01:37:10,940][70768] Fps is (10 sec: 54067.0, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2118385664. Throughput: 0: 49516.9. Samples: 1647176900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 01:37:10,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:37:14,400][71000] Updated weights for policy 0, policy_version 129304 (0.0023) [2024-06-13 01:37:15,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 2118631424. Throughput: 0: 49461.4. Samples: 1647476600. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 01:37:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:37:16,896][71000] Updated weights for policy 0, policy_version 129314 (0.0024) [2024-06-13 01:37:20,612][71000] Updated weights for policy 0, policy_version 129324 (0.0021) [2024-06-13 01:37:20,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 2118860800. Throughput: 0: 49257.7. Samples: 1647629580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 01:37:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:37:23,331][71000] Updated weights for policy 0, policy_version 129334 (0.0028) [2024-06-13 01:37:25,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2119090176. Throughput: 0: 49720.4. Samples: 1647936980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 01:37:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:37:27,308][71000] Updated weights for policy 0, policy_version 129344 (0.0034) [2024-06-13 01:37:29,779][71000] Updated weights for policy 0, policy_version 129354 (0.0031) [2024-06-13 01:37:30,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 2119352320. Throughput: 0: 49657.9. Samples: 1648219700. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 01:37:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:37:33,953][71000] Updated weights for policy 0, policy_version 129364 (0.0032) [2024-06-13 01:37:35,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49698.0, 300 sec: 49263.1). Total num frames: 2119614464. Throughput: 0: 49808.3. Samples: 1648383660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 01:37:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:37:36,609][71000] Updated weights for policy 0, policy_version 129374 (0.0028) [2024-06-13 01:37:40,579][71000] Updated weights for policy 0, policy_version 129384 (0.0036) [2024-06-13 01:37:40,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49700.1, 300 sec: 49319.3). Total num frames: 2119843840. Throughput: 0: 49523.6. Samples: 1648675800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 01:37:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:37:41,009][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000129386_2119860224.pth... [2024-06-13 01:37:41,057][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000128665_2108047360.pth [2024-06-13 01:37:43,057][71000] Updated weights for policy 0, policy_version 129394 (0.0021) [2024-06-13 01:37:45,944][70768] Fps is (10 sec: 45857.4, 60 sec: 49148.8, 300 sec: 49318.0). Total num frames: 2120073216. Throughput: 0: 49439.2. Samples: 1648974420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 01:37:45,944][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:37:47,106][71000] Updated weights for policy 0, policy_version 129404 (0.0024) [2024-06-13 01:37:49,528][71000] Updated weights for policy 0, policy_version 129414 (0.0028) [2024-06-13 01:37:50,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 2120351744. Throughput: 0: 49577.2. Samples: 1649120960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 01:37:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:37:53,859][71000] Updated weights for policy 0, policy_version 129424 (0.0032) [2024-06-13 01:37:55,939][70768] Fps is (10 sec: 54089.1, 60 sec: 49698.3, 300 sec: 49318.7). Total num frames: 2120613888. Throughput: 0: 49799.7. Samples: 1649417880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 01:37:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:37:56,144][71000] Updated weights for policy 0, policy_version 129434 (0.0033) [2024-06-13 01:38:00,430][71000] Updated weights for policy 0, policy_version 129444 (0.0030) [2024-06-13 01:38:00,939][70768] Fps is (10 sec: 45875.9, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2120810496. Throughput: 0: 49794.3. Samples: 1649717340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 01:38:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:38:03,104][71000] Updated weights for policy 0, policy_version 129454 (0.0020) [2024-06-13 01:38:05,940][70768] Fps is (10 sec: 44236.5, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2121056256. Throughput: 0: 49368.9. Samples: 1649851180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 01:38:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:38:06,771][70980] Signal inference workers to stop experience collection... (24350 times) [2024-06-13 01:38:06,806][71000] InferenceWorker_p0-w0: stopping experience collection (24350 times) [2024-06-13 01:38:06,827][70980] Signal inference workers to resume experience collection... (24350 times) [2024-06-13 01:38:06,828][71000] InferenceWorker_p0-w0: resuming experience collection (24350 times) [2024-06-13 01:38:06,964][71000] Updated weights for policy 0, policy_version 129464 (0.0028) [2024-06-13 01:38:09,607][71000] Updated weights for policy 0, policy_version 129474 (0.0024) [2024-06-13 01:38:10,940][70768] Fps is (10 sec: 54066.4, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2121351168. Throughput: 0: 49038.1. Samples: 1650143700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 01:38:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:38:13,995][71000] Updated weights for policy 0, policy_version 129484 (0.0027) [2024-06-13 01:38:15,940][70768] Fps is (10 sec: 55705.6, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 2121613312. Throughput: 0: 49356.4. Samples: 1650440740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 01:38:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 01:38:16,069][71000] Updated weights for policy 0, policy_version 129494 (0.0029) [2024-06-13 01:38:20,543][71000] Updated weights for policy 0, policy_version 129504 (0.0033) [2024-06-13 01:38:20,940][70768] Fps is (10 sec: 45875.6, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2121809920. Throughput: 0: 49203.7. Samples: 1650597820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 01:38:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:38:22,980][71000] Updated weights for policy 0, policy_version 129514 (0.0027) [2024-06-13 01:38:25,940][70768] Fps is (10 sec: 42598.5, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2122039296. Throughput: 0: 49190.3. Samples: 1650889360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 01:38:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:38:27,147][71000] Updated weights for policy 0, policy_version 129524 (0.0027) [2024-06-13 01:38:29,849][71000] Updated weights for policy 0, policy_version 129534 (0.0034) [2024-06-13 01:38:30,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2122317824. Throughput: 0: 48850.2. Samples: 1651172480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 01:38:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:38:34,023][71000] Updated weights for policy 0, policy_version 129544 (0.0035) [2024-06-13 01:38:35,940][70768] Fps is (10 sec: 54066.5, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2122579968. Throughput: 0: 49222.2. Samples: 1651335960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 01:38:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:38:36,439][71000] Updated weights for policy 0, policy_version 129554 (0.0033) [2024-06-13 01:38:40,647][71000] Updated weights for policy 0, policy_version 129564 (0.0026) [2024-06-13 01:38:40,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2122792960. Throughput: 0: 49153.6. Samples: 1651629800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-13 01:38:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:38:42,937][71000] Updated weights for policy 0, policy_version 129574 (0.0036) [2024-06-13 01:38:45,939][70768] Fps is (10 sec: 44237.6, 60 sec: 49155.3, 300 sec: 49263.1). Total num frames: 2123022336. Throughput: 0: 48972.0. Samples: 1651921080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-13 01:38:45,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 01:38:47,358][71000] Updated weights for policy 0, policy_version 129584 (0.0037) [2024-06-13 01:38:49,955][71000] Updated weights for policy 0, policy_version 129594 (0.0033) [2024-06-13 01:38:50,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 2123284480. Throughput: 0: 48919.5. Samples: 1652052560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-13 01:38:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:38:54,021][71000] Updated weights for policy 0, policy_version 129604 (0.0032) [2024-06-13 01:38:55,940][70768] Fps is (10 sec: 52428.0, 60 sec: 48878.8, 300 sec: 49207.5). Total num frames: 2123546624. Throughput: 0: 49172.4. Samples: 1652356460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-13 01:38:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:38:56,527][71000] Updated weights for policy 0, policy_version 129614 (0.0031) [2024-06-13 01:39:00,734][71000] Updated weights for policy 0, policy_version 129624 (0.0028) [2024-06-13 01:39:00,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49151.8, 300 sec: 49152.0). Total num frames: 2123759616. Throughput: 0: 48903.0. Samples: 1652641380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-13 01:39:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:39:03,201][71000] Updated weights for policy 0, policy_version 129634 (0.0027) [2024-06-13 01:39:05,940][70768] Fps is (10 sec: 45875.7, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 2124005376. Throughput: 0: 48458.7. Samples: 1652778460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-13 01:39:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 01:39:07,533][71000] Updated weights for policy 0, policy_version 129644 (0.0031) [2024-06-13 01:39:10,054][71000] Updated weights for policy 0, policy_version 129654 (0.0027) [2024-06-13 01:39:10,940][70768] Fps is (10 sec: 50791.1, 60 sec: 48605.9, 300 sec: 49263.1). Total num frames: 2124267520. Throughput: 0: 48696.9. Samples: 1653080720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-13 01:39:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:39:12,286][70980] Signal inference workers to stop experience collection... (24400 times) [2024-06-13 01:39:12,288][70980] Signal inference workers to resume experience collection... (24400 times) [2024-06-13 01:39:12,294][71000] InferenceWorker_p0-w0: stopping experience collection (24400 times) [2024-06-13 01:39:12,316][71000] InferenceWorker_p0-w0: resuming experience collection (24400 times) [2024-06-13 01:39:13,981][71000] Updated weights for policy 0, policy_version 129664 (0.0034) [2024-06-13 01:39:15,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48332.8, 300 sec: 49152.0). Total num frames: 2124513280. Throughput: 0: 48978.1. Samples: 1653376500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-13 01:39:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 01:39:16,781][71000] Updated weights for policy 0, policy_version 129674 (0.0025) [2024-06-13 01:39:20,705][71000] Updated weights for policy 0, policy_version 129684 (0.0022) [2024-06-13 01:39:20,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2124742656. Throughput: 0: 48709.4. Samples: 1653527880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-13 01:39:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:39:23,524][71000] Updated weights for policy 0, policy_version 129694 (0.0025) [2024-06-13 01:39:25,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2124988416. Throughput: 0: 48492.9. Samples: 1653811980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-13 01:39:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:39:27,675][71000] Updated weights for policy 0, policy_version 129704 (0.0027) [2024-06-13 01:39:30,000][71000] Updated weights for policy 0, policy_version 129714 (0.0030) [2024-06-13 01:39:30,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48605.7, 300 sec: 49207.5). Total num frames: 2125234176. Throughput: 0: 48328.7. Samples: 1654095880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-13 01:39:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 01:39:34,113][71000] Updated weights for policy 0, policy_version 129724 (0.0029) [2024-06-13 01:39:35,939][70768] Fps is (10 sec: 50791.3, 60 sec: 48606.0, 300 sec: 49152.0). Total num frames: 2125496320. Throughput: 0: 49019.2. Samples: 1654258420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-13 01:39:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:39:37,390][71000] Updated weights for policy 0, policy_version 129734 (0.0029) [2024-06-13 01:39:40,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.9, 300 sec: 49096.5). Total num frames: 2125709312. Throughput: 0: 48764.0. Samples: 1654550840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:39:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:39:41,014][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000129744_2125725696.pth... [2024-06-13 01:39:41,021][71000] Updated weights for policy 0, policy_version 129744 (0.0048) [2024-06-13 01:39:41,068][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000129023_2113912832.pth [2024-06-13 01:39:44,182][71000] Updated weights for policy 0, policy_version 129754 (0.0038) [2024-06-13 01:39:45,939][70768] Fps is (10 sec: 45875.1, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2125955072. Throughput: 0: 48906.9. Samples: 1654842180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:39:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:39:47,569][71000] Updated weights for policy 0, policy_version 129764 (0.0032) [2024-06-13 01:39:50,599][71000] Updated weights for policy 0, policy_version 129774 (0.0021) [2024-06-13 01:39:50,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48878.8, 300 sec: 49263.0). Total num frames: 2126217216. Throughput: 0: 49034.1. Samples: 1654985000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:39:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 01:39:54,409][71000] Updated weights for policy 0, policy_version 129784 (0.0031) [2024-06-13 01:39:55,939][70768] Fps is (10 sec: 50790.5, 60 sec: 48606.0, 300 sec: 49096.5). Total num frames: 2126462976. Throughput: 0: 48860.5. Samples: 1655279440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:39:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:39:57,685][71000] Updated weights for policy 0, policy_version 129794 (0.0034) [2024-06-13 01:40:00,763][71000] Updated weights for policy 0, policy_version 129804 (0.0028) [2024-06-13 01:40:00,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2126725120. Throughput: 0: 48753.7. Samples: 1655570420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:40:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 01:40:00,948][70980] Saving new best policy, reward=0.291! [2024-06-13 01:40:04,323][71000] Updated weights for policy 0, policy_version 129814 (0.0027) [2024-06-13 01:40:05,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2126954496. Throughput: 0: 48787.9. Samples: 1655723340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:40:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:40:07,389][71000] Updated weights for policy 0, policy_version 129824 (0.0038) [2024-06-13 01:40:10,744][71000] Updated weights for policy 0, policy_version 129834 (0.0027) [2024-06-13 01:40:10,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2127200256. Throughput: 0: 48872.0. Samples: 1656011220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:40:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:40:14,144][71000] Updated weights for policy 0, policy_version 129844 (0.0041) [2024-06-13 01:40:15,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.8, 300 sec: 49207.5). Total num frames: 2127446016. Throughput: 0: 49245.8. Samples: 1656311940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:40:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:40:17,573][71000] Updated weights for policy 0, policy_version 129854 (0.0029) [2024-06-13 01:40:19,170][70980] Signal inference workers to stop experience collection... (24450 times) [2024-06-13 01:40:19,206][71000] InferenceWorker_p0-w0: stopping experience collection (24450 times) [2024-06-13 01:40:19,227][70980] Signal inference workers to resume experience collection... (24450 times) [2024-06-13 01:40:19,230][71000] InferenceWorker_p0-w0: resuming experience collection (24450 times) [2024-06-13 01:40:20,501][71000] Updated weights for policy 0, policy_version 129864 (0.0028) [2024-06-13 01:40:20,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2127691776. Throughput: 0: 48905.6. Samples: 1656459180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:40:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:40:23,994][71000] Updated weights for policy 0, policy_version 129874 (0.0024) [2024-06-13 01:40:25,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2127937536. Throughput: 0: 49076.4. Samples: 1656759280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:40:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:40:27,011][71000] Updated weights for policy 0, policy_version 129884 (0.0031) [2024-06-13 01:40:30,723][71000] Updated weights for policy 0, policy_version 129894 (0.0027) [2024-06-13 01:40:30,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2128199680. Throughput: 0: 49164.3. Samples: 1657054580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:40:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:40:33,922][71000] Updated weights for policy 0, policy_version 129904 (0.0044) [2024-06-13 01:40:35,940][70768] Fps is (10 sec: 47514.3, 60 sec: 48605.8, 300 sec: 49207.6). Total num frames: 2128412672. Throughput: 0: 49353.5. Samples: 1657205900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:40:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:40:37,479][71000] Updated weights for policy 0, policy_version 129914 (0.0025) [2024-06-13 01:40:40,699][71000] Updated weights for policy 0, policy_version 129924 (0.0029) [2024-06-13 01:40:40,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2128674816. Throughput: 0: 49106.2. Samples: 1657489220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 01:40:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:40:44,376][71000] Updated weights for policy 0, policy_version 129934 (0.0033) [2024-06-13 01:40:45,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2128904192. Throughput: 0: 49130.8. Samples: 1657781300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:40:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:40:47,274][71000] Updated weights for policy 0, policy_version 129944 (0.0033) [2024-06-13 01:40:50,942][70768] Fps is (10 sec: 47502.2, 60 sec: 48877.1, 300 sec: 49151.6). Total num frames: 2129149952. Throughput: 0: 49092.6. Samples: 1657932620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:40:50,942][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:40:51,111][71000] Updated weights for policy 0, policy_version 129954 (0.0023) [2024-06-13 01:40:53,950][71000] Updated weights for policy 0, policy_version 129964 (0.0020) [2024-06-13 01:40:55,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48605.8, 300 sec: 49096.5). Total num frames: 2129379328. Throughput: 0: 49170.8. Samples: 1658223900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:40:55,943][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:40:57,560][71000] Updated weights for policy 0, policy_version 129974 (0.0029) [2024-06-13 01:41:00,591][71000] Updated weights for policy 0, policy_version 129984 (0.0032) [2024-06-13 01:41:00,940][70768] Fps is (10 sec: 50802.9, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 2129657856. Throughput: 0: 48976.2. Samples: 1658515860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:41:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:41:04,174][71000] Updated weights for policy 0, policy_version 129994 (0.0033) [2024-06-13 01:41:05,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2129887232. Throughput: 0: 49200.1. Samples: 1658673180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:41:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:41:07,172][71000] Updated weights for policy 0, policy_version 130004 (0.0034) [2024-06-13 01:41:10,928][71000] Updated weights for policy 0, policy_version 130014 (0.0028) [2024-06-13 01:41:10,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2130149376. Throughput: 0: 48986.4. Samples: 1658963660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:41:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:41:14,126][71000] Updated weights for policy 0, policy_version 130024 (0.0025) [2024-06-13 01:41:15,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48606.0, 300 sec: 49096.5). Total num frames: 2130362368. Throughput: 0: 49028.1. Samples: 1659260840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:41:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:41:17,400][71000] Updated weights for policy 0, policy_version 130034 (0.0033) [2024-06-13 01:41:20,452][71000] Updated weights for policy 0, policy_version 130044 (0.0025) [2024-06-13 01:41:20,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.2, 300 sec: 49207.6). Total num frames: 2130657280. Throughput: 0: 48966.7. Samples: 1659409400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:41:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:41:24,200][71000] Updated weights for policy 0, policy_version 130054 (0.0037) [2024-06-13 01:41:25,939][70768] Fps is (10 sec: 52429.1, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2130886656. Throughput: 0: 49286.3. Samples: 1659707100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:41:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:41:26,839][71000] Updated weights for policy 0, policy_version 130064 (0.0018) [2024-06-13 01:41:30,794][71000] Updated weights for policy 0, policy_version 130074 (0.0026) [2024-06-13 01:41:30,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2131132416. Throughput: 0: 49583.0. Samples: 1660012540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:41:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:41:32,800][70980] Signal inference workers to stop experience collection... (24500 times) [2024-06-13 01:41:32,834][71000] InferenceWorker_p0-w0: stopping experience collection (24500 times) [2024-06-13 01:41:32,855][70980] Signal inference workers to resume experience collection... (24500 times) [2024-06-13 01:41:32,858][71000] InferenceWorker_p0-w0: resuming experience collection (24500 times) [2024-06-13 01:41:33,823][71000] Updated weights for policy 0, policy_version 130084 (0.0027) [2024-06-13 01:41:35,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49151.9, 300 sec: 49152.4). Total num frames: 2131361792. Throughput: 0: 49227.5. Samples: 1660147740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:41:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:41:37,549][71000] Updated weights for policy 0, policy_version 130094 (0.0023) [2024-06-13 01:41:40,476][71000] Updated weights for policy 0, policy_version 130104 (0.0024) [2024-06-13 01:41:40,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.0, 300 sec: 49207.6). Total num frames: 2131640320. Throughput: 0: 49319.9. Samples: 1660443300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:41:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:41:41,054][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000130106_2131656704.pth... [2024-06-13 01:41:41,097][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000129386_2119860224.pth [2024-06-13 01:41:44,015][71000] Updated weights for policy 0, policy_version 130114 (0.0027) [2024-06-13 01:41:45,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49697.9, 300 sec: 49207.5). Total num frames: 2131886080. Throughput: 0: 49432.2. Samples: 1660740320. Policy #0 lag: (min: 3.0, avg: 10.1, max: 21.0) [2024-06-13 01:41:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:41:47,014][71000] Updated weights for policy 0, policy_version 130124 (0.0031) [2024-06-13 01:41:50,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49154.0, 300 sec: 49040.9). Total num frames: 2132099072. Throughput: 0: 49103.6. Samples: 1660882840. Policy #0 lag: (min: 3.0, avg: 10.1, max: 21.0) [2024-06-13 01:41:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:41:50,976][71000] Updated weights for policy 0, policy_version 130134 (0.0029) [2024-06-13 01:41:53,351][71000] Updated weights for policy 0, policy_version 130144 (0.0031) [2024-06-13 01:41:55,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49424.9, 300 sec: 49152.0). Total num frames: 2132344832. Throughput: 0: 49317.6. Samples: 1661182960. Policy #0 lag: (min: 3.0, avg: 10.1, max: 21.0) [2024-06-13 01:41:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:41:57,634][71000] Updated weights for policy 0, policy_version 130154 (0.0035) [2024-06-13 01:42:00,482][71000] Updated weights for policy 0, policy_version 130164 (0.0022) [2024-06-13 01:42:00,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2132623360. Throughput: 0: 49148.4. Samples: 1661472520. Policy #0 lag: (min: 3.0, avg: 10.1, max: 21.0) [2024-06-13 01:42:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:42:04,235][71000] Updated weights for policy 0, policy_version 130174 (0.0032) [2024-06-13 01:42:05,939][70768] Fps is (10 sec: 52429.8, 60 sec: 49698.2, 300 sec: 49096.5). Total num frames: 2132869120. Throughput: 0: 49467.6. Samples: 1661635440. Policy #0 lag: (min: 3.0, avg: 10.1, max: 21.0) [2024-06-13 01:42:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:42:07,149][71000] Updated weights for policy 0, policy_version 130184 (0.0033) [2024-06-13 01:42:10,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2133082112. Throughput: 0: 49392.8. Samples: 1661929780. Policy #0 lag: (min: 3.0, avg: 10.1, max: 21.0) [2024-06-13 01:42:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:42:10,996][71000] Updated weights for policy 0, policy_version 130194 (0.0020) [2024-06-13 01:42:13,657][71000] Updated weights for policy 0, policy_version 130204 (0.0032) [2024-06-13 01:42:15,939][70768] Fps is (10 sec: 45875.2, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 2133327872. Throughput: 0: 48761.9. Samples: 1662206820. Policy #0 lag: (min: 3.0, avg: 10.1, max: 21.0) [2024-06-13 01:42:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:42:17,853][71000] Updated weights for policy 0, policy_version 130214 (0.0024) [2024-06-13 01:42:20,542][71000] Updated weights for policy 0, policy_version 130224 (0.0036) [2024-06-13 01:42:20,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2133590016. Throughput: 0: 49118.2. Samples: 1662358060. Policy #0 lag: (min: 3.0, avg: 10.1, max: 21.0) [2024-06-13 01:42:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:42:24,570][71000] Updated weights for policy 0, policy_version 130234 (0.0030) [2024-06-13 01:42:25,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49424.9, 300 sec: 49152.0). Total num frames: 2133852160. Throughput: 0: 49263.0. Samples: 1662660140. Policy #0 lag: (min: 3.0, avg: 10.1, max: 21.0) [2024-06-13 01:42:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:42:27,214][71000] Updated weights for policy 0, policy_version 130244 (0.0028) [2024-06-13 01:42:30,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 2134048768. Throughput: 0: 49173.5. Samples: 1662953120. Policy #0 lag: (min: 3.0, avg: 10.1, max: 21.0) [2024-06-13 01:42:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:42:31,212][71000] Updated weights for policy 0, policy_version 130254 (0.0030) [2024-06-13 01:42:31,585][70980] Signal inference workers to stop experience collection... (24550 times) [2024-06-13 01:42:31,585][70980] Signal inference workers to resume experience collection... (24550 times) [2024-06-13 01:42:31,621][71000] InferenceWorker_p0-w0: stopping experience collection (24550 times) [2024-06-13 01:42:31,622][71000] InferenceWorker_p0-w0: resuming experience collection (24550 times) [2024-06-13 01:42:33,859][71000] Updated weights for policy 0, policy_version 130264 (0.0037) [2024-06-13 01:42:35,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 2134310912. Throughput: 0: 48987.9. Samples: 1663087300. Policy #0 lag: (min: 3.0, avg: 10.1, max: 21.0) [2024-06-13 01:42:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:42:37,972][71000] Updated weights for policy 0, policy_version 130274 (0.0033) [2024-06-13 01:42:40,598][71000] Updated weights for policy 0, policy_version 130284 (0.0024) [2024-06-13 01:42:40,940][70768] Fps is (10 sec: 54067.2, 60 sec: 49152.0, 300 sec: 49208.2). Total num frames: 2134589440. Throughput: 0: 48847.7. Samples: 1663381100. Policy #0 lag: (min: 3.0, avg: 10.1, max: 21.0) [2024-06-13 01:42:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:42:44,421][71000] Updated weights for policy 0, policy_version 130294 (0.0030) [2024-06-13 01:42:45,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2134835200. Throughput: 0: 49301.6. Samples: 1663691100. Policy #0 lag: (min: 3.0, avg: 10.1, max: 21.0) [2024-06-13 01:42:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:42:46,899][71000] Updated weights for policy 0, policy_version 130304 (0.0026) [2024-06-13 01:42:50,939][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.1, 300 sec: 48929.8). Total num frames: 2135048192. Throughput: 0: 48951.1. Samples: 1663838240. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-13 01:42:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:42:50,973][71000] Updated weights for policy 0, policy_version 130314 (0.0030) [2024-06-13 01:42:53,738][71000] Updated weights for policy 0, policy_version 130324 (0.0034) [2024-06-13 01:42:55,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2135310336. Throughput: 0: 48878.7. Samples: 1664129320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-13 01:42:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 01:42:57,677][71000] Updated weights for policy 0, policy_version 130334 (0.0025) [2024-06-13 01:43:00,363][71000] Updated weights for policy 0, policy_version 130344 (0.0028) [2024-06-13 01:43:00,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2135572480. Throughput: 0: 49307.5. Samples: 1664425660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-13 01:43:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:43:04,359][71000] Updated weights for policy 0, policy_version 130354 (0.0034) [2024-06-13 01:43:05,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 2135834624. Throughput: 0: 49599.5. Samples: 1664590040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-13 01:43:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:43:06,647][71000] Updated weights for policy 0, policy_version 130364 (0.0028) [2024-06-13 01:43:10,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 2136031232. Throughput: 0: 49429.0. Samples: 1664884440. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-13 01:43:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:43:11,044][71000] Updated weights for policy 0, policy_version 130374 (0.0030) [2024-06-13 01:43:13,498][71000] Updated weights for policy 0, policy_version 130384 (0.0028) [2024-06-13 01:43:15,940][70768] Fps is (10 sec: 45875.5, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 2136293376. Throughput: 0: 49152.0. Samples: 1665164960. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-13 01:43:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:43:17,589][71000] Updated weights for policy 0, policy_version 130394 (0.0033) [2024-06-13 01:43:20,251][71000] Updated weights for policy 0, policy_version 130404 (0.0036) [2024-06-13 01:43:20,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2136555520. Throughput: 0: 49637.7. Samples: 1665321000. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-13 01:43:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:43:24,546][71000] Updated weights for policy 0, policy_version 130414 (0.0026) [2024-06-13 01:43:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2136784896. Throughput: 0: 49628.4. Samples: 1665614380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-13 01:43:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:43:27,007][71000] Updated weights for policy 0, policy_version 130424 (0.0034) [2024-06-13 01:43:30,940][70768] Fps is (10 sec: 45875.8, 60 sec: 49425.0, 300 sec: 48929.9). Total num frames: 2137014272. Throughput: 0: 49197.4. Samples: 1665904980. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-13 01:43:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:43:31,175][71000] Updated weights for policy 0, policy_version 130434 (0.0024) [2024-06-13 01:43:33,328][70980] Signal inference workers to stop experience collection... (24600 times) [2024-06-13 01:43:33,329][70980] Signal inference workers to resume experience collection... (24600 times) [2024-06-13 01:43:33,340][71000] InferenceWorker_p0-w0: stopping experience collection (24600 times) [2024-06-13 01:43:33,341][71000] InferenceWorker_p0-w0: resuming experience collection (24600 times) [2024-06-13 01:43:33,477][71000] Updated weights for policy 0, policy_version 130444 (0.0026) [2024-06-13 01:43:35,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.2, 300 sec: 49096.5). Total num frames: 2137276416. Throughput: 0: 49003.5. Samples: 1666043400. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-13 01:43:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:43:37,551][71000] Updated weights for policy 0, policy_version 130454 (0.0025) [2024-06-13 01:43:40,395][71000] Updated weights for policy 0, policy_version 130464 (0.0023) [2024-06-13 01:43:40,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2137538560. Throughput: 0: 49350.6. Samples: 1666350100. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-13 01:43:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:43:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000130465_2137538560.pth... [2024-06-13 01:43:40,991][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000129744_2125725696.pth [2024-06-13 01:43:44,146][71000] Updated weights for policy 0, policy_version 130474 (0.0036) [2024-06-13 01:43:45,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2137767936. Throughput: 0: 49352.4. Samples: 1666646520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-13 01:43:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:43:46,826][71000] Updated weights for policy 0, policy_version 130484 (0.0026) [2024-06-13 01:43:50,895][71000] Updated weights for policy 0, policy_version 130494 (0.0024) [2024-06-13 01:43:50,940][70768] Fps is (10 sec: 47514.4, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 2138013696. Throughput: 0: 48904.1. Samples: 1666790720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 01:43:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:43:53,553][71000] Updated weights for policy 0, policy_version 130504 (0.0027) [2024-06-13 01:43:55,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2138259456. Throughput: 0: 48993.8. Samples: 1667089160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 01:43:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:43:57,341][71000] Updated weights for policy 0, policy_version 130514 (0.0024) [2024-06-13 01:44:00,461][71000] Updated weights for policy 0, policy_version 130524 (0.0024) [2024-06-13 01:44:00,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2138521600. Throughput: 0: 49167.6. Samples: 1667377500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 01:44:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:44:04,014][71000] Updated weights for policy 0, policy_version 130534 (0.0020) [2024-06-13 01:44:05,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48606.0, 300 sec: 49096.5). Total num frames: 2138750976. Throughput: 0: 49045.2. Samples: 1667528020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 01:44:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:44:06,955][71000] Updated weights for policy 0, policy_version 130544 (0.0026) [2024-06-13 01:44:10,798][71000] Updated weights for policy 0, policy_version 130554 (0.0031) [2024-06-13 01:44:10,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2138996736. Throughput: 0: 49105.3. Samples: 1667824120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 01:44:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:44:13,721][71000] Updated weights for policy 0, policy_version 130564 (0.0034) [2024-06-13 01:44:15,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2139242496. Throughput: 0: 49191.2. Samples: 1668118580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 01:44:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:44:17,162][71000] Updated weights for policy 0, policy_version 130574 (0.0038) [2024-06-13 01:44:20,323][71000] Updated weights for policy 0, policy_version 130584 (0.0029) [2024-06-13 01:44:20,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 2139488256. Throughput: 0: 49562.2. Samples: 1668273700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 01:44:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:44:24,069][71000] Updated weights for policy 0, policy_version 130594 (0.0029) [2024-06-13 01:44:25,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2139734016. Throughput: 0: 49173.0. Samples: 1668562880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 01:44:25,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:44:27,079][71000] Updated weights for policy 0, policy_version 130604 (0.0030) [2024-06-13 01:44:30,477][71000] Updated weights for policy 0, policy_version 130614 (0.0028) [2024-06-13 01:44:30,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 49152.0). Total num frames: 2139996160. Throughput: 0: 49433.3. Samples: 1668871020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 01:44:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:44:33,677][71000] Updated weights for policy 0, policy_version 130624 (0.0030) [2024-06-13 01:44:35,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2140241920. Throughput: 0: 49546.5. Samples: 1669020320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 01:44:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:44:36,873][71000] Updated weights for policy 0, policy_version 130634 (0.0035) [2024-06-13 01:44:40,346][71000] Updated weights for policy 0, policy_version 130644 (0.0031) [2024-06-13 01:44:40,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.2, 300 sec: 49263.1). Total num frames: 2140487680. Throughput: 0: 49367.6. Samples: 1669310700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 01:44:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 01:44:43,601][71000] Updated weights for policy 0, policy_version 130654 (0.0025) [2024-06-13 01:44:45,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2140717056. Throughput: 0: 49439.4. Samples: 1669602280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 01:44:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 01:44:47,097][71000] Updated weights for policy 0, policy_version 130664 (0.0028) [2024-06-13 01:44:49,407][70980] Signal inference workers to stop experience collection... (24650 times) [2024-06-13 01:44:49,407][70980] Signal inference workers to resume experience collection... (24650 times) [2024-06-13 01:44:49,421][71000] InferenceWorker_p0-w0: stopping experience collection (24650 times) [2024-06-13 01:44:49,421][71000] InferenceWorker_p0-w0: resuming experience collection (24650 times) [2024-06-13 01:44:50,271][71000] Updated weights for policy 0, policy_version 130674 (0.0023) [2024-06-13 01:44:50,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2140979200. Throughput: 0: 49345.7. Samples: 1669748580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 01:44:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:44:53,516][71000] Updated weights for policy 0, policy_version 130684 (0.0033) [2024-06-13 01:44:55,940][70768] Fps is (10 sec: 52429.5, 60 sec: 49698.1, 300 sec: 49207.6). Total num frames: 2141241344. Throughput: 0: 49572.0. Samples: 1670054860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:44:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:44:56,705][71000] Updated weights for policy 0, policy_version 130694 (0.0021) [2024-06-13 01:44:59,773][71000] Updated weights for policy 0, policy_version 130704 (0.0025) [2024-06-13 01:45:00,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 2141503488. Throughput: 0: 49869.3. Samples: 1670362700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:45:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:45:02,856][71000] Updated weights for policy 0, policy_version 130714 (0.0028) [2024-06-13 01:45:05,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.0, 300 sec: 49263.1). Total num frames: 2141732864. Throughput: 0: 49786.2. Samples: 1670514080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:45:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:45:06,611][71000] Updated weights for policy 0, policy_version 130724 (0.0023) [2024-06-13 01:45:09,597][71000] Updated weights for policy 0, policy_version 130734 (0.0039) [2024-06-13 01:45:10,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49698.2, 300 sec: 49263.1). Total num frames: 2141978624. Throughput: 0: 50004.5. Samples: 1670813080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:45:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:45:13,078][71000] Updated weights for policy 0, policy_version 130744 (0.0020) [2024-06-13 01:45:15,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49971.2, 300 sec: 49318.6). Total num frames: 2142240768. Throughput: 0: 49506.3. Samples: 1671098800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:45:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:45:16,382][71000] Updated weights for policy 0, policy_version 130754 (0.0024) [2024-06-13 01:45:19,726][71000] Updated weights for policy 0, policy_version 130764 (0.0030) [2024-06-13 01:45:20,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49971.2, 300 sec: 49318.6). Total num frames: 2142486528. Throughput: 0: 49756.1. Samples: 1671259340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:45:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:45:22,721][71000] Updated weights for policy 0, policy_version 130774 (0.0031) [2024-06-13 01:45:25,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49698.2, 300 sec: 49207.6). Total num frames: 2142715904. Throughput: 0: 49962.2. Samples: 1671559000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:45:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 01:45:26,393][71000] Updated weights for policy 0, policy_version 130784 (0.0022) [2024-06-13 01:45:28,881][71000] Updated weights for policy 0, policy_version 130794 (0.0025) [2024-06-13 01:45:30,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49971.3, 300 sec: 49429.7). Total num frames: 2142994432. Throughput: 0: 50264.6. Samples: 1671864180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:45:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:45:33,006][71000] Updated weights for policy 0, policy_version 130804 (0.0039) [2024-06-13 01:45:35,939][70768] Fps is (10 sec: 52429.1, 60 sec: 49971.3, 300 sec: 49374.2). Total num frames: 2143240192. Throughput: 0: 50283.6. Samples: 1672011340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:45:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:45:36,119][71000] Updated weights for policy 0, policy_version 130814 (0.0030) [2024-06-13 01:45:39,684][71000] Updated weights for policy 0, policy_version 130824 (0.0030) [2024-06-13 01:45:40,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49971.0, 300 sec: 49429.7). Total num frames: 2143485952. Throughput: 0: 50183.8. Samples: 1672313140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:45:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:45:40,963][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000130829_2143502336.pth... [2024-06-13 01:45:41,021][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000130106_2131656704.pth [2024-06-13 01:45:42,578][71000] Updated weights for policy 0, policy_version 130834 (0.0027) [2024-06-13 01:45:45,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49698.2, 300 sec: 49319.0). Total num frames: 2143698944. Throughput: 0: 49734.7. Samples: 1672600760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:45:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:45:46,450][71000] Updated weights for policy 0, policy_version 130844 (0.0025) [2024-06-13 01:45:49,316][71000] Updated weights for policy 0, policy_version 130854 (0.0024) [2024-06-13 01:45:50,940][70768] Fps is (10 sec: 49153.1, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 2143977472. Throughput: 0: 49474.7. Samples: 1672740440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 01:45:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:45:52,970][71000] Updated weights for policy 0, policy_version 130864 (0.0039) [2024-06-13 01:45:55,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49698.1, 300 sec: 49374.1). Total num frames: 2144223232. Throughput: 0: 49477.7. Samples: 1673039580. Policy #0 lag: (min: 1.0, avg: 10.6, max: 26.0) [2024-06-13 01:45:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:45:55,998][71000] Updated weights for policy 0, policy_version 130874 (0.0032) [2024-06-13 01:45:59,297][70980] Signal inference workers to stop experience collection... (24700 times) [2024-06-13 01:45:59,324][71000] InferenceWorker_p0-w0: stopping experience collection (24700 times) [2024-06-13 01:45:59,356][70980] Signal inference workers to resume experience collection... (24700 times) [2024-06-13 01:45:59,356][71000] InferenceWorker_p0-w0: resuming experience collection (24700 times) [2024-06-13 01:45:59,645][71000] Updated weights for policy 0, policy_version 130884 (0.0031) [2024-06-13 01:46:00,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 2144485376. Throughput: 0: 49824.8. Samples: 1673340920. Policy #0 lag: (min: 1.0, avg: 10.6, max: 26.0) [2024-06-13 01:46:00,942][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 01:46:02,519][71000] Updated weights for policy 0, policy_version 130894 (0.0032) [2024-06-13 01:46:05,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2144698368. Throughput: 0: 49453.8. Samples: 1673484760. Policy #0 lag: (min: 1.0, avg: 10.6, max: 26.0) [2024-06-13 01:46:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:46:06,569][71000] Updated weights for policy 0, policy_version 130904 (0.0020) [2024-06-13 01:46:08,807][71000] Updated weights for policy 0, policy_version 130914 (0.0031) [2024-06-13 01:46:10,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49698.0, 300 sec: 49485.2). Total num frames: 2144960512. Throughput: 0: 49427.9. Samples: 1673783260. Policy #0 lag: (min: 1.0, avg: 10.6, max: 26.0) [2024-06-13 01:46:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:46:13,121][71000] Updated weights for policy 0, policy_version 130924 (0.0032) [2024-06-13 01:46:15,751][71000] Updated weights for policy 0, policy_version 130934 (0.0034) [2024-06-13 01:46:15,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 2145222656. Throughput: 0: 49160.5. Samples: 1674076400. Policy #0 lag: (min: 1.0, avg: 10.6, max: 26.0) [2024-06-13 01:46:15,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 01:46:16,050][70980] Saving new best policy, reward=0.292! [2024-06-13 01:46:19,617][71000] Updated weights for policy 0, policy_version 130944 (0.0020) [2024-06-13 01:46:20,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 2145452032. Throughput: 0: 49408.8. Samples: 1674234740. Policy #0 lag: (min: 1.0, avg: 10.6, max: 26.0) [2024-06-13 01:46:20,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 01:46:22,349][71000] Updated weights for policy 0, policy_version 130954 (0.0039) [2024-06-13 01:46:25,940][70768] Fps is (10 sec: 44236.8, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2145665024. Throughput: 0: 49034.0. Samples: 1674519660. Policy #0 lag: (min: 1.0, avg: 10.6, max: 26.0) [2024-06-13 01:46:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:46:26,449][71000] Updated weights for policy 0, policy_version 130964 (0.0037) [2024-06-13 01:46:28,945][71000] Updated weights for policy 0, policy_version 130974 (0.0027) [2024-06-13 01:46:30,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 2145959936. Throughput: 0: 49292.8. Samples: 1674818940. Policy #0 lag: (min: 1.0, avg: 10.6, max: 26.0) [2024-06-13 01:46:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:46:32,941][71000] Updated weights for policy 0, policy_version 130984 (0.0034) [2024-06-13 01:46:35,752][71000] Updated weights for policy 0, policy_version 130994 (0.0024) [2024-06-13 01:46:35,940][70768] Fps is (10 sec: 54067.1, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2146205696. Throughput: 0: 49771.1. Samples: 1674980140. Policy #0 lag: (min: 1.0, avg: 10.6, max: 26.0) [2024-06-13 01:46:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:46:39,611][71000] Updated weights for policy 0, policy_version 131004 (0.0021) [2024-06-13 01:46:40,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49698.3, 300 sec: 49429.7). Total num frames: 2146467840. Throughput: 0: 49902.3. Samples: 1675285180. Policy #0 lag: (min: 1.0, avg: 10.6, max: 26.0) [2024-06-13 01:46:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:46:42,281][71000] Updated weights for policy 0, policy_version 131014 (0.0025) [2024-06-13 01:46:45,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2146664448. Throughput: 0: 49644.1. Samples: 1675574900. Policy #0 lag: (min: 1.0, avg: 10.6, max: 26.0) [2024-06-13 01:46:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:46:46,140][71000] Updated weights for policy 0, policy_version 131024 (0.0023) [2024-06-13 01:46:48,806][71000] Updated weights for policy 0, policy_version 131034 (0.0026) [2024-06-13 01:46:50,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49698.0, 300 sec: 49540.8). Total num frames: 2146959360. Throughput: 0: 49475.8. Samples: 1675711180. Policy #0 lag: (min: 1.0, avg: 10.6, max: 26.0) [2024-06-13 01:46:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:46:52,985][71000] Updated weights for policy 0, policy_version 131044 (0.0029) [2024-06-13 01:46:55,623][71000] Updated weights for policy 0, policy_version 131054 (0.0029) [2024-06-13 01:46:55,940][70768] Fps is (10 sec: 54067.2, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 2147205120. Throughput: 0: 49443.7. Samples: 1676008220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 01:46:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:46:59,434][71000] Updated weights for policy 0, policy_version 131064 (0.0023) [2024-06-13 01:47:00,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2147450880. Throughput: 0: 49677.7. Samples: 1676311900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 01:47:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:47:02,253][71000] Updated weights for policy 0, policy_version 131074 (0.0039) [2024-06-13 01:47:05,940][70768] Fps is (10 sec: 44236.4, 60 sec: 49151.9, 300 sec: 49374.2). Total num frames: 2147647488. Throughput: 0: 49211.4. Samples: 1676449260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 01:47:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:47:06,210][71000] Updated weights for policy 0, policy_version 131084 (0.0030) [2024-06-13 01:47:07,076][70980] Signal inference workers to stop experience collection... (24750 times) [2024-06-13 01:47:07,110][71000] InferenceWorker_p0-w0: stopping experience collection (24750 times) [2024-06-13 01:47:07,184][70980] Signal inference workers to resume experience collection... (24750 times) [2024-06-13 01:47:07,185][71000] InferenceWorker_p0-w0: resuming experience collection (24750 times) [2024-06-13 01:47:08,838][71000] Updated weights for policy 0, policy_version 131094 (0.0025) [2024-06-13 01:47:10,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49698.3, 300 sec: 49540.8). Total num frames: 2147942400. Throughput: 0: 49465.8. Samples: 1676745620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 01:47:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:47:12,805][71000] Updated weights for policy 0, policy_version 131104 (0.0023) [2024-06-13 01:47:15,279][71000] Updated weights for policy 0, policy_version 131114 (0.0027) [2024-06-13 01:47:15,939][70768] Fps is (10 sec: 55706.6, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 2148204544. Throughput: 0: 49464.6. Samples: 1677044840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 01:47:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:47:19,181][71000] Updated weights for policy 0, policy_version 131124 (0.0038) [2024-06-13 01:47:20,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2148417536. Throughput: 0: 49492.4. Samples: 1677207300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 01:47:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 01:47:22,392][71000] Updated weights for policy 0, policy_version 131134 (0.0038) [2024-06-13 01:47:25,940][70768] Fps is (10 sec: 42598.0, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2148630528. Throughput: 0: 49005.7. Samples: 1677490440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 01:47:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:47:26,178][71000] Updated weights for policy 0, policy_version 131144 (0.0027) [2024-06-13 01:47:28,793][71000] Updated weights for policy 0, policy_version 131154 (0.0028) [2024-06-13 01:47:30,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 2148925440. Throughput: 0: 49157.7. Samples: 1677787000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 01:47:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:47:32,568][71000] Updated weights for policy 0, policy_version 131164 (0.0025) [2024-06-13 01:47:35,093][71000] Updated weights for policy 0, policy_version 131174 (0.0028) [2024-06-13 01:47:35,940][70768] Fps is (10 sec: 55705.7, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 2149187584. Throughput: 0: 49662.4. Samples: 1677945980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 01:47:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:47:39,294][71000] Updated weights for policy 0, policy_version 131184 (0.0025) [2024-06-13 01:47:40,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49151.8, 300 sec: 49429.7). Total num frames: 2149416960. Throughput: 0: 49754.0. Samples: 1678247160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 01:47:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:47:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000131190_2149416960.pth... [2024-06-13 01:47:40,984][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000130465_2137538560.pth [2024-06-13 01:47:41,758][71000] Updated weights for policy 0, policy_version 131194 (0.0030) [2024-06-13 01:47:45,940][70768] Fps is (10 sec: 44236.9, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2149629952. Throughput: 0: 49615.7. Samples: 1678544600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 01:47:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:47:45,999][71000] Updated weights for policy 0, policy_version 131204 (0.0036) [2024-06-13 01:47:48,684][71000] Updated weights for policy 0, policy_version 131214 (0.0026) [2024-06-13 01:47:50,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2149908480. Throughput: 0: 49794.1. Samples: 1678690000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 01:47:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:47:52,282][71000] Updated weights for policy 0, policy_version 131224 (0.0038) [2024-06-13 01:47:55,019][71000] Updated weights for policy 0, policy_version 131234 (0.0023) [2024-06-13 01:47:55,940][70768] Fps is (10 sec: 54067.0, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 2150170624. Throughput: 0: 49663.5. Samples: 1678980480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 01:47:55,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 01:47:58,913][71000] Updated weights for policy 0, policy_version 131244 (0.0038) [2024-06-13 01:48:00,940][70768] Fps is (10 sec: 50791.5, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2150416384. Throughput: 0: 49835.5. Samples: 1679287440. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 01:48:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:48:01,462][71000] Updated weights for policy 0, policy_version 131254 (0.0030) [2024-06-13 01:48:05,521][71000] Updated weights for policy 0, policy_version 131264 (0.0028) [2024-06-13 01:48:05,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49971.3, 300 sec: 49540.8). Total num frames: 2150645760. Throughput: 0: 49376.9. Samples: 1679429260. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 01:48:05,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 01:48:08,262][71000] Updated weights for policy 0, policy_version 131274 (0.0028) [2024-06-13 01:48:10,941][70768] Fps is (10 sec: 47506.3, 60 sec: 49150.7, 300 sec: 49485.0). Total num frames: 2150891520. Throughput: 0: 49535.3. Samples: 1679719600. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 01:48:10,942][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:48:12,275][71000] Updated weights for policy 0, policy_version 131284 (0.0039) [2024-06-13 01:48:13,527][70980] Signal inference workers to stop experience collection... (24800 times) [2024-06-13 01:48:13,556][71000] InferenceWorker_p0-w0: stopping experience collection (24800 times) [2024-06-13 01:48:13,580][70980] Signal inference workers to resume experience collection... (24800 times) [2024-06-13 01:48:13,580][71000] InferenceWorker_p0-w0: resuming experience collection (24800 times) [2024-06-13 01:48:14,851][71000] Updated weights for policy 0, policy_version 131294 (0.0032) [2024-06-13 01:48:15,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49424.9, 300 sec: 49540.8). Total num frames: 2151170048. Throughput: 0: 49540.8. Samples: 1680016340. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 01:48:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:48:18,805][71000] Updated weights for policy 0, policy_version 131304 (0.0027) [2024-06-13 01:48:20,940][70768] Fps is (10 sec: 52435.9, 60 sec: 49971.1, 300 sec: 49596.3). Total num frames: 2151415808. Throughput: 0: 49579.9. Samples: 1680177080. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 01:48:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:48:21,522][71000] Updated weights for policy 0, policy_version 131314 (0.0026) [2024-06-13 01:48:25,374][71000] Updated weights for policy 0, policy_version 131324 (0.0038) [2024-06-13 01:48:25,939][70768] Fps is (10 sec: 45876.2, 60 sec: 49971.3, 300 sec: 49540.8). Total num frames: 2151628800. Throughput: 0: 49402.1. Samples: 1680470240. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 01:48:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:48:28,193][71000] Updated weights for policy 0, policy_version 131334 (0.0024) [2024-06-13 01:48:30,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49151.9, 300 sec: 49485.2). Total num frames: 2151874560. Throughput: 0: 49258.5. Samples: 1680761240. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 01:48:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:48:31,718][71000] Updated weights for policy 0, policy_version 131344 (0.0022) [2024-06-13 01:48:34,743][71000] Updated weights for policy 0, policy_version 131354 (0.0041) [2024-06-13 01:48:35,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 2152153088. Throughput: 0: 49511.7. Samples: 1680918020. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 01:48:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:48:38,422][71000] Updated weights for policy 0, policy_version 131364 (0.0030) [2024-06-13 01:48:40,939][70768] Fps is (10 sec: 50791.3, 60 sec: 49425.3, 300 sec: 49540.8). Total num frames: 2152382464. Throughput: 0: 49813.0. Samples: 1681222060. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 01:48:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:48:41,529][71000] Updated weights for policy 0, policy_version 131374 (0.0029) [2024-06-13 01:48:45,300][71000] Updated weights for policy 0, policy_version 131384 (0.0026) [2024-06-13 01:48:45,940][70768] Fps is (10 sec: 44236.4, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2152595456. Throughput: 0: 49170.1. Samples: 1681500100. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 01:48:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:48:48,251][71000] Updated weights for policy 0, policy_version 131394 (0.0031) [2024-06-13 01:48:50,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.3, 300 sec: 49540.8). Total num frames: 2152873984. Throughput: 0: 49082.3. Samples: 1681637960. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 01:48:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:48:52,221][71000] Updated weights for policy 0, policy_version 131404 (0.0024) [2024-06-13 01:48:54,861][71000] Updated weights for policy 0, policy_version 131414 (0.0027) [2024-06-13 01:48:55,939][70768] Fps is (10 sec: 50791.3, 60 sec: 48879.0, 300 sec: 49429.7). Total num frames: 2153103360. Throughput: 0: 49271.0. Samples: 1681936720. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 01:48:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:48:58,482][71000] Updated weights for policy 0, policy_version 131424 (0.0028) [2024-06-13 01:49:00,940][70768] Fps is (10 sec: 50789.3, 60 sec: 49424.9, 300 sec: 49596.3). Total num frames: 2153381888. Throughput: 0: 49407.1. Samples: 1682239660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:49:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:49:01,605][71000] Updated weights for policy 0, policy_version 131434 (0.0025) [2024-06-13 01:49:04,873][71000] Updated weights for policy 0, policy_version 131444 (0.0038) [2024-06-13 01:49:05,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 49429.7). Total num frames: 2153578496. Throughput: 0: 49062.4. Samples: 1682384880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:49:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:49:08,110][71000] Updated weights for policy 0, policy_version 131454 (0.0042) [2024-06-13 01:49:10,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49426.3, 300 sec: 49540.8). Total num frames: 2153857024. Throughput: 0: 49040.8. Samples: 1682677080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:49:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:49:11,610][71000] Updated weights for policy 0, policy_version 131464 (0.0025) [2024-06-13 01:49:14,084][70980] Signal inference workers to stop experience collection... (24850 times) [2024-06-13 01:49:14,086][70980] Signal inference workers to resume experience collection... (24850 times) [2024-06-13 01:49:14,104][71000] InferenceWorker_p0-w0: stopping experience collection (24850 times) [2024-06-13 01:49:14,104][71000] InferenceWorker_p0-w0: resuming experience collection (24850 times) [2024-06-13 01:49:14,766][71000] Updated weights for policy 0, policy_version 131474 (0.0028) [2024-06-13 01:49:15,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48605.9, 300 sec: 49485.2). Total num frames: 2154086400. Throughput: 0: 49119.1. Samples: 1682971600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:49:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:49:17,951][71000] Updated weights for policy 0, policy_version 131484 (0.0024) [2024-06-13 01:49:20,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.1, 300 sec: 49596.3). Total num frames: 2154364928. Throughput: 0: 49221.0. Samples: 1683132960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:49:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:49:21,149][71000] Updated weights for policy 0, policy_version 131494 (0.0022) [2024-06-13 01:49:24,466][71000] Updated weights for policy 0, policy_version 131504 (0.0028) [2024-06-13 01:49:25,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 2154577920. Throughput: 0: 49082.2. Samples: 1683430760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:49:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:49:27,897][71000] Updated weights for policy 0, policy_version 131514 (0.0030) [2024-06-13 01:49:30,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 2154856448. Throughput: 0: 49386.7. Samples: 1683722500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:49:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:49:31,317][71000] Updated weights for policy 0, policy_version 131524 (0.0037) [2024-06-13 01:49:34,647][71000] Updated weights for policy 0, policy_version 131534 (0.0034) [2024-06-13 01:49:35,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 2155102208. Throughput: 0: 49702.1. Samples: 1683874560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:49:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:49:37,886][71000] Updated weights for policy 0, policy_version 131544 (0.0027) [2024-06-13 01:49:40,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49698.1, 300 sec: 49651.9). Total num frames: 2155364352. Throughput: 0: 49761.7. Samples: 1684176000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:49:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:49:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000131553_2155364352.pth... [2024-06-13 01:49:40,999][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000130829_2143502336.pth [2024-06-13 01:49:41,166][71000] Updated weights for policy 0, policy_version 131554 (0.0027) [2024-06-13 01:49:44,363][71000] Updated weights for policy 0, policy_version 131564 (0.0035) [2024-06-13 01:49:45,939][70768] Fps is (10 sec: 45875.7, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2155560960. Throughput: 0: 49633.2. Samples: 1684473140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:49:45,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-13 01:49:47,889][71000] Updated weights for policy 0, policy_version 131574 (0.0026) [2024-06-13 01:49:50,940][70768] Fps is (10 sec: 47512.7, 60 sec: 49424.8, 300 sec: 49485.2). Total num frames: 2155839488. Throughput: 0: 49407.7. Samples: 1684608240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:49:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:49:51,495][71000] Updated weights for policy 0, policy_version 131584 (0.0033) [2024-06-13 01:49:54,578][71000] Updated weights for policy 0, policy_version 131594 (0.0028) [2024-06-13 01:49:55,940][70768] Fps is (10 sec: 52427.6, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 2156085248. Throughput: 0: 49291.4. Samples: 1684895200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:49:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:49:57,958][71000] Updated weights for policy 0, policy_version 131604 (0.0021) [2024-06-13 01:50:00,829][70980] Signal inference workers to stop experience collection... (24900 times) [2024-06-13 01:50:00,829][70980] Signal inference workers to resume experience collection... (24900 times) [2024-06-13 01:50:00,877][71000] InferenceWorker_p0-w0: stopping experience collection (24900 times) [2024-06-13 01:50:00,877][71000] InferenceWorker_p0-w0: resuming experience collection (24900 times) [2024-06-13 01:50:00,939][70768] Fps is (10 sec: 49153.3, 60 sec: 49152.2, 300 sec: 49485.2). Total num frames: 2156331008. Throughput: 0: 49633.0. Samples: 1685205080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:50:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:50:01,307][71000] Updated weights for policy 0, policy_version 131614 (0.0033) [2024-06-13 01:50:04,367][71000] Updated weights for policy 0, policy_version 131624 (0.0032) [2024-06-13 01:50:05,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49424.9, 300 sec: 49374.1). Total num frames: 2156544000. Throughput: 0: 49064.3. Samples: 1685340860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 01:50:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:50:07,954][71000] Updated weights for policy 0, policy_version 131634 (0.0038) [2024-06-13 01:50:10,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2156822528. Throughput: 0: 49077.2. Samples: 1685639240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 01:50:10,949][70768] Avg episode reward: [(0, '0.267')] [2024-06-13 01:50:11,377][71000] Updated weights for policy 0, policy_version 131644 (0.0029) [2024-06-13 01:50:14,710][71000] Updated weights for policy 0, policy_version 131654 (0.0028) [2024-06-13 01:50:15,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 2157068288. Throughput: 0: 49171.2. Samples: 1685935200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 01:50:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:50:17,914][71000] Updated weights for policy 0, policy_version 131664 (0.0035) [2024-06-13 01:50:20,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2157314048. Throughput: 0: 48999.2. Samples: 1686079520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 01:50:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:50:21,564][71000] Updated weights for policy 0, policy_version 131674 (0.0039) [2024-06-13 01:50:24,611][71000] Updated weights for policy 0, policy_version 131684 (0.0031) [2024-06-13 01:50:25,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2157527040. Throughput: 0: 48858.7. Samples: 1686374640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 01:50:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:50:28,178][71000] Updated weights for policy 0, policy_version 131694 (0.0028) [2024-06-13 01:50:30,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 2157805568. Throughput: 0: 48797.5. Samples: 1686669040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 01:50:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:50:31,291][71000] Updated weights for policy 0, policy_version 131704 (0.0031) [2024-06-13 01:50:34,712][71000] Updated weights for policy 0, policy_version 131714 (0.0023) [2024-06-13 01:50:35,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2158051328. Throughput: 0: 49263.8. Samples: 1686825100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 01:50:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:50:37,963][71000] Updated weights for policy 0, policy_version 131724 (0.0030) [2024-06-13 01:50:40,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 49485.2). Total num frames: 2158297088. Throughput: 0: 49488.0. Samples: 1687122160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 01:50:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:50:41,413][71000] Updated weights for policy 0, policy_version 131734 (0.0027) [2024-06-13 01:50:44,554][71000] Updated weights for policy 0, policy_version 131744 (0.0028) [2024-06-13 01:50:45,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2158526464. Throughput: 0: 49052.9. Samples: 1687412460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 01:50:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:50:47,966][71000] Updated weights for policy 0, policy_version 131754 (0.0021) [2024-06-13 01:50:50,927][71000] Updated weights for policy 0, policy_version 131764 (0.0025) [2024-06-13 01:50:50,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49698.3, 300 sec: 49485.2). Total num frames: 2158821376. Throughput: 0: 49421.9. Samples: 1687564840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 01:50:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:50:54,827][71000] Updated weights for policy 0, policy_version 131774 (0.0027) [2024-06-13 01:50:55,941][70768] Fps is (10 sec: 50780.8, 60 sec: 49150.6, 300 sec: 49318.3). Total num frames: 2159034368. Throughput: 0: 49322.1. Samples: 1687858820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 01:50:55,942][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:50:58,022][71000] Updated weights for policy 0, policy_version 131784 (0.0027) [2024-06-13 01:51:00,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2159280128. Throughput: 0: 49133.3. Samples: 1688146200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 01:51:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:51:01,547][71000] Updated weights for policy 0, policy_version 131794 (0.0035) [2024-06-13 01:51:04,655][71000] Updated weights for policy 0, policy_version 131804 (0.0029) [2024-06-13 01:51:05,940][70768] Fps is (10 sec: 49161.2, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 2159525888. Throughput: 0: 49284.0. Samples: 1688297300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 01:51:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:51:08,126][71000] Updated weights for policy 0, policy_version 131814 (0.0020) [2024-06-13 01:51:10,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 2159788032. Throughput: 0: 49519.1. Samples: 1688603000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 01:51:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:51:10,982][71000] Updated weights for policy 0, policy_version 131824 (0.0025) [2024-06-13 01:51:14,806][71000] Updated weights for policy 0, policy_version 131834 (0.0034) [2024-06-13 01:51:15,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2160017408. Throughput: 0: 49469.9. Samples: 1688895180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 01:51:15,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:51:17,954][71000] Updated weights for policy 0, policy_version 131844 (0.0036) [2024-06-13 01:51:20,940][70768] Fps is (10 sec: 44236.7, 60 sec: 48605.8, 300 sec: 49374.2). Total num frames: 2160230400. Throughput: 0: 48996.0. Samples: 1689029920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 01:51:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:51:21,643][71000] Updated weights for policy 0, policy_version 131854 (0.0035) [2024-06-13 01:51:23,046][70980] Signal inference workers to stop experience collection... (24950 times) [2024-06-13 01:51:23,076][71000] InferenceWorker_p0-w0: stopping experience collection (24950 times) [2024-06-13 01:51:23,098][70980] Signal inference workers to resume experience collection... (24950 times) [2024-06-13 01:51:23,099][71000] InferenceWorker_p0-w0: resuming experience collection (24950 times) [2024-06-13 01:51:24,486][71000] Updated weights for policy 0, policy_version 131864 (0.0038) [2024-06-13 01:51:25,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 2160508928. Throughput: 0: 48868.6. Samples: 1689321240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 01:51:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:51:28,230][71000] Updated weights for policy 0, policy_version 131874 (0.0026) [2024-06-13 01:51:30,939][70768] Fps is (10 sec: 54067.5, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 2160771072. Throughput: 0: 49205.3. Samples: 1689626700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 01:51:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:51:30,979][71000] Updated weights for policy 0, policy_version 131884 (0.0026) [2024-06-13 01:51:34,459][71000] Updated weights for policy 0, policy_version 131894 (0.0029) [2024-06-13 01:51:35,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2161000448. Throughput: 0: 49416.5. Samples: 1689788580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 01:51:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:51:37,760][71000] Updated weights for policy 0, policy_version 131904 (0.0045) [2024-06-13 01:51:40,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 2161246208. Throughput: 0: 49177.1. Samples: 1690071700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 01:51:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:51:41,090][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000131913_2161262592.pth... [2024-06-13 01:51:41,137][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000131190_2149416960.pth [2024-06-13 01:51:41,683][71000] Updated weights for policy 0, policy_version 131914 (0.0029) [2024-06-13 01:51:44,609][71000] Updated weights for policy 0, policy_version 131924 (0.0023) [2024-06-13 01:51:45,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.2, 300 sec: 49318.7). Total num frames: 2161508352. Throughput: 0: 49329.4. Samples: 1690366020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 01:51:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:51:48,258][71000] Updated weights for policy 0, policy_version 131934 (0.0027) [2024-06-13 01:51:50,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 2161754112. Throughput: 0: 49555.6. Samples: 1690527300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 01:51:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:51:51,063][71000] Updated weights for policy 0, policy_version 131944 (0.0029) [2024-06-13 01:51:54,763][71000] Updated weights for policy 0, policy_version 131954 (0.0029) [2024-06-13 01:51:55,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49699.7, 300 sec: 49374.2). Total num frames: 2162016256. Throughput: 0: 49514.7. Samples: 1690831160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 01:51:55,940][70768] Avg episode reward: [(0, '0.265')] [2024-06-13 01:51:57,678][71000] Updated weights for policy 0, policy_version 131964 (0.0027) [2024-06-13 01:52:00,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 2162245632. Throughput: 0: 49428.3. Samples: 1691119460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 01:52:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:52:01,515][71000] Updated weights for policy 0, policy_version 131974 (0.0030) [2024-06-13 01:52:04,301][71000] Updated weights for policy 0, policy_version 131984 (0.0029) [2024-06-13 01:52:05,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49971.1, 300 sec: 49429.7). Total num frames: 2162524160. Throughput: 0: 49615.0. Samples: 1691262600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 01:52:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:52:07,978][71000] Updated weights for policy 0, policy_version 131994 (0.0026) [2024-06-13 01:52:10,898][71000] Updated weights for policy 0, policy_version 132004 (0.0026) [2024-06-13 01:52:10,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2162753536. Throughput: 0: 49835.0. Samples: 1691563820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 01:52:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:52:14,484][71000] Updated weights for policy 0, policy_version 132014 (0.0025) [2024-06-13 01:52:15,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 2162999296. Throughput: 0: 49830.5. Samples: 1691869080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 01:52:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:52:17,643][71000] Updated weights for policy 0, policy_version 132024 (0.0025) [2024-06-13 01:52:20,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 2163228672. Throughput: 0: 49442.2. Samples: 1692013480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 01:52:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:52:21,039][71000] Updated weights for policy 0, policy_version 132034 (0.0027) [2024-06-13 01:52:24,367][71000] Updated weights for policy 0, policy_version 132044 (0.0022) [2024-06-13 01:52:25,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49971.1, 300 sec: 49429.7). Total num frames: 2163507200. Throughput: 0: 49555.5. Samples: 1692301700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 01:52:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:52:27,911][71000] Updated weights for policy 0, policy_version 132054 (0.0030) [2024-06-13 01:52:30,442][70980] Signal inference workers to stop experience collection... (25000 times) [2024-06-13 01:52:30,443][70980] Signal inference workers to resume experience collection... (25000 times) [2024-06-13 01:52:30,487][71000] InferenceWorker_p0-w0: stopping experience collection (25000 times) [2024-06-13 01:52:30,487][71000] InferenceWorker_p0-w0: resuming experience collection (25000 times) [2024-06-13 01:52:30,821][71000] Updated weights for policy 0, policy_version 132064 (0.0031) [2024-06-13 01:52:30,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2163736576. Throughput: 0: 49630.5. Samples: 1692599400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 01:52:30,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:52:34,283][71000] Updated weights for policy 0, policy_version 132074 (0.0030) [2024-06-13 01:52:35,944][70768] Fps is (10 sec: 47493.6, 60 sec: 49694.6, 300 sec: 49373.5). Total num frames: 2163982336. Throughput: 0: 49295.8. Samples: 1692745820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 01:52:35,944][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:52:37,424][71000] Updated weights for policy 0, policy_version 132084 (0.0037) [2024-06-13 01:52:40,939][70768] Fps is (10 sec: 47514.2, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2164211712. Throughput: 0: 49084.5. Samples: 1693039960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 01:52:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:52:41,043][71000] Updated weights for policy 0, policy_version 132094 (0.0030) [2024-06-13 01:52:44,468][71000] Updated weights for policy 0, policy_version 132104 (0.0022) [2024-06-13 01:52:45,940][70768] Fps is (10 sec: 49173.1, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2164473856. Throughput: 0: 49265.9. Samples: 1693336420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 01:52:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:52:47,624][71000] Updated weights for policy 0, policy_version 132114 (0.0025) [2024-06-13 01:52:50,731][71000] Updated weights for policy 0, policy_version 132124 (0.0029) [2024-06-13 01:52:50,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 2164736000. Throughput: 0: 49621.9. Samples: 1693495580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 01:52:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:52:53,999][71000] Updated weights for policy 0, policy_version 132134 (0.0020) [2024-06-13 01:52:55,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2164965376. Throughput: 0: 49610.8. Samples: 1693796300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 01:52:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:52:57,117][71000] Updated weights for policy 0, policy_version 132144 (0.0032) [2024-06-13 01:53:00,562][71000] Updated weights for policy 0, policy_version 132154 (0.0032) [2024-06-13 01:53:00,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 2165211136. Throughput: 0: 49437.8. Samples: 1694093780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 01:53:00,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:53:04,327][71000] Updated weights for policy 0, policy_version 132164 (0.0027) [2024-06-13 01:53:05,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49152.0, 300 sec: 49429.9). Total num frames: 2165473280. Throughput: 0: 49384.4. Samples: 1694235780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 01:53:05,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:53:07,344][71000] Updated weights for policy 0, policy_version 132174 (0.0036) [2024-06-13 01:53:10,782][71000] Updated weights for policy 0, policy_version 132184 (0.0026) [2024-06-13 01:53:10,944][70768] Fps is (10 sec: 50769.3, 60 sec: 49421.6, 300 sec: 49317.9). Total num frames: 2165719040. Throughput: 0: 49687.8. Samples: 1694537860. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-13 01:53:10,944][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:53:13,895][71000] Updated weights for policy 0, policy_version 132194 (0.0026) [2024-06-13 01:53:15,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2165964800. Throughput: 0: 49893.6. Samples: 1694844620. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-13 01:53:15,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 01:53:17,001][71000] Updated weights for policy 0, policy_version 132204 (0.0028) [2024-06-13 01:53:20,279][71000] Updated weights for policy 0, policy_version 132214 (0.0020) [2024-06-13 01:53:20,940][70768] Fps is (10 sec: 49172.4, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 2166210560. Throughput: 0: 49647.2. Samples: 1694979740. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-13 01:53:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:53:24,043][71000] Updated weights for policy 0, policy_version 132224 (0.0033) [2024-06-13 01:53:25,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2166456320. Throughput: 0: 49767.9. Samples: 1695279520. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-13 01:53:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:53:27,127][71000] Updated weights for policy 0, policy_version 132234 (0.0030) [2024-06-13 01:53:30,672][71000] Updated weights for policy 0, policy_version 132244 (0.0034) [2024-06-13 01:53:30,939][70768] Fps is (10 sec: 49153.0, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2166702080. Throughput: 0: 49771.6. Samples: 1695576140. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-13 01:53:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:53:33,973][71000] Updated weights for policy 0, policy_version 132254 (0.0030) [2024-06-13 01:53:34,425][70980] Signal inference workers to stop experience collection... (25050 times) [2024-06-13 01:53:34,425][70980] Signal inference workers to resume experience collection... (25050 times) [2024-06-13 01:53:34,451][71000] InferenceWorker_p0-w0: stopping experience collection (25050 times) [2024-06-13 01:53:34,451][71000] InferenceWorker_p0-w0: resuming experience collection (25050 times) [2024-06-13 01:53:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49428.6, 300 sec: 49374.1). Total num frames: 2166947840. Throughput: 0: 49387.0. Samples: 1695718000. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-13 01:53:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:53:37,449][71000] Updated weights for policy 0, policy_version 132264 (0.0024) [2024-06-13 01:53:40,680][71000] Updated weights for policy 0, policy_version 132274 (0.0023) [2024-06-13 01:53:40,939][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 49485.3). Total num frames: 2167193600. Throughput: 0: 49156.0. Samples: 1696008320. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-13 01:53:40,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:53:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000132275_2167193600.pth... [2024-06-13 01:53:40,998][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000131553_2155364352.pth [2024-06-13 01:53:44,220][71000] Updated weights for policy 0, policy_version 132284 (0.0025) [2024-06-13 01:53:45,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2167439360. Throughput: 0: 48963.8. Samples: 1696297140. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-13 01:53:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:53:47,245][71000] Updated weights for policy 0, policy_version 132294 (0.0029) [2024-06-13 01:53:50,770][71000] Updated weights for policy 0, policy_version 132304 (0.0033) [2024-06-13 01:53:50,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 49374.1). Total num frames: 2167668736. Throughput: 0: 49296.1. Samples: 1696454100. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-13 01:53:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:53:54,183][71000] Updated weights for policy 0, policy_version 132314 (0.0026) [2024-06-13 01:53:55,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49424.9, 300 sec: 49318.6). Total num frames: 2167930880. Throughput: 0: 49150.3. Samples: 1696749420. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-13 01:53:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:53:57,513][71000] Updated weights for policy 0, policy_version 132324 (0.0031) [2024-06-13 01:54:00,646][71000] Updated weights for policy 0, policy_version 132334 (0.0025) [2024-06-13 01:54:00,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2168176640. Throughput: 0: 48805.4. Samples: 1697040860. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-13 01:54:00,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-13 01:54:04,104][71000] Updated weights for policy 0, policy_version 132344 (0.0029) [2024-06-13 01:54:05,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 2168422400. Throughput: 0: 49150.1. Samples: 1697191500. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-13 01:54:05,941][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:54:07,469][71000] Updated weights for policy 0, policy_version 132354 (0.0036) [2024-06-13 01:54:10,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48609.3, 300 sec: 49318.6). Total num frames: 2168635392. Throughput: 0: 48841.3. Samples: 1697477380. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-13 01:54:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:54:11,005][71000] Updated weights for policy 0, policy_version 132364 (0.0029) [2024-06-13 01:54:14,364][71000] Updated weights for policy 0, policy_version 132374 (0.0023) [2024-06-13 01:54:15,940][70768] Fps is (10 sec: 47514.9, 60 sec: 48879.1, 300 sec: 49263.1). Total num frames: 2168897536. Throughput: 0: 48829.3. Samples: 1697773460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:54:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 01:54:17,448][71000] Updated weights for policy 0, policy_version 132384 (0.0032) [2024-06-13 01:54:20,875][71000] Updated weights for policy 0, policy_version 132394 (0.0026) [2024-06-13 01:54:20,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48879.1, 300 sec: 49374.2). Total num frames: 2169143296. Throughput: 0: 48922.7. Samples: 1697919520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:54:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:54:24,003][71000] Updated weights for policy 0, policy_version 132404 (0.0031) [2024-06-13 01:54:25,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2169405440. Throughput: 0: 49104.9. Samples: 1698218040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:54:25,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 01:54:27,396][71000] Updated weights for policy 0, policy_version 132414 (0.0033) [2024-06-13 01:54:30,853][71000] Updated weights for policy 0, policy_version 132424 (0.0027) [2024-06-13 01:54:30,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48878.8, 300 sec: 49263.1). Total num frames: 2169634816. Throughput: 0: 49153.2. Samples: 1698509040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:54:30,946][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:54:34,183][71000] Updated weights for policy 0, policy_version 132434 (0.0036) [2024-06-13 01:54:35,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2169880576. Throughput: 0: 48925.3. Samples: 1698655740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:54:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 01:54:37,399][71000] Updated weights for policy 0, policy_version 132444 (0.0038) [2024-06-13 01:54:40,854][71000] Updated weights for policy 0, policy_version 132454 (0.0030) [2024-06-13 01:54:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.9, 300 sec: 49374.1). Total num frames: 2170126336. Throughput: 0: 48832.2. Samples: 1698946860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:54:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:54:43,944][71000] Updated weights for policy 0, policy_version 132464 (0.0036) [2024-06-13 01:54:45,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2170388480. Throughput: 0: 48869.3. Samples: 1699239980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:54:45,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:54:47,372][71000] Updated weights for policy 0, policy_version 132474 (0.0030) [2024-06-13 01:54:50,737][71000] Updated weights for policy 0, policy_version 132484 (0.0031) [2024-06-13 01:54:50,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2170617856. Throughput: 0: 48941.9. Samples: 1699393880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:54:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:54:54,013][71000] Updated weights for policy 0, policy_version 132494 (0.0031) [2024-06-13 01:54:55,940][70768] Fps is (10 sec: 45875.7, 60 sec: 48606.0, 300 sec: 49207.5). Total num frames: 2170847232. Throughput: 0: 49066.2. Samples: 1699685360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:54:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:54:56,990][70980] Signal inference workers to stop experience collection... (25100 times) [2024-06-13 01:54:56,990][70980] Signal inference workers to resume experience collection... (25100 times) [2024-06-13 01:54:57,000][71000] InferenceWorker_p0-w0: stopping experience collection (25100 times) [2024-06-13 01:54:57,011][71000] InferenceWorker_p0-w0: resuming experience collection (25100 times) [2024-06-13 01:54:57,262][71000] Updated weights for policy 0, policy_version 132504 (0.0036) [2024-06-13 01:55:00,570][71000] Updated weights for policy 0, policy_version 132514 (0.0032) [2024-06-13 01:55:00,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48879.0, 300 sec: 49374.2). Total num frames: 2171109376. Throughput: 0: 49064.0. Samples: 1699981340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:55:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:55:04,099][71000] Updated weights for policy 0, policy_version 132524 (0.0032) [2024-06-13 01:55:05,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48879.2, 300 sec: 49263.1). Total num frames: 2171355136. Throughput: 0: 49276.9. Samples: 1700136980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:55:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:55:07,050][71000] Updated weights for policy 0, policy_version 132534 (0.0019) [2024-06-13 01:55:10,788][71000] Updated weights for policy 0, policy_version 132544 (0.0030) [2024-06-13 01:55:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2171600896. Throughput: 0: 49164.0. Samples: 1700430420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:55:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 01:55:13,802][71000] Updated weights for policy 0, policy_version 132554 (0.0032) [2024-06-13 01:55:15,944][70768] Fps is (10 sec: 47493.1, 60 sec: 48875.4, 300 sec: 49206.8). Total num frames: 2171830272. Throughput: 0: 49082.0. Samples: 1700717940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 01:55:15,944][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:55:17,359][71000] Updated weights for policy 0, policy_version 132564 (0.0031) [2024-06-13 01:55:20,291][71000] Updated weights for policy 0, policy_version 132574 (0.0027) [2024-06-13 01:55:20,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 2172092416. Throughput: 0: 49157.2. Samples: 1700867820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 01:55:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:55:24,005][71000] Updated weights for policy 0, policy_version 132584 (0.0025) [2024-06-13 01:55:25,940][70768] Fps is (10 sec: 50812.4, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2172338176. Throughput: 0: 49142.7. Samples: 1701158280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 01:55:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:55:26,900][71000] Updated weights for policy 0, policy_version 132594 (0.0030) [2024-06-13 01:55:30,939][70768] Fps is (10 sec: 47514.2, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2172567552. Throughput: 0: 49227.7. Samples: 1701455220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 01:55:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:55:31,034][71000] Updated weights for policy 0, policy_version 132604 (0.0026) [2024-06-13 01:55:33,817][71000] Updated weights for policy 0, policy_version 132614 (0.0025) [2024-06-13 01:55:35,940][70768] Fps is (10 sec: 47512.7, 60 sec: 48878.8, 300 sec: 49207.5). Total num frames: 2172813312. Throughput: 0: 48832.0. Samples: 1701591320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 01:55:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:55:37,800][71000] Updated weights for policy 0, policy_version 132624 (0.0032) [2024-06-13 01:55:40,529][71000] Updated weights for policy 0, policy_version 132634 (0.0021) [2024-06-13 01:55:40,940][70768] Fps is (10 sec: 52427.8, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 2173091840. Throughput: 0: 48983.0. Samples: 1701889600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 01:55:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:55:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000132635_2173091840.pth... [2024-06-13 01:55:40,999][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000131913_2161262592.pth [2024-06-13 01:55:44,278][71000] Updated weights for policy 0, policy_version 132644 (0.0034) [2024-06-13 01:55:45,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2173321216. Throughput: 0: 49039.9. Samples: 1702188140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 01:55:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:55:46,971][71000] Updated weights for policy 0, policy_version 132654 (0.0024) [2024-06-13 01:55:50,875][71000] Updated weights for policy 0, policy_version 132664 (0.0028) [2024-06-13 01:55:50,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49152.1, 300 sec: 49263.4). Total num frames: 2173566976. Throughput: 0: 49081.7. Samples: 1702345660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 01:55:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:55:53,471][71000] Updated weights for policy 0, policy_version 132674 (0.0027) [2024-06-13 01:55:55,939][70768] Fps is (10 sec: 45875.9, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2173779968. Throughput: 0: 48856.1. Samples: 1702628940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 01:55:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:55:57,507][71000] Updated weights for policy 0, policy_version 132684 (0.0025) [2024-06-13 01:56:00,279][71000] Updated weights for policy 0, policy_version 132694 (0.0032) [2024-06-13 01:56:00,939][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2174058496. Throughput: 0: 48900.3. Samples: 1702918240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 01:56:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:56:04,193][71000] Updated weights for policy 0, policy_version 132704 (0.0033) [2024-06-13 01:56:05,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2174304256. Throughput: 0: 49052.6. Samples: 1703075180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 01:56:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:56:07,418][71000] Updated weights for policy 0, policy_version 132714 (0.0030) [2024-06-13 01:56:10,675][71000] Updated weights for policy 0, policy_version 132724 (0.0031) [2024-06-13 01:56:10,939][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2174550016. Throughput: 0: 49266.7. Samples: 1703375280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 01:56:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 01:56:12,347][70980] Signal inference workers to stop experience collection... (25150 times) [2024-06-13 01:56:12,347][70980] Signal inference workers to resume experience collection... (25150 times) [2024-06-13 01:56:12,381][71000] InferenceWorker_p0-w0: stopping experience collection (25150 times) [2024-06-13 01:56:12,382][71000] InferenceWorker_p0-w0: resuming experience collection (25150 times) [2024-06-13 01:56:13,798][71000] Updated weights for policy 0, policy_version 132734 (0.0027) [2024-06-13 01:56:15,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48882.5, 300 sec: 49263.1). Total num frames: 2174763008. Throughput: 0: 49257.3. Samples: 1703671800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 01:56:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:56:17,465][71000] Updated weights for policy 0, policy_version 132744 (0.0020) [2024-06-13 01:56:20,561][71000] Updated weights for policy 0, policy_version 132754 (0.0027) [2024-06-13 01:56:20,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49152.0, 300 sec: 49263.0). Total num frames: 2175041536. Throughput: 0: 49281.8. Samples: 1703809000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:56:20,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 01:56:24,149][71000] Updated weights for policy 0, policy_version 132764 (0.0026) [2024-06-13 01:56:25,939][70768] Fps is (10 sec: 54067.4, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2175303680. Throughput: 0: 49174.9. Samples: 1704102460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:56:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:56:27,345][71000] Updated weights for policy 0, policy_version 132774 (0.0026) [2024-06-13 01:56:30,687][71000] Updated weights for policy 0, policy_version 132784 (0.0029) [2024-06-13 01:56:30,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2175533056. Throughput: 0: 49229.0. Samples: 1704403440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:56:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:56:34,239][71000] Updated weights for policy 0, policy_version 132794 (0.0028) [2024-06-13 01:56:35,939][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.2, 300 sec: 49207.6). Total num frames: 2175762432. Throughput: 0: 48855.3. Samples: 1704544140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:56:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:56:37,376][71000] Updated weights for policy 0, policy_version 132804 (0.0025) [2024-06-13 01:56:40,658][71000] Updated weights for policy 0, policy_version 132814 (0.0029) [2024-06-13 01:56:40,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2176024576. Throughput: 0: 49263.4. Samples: 1704845800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:56:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 01:56:44,148][71000] Updated weights for policy 0, policy_version 132824 (0.0028) [2024-06-13 01:56:45,940][70768] Fps is (10 sec: 50789.1, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2176270336. Throughput: 0: 49006.9. Samples: 1705123560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:56:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:56:47,747][71000] Updated weights for policy 0, policy_version 132834 (0.0028) [2024-06-13 01:56:50,939][70768] Fps is (10 sec: 47514.3, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2176499712. Throughput: 0: 48906.7. Samples: 1705275980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:56:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:56:51,031][71000] Updated weights for policy 0, policy_version 132844 (0.0031) [2024-06-13 01:56:54,602][71000] Updated weights for policy 0, policy_version 132854 (0.0036) [2024-06-13 01:56:55,939][70768] Fps is (10 sec: 44237.8, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2176712704. Throughput: 0: 48632.4. Samples: 1705563740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:56:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:56:57,989][71000] Updated weights for policy 0, policy_version 132864 (0.0028) [2024-06-13 01:57:00,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 2176974848. Throughput: 0: 48451.5. Samples: 1705852120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:57:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:57:01,256][71000] Updated weights for policy 0, policy_version 132874 (0.0036) [2024-06-13 01:57:04,441][71000] Updated weights for policy 0, policy_version 132884 (0.0034) [2024-06-13 01:57:05,940][70768] Fps is (10 sec: 54066.9, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2177253376. Throughput: 0: 48798.4. Samples: 1706004920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:57:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:57:07,730][71000] Updated weights for policy 0, policy_version 132894 (0.0031) [2024-06-13 01:57:10,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2177482752. Throughput: 0: 48884.8. Samples: 1706302280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:57:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 01:57:11,093][71000] Updated weights for policy 0, policy_version 132904 (0.0021) [2024-06-13 01:57:14,751][71000] Updated weights for policy 0, policy_version 132914 (0.0035) [2024-06-13 01:57:15,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2177712128. Throughput: 0: 48813.3. Samples: 1706600040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:57:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:57:17,858][71000] Updated weights for policy 0, policy_version 132924 (0.0034) [2024-06-13 01:57:20,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 2177957888. Throughput: 0: 48759.4. Samples: 1706738320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 01:57:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:57:21,320][71000] Updated weights for policy 0, policy_version 132934 (0.0032) [2024-06-13 01:57:23,185][70980] Signal inference workers to stop experience collection... (25200 times) [2024-06-13 01:57:23,185][70980] Signal inference workers to resume experience collection... (25200 times) [2024-06-13 01:57:23,201][71000] InferenceWorker_p0-w0: stopping experience collection (25200 times) [2024-06-13 01:57:23,202][71000] InferenceWorker_p0-w0: resuming experience collection (25200 times) [2024-06-13 01:57:24,283][71000] Updated weights for policy 0, policy_version 132944 (0.0033) [2024-06-13 01:57:25,940][70768] Fps is (10 sec: 52428.7, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2178236416. Throughput: 0: 48585.4. Samples: 1707032140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 01:57:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 01:57:28,097][71000] Updated weights for policy 0, policy_version 132954 (0.0037) [2024-06-13 01:57:30,944][70768] Fps is (10 sec: 49131.5, 60 sec: 48602.5, 300 sec: 49040.9). Total num frames: 2178449408. Throughput: 0: 48946.3. Samples: 1707326340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 01:57:30,944][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 01:57:31,225][71000] Updated weights for policy 0, policy_version 132964 (0.0036) [2024-06-13 01:57:34,823][71000] Updated weights for policy 0, policy_version 132974 (0.0025) [2024-06-13 01:57:35,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 2178695168. Throughput: 0: 48660.8. Samples: 1707465720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 01:57:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:57:37,805][71000] Updated weights for policy 0, policy_version 132984 (0.0029) [2024-06-13 01:57:40,940][70768] Fps is (10 sec: 47533.6, 60 sec: 48332.9, 300 sec: 48985.4). Total num frames: 2178924544. Throughput: 0: 48809.3. Samples: 1707760160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 01:57:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:57:41,020][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000132992_2178940928.pth... [2024-06-13 01:57:41,064][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000132275_2167193600.pth [2024-06-13 01:57:41,614][71000] Updated weights for policy 0, policy_version 132994 (0.0026) [2024-06-13 01:57:44,582][71000] Updated weights for policy 0, policy_version 133004 (0.0038) [2024-06-13 01:57:45,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49152.1, 300 sec: 49096.4). Total num frames: 2179219456. Throughput: 0: 48961.2. Samples: 1708055380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 01:57:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:57:48,253][71000] Updated weights for policy 0, policy_version 133014 (0.0028) [2024-06-13 01:57:50,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2179432448. Throughput: 0: 48965.3. Samples: 1708208360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 01:57:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:57:51,182][71000] Updated weights for policy 0, policy_version 133024 (0.0024) [2024-06-13 01:57:54,796][71000] Updated weights for policy 0, policy_version 133034 (0.0028) [2024-06-13 01:57:55,939][70768] Fps is (10 sec: 47514.6, 60 sec: 49698.2, 300 sec: 49096.5). Total num frames: 2179694592. Throughput: 0: 49176.6. Samples: 1708515220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 01:57:55,940][70768] Avg episode reward: [(0, '0.251')] [2024-06-13 01:57:57,865][71000] Updated weights for policy 0, policy_version 133044 (0.0024) [2024-06-13 01:58:00,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 2179907584. Throughput: 0: 48997.3. Samples: 1708804920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 01:58:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:58:01,318][71000] Updated weights for policy 0, policy_version 133054 (0.0029) [2024-06-13 01:58:04,446][71000] Updated weights for policy 0, policy_version 133064 (0.0029) [2024-06-13 01:58:05,939][70768] Fps is (10 sec: 49151.9, 60 sec: 48879.0, 300 sec: 49041.7). Total num frames: 2180186112. Throughput: 0: 49313.5. Samples: 1708957420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 01:58:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:58:08,195][71000] Updated weights for policy 0, policy_version 133074 (0.0040) [2024-06-13 01:58:10,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49152.0, 300 sec: 49041.0). Total num frames: 2180431872. Throughput: 0: 49311.1. Samples: 1709251140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 01:58:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:58:11,054][71000] Updated weights for policy 0, policy_version 133084 (0.0027) [2024-06-13 01:58:14,650][71000] Updated weights for policy 0, policy_version 133094 (0.0023) [2024-06-13 01:58:15,940][70768] Fps is (10 sec: 50789.3, 60 sec: 49698.0, 300 sec: 49096.5). Total num frames: 2180694016. Throughput: 0: 49413.3. Samples: 1709549740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 01:58:15,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 01:58:17,806][70980] Signal inference workers to stop experience collection... (25250 times) [2024-06-13 01:58:17,806][70980] Signal inference workers to resume experience collection... (25250 times) [2024-06-13 01:58:17,806][71000] Updated weights for policy 0, policy_version 133104 (0.0032) [2024-06-13 01:58:17,847][71000] InferenceWorker_p0-w0: stopping experience collection (25250 times) [2024-06-13 01:58:17,847][71000] InferenceWorker_p0-w0: resuming experience collection (25250 times) [2024-06-13 01:58:20,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 2180907008. Throughput: 0: 49471.9. Samples: 1709691960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 01:58:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 01:58:21,402][71000] Updated weights for policy 0, policy_version 133114 (0.0026) [2024-06-13 01:58:24,297][71000] Updated weights for policy 0, policy_version 133124 (0.0037) [2024-06-13 01:58:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 2181169152. Throughput: 0: 49539.4. Samples: 1709989440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 01:58:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:58:28,238][71000] Updated weights for policy 0, policy_version 133134 (0.0031) [2024-06-13 01:58:30,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49428.4, 300 sec: 49040.9). Total num frames: 2181414912. Throughput: 0: 49518.2. Samples: 1710283700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 01:58:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:58:31,078][71000] Updated weights for policy 0, policy_version 133144 (0.0030) [2024-06-13 01:58:34,690][71000] Updated weights for policy 0, policy_version 133154 (0.0030) [2024-06-13 01:58:35,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 2181660672. Throughput: 0: 49412.1. Samples: 1710431900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 01:58:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 01:58:37,658][71000] Updated weights for policy 0, policy_version 133164 (0.0024) [2024-06-13 01:58:40,939][70768] Fps is (10 sec: 47514.6, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 2181890048. Throughput: 0: 49077.8. Samples: 1710723720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 01:58:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 01:58:41,579][71000] Updated weights for policy 0, policy_version 133174 (0.0030) [2024-06-13 01:58:44,386][71000] Updated weights for policy 0, policy_version 133184 (0.0025) [2024-06-13 01:58:45,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2182152192. Throughput: 0: 49158.2. Samples: 1711017040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 01:58:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:58:48,162][71000] Updated weights for policy 0, policy_version 133194 (0.0033) [2024-06-13 01:58:50,938][71000] Updated weights for policy 0, policy_version 133204 (0.0024) [2024-06-13 01:58:50,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49698.2, 300 sec: 49096.5). Total num frames: 2182414336. Throughput: 0: 49169.3. Samples: 1711170040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 01:58:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 01:58:54,646][71000] Updated weights for policy 0, policy_version 133214 (0.0026) [2024-06-13 01:58:55,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 2182660096. Throughput: 0: 49344.4. Samples: 1711471640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 01:58:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:58:57,674][71000] Updated weights for policy 0, policy_version 133224 (0.0024) [2024-06-13 01:59:00,940][70768] Fps is (10 sec: 45874.8, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 2182873088. Throughput: 0: 49125.0. Samples: 1711760360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 01:59:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:59:01,212][71000] Updated weights for policy 0, policy_version 133234 (0.0036) [2024-06-13 01:59:04,239][71000] Updated weights for policy 0, policy_version 133244 (0.0029) [2024-06-13 01:59:05,940][70768] Fps is (10 sec: 45874.7, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 2183118848. Throughput: 0: 49111.1. Samples: 1711901960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 01:59:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:59:08,221][71000] Updated weights for policy 0, policy_version 133254 (0.0033) [2024-06-13 01:59:10,867][71000] Updated weights for policy 0, policy_version 133264 (0.0028) [2024-06-13 01:59:10,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49424.9, 300 sec: 49152.0). Total num frames: 2183397376. Throughput: 0: 48893.8. Samples: 1712189660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 01:59:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:59:14,897][71000] Updated weights for policy 0, policy_version 133274 (0.0026) [2024-06-13 01:59:15,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 2183610368. Throughput: 0: 49036.1. Samples: 1712490320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 01:59:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:59:17,490][71000] Updated weights for policy 0, policy_version 133284 (0.0027) [2024-06-13 01:59:20,940][70768] Fps is (10 sec: 45875.7, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 2183856128. Throughput: 0: 48979.9. Samples: 1712636000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 01:59:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:59:21,307][71000] Updated weights for policy 0, policy_version 133294 (0.0030) [2024-06-13 01:59:24,408][71000] Updated weights for policy 0, policy_version 133304 (0.0027) [2024-06-13 01:59:25,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2184101888. Throughput: 0: 48826.4. Samples: 1712920920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 01:59:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:59:28,381][71000] Updated weights for policy 0, policy_version 133314 (0.0034) [2024-06-13 01:59:30,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49152.0, 300 sec: 49096.4). Total num frames: 2184364032. Throughput: 0: 48800.8. Samples: 1713213080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 01:59:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 01:59:31,119][71000] Updated weights for policy 0, policy_version 133324 (0.0032) [2024-06-13 01:59:34,766][71000] Updated weights for policy 0, policy_version 133334 (0.0027) [2024-06-13 01:59:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 2184593408. Throughput: 0: 48914.0. Samples: 1713371180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 01:59:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 01:59:37,451][71000] Updated weights for policy 0, policy_version 133344 (0.0026) [2024-06-13 01:59:40,471][70980] Signal inference workers to stop experience collection... (25300 times) [2024-06-13 01:59:40,473][70980] Signal inference workers to resume experience collection... (25300 times) [2024-06-13 01:59:40,482][71000] InferenceWorker_p0-w0: stopping experience collection (25300 times) [2024-06-13 01:59:40,520][71000] InferenceWorker_p0-w0: resuming experience collection (25300 times) [2024-06-13 01:59:40,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.1, 300 sec: 49041.0). Total num frames: 2184855552. Throughput: 0: 49019.2. Samples: 1713677500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 01:59:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:59:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000133353_2184855552.pth... [2024-06-13 01:59:40,995][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000132635_2173091840.pth [2024-06-13 01:59:41,241][71000] Updated weights for policy 0, policy_version 133354 (0.0028) [2024-06-13 01:59:44,452][71000] Updated weights for policy 0, policy_version 133364 (0.0025) [2024-06-13 01:59:45,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 2185068544. Throughput: 0: 48953.7. Samples: 1713963280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 01:59:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 01:59:47,830][71000] Updated weights for policy 0, policy_version 133374 (0.0029) [2024-06-13 01:59:50,934][71000] Updated weights for policy 0, policy_version 133384 (0.0027) [2024-06-13 01:59:50,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2185363456. Throughput: 0: 48978.8. Samples: 1714106000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 01:59:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 01:59:54,554][71000] Updated weights for policy 0, policy_version 133394 (0.0019) [2024-06-13 01:59:55,940][70768] Fps is (10 sec: 50791.0, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 2185576448. Throughput: 0: 49255.7. Samples: 1714406160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 01:59:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 01:59:57,567][71000] Updated weights for policy 0, policy_version 133404 (0.0026) [2024-06-13 02:00:00,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2185838592. Throughput: 0: 49247.6. Samples: 1714706460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 02:00:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:00:01,291][71000] Updated weights for policy 0, policy_version 133414 (0.0028) [2024-06-13 02:00:04,104][71000] Updated weights for policy 0, policy_version 133424 (0.0030) [2024-06-13 02:00:05,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 2186051584. Throughput: 0: 49079.2. Samples: 1714844560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 02:00:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:00:07,743][71000] Updated weights for policy 0, policy_version 133434 (0.0029) [2024-06-13 02:00:10,734][71000] Updated weights for policy 0, policy_version 133444 (0.0028) [2024-06-13 02:00:10,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49152.0, 300 sec: 49208.2). Total num frames: 2186346496. Throughput: 0: 49504.0. Samples: 1715148600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 02:00:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:00:14,678][71000] Updated weights for policy 0, policy_version 133454 (0.0036) [2024-06-13 02:00:15,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 2186543104. Throughput: 0: 49388.6. Samples: 1715435560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 02:00:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:00:17,623][71000] Updated weights for policy 0, policy_version 133464 (0.0026) [2024-06-13 02:00:20,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 2186805248. Throughput: 0: 48998.7. Samples: 1715576120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 02:00:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:00:21,315][71000] Updated weights for policy 0, policy_version 133474 (0.0030) [2024-06-13 02:00:24,211][71000] Updated weights for policy 0, policy_version 133484 (0.0024) [2024-06-13 02:00:25,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2187034624. Throughput: 0: 48692.2. Samples: 1715868660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 02:00:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 02:00:27,733][71000] Updated weights for policy 0, policy_version 133494 (0.0023) [2024-06-13 02:00:30,726][71000] Updated weights for policy 0, policy_version 133504 (0.0036) [2024-06-13 02:00:30,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2187329536. Throughput: 0: 49085.3. Samples: 1716172120. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-13 02:00:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:00:34,467][71000] Updated weights for policy 0, policy_version 133514 (0.0025) [2024-06-13 02:00:35,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 2187542528. Throughput: 0: 49265.3. Samples: 1716322940. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-13 02:00:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:00:37,601][71000] Updated weights for policy 0, policy_version 133524 (0.0036) [2024-06-13 02:00:40,940][70768] Fps is (10 sec: 45875.6, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2187788288. Throughput: 0: 49176.9. Samples: 1716619120. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-13 02:00:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:00:41,290][71000] Updated weights for policy 0, policy_version 133534 (0.0032) [2024-06-13 02:00:44,409][71000] Updated weights for policy 0, policy_version 133544 (0.0029) [2024-06-13 02:00:45,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 2188034048. Throughput: 0: 48931.0. Samples: 1716908360. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-13 02:00:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:00:47,244][70980] Signal inference workers to stop experience collection... (25350 times) [2024-06-13 02:00:47,289][71000] InferenceWorker_p0-w0: stopping experience collection (25350 times) [2024-06-13 02:00:47,298][70980] Signal inference workers to resume experience collection... (25350 times) [2024-06-13 02:00:47,308][71000] InferenceWorker_p0-w0: resuming experience collection (25350 times) [2024-06-13 02:00:48,102][71000] Updated weights for policy 0, policy_version 133554 (0.0020) [2024-06-13 02:00:50,846][71000] Updated weights for policy 0, policy_version 133564 (0.0026) [2024-06-13 02:00:50,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2188312576. Throughput: 0: 49192.9. Samples: 1717058240. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-13 02:00:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:00:54,563][71000] Updated weights for policy 0, policy_version 133574 (0.0026) [2024-06-13 02:00:55,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2188525568. Throughput: 0: 48973.0. Samples: 1717352380. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-13 02:00:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:00:57,858][71000] Updated weights for policy 0, policy_version 133584 (0.0025) [2024-06-13 02:01:00,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2188771328. Throughput: 0: 49355.0. Samples: 1717656540. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-13 02:01:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:01:01,238][71000] Updated weights for policy 0, policy_version 133594 (0.0024) [2024-06-13 02:01:04,685][71000] Updated weights for policy 0, policy_version 133604 (0.0025) [2024-06-13 02:01:05,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49424.9, 300 sec: 49040.9). Total num frames: 2189017088. Throughput: 0: 49207.1. Samples: 1717790440. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-13 02:01:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:01:07,858][71000] Updated weights for policy 0, policy_version 133614 (0.0032) [2024-06-13 02:01:10,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2189279232. Throughput: 0: 49450.3. Samples: 1718093920. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-13 02:01:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:01:11,056][71000] Updated weights for policy 0, policy_version 133624 (0.0026) [2024-06-13 02:01:14,622][71000] Updated weights for policy 0, policy_version 133634 (0.0029) [2024-06-13 02:01:15,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49698.1, 300 sec: 49096.5). Total num frames: 2189524992. Throughput: 0: 49255.6. Samples: 1718388620. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-13 02:01:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:01:17,868][71000] Updated weights for policy 0, policy_version 133644 (0.0029) [2024-06-13 02:01:20,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 2189754368. Throughput: 0: 49177.8. Samples: 1718535940. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-13 02:01:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:01:21,321][71000] Updated weights for policy 0, policy_version 133654 (0.0021) [2024-06-13 02:01:24,430][71000] Updated weights for policy 0, policy_version 133664 (0.0024) [2024-06-13 02:01:25,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49425.2, 300 sec: 49040.9). Total num frames: 2190000128. Throughput: 0: 49265.8. Samples: 1718836080. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-13 02:01:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:01:27,709][71000] Updated weights for policy 0, policy_version 133674 (0.0030) [2024-06-13 02:01:30,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2190262272. Throughput: 0: 49398.6. Samples: 1719131300. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-13 02:01:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:01:31,067][71000] Updated weights for policy 0, policy_version 133684 (0.0032) [2024-06-13 02:01:34,524][71000] Updated weights for policy 0, policy_version 133694 (0.0025) [2024-06-13 02:01:35,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49698.0, 300 sec: 49152.0). Total num frames: 2190524416. Throughput: 0: 49433.6. Samples: 1719282760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 02:01:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:01:37,361][71000] Updated weights for policy 0, policy_version 133704 (0.0030) [2024-06-13 02:01:40,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 2190753792. Throughput: 0: 49552.4. Samples: 1719582240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 02:01:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:01:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000133713_2190753792.pth... [2024-06-13 02:01:40,994][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000132992_2178940928.pth [2024-06-13 02:01:41,264][71000] Updated weights for policy 0, policy_version 133714 (0.0030) [2024-06-13 02:01:44,468][71000] Updated weights for policy 0, policy_version 133724 (0.0026) [2024-06-13 02:01:45,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2190999552. Throughput: 0: 49238.2. Samples: 1719872260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 02:01:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:01:47,848][71000] Updated weights for policy 0, policy_version 133734 (0.0041) [2024-06-13 02:01:50,850][71000] Updated weights for policy 0, policy_version 133744 (0.0031) [2024-06-13 02:01:50,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2191261696. Throughput: 0: 49567.7. Samples: 1720020980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 02:01:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:01:54,481][71000] Updated weights for policy 0, policy_version 133754 (0.0028) [2024-06-13 02:01:55,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.2, 300 sec: 49263.1). Total num frames: 2191507456. Throughput: 0: 49630.8. Samples: 1720327300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 02:01:55,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-13 02:01:57,635][71000] Updated weights for policy 0, policy_version 133764 (0.0030) [2024-06-13 02:02:00,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2191720448. Throughput: 0: 49330.7. Samples: 1720608500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 02:02:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:02:01,415][71000] Updated weights for policy 0, policy_version 133774 (0.0029) [2024-06-13 02:02:04,468][71000] Updated weights for policy 0, policy_version 133784 (0.0035) [2024-06-13 02:02:05,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49425.2, 300 sec: 49152.0). Total num frames: 2191982592. Throughput: 0: 49294.6. Samples: 1720754200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 02:02:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:02:07,769][71000] Updated weights for policy 0, policy_version 133794 (0.0031) [2024-06-13 02:02:10,862][71000] Updated weights for policy 0, policy_version 133804 (0.0025) [2024-06-13 02:02:10,888][70980] Signal inference workers to stop experience collection... (25400 times) [2024-06-13 02:02:10,888][70980] Signal inference workers to resume experience collection... (25400 times) [2024-06-13 02:02:10,935][71000] InferenceWorker_p0-w0: stopping experience collection (25400 times) [2024-06-13 02:02:10,935][71000] InferenceWorker_p0-w0: resuming experience collection (25400 times) [2024-06-13 02:02:10,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2192244736. Throughput: 0: 49246.6. Samples: 1721052180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 02:02:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:02:14,426][71000] Updated weights for policy 0, policy_version 133814 (0.0030) [2024-06-13 02:02:15,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2192490496. Throughput: 0: 49456.6. Samples: 1721356840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 02:02:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:02:17,743][71000] Updated weights for policy 0, policy_version 133824 (0.0031) [2024-06-13 02:02:20,940][70768] Fps is (10 sec: 45875.6, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2192703488. Throughput: 0: 49171.7. Samples: 1721495480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 02:02:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:02:21,256][71000] Updated weights for policy 0, policy_version 133834 (0.0027) [2024-06-13 02:02:24,354][71000] Updated weights for policy 0, policy_version 133844 (0.0024) [2024-06-13 02:02:25,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49425.1, 300 sec: 49208.2). Total num frames: 2192965632. Throughput: 0: 48941.8. Samples: 1721784620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 02:02:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:02:27,862][71000] Updated weights for policy 0, policy_version 133854 (0.0022) [2024-06-13 02:02:30,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 2193211392. Throughput: 0: 49085.8. Samples: 1722081120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 02:02:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:02:31,131][71000] Updated weights for policy 0, policy_version 133864 (0.0026) [2024-06-13 02:02:34,282][71000] Updated weights for policy 0, policy_version 133874 (0.0027) [2024-06-13 02:02:35,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.1, 300 sec: 49263.1). Total num frames: 2193457152. Throughput: 0: 49206.3. Samples: 1722235260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 02:02:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:02:37,612][71000] Updated weights for policy 0, policy_version 133884 (0.0033) [2024-06-13 02:02:40,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2193702912. Throughput: 0: 48844.4. Samples: 1722525300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 02:02:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:02:41,071][71000] Updated weights for policy 0, policy_version 133894 (0.0031) [2024-06-13 02:02:44,457][71000] Updated weights for policy 0, policy_version 133904 (0.0025) [2024-06-13 02:02:45,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2193948672. Throughput: 0: 49061.3. Samples: 1722816260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 02:02:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:02:48,061][71000] Updated weights for policy 0, policy_version 133914 (0.0038) [2024-06-13 02:02:50,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2194194432. Throughput: 0: 49035.1. Samples: 1722960780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 02:02:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:02:51,162][71000] Updated weights for policy 0, policy_version 133924 (0.0036) [2024-06-13 02:02:54,514][71000] Updated weights for policy 0, policy_version 133934 (0.0027) [2024-06-13 02:02:55,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2194440192. Throughput: 0: 49157.0. Samples: 1723264240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 02:02:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:02:57,606][71000] Updated weights for policy 0, policy_version 133944 (0.0027) [2024-06-13 02:03:00,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2194685952. Throughput: 0: 48948.0. Samples: 1723559500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 02:03:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:03:01,169][71000] Updated weights for policy 0, policy_version 133954 (0.0032) [2024-06-13 02:03:04,223][71000] Updated weights for policy 0, policy_version 133964 (0.0033) [2024-06-13 02:03:05,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2194915328. Throughput: 0: 48983.1. Samples: 1723699720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 02:03:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 02:03:07,742][71000] Updated weights for policy 0, policy_version 133974 (0.0028) [2024-06-13 02:03:10,721][71000] Updated weights for policy 0, policy_version 133984 (0.0028) [2024-06-13 02:03:10,940][70768] Fps is (10 sec: 50789.4, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2195193856. Throughput: 0: 49286.1. Samples: 1724002500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 02:03:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:03:14,510][71000] Updated weights for policy 0, policy_version 133994 (0.0025) [2024-06-13 02:03:15,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48605.9, 300 sec: 49152.0). Total num frames: 2195406848. Throughput: 0: 49137.3. Samples: 1724292300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 02:03:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:03:17,358][71000] Updated weights for policy 0, policy_version 134004 (0.0021) [2024-06-13 02:03:20,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2195652608. Throughput: 0: 49074.6. Samples: 1724443620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 02:03:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:03:21,137][71000] Updated weights for policy 0, policy_version 134014 (0.0022) [2024-06-13 02:03:24,252][71000] Updated weights for policy 0, policy_version 134024 (0.0027) [2024-06-13 02:03:25,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2195898368. Throughput: 0: 49011.6. Samples: 1724730820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 02:03:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:03:27,763][71000] Updated weights for policy 0, policy_version 134034 (0.0028) [2024-06-13 02:03:30,618][70980] Signal inference workers to stop experience collection... (25450 times) [2024-06-13 02:03:30,646][71000] InferenceWorker_p0-w0: stopping experience collection (25450 times) [2024-06-13 02:03:30,675][70980] Signal inference workers to resume experience collection... (25450 times) [2024-06-13 02:03:30,675][71000] InferenceWorker_p0-w0: resuming experience collection (25450 times) [2024-06-13 02:03:30,804][71000] Updated weights for policy 0, policy_version 134044 (0.0029) [2024-06-13 02:03:30,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2196176896. Throughput: 0: 49091.1. Samples: 1725025360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 02:03:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:03:34,314][71000] Updated weights for policy 0, policy_version 134054 (0.0021) [2024-06-13 02:03:35,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2196406272. Throughput: 0: 49296.5. Samples: 1725179120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 02:03:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:03:37,340][71000] Updated weights for policy 0, policy_version 134064 (0.0028) [2024-06-13 02:03:40,917][71000] Updated weights for policy 0, policy_version 134074 (0.0024) [2024-06-13 02:03:40,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2196668416. Throughput: 0: 49451.1. Samples: 1725489540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:03:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:03:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000134074_2196668416.pth... [2024-06-13 02:03:40,999][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000133353_2184855552.pth [2024-06-13 02:03:44,046][71000] Updated weights for policy 0, policy_version 134084 (0.0024) [2024-06-13 02:03:45,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 2196897792. Throughput: 0: 49369.2. Samples: 1725781120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:03:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:03:47,810][71000] Updated weights for policy 0, policy_version 134094 (0.0028) [2024-06-13 02:03:50,775][71000] Updated weights for policy 0, policy_version 134104 (0.0031) [2024-06-13 02:03:50,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2197159936. Throughput: 0: 49268.0. Samples: 1725916780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:03:50,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:03:54,654][71000] Updated weights for policy 0, policy_version 134114 (0.0027) [2024-06-13 02:03:55,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2197389312. Throughput: 0: 49144.5. Samples: 1726214000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:03:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:03:57,484][71000] Updated weights for policy 0, policy_version 134124 (0.0030) [2024-06-13 02:04:00,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2197635072. Throughput: 0: 49500.8. Samples: 1726519840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:04:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:04:01,041][71000] Updated weights for policy 0, policy_version 134134 (0.0032) [2024-06-13 02:04:04,185][71000] Updated weights for policy 0, policy_version 134144 (0.0026) [2024-06-13 02:04:05,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2197880832. Throughput: 0: 49209.8. Samples: 1726658060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:04:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:04:07,803][71000] Updated weights for policy 0, policy_version 134154 (0.0026) [2024-06-13 02:04:10,656][71000] Updated weights for policy 0, policy_version 134164 (0.0028) [2024-06-13 02:04:10,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2198142976. Throughput: 0: 49432.4. Samples: 1726955280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:04:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:04:14,713][71000] Updated weights for policy 0, policy_version 134174 (0.0033) [2024-06-13 02:04:15,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2198372352. Throughput: 0: 49295.0. Samples: 1727243640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:04:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 02:04:17,606][71000] Updated weights for policy 0, policy_version 134184 (0.0029) [2024-06-13 02:04:20,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2198618112. Throughput: 0: 49112.3. Samples: 1727389180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:04:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:04:21,252][71000] Updated weights for policy 0, policy_version 134194 (0.0030) [2024-06-13 02:04:24,094][71000] Updated weights for policy 0, policy_version 134204 (0.0022) [2024-06-13 02:04:25,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.0, 300 sec: 49207.5). Total num frames: 2198880256. Throughput: 0: 48905.7. Samples: 1727690300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:04:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:04:27,959][71000] Updated weights for policy 0, policy_version 134214 (0.0032) [2024-06-13 02:04:30,838][71000] Updated weights for policy 0, policy_version 134224 (0.0029) [2024-06-13 02:04:30,943][70768] Fps is (10 sec: 50775.4, 60 sec: 49149.5, 300 sec: 49262.6). Total num frames: 2199126016. Throughput: 0: 48738.6. Samples: 1727974500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:04:30,943][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:04:34,752][71000] Updated weights for policy 0, policy_version 134234 (0.0031) [2024-06-13 02:04:35,940][70768] Fps is (10 sec: 49150.8, 60 sec: 49424.8, 300 sec: 49207.5). Total num frames: 2199371776. Throughput: 0: 49236.5. Samples: 1728132440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:04:35,941][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:04:37,342][71000] Updated weights for policy 0, policy_version 134244 (0.0028) [2024-06-13 02:04:40,940][70768] Fps is (10 sec: 45889.4, 60 sec: 48605.9, 300 sec: 49207.6). Total num frames: 2199584768. Throughput: 0: 49186.8. Samples: 1728427400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:04:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:04:41,078][71000] Updated weights for policy 0, policy_version 134254 (0.0024) [2024-06-13 02:04:44,372][71000] Updated weights for policy 0, policy_version 134264 (0.0026) [2024-06-13 02:04:45,940][70768] Fps is (10 sec: 47515.2, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2199846912. Throughput: 0: 49227.7. Samples: 1728735080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:04:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:04:47,579][71000] Updated weights for policy 0, policy_version 134274 (0.0030) [2024-06-13 02:04:50,721][71000] Updated weights for policy 0, policy_version 134284 (0.0031) [2024-06-13 02:04:50,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2200109056. Throughput: 0: 49417.6. Samples: 1728881860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:04:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:04:51,928][70980] Signal inference workers to stop experience collection... (25500 times) [2024-06-13 02:04:51,929][70980] Signal inference workers to resume experience collection... (25500 times) [2024-06-13 02:04:51,950][71000] InferenceWorker_p0-w0: stopping experience collection (25500 times) [2024-06-13 02:04:51,950][71000] InferenceWorker_p0-w0: resuming experience collection (25500 times) [2024-06-13 02:04:54,575][71000] Updated weights for policy 0, policy_version 134294 (0.0022) [2024-06-13 02:04:55,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2200354816. Throughput: 0: 49392.4. Samples: 1729177940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:04:55,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 02:04:57,405][71000] Updated weights for policy 0, policy_version 134304 (0.0027) [2024-06-13 02:05:00,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49151.9, 300 sec: 49263.0). Total num frames: 2200584192. Throughput: 0: 49289.2. Samples: 1729461660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:05:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:05:01,161][71000] Updated weights for policy 0, policy_version 134314 (0.0029) [2024-06-13 02:05:04,533][71000] Updated weights for policy 0, policy_version 134324 (0.0029) [2024-06-13 02:05:05,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2200813568. Throughput: 0: 49146.3. Samples: 1729600760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:05:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:05:07,521][71000] Updated weights for policy 0, policy_version 134334 (0.0031) [2024-06-13 02:05:10,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2201075712. Throughput: 0: 49127.2. Samples: 1729901020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:05:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:05:11,062][71000] Updated weights for policy 0, policy_version 134344 (0.0027) [2024-06-13 02:05:14,240][71000] Updated weights for policy 0, policy_version 134354 (0.0033) [2024-06-13 02:05:15,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 2201337856. Throughput: 0: 49422.5. Samples: 1730198360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:05:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:05:18,058][71000] Updated weights for policy 0, policy_version 134364 (0.0041) [2024-06-13 02:05:20,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48879.0, 300 sec: 49207.6). Total num frames: 2201550848. Throughput: 0: 49098.5. Samples: 1730341860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:05:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:05:21,130][71000] Updated weights for policy 0, policy_version 134374 (0.0025) [2024-06-13 02:05:24,642][71000] Updated weights for policy 0, policy_version 134384 (0.0024) [2024-06-13 02:05:25,940][70768] Fps is (10 sec: 45874.4, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 2201796608. Throughput: 0: 48982.5. Samples: 1730631620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:05:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:05:27,897][71000] Updated weights for policy 0, policy_version 134394 (0.0026) [2024-06-13 02:05:30,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48608.4, 300 sec: 49152.0). Total num frames: 2202042368. Throughput: 0: 48620.5. Samples: 1730923000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:05:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:05:31,310][71000] Updated weights for policy 0, policy_version 134404 (0.0030) [2024-06-13 02:05:34,349][71000] Updated weights for policy 0, policy_version 134414 (0.0033) [2024-06-13 02:05:35,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48606.1, 300 sec: 49152.0). Total num frames: 2202288128. Throughput: 0: 48885.1. Samples: 1731081680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:05:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:05:38,107][71000] Updated weights for policy 0, policy_version 134424 (0.0034) [2024-06-13 02:05:40,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2202533888. Throughput: 0: 48596.9. Samples: 1731364800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 02:05:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:05:40,987][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000134433_2202550272.pth... [2024-06-13 02:05:41,038][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000133713_2190753792.pth [2024-06-13 02:05:41,174][71000] Updated weights for policy 0, policy_version 134434 (0.0023) [2024-06-13 02:05:45,065][71000] Updated weights for policy 0, policy_version 134444 (0.0028) [2024-06-13 02:05:45,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2202779648. Throughput: 0: 48682.3. Samples: 1731652360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-13 02:05:45,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:05:47,211][70980] Signal inference workers to stop experience collection... (25550 times) [2024-06-13 02:05:47,215][70980] Signal inference workers to resume experience collection... (25550 times) [2024-06-13 02:05:47,239][71000] InferenceWorker_p0-w0: stopping experience collection (25550 times) [2024-06-13 02:05:47,239][71000] InferenceWorker_p0-w0: resuming experience collection (25550 times) [2024-06-13 02:05:48,262][71000] Updated weights for policy 0, policy_version 134454 (0.0023) [2024-06-13 02:05:50,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48879.1, 300 sec: 49207.5). Total num frames: 2203041792. Throughput: 0: 49115.2. Samples: 1731810940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-13 02:05:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:05:51,066][71000] Updated weights for policy 0, policy_version 134464 (0.0026) [2024-06-13 02:05:54,483][71000] Updated weights for policy 0, policy_version 134474 (0.0043) [2024-06-13 02:05:55,939][70768] Fps is (10 sec: 50791.1, 60 sec: 48879.1, 300 sec: 49207.6). Total num frames: 2203287552. Throughput: 0: 49218.8. Samples: 1732115860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-13 02:05:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:05:57,718][71000] Updated weights for policy 0, policy_version 134484 (0.0031) [2024-06-13 02:06:00,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 2203516928. Throughput: 0: 48930.2. Samples: 1732400220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-13 02:06:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:06:01,352][71000] Updated weights for policy 0, policy_version 134494 (0.0026) [2024-06-13 02:06:04,528][71000] Updated weights for policy 0, policy_version 134504 (0.0018) [2024-06-13 02:06:05,939][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2203762688. Throughput: 0: 48805.4. Samples: 1732538100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-13 02:06:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:06:08,237][71000] Updated weights for policy 0, policy_version 134514 (0.0026) [2024-06-13 02:06:10,805][71000] Updated weights for policy 0, policy_version 134524 (0.0020) [2024-06-13 02:06:10,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2204041216. Throughput: 0: 48927.7. Samples: 1732833360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-13 02:06:10,941][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 02:06:14,645][71000] Updated weights for policy 0, policy_version 134534 (0.0021) [2024-06-13 02:06:15,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49151.9, 300 sec: 49263.0). Total num frames: 2204286976. Throughput: 0: 49215.8. Samples: 1733137720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-13 02:06:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:06:17,581][71000] Updated weights for policy 0, policy_version 134544 (0.0023) [2024-06-13 02:06:20,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2204499968. Throughput: 0: 48927.1. Samples: 1733283400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-13 02:06:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:06:21,351][71000] Updated weights for policy 0, policy_version 134554 (0.0026) [2024-06-13 02:06:24,154][71000] Updated weights for policy 0, policy_version 134564 (0.0033) [2024-06-13 02:06:25,940][70768] Fps is (10 sec: 44237.0, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2204729344. Throughput: 0: 49063.1. Samples: 1733572640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-13 02:06:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 02:06:27,937][71000] Updated weights for policy 0, policy_version 134574 (0.0035) [2024-06-13 02:06:30,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 2205007872. Throughput: 0: 49193.8. Samples: 1733866080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-13 02:06:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:06:31,120][71000] Updated weights for policy 0, policy_version 134584 (0.0028) [2024-06-13 02:06:34,791][71000] Updated weights for policy 0, policy_version 134594 (0.0032) [2024-06-13 02:06:35,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2205253632. Throughput: 0: 49133.8. Samples: 1734021960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-13 02:06:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:06:37,579][71000] Updated weights for policy 0, policy_version 134604 (0.0025) [2024-06-13 02:06:40,940][70768] Fps is (10 sec: 47512.4, 60 sec: 49151.8, 300 sec: 49096.4). Total num frames: 2205483008. Throughput: 0: 48866.7. Samples: 1734314880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-13 02:06:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:06:41,503][71000] Updated weights for policy 0, policy_version 134614 (0.0032) [2024-06-13 02:06:44,291][71000] Updated weights for policy 0, policy_version 134624 (0.0036) [2024-06-13 02:06:45,939][70768] Fps is (10 sec: 44237.0, 60 sec: 48606.0, 300 sec: 48929.9). Total num frames: 2205696000. Throughput: 0: 49109.8. Samples: 1734610160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 02:06:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:06:48,086][71000] Updated weights for policy 0, policy_version 134634 (0.0034) [2024-06-13 02:06:50,940][70768] Fps is (10 sec: 50791.9, 60 sec: 49152.0, 300 sec: 49096.4). Total num frames: 2205990912. Throughput: 0: 49253.7. Samples: 1734754520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 02:06:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:06:51,196][71000] Updated weights for policy 0, policy_version 134644 (0.0031) [2024-06-13 02:06:54,709][71000] Updated weights for policy 0, policy_version 134654 (0.0036) [2024-06-13 02:06:55,940][70768] Fps is (10 sec: 54066.9, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2206236672. Throughput: 0: 49343.1. Samples: 1735053800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 02:06:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:06:57,758][71000] Updated weights for policy 0, policy_version 134664 (0.0037) [2024-06-13 02:07:00,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2206466048. Throughput: 0: 48998.8. Samples: 1735342660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 02:07:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:07:01,517][71000] Updated weights for policy 0, policy_version 134674 (0.0030) [2024-06-13 02:07:04,413][71000] Updated weights for policy 0, policy_version 134684 (0.0026) [2024-06-13 02:07:05,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2206695424. Throughput: 0: 48985.8. Samples: 1735487760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 02:07:05,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 02:07:06,361][70980] Signal inference workers to stop experience collection... (25600 times) [2024-06-13 02:07:06,363][70980] Signal inference workers to resume experience collection... (25600 times) [2024-06-13 02:07:06,405][71000] InferenceWorker_p0-w0: stopping experience collection (25600 times) [2024-06-13 02:07:06,406][71000] InferenceWorker_p0-w0: resuming experience collection (25600 times) [2024-06-13 02:07:07,895][71000] Updated weights for policy 0, policy_version 134694 (0.0030) [2024-06-13 02:07:10,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 2206973952. Throughput: 0: 49095.1. Samples: 1735781920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 02:07:10,949][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 02:07:11,292][71000] Updated weights for policy 0, policy_version 134704 (0.0025) [2024-06-13 02:07:14,318][71000] Updated weights for policy 0, policy_version 134714 (0.0027) [2024-06-13 02:07:15,940][70768] Fps is (10 sec: 52428.9, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2207219712. Throughput: 0: 49199.6. Samples: 1736080060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 02:07:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 02:07:17,686][71000] Updated weights for policy 0, policy_version 134724 (0.0027) [2024-06-13 02:07:20,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2207465472. Throughput: 0: 49283.0. Samples: 1736239700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 02:07:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:07:21,195][71000] Updated weights for policy 0, policy_version 134734 (0.0023) [2024-06-13 02:07:24,441][71000] Updated weights for policy 0, policy_version 134744 (0.0030) [2024-06-13 02:07:25,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49698.1, 300 sec: 49152.0). Total num frames: 2207711232. Throughput: 0: 49409.6. Samples: 1736538300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 02:07:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:07:27,550][71000] Updated weights for policy 0, policy_version 134754 (0.0024) [2024-06-13 02:07:30,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2207940608. Throughput: 0: 49365.7. Samples: 1736831620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 02:07:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:07:31,318][71000] Updated weights for policy 0, policy_version 134764 (0.0026) [2024-06-13 02:07:34,151][71000] Updated weights for policy 0, policy_version 134774 (0.0035) [2024-06-13 02:07:35,939][70768] Fps is (10 sec: 50791.2, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2208219136. Throughput: 0: 49401.4. Samples: 1736977580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 02:07:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:07:37,841][71000] Updated weights for policy 0, policy_version 134784 (0.0024) [2024-06-13 02:07:40,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.3, 300 sec: 49152.0). Total num frames: 2208448512. Throughput: 0: 49486.2. Samples: 1737280680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 02:07:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:07:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000134794_2208464896.pth... [2024-06-13 02:07:40,957][71000] Updated weights for policy 0, policy_version 134794 (0.0028) [2024-06-13 02:07:41,001][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000134074_2196668416.pth [2024-06-13 02:07:44,471][71000] Updated weights for policy 0, policy_version 134804 (0.0035) [2024-06-13 02:07:45,940][70768] Fps is (10 sec: 45874.3, 60 sec: 49698.0, 300 sec: 49096.4). Total num frames: 2208677888. Throughput: 0: 49543.8. Samples: 1737572140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 02:07:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:07:47,645][71000] Updated weights for policy 0, policy_version 134814 (0.0032) [2024-06-13 02:07:50,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2208923648. Throughput: 0: 49333.0. Samples: 1737707740. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:07:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:07:51,119][71000] Updated weights for policy 0, policy_version 134824 (0.0026) [2024-06-13 02:07:54,151][71000] Updated weights for policy 0, policy_version 134834 (0.0028) [2024-06-13 02:07:55,939][70768] Fps is (10 sec: 52429.9, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2209202176. Throughput: 0: 49526.4. Samples: 1738010600. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:07:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:07:57,818][71000] Updated weights for policy 0, policy_version 134844 (0.0038) [2024-06-13 02:08:00,777][71000] Updated weights for policy 0, policy_version 134854 (0.0037) [2024-06-13 02:08:00,940][70768] Fps is (10 sec: 52427.1, 60 sec: 49697.9, 300 sec: 49263.0). Total num frames: 2209447936. Throughput: 0: 49622.4. Samples: 1738313080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:08:00,941][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:08:04,516][71000] Updated weights for policy 0, policy_version 134864 (0.0024) [2024-06-13 02:08:05,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49698.2, 300 sec: 49096.5). Total num frames: 2209677312. Throughput: 0: 49241.9. Samples: 1738455580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:08:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:08:07,391][71000] Updated weights for policy 0, policy_version 134874 (0.0040) [2024-06-13 02:08:10,939][70768] Fps is (10 sec: 44238.3, 60 sec: 48606.0, 300 sec: 49096.5). Total num frames: 2209890304. Throughput: 0: 49028.2. Samples: 1738744560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:08:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:08:11,574][71000] Updated weights for policy 0, policy_version 134884 (0.0030) [2024-06-13 02:08:14,445][71000] Updated weights for policy 0, policy_version 134894 (0.0031) [2024-06-13 02:08:15,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2210168832. Throughput: 0: 48936.0. Samples: 1739033740. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:08:15,949][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:08:18,255][70980] Signal inference workers to stop experience collection... (25650 times) [2024-06-13 02:08:18,255][70980] Signal inference workers to resume experience collection... (25650 times) [2024-06-13 02:08:18,293][71000] InferenceWorker_p0-w0: stopping experience collection (25650 times) [2024-06-13 02:08:18,293][71000] InferenceWorker_p0-w0: resuming experience collection (25650 times) [2024-06-13 02:08:18,394][71000] Updated weights for policy 0, policy_version 134904 (0.0024) [2024-06-13 02:08:20,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49152.1, 300 sec: 49207.5). Total num frames: 2210414592. Throughput: 0: 49262.6. Samples: 1739194400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:08:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 02:08:20,956][71000] Updated weights for policy 0, policy_version 134914 (0.0030) [2024-06-13 02:08:24,934][71000] Updated weights for policy 0, policy_version 134924 (0.0034) [2024-06-13 02:08:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2210643968. Throughput: 0: 48898.2. Samples: 1739481100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:08:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:08:27,715][71000] Updated weights for policy 0, policy_version 134934 (0.0031) [2024-06-13 02:08:30,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2210873344. Throughput: 0: 48945.0. Samples: 1739774660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:08:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:08:31,863][71000] Updated weights for policy 0, policy_version 134944 (0.0023) [2024-06-13 02:08:34,455][71000] Updated weights for policy 0, policy_version 134954 (0.0027) [2024-06-13 02:08:35,940][70768] Fps is (10 sec: 50789.7, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 2211151872. Throughput: 0: 49009.6. Samples: 1739913180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:08:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 02:08:38,322][71000] Updated weights for policy 0, policy_version 134964 (0.0022) [2024-06-13 02:08:40,900][71000] Updated weights for policy 0, policy_version 134974 (0.0022) [2024-06-13 02:08:40,940][70768] Fps is (10 sec: 54066.8, 60 sec: 49424.9, 300 sec: 49207.5). Total num frames: 2211414016. Throughput: 0: 49022.4. Samples: 1740216620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:08:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:08:45,144][71000] Updated weights for policy 0, policy_version 134984 (0.0023) [2024-06-13 02:08:45,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 2211627008. Throughput: 0: 48853.6. Samples: 1740511480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:08:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:08:47,991][71000] Updated weights for policy 0, policy_version 134994 (0.0029) [2024-06-13 02:08:50,939][70768] Fps is (10 sec: 44237.7, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2211856384. Throughput: 0: 48788.5. Samples: 1740651060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:08:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:08:51,852][71000] Updated weights for policy 0, policy_version 135004 (0.0026) [2024-06-13 02:08:54,346][71000] Updated weights for policy 0, policy_version 135014 (0.0024) [2024-06-13 02:08:55,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 2212134912. Throughput: 0: 49027.9. Samples: 1740950820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 02:08:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:08:58,403][71000] Updated weights for policy 0, policy_version 135024 (0.0034) [2024-06-13 02:09:00,915][71000] Updated weights for policy 0, policy_version 135034 (0.0029) [2024-06-13 02:09:00,940][70768] Fps is (10 sec: 54066.7, 60 sec: 49152.2, 300 sec: 49207.5). Total num frames: 2212397056. Throughput: 0: 49152.8. Samples: 1741245620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 02:09:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:09:05,305][71000] Updated weights for policy 0, policy_version 135044 (0.0039) [2024-06-13 02:09:05,939][70768] Fps is (10 sec: 45876.0, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 2212593664. Throughput: 0: 48879.3. Samples: 1741393960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 02:09:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:09:07,681][71000] Updated weights for policy 0, policy_version 135054 (0.0023) [2024-06-13 02:09:10,940][70768] Fps is (10 sec: 44237.1, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2212839424. Throughput: 0: 48691.2. Samples: 1741672200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 02:09:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:09:12,185][71000] Updated weights for policy 0, policy_version 135064 (0.0028) [2024-06-13 02:09:14,571][71000] Updated weights for policy 0, policy_version 135074 (0.0027) [2024-06-13 02:09:15,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2213101568. Throughput: 0: 48654.8. Samples: 1741964120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 02:09:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:09:18,799][71000] Updated weights for policy 0, policy_version 135084 (0.0034) [2024-06-13 02:09:20,433][70980] Signal inference workers to stop experience collection... (25700 times) [2024-06-13 02:09:20,435][70980] Signal inference workers to resume experience collection... (25700 times) [2024-06-13 02:09:20,476][71000] InferenceWorker_p0-w0: stopping experience collection (25700 times) [2024-06-13 02:09:20,476][71000] InferenceWorker_p0-w0: resuming experience collection (25700 times) [2024-06-13 02:09:20,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2213363712. Throughput: 0: 49060.2. Samples: 1742120880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 02:09:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:09:21,098][71000] Updated weights for policy 0, policy_version 135094 (0.0026) [2024-06-13 02:09:25,149][71000] Updated weights for policy 0, policy_version 135104 (0.0031) [2024-06-13 02:09:25,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.9, 300 sec: 48985.9). Total num frames: 2213576704. Throughput: 0: 48994.3. Samples: 1742421360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 02:09:25,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:09:27,619][71000] Updated weights for policy 0, policy_version 135114 (0.0030) [2024-06-13 02:09:30,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49425.1, 300 sec: 49041.0). Total num frames: 2213838848. Throughput: 0: 48973.3. Samples: 1742715280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 02:09:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:09:31,926][71000] Updated weights for policy 0, policy_version 135124 (0.0024) [2024-06-13 02:09:34,115][71000] Updated weights for policy 0, policy_version 135134 (0.0032) [2024-06-13 02:09:35,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2214100992. Throughput: 0: 49138.9. Samples: 1742862320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 02:09:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:09:38,389][71000] Updated weights for policy 0, policy_version 135144 (0.0032) [2024-06-13 02:09:40,657][71000] Updated weights for policy 0, policy_version 135154 (0.0031) [2024-06-13 02:09:40,940][70768] Fps is (10 sec: 54066.7, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2214379520. Throughput: 0: 49134.2. Samples: 1743161860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 02:09:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:09:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000135155_2214379520.pth... [2024-06-13 02:09:40,989][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000134433_2202550272.pth [2024-06-13 02:09:45,182][71000] Updated weights for policy 0, policy_version 135164 (0.0027) [2024-06-13 02:09:45,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 2214592512. Throughput: 0: 49223.1. Samples: 1743460660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 02:09:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:09:47,518][71000] Updated weights for policy 0, policy_version 135174 (0.0026) [2024-06-13 02:09:50,940][70768] Fps is (10 sec: 42598.7, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 2214805504. Throughput: 0: 49012.3. Samples: 1743599520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 02:09:50,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:09:51,725][71000] Updated weights for policy 0, policy_version 135184 (0.0034) [2024-06-13 02:09:54,039][71000] Updated weights for policy 0, policy_version 135194 (0.0028) [2024-06-13 02:09:55,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2215067648. Throughput: 0: 49208.9. Samples: 1743886600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:09:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:09:58,331][71000] Updated weights for policy 0, policy_version 135204 (0.0030) [2024-06-13 02:10:00,727][71000] Updated weights for policy 0, policy_version 135214 (0.0031) [2024-06-13 02:10:00,940][70768] Fps is (10 sec: 54067.4, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2215346176. Throughput: 0: 49519.5. Samples: 1744192500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:10:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:10:04,625][71000] Updated weights for policy 0, policy_version 135224 (0.0030) [2024-06-13 02:10:05,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49697.9, 300 sec: 49152.0). Total num frames: 2215575552. Throughput: 0: 49457.6. Samples: 1744346480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:10:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:10:07,334][71000] Updated weights for policy 0, policy_version 135234 (0.0032) [2024-06-13 02:10:10,940][70768] Fps is (10 sec: 44236.7, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2215788544. Throughput: 0: 49374.3. Samples: 1744643200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:10:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:10:11,342][71000] Updated weights for policy 0, policy_version 135244 (0.0030) [2024-06-13 02:10:14,122][71000] Updated weights for policy 0, policy_version 135254 (0.0029) [2024-06-13 02:10:15,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.0, 300 sec: 49263.1). Total num frames: 2216083456. Throughput: 0: 49427.5. Samples: 1744939520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:10:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:10:17,944][71000] Updated weights for policy 0, policy_version 135264 (0.0022) [2024-06-13 02:10:19,105][70980] Signal inference workers to stop experience collection... (25750 times) [2024-06-13 02:10:19,105][70980] Signal inference workers to resume experience collection... (25750 times) [2024-06-13 02:10:19,123][71000] InferenceWorker_p0-w0: stopping experience collection (25750 times) [2024-06-13 02:10:19,123][71000] InferenceWorker_p0-w0: resuming experience collection (25750 times) [2024-06-13 02:10:20,738][71000] Updated weights for policy 0, policy_version 135274 (0.0035) [2024-06-13 02:10:20,940][70768] Fps is (10 sec: 54066.8, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2216329216. Throughput: 0: 49422.3. Samples: 1745086320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:10:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:10:24,599][71000] Updated weights for policy 0, policy_version 135284 (0.0025) [2024-06-13 02:10:25,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 49263.1). Total num frames: 2216574976. Throughput: 0: 49426.7. Samples: 1745386060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:10:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:10:27,230][71000] Updated weights for policy 0, policy_version 135294 (0.0027) [2024-06-13 02:10:30,939][70768] Fps is (10 sec: 47514.3, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2216804352. Throughput: 0: 49655.7. Samples: 1745695160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:10:30,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 02:10:30,984][71000] Updated weights for policy 0, policy_version 135304 (0.0036) [2024-06-13 02:10:33,909][71000] Updated weights for policy 0, policy_version 135314 (0.0021) [2024-06-13 02:10:35,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 2217066496. Throughput: 0: 49677.9. Samples: 1745835020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:10:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 02:10:37,712][71000] Updated weights for policy 0, policy_version 135324 (0.0026) [2024-06-13 02:10:40,615][71000] Updated weights for policy 0, policy_version 135334 (0.0022) [2024-06-13 02:10:40,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2217328640. Throughput: 0: 49893.2. Samples: 1746131800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:10:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:10:44,364][71000] Updated weights for policy 0, policy_version 135344 (0.0027) [2024-06-13 02:10:45,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2217541632. Throughput: 0: 49210.1. Samples: 1746406960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:10:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:10:47,338][71000] Updated weights for policy 0, policy_version 135354 (0.0033) [2024-06-13 02:10:50,940][70768] Fps is (10 sec: 45875.7, 60 sec: 49698.1, 300 sec: 49152.0). Total num frames: 2217787392. Throughput: 0: 49189.5. Samples: 1746560000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:10:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 02:10:51,145][71000] Updated weights for policy 0, policy_version 135364 (0.0027) [2024-06-13 02:10:54,356][71000] Updated weights for policy 0, policy_version 135374 (0.0028) [2024-06-13 02:10:55,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2218016768. Throughput: 0: 49129.8. Samples: 1746854040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:10:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:10:57,727][71000] Updated weights for policy 0, policy_version 135384 (0.0022) [2024-06-13 02:11:00,773][71000] Updated weights for policy 0, policy_version 135394 (0.0031) [2024-06-13 02:11:00,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2218295296. Throughput: 0: 49321.0. Samples: 1747158960. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-13 02:11:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:11:04,337][71000] Updated weights for policy 0, policy_version 135404 (0.0029) [2024-06-13 02:11:05,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2218541056. Throughput: 0: 49258.6. Samples: 1747302960. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-13 02:11:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:11:07,467][71000] Updated weights for policy 0, policy_version 135414 (0.0022) [2024-06-13 02:11:10,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49698.2, 300 sec: 49096.5). Total num frames: 2218770432. Throughput: 0: 49351.3. Samples: 1747606860. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-13 02:11:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 02:11:11,043][71000] Updated weights for policy 0, policy_version 135424 (0.0026) [2024-06-13 02:11:11,123][70980] Signal inference workers to stop experience collection... (25800 times) [2024-06-13 02:11:11,155][71000] InferenceWorker_p0-w0: stopping experience collection (25800 times) [2024-06-13 02:11:11,181][70980] Signal inference workers to resume experience collection... (25800 times) [2024-06-13 02:11:11,182][71000] InferenceWorker_p0-w0: resuming experience collection (25800 times) [2024-06-13 02:11:14,132][71000] Updated weights for policy 0, policy_version 135434 (0.0027) [2024-06-13 02:11:15,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2219016192. Throughput: 0: 49016.8. Samples: 1747900920. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-13 02:11:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:11:17,214][71000] Updated weights for policy 0, policy_version 135444 (0.0031) [2024-06-13 02:11:20,745][71000] Updated weights for policy 0, policy_version 135454 (0.0026) [2024-06-13 02:11:20,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2219278336. Throughput: 0: 49186.2. Samples: 1748048400. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-13 02:11:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 02:11:23,793][71000] Updated weights for policy 0, policy_version 135464 (0.0022) [2024-06-13 02:11:25,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 2219524096. Throughput: 0: 49341.5. Samples: 1748352160. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-13 02:11:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:11:27,374][71000] Updated weights for policy 0, policy_version 135474 (0.0029) [2024-06-13 02:11:30,658][71000] Updated weights for policy 0, policy_version 135484 (0.0025) [2024-06-13 02:11:30,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2219786240. Throughput: 0: 49732.1. Samples: 1748644900. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-13 02:11:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 02:11:34,060][71000] Updated weights for policy 0, policy_version 135494 (0.0030) [2024-06-13 02:11:35,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2220015616. Throughput: 0: 49702.2. Samples: 1748796600. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-13 02:11:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:11:37,109][71000] Updated weights for policy 0, policy_version 135504 (0.0030) [2024-06-13 02:11:40,729][71000] Updated weights for policy 0, policy_version 135514 (0.0026) [2024-06-13 02:11:40,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48878.9, 300 sec: 49374.1). Total num frames: 2220261376. Throughput: 0: 49616.7. Samples: 1749086800. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-13 02:11:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:11:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000135514_2220261376.pth... [2024-06-13 02:11:41,000][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000134794_2208464896.pth [2024-06-13 02:11:43,841][71000] Updated weights for policy 0, policy_version 135524 (0.0022) [2024-06-13 02:11:45,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2220490752. Throughput: 0: 49345.8. Samples: 1749379520. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-13 02:11:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:11:47,565][71000] Updated weights for policy 0, policy_version 135534 (0.0031) [2024-06-13 02:11:50,733][71000] Updated weights for policy 0, policy_version 135544 (0.0029) [2024-06-13 02:11:50,939][70768] Fps is (10 sec: 50791.2, 60 sec: 49698.2, 300 sec: 49263.1). Total num frames: 2220769280. Throughput: 0: 49302.9. Samples: 1749521580. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-13 02:11:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:11:54,012][71000] Updated weights for policy 0, policy_version 135554 (0.0035) [2024-06-13 02:11:55,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2220982272. Throughput: 0: 49206.7. Samples: 1749821160. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-13 02:11:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:11:57,225][71000] Updated weights for policy 0, policy_version 135564 (0.0026) [2024-06-13 02:12:00,838][71000] Updated weights for policy 0, policy_version 135574 (0.0025) [2024-06-13 02:12:00,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2221244416. Throughput: 0: 49405.4. Samples: 1750124160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-13 02:12:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:12:03,724][71000] Updated weights for policy 0, policy_version 135584 (0.0028) [2024-06-13 02:12:05,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 2221490176. Throughput: 0: 49226.6. Samples: 1750263600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-13 02:12:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:12:07,685][71000] Updated weights for policy 0, policy_version 135594 (0.0028) [2024-06-13 02:12:10,332][71000] Updated weights for policy 0, policy_version 135604 (0.0034) [2024-06-13 02:12:10,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2221752320. Throughput: 0: 49043.6. Samples: 1750559120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-13 02:12:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:12:14,221][71000] Updated weights for policy 0, policy_version 135614 (0.0022) [2024-06-13 02:12:15,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2221998080. Throughput: 0: 49359.1. Samples: 1750866060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-13 02:12:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:12:17,344][71000] Updated weights for policy 0, policy_version 135624 (0.0024) [2024-06-13 02:12:17,520][70980] Signal inference workers to stop experience collection... (25850 times) [2024-06-13 02:12:17,520][70980] Signal inference workers to resume experience collection... (25850 times) [2024-06-13 02:12:17,563][71000] InferenceWorker_p0-w0: stopping experience collection (25850 times) [2024-06-13 02:12:17,564][71000] InferenceWorker_p0-w0: resuming experience collection (25850 times) [2024-06-13 02:12:20,760][71000] Updated weights for policy 0, policy_version 135634 (0.0025) [2024-06-13 02:12:20,940][70768] Fps is (10 sec: 47512.5, 60 sec: 49151.8, 300 sec: 49207.5). Total num frames: 2222227456. Throughput: 0: 48876.3. Samples: 1750996040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-13 02:12:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:12:23,547][71000] Updated weights for policy 0, policy_version 135644 (0.0032) [2024-06-13 02:12:25,940][70768] Fps is (10 sec: 47510.4, 60 sec: 49151.4, 300 sec: 49263.0). Total num frames: 2222473216. Throughput: 0: 48949.2. Samples: 1751289540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-13 02:12:25,941][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 02:12:27,479][71000] Updated weights for policy 0, policy_version 135654 (0.0028) [2024-06-13 02:12:29,878][71000] Updated weights for policy 0, policy_version 135664 (0.0022) [2024-06-13 02:12:30,939][70768] Fps is (10 sec: 54068.5, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 2222768128. Throughput: 0: 49380.5. Samples: 1751601640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-13 02:12:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:12:34,194][71000] Updated weights for policy 0, policy_version 135674 (0.0028) [2024-06-13 02:12:35,939][70768] Fps is (10 sec: 50794.2, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2222981120. Throughput: 0: 49758.7. Samples: 1751760720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-13 02:12:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:12:36,954][71000] Updated weights for policy 0, policy_version 135684 (0.0028) [2024-06-13 02:12:40,713][71000] Updated weights for policy 0, policy_version 135694 (0.0023) [2024-06-13 02:12:40,940][70768] Fps is (10 sec: 44236.5, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2223210496. Throughput: 0: 49455.5. Samples: 1752046660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-13 02:12:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:12:43,268][71000] Updated weights for policy 0, policy_version 135704 (0.0026) [2024-06-13 02:12:45,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2223456256. Throughput: 0: 49342.6. Samples: 1752344580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-13 02:12:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:12:47,440][71000] Updated weights for policy 0, policy_version 135714 (0.0031) [2024-06-13 02:12:50,006][71000] Updated weights for policy 0, policy_version 135724 (0.0031) [2024-06-13 02:12:50,939][70768] Fps is (10 sec: 52429.3, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2223734784. Throughput: 0: 49573.9. Samples: 1752494420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-13 02:12:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:12:54,360][71000] Updated weights for policy 0, policy_version 135734 (0.0030) [2024-06-13 02:12:55,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.1, 300 sec: 49207.6). Total num frames: 2223964160. Throughput: 0: 49796.0. Samples: 1752799940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-13 02:12:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:12:56,589][71000] Updated weights for policy 0, policy_version 135744 (0.0026) [2024-06-13 02:13:00,814][71000] Updated weights for policy 0, policy_version 135754 (0.0040) [2024-06-13 02:13:00,940][70768] Fps is (10 sec: 45874.2, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2224193536. Throughput: 0: 49567.0. Samples: 1753096580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-13 02:13:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:13:03,145][71000] Updated weights for policy 0, policy_version 135764 (0.0026) [2024-06-13 02:13:05,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2224439296. Throughput: 0: 49599.3. Samples: 1753228000. Policy #0 lag: (min: 1.0, avg: 9.8, max: 24.0) [2024-06-13 02:13:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:13:07,350][71000] Updated weights for policy 0, policy_version 135774 (0.0034) [2024-06-13 02:13:09,601][71000] Updated weights for policy 0, policy_version 135784 (0.0027) [2024-06-13 02:13:10,939][70768] Fps is (10 sec: 52429.8, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2224717824. Throughput: 0: 49794.6. Samples: 1753530260. Policy #0 lag: (min: 1.0, avg: 9.8, max: 24.0) [2024-06-13 02:13:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 02:13:13,728][71000] Updated weights for policy 0, policy_version 135794 (0.0033) [2024-06-13 02:13:15,940][70768] Fps is (10 sec: 54066.7, 60 sec: 49698.0, 300 sec: 49374.1). Total num frames: 2224979968. Throughput: 0: 49647.4. Samples: 1753835780. Policy #0 lag: (min: 1.0, avg: 9.8, max: 24.0) [2024-06-13 02:13:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:13:16,507][71000] Updated weights for policy 0, policy_version 135804 (0.0034) [2024-06-13 02:13:20,266][71000] Updated weights for policy 0, policy_version 135814 (0.0029) [2024-06-13 02:13:20,939][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.3, 300 sec: 49318.6). Total num frames: 2225192960. Throughput: 0: 49537.8. Samples: 1753989920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 24.0) [2024-06-13 02:13:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:13:22,994][71000] Updated weights for policy 0, policy_version 135824 (0.0026) [2024-06-13 02:13:23,362][70980] Signal inference workers to stop experience collection... (25900 times) [2024-06-13 02:13:23,362][70980] Signal inference workers to resume experience collection... (25900 times) [2024-06-13 02:13:23,374][71000] InferenceWorker_p0-w0: stopping experience collection (25900 times) [2024-06-13 02:13:23,374][71000] InferenceWorker_p0-w0: resuming experience collection (25900 times) [2024-06-13 02:13:25,940][70768] Fps is (10 sec: 44237.2, 60 sec: 49152.5, 300 sec: 49318.6). Total num frames: 2225422336. Throughput: 0: 49470.6. Samples: 1754272840. Policy #0 lag: (min: 1.0, avg: 9.8, max: 24.0) [2024-06-13 02:13:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 02:13:27,216][71000] Updated weights for policy 0, policy_version 135834 (0.0034) [2024-06-13 02:13:29,590][71000] Updated weights for policy 0, policy_version 135844 (0.0022) [2024-06-13 02:13:30,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2225717248. Throughput: 0: 49488.5. Samples: 1754571560. Policy #0 lag: (min: 1.0, avg: 9.8, max: 24.0) [2024-06-13 02:13:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:13:33,674][71000] Updated weights for policy 0, policy_version 135854 (0.0027) [2024-06-13 02:13:35,940][70768] Fps is (10 sec: 54067.4, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 2225963008. Throughput: 0: 49776.4. Samples: 1754734360. Policy #0 lag: (min: 1.0, avg: 9.8, max: 24.0) [2024-06-13 02:13:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:13:36,313][71000] Updated weights for policy 0, policy_version 135864 (0.0030) [2024-06-13 02:13:40,098][71000] Updated weights for policy 0, policy_version 135874 (0.0034) [2024-06-13 02:13:40,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49971.1, 300 sec: 49429.7). Total num frames: 2226208768. Throughput: 0: 49574.9. Samples: 1755030820. Policy #0 lag: (min: 1.0, avg: 9.8, max: 24.0) [2024-06-13 02:13:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:13:40,962][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000135877_2226208768.pth... [2024-06-13 02:13:41,014][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000135155_2214379520.pth [2024-06-13 02:13:42,654][71000] Updated weights for policy 0, policy_version 135884 (0.0025) [2024-06-13 02:13:45,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 2226438144. Throughput: 0: 49488.3. Samples: 1755323540. Policy #0 lag: (min: 1.0, avg: 9.8, max: 24.0) [2024-06-13 02:13:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:13:46,874][71000] Updated weights for policy 0, policy_version 135894 (0.0028) [2024-06-13 02:13:49,484][71000] Updated weights for policy 0, policy_version 135904 (0.0030) [2024-06-13 02:13:50,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2226700288. Throughput: 0: 49832.5. Samples: 1755470460. Policy #0 lag: (min: 1.0, avg: 9.8, max: 24.0) [2024-06-13 02:13:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:13:53,307][71000] Updated weights for policy 0, policy_version 135914 (0.0032) [2024-06-13 02:13:55,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49971.1, 300 sec: 49374.2). Total num frames: 2226962432. Throughput: 0: 49732.4. Samples: 1755768220. Policy #0 lag: (min: 1.0, avg: 9.8, max: 24.0) [2024-06-13 02:13:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:13:56,076][71000] Updated weights for policy 0, policy_version 135924 (0.0032) [2024-06-13 02:14:00,003][71000] Updated weights for policy 0, policy_version 135934 (0.0026) [2024-06-13 02:14:00,940][70768] Fps is (10 sec: 50790.1, 60 sec: 50244.4, 300 sec: 49540.8). Total num frames: 2227208192. Throughput: 0: 49737.5. Samples: 1756073960. Policy #0 lag: (min: 1.0, avg: 9.8, max: 24.0) [2024-06-13 02:14:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:14:02,747][71000] Updated weights for policy 0, policy_version 135944 (0.0033) [2024-06-13 02:14:05,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 2227437568. Throughput: 0: 49521.2. Samples: 1756218380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:14:05,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 02:14:06,390][71000] Updated weights for policy 0, policy_version 135954 (0.0023) [2024-06-13 02:14:09,651][71000] Updated weights for policy 0, policy_version 135964 (0.0032) [2024-06-13 02:14:10,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2227683328. Throughput: 0: 49618.3. Samples: 1756505660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:14:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:14:13,291][71000] Updated weights for policy 0, policy_version 135974 (0.0020) [2024-06-13 02:14:15,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2227945472. Throughput: 0: 49597.0. Samples: 1756803420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:14:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:14:16,041][71000] Updated weights for policy 0, policy_version 135984 (0.0025) [2024-06-13 02:14:19,734][71000] Updated weights for policy 0, policy_version 135994 (0.0037) [2024-06-13 02:14:20,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49698.1, 300 sec: 49485.3). Total num frames: 2228174848. Throughput: 0: 49301.8. Samples: 1756952940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:14:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:14:22,922][71000] Updated weights for policy 0, policy_version 136004 (0.0031) [2024-06-13 02:14:25,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 2228420608. Throughput: 0: 49294.7. Samples: 1757249080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:14:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:14:26,325][71000] Updated weights for policy 0, policy_version 136014 (0.0025) [2024-06-13 02:14:29,503][71000] Updated weights for policy 0, policy_version 136024 (0.0027) [2024-06-13 02:14:30,939][70768] Fps is (10 sec: 52428.7, 60 sec: 49698.2, 300 sec: 49485.3). Total num frames: 2228699136. Throughput: 0: 49519.1. Samples: 1757551900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:14:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:14:32,954][71000] Updated weights for policy 0, policy_version 136034 (0.0030) [2024-06-13 02:14:35,939][70768] Fps is (10 sec: 52429.8, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 2228944896. Throughput: 0: 49514.3. Samples: 1757698600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:14:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:14:35,941][71000] Updated weights for policy 0, policy_version 136044 (0.0035) [2024-06-13 02:14:39,442][71000] Updated weights for policy 0, policy_version 136054 (0.0032) [2024-06-13 02:14:40,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2229174272. Throughput: 0: 49503.9. Samples: 1757995900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:14:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:14:42,529][71000] Updated weights for policy 0, policy_version 136064 (0.0029) [2024-06-13 02:14:45,940][70768] Fps is (10 sec: 45874.6, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 2229403648. Throughput: 0: 49187.1. Samples: 1758287380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:14:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:14:45,957][70980] Signal inference workers to stop experience collection... (25950 times) [2024-06-13 02:14:46,011][71000] InferenceWorker_p0-w0: stopping experience collection (25950 times) [2024-06-13 02:14:46,064][70980] Signal inference workers to resume experience collection... (25950 times) [2024-06-13 02:14:46,064][71000] InferenceWorker_p0-w0: resuming experience collection (25950 times) [2024-06-13 02:14:46,197][71000] Updated weights for policy 0, policy_version 136074 (0.0035) [2024-06-13 02:14:49,277][71000] Updated weights for policy 0, policy_version 136084 (0.0039) [2024-06-13 02:14:50,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 2229665792. Throughput: 0: 49329.8. Samples: 1758438220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:14:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:14:52,909][71000] Updated weights for policy 0, policy_version 136094 (0.0024) [2024-06-13 02:14:55,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48878.8, 300 sec: 49318.6). Total num frames: 2229895168. Throughput: 0: 49269.6. Samples: 1758722800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:14:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:14:56,382][71000] Updated weights for policy 0, policy_version 136104 (0.0030) [2024-06-13 02:14:59,730][71000] Updated weights for policy 0, policy_version 136114 (0.0027) [2024-06-13 02:15:00,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48879.0, 300 sec: 49374.2). Total num frames: 2230140928. Throughput: 0: 49130.7. Samples: 1759014300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:15:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:15:02,926][71000] Updated weights for policy 0, policy_version 136124 (0.0028) [2024-06-13 02:15:05,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2230386688. Throughput: 0: 49135.5. Samples: 1759164040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:15:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:15:06,263][71000] Updated weights for policy 0, policy_version 136134 (0.0027) [2024-06-13 02:15:09,418][71000] Updated weights for policy 0, policy_version 136144 (0.0027) [2024-06-13 02:15:10,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2230632448. Throughput: 0: 49220.9. Samples: 1759464020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 02:15:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 02:15:12,754][71000] Updated weights for policy 0, policy_version 136154 (0.0022) [2024-06-13 02:15:15,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49151.9, 300 sec: 49374.2). Total num frames: 2230894592. Throughput: 0: 49086.1. Samples: 1759760780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 02:15:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:15:16,502][71000] Updated weights for policy 0, policy_version 136164 (0.0038) [2024-06-13 02:15:19,438][71000] Updated weights for policy 0, policy_version 136174 (0.0032) [2024-06-13 02:15:20,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49424.9, 300 sec: 49374.2). Total num frames: 2231140352. Throughput: 0: 49198.4. Samples: 1759912540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 02:15:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:15:23,010][71000] Updated weights for policy 0, policy_version 136184 (0.0027) [2024-06-13 02:15:25,786][71000] Updated weights for policy 0, policy_version 136194 (0.0034) [2024-06-13 02:15:25,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 2231402496. Throughput: 0: 49034.8. Samples: 1760202460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 02:15:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:15:29,569][71000] Updated weights for policy 0, policy_version 136204 (0.0026) [2024-06-13 02:15:30,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.8, 300 sec: 49374.1). Total num frames: 2231631872. Throughput: 0: 49147.9. Samples: 1760499040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 02:15:30,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 02:15:32,742][71000] Updated weights for policy 0, policy_version 136214 (0.0032) [2024-06-13 02:15:35,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 2231877632. Throughput: 0: 49205.8. Samples: 1760652480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 02:15:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:15:36,199][71000] Updated weights for policy 0, policy_version 136224 (0.0020) [2024-06-13 02:15:39,258][71000] Updated weights for policy 0, policy_version 136234 (0.0025) [2024-06-13 02:15:40,940][70768] Fps is (10 sec: 50791.3, 60 sec: 49425.1, 300 sec: 49485.3). Total num frames: 2232139776. Throughput: 0: 49633.5. Samples: 1760956300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 02:15:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:15:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000136239_2232139776.pth... [2024-06-13 02:15:41,008][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000135514_2220261376.pth [2024-06-13 02:15:42,817][71000] Updated weights for policy 0, policy_version 136244 (0.0031) [2024-06-13 02:15:45,651][71000] Updated weights for policy 0, policy_version 136254 (0.0027) [2024-06-13 02:15:45,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 2232385536. Throughput: 0: 49699.1. Samples: 1761250760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 02:15:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:15:49,589][71000] Updated weights for policy 0, policy_version 136264 (0.0037) [2024-06-13 02:15:50,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2232614912. Throughput: 0: 49692.4. Samples: 1761400200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 02:15:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:15:52,378][71000] Updated weights for policy 0, policy_version 136274 (0.0033) [2024-06-13 02:15:55,939][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 2232860672. Throughput: 0: 49525.5. Samples: 1761692660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 02:15:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:15:56,040][71000] Updated weights for policy 0, policy_version 136284 (0.0033) [2024-06-13 02:15:58,742][71000] Updated weights for policy 0, policy_version 136294 (0.0027) [2024-06-13 02:16:00,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 2233122816. Throughput: 0: 49653.0. Samples: 1761995160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 02:16:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:16:02,718][71000] Updated weights for policy 0, policy_version 136304 (0.0031) [2024-06-13 02:16:05,365][71000] Updated weights for policy 0, policy_version 136314 (0.0025) [2024-06-13 02:16:05,939][70768] Fps is (10 sec: 52428.7, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 2233384960. Throughput: 0: 49421.0. Samples: 1762136480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 02:16:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:16:09,330][71000] Updated weights for policy 0, policy_version 136324 (0.0027) [2024-06-13 02:16:09,484][70980] Signal inference workers to stop experience collection... (26000 times) [2024-06-13 02:16:09,484][70980] Signal inference workers to resume experience collection... (26000 times) [2024-06-13 02:16:09,503][71000] InferenceWorker_p0-w0: stopping experience collection (26000 times) [2024-06-13 02:16:09,503][71000] InferenceWorker_p0-w0: resuming experience collection (26000 times) [2024-06-13 02:16:10,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49698.3, 300 sec: 49485.2). Total num frames: 2233614336. Throughput: 0: 49793.8. Samples: 1762443180. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-13 02:16:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:16:11,896][71000] Updated weights for policy 0, policy_version 136334 (0.0024) [2024-06-13 02:16:15,818][71000] Updated weights for policy 0, policy_version 136344 (0.0020) [2024-06-13 02:16:15,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2233860096. Throughput: 0: 49843.7. Samples: 1762742000. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-13 02:16:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:16:18,502][71000] Updated weights for policy 0, policy_version 136354 (0.0024) [2024-06-13 02:16:20,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 2234122240. Throughput: 0: 49498.5. Samples: 1762879920. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-13 02:16:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:16:22,481][71000] Updated weights for policy 0, policy_version 136364 (0.0028) [2024-06-13 02:16:25,023][71000] Updated weights for policy 0, policy_version 136374 (0.0024) [2024-06-13 02:16:25,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2234368000. Throughput: 0: 49386.1. Samples: 1763178680. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-13 02:16:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:16:29,136][71000] Updated weights for policy 0, policy_version 136384 (0.0034) [2024-06-13 02:16:30,939][70768] Fps is (10 sec: 47514.3, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2234597376. Throughput: 0: 49405.3. Samples: 1763474000. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-13 02:16:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:16:31,759][71000] Updated weights for policy 0, policy_version 136394 (0.0024) [2024-06-13 02:16:35,674][71000] Updated weights for policy 0, policy_version 136404 (0.0024) [2024-06-13 02:16:35,939][70768] Fps is (10 sec: 47514.3, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2234843136. Throughput: 0: 49360.5. Samples: 1763621420. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-13 02:16:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 02:16:38,387][71000] Updated weights for policy 0, policy_version 136414 (0.0031) [2024-06-13 02:16:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2235088896. Throughput: 0: 49342.2. Samples: 1763913060. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-13 02:16:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 02:16:42,418][71000] Updated weights for policy 0, policy_version 136424 (0.0029) [2024-06-13 02:16:45,069][71000] Updated weights for policy 0, policy_version 136434 (0.0041) [2024-06-13 02:16:45,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2235351040. Throughput: 0: 48923.9. Samples: 1764196740. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-13 02:16:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:16:49,178][71000] Updated weights for policy 0, policy_version 136444 (0.0027) [2024-06-13 02:16:50,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 2235580416. Throughput: 0: 49401.1. Samples: 1764359540. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-13 02:16:50,949][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:16:51,873][71000] Updated weights for policy 0, policy_version 136454 (0.0033) [2024-06-13 02:16:55,837][71000] Updated weights for policy 0, policy_version 136464 (0.0022) [2024-06-13 02:16:55,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2235826176. Throughput: 0: 49217.7. Samples: 1764657980. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-13 02:16:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:16:58,499][71000] Updated weights for policy 0, policy_version 136474 (0.0041) [2024-06-13 02:17:00,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2236071936. Throughput: 0: 49138.3. Samples: 1764953220. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-13 02:17:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:17:02,247][71000] Updated weights for policy 0, policy_version 136484 (0.0030) [2024-06-13 02:17:04,906][71000] Updated weights for policy 0, policy_version 136494 (0.0028) [2024-06-13 02:17:05,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 2236350464. Throughput: 0: 49433.8. Samples: 1765104440. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-13 02:17:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:17:09,187][71000] Updated weights for policy 0, policy_version 136504 (0.0027) [2024-06-13 02:17:10,003][70980] Signal inference workers to stop experience collection... (26050 times) [2024-06-13 02:17:10,056][71000] InferenceWorker_p0-w0: stopping experience collection (26050 times) [2024-06-13 02:17:10,056][70980] Signal inference workers to resume experience collection... (26050 times) [2024-06-13 02:17:10,067][71000] InferenceWorker_p0-w0: resuming experience collection (26050 times) [2024-06-13 02:17:10,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49424.9, 300 sec: 49429.7). Total num frames: 2236579840. Throughput: 0: 49186.6. Samples: 1765392080. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-13 02:17:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:17:11,864][71000] Updated weights for policy 0, policy_version 136514 (0.0027) [2024-06-13 02:17:15,806][71000] Updated weights for policy 0, policy_version 136524 (0.0035) [2024-06-13 02:17:15,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2236809216. Throughput: 0: 49299.4. Samples: 1765692480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:17:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:17:18,405][71000] Updated weights for policy 0, policy_version 136534 (0.0026) [2024-06-13 02:17:20,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48879.0, 300 sec: 49429.8). Total num frames: 2237054976. Throughput: 0: 49066.2. Samples: 1765829400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:17:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:17:22,406][71000] Updated weights for policy 0, policy_version 136544 (0.0027) [2024-06-13 02:17:25,087][71000] Updated weights for policy 0, policy_version 136554 (0.0021) [2024-06-13 02:17:25,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49425.1, 300 sec: 49374.1). Total num frames: 2237333504. Throughput: 0: 49241.7. Samples: 1766128940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:17:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 02:17:28,992][71000] Updated weights for policy 0, policy_version 136564 (0.0023) [2024-06-13 02:17:30,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 2237546496. Throughput: 0: 49477.3. Samples: 1766423220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:17:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:17:32,051][71000] Updated weights for policy 0, policy_version 136574 (0.0034) [2024-06-13 02:17:35,712][71000] Updated weights for policy 0, policy_version 136584 (0.0030) [2024-06-13 02:17:35,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2237792256. Throughput: 0: 49094.9. Samples: 1766568800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:17:35,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 02:17:38,466][71000] Updated weights for policy 0, policy_version 136594 (0.0025) [2024-06-13 02:17:40,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48878.9, 300 sec: 49374.2). Total num frames: 2238021632. Throughput: 0: 48919.1. Samples: 1766859340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:17:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:17:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000136599_2238038016.pth... [2024-06-13 02:17:41,005][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000135877_2226208768.pth [2024-06-13 02:17:42,281][71000] Updated weights for policy 0, policy_version 136604 (0.0031) [2024-06-13 02:17:45,122][71000] Updated weights for policy 0, policy_version 136614 (0.0026) [2024-06-13 02:17:45,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2238316544. Throughput: 0: 48888.3. Samples: 1767153200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:17:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:17:49,115][71000] Updated weights for policy 0, policy_version 136624 (0.0033) [2024-06-13 02:17:50,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.1, 300 sec: 49374.1). Total num frames: 2238529536. Throughput: 0: 49223.6. Samples: 1767319500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:17:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:17:51,786][71000] Updated weights for policy 0, policy_version 136634 (0.0027) [2024-06-13 02:17:55,931][71000] Updated weights for policy 0, policy_version 136644 (0.0022) [2024-06-13 02:17:55,939][70768] Fps is (10 sec: 45876.2, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 2238775296. Throughput: 0: 49354.9. Samples: 1767613040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:17:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:17:58,361][71000] Updated weights for policy 0, policy_version 136654 (0.0023) [2024-06-13 02:18:00,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2239021056. Throughput: 0: 49309.0. Samples: 1767911380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:18:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:18:02,212][71000] Updated weights for policy 0, policy_version 136664 (0.0029) [2024-06-13 02:18:05,175][71000] Updated weights for policy 0, policy_version 136674 (0.0031) [2024-06-13 02:18:05,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2239299584. Throughput: 0: 49578.1. Samples: 1768060420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:18:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:18:09,050][71000] Updated weights for policy 0, policy_version 136684 (0.0030) [2024-06-13 02:18:10,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2239528960. Throughput: 0: 49605.7. Samples: 1768361200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 02:18:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 02:18:11,774][71000] Updated weights for policy 0, policy_version 136694 (0.0026) [2024-06-13 02:18:12,055][70980] Signal inference workers to stop experience collection... (26100 times) [2024-06-13 02:18:12,056][70980] Signal inference workers to resume experience collection... (26100 times) [2024-06-13 02:18:12,064][71000] InferenceWorker_p0-w0: stopping experience collection (26100 times) [2024-06-13 02:18:12,079][71000] InferenceWorker_p0-w0: resuming experience collection (26100 times) [2024-06-13 02:18:15,717][71000] Updated weights for policy 0, policy_version 136704 (0.0033) [2024-06-13 02:18:15,940][70768] Fps is (10 sec: 45875.7, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 2239758336. Throughput: 0: 49509.0. Samples: 1768651120. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 02:18:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:18:18,463][71000] Updated weights for policy 0, policy_version 136714 (0.0028) [2024-06-13 02:18:20,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 2240020480. Throughput: 0: 49435.5. Samples: 1768793400. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 02:18:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 02:18:22,349][71000] Updated weights for policy 0, policy_version 136724 (0.0034) [2024-06-13 02:18:25,159][71000] Updated weights for policy 0, policy_version 136734 (0.0024) [2024-06-13 02:18:25,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 2240282624. Throughput: 0: 49680.4. Samples: 1769094960. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 02:18:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:18:28,808][71000] Updated weights for policy 0, policy_version 136744 (0.0031) [2024-06-13 02:18:30,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 2240528384. Throughput: 0: 49779.2. Samples: 1769393260. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 02:18:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:18:31,646][71000] Updated weights for policy 0, policy_version 136754 (0.0032) [2024-06-13 02:18:35,313][71000] Updated weights for policy 0, policy_version 136764 (0.0024) [2024-06-13 02:18:35,939][70768] Fps is (10 sec: 47514.4, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2240757760. Throughput: 0: 49561.0. Samples: 1769549740. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 02:18:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:18:38,304][71000] Updated weights for policy 0, policy_version 136774 (0.0029) [2024-06-13 02:18:40,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 2241003520. Throughput: 0: 49354.2. Samples: 1769833980. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 02:18:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:18:41,778][71000] Updated weights for policy 0, policy_version 136784 (0.0026) [2024-06-13 02:18:45,151][71000] Updated weights for policy 0, policy_version 136794 (0.0037) [2024-06-13 02:18:45,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.2, 300 sec: 49374.2). Total num frames: 2241265664. Throughput: 0: 49337.0. Samples: 1770131540. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 02:18:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:18:48,599][71000] Updated weights for policy 0, policy_version 136804 (0.0027) [2024-06-13 02:18:50,939][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2241495040. Throughput: 0: 49324.6. Samples: 1770280020. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 02:18:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:18:51,890][71000] Updated weights for policy 0, policy_version 136814 (0.0027) [2024-06-13 02:18:55,273][71000] Updated weights for policy 0, policy_version 136824 (0.0029) [2024-06-13 02:18:55,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2241740800. Throughput: 0: 49191.7. Samples: 1770574820. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 02:18:55,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-13 02:18:58,478][71000] Updated weights for policy 0, policy_version 136834 (0.0023) [2024-06-13 02:19:00,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49698.1, 300 sec: 49374.1). Total num frames: 2242002944. Throughput: 0: 49294.9. Samples: 1770869400. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 02:19:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:19:01,700][71000] Updated weights for policy 0, policy_version 136844 (0.0028) [2024-06-13 02:19:04,963][71000] Updated weights for policy 0, policy_version 136854 (0.0027) [2024-06-13 02:19:05,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2242265088. Throughput: 0: 49677.3. Samples: 1771028880. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 02:19:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:19:08,022][71000] Updated weights for policy 0, policy_version 136864 (0.0031) [2024-06-13 02:19:10,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 2242494464. Throughput: 0: 49585.5. Samples: 1771326300. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 02:19:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 02:19:11,668][71000] Updated weights for policy 0, policy_version 136874 (0.0033) [2024-06-13 02:19:15,087][71000] Updated weights for policy 0, policy_version 136884 (0.0026) [2024-06-13 02:19:15,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49971.0, 300 sec: 49429.7). Total num frames: 2242756608. Throughput: 0: 49495.4. Samples: 1771620560. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 02:19:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 02:19:18,094][71000] Updated weights for policy 0, policy_version 136894 (0.0029) [2024-06-13 02:19:20,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 2243002368. Throughput: 0: 49298.6. Samples: 1771768180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 02:19:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 02:19:21,452][71000] Updated weights for policy 0, policy_version 136904 (0.0033) [2024-06-13 02:19:25,063][71000] Updated weights for policy 0, policy_version 136914 (0.0030) [2024-06-13 02:19:25,189][70980] Signal inference workers to stop experience collection... (26150 times) [2024-06-13 02:19:25,193][70980] Signal inference workers to resume experience collection... (26150 times) [2024-06-13 02:19:25,229][71000] InferenceWorker_p0-w0: stopping experience collection (26150 times) [2024-06-13 02:19:25,230][71000] InferenceWorker_p0-w0: resuming experience collection (26150 times) [2024-06-13 02:19:25,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2243248128. Throughput: 0: 49705.6. Samples: 1772070740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 02:19:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:19:28,195][71000] Updated weights for policy 0, policy_version 136924 (0.0027) [2024-06-13 02:19:30,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2243477504. Throughput: 0: 49789.7. Samples: 1772372080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 02:19:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:19:31,406][71000] Updated weights for policy 0, policy_version 136934 (0.0028) [2024-06-13 02:19:34,483][71000] Updated weights for policy 0, policy_version 136944 (0.0019) [2024-06-13 02:19:35,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 2243739648. Throughput: 0: 49605.2. Samples: 1772512260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 02:19:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:19:38,142][71000] Updated weights for policy 0, policy_version 136954 (0.0025) [2024-06-13 02:19:40,939][70768] Fps is (10 sec: 52429.0, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 2244001792. Throughput: 0: 49677.8. Samples: 1772810320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 02:19:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:19:40,956][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000136963_2244001792.pth... [2024-06-13 02:19:41,012][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000136239_2232139776.pth [2024-06-13 02:19:41,161][71000] Updated weights for policy 0, policy_version 136964 (0.0033) [2024-06-13 02:19:45,023][71000] Updated weights for policy 0, policy_version 136974 (0.0026) [2024-06-13 02:19:45,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49971.0, 300 sec: 49485.2). Total num frames: 2244263936. Throughput: 0: 49804.4. Samples: 1773110600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 02:19:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:19:47,896][71000] Updated weights for policy 0, policy_version 136984 (0.0025) [2024-06-13 02:19:50,940][70768] Fps is (10 sec: 45874.5, 60 sec: 49424.9, 300 sec: 49374.2). Total num frames: 2244460544. Throughput: 0: 49451.9. Samples: 1773254220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 02:19:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:19:51,517][71000] Updated weights for policy 0, policy_version 136994 (0.0038) [2024-06-13 02:19:54,554][71000] Updated weights for policy 0, policy_version 137004 (0.0043) [2024-06-13 02:19:55,939][70768] Fps is (10 sec: 45876.0, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 2244722688. Throughput: 0: 49496.4. Samples: 1773553640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 02:19:55,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 02:19:58,002][71000] Updated weights for policy 0, policy_version 137014 (0.0021) [2024-06-13 02:20:00,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 2244984832. Throughput: 0: 49406.0. Samples: 1773843820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 02:20:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:20:00,973][71000] Updated weights for policy 0, policy_version 137024 (0.0022) [2024-06-13 02:20:04,817][71000] Updated weights for policy 0, policy_version 137034 (0.0021) [2024-06-13 02:20:05,939][70768] Fps is (10 sec: 52429.1, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 2245246976. Throughput: 0: 49614.8. Samples: 1774000840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 02:20:05,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 02:20:07,351][71000] Updated weights for policy 0, policy_version 137044 (0.0037) [2024-06-13 02:20:10,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49424.9, 300 sec: 49374.2). Total num frames: 2245459968. Throughput: 0: 49711.6. Samples: 1774307760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 02:20:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:20:11,408][71000] Updated weights for policy 0, policy_version 137054 (0.0023) [2024-06-13 02:20:14,102][71000] Updated weights for policy 0, policy_version 137064 (0.0035) [2024-06-13 02:20:15,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2245722112. Throughput: 0: 49454.2. Samples: 1774597520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 02:20:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:20:17,893][71000] Updated weights for policy 0, policy_version 137074 (0.0028) [2024-06-13 02:20:20,491][71000] Updated weights for policy 0, policy_version 137084 (0.0032) [2024-06-13 02:20:20,940][70768] Fps is (10 sec: 54066.9, 60 sec: 49971.1, 300 sec: 49485.2). Total num frames: 2246000640. Throughput: 0: 49867.4. Samples: 1774756300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 02:20:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 02:20:24,301][70980] Signal inference workers to stop experience collection... (26200 times) [2024-06-13 02:20:24,333][71000] InferenceWorker_p0-w0: stopping experience collection (26200 times) [2024-06-13 02:20:24,412][70980] Signal inference workers to resume experience collection... (26200 times) [2024-06-13 02:20:24,412][71000] InferenceWorker_p0-w0: resuming experience collection (26200 times) [2024-06-13 02:20:24,574][71000] Updated weights for policy 0, policy_version 137094 (0.0032) [2024-06-13 02:20:25,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 2246246400. Throughput: 0: 50139.5. Samples: 1775066600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 02:20:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:20:26,929][71000] Updated weights for policy 0, policy_version 137104 (0.0027) [2024-06-13 02:20:30,940][70768] Fps is (10 sec: 45875.7, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 2246459392. Throughput: 0: 49952.1. Samples: 1775358440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 02:20:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:20:31,296][71000] Updated weights for policy 0, policy_version 137114 (0.0026) [2024-06-13 02:20:33,754][71000] Updated weights for policy 0, policy_version 137124 (0.0033) [2024-06-13 02:20:35,939][70768] Fps is (10 sec: 45875.5, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2246705152. Throughput: 0: 49675.7. Samples: 1775489620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 02:20:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:20:37,775][71000] Updated weights for policy 0, policy_version 137134 (0.0032) [2024-06-13 02:20:40,410][71000] Updated weights for policy 0, policy_version 137144 (0.0024) [2024-06-13 02:20:40,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49698.0, 300 sec: 49485.2). Total num frames: 2246983680. Throughput: 0: 49626.9. Samples: 1775786860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 02:20:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:20:44,327][71000] Updated weights for policy 0, policy_version 137154 (0.0028) [2024-06-13 02:20:45,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 2247229440. Throughput: 0: 49914.6. Samples: 1776089980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 02:20:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:20:46,666][71000] Updated weights for policy 0, policy_version 137164 (0.0020) [2024-06-13 02:20:50,940][70768] Fps is (10 sec: 45875.5, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 2247442432. Throughput: 0: 49642.1. Samples: 1776234740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 02:20:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:20:51,097][71000] Updated weights for policy 0, policy_version 137174 (0.0033) [2024-06-13 02:20:53,290][71000] Updated weights for policy 0, policy_version 137184 (0.0029) [2024-06-13 02:20:55,939][70768] Fps is (10 sec: 45875.8, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2247688192. Throughput: 0: 49356.6. Samples: 1776528800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 02:20:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:20:57,837][71000] Updated weights for policy 0, policy_version 137194 (0.0035) [2024-06-13 02:21:00,154][71000] Updated weights for policy 0, policy_version 137204 (0.0033) [2024-06-13 02:21:00,940][70768] Fps is (10 sec: 52425.1, 60 sec: 49697.5, 300 sec: 49429.6). Total num frames: 2247966720. Throughput: 0: 49380.1. Samples: 1776819660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 02:21:00,941][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:21:04,164][71000] Updated weights for policy 0, policy_version 137214 (0.0029) [2024-06-13 02:21:05,939][70768] Fps is (10 sec: 54067.5, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 2248228864. Throughput: 0: 49665.6. Samples: 1776991240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 02:21:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:21:06,482][71000] Updated weights for policy 0, policy_version 137224 (0.0038) [2024-06-13 02:21:10,723][71000] Updated weights for policy 0, policy_version 137234 (0.0027) [2024-06-13 02:21:10,939][70768] Fps is (10 sec: 47517.3, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 2248441856. Throughput: 0: 49346.3. Samples: 1777287180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 02:21:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:21:12,854][71000] Updated weights for policy 0, policy_version 137244 (0.0028) [2024-06-13 02:21:15,940][70768] Fps is (10 sec: 44236.5, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2248671232. Throughput: 0: 49379.6. Samples: 1777580520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 02:21:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:21:16,533][70980] Signal inference workers to stop experience collection... (26250 times) [2024-06-13 02:21:16,577][71000] InferenceWorker_p0-w0: stopping experience collection (26250 times) [2024-06-13 02:21:16,582][70980] Signal inference workers to resume experience collection... (26250 times) [2024-06-13 02:21:16,598][71000] InferenceWorker_p0-w0: resuming experience collection (26250 times) [2024-06-13 02:21:17,474][71000] Updated weights for policy 0, policy_version 137254 (0.0031) [2024-06-13 02:21:19,971][71000] Updated weights for policy 0, policy_version 137264 (0.0032) [2024-06-13 02:21:20,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 2248949760. Throughput: 0: 49638.6. Samples: 1777723360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 02:21:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:21:24,038][71000] Updated weights for policy 0, policy_version 137274 (0.0030) [2024-06-13 02:21:25,940][70768] Fps is (10 sec: 54067.2, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 2249211904. Throughput: 0: 49600.6. Samples: 1778018880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 02:21:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:21:26,742][71000] Updated weights for policy 0, policy_version 137284 (0.0030) [2024-06-13 02:21:30,606][71000] Updated weights for policy 0, policy_version 137294 (0.0021) [2024-06-13 02:21:30,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 2249441280. Throughput: 0: 49611.6. Samples: 1778322500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 02:21:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:21:33,046][71000] Updated weights for policy 0, policy_version 137304 (0.0022) [2024-06-13 02:21:35,940][70768] Fps is (10 sec: 44236.6, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2249654272. Throughput: 0: 49283.1. Samples: 1778452480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 02:21:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:21:37,179][71000] Updated weights for policy 0, policy_version 137314 (0.0036) [2024-06-13 02:21:39,488][71000] Updated weights for policy 0, policy_version 137324 (0.0039) [2024-06-13 02:21:40,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 2249932800. Throughput: 0: 49418.6. Samples: 1778752640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 02:21:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:21:40,985][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000137326_2249949184.pth... [2024-06-13 02:21:41,036][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000136599_2238038016.pth [2024-06-13 02:21:44,073][71000] Updated weights for policy 0, policy_version 137334 (0.0033) [2024-06-13 02:21:45,940][70768] Fps is (10 sec: 55705.9, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 2250211328. Throughput: 0: 49673.3. Samples: 1779054920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 02:21:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:21:46,236][71000] Updated weights for policy 0, policy_version 137344 (0.0032) [2024-06-13 02:21:50,358][71000] Updated weights for policy 0, policy_version 137354 (0.0032) [2024-06-13 02:21:50,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 2250424320. Throughput: 0: 49275.0. Samples: 1779208620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 02:21:50,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:21:52,840][71000] Updated weights for policy 0, policy_version 137364 (0.0027) [2024-06-13 02:21:55,939][70768] Fps is (10 sec: 44236.9, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2250653696. Throughput: 0: 49381.8. Samples: 1779509360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 02:21:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:21:57,094][71000] Updated weights for policy 0, policy_version 137374 (0.0033) [2024-06-13 02:21:59,779][71000] Updated weights for policy 0, policy_version 137384 (0.0034) [2024-06-13 02:22:00,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.6, 300 sec: 49374.2). Total num frames: 2250915840. Throughput: 0: 48992.9. Samples: 1779785200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 02:22:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:22:04,088][71000] Updated weights for policy 0, policy_version 137394 (0.0029) [2024-06-13 02:22:05,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49151.9, 300 sec: 49485.3). Total num frames: 2251177984. Throughput: 0: 49312.5. Samples: 1779942420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 02:22:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:22:06,376][71000] Updated weights for policy 0, policy_version 137404 (0.0034) [2024-06-13 02:22:10,379][71000] Updated weights for policy 0, policy_version 137414 (0.0030) [2024-06-13 02:22:10,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 2251390976. Throughput: 0: 49486.2. Samples: 1780245760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 02:22:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:22:12,957][71000] Updated weights for policy 0, policy_version 137424 (0.0030) [2024-06-13 02:22:15,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2251636736. Throughput: 0: 49208.9. Samples: 1780536900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 02:22:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:22:17,106][71000] Updated weights for policy 0, policy_version 137434 (0.0039) [2024-06-13 02:22:19,346][70980] Signal inference workers to stop experience collection... (26300 times) [2024-06-13 02:22:19,347][70980] Signal inference workers to resume experience collection... (26300 times) [2024-06-13 02:22:19,396][71000] InferenceWorker_p0-w0: stopping experience collection (26300 times) [2024-06-13 02:22:19,396][71000] InferenceWorker_p0-w0: resuming experience collection (26300 times) [2024-06-13 02:22:19,714][71000] Updated weights for policy 0, policy_version 137444 (0.0035) [2024-06-13 02:22:20,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2251898880. Throughput: 0: 49409.8. Samples: 1780675920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 02:22:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:22:23,723][71000] Updated weights for policy 0, policy_version 137454 (0.0034) [2024-06-13 02:22:25,940][70768] Fps is (10 sec: 54065.8, 60 sec: 49424.8, 300 sec: 49596.3). Total num frames: 2252177408. Throughput: 0: 49241.0. Samples: 1780968500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 02:22:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:22:26,607][71000] Updated weights for policy 0, policy_version 137464 (0.0025) [2024-06-13 02:22:30,600][71000] Updated weights for policy 0, policy_version 137474 (0.0027) [2024-06-13 02:22:30,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48879.0, 300 sec: 49429.7). Total num frames: 2252374016. Throughput: 0: 49312.9. Samples: 1781274000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-13 02:22:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:22:33,315][71000] Updated weights for policy 0, policy_version 137484 (0.0029) [2024-06-13 02:22:35,939][70768] Fps is (10 sec: 44238.2, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2252619776. Throughput: 0: 49059.6. Samples: 1781416300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-13 02:22:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:22:37,378][71000] Updated weights for policy 0, policy_version 137494 (0.0025) [2024-06-13 02:22:40,102][71000] Updated weights for policy 0, policy_version 137504 (0.0033) [2024-06-13 02:22:40,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 2252881920. Throughput: 0: 48602.4. Samples: 1781696480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-13 02:22:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:22:44,109][71000] Updated weights for policy 0, policy_version 137514 (0.0028) [2024-06-13 02:22:45,939][70768] Fps is (10 sec: 50790.5, 60 sec: 48605.9, 300 sec: 49485.3). Total num frames: 2253127680. Throughput: 0: 48994.7. Samples: 1781989960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-13 02:22:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:22:47,039][71000] Updated weights for policy 0, policy_version 137524 (0.0024) [2024-06-13 02:22:50,480][71000] Updated weights for policy 0, policy_version 137534 (0.0023) [2024-06-13 02:22:50,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49424.9, 300 sec: 49540.7). Total num frames: 2253389824. Throughput: 0: 49060.2. Samples: 1782150140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-13 02:22:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:22:53,286][71000] Updated weights for policy 0, policy_version 137544 (0.0027) [2024-06-13 02:22:55,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 2253619200. Throughput: 0: 49075.1. Samples: 1782454140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-13 02:22:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:22:57,093][71000] Updated weights for policy 0, policy_version 137554 (0.0026) [2024-06-13 02:22:59,959][71000] Updated weights for policy 0, policy_version 137564 (0.0022) [2024-06-13 02:23:00,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 2253864960. Throughput: 0: 49046.1. Samples: 1782743980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-13 02:23:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:23:03,591][71000] Updated weights for policy 0, policy_version 137574 (0.0024) [2024-06-13 02:23:05,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48878.8, 300 sec: 49429.7). Total num frames: 2254110720. Throughput: 0: 49198.6. Samples: 1782889860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-13 02:23:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:23:06,930][71000] Updated weights for policy 0, policy_version 137584 (0.0033) [2024-06-13 02:23:10,154][71000] Updated weights for policy 0, policy_version 137594 (0.0023) [2024-06-13 02:23:10,940][70768] Fps is (10 sec: 50791.3, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 2254372864. Throughput: 0: 49353.2. Samples: 1783189380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-13 02:23:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:23:13,439][71000] Updated weights for policy 0, policy_version 137604 (0.0033) [2024-06-13 02:23:15,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2254602240. Throughput: 0: 49105.8. Samples: 1783483760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-13 02:23:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:23:17,043][71000] Updated weights for policy 0, policy_version 137614 (0.0031) [2024-06-13 02:23:19,223][70980] Signal inference workers to stop experience collection... (26350 times) [2024-06-13 02:23:19,256][71000] InferenceWorker_p0-w0: stopping experience collection (26350 times) [2024-06-13 02:23:19,279][70980] Signal inference workers to resume experience collection... (26350 times) [2024-06-13 02:23:19,279][71000] InferenceWorker_p0-w0: resuming experience collection (26350 times) [2024-06-13 02:23:20,296][71000] Updated weights for policy 0, policy_version 137624 (0.0023) [2024-06-13 02:23:20,940][70768] Fps is (10 sec: 47512.4, 60 sec: 49151.8, 300 sec: 49374.1). Total num frames: 2254848000. Throughput: 0: 49010.4. Samples: 1783621780. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-13 02:23:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:23:23,597][71000] Updated weights for policy 0, policy_version 137634 (0.0023) [2024-06-13 02:23:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48606.1, 300 sec: 49374.2). Total num frames: 2255093760. Throughput: 0: 49383.7. Samples: 1783918740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-13 02:23:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:23:26,801][71000] Updated weights for policy 0, policy_version 137644 (0.0031) [2024-06-13 02:23:29,990][71000] Updated weights for policy 0, policy_version 137654 (0.0033) [2024-06-13 02:23:30,940][70768] Fps is (10 sec: 52430.0, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 2255372288. Throughput: 0: 49503.9. Samples: 1784217640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 02:23:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:23:33,798][71000] Updated weights for policy 0, policy_version 137664 (0.0024) [2024-06-13 02:23:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2255585280. Throughput: 0: 49276.6. Samples: 1784367580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 02:23:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:23:36,793][71000] Updated weights for policy 0, policy_version 137674 (0.0027) [2024-06-13 02:23:40,639][71000] Updated weights for policy 0, policy_version 137684 (0.0023) [2024-06-13 02:23:40,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.1, 300 sec: 49374.1). Total num frames: 2255831040. Throughput: 0: 49032.0. Samples: 1784660580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 02:23:40,944][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:23:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000137685_2255831040.pth... [2024-06-13 02:23:40,989][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000136963_2244001792.pth [2024-06-13 02:23:43,704][71000] Updated weights for policy 0, policy_version 137694 (0.0029) [2024-06-13 02:23:45,940][70768] Fps is (10 sec: 47512.6, 60 sec: 48878.7, 300 sec: 49374.1). Total num frames: 2256060416. Throughput: 0: 48844.9. Samples: 1784942000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 02:23:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:23:47,113][71000] Updated weights for policy 0, policy_version 137704 (0.0031) [2024-06-13 02:23:50,415][71000] Updated weights for policy 0, policy_version 137714 (0.0027) [2024-06-13 02:23:50,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48879.0, 300 sec: 49429.7). Total num frames: 2256322560. Throughput: 0: 49017.4. Samples: 1785095640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 02:23:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:23:53,819][71000] Updated weights for policy 0, policy_version 137724 (0.0021) [2024-06-13 02:23:55,940][70768] Fps is (10 sec: 50791.3, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2256568320. Throughput: 0: 48998.2. Samples: 1785394300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 02:23:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:23:56,885][71000] Updated weights for policy 0, policy_version 137734 (0.0025) [2024-06-13 02:24:00,559][71000] Updated weights for policy 0, policy_version 137744 (0.0024) [2024-06-13 02:24:00,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 2256797696. Throughput: 0: 49110.6. Samples: 1785693740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 02:24:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:24:03,642][71000] Updated weights for policy 0, policy_version 137754 (0.0035) [2024-06-13 02:24:05,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48879.1, 300 sec: 49318.6). Total num frames: 2257043456. Throughput: 0: 49271.9. Samples: 1785839000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 02:24:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:24:07,154][71000] Updated weights for policy 0, policy_version 137764 (0.0031) [2024-06-13 02:24:10,389][71000] Updated weights for policy 0, policy_version 137774 (0.0033) [2024-06-13 02:24:10,939][70768] Fps is (10 sec: 52429.1, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2257321984. Throughput: 0: 49098.3. Samples: 1786128160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 02:24:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:24:13,811][71000] Updated weights for policy 0, policy_version 137784 (0.0028) [2024-06-13 02:24:15,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2257551360. Throughput: 0: 48923.1. Samples: 1786419180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 02:24:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:24:17,013][71000] Updated weights for policy 0, policy_version 137794 (0.0028) [2024-06-13 02:24:20,639][71000] Updated weights for policy 0, policy_version 137804 (0.0031) [2024-06-13 02:24:20,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48879.1, 300 sec: 49263.1). Total num frames: 2257780736. Throughput: 0: 49112.4. Samples: 1786577640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 02:24:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:24:22,742][70980] Signal inference workers to stop experience collection... (26400 times) [2024-06-13 02:24:22,787][71000] InferenceWorker_p0-w0: stopping experience collection (26400 times) [2024-06-13 02:24:22,793][70980] Signal inference workers to resume experience collection... (26400 times) [2024-06-13 02:24:22,808][71000] InferenceWorker_p0-w0: resuming experience collection (26400 times) [2024-06-13 02:24:23,716][71000] Updated weights for policy 0, policy_version 137814 (0.0025) [2024-06-13 02:24:25,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 2258042880. Throughput: 0: 48784.7. Samples: 1786855900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 02:24:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:24:27,506][71000] Updated weights for policy 0, policy_version 137824 (0.0031) [2024-06-13 02:24:30,200][71000] Updated weights for policy 0, policy_version 137834 (0.0037) [2024-06-13 02:24:30,939][70768] Fps is (10 sec: 52429.2, 60 sec: 48879.0, 300 sec: 49374.2). Total num frames: 2258305024. Throughput: 0: 49050.0. Samples: 1787149240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 02:24:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 02:24:34,042][71000] Updated weights for policy 0, policy_version 137844 (0.0027) [2024-06-13 02:24:35,939][70768] Fps is (10 sec: 49153.2, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2258534400. Throughput: 0: 49272.2. Samples: 1787312880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 02:24:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:24:36,866][71000] Updated weights for policy 0, policy_version 137854 (0.0025) [2024-06-13 02:24:40,893][71000] Updated weights for policy 0, policy_version 137864 (0.0028) [2024-06-13 02:24:40,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2258763776. Throughput: 0: 48932.9. Samples: 1787596280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 02:24:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:24:43,423][71000] Updated weights for policy 0, policy_version 137874 (0.0025) [2024-06-13 02:24:45,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.2, 300 sec: 49318.6). Total num frames: 2259009536. Throughput: 0: 49064.9. Samples: 1787901660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 02:24:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:24:47,267][71000] Updated weights for policy 0, policy_version 137884 (0.0025) [2024-06-13 02:24:49,912][71000] Updated weights for policy 0, policy_version 137894 (0.0021) [2024-06-13 02:24:50,940][70768] Fps is (10 sec: 54067.3, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 2259304448. Throughput: 0: 49059.4. Samples: 1788046680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 02:24:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:24:54,083][71000] Updated weights for policy 0, policy_version 137904 (0.0031) [2024-06-13 02:24:55,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2259517440. Throughput: 0: 49195.5. Samples: 1788341960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 02:24:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:24:56,782][71000] Updated weights for policy 0, policy_version 137914 (0.0029) [2024-06-13 02:25:00,638][71000] Updated weights for policy 0, policy_version 137924 (0.0027) [2024-06-13 02:25:00,939][70768] Fps is (10 sec: 45875.5, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2259763200. Throughput: 0: 49487.2. Samples: 1788646100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 02:25:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 02:25:03,139][71000] Updated weights for policy 0, policy_version 137934 (0.0030) [2024-06-13 02:25:05,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49151.8, 300 sec: 49263.1). Total num frames: 2259992576. Throughput: 0: 48919.5. Samples: 1788779020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 02:25:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:25:07,130][71000] Updated weights for policy 0, policy_version 137944 (0.0029) [2024-06-13 02:25:09,771][71000] Updated weights for policy 0, policy_version 137954 (0.0028) [2024-06-13 02:25:10,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2260271104. Throughput: 0: 49488.0. Samples: 1789082860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 02:25:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:25:13,955][71000] Updated weights for policy 0, policy_version 137964 (0.0026) [2024-06-13 02:25:15,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49424.9, 300 sec: 49207.5). Total num frames: 2260516864. Throughput: 0: 49470.9. Samples: 1789375440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 02:25:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:25:16,270][71000] Updated weights for policy 0, policy_version 137974 (0.0024) [2024-06-13 02:25:20,135][70980] Signal inference workers to stop experience collection... (26450 times) [2024-06-13 02:25:20,135][70980] Signal inference workers to resume experience collection... (26450 times) [2024-06-13 02:25:20,173][71000] InferenceWorker_p0-w0: stopping experience collection (26450 times) [2024-06-13 02:25:20,173][71000] InferenceWorker_p0-w0: resuming experience collection (26450 times) [2024-06-13 02:25:20,611][71000] Updated weights for policy 0, policy_version 137984 (0.0033) [2024-06-13 02:25:20,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2260746240. Throughput: 0: 49205.1. Samples: 1789527120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 02:25:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:25:23,192][71000] Updated weights for policy 0, policy_version 137994 (0.0022) [2024-06-13 02:25:25,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2260975616. Throughput: 0: 49439.1. Samples: 1789821040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 02:25:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:25:27,206][71000] Updated weights for policy 0, policy_version 138004 (0.0029) [2024-06-13 02:25:29,792][71000] Updated weights for policy 0, policy_version 138014 (0.0033) [2024-06-13 02:25:30,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2261254144. Throughput: 0: 49037.6. Samples: 1790108360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 02:25:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:25:34,226][71000] Updated weights for policy 0, policy_version 138024 (0.0035) [2024-06-13 02:25:35,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49151.8, 300 sec: 49152.0). Total num frames: 2261483520. Throughput: 0: 49375.0. Samples: 1790268560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 02:25:35,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 02:25:36,583][71000] Updated weights for policy 0, policy_version 138034 (0.0029) [2024-06-13 02:25:40,595][71000] Updated weights for policy 0, policy_version 138044 (0.0026) [2024-06-13 02:25:40,939][70768] Fps is (10 sec: 45875.8, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2261712896. Throughput: 0: 49199.6. Samples: 1790555940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 02:25:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:25:41,059][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000138045_2261729280.pth... [2024-06-13 02:25:41,110][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000137326_2249949184.pth [2024-06-13 02:25:42,989][71000] Updated weights for policy 0, policy_version 138054 (0.0024) [2024-06-13 02:25:45,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2261958656. Throughput: 0: 49144.4. Samples: 1790857600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 02:25:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:25:47,149][71000] Updated weights for policy 0, policy_version 138064 (0.0037) [2024-06-13 02:25:49,855][71000] Updated weights for policy 0, policy_version 138074 (0.0030) [2024-06-13 02:25:50,940][70768] Fps is (10 sec: 52428.0, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 2262237184. Throughput: 0: 49292.4. Samples: 1790997180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 02:25:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:25:53,955][71000] Updated weights for policy 0, policy_version 138084 (0.0033) [2024-06-13 02:25:55,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49425.1, 300 sec: 49207.7). Total num frames: 2262482944. Throughput: 0: 49049.9. Samples: 1791290100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 02:25:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:25:56,615][71000] Updated weights for policy 0, policy_version 138094 (0.0024) [2024-06-13 02:26:00,689][71000] Updated weights for policy 0, policy_version 138104 (0.0028) [2024-06-13 02:26:00,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2262695936. Throughput: 0: 49120.6. Samples: 1791585860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 02:26:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:26:02,910][71000] Updated weights for policy 0, policy_version 138114 (0.0029) [2024-06-13 02:26:05,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2262941696. Throughput: 0: 48734.7. Samples: 1791720180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 02:26:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:26:07,074][71000] Updated weights for policy 0, policy_version 138124 (0.0028) [2024-06-13 02:26:09,903][71000] Updated weights for policy 0, policy_version 138134 (0.0027) [2024-06-13 02:26:10,942][70768] Fps is (10 sec: 50775.8, 60 sec: 48876.7, 300 sec: 49262.6). Total num frames: 2263203840. Throughput: 0: 49068.0. Samples: 1792029240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 02:26:10,943][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:26:14,164][71000] Updated weights for policy 0, policy_version 138144 (0.0035) [2024-06-13 02:26:14,960][70980] Signal inference workers to stop experience collection... (26500 times) [2024-06-13 02:26:14,991][71000] InferenceWorker_p0-w0: stopping experience collection (26500 times) [2024-06-13 02:26:15,015][70980] Signal inference workers to resume experience collection... (26500 times) [2024-06-13 02:26:15,017][71000] InferenceWorker_p0-w0: resuming experience collection (26500 times) [2024-06-13 02:26:15,940][70768] Fps is (10 sec: 54067.0, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2263482368. Throughput: 0: 49178.2. Samples: 1792321380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 02:26:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:26:16,692][71000] Updated weights for policy 0, policy_version 138154 (0.0039) [2024-06-13 02:26:20,757][71000] Updated weights for policy 0, policy_version 138164 (0.0027) [2024-06-13 02:26:20,940][70768] Fps is (10 sec: 49165.8, 60 sec: 49152.0, 300 sec: 49096.4). Total num frames: 2263695360. Throughput: 0: 48879.6. Samples: 1792468140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 02:26:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 02:26:23,208][71000] Updated weights for policy 0, policy_version 138174 (0.0028) [2024-06-13 02:26:25,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2263941120. Throughput: 0: 49234.9. Samples: 1792771520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 02:26:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:26:27,002][71000] Updated weights for policy 0, policy_version 138184 (0.0024) [2024-06-13 02:26:29,825][71000] Updated weights for policy 0, policy_version 138194 (0.0023) [2024-06-13 02:26:30,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2264203264. Throughput: 0: 49056.7. Samples: 1793065160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 02:26:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:26:33,978][71000] Updated weights for policy 0, policy_version 138204 (0.0038) [2024-06-13 02:26:35,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2264449024. Throughput: 0: 49308.5. Samples: 1793216060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 02:26:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:26:36,419][71000] Updated weights for policy 0, policy_version 138214 (0.0028) [2024-06-13 02:26:40,659][71000] Updated weights for policy 0, policy_version 138224 (0.0027) [2024-06-13 02:26:40,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49424.9, 300 sec: 49040.9). Total num frames: 2264678400. Throughput: 0: 49216.7. Samples: 1793504860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:26:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:26:43,596][71000] Updated weights for policy 0, policy_version 138234 (0.0037) [2024-06-13 02:26:45,940][70768] Fps is (10 sec: 44237.1, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2264891392. Throughput: 0: 49135.1. Samples: 1793796940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:26:45,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 02:26:47,292][71000] Updated weights for policy 0, policy_version 138244 (0.0024) [2024-06-13 02:26:49,949][71000] Updated weights for policy 0, policy_version 138254 (0.0027) [2024-06-13 02:26:50,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2265186304. Throughput: 0: 49408.5. Samples: 1793943560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:26:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:26:53,748][71000] Updated weights for policy 0, policy_version 138264 (0.0034) [2024-06-13 02:26:55,940][70768] Fps is (10 sec: 55705.8, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2265448448. Throughput: 0: 49300.0. Samples: 1794247600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:26:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:26:56,328][71000] Updated weights for policy 0, policy_version 138274 (0.0033) [2024-06-13 02:27:00,402][71000] Updated weights for policy 0, policy_version 138284 (0.0030) [2024-06-13 02:27:00,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49698.0, 300 sec: 49152.0). Total num frames: 2265677824. Throughput: 0: 49313.8. Samples: 1794540500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:27:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:27:03,252][71000] Updated weights for policy 0, policy_version 138294 (0.0030) [2024-06-13 02:27:05,939][70768] Fps is (10 sec: 44236.9, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2265890816. Throughput: 0: 49156.5. Samples: 1794680180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:27:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 02:27:07,171][71000] Updated weights for policy 0, policy_version 138304 (0.0035) [2024-06-13 02:27:10,042][71000] Updated weights for policy 0, policy_version 138314 (0.0028) [2024-06-13 02:27:10,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49154.2, 300 sec: 49207.5). Total num frames: 2266152960. Throughput: 0: 48985.8. Samples: 1794975880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:27:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:27:12,695][70980] Signal inference workers to stop experience collection... (26550 times) [2024-06-13 02:27:12,696][70980] Signal inference workers to resume experience collection... (26550 times) [2024-06-13 02:27:12,707][71000] InferenceWorker_p0-w0: stopping experience collection (26550 times) [2024-06-13 02:27:12,717][71000] InferenceWorker_p0-w0: resuming experience collection (26550 times) [2024-06-13 02:27:13,801][71000] Updated weights for policy 0, policy_version 138324 (0.0033) [2024-06-13 02:27:15,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48606.0, 300 sec: 49152.0). Total num frames: 2266398720. Throughput: 0: 48944.2. Samples: 1795267640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:27:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:27:16,683][71000] Updated weights for policy 0, policy_version 138334 (0.0027) [2024-06-13 02:27:20,438][71000] Updated weights for policy 0, policy_version 138344 (0.0030) [2024-06-13 02:27:20,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49152.1, 300 sec: 49041.0). Total num frames: 2266644480. Throughput: 0: 49062.4. Samples: 1795423860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:27:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:27:23,462][71000] Updated weights for policy 0, policy_version 138354 (0.0035) [2024-06-13 02:27:25,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 2266873856. Throughput: 0: 49132.2. Samples: 1795715800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:27:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:27:26,919][71000] Updated weights for policy 0, policy_version 138364 (0.0028) [2024-06-13 02:27:30,126][71000] Updated weights for policy 0, policy_version 138374 (0.0031) [2024-06-13 02:27:30,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2267136000. Throughput: 0: 49185.7. Samples: 1796010300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:27:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:27:33,490][71000] Updated weights for policy 0, policy_version 138384 (0.0028) [2024-06-13 02:27:35,940][70768] Fps is (10 sec: 50789.3, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2267381760. Throughput: 0: 49355.4. Samples: 1796164560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:27:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:27:37,001][71000] Updated weights for policy 0, policy_version 138394 (0.0023) [2024-06-13 02:27:40,026][71000] Updated weights for policy 0, policy_version 138404 (0.0028) [2024-06-13 02:27:40,939][70768] Fps is (10 sec: 50791.3, 60 sec: 49425.2, 300 sec: 49207.5). Total num frames: 2267643904. Throughput: 0: 49081.8. Samples: 1796456280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:27:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:27:41,048][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000138407_2267660288.pth... [2024-06-13 02:27:41,094][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000137685_2255831040.pth [2024-06-13 02:27:43,554][71000] Updated weights for policy 0, policy_version 138414 (0.0022) [2024-06-13 02:27:45,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49698.1, 300 sec: 49096.5). Total num frames: 2267873280. Throughput: 0: 49207.7. Samples: 1796754840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 02:27:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:27:47,022][71000] Updated weights for policy 0, policy_version 138424 (0.0030) [2024-06-13 02:27:50,165][71000] Updated weights for policy 0, policy_version 138434 (0.0033) [2024-06-13 02:27:50,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2268119040. Throughput: 0: 49197.2. Samples: 1796894060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 02:27:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:27:53,328][71000] Updated weights for policy 0, policy_version 138444 (0.0036) [2024-06-13 02:27:55,942][70768] Fps is (10 sec: 49142.3, 60 sec: 48604.2, 300 sec: 49151.7). Total num frames: 2268364800. Throughput: 0: 49348.2. Samples: 1797196640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 02:27:55,942][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:27:57,183][71000] Updated weights for policy 0, policy_version 138454 (0.0042) [2024-06-13 02:28:00,382][71000] Updated weights for policy 0, policy_version 138464 (0.0025) [2024-06-13 02:28:00,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2268643328. Throughput: 0: 49365.3. Samples: 1797489080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 02:28:00,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 02:28:03,707][71000] Updated weights for policy 0, policy_version 138474 (0.0030) [2024-06-13 02:28:05,940][70768] Fps is (10 sec: 47523.1, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2268839936. Throughput: 0: 49274.2. Samples: 1797641200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 02:28:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 02:28:07,066][71000] Updated weights for policy 0, policy_version 138484 (0.0030) [2024-06-13 02:28:09,959][70980] Signal inference workers to stop experience collection... (26600 times) [2024-06-13 02:28:09,960][70980] Signal inference workers to resume experience collection... (26600 times) [2024-06-13 02:28:09,996][71000] InferenceWorker_p0-w0: stopping experience collection (26600 times) [2024-06-13 02:28:09,996][71000] InferenceWorker_p0-w0: resuming experience collection (26600 times) [2024-06-13 02:28:10,641][71000] Updated weights for policy 0, policy_version 138494 (0.0028) [2024-06-13 02:28:10,940][70768] Fps is (10 sec: 45874.7, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2269102080. Throughput: 0: 49151.8. Samples: 1797927640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 02:28:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 02:28:13,499][71000] Updated weights for policy 0, policy_version 138504 (0.0032) [2024-06-13 02:28:15,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2269347840. Throughput: 0: 49033.0. Samples: 1798216780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 02:28:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:28:17,506][71000] Updated weights for policy 0, policy_version 138514 (0.0029) [2024-06-13 02:28:20,289][71000] Updated weights for policy 0, policy_version 138524 (0.0032) [2024-06-13 02:28:20,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49698.0, 300 sec: 49263.1). Total num frames: 2269626368. Throughput: 0: 49043.6. Samples: 1798371520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 02:28:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:28:24,050][71000] Updated weights for policy 0, policy_version 138534 (0.0029) [2024-06-13 02:28:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 2269839360. Throughput: 0: 49059.9. Samples: 1798663980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 02:28:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 02:28:27,073][71000] Updated weights for policy 0, policy_version 138544 (0.0028) [2024-06-13 02:28:30,569][71000] Updated weights for policy 0, policy_version 138554 (0.0031) [2024-06-13 02:28:30,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2270085120. Throughput: 0: 49057.7. Samples: 1798962440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 02:28:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:28:33,539][71000] Updated weights for policy 0, policy_version 138564 (0.0025) [2024-06-13 02:28:35,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2270330880. Throughput: 0: 49218.3. Samples: 1799108880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 02:28:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:28:37,231][71000] Updated weights for policy 0, policy_version 138574 (0.0032) [2024-06-13 02:28:40,230][71000] Updated weights for policy 0, policy_version 138584 (0.0028) [2024-06-13 02:28:40,939][70768] Fps is (10 sec: 52429.7, 60 sec: 49425.1, 300 sec: 49318.7). Total num frames: 2270609408. Throughput: 0: 49154.2. Samples: 1799408480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-13 02:28:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:28:43,703][71000] Updated weights for policy 0, policy_version 138594 (0.0041) [2024-06-13 02:28:45,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2270822400. Throughput: 0: 49304.3. Samples: 1799707780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:28:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:28:46,649][71000] Updated weights for policy 0, policy_version 138604 (0.0037) [2024-06-13 02:28:50,679][71000] Updated weights for policy 0, policy_version 138614 (0.0028) [2024-06-13 02:28:50,940][70768] Fps is (10 sec: 45874.3, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2271068160. Throughput: 0: 48928.3. Samples: 1799842980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:28:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:28:53,301][71000] Updated weights for policy 0, policy_version 138624 (0.0028) [2024-06-13 02:28:55,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49153.6, 300 sec: 49207.5). Total num frames: 2271313920. Throughput: 0: 49112.1. Samples: 1800137680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:28:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:28:57,090][71000] Updated weights for policy 0, policy_version 138634 (0.0032) [2024-06-13 02:28:59,386][70980] Signal inference workers to stop experience collection... (26650 times) [2024-06-13 02:28:59,387][70980] Signal inference workers to resume experience collection... (26650 times) [2024-06-13 02:28:59,424][71000] InferenceWorker_p0-w0: stopping experience collection (26650 times) [2024-06-13 02:28:59,425][71000] InferenceWorker_p0-w0: resuming experience collection (26650 times) [2024-06-13 02:28:59,907][71000] Updated weights for policy 0, policy_version 138644 (0.0026) [2024-06-13 02:29:00,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2271592448. Throughput: 0: 49449.6. Samples: 1800442020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:29:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:29:03,600][71000] Updated weights for policy 0, policy_version 138654 (0.0026) [2024-06-13 02:29:05,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.0, 300 sec: 49096.4). Total num frames: 2271805440. Throughput: 0: 49549.9. Samples: 1800601260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:29:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:29:06,512][71000] Updated weights for policy 0, policy_version 138664 (0.0032) [2024-06-13 02:29:10,285][71000] Updated weights for policy 0, policy_version 138674 (0.0030) [2024-06-13 02:29:10,940][70768] Fps is (10 sec: 45875.6, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2272051200. Throughput: 0: 49585.7. Samples: 1800895340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:29:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:29:13,142][71000] Updated weights for policy 0, policy_version 138684 (0.0027) [2024-06-13 02:29:15,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2272296960. Throughput: 0: 49302.7. Samples: 1801181060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:29:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:29:17,064][71000] Updated weights for policy 0, policy_version 138694 (0.0036) [2024-06-13 02:29:19,748][71000] Updated weights for policy 0, policy_version 138704 (0.0031) [2024-06-13 02:29:20,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2272575488. Throughput: 0: 49402.7. Samples: 1801332000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:29:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:29:23,627][71000] Updated weights for policy 0, policy_version 138714 (0.0024) [2024-06-13 02:29:25,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49698.1, 300 sec: 49207.5). Total num frames: 2272821248. Throughput: 0: 49353.3. Samples: 1801629380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:29:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 02:29:26,151][71000] Updated weights for policy 0, policy_version 138724 (0.0026) [2024-06-13 02:29:30,102][71000] Updated weights for policy 0, policy_version 138734 (0.0031) [2024-06-13 02:29:30,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2273034240. Throughput: 0: 49415.7. Samples: 1801931480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:29:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:29:32,682][71000] Updated weights for policy 0, policy_version 138744 (0.0033) [2024-06-13 02:29:35,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2273296384. Throughput: 0: 49457.0. Samples: 1802068540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:29:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:29:37,344][71000] Updated weights for policy 0, policy_version 138754 (0.0031) [2024-06-13 02:29:39,464][71000] Updated weights for policy 0, policy_version 138764 (0.0034) [2024-06-13 02:29:40,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49151.8, 300 sec: 49318.6). Total num frames: 2273558528. Throughput: 0: 49394.1. Samples: 1802360420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:29:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 02:29:40,953][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000138767_2273558528.pth... [2024-06-13 02:29:40,997][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000138045_2261729280.pth [2024-06-13 02:29:44,062][71000] Updated weights for policy 0, policy_version 138774 (0.0026) [2024-06-13 02:29:45,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.3, 300 sec: 49152.0). Total num frames: 2273804288. Throughput: 0: 49334.9. Samples: 1802662080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:29:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:29:46,260][71000] Updated weights for policy 0, policy_version 138784 (0.0028) [2024-06-13 02:29:50,477][71000] Updated weights for policy 0, policy_version 138794 (0.0026) [2024-06-13 02:29:50,940][70768] Fps is (10 sec: 45875.8, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2274017280. Throughput: 0: 49072.0. Samples: 1802809500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 02:29:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:29:52,188][70980] Signal inference workers to stop experience collection... (26700 times) [2024-06-13 02:29:52,197][71000] InferenceWorker_p0-w0: stopping experience collection (26700 times) [2024-06-13 02:29:52,298][70980] Signal inference workers to resume experience collection... (26700 times) [2024-06-13 02:29:52,298][71000] InferenceWorker_p0-w0: resuming experience collection (26700 times) [2024-06-13 02:29:52,708][71000] Updated weights for policy 0, policy_version 138804 (0.0028) [2024-06-13 02:29:55,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2274279424. Throughput: 0: 49108.1. Samples: 1803105200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 02:29:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:29:57,219][71000] Updated weights for policy 0, policy_version 138814 (0.0028) [2024-06-13 02:29:59,468][71000] Updated weights for policy 0, policy_version 138824 (0.0026) [2024-06-13 02:30:00,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48879.1, 300 sec: 49263.1). Total num frames: 2274525184. Throughput: 0: 49049.4. Samples: 1803388280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 02:30:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:30:03,980][71000] Updated weights for policy 0, policy_version 138834 (0.0026) [2024-06-13 02:30:05,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49971.3, 300 sec: 49263.1). Total num frames: 2274803712. Throughput: 0: 49283.6. Samples: 1803549760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 02:30:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:30:05,984][71000] Updated weights for policy 0, policy_version 138844 (0.0033) [2024-06-13 02:30:10,341][71000] Updated weights for policy 0, policy_version 138854 (0.0026) [2024-06-13 02:30:10,940][70768] Fps is (10 sec: 45874.6, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2274983936. Throughput: 0: 49354.9. Samples: 1803850360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 02:30:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:30:13,019][71000] Updated weights for policy 0, policy_version 138864 (0.0033) [2024-06-13 02:30:15,940][70768] Fps is (10 sec: 45874.3, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2275262464. Throughput: 0: 48841.2. Samples: 1804129340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 02:30:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:30:17,408][71000] Updated weights for policy 0, policy_version 138874 (0.0037) [2024-06-13 02:30:19,546][71000] Updated weights for policy 0, policy_version 138884 (0.0031) [2024-06-13 02:30:20,940][70768] Fps is (10 sec: 52429.4, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2275508224. Throughput: 0: 49327.6. Samples: 1804288280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 02:30:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 02:30:24,035][71000] Updated weights for policy 0, policy_version 138894 (0.0029) [2024-06-13 02:30:25,940][70768] Fps is (10 sec: 52429.6, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2275786752. Throughput: 0: 49611.7. Samples: 1804592940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 02:30:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:30:26,123][71000] Updated weights for policy 0, policy_version 138904 (0.0032) [2024-06-13 02:30:30,406][71000] Updated weights for policy 0, policy_version 138914 (0.0027) [2024-06-13 02:30:30,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2275999744. Throughput: 0: 49612.7. Samples: 1804894660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 02:30:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 02:30:32,981][71000] Updated weights for policy 0, policy_version 138924 (0.0033) [2024-06-13 02:30:35,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2276245504. Throughput: 0: 49174.2. Samples: 1805022340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 02:30:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:30:37,371][71000] Updated weights for policy 0, policy_version 138934 (0.0027) [2024-06-13 02:30:39,465][71000] Updated weights for policy 0, policy_version 138944 (0.0026) [2024-06-13 02:30:40,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48879.1, 300 sec: 49263.1). Total num frames: 2276491264. Throughput: 0: 48995.1. Samples: 1805309980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 02:30:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:30:44,042][71000] Updated weights for policy 0, policy_version 138954 (0.0023) [2024-06-13 02:30:44,770][70980] Signal inference workers to stop experience collection... (26750 times) [2024-06-13 02:30:44,819][71000] InferenceWorker_p0-w0: stopping experience collection (26750 times) [2024-06-13 02:30:44,825][70980] Signal inference workers to resume experience collection... (26750 times) [2024-06-13 02:30:44,835][71000] InferenceWorker_p0-w0: resuming experience collection (26750 times) [2024-06-13 02:30:45,939][70768] Fps is (10 sec: 52429.0, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2276769792. Throughput: 0: 49249.4. Samples: 1805604500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 02:30:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:30:46,434][71000] Updated weights for policy 0, policy_version 138964 (0.0029) [2024-06-13 02:30:50,637][71000] Updated weights for policy 0, policy_version 138974 (0.0028) [2024-06-13 02:30:50,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2276966400. Throughput: 0: 49158.2. Samples: 1805761880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 02:30:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:30:52,790][71000] Updated weights for policy 0, policy_version 138984 (0.0027) [2024-06-13 02:30:55,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2277228544. Throughput: 0: 49073.0. Samples: 1806058640. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-13 02:30:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:30:57,303][71000] Updated weights for policy 0, policy_version 138994 (0.0038) [2024-06-13 02:30:59,366][71000] Updated weights for policy 0, policy_version 139004 (0.0022) [2024-06-13 02:31:00,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2277474304. Throughput: 0: 49442.0. Samples: 1806354220. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-13 02:31:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:31:03,606][71000] Updated weights for policy 0, policy_version 139014 (0.0033) [2024-06-13 02:31:05,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48878.9, 300 sec: 49263.5). Total num frames: 2277736448. Throughput: 0: 49460.0. Samples: 1806513980. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-13 02:31:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:31:06,175][71000] Updated weights for policy 0, policy_version 139024 (0.0033) [2024-06-13 02:31:10,341][71000] Updated weights for policy 0, policy_version 139034 (0.0026) [2024-06-13 02:31:10,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49971.3, 300 sec: 49152.0). Total num frames: 2277982208. Throughput: 0: 49055.1. Samples: 1806800420. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-13 02:31:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:31:12,837][71000] Updated weights for policy 0, policy_version 139044 (0.0032) [2024-06-13 02:31:15,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2278227968. Throughput: 0: 49127.1. Samples: 1807105380. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-13 02:31:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:31:16,842][71000] Updated weights for policy 0, policy_version 139054 (0.0021) [2024-06-13 02:31:19,208][71000] Updated weights for policy 0, policy_version 139064 (0.0031) [2024-06-13 02:31:20,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2278473728. Throughput: 0: 49474.7. Samples: 1807248700. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-13 02:31:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:31:23,356][71000] Updated weights for policy 0, policy_version 139074 (0.0024) [2024-06-13 02:31:25,854][71000] Updated weights for policy 0, policy_version 139084 (0.0026) [2024-06-13 02:31:25,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2278752256. Throughput: 0: 49733.6. Samples: 1807548000. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-13 02:31:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:31:29,977][71000] Updated weights for policy 0, policy_version 139094 (0.0027) [2024-06-13 02:31:30,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.2, 300 sec: 49263.1). Total num frames: 2278981632. Throughput: 0: 49790.1. Samples: 1807845060. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-13 02:31:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:31:32,595][71000] Updated weights for policy 0, policy_version 139104 (0.0028) [2024-06-13 02:31:35,940][70768] Fps is (10 sec: 45875.8, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2279211008. Throughput: 0: 49425.3. Samples: 1807986020. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-13 02:31:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:31:36,491][71000] Updated weights for policy 0, policy_version 139114 (0.0024) [2024-06-13 02:31:37,056][70980] Signal inference workers to stop experience collection... (26800 times) [2024-06-13 02:31:37,057][70980] Signal inference workers to resume experience collection... (26800 times) [2024-06-13 02:31:37,073][71000] InferenceWorker_p0-w0: stopping experience collection (26800 times) [2024-06-13 02:31:37,100][71000] InferenceWorker_p0-w0: resuming experience collection (26800 times) [2024-06-13 02:31:39,281][71000] Updated weights for policy 0, policy_version 139124 (0.0027) [2024-06-13 02:31:40,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2279456768. Throughput: 0: 49581.3. Samples: 1808289800. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-13 02:31:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:31:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000139127_2279456768.pth... [2024-06-13 02:31:41,004][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000138407_2267660288.pth [2024-06-13 02:31:42,923][71000] Updated weights for policy 0, policy_version 139134 (0.0024) [2024-06-13 02:31:45,782][71000] Updated weights for policy 0, policy_version 139144 (0.0023) [2024-06-13 02:31:45,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2279735296. Throughput: 0: 49679.4. Samples: 1808589800. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-13 02:31:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:31:49,615][71000] Updated weights for policy 0, policy_version 139154 (0.0030) [2024-06-13 02:31:50,940][70768] Fps is (10 sec: 54066.8, 60 sec: 50517.2, 300 sec: 49318.6). Total num frames: 2279997440. Throughput: 0: 49476.4. Samples: 1808740420. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-13 02:31:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:31:52,164][71000] Updated weights for policy 0, policy_version 139164 (0.0034) [2024-06-13 02:31:55,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2280210432. Throughput: 0: 49741.3. Samples: 1809038780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:31:55,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 02:31:56,400][71000] Updated weights for policy 0, policy_version 139174 (0.0031) [2024-06-13 02:31:59,071][71000] Updated weights for policy 0, policy_version 139184 (0.0028) [2024-06-13 02:32:00,940][70768] Fps is (10 sec: 44236.7, 60 sec: 49424.9, 300 sec: 49318.6). Total num frames: 2280439808. Throughput: 0: 49644.5. Samples: 1809339380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:32:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:32:02,750][71000] Updated weights for policy 0, policy_version 139194 (0.0022) [2024-06-13 02:32:05,861][71000] Updated weights for policy 0, policy_version 139204 (0.0030) [2024-06-13 02:32:05,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 2280718336. Throughput: 0: 49495.5. Samples: 1809476000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:32:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:32:09,254][71000] Updated weights for policy 0, policy_version 139214 (0.0031) [2024-06-13 02:32:10,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49698.0, 300 sec: 49374.1). Total num frames: 2280964096. Throughput: 0: 49560.4. Samples: 1809778220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:32:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:32:12,295][71000] Updated weights for policy 0, policy_version 139224 (0.0028) [2024-06-13 02:32:15,939][70768] Fps is (10 sec: 47514.3, 60 sec: 49425.3, 300 sec: 49318.6). Total num frames: 2281193472. Throughput: 0: 49627.3. Samples: 1810078280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:32:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:32:15,984][71000] Updated weights for policy 0, policy_version 139234 (0.0027) [2024-06-13 02:32:19,297][71000] Updated weights for policy 0, policy_version 139244 (0.0036) [2024-06-13 02:32:20,940][70768] Fps is (10 sec: 45875.7, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2281422848. Throughput: 0: 49621.7. Samples: 1810219000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:32:20,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 02:32:22,384][71000] Updated weights for policy 0, policy_version 139254 (0.0019) [2024-06-13 02:32:25,752][71000] Updated weights for policy 0, policy_version 139264 (0.0024) [2024-06-13 02:32:25,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 2281701376. Throughput: 0: 49529.3. Samples: 1810518620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:32:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:32:28,956][71000] Updated weights for policy 0, policy_version 139274 (0.0029) [2024-06-13 02:32:30,939][70768] Fps is (10 sec: 54067.6, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 2281963520. Throughput: 0: 49392.1. Samples: 1810812440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:32:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:32:32,735][71000] Updated weights for policy 0, policy_version 139284 (0.0033) [2024-06-13 02:32:35,637][70980] Signal inference workers to stop experience collection... (26850 times) [2024-06-13 02:32:35,637][70980] Signal inference workers to resume experience collection... (26850 times) [2024-06-13 02:32:35,657][71000] InferenceWorker_p0-w0: stopping experience collection (26850 times) [2024-06-13 02:32:35,657][71000] InferenceWorker_p0-w0: resuming experience collection (26850 times) [2024-06-13 02:32:35,818][71000] Updated weights for policy 0, policy_version 139294 (0.0033) [2024-06-13 02:32:35,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49971.2, 300 sec: 49374.2). Total num frames: 2282209280. Throughput: 0: 49664.6. Samples: 1810975320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:32:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:32:39,263][71000] Updated weights for policy 0, policy_version 139304 (0.0027) [2024-06-13 02:32:40,940][70768] Fps is (10 sec: 45874.3, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2282422272. Throughput: 0: 49533.3. Samples: 1811267780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:32:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:32:42,221][71000] Updated weights for policy 0, policy_version 139314 (0.0029) [2024-06-13 02:32:45,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 2282668032. Throughput: 0: 49571.2. Samples: 1811570080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:32:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:32:46,047][71000] Updated weights for policy 0, policy_version 139324 (0.0020) [2024-06-13 02:32:48,679][71000] Updated weights for policy 0, policy_version 139334 (0.0028) [2024-06-13 02:32:50,940][70768] Fps is (10 sec: 52429.6, 60 sec: 49152.1, 300 sec: 49430.0). Total num frames: 2282946560. Throughput: 0: 49985.8. Samples: 1811725360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:32:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:32:52,348][71000] Updated weights for policy 0, policy_version 139344 (0.0027) [2024-06-13 02:32:55,434][71000] Updated weights for policy 0, policy_version 139354 (0.0031) [2024-06-13 02:32:55,940][70768] Fps is (10 sec: 55705.5, 60 sec: 50244.3, 300 sec: 49429.7). Total num frames: 2283225088. Throughput: 0: 49830.0. Samples: 1812020560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 02:32:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:32:59,162][71000] Updated weights for policy 0, policy_version 139364 (0.0019) [2024-06-13 02:33:00,939][70768] Fps is (10 sec: 47513.7, 60 sec: 49698.3, 300 sec: 49429.7). Total num frames: 2283421696. Throughput: 0: 49763.5. Samples: 1812317640. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-13 02:33:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:33:01,905][71000] Updated weights for policy 0, policy_version 139374 (0.0022) [2024-06-13 02:33:05,652][71000] Updated weights for policy 0, policy_version 139384 (0.0019) [2024-06-13 02:33:05,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2283683840. Throughput: 0: 49707.6. Samples: 1812455840. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-13 02:33:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:33:08,399][71000] Updated weights for policy 0, policy_version 139394 (0.0035) [2024-06-13 02:33:10,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2283929600. Throughput: 0: 49752.0. Samples: 1812757460. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-13 02:33:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:33:12,099][71000] Updated weights for policy 0, policy_version 139404 (0.0026) [2024-06-13 02:33:15,021][71000] Updated weights for policy 0, policy_version 139414 (0.0029) [2024-06-13 02:33:15,940][70768] Fps is (10 sec: 52429.0, 60 sec: 50244.2, 300 sec: 49429.7). Total num frames: 2284208128. Throughput: 0: 49845.3. Samples: 1813055480. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-13 02:33:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:33:18,744][71000] Updated weights for policy 0, policy_version 139424 (0.0032) [2024-06-13 02:33:20,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 2284421120. Throughput: 0: 49711.0. Samples: 1813212320. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-13 02:33:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:33:21,627][71000] Updated weights for policy 0, policy_version 139434 (0.0035) [2024-06-13 02:33:25,743][71000] Updated weights for policy 0, policy_version 139444 (0.0029) [2024-06-13 02:33:25,940][70768] Fps is (10 sec: 44236.6, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2284650496. Throughput: 0: 49631.7. Samples: 1813501200. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-13 02:33:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:33:28,369][71000] Updated weights for policy 0, policy_version 139454 (0.0036) [2024-06-13 02:33:30,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2284912640. Throughput: 0: 49403.5. Samples: 1813793240. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-13 02:33:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:33:30,963][70980] Signal inference workers to stop experience collection... (26900 times) [2024-06-13 02:33:31,010][71000] InferenceWorker_p0-w0: stopping experience collection (26900 times) [2024-06-13 02:33:31,071][70980] Signal inference workers to resume experience collection... (26900 times) [2024-06-13 02:33:31,071][71000] InferenceWorker_p0-w0: resuming experience collection (26900 times) [2024-06-13 02:33:32,330][71000] Updated weights for policy 0, policy_version 139464 (0.0021) [2024-06-13 02:33:34,978][71000] Updated weights for policy 0, policy_version 139474 (0.0027) [2024-06-13 02:33:35,940][70768] Fps is (10 sec: 54066.8, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 2285191168. Throughput: 0: 49358.1. Samples: 1813946480. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-13 02:33:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:33:38,694][71000] Updated weights for policy 0, policy_version 139484 (0.0029) [2024-06-13 02:33:40,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 2285404160. Throughput: 0: 49285.6. Samples: 1814238420. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-13 02:33:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:33:40,971][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000139491_2285420544.pth... [2024-06-13 02:33:41,030][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000138767_2273558528.pth [2024-06-13 02:33:41,878][71000] Updated weights for policy 0, policy_version 139494 (0.0023) [2024-06-13 02:33:45,939][70768] Fps is (10 sec: 42599.0, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2285617152. Throughput: 0: 49009.8. Samples: 1814523080. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-13 02:33:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:33:46,065][71000] Updated weights for policy 0, policy_version 139504 (0.0028) [2024-06-13 02:33:48,259][71000] Updated weights for policy 0, policy_version 139514 (0.0029) [2024-06-13 02:33:50,939][70768] Fps is (10 sec: 50791.4, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2285912064. Throughput: 0: 49236.5. Samples: 1814671480. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-13 02:33:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 02:33:52,346][71000] Updated weights for policy 0, policy_version 139524 (0.0028) [2024-06-13 02:33:54,547][71000] Updated weights for policy 0, policy_version 139534 (0.0021) [2024-06-13 02:33:55,940][70768] Fps is (10 sec: 55705.5, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2286174208. Throughput: 0: 49260.1. Samples: 1814974160. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-13 02:33:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 02:33:58,899][71000] Updated weights for policy 0, policy_version 139544 (0.0033) [2024-06-13 02:34:00,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49971.1, 300 sec: 49540.8). Total num frames: 2286419968. Throughput: 0: 49355.0. Samples: 1815276460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-13 02:34:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:34:01,540][71000] Updated weights for policy 0, policy_version 139554 (0.0024) [2024-06-13 02:34:05,743][71000] Updated weights for policy 0, policy_version 139564 (0.0032) [2024-06-13 02:34:05,940][70768] Fps is (10 sec: 45874.3, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 2286632960. Throughput: 0: 49060.3. Samples: 1815420040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-13 02:34:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:34:08,091][71000] Updated weights for policy 0, policy_version 139574 (0.0030) [2024-06-13 02:34:10,939][70768] Fps is (10 sec: 45875.7, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 2286878720. Throughput: 0: 49081.0. Samples: 1815709840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-13 02:34:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:34:12,138][71000] Updated weights for policy 0, policy_version 139584 (0.0026) [2024-06-13 02:34:14,581][71000] Updated weights for policy 0, policy_version 139594 (0.0026) [2024-06-13 02:34:15,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 2287157248. Throughput: 0: 49238.6. Samples: 1816008980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-13 02:34:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:34:18,900][71000] Updated weights for policy 0, policy_version 139604 (0.0032) [2024-06-13 02:34:20,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 2287403008. Throughput: 0: 49308.5. Samples: 1816165360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-13 02:34:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:34:21,392][71000] Updated weights for policy 0, policy_version 139614 (0.0030) [2024-06-13 02:34:25,298][71000] Updated weights for policy 0, policy_version 139624 (0.0032) [2024-06-13 02:34:25,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2287616000. Throughput: 0: 49321.8. Samples: 1816457900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-13 02:34:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:34:28,043][71000] Updated weights for policy 0, policy_version 139634 (0.0042) [2024-06-13 02:34:30,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2287878144. Throughput: 0: 49471.0. Samples: 1816749280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-13 02:34:30,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 02:34:32,261][71000] Updated weights for policy 0, policy_version 139644 (0.0040) [2024-06-13 02:34:34,855][71000] Updated weights for policy 0, policy_version 139654 (0.0025) [2024-06-13 02:34:35,940][70768] Fps is (10 sec: 52429.6, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 2288140288. Throughput: 0: 49468.9. Samples: 1816897580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-13 02:34:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 02:34:38,799][71000] Updated weights for policy 0, policy_version 139664 (0.0031) [2024-06-13 02:34:39,270][70980] Signal inference workers to stop experience collection... (26950 times) [2024-06-13 02:34:39,318][71000] InferenceWorker_p0-w0: stopping experience collection (26950 times) [2024-06-13 02:34:39,322][70980] Signal inference workers to resume experience collection... (26950 times) [2024-06-13 02:34:39,337][71000] InferenceWorker_p0-w0: resuming experience collection (26950 times) [2024-06-13 02:34:40,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 2288386048. Throughput: 0: 49429.2. Samples: 1817198480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-13 02:34:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:34:41,321][71000] Updated weights for policy 0, policy_version 139674 (0.0030) [2024-06-13 02:34:45,462][71000] Updated weights for policy 0, policy_version 139684 (0.0026) [2024-06-13 02:34:45,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 2288599040. Throughput: 0: 49369.4. Samples: 1817498080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-13 02:34:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 02:34:47,908][71000] Updated weights for policy 0, policy_version 139694 (0.0027) [2024-06-13 02:34:50,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48878.9, 300 sec: 49374.2). Total num frames: 2288844800. Throughput: 0: 49089.1. Samples: 1817629040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-13 02:34:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:34:52,081][71000] Updated weights for policy 0, policy_version 139704 (0.0034) [2024-06-13 02:34:54,631][71000] Updated weights for policy 0, policy_version 139714 (0.0029) [2024-06-13 02:34:55,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48878.8, 300 sec: 49429.7). Total num frames: 2289106944. Throughput: 0: 49222.9. Samples: 1817924880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-13 02:34:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:34:58,774][71000] Updated weights for policy 0, policy_version 139724 (0.0026) [2024-06-13 02:35:00,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 2289369088. Throughput: 0: 49126.7. Samples: 1818219680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-13 02:35:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:35:01,415][71000] Updated weights for policy 0, policy_version 139734 (0.0031) [2024-06-13 02:35:05,297][71000] Updated weights for policy 0, policy_version 139744 (0.0027) [2024-06-13 02:35:05,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49152.1, 300 sec: 49485.3). Total num frames: 2289582080. Throughput: 0: 49079.6. Samples: 1818373940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 02:35:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:35:07,884][71000] Updated weights for policy 0, policy_version 139754 (0.0025) [2024-06-13 02:35:10,940][70768] Fps is (10 sec: 45874.7, 60 sec: 49151.8, 300 sec: 49374.2). Total num frames: 2289827840. Throughput: 0: 49129.3. Samples: 1818668720. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 02:35:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:35:12,328][71000] Updated weights for policy 0, policy_version 139764 (0.0028) [2024-06-13 02:35:14,859][71000] Updated weights for policy 0, policy_version 139774 (0.0030) [2024-06-13 02:35:15,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48605.9, 300 sec: 49374.2). Total num frames: 2290073600. Throughput: 0: 48721.4. Samples: 1818941740. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 02:35:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:35:18,970][71000] Updated weights for policy 0, policy_version 139784 (0.0029) [2024-06-13 02:35:20,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48878.8, 300 sec: 49318.6). Total num frames: 2290335744. Throughput: 0: 49069.1. Samples: 1819105700. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 02:35:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:35:21,418][71000] Updated weights for policy 0, policy_version 139794 (0.0023) [2024-06-13 02:35:25,515][71000] Updated weights for policy 0, policy_version 139804 (0.0033) [2024-06-13 02:35:25,939][70768] Fps is (10 sec: 50791.2, 60 sec: 49425.3, 300 sec: 49429.7). Total num frames: 2290581504. Throughput: 0: 48863.8. Samples: 1819397340. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 02:35:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:35:27,942][71000] Updated weights for policy 0, policy_version 139814 (0.0032) [2024-06-13 02:35:30,939][70768] Fps is (10 sec: 45876.0, 60 sec: 48605.9, 300 sec: 49318.6). Total num frames: 2290794496. Throughput: 0: 48965.4. Samples: 1819701520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 02:35:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:35:32,309][71000] Updated weights for policy 0, policy_version 139824 (0.0021) [2024-06-13 02:35:34,506][71000] Updated weights for policy 0, policy_version 139834 (0.0025) [2024-06-13 02:35:35,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48605.8, 300 sec: 49374.2). Total num frames: 2291056640. Throughput: 0: 49029.3. Samples: 1819835360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 02:35:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:35:38,777][71000] Updated weights for policy 0, policy_version 139844 (0.0030) [2024-06-13 02:35:39,760][70980] Signal inference workers to stop experience collection... (27000 times) [2024-06-13 02:35:39,760][70980] Signal inference workers to resume experience collection... (27000 times) [2024-06-13 02:35:39,786][71000] InferenceWorker_p0-w0: stopping experience collection (27000 times) [2024-06-13 02:35:39,786][71000] InferenceWorker_p0-w0: resuming experience collection (27000 times) [2024-06-13 02:35:40,940][70768] Fps is (10 sec: 54066.9, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 2291335168. Throughput: 0: 49192.5. Samples: 1820138540. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 02:35:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:35:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000139852_2291335168.pth... [2024-06-13 02:35:40,996][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000139127_2279456768.pth [2024-06-13 02:35:41,450][71000] Updated weights for policy 0, policy_version 139854 (0.0034) [2024-06-13 02:35:45,137][71000] Updated weights for policy 0, policy_version 139864 (0.0025) [2024-06-13 02:35:45,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 2291580928. Throughput: 0: 49280.0. Samples: 1820437280. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 02:35:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:35:48,064][71000] Updated weights for policy 0, policy_version 139874 (0.0026) [2024-06-13 02:35:50,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2291810304. Throughput: 0: 49117.3. Samples: 1820584220. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 02:35:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 02:35:51,666][71000] Updated weights for policy 0, policy_version 139884 (0.0024) [2024-06-13 02:35:54,711][71000] Updated weights for policy 0, policy_version 139894 (0.0033) [2024-06-13 02:35:55,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 2292056064. Throughput: 0: 49152.2. Samples: 1820880560. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 02:35:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:35:58,622][71000] Updated weights for policy 0, policy_version 139904 (0.0032) [2024-06-13 02:36:00,939][70768] Fps is (10 sec: 52429.3, 60 sec: 49425.2, 300 sec: 49485.3). Total num frames: 2292334592. Throughput: 0: 49434.3. Samples: 1821166280. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 02:36:00,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 02:36:01,625][71000] Updated weights for policy 0, policy_version 139914 (0.0032) [2024-06-13 02:36:05,229][71000] Updated weights for policy 0, policy_version 139924 (0.0025) [2024-06-13 02:36:05,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49971.1, 300 sec: 49485.2). Total num frames: 2292580352. Throughput: 0: 49444.5. Samples: 1821330700. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-13 02:36:05,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 02:36:08,126][71000] Updated weights for policy 0, policy_version 139934 (0.0033) [2024-06-13 02:36:10,940][70768] Fps is (10 sec: 45874.6, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2292793344. Throughput: 0: 49432.7. Samples: 1821621820. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-13 02:36:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:36:11,484][71000] Updated weights for policy 0, policy_version 139944 (0.0027) [2024-06-13 02:36:14,861][71000] Updated weights for policy 0, policy_version 139954 (0.0022) [2024-06-13 02:36:15,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 2293039104. Throughput: 0: 49414.5. Samples: 1821925180. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-13 02:36:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:36:18,110][71000] Updated weights for policy 0, policy_version 139964 (0.0034) [2024-06-13 02:36:20,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2293301248. Throughput: 0: 49487.0. Samples: 1822062280. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-13 02:36:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:36:21,839][71000] Updated weights for policy 0, policy_version 139974 (0.0027) [2024-06-13 02:36:24,978][71000] Updated weights for policy 0, policy_version 139984 (0.0026) [2024-06-13 02:36:25,940][70768] Fps is (10 sec: 52429.5, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 2293563392. Throughput: 0: 49486.2. Samples: 1822365420. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-13 02:36:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:36:28,230][71000] Updated weights for policy 0, policy_version 139994 (0.0022) [2024-06-13 02:36:30,939][70768] Fps is (10 sec: 47514.4, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 2293776384. Throughput: 0: 49612.9. Samples: 1822669860. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-13 02:36:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:36:31,390][71000] Updated weights for policy 0, policy_version 140004 (0.0021) [2024-06-13 02:36:34,513][71000] Updated weights for policy 0, policy_version 140014 (0.0023) [2024-06-13 02:36:35,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 2294022144. Throughput: 0: 49270.6. Samples: 1822801400. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-13 02:36:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:36:37,997][70980] Signal inference workers to stop experience collection... (27050 times) [2024-06-13 02:36:38,041][71000] InferenceWorker_p0-w0: stopping experience collection (27050 times) [2024-06-13 02:36:38,111][70980] Signal inference workers to resume experience collection... (27050 times) [2024-06-13 02:36:38,112][71000] InferenceWorker_p0-w0: resuming experience collection (27050 times) [2024-06-13 02:36:38,246][71000] Updated weights for policy 0, policy_version 140024 (0.0032) [2024-06-13 02:36:40,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2294284288. Throughput: 0: 49229.2. Samples: 1823095880. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-13 02:36:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:36:41,741][71000] Updated weights for policy 0, policy_version 140034 (0.0025) [2024-06-13 02:36:45,072][71000] Updated weights for policy 0, policy_version 140044 (0.0033) [2024-06-13 02:36:45,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2294546432. Throughput: 0: 49510.6. Samples: 1823394260. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-13 02:36:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:36:48,327][71000] Updated weights for policy 0, policy_version 140054 (0.0031) [2024-06-13 02:36:50,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2294759424. Throughput: 0: 49133.9. Samples: 1823541720. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-13 02:36:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:36:51,632][71000] Updated weights for policy 0, policy_version 140064 (0.0025) [2024-06-13 02:36:54,803][71000] Updated weights for policy 0, policy_version 140074 (0.0020) [2024-06-13 02:36:55,940][70768] Fps is (10 sec: 44236.7, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 2294988800. Throughput: 0: 49200.0. Samples: 1823835820. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-13 02:36:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:36:58,239][71000] Updated weights for policy 0, policy_version 140084 (0.0024) [2024-06-13 02:37:00,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48878.8, 300 sec: 49318.6). Total num frames: 2295267328. Throughput: 0: 48856.5. Samples: 1824123720. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-13 02:37:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:37:01,487][71000] Updated weights for policy 0, policy_version 140094 (0.0035) [2024-06-13 02:37:04,898][71000] Updated weights for policy 0, policy_version 140104 (0.0030) [2024-06-13 02:37:05,940][70768] Fps is (10 sec: 54067.0, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2295529472. Throughput: 0: 49418.3. Samples: 1824286100. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-13 02:37:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:37:08,332][71000] Updated weights for policy 0, policy_version 140114 (0.0022) [2024-06-13 02:37:10,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2295742464. Throughput: 0: 49143.4. Samples: 1824576880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:37:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:37:11,663][71000] Updated weights for policy 0, policy_version 140124 (0.0031) [2024-06-13 02:37:14,903][71000] Updated weights for policy 0, policy_version 140134 (0.0029) [2024-06-13 02:37:15,940][70768] Fps is (10 sec: 44237.0, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 2295971840. Throughput: 0: 48886.2. Samples: 1824869740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:37:15,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 02:37:18,281][71000] Updated weights for policy 0, policy_version 140144 (0.0028) [2024-06-13 02:37:20,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2296250368. Throughput: 0: 49292.0. Samples: 1825019540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:37:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:37:21,373][71000] Updated weights for policy 0, policy_version 140154 (0.0030) [2024-06-13 02:37:24,982][71000] Updated weights for policy 0, policy_version 140164 (0.0027) [2024-06-13 02:37:25,940][70768] Fps is (10 sec: 54066.4, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2296512512. Throughput: 0: 49470.2. Samples: 1825322040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:37:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:37:28,226][71000] Updated weights for policy 0, policy_version 140174 (0.0029) [2024-06-13 02:37:30,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2296725504. Throughput: 0: 49323.1. Samples: 1825613800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:37:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:37:31,592][71000] Updated weights for policy 0, policy_version 140184 (0.0034) [2024-06-13 02:37:34,774][71000] Updated weights for policy 0, policy_version 140194 (0.0027) [2024-06-13 02:37:35,939][70768] Fps is (10 sec: 44237.8, 60 sec: 48879.1, 300 sec: 49263.1). Total num frames: 2296954880. Throughput: 0: 49117.8. Samples: 1825752020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:37:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:37:38,158][71000] Updated weights for policy 0, policy_version 140204 (0.0029) [2024-06-13 02:37:40,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 2297233408. Throughput: 0: 49328.9. Samples: 1826055620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:37:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:37:40,959][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000140212_2297233408.pth... [2024-06-13 02:37:41,002][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000139491_2285420544.pth [2024-06-13 02:37:41,746][71000] Updated weights for policy 0, policy_version 140214 (0.0027) [2024-06-13 02:37:44,932][71000] Updated weights for policy 0, policy_version 140224 (0.0025) [2024-06-13 02:37:45,940][70768] Fps is (10 sec: 54065.9, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2297495552. Throughput: 0: 49419.9. Samples: 1826347620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:37:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:37:48,650][71000] Updated weights for policy 0, policy_version 140234 (0.0030) [2024-06-13 02:37:49,315][70980] Signal inference workers to stop experience collection... (27100 times) [2024-06-13 02:37:49,316][70980] Signal inference workers to resume experience collection... (27100 times) [2024-06-13 02:37:49,358][71000] InferenceWorker_p0-w0: stopping experience collection (27100 times) [2024-06-13 02:37:49,358][71000] InferenceWorker_p0-w0: resuming experience collection (27100 times) [2024-06-13 02:37:50,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2297708544. Throughput: 0: 49002.3. Samples: 1826491200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:37:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:37:51,632][71000] Updated weights for policy 0, policy_version 140244 (0.0023) [2024-06-13 02:37:55,043][71000] Updated weights for policy 0, policy_version 140254 (0.0027) [2024-06-13 02:37:55,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49698.0, 300 sec: 49318.6). Total num frames: 2297970688. Throughput: 0: 49143.9. Samples: 1826788360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:37:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:37:58,061][71000] Updated weights for policy 0, policy_version 140264 (0.0032) [2024-06-13 02:38:00,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2298216448. Throughput: 0: 49211.0. Samples: 1827084240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:38:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:38:01,504][71000] Updated weights for policy 0, policy_version 140274 (0.0023) [2024-06-13 02:38:04,771][71000] Updated weights for policy 0, policy_version 140284 (0.0042) [2024-06-13 02:38:05,940][70768] Fps is (10 sec: 47514.4, 60 sec: 48605.9, 300 sec: 49207.5). Total num frames: 2298445824. Throughput: 0: 49185.8. Samples: 1827232900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:38:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:38:08,378][71000] Updated weights for policy 0, policy_version 140294 (0.0026) [2024-06-13 02:38:10,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2298691584. Throughput: 0: 48931.7. Samples: 1827523960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:38:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:38:11,566][71000] Updated weights for policy 0, policy_version 140304 (0.0025) [2024-06-13 02:38:15,445][71000] Updated weights for policy 0, policy_version 140314 (0.0036) [2024-06-13 02:38:15,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.1, 300 sec: 49207.6). Total num frames: 2298937344. Throughput: 0: 49055.2. Samples: 1827821280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 02:38:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:38:18,141][71000] Updated weights for policy 0, policy_version 140324 (0.0025) [2024-06-13 02:38:20,940][70768] Fps is (10 sec: 50789.0, 60 sec: 49151.8, 300 sec: 49318.6). Total num frames: 2299199488. Throughput: 0: 49117.4. Samples: 1827962320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 02:38:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:38:21,709][71000] Updated weights for policy 0, policy_version 140334 (0.0030) [2024-06-13 02:38:24,739][71000] Updated weights for policy 0, policy_version 140344 (0.0024) [2024-06-13 02:38:25,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48606.0, 300 sec: 49207.5). Total num frames: 2299428864. Throughput: 0: 49029.0. Samples: 1828261920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 02:38:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:38:28,741][71000] Updated weights for policy 0, policy_version 140354 (0.0026) [2024-06-13 02:38:30,940][70768] Fps is (10 sec: 49153.5, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2299691008. Throughput: 0: 48941.5. Samples: 1828549980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 02:38:30,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:38:31,407][71000] Updated weights for policy 0, policy_version 140364 (0.0033) [2024-06-13 02:38:35,339][71000] Updated weights for policy 0, policy_version 140374 (0.0033) [2024-06-13 02:38:35,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2299904000. Throughput: 0: 48830.2. Samples: 1828688560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 02:38:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:38:38,203][71000] Updated weights for policy 0, policy_version 140384 (0.0022) [2024-06-13 02:38:40,940][70768] Fps is (10 sec: 47512.7, 60 sec: 48878.8, 300 sec: 49318.6). Total num frames: 2300166144. Throughput: 0: 48948.4. Samples: 1828991040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 02:38:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:38:42,216][71000] Updated weights for policy 0, policy_version 140394 (0.0035) [2024-06-13 02:38:44,768][71000] Updated weights for policy 0, policy_version 140404 (0.0027) [2024-06-13 02:38:45,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48605.9, 300 sec: 49152.0). Total num frames: 2300411904. Throughput: 0: 48861.8. Samples: 1829283020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 02:38:45,949][70768] Avg episode reward: [(0, '0.270')] [2024-06-13 02:38:48,758][71000] Updated weights for policy 0, policy_version 140414 (0.0034) [2024-06-13 02:38:50,939][70768] Fps is (10 sec: 49153.1, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2300657664. Throughput: 0: 48899.6. Samples: 1829433380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 02:38:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:38:51,437][71000] Updated weights for policy 0, policy_version 140424 (0.0031) [2024-06-13 02:38:55,585][71000] Updated weights for policy 0, policy_version 140434 (0.0033) [2024-06-13 02:38:55,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 2300887040. Throughput: 0: 48949.2. Samples: 1829726680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 02:38:55,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 02:38:58,334][71000] Updated weights for policy 0, policy_version 140444 (0.0036) [2024-06-13 02:39:00,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48879.0, 300 sec: 49207.6). Total num frames: 2301149184. Throughput: 0: 48935.0. Samples: 1830023360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 02:39:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:39:01,771][70980] Signal inference workers to stop experience collection... (27150 times) [2024-06-13 02:39:01,772][70980] Signal inference workers to resume experience collection... (27150 times) [2024-06-13 02:39:01,784][71000] InferenceWorker_p0-w0: stopping experience collection (27150 times) [2024-06-13 02:39:01,791][71000] InferenceWorker_p0-w0: resuming experience collection (27150 times) [2024-06-13 02:39:01,913][71000] Updated weights for policy 0, policy_version 140454 (0.0032) [2024-06-13 02:39:04,800][71000] Updated weights for policy 0, policy_version 140464 (0.0033) [2024-06-13 02:39:05,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2301394944. Throughput: 0: 49043.7. Samples: 1830169280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 02:39:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:39:08,594][71000] Updated weights for policy 0, policy_version 140474 (0.0027) [2024-06-13 02:39:10,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2301657088. Throughput: 0: 49072.3. Samples: 1830470180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 02:39:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:39:11,351][71000] Updated weights for policy 0, policy_version 140484 (0.0029) [2024-06-13 02:39:15,268][71000] Updated weights for policy 0, policy_version 140494 (0.0038) [2024-06-13 02:39:15,940][70768] Fps is (10 sec: 47514.5, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2301870080. Throughput: 0: 49188.9. Samples: 1830763480. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-13 02:39:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:39:18,227][71000] Updated weights for policy 0, policy_version 140504 (0.0025) [2024-06-13 02:39:20,940][70768] Fps is (10 sec: 47514.4, 60 sec: 48879.2, 300 sec: 49207.6). Total num frames: 2302132224. Throughput: 0: 49173.8. Samples: 1830901380. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-13 02:39:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:39:21,659][71000] Updated weights for policy 0, policy_version 140514 (0.0022) [2024-06-13 02:39:24,513][71000] Updated weights for policy 0, policy_version 140524 (0.0027) [2024-06-13 02:39:25,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2302377984. Throughput: 0: 49058.3. Samples: 1831198660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-13 02:39:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:39:28,607][71000] Updated weights for policy 0, policy_version 140534 (0.0031) [2024-06-13 02:39:30,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2302640128. Throughput: 0: 49090.2. Samples: 1831492080. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-13 02:39:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:39:31,541][71000] Updated weights for policy 0, policy_version 140544 (0.0029) [2024-06-13 02:39:35,230][71000] Updated weights for policy 0, policy_version 140554 (0.0031) [2024-06-13 02:39:35,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49424.9, 300 sec: 49096.4). Total num frames: 2302869504. Throughput: 0: 49361.5. Samples: 1831654660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-13 02:39:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:39:38,046][71000] Updated weights for policy 0, policy_version 140564 (0.0031) [2024-06-13 02:39:40,939][70768] Fps is (10 sec: 45876.2, 60 sec: 48879.2, 300 sec: 49152.0). Total num frames: 2303098880. Throughput: 0: 49230.0. Samples: 1831942020. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-13 02:39:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:39:40,981][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000140571_2303115264.pth... [2024-06-13 02:39:41,022][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000139852_2291335168.pth [2024-06-13 02:39:41,747][71000] Updated weights for policy 0, policy_version 140574 (0.0031) [2024-06-13 02:39:44,761][71000] Updated weights for policy 0, policy_version 140584 (0.0027) [2024-06-13 02:39:45,940][70768] Fps is (10 sec: 49153.0, 60 sec: 49152.1, 300 sec: 49207.5). Total num frames: 2303361024. Throughput: 0: 49291.6. Samples: 1832241480. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-13 02:39:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:39:48,157][71000] Updated weights for policy 0, policy_version 140594 (0.0020) [2024-06-13 02:39:50,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2303623168. Throughput: 0: 49450.8. Samples: 1832394560. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-13 02:39:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:39:51,183][71000] Updated weights for policy 0, policy_version 140604 (0.0022) [2024-06-13 02:39:54,790][71000] Updated weights for policy 0, policy_version 140614 (0.0035) [2024-06-13 02:39:55,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49971.2, 300 sec: 49207.5). Total num frames: 2303885312. Throughput: 0: 49333.0. Samples: 1832690160. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-13 02:39:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:39:57,638][71000] Updated weights for policy 0, policy_version 140624 (0.0025) [2024-06-13 02:40:00,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2304098304. Throughput: 0: 49400.0. Samples: 1832986480. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-13 02:40:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:40:01,356][71000] Updated weights for policy 0, policy_version 140634 (0.0029) [2024-06-13 02:40:04,403][71000] Updated weights for policy 0, policy_version 140644 (0.0023) [2024-06-13 02:40:05,939][70768] Fps is (10 sec: 45875.7, 60 sec: 49152.2, 300 sec: 49207.6). Total num frames: 2304344064. Throughput: 0: 49418.2. Samples: 1833125200. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-13 02:40:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:40:08,161][71000] Updated weights for policy 0, policy_version 140654 (0.0034) [2024-06-13 02:40:10,940][70768] Fps is (10 sec: 52429.1, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 2304622592. Throughput: 0: 49460.6. Samples: 1833424380. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-13 02:40:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:40:11,389][71000] Updated weights for policy 0, policy_version 140664 (0.0029) [2024-06-13 02:40:14,894][71000] Updated weights for policy 0, policy_version 140674 (0.0035) [2024-06-13 02:40:15,396][70980] Signal inference workers to stop experience collection... (27200 times) [2024-06-13 02:40:15,440][70980] Signal inference workers to resume experience collection... (27200 times) [2024-06-13 02:40:15,448][71000] InferenceWorker_p0-w0: stopping experience collection (27200 times) [2024-06-13 02:40:15,477][71000] InferenceWorker_p0-w0: resuming experience collection (27200 times) [2024-06-13 02:40:15,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49971.2, 300 sec: 49263.1). Total num frames: 2304868352. Throughput: 0: 49577.9. Samples: 1833723080. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-13 02:40:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:40:17,856][71000] Updated weights for policy 0, policy_version 140684 (0.0026) [2024-06-13 02:40:20,939][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2305081344. Throughput: 0: 49275.4. Samples: 1833872040. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 02:40:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:40:21,407][71000] Updated weights for policy 0, policy_version 140694 (0.0023) [2024-06-13 02:40:24,339][71000] Updated weights for policy 0, policy_version 140704 (0.0027) [2024-06-13 02:40:25,940][70768] Fps is (10 sec: 45874.6, 60 sec: 49152.0, 300 sec: 49263.0). Total num frames: 2305327104. Throughput: 0: 49277.5. Samples: 1834159520. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 02:40:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:40:28,237][71000] Updated weights for policy 0, policy_version 140714 (0.0028) [2024-06-13 02:40:30,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 2305605632. Throughput: 0: 49172.0. Samples: 1834454220. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 02:40:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:40:31,505][71000] Updated weights for policy 0, policy_version 140724 (0.0022) [2024-06-13 02:40:34,690][71000] Updated weights for policy 0, policy_version 140734 (0.0027) [2024-06-13 02:40:35,940][70768] Fps is (10 sec: 52429.5, 60 sec: 49698.3, 300 sec: 49207.5). Total num frames: 2305851392. Throughput: 0: 49400.0. Samples: 1834617560. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 02:40:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:40:38,187][71000] Updated weights for policy 0, policy_version 140744 (0.0032) [2024-06-13 02:40:40,939][70768] Fps is (10 sec: 45875.4, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 2306064384. Throughput: 0: 49222.8. Samples: 1834905180. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 02:40:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 02:40:41,526][71000] Updated weights for policy 0, policy_version 140754 (0.0032) [2024-06-13 02:40:44,774][71000] Updated weights for policy 0, policy_version 140764 (0.0034) [2024-06-13 02:40:45,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2306310144. Throughput: 0: 49057.3. Samples: 1835194060. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 02:40:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:40:48,079][71000] Updated weights for policy 0, policy_version 140774 (0.0027) [2024-06-13 02:40:50,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2306572288. Throughput: 0: 49359.4. Samples: 1835346380. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 02:40:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:40:51,582][71000] Updated weights for policy 0, policy_version 140784 (0.0031) [2024-06-13 02:40:54,725][71000] Updated weights for policy 0, policy_version 140794 (0.0028) [2024-06-13 02:40:55,940][70768] Fps is (10 sec: 52429.1, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2306834432. Throughput: 0: 49263.5. Samples: 1835641240. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 02:40:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:40:58,110][71000] Updated weights for policy 0, policy_version 140804 (0.0024) [2024-06-13 02:41:00,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2307063808. Throughput: 0: 49298.2. Samples: 1835941500. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 02:41:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:41:01,507][71000] Updated weights for policy 0, policy_version 140814 (0.0019) [2024-06-13 02:41:04,710][71000] Updated weights for policy 0, policy_version 140824 (0.0029) [2024-06-13 02:41:05,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2307293184. Throughput: 0: 49047.1. Samples: 1836079160. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 02:41:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:41:08,158][71000] Updated weights for policy 0, policy_version 140834 (0.0025) [2024-06-13 02:41:10,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.9, 300 sec: 49207.6). Total num frames: 2307555328. Throughput: 0: 49209.5. Samples: 1836373940. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 02:41:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:41:11,496][71000] Updated weights for policy 0, policy_version 140844 (0.0027) [2024-06-13 02:41:14,971][71000] Updated weights for policy 0, policy_version 140854 (0.0027) [2024-06-13 02:41:15,939][70768] Fps is (10 sec: 54067.5, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 2307833856. Throughput: 0: 49316.1. Samples: 1836673440. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 02:41:15,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 02:41:17,927][71000] Updated weights for policy 0, policy_version 140864 (0.0023) [2024-06-13 02:41:20,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 2308046848. Throughput: 0: 49013.4. Samples: 1836823160. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 02:41:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 02:41:21,400][71000] Updated weights for policy 0, policy_version 140874 (0.0022) [2024-06-13 02:41:24,336][71000] Updated weights for policy 0, policy_version 140884 (0.0034) [2024-06-13 02:41:25,464][70980] Signal inference workers to stop experience collection... (27250 times) [2024-06-13 02:41:25,464][70980] Signal inference workers to resume experience collection... (27250 times) [2024-06-13 02:41:25,475][71000] InferenceWorker_p0-w0: stopping experience collection (27250 times) [2024-06-13 02:41:25,486][71000] InferenceWorker_p0-w0: resuming experience collection (27250 times) [2024-06-13 02:41:25,939][70768] Fps is (10 sec: 45875.0, 60 sec: 49425.2, 300 sec: 49207.5). Total num frames: 2308292608. Throughput: 0: 49167.5. Samples: 1837117720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 02:41:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:41:28,003][71000] Updated weights for policy 0, policy_version 140894 (0.0028) [2024-06-13 02:41:30,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2308554752. Throughput: 0: 49529.4. Samples: 1837422880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 02:41:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:41:31,062][71000] Updated weights for policy 0, policy_version 140904 (0.0023) [2024-06-13 02:41:34,614][71000] Updated weights for policy 0, policy_version 140914 (0.0028) [2024-06-13 02:41:35,939][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2308816896. Throughput: 0: 49454.9. Samples: 1837571840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 02:41:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:41:37,635][71000] Updated weights for policy 0, policy_version 140924 (0.0027) [2024-06-13 02:41:40,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2309029888. Throughput: 0: 49500.5. Samples: 1837868760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 02:41:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:41:40,982][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000140933_2309046272.pth... [2024-06-13 02:41:41,025][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000140212_2297233408.pth [2024-06-13 02:41:41,291][71000] Updated weights for policy 0, policy_version 140934 (0.0033) [2024-06-13 02:41:44,357][71000] Updated weights for policy 0, policy_version 140944 (0.0035) [2024-06-13 02:41:45,940][70768] Fps is (10 sec: 45874.7, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2309275648. Throughput: 0: 49409.7. Samples: 1838164940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 02:41:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 02:41:47,711][71000] Updated weights for policy 0, policy_version 140954 (0.0023) [2024-06-13 02:41:50,940][70768] Fps is (10 sec: 49150.9, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2309521408. Throughput: 0: 49473.1. Samples: 1838305460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 02:41:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:41:51,159][71000] Updated weights for policy 0, policy_version 140964 (0.0024) [2024-06-13 02:41:54,486][71000] Updated weights for policy 0, policy_version 140974 (0.0024) [2024-06-13 02:41:55,939][70768] Fps is (10 sec: 52429.4, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2309799936. Throughput: 0: 49627.6. Samples: 1838607180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 02:41:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:41:57,626][71000] Updated weights for policy 0, policy_version 140984 (0.0033) [2024-06-13 02:42:00,940][70768] Fps is (10 sec: 52429.7, 60 sec: 49698.1, 300 sec: 49207.6). Total num frames: 2310045696. Throughput: 0: 49800.4. Samples: 1838914460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 02:42:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:42:00,947][71000] Updated weights for policy 0, policy_version 140994 (0.0024) [2024-06-13 02:42:04,229][71000] Updated weights for policy 0, policy_version 141004 (0.0030) [2024-06-13 02:42:05,940][70768] Fps is (10 sec: 49150.9, 60 sec: 49971.0, 300 sec: 49318.6). Total num frames: 2310291456. Throughput: 0: 49811.8. Samples: 1839064700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 02:42:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:42:07,313][71000] Updated weights for policy 0, policy_version 141014 (0.0038) [2024-06-13 02:42:10,701][71000] Updated weights for policy 0, policy_version 141024 (0.0028) [2024-06-13 02:42:10,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 2310537216. Throughput: 0: 49774.6. Samples: 1839357580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 02:42:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:42:13,996][71000] Updated weights for policy 0, policy_version 141034 (0.0029) [2024-06-13 02:42:15,940][70768] Fps is (10 sec: 50791.3, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2310799360. Throughput: 0: 49613.8. Samples: 1839655500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 02:42:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:42:17,298][70980] Signal inference workers to stop experience collection... (27300 times) [2024-06-13 02:42:17,299][70980] Signal inference workers to resume experience collection... (27300 times) [2024-06-13 02:42:17,311][71000] InferenceWorker_p0-w0: stopping experience collection (27300 times) [2024-06-13 02:42:17,312][71000] InferenceWorker_p0-w0: resuming experience collection (27300 times) [2024-06-13 02:42:17,438][71000] Updated weights for policy 0, policy_version 141044 (0.0027) [2024-06-13 02:42:20,433][71000] Updated weights for policy 0, policy_version 141054 (0.0033) [2024-06-13 02:42:20,940][70768] Fps is (10 sec: 50789.3, 60 sec: 49971.0, 300 sec: 49263.1). Total num frames: 2311045120. Throughput: 0: 49917.9. Samples: 1839818160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 02:42:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:42:23,965][71000] Updated weights for policy 0, policy_version 141064 (0.0030) [2024-06-13 02:42:25,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 49374.2). Total num frames: 2311290880. Throughput: 0: 49726.6. Samples: 1840106460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 02:42:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:42:27,191][71000] Updated weights for policy 0, policy_version 141074 (0.0022) [2024-06-13 02:42:30,819][71000] Updated weights for policy 0, policy_version 141084 (0.0037) [2024-06-13 02:42:30,940][70768] Fps is (10 sec: 47514.6, 60 sec: 49425.1, 300 sec: 49374.1). Total num frames: 2311520256. Throughput: 0: 49760.1. Samples: 1840404140. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 02:42:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:42:33,577][71000] Updated weights for policy 0, policy_version 141094 (0.0034) [2024-06-13 02:42:35,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49698.0, 300 sec: 49374.1). Total num frames: 2311798784. Throughput: 0: 49992.5. Samples: 1840555120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 02:42:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:42:37,554][71000] Updated weights for policy 0, policy_version 141104 (0.0020) [2024-06-13 02:42:40,181][71000] Updated weights for policy 0, policy_version 141114 (0.0032) [2024-06-13 02:42:40,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49971.1, 300 sec: 49263.1). Total num frames: 2312028160. Throughput: 0: 49802.9. Samples: 1840848320. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 02:42:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:42:44,306][71000] Updated weights for policy 0, policy_version 141124 (0.0037) [2024-06-13 02:42:45,940][70768] Fps is (10 sec: 45875.7, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 2312257536. Throughput: 0: 49475.5. Samples: 1841140860. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 02:42:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:42:46,874][71000] Updated weights for policy 0, policy_version 141134 (0.0024) [2024-06-13 02:42:50,939][70768] Fps is (10 sec: 45876.2, 60 sec: 49425.3, 300 sec: 49207.6). Total num frames: 2312486912. Throughput: 0: 49344.3. Samples: 1841285180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 02:42:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:42:51,126][71000] Updated weights for policy 0, policy_version 141144 (0.0026) [2024-06-13 02:42:53,766][71000] Updated weights for policy 0, policy_version 141154 (0.0044) [2024-06-13 02:42:55,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2312765440. Throughput: 0: 49271.5. Samples: 1841574800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 02:42:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:42:57,698][71000] Updated weights for policy 0, policy_version 141164 (0.0033) [2024-06-13 02:43:00,226][71000] Updated weights for policy 0, policy_version 141174 (0.0028) [2024-06-13 02:43:00,939][70768] Fps is (10 sec: 52428.4, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2313011200. Throughput: 0: 49246.3. Samples: 1841871580. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 02:43:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:43:04,239][71000] Updated weights for policy 0, policy_version 141184 (0.0026) [2024-06-13 02:43:05,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.1, 300 sec: 49374.1). Total num frames: 2313256960. Throughput: 0: 49005.9. Samples: 1842023420. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 02:43:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:43:06,648][71000] Updated weights for policy 0, policy_version 141194 (0.0020) [2024-06-13 02:43:10,870][71000] Updated weights for policy 0, policy_version 141204 (0.0026) [2024-06-13 02:43:10,940][70768] Fps is (10 sec: 47512.6, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2313486336. Throughput: 0: 49317.6. Samples: 1842325760. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 02:43:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:43:13,528][71000] Updated weights for policy 0, policy_version 141214 (0.0027) [2024-06-13 02:43:15,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.0, 300 sec: 49318.7). Total num frames: 2313748480. Throughput: 0: 49096.9. Samples: 1842613500. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 02:43:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:43:16,392][70980] Signal inference workers to stop experience collection... (27350 times) [2024-06-13 02:43:16,392][70980] Signal inference workers to resume experience collection... (27350 times) [2024-06-13 02:43:16,402][71000] InferenceWorker_p0-w0: stopping experience collection (27350 times) [2024-06-13 02:43:16,403][71000] InferenceWorker_p0-w0: resuming experience collection (27350 times) [2024-06-13 02:43:17,473][71000] Updated weights for policy 0, policy_version 141224 (0.0028) [2024-06-13 02:43:20,084][71000] Updated weights for policy 0, policy_version 141234 (0.0032) [2024-06-13 02:43:20,940][70768] Fps is (10 sec: 52429.5, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2314010624. Throughput: 0: 49192.1. Samples: 1842768760. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 02:43:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:43:24,104][71000] Updated weights for policy 0, policy_version 141244 (0.0034) [2024-06-13 02:43:25,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2314223616. Throughput: 0: 49260.1. Samples: 1843065020. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 02:43:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:43:26,811][71000] Updated weights for policy 0, policy_version 141254 (0.0028) [2024-06-13 02:43:30,726][71000] Updated weights for policy 0, policy_version 141264 (0.0025) [2024-06-13 02:43:30,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2314485760. Throughput: 0: 49425.8. Samples: 1843365020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:43:30,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 02:43:33,453][71000] Updated weights for policy 0, policy_version 141274 (0.0025) [2024-06-13 02:43:35,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48879.0, 300 sec: 49374.2). Total num frames: 2314731520. Throughput: 0: 49239.4. Samples: 1843500960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:43:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:43:37,279][71000] Updated weights for policy 0, policy_version 141284 (0.0025) [2024-06-13 02:43:40,047][71000] Updated weights for policy 0, policy_version 141294 (0.0035) [2024-06-13 02:43:40,940][70768] Fps is (10 sec: 50789.4, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2314993664. Throughput: 0: 49533.7. Samples: 1843803820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:43:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:43:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000141296_2314993664.pth... [2024-06-13 02:43:41,011][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000140571_2303115264.pth [2024-06-13 02:43:43,700][71000] Updated weights for policy 0, policy_version 141304 (0.0031) [2024-06-13 02:43:45,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2315206656. Throughput: 0: 49618.6. Samples: 1844104420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:43:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:43:46,754][71000] Updated weights for policy 0, policy_version 141314 (0.0021) [2024-06-13 02:43:50,491][71000] Updated weights for policy 0, policy_version 141324 (0.0034) [2024-06-13 02:43:50,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 2315468800. Throughput: 0: 49353.8. Samples: 1844244340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:43:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:43:53,417][71000] Updated weights for policy 0, policy_version 141334 (0.0025) [2024-06-13 02:43:55,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 2315714560. Throughput: 0: 49418.9. Samples: 1844549600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:43:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:43:56,967][71000] Updated weights for policy 0, policy_version 141344 (0.0031) [2024-06-13 02:43:59,868][71000] Updated weights for policy 0, policy_version 141354 (0.0032) [2024-06-13 02:44:00,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2315976704. Throughput: 0: 49473.8. Samples: 1844839820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:44:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:44:03,576][71000] Updated weights for policy 0, policy_version 141364 (0.0027) [2024-06-13 02:44:05,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2316206080. Throughput: 0: 49562.5. Samples: 1844999080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:44:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:44:06,701][71000] Updated weights for policy 0, policy_version 141374 (0.0024) [2024-06-13 02:44:10,058][71000] Updated weights for policy 0, policy_version 141384 (0.0039) [2024-06-13 02:44:10,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2316451840. Throughput: 0: 49555.0. Samples: 1845295000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:44:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:44:13,327][71000] Updated weights for policy 0, policy_version 141394 (0.0030) [2024-06-13 02:44:15,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2316697600. Throughput: 0: 49222.6. Samples: 1845580040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:44:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:44:16,624][71000] Updated weights for policy 0, policy_version 141404 (0.0025) [2024-06-13 02:44:19,870][71000] Updated weights for policy 0, policy_version 141414 (0.0041) [2024-06-13 02:44:20,940][70768] Fps is (10 sec: 52429.1, 60 sec: 49425.1, 300 sec: 49485.3). Total num frames: 2316976128. Throughput: 0: 49600.5. Samples: 1845732980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:44:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:44:23,627][71000] Updated weights for policy 0, policy_version 141424 (0.0029) [2024-06-13 02:44:25,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2317189120. Throughput: 0: 49397.9. Samples: 1846026720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:44:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:44:26,251][70980] Signal inference workers to stop experience collection... (27400 times) [2024-06-13 02:44:26,252][70980] Signal inference workers to resume experience collection... (27400 times) [2024-06-13 02:44:26,268][71000] InferenceWorker_p0-w0: stopping experience collection (27400 times) [2024-06-13 02:44:26,268][71000] InferenceWorker_p0-w0: resuming experience collection (27400 times) [2024-06-13 02:44:26,691][71000] Updated weights for policy 0, policy_version 141434 (0.0034) [2024-06-13 02:44:29,947][71000] Updated weights for policy 0, policy_version 141444 (0.0032) [2024-06-13 02:44:30,940][70768] Fps is (10 sec: 44236.3, 60 sec: 48878.8, 300 sec: 49318.6). Total num frames: 2317418496. Throughput: 0: 49113.6. Samples: 1846314540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:44:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:44:33,436][71000] Updated weights for policy 0, policy_version 141454 (0.0029) [2024-06-13 02:44:35,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2317697024. Throughput: 0: 49442.4. Samples: 1846469240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 02:44:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:44:36,526][71000] Updated weights for policy 0, policy_version 141464 (0.0035) [2024-06-13 02:44:40,022][71000] Updated weights for policy 0, policy_version 141474 (0.0032) [2024-06-13 02:44:40,940][70768] Fps is (10 sec: 54067.1, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2317959168. Throughput: 0: 49232.2. Samples: 1846765060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 02:44:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:44:43,511][71000] Updated weights for policy 0, policy_version 141484 (0.0022) [2024-06-13 02:44:45,939][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2318172160. Throughput: 0: 49474.3. Samples: 1847066160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 02:44:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:44:46,447][71000] Updated weights for policy 0, policy_version 141494 (0.0023) [2024-06-13 02:44:49,786][71000] Updated weights for policy 0, policy_version 141504 (0.0031) [2024-06-13 02:44:50,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2318417920. Throughput: 0: 49176.2. Samples: 1847212000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 02:44:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:44:53,206][71000] Updated weights for policy 0, policy_version 141514 (0.0022) [2024-06-13 02:44:55,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2318680064. Throughput: 0: 49041.0. Samples: 1847501840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 02:44:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:44:56,571][71000] Updated weights for policy 0, policy_version 141524 (0.0037) [2024-06-13 02:44:59,752][71000] Updated weights for policy 0, policy_version 141534 (0.0024) [2024-06-13 02:45:00,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2318942208. Throughput: 0: 49333.3. Samples: 1847800040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 02:45:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:45:03,055][71000] Updated weights for policy 0, policy_version 141544 (0.0031) [2024-06-13 02:45:05,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2319155200. Throughput: 0: 49253.3. Samples: 1847949380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 02:45:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:45:06,477][71000] Updated weights for policy 0, policy_version 141554 (0.0026) [2024-06-13 02:45:09,523][71000] Updated weights for policy 0, policy_version 141564 (0.0021) [2024-06-13 02:45:10,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49698.1, 300 sec: 49374.1). Total num frames: 2319433728. Throughput: 0: 49545.2. Samples: 1848256260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 02:45:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 02:45:12,944][71000] Updated weights for policy 0, policy_version 141574 (0.0038) [2024-06-13 02:45:15,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 2319679488. Throughput: 0: 49689.1. Samples: 1848550540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 02:45:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 02:45:16,290][71000] Updated weights for policy 0, policy_version 141584 (0.0028) [2024-06-13 02:45:19,625][71000] Updated weights for policy 0, policy_version 141594 (0.0022) [2024-06-13 02:45:20,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49151.9, 300 sec: 49485.2). Total num frames: 2319925248. Throughput: 0: 49668.7. Samples: 1848704340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 02:45:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:45:22,671][71000] Updated weights for policy 0, policy_version 141604 (0.0028) [2024-06-13 02:45:25,941][70768] Fps is (10 sec: 50784.6, 60 sec: 49970.3, 300 sec: 49429.5). Total num frames: 2320187392. Throughput: 0: 49630.1. Samples: 1848998460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 02:45:25,941][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 02:45:26,072][71000] Updated weights for policy 0, policy_version 141614 (0.0029) [2024-06-13 02:45:29,338][71000] Updated weights for policy 0, policy_version 141624 (0.0033) [2024-06-13 02:45:30,940][70768] Fps is (10 sec: 50790.6, 60 sec: 50244.3, 300 sec: 49429.7). Total num frames: 2320433152. Throughput: 0: 49675.8. Samples: 1849301580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 02:45:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:45:32,194][70980] Signal inference workers to stop experience collection... (27450 times) [2024-06-13 02:45:32,249][71000] InferenceWorker_p0-w0: stopping experience collection (27450 times) [2024-06-13 02:45:32,304][70980] Signal inference workers to resume experience collection... (27450 times) [2024-06-13 02:45:32,304][71000] InferenceWorker_p0-w0: resuming experience collection (27450 times) [2024-06-13 02:45:32,566][71000] Updated weights for policy 0, policy_version 141634 (0.0020) [2024-06-13 02:45:35,943][70768] Fps is (10 sec: 47500.7, 60 sec: 49421.9, 300 sec: 49484.6). Total num frames: 2320662528. Throughput: 0: 49714.4. Samples: 1849449340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 02:45:35,944][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:45:36,176][71000] Updated weights for policy 0, policy_version 141644 (0.0025) [2024-06-13 02:45:39,452][71000] Updated weights for policy 0, policy_version 141654 (0.0029) [2024-06-13 02:45:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 2320924672. Throughput: 0: 49719.3. Samples: 1849739220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 02:45:40,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-13 02:45:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000141658_2320924672.pth... [2024-06-13 02:45:41,010][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000140933_2309046272.pth [2024-06-13 02:45:42,839][71000] Updated weights for policy 0, policy_version 141664 (0.0034) [2024-06-13 02:45:45,940][70768] Fps is (10 sec: 50809.7, 60 sec: 49971.1, 300 sec: 49485.3). Total num frames: 2321170432. Throughput: 0: 49764.4. Samples: 1850039440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 02:45:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 02:45:45,985][71000] Updated weights for policy 0, policy_version 141674 (0.0024) [2024-06-13 02:45:49,382][71000] Updated weights for policy 0, policy_version 141684 (0.0021) [2024-06-13 02:45:50,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 2321416192. Throughput: 0: 49782.2. Samples: 1850189580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 02:45:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:45:52,811][71000] Updated weights for policy 0, policy_version 141694 (0.0027) [2024-06-13 02:45:55,864][71000] Updated weights for policy 0, policy_version 141704 (0.0033) [2024-06-13 02:45:55,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49971.1, 300 sec: 49540.8). Total num frames: 2321678336. Throughput: 0: 49671.6. Samples: 1850491480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 02:45:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:45:59,227][71000] Updated weights for policy 0, policy_version 141714 (0.0041) [2024-06-13 02:46:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 2321924096. Throughput: 0: 49661.7. Samples: 1850785320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 02:46:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:46:02,702][71000] Updated weights for policy 0, policy_version 141724 (0.0026) [2024-06-13 02:46:05,939][70768] Fps is (10 sec: 47514.5, 60 sec: 49971.3, 300 sec: 49485.3). Total num frames: 2322153472. Throughput: 0: 49557.2. Samples: 1850934400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 02:46:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:46:06,052][71000] Updated weights for policy 0, policy_version 141734 (0.0029) [2024-06-13 02:46:09,135][71000] Updated weights for policy 0, policy_version 141744 (0.0030) [2024-06-13 02:46:10,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 2322415616. Throughput: 0: 49502.1. Samples: 1851226000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 02:46:10,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 02:46:12,467][71000] Updated weights for policy 0, policy_version 141754 (0.0021) [2024-06-13 02:46:15,688][71000] Updated weights for policy 0, policy_version 141764 (0.0033) [2024-06-13 02:46:15,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 2322661376. Throughput: 0: 49286.4. Samples: 1851519460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 02:46:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:46:19,689][71000] Updated weights for policy 0, policy_version 141774 (0.0035) [2024-06-13 02:46:20,942][70768] Fps is (10 sec: 49140.7, 60 sec: 49696.3, 300 sec: 49540.4). Total num frames: 2322907136. Throughput: 0: 49358.1. Samples: 1851670380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 02:46:20,942][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:46:22,483][71000] Updated weights for policy 0, policy_version 141784 (0.0032) [2024-06-13 02:46:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.9, 300 sec: 49429.7). Total num frames: 2323136512. Throughput: 0: 49359.7. Samples: 1851960400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 02:46:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:46:26,216][71000] Updated weights for policy 0, policy_version 141794 (0.0026) [2024-06-13 02:46:29,271][71000] Updated weights for policy 0, policy_version 141804 (0.0025) [2024-06-13 02:46:30,940][70768] Fps is (10 sec: 47524.7, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 2323382272. Throughput: 0: 49358.7. Samples: 1852260580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 02:46:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:46:32,662][71000] Updated weights for policy 0, policy_version 141814 (0.0031) [2024-06-13 02:46:35,757][71000] Updated weights for policy 0, policy_version 141824 (0.0032) [2024-06-13 02:46:35,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49701.2, 300 sec: 49540.7). Total num frames: 2323644416. Throughput: 0: 49443.0. Samples: 1852414520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 02:46:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:46:39,485][71000] Updated weights for policy 0, policy_version 141834 (0.0029) [2024-06-13 02:46:40,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 2323906560. Throughput: 0: 49425.8. Samples: 1852715640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 02:46:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:46:42,233][71000] Updated weights for policy 0, policy_version 141844 (0.0024) [2024-06-13 02:46:45,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.0, 300 sec: 49485.3). Total num frames: 2324119552. Throughput: 0: 49299.6. Samples: 1853003800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 02:46:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:46:45,961][71000] Updated weights for policy 0, policy_version 141854 (0.0027) [2024-06-13 02:46:49,116][71000] Updated weights for policy 0, policy_version 141864 (0.0026) [2024-06-13 02:46:50,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2324381696. Throughput: 0: 49471.4. Samples: 1853160620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 02:46:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:46:52,306][71000] Updated weights for policy 0, policy_version 141874 (0.0025) [2024-06-13 02:46:55,856][71000] Updated weights for policy 0, policy_version 141884 (0.0025) [2024-06-13 02:46:55,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2324627456. Throughput: 0: 49507.5. Samples: 1853453840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 02:46:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:46:59,016][71000] Updated weights for policy 0, policy_version 141894 (0.0035) [2024-06-13 02:47:00,124][70980] Signal inference workers to stop experience collection... (27500 times) [2024-06-13 02:47:00,125][70980] Signal inference workers to resume experience collection... (27500 times) [2024-06-13 02:47:00,156][71000] InferenceWorker_p0-w0: stopping experience collection (27500 times) [2024-06-13 02:47:00,156][71000] InferenceWorker_p0-w0: resuming experience collection (27500 times) [2024-06-13 02:47:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 49485.3). Total num frames: 2324889600. Throughput: 0: 49661.3. Samples: 1853754220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 02:47:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:47:02,211][71000] Updated weights for policy 0, policy_version 141904 (0.0032) [2024-06-13 02:47:05,798][71000] Updated weights for policy 0, policy_version 141914 (0.0026) [2024-06-13 02:47:05,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49424.9, 300 sec: 49429.7). Total num frames: 2325118976. Throughput: 0: 49610.9. Samples: 1853902760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 02:47:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:47:08,906][71000] Updated weights for policy 0, policy_version 141924 (0.0034) [2024-06-13 02:47:10,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2325364736. Throughput: 0: 49531.6. Samples: 1854189320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 02:47:10,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-13 02:47:12,703][71000] Updated weights for policy 0, policy_version 141934 (0.0027) [2024-06-13 02:47:15,814][71000] Updated weights for policy 0, policy_version 141944 (0.0036) [2024-06-13 02:47:15,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2325610496. Throughput: 0: 49440.9. Samples: 1854485420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 02:47:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:47:19,244][71000] Updated weights for policy 0, policy_version 141954 (0.0027) [2024-06-13 02:47:20,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49427.1, 300 sec: 49429.7). Total num frames: 2325872640. Throughput: 0: 49210.9. Samples: 1854629000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 02:47:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:47:22,498][71000] Updated weights for policy 0, policy_version 141964 (0.0031) [2024-06-13 02:47:25,798][71000] Updated weights for policy 0, policy_version 141974 (0.0027) [2024-06-13 02:47:25,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2326102016. Throughput: 0: 49119.5. Samples: 1854926020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 02:47:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:47:28,855][71000] Updated weights for policy 0, policy_version 141984 (0.0021) [2024-06-13 02:47:30,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2326347776. Throughput: 0: 49264.4. Samples: 1855220700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 02:47:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:47:32,356][71000] Updated weights for policy 0, policy_version 141994 (0.0045) [2024-06-13 02:47:35,491][71000] Updated weights for policy 0, policy_version 142004 (0.0030) [2024-06-13 02:47:35,939][70768] Fps is (10 sec: 49153.1, 60 sec: 49152.2, 300 sec: 49374.2). Total num frames: 2326593536. Throughput: 0: 49288.1. Samples: 1855378580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 02:47:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:47:38,880][71000] Updated weights for policy 0, policy_version 142014 (0.0033) [2024-06-13 02:47:40,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2326855680. Throughput: 0: 49186.7. Samples: 1855667240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 02:47:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:47:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000142020_2326855680.pth... [2024-06-13 02:47:41,011][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000141296_2314993664.pth [2024-06-13 02:47:42,226][71000] Updated weights for policy 0, policy_version 142024 (0.0032) [2024-06-13 02:47:45,701][71000] Updated weights for policy 0, policy_version 142034 (0.0027) [2024-06-13 02:47:45,939][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2327085056. Throughput: 0: 49131.2. Samples: 1855965120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 02:47:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:47:48,795][71000] Updated weights for policy 0, policy_version 142044 (0.0031) [2024-06-13 02:47:50,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2327347200. Throughput: 0: 49236.5. Samples: 1856118400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 02:47:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:47:51,907][71000] Updated weights for policy 0, policy_version 142054 (0.0035) [2024-06-13 02:47:55,206][71000] Updated weights for policy 0, policy_version 142064 (0.0024) [2024-06-13 02:47:55,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2327592960. Throughput: 0: 49431.4. Samples: 1856413740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 02:47:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:47:58,526][71000] Updated weights for policy 0, policy_version 142074 (0.0032) [2024-06-13 02:48:00,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2327838720. Throughput: 0: 49400.9. Samples: 1856708460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 02:48:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:48:02,407][71000] Updated weights for policy 0, policy_version 142084 (0.0027) [2024-06-13 02:48:05,643][71000] Updated weights for policy 0, policy_version 142094 (0.0033) [2024-06-13 02:48:05,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2328068096. Throughput: 0: 49318.0. Samples: 1856848320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 02:48:05,949][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:48:09,047][71000] Updated weights for policy 0, policy_version 142104 (0.0025) [2024-06-13 02:48:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2328330240. Throughput: 0: 49481.9. Samples: 1857152700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 02:48:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:48:12,054][71000] Updated weights for policy 0, policy_version 142114 (0.0029) [2024-06-13 02:48:14,412][70980] Signal inference workers to stop experience collection... (27550 times) [2024-06-13 02:48:14,444][71000] InferenceWorker_p0-w0: stopping experience collection (27550 times) [2024-06-13 02:48:14,458][70980] Signal inference workers to resume experience collection... (27550 times) [2024-06-13 02:48:14,460][71000] InferenceWorker_p0-w0: resuming experience collection (27550 times) [2024-06-13 02:48:15,391][71000] Updated weights for policy 0, policy_version 142124 (0.0018) [2024-06-13 02:48:15,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 2328576000. Throughput: 0: 49455.1. Samples: 1857446180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 02:48:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:48:18,421][71000] Updated weights for policy 0, policy_version 142134 (0.0026) [2024-06-13 02:48:20,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49151.8, 300 sec: 49485.2). Total num frames: 2328821760. Throughput: 0: 49095.2. Samples: 1857587880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 02:48:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:48:22,266][71000] Updated weights for policy 0, policy_version 142144 (0.0035) [2024-06-13 02:48:25,275][71000] Updated weights for policy 0, policy_version 142154 (0.0033) [2024-06-13 02:48:25,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 2329051136. Throughput: 0: 49188.4. Samples: 1857880720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 02:48:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:48:29,043][71000] Updated weights for policy 0, policy_version 142164 (0.0030) [2024-06-13 02:48:30,940][70768] Fps is (10 sec: 45876.0, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 2329280512. Throughput: 0: 49039.9. Samples: 1858171920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 02:48:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:48:32,120][71000] Updated weights for policy 0, policy_version 142174 (0.0020) [2024-06-13 02:48:35,436][71000] Updated weights for policy 0, policy_version 142184 (0.0035) [2024-06-13 02:48:35,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2329542656. Throughput: 0: 49065.4. Samples: 1858326340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 02:48:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:48:38,687][71000] Updated weights for policy 0, policy_version 142194 (0.0029) [2024-06-13 02:48:40,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2329804800. Throughput: 0: 48895.2. Samples: 1858614020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 02:48:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:48:42,218][71000] Updated weights for policy 0, policy_version 142204 (0.0022) [2024-06-13 02:48:45,239][71000] Updated weights for policy 0, policy_version 142214 (0.0030) [2024-06-13 02:48:45,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2330034176. Throughput: 0: 48808.0. Samples: 1858904820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:48:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:48:48,754][71000] Updated weights for policy 0, policy_version 142224 (0.0020) [2024-06-13 02:48:50,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48605.9, 300 sec: 49318.6). Total num frames: 2330263552. Throughput: 0: 49090.8. Samples: 1859057400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:48:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:48:51,991][71000] Updated weights for policy 0, policy_version 142234 (0.0035) [2024-06-13 02:48:55,617][71000] Updated weights for policy 0, policy_version 142244 (0.0025) [2024-06-13 02:48:55,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 2330542080. Throughput: 0: 48918.5. Samples: 1859354040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:48:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:48:58,791][71000] Updated weights for policy 0, policy_version 142254 (0.0029) [2024-06-13 02:49:00,940][70768] Fps is (10 sec: 50789.5, 60 sec: 48878.8, 300 sec: 49374.2). Total num frames: 2330771456. Throughput: 0: 48971.0. Samples: 1859649880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:49:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:49:02,419][71000] Updated weights for policy 0, policy_version 142264 (0.0033) [2024-06-13 02:49:05,443][71000] Updated weights for policy 0, policy_version 142274 (0.0022) [2024-06-13 02:49:05,939][70768] Fps is (10 sec: 49152.8, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2331033600. Throughput: 0: 49116.7. Samples: 1859798120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:49:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:49:08,939][71000] Updated weights for policy 0, policy_version 142284 (0.0039) [2024-06-13 02:49:10,939][70768] Fps is (10 sec: 45876.3, 60 sec: 48332.8, 300 sec: 49263.1). Total num frames: 2331230208. Throughput: 0: 48981.9. Samples: 1860084900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:49:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:49:12,050][71000] Updated weights for policy 0, policy_version 142294 (0.0044) [2024-06-13 02:49:15,659][71000] Updated weights for policy 0, policy_version 142304 (0.0029) [2024-06-13 02:49:15,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2331508736. Throughput: 0: 48968.4. Samples: 1860375500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:49:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:49:18,794][71000] Updated weights for policy 0, policy_version 142314 (0.0023) [2024-06-13 02:49:20,372][70980] Signal inference workers to stop experience collection... (27600 times) [2024-06-13 02:49:20,372][70980] Signal inference workers to resume experience collection... (27600 times) [2024-06-13 02:49:20,405][71000] InferenceWorker_p0-w0: stopping experience collection (27600 times) [2024-06-13 02:49:20,405][71000] InferenceWorker_p0-w0: resuming experience collection (27600 times) [2024-06-13 02:49:20,940][70768] Fps is (10 sec: 52427.8, 60 sec: 48879.0, 300 sec: 49374.1). Total num frames: 2331754496. Throughput: 0: 48806.5. Samples: 1860522640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:49:20,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 02:49:22,173][71000] Updated weights for policy 0, policy_version 142324 (0.0021) [2024-06-13 02:49:25,338][71000] Updated weights for policy 0, policy_version 142334 (0.0027) [2024-06-13 02:49:25,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 2332016640. Throughput: 0: 49259.4. Samples: 1860830700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:49:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:49:28,723][71000] Updated weights for policy 0, policy_version 142344 (0.0024) [2024-06-13 02:49:30,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2332246016. Throughput: 0: 49349.4. Samples: 1861125540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:49:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:49:32,016][71000] Updated weights for policy 0, policy_version 142354 (0.0032) [2024-06-13 02:49:35,939][70768] Fps is (10 sec: 45875.9, 60 sec: 48878.9, 300 sec: 49207.6). Total num frames: 2332475392. Throughput: 0: 49067.6. Samples: 1861265440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:49:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:49:36,010][71000] Updated weights for policy 0, policy_version 142364 (0.0026) [2024-06-13 02:49:38,767][71000] Updated weights for policy 0, policy_version 142374 (0.0035) [2024-06-13 02:49:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48879.0, 300 sec: 49374.1). Total num frames: 2332737536. Throughput: 0: 48930.8. Samples: 1861555920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:49:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:49:40,955][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000142379_2332737536.pth... [2024-06-13 02:49:41,010][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000141658_2320924672.pth [2024-06-13 02:49:42,253][71000] Updated weights for policy 0, policy_version 142384 (0.0026) [2024-06-13 02:49:45,629][71000] Updated weights for policy 0, policy_version 142394 (0.0035) [2024-06-13 02:49:45,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2332999680. Throughput: 0: 49145.0. Samples: 1861861400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 02:49:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:49:48,775][71000] Updated weights for policy 0, policy_version 142404 (0.0028) [2024-06-13 02:49:50,939][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2333212672. Throughput: 0: 49128.4. Samples: 1862008900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:49:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:49:52,046][71000] Updated weights for policy 0, policy_version 142414 (0.0030) [2024-06-13 02:49:55,474][71000] Updated weights for policy 0, policy_version 142424 (0.0028) [2024-06-13 02:49:55,939][70768] Fps is (10 sec: 47514.6, 60 sec: 48879.1, 300 sec: 49263.1). Total num frames: 2333474816. Throughput: 0: 49373.8. Samples: 1862306720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:49:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:49:58,653][71000] Updated weights for policy 0, policy_version 142434 (0.0035) [2024-06-13 02:50:00,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.3, 300 sec: 49429.7). Total num frames: 2333736960. Throughput: 0: 49473.5. Samples: 1862601800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:50:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:50:02,440][71000] Updated weights for policy 0, policy_version 142444 (0.0029) [2024-06-13 02:50:05,474][71000] Updated weights for policy 0, policy_version 142454 (0.0028) [2024-06-13 02:50:05,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2333982720. Throughput: 0: 49703.7. Samples: 1862759300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:50:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:50:08,927][71000] Updated weights for policy 0, policy_version 142464 (0.0029) [2024-06-13 02:50:10,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2334212096. Throughput: 0: 49105.9. Samples: 1863040460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:50:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:50:12,238][71000] Updated weights for policy 0, policy_version 142474 (0.0029) [2024-06-13 02:50:15,800][71000] Updated weights for policy 0, policy_version 142484 (0.0025) [2024-06-13 02:50:15,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2334457856. Throughput: 0: 48978.5. Samples: 1863329580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:50:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:50:18,943][71000] Updated weights for policy 0, policy_version 142494 (0.0029) [2024-06-13 02:50:20,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49152.0, 300 sec: 49207.7). Total num frames: 2334703616. Throughput: 0: 49205.6. Samples: 1863479700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:50:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:50:22,482][71000] Updated weights for policy 0, policy_version 142504 (0.0033) [2024-06-13 02:50:25,567][71000] Updated weights for policy 0, policy_version 142514 (0.0024) [2024-06-13 02:50:25,939][70768] Fps is (10 sec: 50791.4, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2334965760. Throughput: 0: 49376.9. Samples: 1863777880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:50:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 02:50:29,044][71000] Updated weights for policy 0, policy_version 142524 (0.0021) [2024-06-13 02:50:30,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49151.9, 300 sec: 49263.7). Total num frames: 2335195136. Throughput: 0: 49054.8. Samples: 1864068860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:50:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:50:32,355][71000] Updated weights for policy 0, policy_version 142534 (0.0022) [2024-06-13 02:50:35,786][71000] Updated weights for policy 0, policy_version 142544 (0.0035) [2024-06-13 02:50:35,939][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.1, 300 sec: 49207.6). Total num frames: 2335440896. Throughput: 0: 48973.3. Samples: 1864212700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:50:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:50:38,332][70980] Signal inference workers to stop experience collection... (27650 times) [2024-06-13 02:50:38,354][71000] InferenceWorker_p0-w0: stopping experience collection (27650 times) [2024-06-13 02:50:38,387][70980] Signal inference workers to resume experience collection... (27650 times) [2024-06-13 02:50:38,387][71000] InferenceWorker_p0-w0: resuming experience collection (27650 times) [2024-06-13 02:50:38,941][71000] Updated weights for policy 0, policy_version 142554 (0.0026) [2024-06-13 02:50:40,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2335686656. Throughput: 0: 48872.3. Samples: 1864505980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:50:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:50:42,684][71000] Updated weights for policy 0, policy_version 142564 (0.0036) [2024-06-13 02:50:45,585][71000] Updated weights for policy 0, policy_version 142574 (0.0027) [2024-06-13 02:50:45,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2335948800. Throughput: 0: 49032.3. Samples: 1864808260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 02:50:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:50:49,055][71000] Updated weights for policy 0, policy_version 142584 (0.0033) [2024-06-13 02:50:50,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2336178176. Throughput: 0: 48901.8. Samples: 1864959880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 02:50:50,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:50:52,443][71000] Updated weights for policy 0, policy_version 142594 (0.0022) [2024-06-13 02:50:55,939][70768] Fps is (10 sec: 45875.9, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2336407552. Throughput: 0: 49051.2. Samples: 1865247760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 02:50:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:50:56,023][71000] Updated weights for policy 0, policy_version 142604 (0.0025) [2024-06-13 02:50:59,118][71000] Updated weights for policy 0, policy_version 142614 (0.0026) [2024-06-13 02:51:00,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2336669696. Throughput: 0: 49013.0. Samples: 1865535160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 02:51:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:51:02,570][71000] Updated weights for policy 0, policy_version 142624 (0.0025) [2024-06-13 02:51:05,927][71000] Updated weights for policy 0, policy_version 142634 (0.0037) [2024-06-13 02:51:05,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2336915456. Throughput: 0: 49213.9. Samples: 1865694320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 02:51:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:51:08,891][71000] Updated weights for policy 0, policy_version 142644 (0.0033) [2024-06-13 02:51:10,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49424.9, 300 sec: 49207.5). Total num frames: 2337177600. Throughput: 0: 49209.1. Samples: 1865992300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 02:51:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:51:12,369][71000] Updated weights for policy 0, policy_version 142654 (0.0026) [2024-06-13 02:51:15,802][71000] Updated weights for policy 0, policy_version 142664 (0.0022) [2024-06-13 02:51:15,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 49152.4). Total num frames: 2337406976. Throughput: 0: 49209.3. Samples: 1866283280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 02:51:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:51:19,281][71000] Updated weights for policy 0, policy_version 142674 (0.0030) [2024-06-13 02:51:20,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2337652736. Throughput: 0: 49210.4. Samples: 1866427180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 02:51:20,944][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:51:22,499][71000] Updated weights for policy 0, policy_version 142684 (0.0025) [2024-06-13 02:51:25,907][71000] Updated weights for policy 0, policy_version 142694 (0.0033) [2024-06-13 02:51:25,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48878.8, 300 sec: 49207.5). Total num frames: 2337898496. Throughput: 0: 49110.5. Samples: 1866715960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 02:51:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:51:29,104][71000] Updated weights for policy 0, policy_version 142704 (0.0028) [2024-06-13 02:51:30,940][70768] Fps is (10 sec: 50791.6, 60 sec: 49425.1, 300 sec: 49207.6). Total num frames: 2338160640. Throughput: 0: 49138.7. Samples: 1867019500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 02:51:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:51:32,423][71000] Updated weights for policy 0, policy_version 142714 (0.0028) [2024-06-13 02:51:35,547][71000] Updated weights for policy 0, policy_version 142724 (0.0027) [2024-06-13 02:51:35,939][70768] Fps is (10 sec: 50791.3, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2338406400. Throughput: 0: 49353.0. Samples: 1867180760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 02:51:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:51:38,844][71000] Updated weights for policy 0, policy_version 142734 (0.0032) [2024-06-13 02:51:40,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2338652160. Throughput: 0: 49514.6. Samples: 1867475920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 02:51:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:51:40,996][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000142741_2338668544.pth... [2024-06-13 02:51:41,051][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000142020_2326855680.pth [2024-06-13 02:51:42,138][71000] Updated weights for policy 0, policy_version 142744 (0.0026) [2024-06-13 02:51:45,594][71000] Updated weights for policy 0, policy_version 142754 (0.0030) [2024-06-13 02:51:45,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2338897920. Throughput: 0: 49739.9. Samples: 1867773460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 02:51:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:51:48,830][71000] Updated weights for policy 0, policy_version 142764 (0.0033) [2024-06-13 02:51:50,939][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.2, 300 sec: 49263.1). Total num frames: 2339160064. Throughput: 0: 49501.8. Samples: 1867921900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 02:51:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:51:52,008][70980] Signal inference workers to stop experience collection... (27700 times) [2024-06-13 02:51:52,009][70980] Signal inference workers to resume experience collection... (27700 times) [2024-06-13 02:51:52,054][71000] InferenceWorker_p0-w0: stopping experience collection (27700 times) [2024-06-13 02:51:52,054][71000] InferenceWorker_p0-w0: resuming experience collection (27700 times) [2024-06-13 02:51:52,139][71000] Updated weights for policy 0, policy_version 142774 (0.0029) [2024-06-13 02:51:55,010][71000] Updated weights for policy 0, policy_version 142784 (0.0030) [2024-06-13 02:51:55,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49971.1, 300 sec: 49207.6). Total num frames: 2339405824. Throughput: 0: 49622.5. Samples: 1868225300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 02:51:55,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 02:51:58,511][71000] Updated weights for policy 0, policy_version 142794 (0.0021) [2024-06-13 02:52:00,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49698.0, 300 sec: 49263.1). Total num frames: 2339651584. Throughput: 0: 49862.5. Samples: 1868527100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 02:52:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:52:01,647][71000] Updated weights for policy 0, policy_version 142804 (0.0026) [2024-06-13 02:52:05,210][71000] Updated weights for policy 0, policy_version 142814 (0.0021) [2024-06-13 02:52:05,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2339880960. Throughput: 0: 49842.1. Samples: 1868670060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 02:52:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:52:08,292][71000] Updated weights for policy 0, policy_version 142824 (0.0034) [2024-06-13 02:52:10,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 2340143104. Throughput: 0: 49943.2. Samples: 1868963400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 02:52:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:52:11,798][71000] Updated weights for policy 0, policy_version 142834 (0.0035) [2024-06-13 02:52:14,846][71000] Updated weights for policy 0, policy_version 142844 (0.0019) [2024-06-13 02:52:15,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49698.2, 300 sec: 49207.5). Total num frames: 2340388864. Throughput: 0: 49789.4. Samples: 1869260020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 02:52:15,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 02:52:18,362][71000] Updated weights for policy 0, policy_version 142854 (0.0033) [2024-06-13 02:52:20,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49971.3, 300 sec: 49318.6). Total num frames: 2340651008. Throughput: 0: 49652.3. Samples: 1869415120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 02:52:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:52:21,269][71000] Updated weights for policy 0, policy_version 142864 (0.0026) [2024-06-13 02:52:25,011][71000] Updated weights for policy 0, policy_version 142874 (0.0025) [2024-06-13 02:52:25,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49425.2, 300 sec: 49207.5). Total num frames: 2340864000. Throughput: 0: 49715.5. Samples: 1869713120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 02:52:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:52:28,077][71000] Updated weights for policy 0, policy_version 142884 (0.0033) [2024-06-13 02:52:30,940][70768] Fps is (10 sec: 45875.5, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2341109760. Throughput: 0: 49408.5. Samples: 1869996840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 02:52:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:52:31,781][71000] Updated weights for policy 0, policy_version 142894 (0.0028) [2024-06-13 02:52:34,815][71000] Updated weights for policy 0, policy_version 142904 (0.0026) [2024-06-13 02:52:35,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2341388288. Throughput: 0: 49583.5. Samples: 1870153160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 02:52:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:52:38,366][71000] Updated weights for policy 0, policy_version 142914 (0.0031) [2024-06-13 02:52:40,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 2341634048. Throughput: 0: 49508.8. Samples: 1870453200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 02:52:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:52:41,341][71000] Updated weights for policy 0, policy_version 142924 (0.0024) [2024-06-13 02:52:45,080][71000] Updated weights for policy 0, policy_version 142934 (0.0031) [2024-06-13 02:52:45,939][70768] Fps is (10 sec: 49152.5, 60 sec: 49698.3, 300 sec: 49263.1). Total num frames: 2341879808. Throughput: 0: 49269.1. Samples: 1870744200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 02:52:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:52:47,918][71000] Updated weights for policy 0, policy_version 142944 (0.0022) [2024-06-13 02:52:50,737][70980] Signal inference workers to stop experience collection... (27750 times) [2024-06-13 02:52:50,737][70980] Signal inference workers to resume experience collection... (27750 times) [2024-06-13 02:52:50,753][71000] InferenceWorker_p0-w0: stopping experience collection (27750 times) [2024-06-13 02:52:50,753][71000] InferenceWorker_p0-w0: resuming experience collection (27750 times) [2024-06-13 02:52:50,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49424.9, 300 sec: 49263.1). Total num frames: 2342125568. Throughput: 0: 49307.4. Samples: 1870888900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 02:52:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:52:51,662][71000] Updated weights for policy 0, policy_version 142954 (0.0030) [2024-06-13 02:52:54,586][71000] Updated weights for policy 0, policy_version 142964 (0.0031) [2024-06-13 02:52:55,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2342371328. Throughput: 0: 49324.4. Samples: 1871183000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 02:52:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:52:58,394][71000] Updated weights for policy 0, policy_version 142974 (0.0029) [2024-06-13 02:53:00,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2342617088. Throughput: 0: 49363.7. Samples: 1871481400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:53:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:53:01,507][71000] Updated weights for policy 0, policy_version 142984 (0.0017) [2024-06-13 02:53:05,339][71000] Updated weights for policy 0, policy_version 142994 (0.0027) [2024-06-13 02:53:05,939][70768] Fps is (10 sec: 47514.3, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2342846464. Throughput: 0: 49052.6. Samples: 1871622480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:53:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:53:08,077][71000] Updated weights for policy 0, policy_version 143004 (0.0031) [2024-06-13 02:53:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2343108608. Throughput: 0: 49094.5. Samples: 1871922380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:53:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:53:11,782][71000] Updated weights for policy 0, policy_version 143014 (0.0035) [2024-06-13 02:53:14,733][71000] Updated weights for policy 0, policy_version 143024 (0.0038) [2024-06-13 02:53:15,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49424.9, 300 sec: 49263.1). Total num frames: 2343354368. Throughput: 0: 49358.6. Samples: 1872217980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:53:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 02:53:18,655][71000] Updated weights for policy 0, policy_version 143034 (0.0025) [2024-06-13 02:53:20,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2343600128. Throughput: 0: 49320.4. Samples: 1872372580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:53:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:53:21,281][71000] Updated weights for policy 0, policy_version 143044 (0.0028) [2024-06-13 02:53:25,113][71000] Updated weights for policy 0, policy_version 143054 (0.0026) [2024-06-13 02:53:25,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49698.1, 300 sec: 49374.1). Total num frames: 2343845888. Throughput: 0: 49196.4. Samples: 1872667040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:53:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 02:53:27,854][71000] Updated weights for policy 0, policy_version 143064 (0.0027) [2024-06-13 02:53:30,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 2344091648. Throughput: 0: 49339.0. Samples: 1872964460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:53:30,948][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:53:31,487][71000] Updated weights for policy 0, policy_version 143074 (0.0026) [2024-06-13 02:53:34,563][71000] Updated weights for policy 0, policy_version 143084 (0.0029) [2024-06-13 02:53:35,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2344337408. Throughput: 0: 49442.8. Samples: 1873113820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:53:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 02:53:38,372][71000] Updated weights for policy 0, policy_version 143094 (0.0029) [2024-06-13 02:53:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2344583168. Throughput: 0: 49297.5. Samples: 1873401380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:53:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:53:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000143102_2344583168.pth... [2024-06-13 02:53:41,003][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000142379_2332737536.pth [2024-06-13 02:53:41,524][71000] Updated weights for policy 0, policy_version 143104 (0.0029) [2024-06-13 02:53:45,122][71000] Updated weights for policy 0, policy_version 143114 (0.0028) [2024-06-13 02:53:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.8, 300 sec: 49318.6). Total num frames: 2344812544. Throughput: 0: 49337.0. Samples: 1873701560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:53:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:53:47,878][71000] Updated weights for policy 0, policy_version 143124 (0.0027) [2024-06-13 02:53:50,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2345091072. Throughput: 0: 49365.1. Samples: 1873843920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:53:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:53:51,354][71000] Updated weights for policy 0, policy_version 143134 (0.0024) [2024-06-13 02:53:54,727][71000] Updated weights for policy 0, policy_version 143144 (0.0030) [2024-06-13 02:53:55,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2345320448. Throughput: 0: 49264.1. Samples: 1874139260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 02:53:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:53:58,328][71000] Updated weights for policy 0, policy_version 143154 (0.0024) [2024-06-13 02:54:00,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 2345582592. Throughput: 0: 49191.7. Samples: 1874431600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 02:54:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:54:01,255][71000] Updated weights for policy 0, policy_version 143164 (0.0026) [2024-06-13 02:54:05,009][71000] Updated weights for policy 0, policy_version 143174 (0.0032) [2024-06-13 02:54:05,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 2345795584. Throughput: 0: 49139.1. Samples: 1874583840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 02:54:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:54:07,993][70980] Signal inference workers to stop experience collection... (27800 times) [2024-06-13 02:54:07,994][70980] Signal inference workers to resume experience collection... (27800 times) [2024-06-13 02:54:08,035][71000] InferenceWorker_p0-w0: stopping experience collection (27800 times) [2024-06-13 02:54:08,036][71000] InferenceWorker_p0-w0: resuming experience collection (27800 times) [2024-06-13 02:54:08,123][71000] Updated weights for policy 0, policy_version 143184 (0.0025) [2024-06-13 02:54:10,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2346057728. Throughput: 0: 49192.1. Samples: 1874880680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 02:54:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 02:54:11,297][71000] Updated weights for policy 0, policy_version 143194 (0.0023) [2024-06-13 02:54:14,393][71000] Updated weights for policy 0, policy_version 143204 (0.0031) [2024-06-13 02:54:15,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2346303488. Throughput: 0: 49321.3. Samples: 1875183920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 02:54:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:54:17,624][71000] Updated weights for policy 0, policy_version 143214 (0.0027) [2024-06-13 02:54:20,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2346565632. Throughput: 0: 49299.0. Samples: 1875332280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 02:54:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:54:20,990][71000] Updated weights for policy 0, policy_version 143224 (0.0025) [2024-06-13 02:54:24,700][71000] Updated weights for policy 0, policy_version 143234 (0.0026) [2024-06-13 02:54:25,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2346795008. Throughput: 0: 49476.9. Samples: 1875627840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 02:54:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:54:27,837][71000] Updated weights for policy 0, policy_version 143244 (0.0028) [2024-06-13 02:54:30,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2347057152. Throughput: 0: 49601.2. Samples: 1875933620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 02:54:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:54:31,001][71000] Updated weights for policy 0, policy_version 143254 (0.0028) [2024-06-13 02:54:34,187][71000] Updated weights for policy 0, policy_version 143264 (0.0025) [2024-06-13 02:54:35,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 2347319296. Throughput: 0: 49758.7. Samples: 1876083060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 02:54:35,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 02:54:37,509][71000] Updated weights for policy 0, policy_version 143274 (0.0033) [2024-06-13 02:54:40,594][71000] Updated weights for policy 0, policy_version 143284 (0.0024) [2024-06-13 02:54:40,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.0, 300 sec: 49374.2). Total num frames: 2347565056. Throughput: 0: 49764.8. Samples: 1876378680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 02:54:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:54:44,510][71000] Updated weights for policy 0, policy_version 143294 (0.0025) [2024-06-13 02:54:45,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 2347810816. Throughput: 0: 49881.8. Samples: 1876676280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 02:54:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:54:47,597][71000] Updated weights for policy 0, policy_version 143304 (0.0039) [2024-06-13 02:54:50,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 2348040192. Throughput: 0: 49784.8. Samples: 1876824160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 02:54:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:54:51,068][71000] Updated weights for policy 0, policy_version 143314 (0.0025) [2024-06-13 02:54:54,123][71000] Updated weights for policy 0, policy_version 143324 (0.0026) [2024-06-13 02:54:55,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2348285952. Throughput: 0: 49611.6. Samples: 1877113200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 02:54:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:54:57,520][71000] Updated weights for policy 0, policy_version 143334 (0.0028) [2024-06-13 02:55:00,623][71000] Updated weights for policy 0, policy_version 143344 (0.0027) [2024-06-13 02:55:00,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2348548096. Throughput: 0: 49405.7. Samples: 1877407180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 02:55:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 02:55:04,165][71000] Updated weights for policy 0, policy_version 143354 (0.0020) [2024-06-13 02:55:05,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 2348793856. Throughput: 0: 49643.6. Samples: 1877566240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 02:55:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:55:07,286][71000] Updated weights for policy 0, policy_version 143364 (0.0034) [2024-06-13 02:55:10,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2349023232. Throughput: 0: 49660.5. Samples: 1877862560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 02:55:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:55:11,025][71000] Updated weights for policy 0, policy_version 143374 (0.0036) [2024-06-13 02:55:13,550][70980] Signal inference workers to stop experience collection... (27850 times) [2024-06-13 02:55:13,553][70980] Signal inference workers to resume experience collection... (27850 times) [2024-06-13 02:55:13,575][71000] InferenceWorker_p0-w0: stopping experience collection (27850 times) [2024-06-13 02:55:13,576][71000] InferenceWorker_p0-w0: resuming experience collection (27850 times) [2024-06-13 02:55:13,688][71000] Updated weights for policy 0, policy_version 143384 (0.0023) [2024-06-13 02:55:15,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2349268992. Throughput: 0: 49452.9. Samples: 1878159000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 02:55:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:55:17,656][71000] Updated weights for policy 0, policy_version 143394 (0.0024) [2024-06-13 02:55:20,479][71000] Updated weights for policy 0, policy_version 143404 (0.0029) [2024-06-13 02:55:20,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 2349547520. Throughput: 0: 49297.8. Samples: 1878301460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 02:55:20,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 02:55:24,073][71000] Updated weights for policy 0, policy_version 143414 (0.0022) [2024-06-13 02:55:25,940][70768] Fps is (10 sec: 52429.6, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 2349793280. Throughput: 0: 49461.5. Samples: 1878604440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 02:55:25,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:55:27,170][71000] Updated weights for policy 0, policy_version 143424 (0.0034) [2024-06-13 02:55:30,863][71000] Updated weights for policy 0, policy_version 143434 (0.0024) [2024-06-13 02:55:30,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2350022656. Throughput: 0: 49522.6. Samples: 1878904800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 02:55:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:55:33,671][71000] Updated weights for policy 0, policy_version 143444 (0.0027) [2024-06-13 02:55:35,942][70768] Fps is (10 sec: 47504.1, 60 sec: 49150.5, 300 sec: 49429.4). Total num frames: 2350268416. Throughput: 0: 49318.0. Samples: 1879043560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 02:55:35,942][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:55:37,460][71000] Updated weights for policy 0, policy_version 143454 (0.0030) [2024-06-13 02:55:40,337][71000] Updated weights for policy 0, policy_version 143464 (0.0031) [2024-06-13 02:55:40,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2350530560. Throughput: 0: 49630.6. Samples: 1879346580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 02:55:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:55:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000143465_2350530560.pth... [2024-06-13 02:55:41,009][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000142741_2338668544.pth [2024-06-13 02:55:43,806][71000] Updated weights for policy 0, policy_version 143474 (0.0020) [2024-06-13 02:55:45,939][70768] Fps is (10 sec: 52439.5, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 2350792704. Throughput: 0: 49754.8. Samples: 1879646140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 02:55:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:55:47,151][71000] Updated weights for policy 0, policy_version 143484 (0.0025) [2024-06-13 02:55:50,338][71000] Updated weights for policy 0, policy_version 143494 (0.0030) [2024-06-13 02:55:50,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 2351038464. Throughput: 0: 49667.0. Samples: 1879801260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 02:55:50,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 02:55:53,638][71000] Updated weights for policy 0, policy_version 143504 (0.0028) [2024-06-13 02:55:55,942][70768] Fps is (10 sec: 45863.4, 60 sec: 49423.0, 300 sec: 49429.3). Total num frames: 2351251456. Throughput: 0: 49605.6. Samples: 1880094940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 02:55:55,943][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:55:57,176][71000] Updated weights for policy 0, policy_version 143514 (0.0026) [2024-06-13 02:55:59,948][71000] Updated weights for policy 0, policy_version 143524 (0.0028) [2024-06-13 02:56:00,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 2351529984. Throughput: 0: 49649.9. Samples: 1880393240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 02:56:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:56:03,775][71000] Updated weights for policy 0, policy_version 143534 (0.0035) [2024-06-13 02:56:05,940][70768] Fps is (10 sec: 54080.4, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 2351792128. Throughput: 0: 49993.8. Samples: 1880551180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 02:56:05,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 02:56:06,804][71000] Updated weights for policy 0, policy_version 143544 (0.0029) [2024-06-13 02:56:09,982][70980] Signal inference workers to stop experience collection... (27900 times) [2024-06-13 02:56:09,982][70980] Signal inference workers to resume experience collection... (27900 times) [2024-06-13 02:56:09,999][71000] InferenceWorker_p0-w0: stopping experience collection (27900 times) [2024-06-13 02:56:09,999][71000] InferenceWorker_p0-w0: resuming experience collection (27900 times) [2024-06-13 02:56:10,136][71000] Updated weights for policy 0, policy_version 143554 (0.0027) [2024-06-13 02:56:10,940][70768] Fps is (10 sec: 50790.2, 60 sec: 50244.2, 300 sec: 49596.3). Total num frames: 2352037888. Throughput: 0: 50015.5. Samples: 1880855140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-13 02:56:10,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 02:56:13,378][71000] Updated weights for policy 0, policy_version 143564 (0.0024) [2024-06-13 02:56:15,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49698.1, 300 sec: 49485.3). Total num frames: 2352250880. Throughput: 0: 49766.5. Samples: 1881144300. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-13 02:56:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:56:16,853][71000] Updated weights for policy 0, policy_version 143574 (0.0038) [2024-06-13 02:56:19,807][71000] Updated weights for policy 0, policy_version 143584 (0.0028) [2024-06-13 02:56:20,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.1, 300 sec: 49485.3). Total num frames: 2352496640. Throughput: 0: 49815.5. Samples: 1881285160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-13 02:56:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:56:23,767][71000] Updated weights for policy 0, policy_version 143594 (0.0036) [2024-06-13 02:56:25,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2352758784. Throughput: 0: 49584.0. Samples: 1881577860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-13 02:56:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:56:26,287][71000] Updated weights for policy 0, policy_version 143604 (0.0028) [2024-06-13 02:56:30,116][71000] Updated weights for policy 0, policy_version 143614 (0.0031) [2024-06-13 02:56:30,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 2353004544. Throughput: 0: 49557.3. Samples: 1881876220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-13 02:56:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:56:32,957][71000] Updated weights for policy 0, policy_version 143624 (0.0027) [2024-06-13 02:56:35,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49426.7, 300 sec: 49429.7). Total num frames: 2353233920. Throughput: 0: 49426.8. Samples: 1882025460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-13 02:56:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:56:36,807][71000] Updated weights for policy 0, policy_version 143634 (0.0037) [2024-06-13 02:56:39,823][71000] Updated weights for policy 0, policy_version 143644 (0.0027) [2024-06-13 02:56:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2353496064. Throughput: 0: 49553.9. Samples: 1882324740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-13 02:56:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:56:43,578][71000] Updated weights for policy 0, policy_version 143654 (0.0029) [2024-06-13 02:56:45,939][70768] Fps is (10 sec: 52429.0, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2353758208. Throughput: 0: 49382.3. Samples: 1882615440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-13 02:56:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:56:46,341][71000] Updated weights for policy 0, policy_version 143664 (0.0030) [2024-06-13 02:56:50,025][71000] Updated weights for policy 0, policy_version 143674 (0.0033) [2024-06-13 02:56:50,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2354003968. Throughput: 0: 49270.7. Samples: 1882768360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-13 02:56:50,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-13 02:56:52,884][71000] Updated weights for policy 0, policy_version 143684 (0.0028) [2024-06-13 02:56:55,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49973.2, 300 sec: 49485.2). Total num frames: 2354249728. Throughput: 0: 49319.0. Samples: 1883074500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-13 02:56:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:56:56,502][71000] Updated weights for policy 0, policy_version 143694 (0.0033) [2024-06-13 02:56:59,688][71000] Updated weights for policy 0, policy_version 143704 (0.0022) [2024-06-13 02:57:00,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 2354511872. Throughput: 0: 49668.5. Samples: 1883379380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-13 02:57:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:57:03,250][71000] Updated weights for policy 0, policy_version 143714 (0.0034) [2024-06-13 02:57:05,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 2354741248. Throughput: 0: 49743.1. Samples: 1883523600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-13 02:57:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:57:06,259][71000] Updated weights for policy 0, policy_version 143724 (0.0020) [2024-06-13 02:57:09,733][71000] Updated weights for policy 0, policy_version 143734 (0.0028) [2024-06-13 02:57:10,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 2355003392. Throughput: 0: 49912.0. Samples: 1883823900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:57:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:57:12,573][71000] Updated weights for policy 0, policy_version 143744 (0.0027) [2024-06-13 02:57:15,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 2355232768. Throughput: 0: 50075.8. Samples: 1884129640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:57:15,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 02:57:16,244][71000] Updated weights for policy 0, policy_version 143754 (0.0025) [2024-06-13 02:57:19,543][71000] Updated weights for policy 0, policy_version 143764 (0.0034) [2024-06-13 02:57:20,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 2355478528. Throughput: 0: 49858.7. Samples: 1884269100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:57:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:57:22,865][71000] Updated weights for policy 0, policy_version 143774 (0.0025) [2024-06-13 02:57:23,610][70980] Signal inference workers to stop experience collection... (27950 times) [2024-06-13 02:57:23,660][71000] InferenceWorker_p0-w0: stopping experience collection (27950 times) [2024-06-13 02:57:23,660][70980] Signal inference workers to resume experience collection... (27950 times) [2024-06-13 02:57:23,682][71000] InferenceWorker_p0-w0: resuming experience collection (27950 times) [2024-06-13 02:57:25,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 2355724288. Throughput: 0: 49895.6. Samples: 1884570040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:57:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:57:26,075][71000] Updated weights for policy 0, policy_version 143784 (0.0024) [2024-06-13 02:57:29,312][71000] Updated weights for policy 0, policy_version 143794 (0.0033) [2024-06-13 02:57:30,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 2356002816. Throughput: 0: 50130.2. Samples: 1884871300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:57:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:57:32,463][71000] Updated weights for policy 0, policy_version 143804 (0.0029) [2024-06-13 02:57:35,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 2356232192. Throughput: 0: 49949.8. Samples: 1885016100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:57:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:57:36,035][71000] Updated weights for policy 0, policy_version 143814 (0.0029) [2024-06-13 02:57:39,164][71000] Updated weights for policy 0, policy_version 143824 (0.0038) [2024-06-13 02:57:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 2356494336. Throughput: 0: 49752.1. Samples: 1885313340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:57:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:57:40,953][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000143829_2356494336.pth... [2024-06-13 02:57:40,995][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000143102_2344583168.pth [2024-06-13 02:57:42,555][71000] Updated weights for policy 0, policy_version 143834 (0.0031) [2024-06-13 02:57:45,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.0, 300 sec: 49485.3). Total num frames: 2356723712. Throughput: 0: 49453.5. Samples: 1885604780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:57:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:57:45,961][71000] Updated weights for policy 0, policy_version 143844 (0.0031) [2024-06-13 02:57:49,052][71000] Updated weights for policy 0, policy_version 143854 (0.0023) [2024-06-13 02:57:50,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 2356985856. Throughput: 0: 49598.2. Samples: 1885755520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:57:50,943][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:57:52,609][71000] Updated weights for policy 0, policy_version 143864 (0.0024) [2024-06-13 02:57:55,875][71000] Updated weights for policy 0, policy_version 143874 (0.0029) [2024-06-13 02:57:55,940][70768] Fps is (10 sec: 50789.3, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 2357231616. Throughput: 0: 49405.2. Samples: 1886047140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:57:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:57:59,564][71000] Updated weights for policy 0, policy_version 143884 (0.0030) [2024-06-13 02:58:00,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 2357460992. Throughput: 0: 48891.3. Samples: 1886329740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:58:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:58:02,491][71000] Updated weights for policy 0, policy_version 143894 (0.0038) [2024-06-13 02:58:05,909][71000] Updated weights for policy 0, policy_version 143904 (0.0031) [2024-06-13 02:58:05,939][70768] Fps is (10 sec: 49153.1, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 2357723136. Throughput: 0: 49400.0. Samples: 1886492100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:58:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:58:09,274][71000] Updated weights for policy 0, policy_version 143914 (0.0030) [2024-06-13 02:58:10,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 49485.3). Total num frames: 2357952512. Throughput: 0: 49359.6. Samples: 1886791220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 02:58:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:58:12,724][71000] Updated weights for policy 0, policy_version 143924 (0.0049) [2024-06-13 02:58:15,665][71000] Updated weights for policy 0, policy_version 143934 (0.0026) [2024-06-13 02:58:15,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49698.3, 300 sec: 49540.8). Total num frames: 2358214656. Throughput: 0: 49127.1. Samples: 1887082020. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 02:58:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:58:19,309][71000] Updated weights for policy 0, policy_version 143944 (0.0033) [2024-06-13 02:58:20,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 2358444032. Throughput: 0: 49170.2. Samples: 1887228760. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 02:58:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:58:22,685][71000] Updated weights for policy 0, policy_version 143954 (0.0027) [2024-06-13 02:58:25,781][71000] Updated weights for policy 0, policy_version 143964 (0.0028) [2024-06-13 02:58:25,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 2358706176. Throughput: 0: 49140.9. Samples: 1887524680. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 02:58:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:58:28,060][70980] Signal inference workers to stop experience collection... (28000 times) [2024-06-13 02:58:28,062][70980] Signal inference workers to resume experience collection... (28000 times) [2024-06-13 02:58:28,089][71000] InferenceWorker_p0-w0: stopping experience collection (28000 times) [2024-06-13 02:58:28,090][71000] InferenceWorker_p0-w0: resuming experience collection (28000 times) [2024-06-13 02:58:28,950][71000] Updated weights for policy 0, policy_version 143974 (0.0036) [2024-06-13 02:58:30,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 2358951936. Throughput: 0: 49430.2. Samples: 1887829140. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 02:58:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:58:32,418][71000] Updated weights for policy 0, policy_version 143984 (0.0027) [2024-06-13 02:58:35,675][71000] Updated weights for policy 0, policy_version 143994 (0.0032) [2024-06-13 02:58:35,939][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 2359197696. Throughput: 0: 49228.0. Samples: 1887970780. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 02:58:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:58:38,781][71000] Updated weights for policy 0, policy_version 144004 (0.0019) [2024-06-13 02:58:40,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49151.9, 300 sec: 49596.3). Total num frames: 2359443456. Throughput: 0: 49284.9. Samples: 1888264960. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 02:58:40,948][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:58:42,608][71000] Updated weights for policy 0, policy_version 144014 (0.0032) [2024-06-13 02:58:45,644][71000] Updated weights for policy 0, policy_version 144024 (0.0023) [2024-06-13 02:58:45,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.0, 300 sec: 49485.3). Total num frames: 2359689216. Throughput: 0: 49503.6. Samples: 1888557400. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 02:58:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:58:49,058][71000] Updated weights for policy 0, policy_version 144034 (0.0021) [2024-06-13 02:58:50,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49151.9, 300 sec: 49540.8). Total num frames: 2359934976. Throughput: 0: 49136.7. Samples: 1888703260. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 02:58:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:58:52,291][71000] Updated weights for policy 0, policy_version 144044 (0.0025) [2024-06-13 02:58:55,466][71000] Updated weights for policy 0, policy_version 144054 (0.0028) [2024-06-13 02:58:55,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 2360180736. Throughput: 0: 49233.7. Samples: 1889006740. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 02:58:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 02:58:58,747][71000] Updated weights for policy 0, policy_version 144064 (0.0033) [2024-06-13 02:59:00,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 2360426496. Throughput: 0: 49351.4. Samples: 1889302840. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 02:59:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 02:59:02,633][71000] Updated weights for policy 0, policy_version 144074 (0.0036) [2024-06-13 02:59:05,290][71000] Updated weights for policy 0, policy_version 144084 (0.0027) [2024-06-13 02:59:05,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49151.8, 300 sec: 49540.7). Total num frames: 2360672256. Throughput: 0: 49350.4. Samples: 1889449540. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 02:59:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 02:59:09,013][71000] Updated weights for policy 0, policy_version 144094 (0.0022) [2024-06-13 02:59:10,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 2360934400. Throughput: 0: 49446.1. Samples: 1889749760. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 02:59:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:59:12,102][71000] Updated weights for policy 0, policy_version 144104 (0.0032) [2024-06-13 02:59:15,471][71000] Updated weights for policy 0, policy_version 144114 (0.0030) [2024-06-13 02:59:15,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49424.9, 300 sec: 49540.8). Total num frames: 2361180160. Throughput: 0: 49307.8. Samples: 1890048000. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 02:59:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:59:18,630][71000] Updated weights for policy 0, policy_version 144124 (0.0032) [2024-06-13 02:59:20,942][70768] Fps is (10 sec: 49140.3, 60 sec: 49696.2, 300 sec: 49595.9). Total num frames: 2361425920. Throughput: 0: 49369.3. Samples: 1890192520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 02:59:20,942][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 02:59:22,277][71000] Updated weights for policy 0, policy_version 144134 (0.0027) [2024-06-13 02:59:25,199][71000] Updated weights for policy 0, policy_version 144144 (0.0030) [2024-06-13 02:59:25,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49424.9, 300 sec: 49540.8). Total num frames: 2361671680. Throughput: 0: 49512.4. Samples: 1890493020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 02:59:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 02:59:28,829][71000] Updated weights for policy 0, policy_version 144154 (0.0039) [2024-06-13 02:59:30,940][70768] Fps is (10 sec: 45885.7, 60 sec: 48878.8, 300 sec: 49374.2). Total num frames: 2361884672. Throughput: 0: 49293.7. Samples: 1890775620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 02:59:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 02:59:32,130][71000] Updated weights for policy 0, policy_version 144164 (0.0027) [2024-06-13 02:59:35,515][71000] Updated weights for policy 0, policy_version 144174 (0.0027) [2024-06-13 02:59:35,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49425.0, 300 sec: 49485.3). Total num frames: 2362163200. Throughput: 0: 49518.8. Samples: 1890931600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 02:59:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 02:59:38,928][71000] Updated weights for policy 0, policy_version 144184 (0.0026) [2024-06-13 02:59:39,979][70980] Signal inference workers to stop experience collection... (28050 times) [2024-06-13 02:59:39,980][70980] Signal inference workers to resume experience collection... (28050 times) [2024-06-13 02:59:40,021][71000] InferenceWorker_p0-w0: stopping experience collection (28050 times) [2024-06-13 02:59:40,021][71000] InferenceWorker_p0-w0: resuming experience collection (28050 times) [2024-06-13 02:59:40,939][70768] Fps is (10 sec: 50791.2, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 2362392576. Throughput: 0: 49144.9. Samples: 1891218260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 02:59:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 02:59:41,007][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000144190_2362408960.pth... [2024-06-13 02:59:41,050][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000143465_2350530560.pth [2024-06-13 02:59:42,242][71000] Updated weights for policy 0, policy_version 144194 (0.0039) [2024-06-13 02:59:45,355][71000] Updated weights for policy 0, policy_version 144204 (0.0029) [2024-06-13 02:59:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2362638336. Throughput: 0: 49069.0. Samples: 1891510940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 02:59:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:59:48,751][71000] Updated weights for policy 0, policy_version 144214 (0.0025) [2024-06-13 02:59:50,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 2362884096. Throughput: 0: 49245.5. Samples: 1891665580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 02:59:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:59:51,999][71000] Updated weights for policy 0, policy_version 144224 (0.0026) [2024-06-13 02:59:55,240][71000] Updated weights for policy 0, policy_version 144234 (0.0033) [2024-06-13 02:59:55,939][70768] Fps is (10 sec: 52429.2, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 2363162624. Throughput: 0: 49358.7. Samples: 1891970900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 02:59:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 02:59:58,467][71000] Updated weights for policy 0, policy_version 144244 (0.0024) [2024-06-13 03:00:00,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2363392000. Throughput: 0: 49406.7. Samples: 1892271300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 03:00:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 03:00:01,657][71000] Updated weights for policy 0, policy_version 144254 (0.0023) [2024-06-13 03:00:05,198][71000] Updated weights for policy 0, policy_version 144264 (0.0030) [2024-06-13 03:00:05,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 2363637760. Throughput: 0: 49579.1. Samples: 1892423460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 03:00:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 03:00:08,395][71000] Updated weights for policy 0, policy_version 144274 (0.0031) [2024-06-13 03:00:10,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 2363899904. Throughput: 0: 49395.6. Samples: 1892715820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 03:00:10,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 03:00:11,851][71000] Updated weights for policy 0, policy_version 144284 (0.0034) [2024-06-13 03:00:14,886][71000] Updated weights for policy 0, policy_version 144294 (0.0026) [2024-06-13 03:00:15,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.1, 300 sec: 49485.3). Total num frames: 2364145664. Throughput: 0: 49555.2. Samples: 1893005600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 03:00:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:00:18,531][71000] Updated weights for policy 0, policy_version 144304 (0.0021) [2024-06-13 03:00:20,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49153.9, 300 sec: 49429.7). Total num frames: 2364375040. Throughput: 0: 49424.4. Samples: 1893155700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:00:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:00:21,694][71000] Updated weights for policy 0, policy_version 144314 (0.0025) [2024-06-13 03:00:25,127][71000] Updated weights for policy 0, policy_version 144324 (0.0038) [2024-06-13 03:00:25,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 2364637184. Throughput: 0: 49718.9. Samples: 1893455620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:00:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:00:28,296][71000] Updated weights for policy 0, policy_version 144334 (0.0035) [2024-06-13 03:00:30,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49698.1, 300 sec: 49485.5). Total num frames: 2364866560. Throughput: 0: 49864.7. Samples: 1893754860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:00:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:00:31,711][71000] Updated weights for policy 0, policy_version 144344 (0.0031) [2024-06-13 03:00:35,016][71000] Updated weights for policy 0, policy_version 144354 (0.0028) [2024-06-13 03:00:35,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2365128704. Throughput: 0: 49639.1. Samples: 1893899340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:00:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:00:38,602][71000] Updated weights for policy 0, policy_version 144364 (0.0038) [2024-06-13 03:00:40,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 2365358080. Throughput: 0: 49205.7. Samples: 1894185160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:00:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:00:41,940][71000] Updated weights for policy 0, policy_version 144374 (0.0028) [2024-06-13 03:00:44,720][70980] Signal inference workers to stop experience collection... (28100 times) [2024-06-13 03:00:44,721][70980] Signal inference workers to resume experience collection... (28100 times) [2024-06-13 03:00:44,773][71000] InferenceWorker_p0-w0: stopping experience collection (28100 times) [2024-06-13 03:00:44,773][71000] InferenceWorker_p0-w0: resuming experience collection (28100 times) [2024-06-13 03:00:45,314][71000] Updated weights for policy 0, policy_version 144384 (0.0030) [2024-06-13 03:00:45,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2365603840. Throughput: 0: 49065.5. Samples: 1894479240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:00:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:00:48,470][71000] Updated weights for policy 0, policy_version 144394 (0.0029) [2024-06-13 03:00:50,939][70768] Fps is (10 sec: 49152.6, 60 sec: 49425.2, 300 sec: 49485.7). Total num frames: 2365849600. Throughput: 0: 48934.8. Samples: 1894625520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:00:50,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 03:00:51,850][71000] Updated weights for policy 0, policy_version 144404 (0.0029) [2024-06-13 03:00:54,770][71000] Updated weights for policy 0, policy_version 144414 (0.0028) [2024-06-13 03:00:55,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 2366111744. Throughput: 0: 49063.5. Samples: 1894923680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:00:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:00:58,541][71000] Updated weights for policy 0, policy_version 144424 (0.0026) [2024-06-13 03:01:00,940][70768] Fps is (10 sec: 50789.0, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 2366357504. Throughput: 0: 49428.3. Samples: 1895229880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:01:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:01:01,752][71000] Updated weights for policy 0, policy_version 144434 (0.0030) [2024-06-13 03:01:05,047][71000] Updated weights for policy 0, policy_version 144444 (0.0025) [2024-06-13 03:01:05,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 2366603264. Throughput: 0: 49315.0. Samples: 1895374880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:01:05,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-13 03:01:08,076][71000] Updated weights for policy 0, policy_version 144454 (0.0027) [2024-06-13 03:01:10,940][70768] Fps is (10 sec: 47514.3, 60 sec: 48879.0, 300 sec: 49429.7). Total num frames: 2366832640. Throughput: 0: 49351.7. Samples: 1895676440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:01:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:01:11,765][71000] Updated weights for policy 0, policy_version 144464 (0.0027) [2024-06-13 03:01:14,846][71000] Updated weights for policy 0, policy_version 144474 (0.0031) [2024-06-13 03:01:15,944][70768] Fps is (10 sec: 49132.9, 60 sec: 49148.7, 300 sec: 49484.6). Total num frames: 2367094784. Throughput: 0: 49118.1. Samples: 1895965360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:01:15,944][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:01:18,206][71000] Updated weights for policy 0, policy_version 144484 (0.0022) [2024-06-13 03:01:20,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 2367356928. Throughput: 0: 49224.4. Samples: 1896114440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:01:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 03:01:21,417][71000] Updated weights for policy 0, policy_version 144494 (0.0023) [2024-06-13 03:01:25,094][71000] Updated weights for policy 0, policy_version 144504 (0.0022) [2024-06-13 03:01:25,940][70768] Fps is (10 sec: 50810.2, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2367602688. Throughput: 0: 49662.1. Samples: 1896419960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:01:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 03:01:28,021][71000] Updated weights for policy 0, policy_version 144514 (0.0036) [2024-06-13 03:01:30,939][70768] Fps is (10 sec: 47514.2, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 2367832064. Throughput: 0: 49756.4. Samples: 1896718280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:01:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:01:31,509][71000] Updated weights for policy 0, policy_version 144524 (0.0032) [2024-06-13 03:01:34,358][71000] Updated weights for policy 0, policy_version 144534 (0.0031) [2024-06-13 03:01:35,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2368094208. Throughput: 0: 49721.7. Samples: 1896863000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:01:35,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-13 03:01:38,333][71000] Updated weights for policy 0, policy_version 144544 (0.0034) [2024-06-13 03:01:40,940][70768] Fps is (10 sec: 50789.3, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 2368339968. Throughput: 0: 49870.2. Samples: 1897167840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:01:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:01:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000144552_2368339968.pth... [2024-06-13 03:01:41,026][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000143829_2356494336.pth [2024-06-13 03:01:41,289][71000] Updated weights for policy 0, policy_version 144554 (0.0023) [2024-06-13 03:01:44,920][71000] Updated weights for policy 0, policy_version 144564 (0.0032) [2024-06-13 03:01:45,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2368569344. Throughput: 0: 49360.6. Samples: 1897451100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:01:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 03:01:48,286][71000] Updated weights for policy 0, policy_version 144574 (0.0027) [2024-06-13 03:01:50,940][70768] Fps is (10 sec: 47514.5, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2368815104. Throughput: 0: 49403.2. Samples: 1897598020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:01:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:01:51,831][71000] Updated weights for policy 0, policy_version 144584 (0.0033) [2024-06-13 03:01:54,686][71000] Updated weights for policy 0, policy_version 144594 (0.0030) [2024-06-13 03:01:55,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 2369077248. Throughput: 0: 49112.0. Samples: 1897886480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:01:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:01:58,519][71000] Updated weights for policy 0, policy_version 144604 (0.0028) [2024-06-13 03:02:00,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2369323008. Throughput: 0: 49497.3. Samples: 1898192540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:02:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:02:01,427][71000] Updated weights for policy 0, policy_version 144614 (0.0023) [2024-06-13 03:02:05,173][71000] Updated weights for policy 0, policy_version 144624 (0.0027) [2024-06-13 03:02:05,677][70980] Signal inference workers to stop experience collection... (28150 times) [2024-06-13 03:02:05,678][70980] Signal inference workers to resume experience collection... (28150 times) [2024-06-13 03:02:05,712][71000] InferenceWorker_p0-w0: stopping experience collection (28150 times) [2024-06-13 03:02:05,712][71000] InferenceWorker_p0-w0: resuming experience collection (28150 times) [2024-06-13 03:02:05,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2369552384. Throughput: 0: 49288.1. Samples: 1898332400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:02:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 03:02:08,113][71000] Updated weights for policy 0, policy_version 144634 (0.0028) [2024-06-13 03:02:10,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 2369814528. Throughput: 0: 49271.3. Samples: 1898637160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:02:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 03:02:11,936][71000] Updated weights for policy 0, policy_version 144644 (0.0031) [2024-06-13 03:02:14,776][71000] Updated weights for policy 0, policy_version 144654 (0.0034) [2024-06-13 03:02:15,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49428.4, 300 sec: 49429.7). Total num frames: 2370060288. Throughput: 0: 49119.6. Samples: 1898928660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:02:15,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 03:02:18,582][71000] Updated weights for policy 0, policy_version 144664 (0.0033) [2024-06-13 03:02:20,940][70768] Fps is (10 sec: 49150.8, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 2370306048. Throughput: 0: 49464.2. Samples: 1899088900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:02:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:02:21,534][71000] Updated weights for policy 0, policy_version 144674 (0.0027) [2024-06-13 03:02:25,155][71000] Updated weights for policy 0, policy_version 144684 (0.0027) [2024-06-13 03:02:25,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.1, 300 sec: 49374.1). Total num frames: 2370568192. Throughput: 0: 49118.8. Samples: 1899378180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:02:25,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 03:02:27,734][71000] Updated weights for policy 0, policy_version 144694 (0.0026) [2024-06-13 03:02:30,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49424.9, 300 sec: 49374.1). Total num frames: 2370797568. Throughput: 0: 49273.7. Samples: 1899668420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 03:02:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:02:31,636][71000] Updated weights for policy 0, policy_version 144704 (0.0031) [2024-06-13 03:02:34,696][71000] Updated weights for policy 0, policy_version 144714 (0.0036) [2024-06-13 03:02:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2371059712. Throughput: 0: 49396.9. Samples: 1899820880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 03:02:35,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-13 03:02:38,453][71000] Updated weights for policy 0, policy_version 144724 (0.0027) [2024-06-13 03:02:40,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.2, 300 sec: 49374.2). Total num frames: 2371289088. Throughput: 0: 49465.4. Samples: 1900112420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 03:02:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:02:41,543][71000] Updated weights for policy 0, policy_version 144734 (0.0025) [2024-06-13 03:02:45,041][71000] Updated weights for policy 0, policy_version 144744 (0.0029) [2024-06-13 03:02:45,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2371534848. Throughput: 0: 49317.4. Samples: 1900411820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 03:02:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:02:48,251][71000] Updated weights for policy 0, policy_version 144754 (0.0030) [2024-06-13 03:02:50,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2371780608. Throughput: 0: 49434.1. Samples: 1900556940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 03:02:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 03:02:51,970][71000] Updated weights for policy 0, policy_version 144764 (0.0027) [2024-06-13 03:02:54,858][71000] Updated weights for policy 0, policy_version 144774 (0.0023) [2024-06-13 03:02:55,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2372042752. Throughput: 0: 49210.6. Samples: 1900851640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 03:02:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 03:02:58,414][71000] Updated weights for policy 0, policy_version 144784 (0.0030) [2024-06-13 03:03:00,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 2372255744. Throughput: 0: 49124.0. Samples: 1901139240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 03:03:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 03:03:01,480][71000] Updated weights for policy 0, policy_version 144794 (0.0026) [2024-06-13 03:03:05,098][71000] Updated weights for policy 0, policy_version 144804 (0.0032) [2024-06-13 03:03:05,940][70768] Fps is (10 sec: 45874.7, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2372501504. Throughput: 0: 48735.7. Samples: 1901282000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 03:03:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:03:08,311][70980] Signal inference workers to stop experience collection... (28200 times) [2024-06-13 03:03:08,313][70980] Signal inference workers to resume experience collection... (28200 times) [2024-06-13 03:03:08,358][71000] InferenceWorker_p0-w0: stopping experience collection (28200 times) [2024-06-13 03:03:08,358][71000] InferenceWorker_p0-w0: resuming experience collection (28200 times) [2024-06-13 03:03:08,445][71000] Updated weights for policy 0, policy_version 144814 (0.0025) [2024-06-13 03:03:10,940][70768] Fps is (10 sec: 50789.4, 60 sec: 49151.8, 300 sec: 49318.6). Total num frames: 2372763648. Throughput: 0: 48700.3. Samples: 1901569700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 03:03:10,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 03:03:11,886][71000] Updated weights for policy 0, policy_version 144824 (0.0020) [2024-06-13 03:03:14,859][71000] Updated weights for policy 0, policy_version 144834 (0.0026) [2024-06-13 03:03:15,939][70768] Fps is (10 sec: 50791.2, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2373009408. Throughput: 0: 48942.4. Samples: 1901870820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 03:03:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 03:03:18,516][71000] Updated weights for policy 0, policy_version 144844 (0.0031) [2024-06-13 03:03:20,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48879.1, 300 sec: 49263.1). Total num frames: 2373238784. Throughput: 0: 48850.2. Samples: 1902019140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 03:03:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:03:21,613][71000] Updated weights for policy 0, policy_version 144854 (0.0035) [2024-06-13 03:03:25,190][71000] Updated weights for policy 0, policy_version 144864 (0.0029) [2024-06-13 03:03:25,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48605.8, 300 sec: 49263.1). Total num frames: 2373484544. Throughput: 0: 49012.7. Samples: 1902318000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 03:03:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 03:03:28,162][71000] Updated weights for policy 0, policy_version 144874 (0.0021) [2024-06-13 03:03:30,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2373746688. Throughput: 0: 48860.9. Samples: 1902610560. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-13 03:03:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:03:31,658][71000] Updated weights for policy 0, policy_version 144884 (0.0023) [2024-06-13 03:03:34,646][71000] Updated weights for policy 0, policy_version 144894 (0.0026) [2024-06-13 03:03:35,944][70768] Fps is (10 sec: 50769.2, 60 sec: 48875.5, 300 sec: 49317.9). Total num frames: 2373992448. Throughput: 0: 49104.7. Samples: 1902766860. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-13 03:03:35,944][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:03:38,263][71000] Updated weights for policy 0, policy_version 144904 (0.0031) [2024-06-13 03:03:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2374238208. Throughput: 0: 49093.7. Samples: 1903060860. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-13 03:03:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:03:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000144912_2374238208.pth... [2024-06-13 03:03:41,003][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000144190_2362408960.pth [2024-06-13 03:03:41,482][71000] Updated weights for policy 0, policy_version 144914 (0.0032) [2024-06-13 03:03:45,265][71000] Updated weights for policy 0, policy_version 144924 (0.0024) [2024-06-13 03:03:45,940][70768] Fps is (10 sec: 49172.9, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2374483968. Throughput: 0: 49116.8. Samples: 1903349500. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-13 03:03:45,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 03:03:48,207][71000] Updated weights for policy 0, policy_version 144934 (0.0030) [2024-06-13 03:03:50,939][70768] Fps is (10 sec: 47514.1, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 2374713344. Throughput: 0: 49170.8. Samples: 1903494680. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-13 03:03:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:03:51,676][71000] Updated weights for policy 0, policy_version 144944 (0.0034) [2024-06-13 03:03:54,961][71000] Updated weights for policy 0, policy_version 144954 (0.0038) [2024-06-13 03:03:55,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48878.8, 300 sec: 49318.6). Total num frames: 2374975488. Throughput: 0: 49385.3. Samples: 1903792040. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-13 03:03:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:03:58,272][71000] Updated weights for policy 0, policy_version 144964 (0.0027) [2024-06-13 03:04:00,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2375204864. Throughput: 0: 49416.4. Samples: 1904094560. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-13 03:04:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:04:01,378][71000] Updated weights for policy 0, policy_version 144974 (0.0023) [2024-06-13 03:04:05,095][71000] Updated weights for policy 0, policy_version 144984 (0.0032) [2024-06-13 03:04:05,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2375467008. Throughput: 0: 49178.5. Samples: 1904232180. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-13 03:04:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:04:07,832][70980] Signal inference workers to stop experience collection... (28250 times) [2024-06-13 03:04:07,877][71000] InferenceWorker_p0-w0: stopping experience collection (28250 times) [2024-06-13 03:04:07,881][70980] Signal inference workers to resume experience collection... (28250 times) [2024-06-13 03:04:07,888][71000] InferenceWorker_p0-w0: resuming experience collection (28250 times) [2024-06-13 03:04:08,015][71000] Updated weights for policy 0, policy_version 144994 (0.0030) [2024-06-13 03:04:10,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2375712768. Throughput: 0: 49065.8. Samples: 1904525960. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-13 03:04:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:04:11,430][71000] Updated weights for policy 0, policy_version 145004 (0.0031) [2024-06-13 03:04:14,671][71000] Updated weights for policy 0, policy_version 145014 (0.0023) [2024-06-13 03:04:15,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49425.0, 300 sec: 49319.0). Total num frames: 2375974912. Throughput: 0: 49165.3. Samples: 1904823000. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-13 03:04:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:04:18,159][71000] Updated weights for policy 0, policy_version 145024 (0.0024) [2024-06-13 03:04:20,939][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 2376187904. Throughput: 0: 49092.7. Samples: 1904975820. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-13 03:04:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:04:21,261][71000] Updated weights for policy 0, policy_version 145034 (0.0026) [2024-06-13 03:04:24,850][71000] Updated weights for policy 0, policy_version 145044 (0.0030) [2024-06-13 03:04:25,940][70768] Fps is (10 sec: 45874.6, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2376433664. Throughput: 0: 49001.7. Samples: 1905265940. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-13 03:04:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:04:28,227][71000] Updated weights for policy 0, policy_version 145054 (0.0033) [2024-06-13 03:04:30,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.9, 300 sec: 49152.0). Total num frames: 2376663040. Throughput: 0: 49045.4. Samples: 1905556540. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-13 03:04:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:04:31,459][71000] Updated weights for policy 0, policy_version 145064 (0.0033) [2024-06-13 03:04:34,686][71000] Updated weights for policy 0, policy_version 145074 (0.0029) [2024-06-13 03:04:35,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49155.5, 300 sec: 49318.6). Total num frames: 2376941568. Throughput: 0: 49225.7. Samples: 1905709840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 03:04:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:04:38,133][71000] Updated weights for policy 0, policy_version 145084 (0.0025) [2024-06-13 03:04:40,939][70768] Fps is (10 sec: 52428.6, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2377187328. Throughput: 0: 49130.9. Samples: 1906002920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 03:04:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:04:41,205][71000] Updated weights for policy 0, policy_version 145094 (0.0024) [2024-06-13 03:04:44,796][71000] Updated weights for policy 0, policy_version 145104 (0.0031) [2024-06-13 03:04:45,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2377433088. Throughput: 0: 49016.1. Samples: 1906300280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 03:04:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:04:47,907][71000] Updated weights for policy 0, policy_version 145114 (0.0030) [2024-06-13 03:04:50,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2377678848. Throughput: 0: 49260.2. Samples: 1906448880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 03:04:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:04:51,173][71000] Updated weights for policy 0, policy_version 145124 (0.0024) [2024-06-13 03:04:54,624][71000] Updated weights for policy 0, policy_version 145134 (0.0032) [2024-06-13 03:04:55,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2377924608. Throughput: 0: 49531.6. Samples: 1906754880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 03:04:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:04:58,053][71000] Updated weights for policy 0, policy_version 145144 (0.0036) [2024-06-13 03:05:00,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 2378186752. Throughput: 0: 49429.3. Samples: 1907047320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 03:05:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:05:01,059][71000] Updated weights for policy 0, policy_version 145154 (0.0023) [2024-06-13 03:05:04,570][71000] Updated weights for policy 0, policy_version 145164 (0.0025) [2024-06-13 03:05:05,939][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.2, 300 sec: 49207.6). Total num frames: 2378416128. Throughput: 0: 49426.7. Samples: 1907200020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 03:05:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:05:07,825][71000] Updated weights for policy 0, policy_version 145174 (0.0028) [2024-06-13 03:05:10,621][70980] Signal inference workers to stop experience collection... (28300 times) [2024-06-13 03:05:10,664][71000] InferenceWorker_p0-w0: stopping experience collection (28300 times) [2024-06-13 03:05:10,731][70980] Signal inference workers to resume experience collection... (28300 times) [2024-06-13 03:05:10,732][71000] InferenceWorker_p0-w0: resuming experience collection (28300 times) [2024-06-13 03:05:10,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 2378678272. Throughput: 0: 49575.7. Samples: 1907496840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 03:05:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:05:11,019][71000] Updated weights for policy 0, policy_version 145184 (0.0029) [2024-06-13 03:05:14,271][71000] Updated weights for policy 0, policy_version 145194 (0.0029) [2024-06-13 03:05:15,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2378907648. Throughput: 0: 49530.5. Samples: 1907785420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 03:05:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:05:18,127][71000] Updated weights for policy 0, policy_version 145204 (0.0031) [2024-06-13 03:05:20,939][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.1, 300 sec: 49207.6). Total num frames: 2379153408. Throughput: 0: 49337.8. Samples: 1907930040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 03:05:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:05:21,190][71000] Updated weights for policy 0, policy_version 145214 (0.0037) [2024-06-13 03:05:24,667][71000] Updated weights for policy 0, policy_version 145224 (0.0024) [2024-06-13 03:05:25,939][70768] Fps is (10 sec: 50791.0, 60 sec: 49698.3, 300 sec: 49318.7). Total num frames: 2379415552. Throughput: 0: 49620.9. Samples: 1908235860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 03:05:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:05:27,822][71000] Updated weights for policy 0, policy_version 145234 (0.0026) [2024-06-13 03:05:30,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49971.2, 300 sec: 49263.1). Total num frames: 2379661312. Throughput: 0: 49711.6. Samples: 1908537300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 03:05:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:05:30,969][71000] Updated weights for policy 0, policy_version 145244 (0.0025) [2024-06-13 03:05:34,048][71000] Updated weights for policy 0, policy_version 145254 (0.0027) [2024-06-13 03:05:35,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2379907072. Throughput: 0: 49623.5. Samples: 1908681940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 03:05:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:05:37,553][71000] Updated weights for policy 0, policy_version 145264 (0.0034) [2024-06-13 03:05:40,795][71000] Updated weights for policy 0, policy_version 145274 (0.0034) [2024-06-13 03:05:40,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49698.0, 300 sec: 49374.1). Total num frames: 2380169216. Throughput: 0: 49395.9. Samples: 1908977700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 03:05:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:05:40,954][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000145274_2380169216.pth... [2024-06-13 03:05:41,008][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000144552_2368339968.pth [2024-06-13 03:05:44,431][71000] Updated weights for policy 0, policy_version 145284 (0.0024) [2024-06-13 03:05:45,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49971.0, 300 sec: 49429.6). Total num frames: 2380431360. Throughput: 0: 49450.5. Samples: 1909272600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 03:05:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:05:47,884][71000] Updated weights for policy 0, policy_version 145294 (0.0027) [2024-06-13 03:05:50,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2380644352. Throughput: 0: 49382.6. Samples: 1909422240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 03:05:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:05:51,106][71000] Updated weights for policy 0, policy_version 145304 (0.0029) [2024-06-13 03:05:54,317][71000] Updated weights for policy 0, policy_version 145314 (0.0031) [2024-06-13 03:05:55,940][70768] Fps is (10 sec: 45875.7, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2380890112. Throughput: 0: 49369.2. Samples: 1909718460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 03:05:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:05:57,563][71000] Updated weights for policy 0, policy_version 145324 (0.0022) [2024-06-13 03:06:00,939][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2381135872. Throughput: 0: 49594.4. Samples: 1910017160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 03:06:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:06:00,946][71000] Updated weights for policy 0, policy_version 145334 (0.0029) [2024-06-13 03:06:04,411][71000] Updated weights for policy 0, policy_version 145344 (0.0033) [2024-06-13 03:06:05,940][70768] Fps is (10 sec: 54067.2, 60 sec: 50244.2, 300 sec: 49485.2). Total num frames: 2381430784. Throughput: 0: 49682.6. Samples: 1910165760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 03:06:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:06:07,937][71000] Updated weights for policy 0, policy_version 145354 (0.0034) [2024-06-13 03:06:10,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49151.9, 300 sec: 49263.7). Total num frames: 2381627392. Throughput: 0: 49476.3. Samples: 1910462300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 03:06:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:06:10,975][71000] Updated weights for policy 0, policy_version 145364 (0.0027) [2024-06-13 03:06:14,311][71000] Updated weights for policy 0, policy_version 145374 (0.0032) [2024-06-13 03:06:15,941][70768] Fps is (10 sec: 44232.3, 60 sec: 49424.2, 300 sec: 49207.4). Total num frames: 2381873152. Throughput: 0: 49440.6. Samples: 1910762180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 03:06:15,941][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:06:17,622][71000] Updated weights for policy 0, policy_version 145384 (0.0025) [2024-06-13 03:06:20,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.0, 300 sec: 49207.6). Total num frames: 2382118912. Throughput: 0: 49306.2. Samples: 1910900720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 03:06:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:06:21,361][71000] Updated weights for policy 0, policy_version 145394 (0.0034) [2024-06-13 03:06:24,319][71000] Updated weights for policy 0, policy_version 145404 (0.0030) [2024-06-13 03:06:25,940][70768] Fps is (10 sec: 52434.2, 60 sec: 49698.0, 300 sec: 49374.1). Total num frames: 2382397440. Throughput: 0: 49337.0. Samples: 1911197860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 03:06:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:06:28,105][71000] Updated weights for policy 0, policy_version 145414 (0.0028) [2024-06-13 03:06:30,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2382610432. Throughput: 0: 49489.1. Samples: 1911499600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 03:06:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:06:30,986][71000] Updated weights for policy 0, policy_version 145424 (0.0030) [2024-06-13 03:06:31,518][70980] Signal inference workers to stop experience collection... (28350 times) [2024-06-13 03:06:31,521][70980] Signal inference workers to resume experience collection... (28350 times) [2024-06-13 03:06:31,530][71000] InferenceWorker_p0-w0: stopping experience collection (28350 times) [2024-06-13 03:06:31,553][71000] InferenceWorker_p0-w0: resuming experience collection (28350 times) [2024-06-13 03:06:34,558][71000] Updated weights for policy 0, policy_version 145434 (0.0032) [2024-06-13 03:06:35,939][70768] Fps is (10 sec: 45875.5, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 2382856192. Throughput: 0: 49100.1. Samples: 1911631740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 03:06:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:06:37,487][71000] Updated weights for policy 0, policy_version 145444 (0.0035) [2024-06-13 03:06:40,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.1, 300 sec: 49263.1). Total num frames: 2383101952. Throughput: 0: 49295.2. Samples: 1911936740. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 03:06:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:06:41,054][71000] Updated weights for policy 0, policy_version 145454 (0.0024) [2024-06-13 03:06:44,182][71000] Updated weights for policy 0, policy_version 145464 (0.0024) [2024-06-13 03:06:45,941][70768] Fps is (10 sec: 52419.6, 60 sec: 49150.7, 300 sec: 49373.9). Total num frames: 2383380480. Throughput: 0: 49030.0. Samples: 1912223600. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 03:06:45,942][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:06:47,902][71000] Updated weights for policy 0, policy_version 145474 (0.0031) [2024-06-13 03:06:50,887][71000] Updated weights for policy 0, policy_version 145484 (0.0026) [2024-06-13 03:06:50,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49424.9, 300 sec: 49263.1). Total num frames: 2383609856. Throughput: 0: 49273.7. Samples: 1912383080. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 03:06:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:06:54,681][71000] Updated weights for policy 0, policy_version 145494 (0.0025) [2024-06-13 03:06:55,940][70768] Fps is (10 sec: 44244.3, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2383822848. Throughput: 0: 49021.8. Samples: 1912668280. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 03:06:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:06:57,600][71000] Updated weights for policy 0, policy_version 145504 (0.0029) [2024-06-13 03:07:00,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2384084992. Throughput: 0: 48839.8. Samples: 1912959920. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 03:07:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:07:01,132][71000] Updated weights for policy 0, policy_version 145514 (0.0039) [2024-06-13 03:07:04,582][71000] Updated weights for policy 0, policy_version 145524 (0.0036) [2024-06-13 03:07:05,940][70768] Fps is (10 sec: 52429.1, 60 sec: 48605.9, 300 sec: 49263.1). Total num frames: 2384347136. Throughput: 0: 48999.6. Samples: 1913105700. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 03:07:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:07:07,978][71000] Updated weights for policy 0, policy_version 145534 (0.0028) [2024-06-13 03:07:10,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 49207.5). Total num frames: 2384576512. Throughput: 0: 49108.1. Samples: 1913407720. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 03:07:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:07:10,973][71000] Updated weights for policy 0, policy_version 145544 (0.0025) [2024-06-13 03:07:14,360][71000] Updated weights for policy 0, policy_version 145554 (0.0032) [2024-06-13 03:07:15,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48879.8, 300 sec: 49152.0). Total num frames: 2384805888. Throughput: 0: 48904.1. Samples: 1913700280. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 03:07:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:07:17,809][71000] Updated weights for policy 0, policy_version 145564 (0.0033) [2024-06-13 03:07:20,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2385051648. Throughput: 0: 49013.2. Samples: 1913837340. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 03:07:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:07:21,090][71000] Updated weights for policy 0, policy_version 145574 (0.0025) [2024-06-13 03:07:24,414][71000] Updated weights for policy 0, policy_version 145584 (0.0022) [2024-06-13 03:07:25,940][70768] Fps is (10 sec: 54066.9, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2385346560. Throughput: 0: 49018.2. Samples: 1914142560. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 03:07:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:07:27,731][71000] Updated weights for policy 0, policy_version 145594 (0.0024) [2024-06-13 03:07:30,925][71000] Updated weights for policy 0, policy_version 145604 (0.0034) [2024-06-13 03:07:30,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2385575936. Throughput: 0: 49105.8. Samples: 1914433280. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 03:07:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 03:07:34,540][71000] Updated weights for policy 0, policy_version 145614 (0.0028) [2024-06-13 03:07:35,940][70768] Fps is (10 sec: 44236.8, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2385788928. Throughput: 0: 48862.4. Samples: 1914581880. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 03:07:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:07:36,720][70980] Signal inference workers to stop experience collection... (28400 times) [2024-06-13 03:07:36,720][70980] Signal inference workers to resume experience collection... (28400 times) [2024-06-13 03:07:36,732][71000] InferenceWorker_p0-w0: stopping experience collection (28400 times) [2024-06-13 03:07:36,732][71000] InferenceWorker_p0-w0: resuming experience collection (28400 times) [2024-06-13 03:07:37,453][71000] Updated weights for policy 0, policy_version 145624 (0.0024) [2024-06-13 03:07:40,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2386051072. Throughput: 0: 49249.8. Samples: 1914884520. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 03:07:40,948][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:07:40,962][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000145633_2386051072.pth... [2024-06-13 03:07:41,012][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000144912_2374238208.pth [2024-06-13 03:07:41,149][71000] Updated weights for policy 0, policy_version 145634 (0.0026) [2024-06-13 03:07:44,201][71000] Updated weights for policy 0, policy_version 145644 (0.0031) [2024-06-13 03:07:45,940][70768] Fps is (10 sec: 54066.6, 60 sec: 49153.3, 300 sec: 49318.6). Total num frames: 2386329600. Throughput: 0: 49155.0. Samples: 1915171900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 03:07:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:07:47,524][71000] Updated weights for policy 0, policy_version 145654 (0.0032) [2024-06-13 03:07:50,781][71000] Updated weights for policy 0, policy_version 145664 (0.0022) [2024-06-13 03:07:50,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.1, 300 sec: 49207.5). Total num frames: 2386558976. Throughput: 0: 49436.4. Samples: 1915330340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 03:07:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:07:54,200][71000] Updated weights for policy 0, policy_version 145674 (0.0032) [2024-06-13 03:07:55,940][70768] Fps is (10 sec: 44237.4, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2386771968. Throughput: 0: 49120.0. Samples: 1915618120. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 03:07:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:07:57,457][71000] Updated weights for policy 0, policy_version 145684 (0.0028) [2024-06-13 03:08:00,674][71000] Updated weights for policy 0, policy_version 145694 (0.0030) [2024-06-13 03:08:00,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2387050496. Throughput: 0: 49342.0. Samples: 1915920680. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 03:08:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:08:04,050][71000] Updated weights for policy 0, policy_version 145704 (0.0023) [2024-06-13 03:08:05,940][70768] Fps is (10 sec: 54066.8, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2387312640. Throughput: 0: 49707.1. Samples: 1916074160. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 03:08:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:08:07,470][71000] Updated weights for policy 0, policy_version 145714 (0.0041) [2024-06-13 03:08:10,658][71000] Updated weights for policy 0, policy_version 145724 (0.0029) [2024-06-13 03:08:10,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49698.0, 300 sec: 49318.6). Total num frames: 2387558400. Throughput: 0: 49514.6. Samples: 1916370720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 03:08:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 03:08:13,779][71000] Updated weights for policy 0, policy_version 145734 (0.0032) [2024-06-13 03:08:15,940][70768] Fps is (10 sec: 45875.7, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2387771392. Throughput: 0: 49527.3. Samples: 1916662000. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 03:08:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:08:17,591][71000] Updated weights for policy 0, policy_version 145744 (0.0024) [2024-06-13 03:08:20,853][71000] Updated weights for policy 0, policy_version 145754 (0.0021) [2024-06-13 03:08:20,939][70768] Fps is (10 sec: 47514.3, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 2388033536. Throughput: 0: 49147.6. Samples: 1916793520. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 03:08:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:08:24,102][71000] Updated weights for policy 0, policy_version 145764 (0.0028) [2024-06-13 03:08:25,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2388295680. Throughput: 0: 49117.8. Samples: 1917094820. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 03:08:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:08:27,366][71000] Updated weights for policy 0, policy_version 145774 (0.0022) [2024-06-13 03:08:30,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48879.0, 300 sec: 49208.2). Total num frames: 2388508672. Throughput: 0: 49241.0. Samples: 1917387740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 03:08:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:08:30,946][71000] Updated weights for policy 0, policy_version 145784 (0.0032) [2024-06-13 03:08:34,095][71000] Updated weights for policy 0, policy_version 145794 (0.0027) [2024-06-13 03:08:35,939][70768] Fps is (10 sec: 45875.4, 60 sec: 49425.1, 300 sec: 49207.6). Total num frames: 2388754432. Throughput: 0: 48970.3. Samples: 1917534000. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 03:08:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:08:36,667][70980] Signal inference workers to stop experience collection... (28450 times) [2024-06-13 03:08:36,670][70980] Signal inference workers to resume experience collection... (28450 times) [2024-06-13 03:08:36,681][71000] InferenceWorker_p0-w0: stopping experience collection (28450 times) [2024-06-13 03:08:36,681][71000] InferenceWorker_p0-w0: resuming experience collection (28450 times) [2024-06-13 03:08:37,797][71000] Updated weights for policy 0, policy_version 145804 (0.0035) [2024-06-13 03:08:40,736][71000] Updated weights for policy 0, policy_version 145814 (0.0026) [2024-06-13 03:08:40,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2389016576. Throughput: 0: 49083.1. Samples: 1917826860. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 03:08:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:08:44,203][71000] Updated weights for policy 0, policy_version 145824 (0.0027) [2024-06-13 03:08:45,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48606.0, 300 sec: 49263.1). Total num frames: 2389245952. Throughput: 0: 48798.4. Samples: 1918116600. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-13 03:08:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:08:47,713][71000] Updated weights for policy 0, policy_version 145834 (0.0025) [2024-06-13 03:08:50,939][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 49207.6). Total num frames: 2389491712. Throughput: 0: 48741.0. Samples: 1918267500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 03:08:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:08:51,007][71000] Updated weights for policy 0, policy_version 145844 (0.0032) [2024-06-13 03:08:54,442][71000] Updated weights for policy 0, policy_version 145854 (0.0037) [2024-06-13 03:08:55,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2389721088. Throughput: 0: 48656.5. Samples: 1918560260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 03:08:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:08:57,678][71000] Updated weights for policy 0, policy_version 145864 (0.0028) [2024-06-13 03:09:00,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48879.1, 300 sec: 49207.6). Total num frames: 2389983232. Throughput: 0: 48691.5. Samples: 1918853120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 03:09:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:09:00,948][71000] Updated weights for policy 0, policy_version 145874 (0.0026) [2024-06-13 03:09:04,294][71000] Updated weights for policy 0, policy_version 145884 (0.0028) [2024-06-13 03:09:05,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48605.8, 300 sec: 49207.5). Total num frames: 2390228992. Throughput: 0: 49075.8. Samples: 1919001940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 03:09:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 03:09:07,600][71000] Updated weights for policy 0, policy_version 145894 (0.0028) [2024-06-13 03:09:10,623][71000] Updated weights for policy 0, policy_version 145904 (0.0026) [2024-06-13 03:09:10,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2390491136. Throughput: 0: 49102.7. Samples: 1919304440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 03:09:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:09:14,413][71000] Updated weights for policy 0, policy_version 145914 (0.0031) [2024-06-13 03:09:15,939][70768] Fps is (10 sec: 47514.2, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2390704128. Throughput: 0: 49011.2. Samples: 1919593240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 03:09:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:09:17,735][71000] Updated weights for policy 0, policy_version 145924 (0.0037) [2024-06-13 03:09:20,941][70768] Fps is (10 sec: 45870.0, 60 sec: 48604.9, 300 sec: 49207.4). Total num frames: 2390949888. Throughput: 0: 48866.8. Samples: 1919733060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 03:09:20,941][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 03:09:21,228][71000] Updated weights for policy 0, policy_version 145934 (0.0031) [2024-06-13 03:09:24,404][71000] Updated weights for policy 0, policy_version 145944 (0.0041) [2024-06-13 03:09:25,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48332.9, 300 sec: 49263.1). Total num frames: 2391195648. Throughput: 0: 48928.9. Samples: 1920028660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 03:09:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:09:27,925][71000] Updated weights for policy 0, policy_version 145954 (0.0032) [2024-06-13 03:09:30,739][71000] Updated weights for policy 0, policy_version 145964 (0.0036) [2024-06-13 03:09:30,940][70768] Fps is (10 sec: 52434.6, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2391474176. Throughput: 0: 49025.8. Samples: 1920322760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 03:09:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:09:34,438][71000] Updated weights for policy 0, policy_version 145974 (0.0025) [2024-06-13 03:09:35,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2391687168. Throughput: 0: 49100.8. Samples: 1920477040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 03:09:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:09:37,847][71000] Updated weights for policy 0, policy_version 145984 (0.0031) [2024-06-13 03:09:40,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48605.8, 300 sec: 49152.0). Total num frames: 2391932928. Throughput: 0: 48975.1. Samples: 1920764140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 03:09:40,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 03:09:40,960][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000145992_2391932928.pth... [2024-06-13 03:09:41,007][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000145274_2380169216.pth [2024-06-13 03:09:41,437][71000] Updated weights for policy 0, policy_version 145994 (0.0028) [2024-06-13 03:09:44,473][71000] Updated weights for policy 0, policy_version 146004 (0.0028) [2024-06-13 03:09:45,941][70768] Fps is (10 sec: 49145.3, 60 sec: 48877.8, 300 sec: 49151.8). Total num frames: 2392178688. Throughput: 0: 48911.9. Samples: 1921054220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 03:09:45,941][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:09:48,454][71000] Updated weights for policy 0, policy_version 146014 (0.0033) [2024-06-13 03:09:49,455][70980] Signal inference workers to stop experience collection... (28500 times) [2024-06-13 03:09:49,502][71000] InferenceWorker_p0-w0: stopping experience collection (28500 times) [2024-06-13 03:09:49,518][70980] Signal inference workers to resume experience collection... (28500 times) [2024-06-13 03:09:49,520][71000] InferenceWorker_p0-w0: resuming experience collection (28500 times) [2024-06-13 03:09:50,910][71000] Updated weights for policy 0, policy_version 146024 (0.0035) [2024-06-13 03:09:50,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2392457216. Throughput: 0: 48899.6. Samples: 1921202420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 03:09:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:09:54,898][71000] Updated weights for policy 0, policy_version 146034 (0.0031) [2024-06-13 03:09:55,940][70768] Fps is (10 sec: 50797.3, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2392686592. Throughput: 0: 48716.0. Samples: 1921496660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 03:09:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:09:57,799][71000] Updated weights for policy 0, policy_version 146044 (0.0026) [2024-06-13 03:10:00,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2392915968. Throughput: 0: 49037.3. Samples: 1921799920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 03:10:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:10:01,239][71000] Updated weights for policy 0, policy_version 146054 (0.0027) [2024-06-13 03:10:04,212][71000] Updated weights for policy 0, policy_version 146064 (0.0037) [2024-06-13 03:10:05,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48879.0, 300 sec: 49096.4). Total num frames: 2393161728. Throughput: 0: 49035.8. Samples: 1921939620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 03:10:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:10:08,334][71000] Updated weights for policy 0, policy_version 146074 (0.0024) [2024-06-13 03:10:10,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48878.9, 300 sec: 49207.6). Total num frames: 2393423872. Throughput: 0: 49056.0. Samples: 1922236180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 03:10:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:10:11,001][71000] Updated weights for policy 0, policy_version 146084 (0.0023) [2024-06-13 03:10:14,835][71000] Updated weights for policy 0, policy_version 146094 (0.0024) [2024-06-13 03:10:15,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49698.0, 300 sec: 49263.1). Total num frames: 2393686016. Throughput: 0: 49168.7. Samples: 1922535360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 03:10:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:10:17,601][71000] Updated weights for policy 0, policy_version 146104 (0.0030) [2024-06-13 03:10:20,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49152.9, 300 sec: 49096.4). Total num frames: 2393899008. Throughput: 0: 49015.1. Samples: 1922682720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 03:10:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 03:10:21,370][71000] Updated weights for policy 0, policy_version 146114 (0.0023) [2024-06-13 03:10:24,185][71000] Updated weights for policy 0, policy_version 146124 (0.0024) [2024-06-13 03:10:25,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 2394144768. Throughput: 0: 49092.4. Samples: 1922973300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 03:10:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:10:27,967][71000] Updated weights for policy 0, policy_version 146134 (0.0025) [2024-06-13 03:10:30,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2394406912. Throughput: 0: 49188.1. Samples: 1923267620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 03:10:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:10:30,987][71000] Updated weights for policy 0, policy_version 146144 (0.0030) [2024-06-13 03:10:34,610][71000] Updated weights for policy 0, policy_version 146154 (0.0030) [2024-06-13 03:10:35,939][70768] Fps is (10 sec: 50791.1, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2394652672. Throughput: 0: 49253.8. Samples: 1923418840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 03:10:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:10:37,732][71000] Updated weights for policy 0, policy_version 146164 (0.0029) [2024-06-13 03:10:40,939][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2394882048. Throughput: 0: 49338.7. Samples: 1923716900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 03:10:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:10:41,282][71000] Updated weights for policy 0, policy_version 146174 (0.0031) [2024-06-13 03:10:44,259][71000] Updated weights for policy 0, policy_version 146184 (0.0035) [2024-06-13 03:10:45,940][70768] Fps is (10 sec: 44236.5, 60 sec: 48606.9, 300 sec: 48985.4). Total num frames: 2395095040. Throughput: 0: 49186.7. Samples: 1924013320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 03:10:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:10:47,922][71000] Updated weights for policy 0, policy_version 146194 (0.0030) [2024-06-13 03:10:50,700][71000] Updated weights for policy 0, policy_version 146204 (0.0033) [2024-06-13 03:10:50,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2395406336. Throughput: 0: 49385.8. Samples: 1924161980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 03:10:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:10:52,883][70980] Signal inference workers to stop experience collection... (28550 times) [2024-06-13 03:10:52,885][70980] Signal inference workers to resume experience collection... (28550 times) [2024-06-13 03:10:52,916][71000] InferenceWorker_p0-w0: stopping experience collection (28550 times) [2024-06-13 03:10:52,916][71000] InferenceWorker_p0-w0: resuming experience collection (28550 times) [2024-06-13 03:10:54,513][71000] Updated weights for policy 0, policy_version 146214 (0.0028) [2024-06-13 03:10:55,939][70768] Fps is (10 sec: 54067.6, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2395635712. Throughput: 0: 49370.7. Samples: 1924457860. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-13 03:10:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:10:57,439][71000] Updated weights for policy 0, policy_version 146224 (0.0026) [2024-06-13 03:11:00,881][71000] Updated weights for policy 0, policy_version 146234 (0.0029) [2024-06-13 03:11:00,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.2, 300 sec: 49040.9). Total num frames: 2395897856. Throughput: 0: 49510.8. Samples: 1924763340. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-13 03:11:00,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 03:11:04,030][71000] Updated weights for policy 0, policy_version 146244 (0.0031) [2024-06-13 03:11:05,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2396127232. Throughput: 0: 49370.7. Samples: 1924904400. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-13 03:11:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:11:07,325][71000] Updated weights for policy 0, policy_version 146254 (0.0036) [2024-06-13 03:11:10,823][71000] Updated weights for policy 0, policy_version 146264 (0.0026) [2024-06-13 03:11:10,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.0, 300 sec: 49207.7). Total num frames: 2396389376. Throughput: 0: 49537.4. Samples: 1925202480. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-13 03:11:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:11:14,126][71000] Updated weights for policy 0, policy_version 146274 (0.0029) [2024-06-13 03:11:15,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.1, 300 sec: 49207.5). Total num frames: 2396635136. Throughput: 0: 49572.4. Samples: 1925498380. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-13 03:11:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 03:11:17,412][71000] Updated weights for policy 0, policy_version 146284 (0.0031) [2024-06-13 03:11:20,798][71000] Updated weights for policy 0, policy_version 146294 (0.0031) [2024-06-13 03:11:20,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49971.2, 300 sec: 49152.0). Total num frames: 2396897280. Throughput: 0: 49576.8. Samples: 1925649800. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-13 03:11:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:11:24,174][71000] Updated weights for policy 0, policy_version 146304 (0.0029) [2024-06-13 03:11:25,939][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2397093888. Throughput: 0: 49437.8. Samples: 1925941600. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-13 03:11:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:11:27,511][71000] Updated weights for policy 0, policy_version 146314 (0.0025) [2024-06-13 03:11:30,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2397356032. Throughput: 0: 49178.2. Samples: 1926226340. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-13 03:11:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 03:11:31,150][71000] Updated weights for policy 0, policy_version 146324 (0.0025) [2024-06-13 03:11:34,267][71000] Updated weights for policy 0, policy_version 146334 (0.0024) [2024-06-13 03:11:35,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49424.9, 300 sec: 49207.5). Total num frames: 2397618176. Throughput: 0: 49323.9. Samples: 1926381560. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-13 03:11:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 03:11:37,628][71000] Updated weights for policy 0, policy_version 146344 (0.0035) [2024-06-13 03:11:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.0, 300 sec: 49041.2). Total num frames: 2397847552. Throughput: 0: 49236.4. Samples: 1926673500. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-13 03:11:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:11:40,960][71000] Updated weights for policy 0, policy_version 146354 (0.0030) [2024-06-13 03:11:40,961][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000146354_2397863936.pth... [2024-06-13 03:11:41,002][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000145633_2386051072.pth [2024-06-13 03:11:44,094][71000] Updated weights for policy 0, policy_version 146364 (0.0024) [2024-06-13 03:11:45,940][70768] Fps is (10 sec: 45875.6, 60 sec: 49698.1, 300 sec: 49040.9). Total num frames: 2398076928. Throughput: 0: 48945.8. Samples: 1926965900. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-13 03:11:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:11:47,519][71000] Updated weights for policy 0, policy_version 146374 (0.0033) [2024-06-13 03:11:50,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2398339072. Throughput: 0: 48984.0. Samples: 1927108680. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-13 03:11:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:11:51,325][71000] Updated weights for policy 0, policy_version 146384 (0.0032) [2024-06-13 03:11:54,323][71000] Updated weights for policy 0, policy_version 146394 (0.0033) [2024-06-13 03:11:55,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2398584832. Throughput: 0: 48880.8. Samples: 1927402120. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-13 03:11:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:11:57,721][71000] Updated weights for policy 0, policy_version 146404 (0.0023) [2024-06-13 03:12:00,761][71000] Updated weights for policy 0, policy_version 146414 (0.0029) [2024-06-13 03:12:00,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2398846976. Throughput: 0: 49014.2. Samples: 1927704020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 03:12:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:12:03,289][70980] Signal inference workers to stop experience collection... (28600 times) [2024-06-13 03:12:03,336][71000] InferenceWorker_p0-w0: stopping experience collection (28600 times) [2024-06-13 03:12:03,343][70980] Signal inference workers to resume experience collection... (28600 times) [2024-06-13 03:12:03,347][71000] InferenceWorker_p0-w0: resuming experience collection (28600 times) [2024-06-13 03:12:04,186][71000] Updated weights for policy 0, policy_version 146424 (0.0021) [2024-06-13 03:12:05,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2399076352. Throughput: 0: 49128.0. Samples: 1927860560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 03:12:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:12:07,496][71000] Updated weights for policy 0, policy_version 146434 (0.0023) [2024-06-13 03:12:10,645][71000] Updated weights for policy 0, policy_version 146444 (0.0036) [2024-06-13 03:12:10,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2399338496. Throughput: 0: 49047.0. Samples: 1928148720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 03:12:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:12:14,223][71000] Updated weights for policy 0, policy_version 146454 (0.0036) [2024-06-13 03:12:15,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2399584256. Throughput: 0: 49140.0. Samples: 1928437640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 03:12:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:12:17,604][71000] Updated weights for policy 0, policy_version 146464 (0.0029) [2024-06-13 03:12:20,897][71000] Updated weights for policy 0, policy_version 146474 (0.0028) [2024-06-13 03:12:20,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48878.9, 300 sec: 49096.4). Total num frames: 2399830016. Throughput: 0: 49102.7. Samples: 1928591180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 03:12:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:12:24,125][71000] Updated weights for policy 0, policy_version 146484 (0.0033) [2024-06-13 03:12:25,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2400043008. Throughput: 0: 49005.8. Samples: 1928878760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 03:12:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 03:12:27,638][71000] Updated weights for policy 0, policy_version 146494 (0.0026) [2024-06-13 03:12:30,776][71000] Updated weights for policy 0, policy_version 146504 (0.0033) [2024-06-13 03:12:30,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2400321536. Throughput: 0: 49136.9. Samples: 1929177060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 03:12:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:12:34,349][71000] Updated weights for policy 0, policy_version 146514 (0.0026) [2024-06-13 03:12:35,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49152.1, 300 sec: 49207.5). Total num frames: 2400567296. Throughput: 0: 49413.3. Samples: 1929332280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 03:12:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:12:37,459][71000] Updated weights for policy 0, policy_version 146524 (0.0024) [2024-06-13 03:12:40,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49151.8, 300 sec: 49040.9). Total num frames: 2400796672. Throughput: 0: 49441.2. Samples: 1929626980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 03:12:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:12:40,986][71000] Updated weights for policy 0, policy_version 146534 (0.0031) [2024-06-13 03:12:44,309][71000] Updated weights for policy 0, policy_version 146544 (0.0018) [2024-06-13 03:12:45,939][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2401026048. Throughput: 0: 49174.3. Samples: 1929916860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 03:12:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:12:47,820][71000] Updated weights for policy 0, policy_version 146554 (0.0039) [2024-06-13 03:12:50,793][71000] Updated weights for policy 0, policy_version 146564 (0.0025) [2024-06-13 03:12:50,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49424.9, 300 sec: 49263.0). Total num frames: 2401304576. Throughput: 0: 48796.2. Samples: 1930056400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 03:12:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:12:54,398][71000] Updated weights for policy 0, policy_version 146574 (0.0030) [2024-06-13 03:12:55,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2401550336. Throughput: 0: 49216.0. Samples: 1930363440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 03:12:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:12:57,471][71000] Updated weights for policy 0, policy_version 146584 (0.0040) [2024-06-13 03:13:00,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 2401763328. Throughput: 0: 49156.7. Samples: 1930649700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 03:13:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:13:01,276][71000] Updated weights for policy 0, policy_version 146594 (0.0028) [2024-06-13 03:13:04,324][71000] Updated weights for policy 0, policy_version 146604 (0.0033) [2024-06-13 03:13:05,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 2402009088. Throughput: 0: 49028.2. Samples: 1930797440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 03:13:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:13:08,028][70980] Signal inference workers to stop experience collection... (28650 times) [2024-06-13 03:13:08,072][71000] InferenceWorker_p0-w0: stopping experience collection (28650 times) [2024-06-13 03:13:08,078][70980] Signal inference workers to resume experience collection... (28650 times) [2024-06-13 03:13:08,088][71000] InferenceWorker_p0-w0: resuming experience collection (28650 times) [2024-06-13 03:13:08,090][71000] Updated weights for policy 0, policy_version 146614 (0.0022) [2024-06-13 03:13:10,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48605.9, 300 sec: 49096.5). Total num frames: 2402254848. Throughput: 0: 48908.0. Samples: 1931079620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 03:13:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:13:11,136][71000] Updated weights for policy 0, policy_version 146624 (0.0036) [2024-06-13 03:13:14,641][71000] Updated weights for policy 0, policy_version 146634 (0.0023) [2024-06-13 03:13:15,940][70768] Fps is (10 sec: 52427.6, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2402533376. Throughput: 0: 48964.3. Samples: 1931380460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 03:13:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:13:17,579][71000] Updated weights for policy 0, policy_version 146644 (0.0028) [2024-06-13 03:13:20,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 2402746368. Throughput: 0: 48847.9. Samples: 1931530440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 03:13:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:13:21,239][71000] Updated weights for policy 0, policy_version 146654 (0.0026) [2024-06-13 03:13:24,282][71000] Updated weights for policy 0, policy_version 146664 (0.0023) [2024-06-13 03:13:25,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49698.2, 300 sec: 49207.5). Total num frames: 2403024896. Throughput: 0: 49216.2. Samples: 1931841700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 03:13:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:13:27,534][71000] Updated weights for policy 0, policy_version 146674 (0.0025) [2024-06-13 03:13:30,567][71000] Updated weights for policy 0, policy_version 146684 (0.0030) [2024-06-13 03:13:30,940][70768] Fps is (10 sec: 54067.6, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2403287040. Throughput: 0: 49671.5. Samples: 1932152080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 03:13:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 03:13:33,938][71000] Updated weights for policy 0, policy_version 146694 (0.0043) [2024-06-13 03:13:35,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2403532800. Throughput: 0: 50164.8. Samples: 1932313800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 03:13:35,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-13 03:13:37,059][71000] Updated weights for policy 0, policy_version 146704 (0.0025) [2024-06-13 03:13:40,526][71000] Updated weights for policy 0, policy_version 146714 (0.0029) [2024-06-13 03:13:40,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.3, 300 sec: 49263.1). Total num frames: 2403778560. Throughput: 0: 49781.8. Samples: 1932603620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 03:13:40,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-13 03:13:41,113][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000146717_2403811328.pth... [2024-06-13 03:13:41,151][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000145992_2391932928.pth [2024-06-13 03:13:43,634][71000] Updated weights for policy 0, policy_version 146724 (0.0022) [2024-06-13 03:13:45,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49698.1, 300 sec: 49207.5). Total num frames: 2404007936. Throughput: 0: 50000.6. Samples: 1932899720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 03:13:45,940][70768] Avg episode reward: [(0, '0.230')] [2024-06-13 03:13:47,107][71000] Updated weights for policy 0, policy_version 146734 (0.0030) [2024-06-13 03:13:50,451][71000] Updated weights for policy 0, policy_version 146744 (0.0031) [2024-06-13 03:13:50,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.3, 300 sec: 49374.2). Total num frames: 2404286464. Throughput: 0: 49962.1. Samples: 1933045740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 03:13:50,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-13 03:13:53,963][71000] Updated weights for policy 0, policy_version 146754 (0.0036) [2024-06-13 03:13:55,940][70768] Fps is (10 sec: 54066.4, 60 sec: 49971.1, 300 sec: 49374.1). Total num frames: 2404548608. Throughput: 0: 50454.0. Samples: 1933350060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 03:13:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:13:57,003][71000] Updated weights for policy 0, policy_version 146764 (0.0031) [2024-06-13 03:14:00,305][71000] Updated weights for policy 0, policy_version 146774 (0.0026) [2024-06-13 03:14:00,940][70768] Fps is (10 sec: 49151.3, 60 sec: 50244.3, 300 sec: 49318.6). Total num frames: 2404777984. Throughput: 0: 50494.7. Samples: 1933652720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 03:14:00,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 03:14:03,395][71000] Updated weights for policy 0, policy_version 146784 (0.0031) [2024-06-13 03:14:05,940][70768] Fps is (10 sec: 45875.8, 60 sec: 49971.1, 300 sec: 49207.5). Total num frames: 2405007360. Throughput: 0: 50483.2. Samples: 1933802180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 03:14:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:14:06,957][71000] Updated weights for policy 0, policy_version 146794 (0.0047) [2024-06-13 03:14:09,867][71000] Updated weights for policy 0, policy_version 146804 (0.0030) [2024-06-13 03:14:10,940][70768] Fps is (10 sec: 50791.2, 60 sec: 50517.3, 300 sec: 49429.7). Total num frames: 2405285888. Throughput: 0: 50223.5. Samples: 1934101760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 03:14:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:14:13,533][71000] Updated weights for policy 0, policy_version 146814 (0.0027) [2024-06-13 03:14:15,940][70768] Fps is (10 sec: 54067.5, 60 sec: 50244.4, 300 sec: 49485.4). Total num frames: 2405548032. Throughput: 0: 49944.1. Samples: 1934399560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 03:14:15,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 03:14:16,498][71000] Updated weights for policy 0, policy_version 146824 (0.0027) [2024-06-13 03:14:19,866][71000] Updated weights for policy 0, policy_version 146834 (0.0033) [2024-06-13 03:14:20,371][70980] Signal inference workers to stop experience collection... (28700 times) [2024-06-13 03:14:20,371][70980] Signal inference workers to resume experience collection... (28700 times) [2024-06-13 03:14:20,417][71000] InferenceWorker_p0-w0: stopping experience collection (28700 times) [2024-06-13 03:14:20,417][71000] InferenceWorker_p0-w0: resuming experience collection (28700 times) [2024-06-13 03:14:20,940][70768] Fps is (10 sec: 49151.6, 60 sec: 50517.4, 300 sec: 49429.7). Total num frames: 2405777408. Throughput: 0: 49866.5. Samples: 1934557800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 03:14:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:14:22,960][71000] Updated weights for policy 0, policy_version 146844 (0.0029) [2024-06-13 03:14:25,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49971.1, 300 sec: 49318.6). Total num frames: 2406023168. Throughput: 0: 49808.4. Samples: 1934845000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 03:14:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:14:26,574][71000] Updated weights for policy 0, policy_version 146854 (0.0029) [2024-06-13 03:14:29,614][71000] Updated weights for policy 0, policy_version 146864 (0.0035) [2024-06-13 03:14:30,940][70768] Fps is (10 sec: 52428.1, 60 sec: 50244.1, 300 sec: 49540.7). Total num frames: 2406301696. Throughput: 0: 50067.4. Samples: 1935152760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 03:14:30,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 03:14:33,339][71000] Updated weights for policy 0, policy_version 146874 (0.0026) [2024-06-13 03:14:35,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49971.1, 300 sec: 49485.2). Total num frames: 2406531072. Throughput: 0: 50311.6. Samples: 1935309760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 03:14:35,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 03:14:35,969][71000] Updated weights for policy 0, policy_version 146884 (0.0035) [2024-06-13 03:14:39,697][71000] Updated weights for policy 0, policy_version 146894 (0.0037) [2024-06-13 03:14:40,940][70768] Fps is (10 sec: 49152.5, 60 sec: 50244.2, 300 sec: 49541.0). Total num frames: 2406793216. Throughput: 0: 50113.8. Samples: 1935605180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 03:14:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:14:42,733][71000] Updated weights for policy 0, policy_version 146904 (0.0031) [2024-06-13 03:14:45,940][70768] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 49374.2). Total num frames: 2407022592. Throughput: 0: 50007.7. Samples: 1935903060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 03:14:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:14:46,226][71000] Updated weights for policy 0, policy_version 146914 (0.0025) [2024-06-13 03:14:49,559][71000] Updated weights for policy 0, policy_version 146924 (0.0028) [2024-06-13 03:14:50,940][70768] Fps is (10 sec: 49148.6, 60 sec: 49970.6, 300 sec: 49485.1). Total num frames: 2407284736. Throughput: 0: 50048.5. Samples: 1936054400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 03:14:50,941][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 03:14:53,150][71000] Updated weights for policy 0, policy_version 146934 (0.0029) [2024-06-13 03:14:55,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 2407514112. Throughput: 0: 49891.9. Samples: 1936346900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 03:14:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:14:56,060][71000] Updated weights for policy 0, policy_version 146944 (0.0031) [2024-06-13 03:14:59,632][71000] Updated weights for policy 0, policy_version 146954 (0.0027) [2024-06-13 03:15:00,940][70768] Fps is (10 sec: 49155.3, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 2407776256. Throughput: 0: 49800.7. Samples: 1936640600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 03:15:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:15:02,435][71000] Updated weights for policy 0, policy_version 146964 (0.0028) [2024-06-13 03:15:05,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49971.1, 300 sec: 49429.7). Total num frames: 2408005632. Throughput: 0: 49639.0. Samples: 1936791560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 03:15:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:15:06,120][71000] Updated weights for policy 0, policy_version 146974 (0.0046) [2024-06-13 03:15:09,007][71000] Updated weights for policy 0, policy_version 146984 (0.0023) [2024-06-13 03:15:10,944][70768] Fps is (10 sec: 50769.5, 60 sec: 49967.7, 300 sec: 49484.5). Total num frames: 2408284160. Throughput: 0: 49921.2. Samples: 1937091660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 03:15:10,944][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 03:15:12,459][71000] Updated weights for policy 0, policy_version 146994 (0.0027) [2024-06-13 03:15:15,640][71000] Updated weights for policy 0, policy_version 147004 (0.0023) [2024-06-13 03:15:15,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 2408513536. Throughput: 0: 49737.5. Samples: 1937390940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 03:15:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:15:19,147][71000] Updated weights for policy 0, policy_version 147014 (0.0026) [2024-06-13 03:15:20,054][70980] Signal inference workers to stop experience collection... (28750 times) [2024-06-13 03:15:20,055][70980] Signal inference workers to resume experience collection... (28750 times) [2024-06-13 03:15:20,100][71000] InferenceWorker_p0-w0: stopping experience collection (28750 times) [2024-06-13 03:15:20,101][71000] InferenceWorker_p0-w0: resuming experience collection (28750 times) [2024-06-13 03:15:20,939][70768] Fps is (10 sec: 47534.0, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 2408759296. Throughput: 0: 49464.9. Samples: 1937535680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 03:15:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:15:22,066][71000] Updated weights for policy 0, policy_version 147024 (0.0030) [2024-06-13 03:15:25,742][71000] Updated weights for policy 0, policy_version 147034 (0.0025) [2024-06-13 03:15:25,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 2409005056. Throughput: 0: 49374.3. Samples: 1937827020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 03:15:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:15:28,743][71000] Updated weights for policy 0, policy_version 147044 (0.0030) [2024-06-13 03:15:30,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 2409267200. Throughput: 0: 49446.2. Samples: 1938128140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 03:15:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:15:32,422][71000] Updated weights for policy 0, policy_version 147054 (0.0027) [2024-06-13 03:15:35,386][71000] Updated weights for policy 0, policy_version 147064 (0.0028) [2024-06-13 03:15:35,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 2409512960. Throughput: 0: 49492.5. Samples: 1938281520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 03:15:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:15:38,977][71000] Updated weights for policy 0, policy_version 147074 (0.0033) [2024-06-13 03:15:40,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49152.0, 300 sec: 49651.8). Total num frames: 2409742336. Throughput: 0: 49656.3. Samples: 1938581440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 03:15:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:15:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000147080_2409758720.pth... [2024-06-13 03:15:40,991][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000146354_2397863936.pth [2024-06-13 03:15:41,851][71000] Updated weights for policy 0, policy_version 147084 (0.0021) [2024-06-13 03:15:45,493][71000] Updated weights for policy 0, policy_version 147094 (0.0021) [2024-06-13 03:15:45,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 2410004480. Throughput: 0: 49685.9. Samples: 1938876460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 03:15:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:15:48,566][71000] Updated weights for policy 0, policy_version 147104 (0.0028) [2024-06-13 03:15:50,940][70768] Fps is (10 sec: 52429.8, 60 sec: 49698.8, 300 sec: 49596.3). Total num frames: 2410266624. Throughput: 0: 49533.6. Samples: 1939020560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 03:15:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:15:52,177][71000] Updated weights for policy 0, policy_version 147114 (0.0025) [2024-06-13 03:15:55,226][71000] Updated weights for policy 0, policy_version 147124 (0.0032) [2024-06-13 03:15:55,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49971.3, 300 sec: 49540.8). Total num frames: 2410512384. Throughput: 0: 49638.5. Samples: 1939325180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 03:15:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:15:58,831][71000] Updated weights for policy 0, policy_version 147134 (0.0025) [2024-06-13 03:16:00,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 2410725376. Throughput: 0: 49511.6. Samples: 1939618960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 03:16:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:16:01,871][71000] Updated weights for policy 0, policy_version 147144 (0.0034) [2024-06-13 03:16:05,258][71000] Updated weights for policy 0, policy_version 147154 (0.0029) [2024-06-13 03:16:05,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49698.3, 300 sec: 49485.2). Total num frames: 2410987520. Throughput: 0: 49470.2. Samples: 1939761840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 03:16:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:16:08,479][71000] Updated weights for policy 0, policy_version 147164 (0.0034) [2024-06-13 03:16:10,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49428.5, 300 sec: 49540.8). Total num frames: 2411249664. Throughput: 0: 49497.7. Samples: 1940054420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 03:16:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:16:12,023][71000] Updated weights for policy 0, policy_version 147174 (0.0029) [2024-06-13 03:16:15,337][71000] Updated weights for policy 0, policy_version 147184 (0.0031) [2024-06-13 03:16:15,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2411479040. Throughput: 0: 49490.3. Samples: 1940355200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 03:16:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:16:18,740][71000] Updated weights for policy 0, policy_version 147194 (0.0025) [2024-06-13 03:16:20,940][70768] Fps is (10 sec: 45874.6, 60 sec: 49151.8, 300 sec: 49540.7). Total num frames: 2411708416. Throughput: 0: 49321.1. Samples: 1940500980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 03:16:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:16:22,243][71000] Updated weights for policy 0, policy_version 147204 (0.0024) [2024-06-13 03:16:25,241][71000] Updated weights for policy 0, policy_version 147214 (0.0022) [2024-06-13 03:16:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 2411970560. Throughput: 0: 49010.0. Samples: 1940786880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 03:16:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:16:28,734][71000] Updated weights for policy 0, policy_version 147224 (0.0030) [2024-06-13 03:16:30,940][70768] Fps is (10 sec: 50791.5, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2412216320. Throughput: 0: 48973.8. Samples: 1941080280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 03:16:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:16:32,027][71000] Updated weights for policy 0, policy_version 147234 (0.0028) [2024-06-13 03:16:35,293][71000] Updated weights for policy 0, policy_version 147244 (0.0020) [2024-06-13 03:16:35,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49424.9, 300 sec: 49596.3). Total num frames: 2412478464. Throughput: 0: 49313.6. Samples: 1941239680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 03:16:35,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:16:38,225][70980] Signal inference workers to stop experience collection... (28800 times) [2024-06-13 03:16:38,233][71000] InferenceWorker_p0-w0: stopping experience collection (28800 times) [2024-06-13 03:16:38,279][70980] Signal inference workers to resume experience collection... (28800 times) [2024-06-13 03:16:38,280][71000] InferenceWorker_p0-w0: resuming experience collection (28800 times) [2024-06-13 03:16:38,419][71000] Updated weights for policy 0, policy_version 147254 (0.0023) [2024-06-13 03:16:40,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 2412707840. Throughput: 0: 49107.0. Samples: 1941535000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 03:16:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:16:41,971][71000] Updated weights for policy 0, policy_version 147264 (0.0027) [2024-06-13 03:16:45,083][71000] Updated weights for policy 0, policy_version 147274 (0.0024) [2024-06-13 03:16:45,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 2412953600. Throughput: 0: 48920.4. Samples: 1941820380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 03:16:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:16:48,656][71000] Updated weights for policy 0, policy_version 147284 (0.0033) [2024-06-13 03:16:50,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.9, 300 sec: 49540.8). Total num frames: 2413199360. Throughput: 0: 49224.4. Samples: 1941976940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 03:16:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:16:51,782][71000] Updated weights for policy 0, policy_version 147294 (0.0030) [2024-06-13 03:16:55,184][71000] Updated weights for policy 0, policy_version 147304 (0.0026) [2024-06-13 03:16:55,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 49485.2). Total num frames: 2413445120. Throughput: 0: 49280.1. Samples: 1942272020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 03:16:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:16:58,370][71000] Updated weights for policy 0, policy_version 147314 (0.0030) [2024-06-13 03:17:00,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 2413690880. Throughput: 0: 49288.4. Samples: 1942573180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 03:17:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:17:01,958][71000] Updated weights for policy 0, policy_version 147324 (0.0028) [2024-06-13 03:17:04,853][71000] Updated weights for policy 0, policy_version 147334 (0.0032) [2024-06-13 03:17:05,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 2413953024. Throughput: 0: 49332.7. Samples: 1942720940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 03:17:05,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 03:17:08,603][71000] Updated weights for policy 0, policy_version 147344 (0.0028) [2024-06-13 03:17:10,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.1, 300 sec: 49540.8). Total num frames: 2414198784. Throughput: 0: 49600.9. Samples: 1943018920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 03:17:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:17:11,614][71000] Updated weights for policy 0, policy_version 147354 (0.0033) [2024-06-13 03:17:15,294][71000] Updated weights for policy 0, policy_version 147364 (0.0021) [2024-06-13 03:17:15,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49151.9, 300 sec: 49485.2). Total num frames: 2414428160. Throughput: 0: 49470.5. Samples: 1943306460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:17:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:17:18,311][71000] Updated weights for policy 0, policy_version 147374 (0.0023) [2024-06-13 03:17:20,940][70768] Fps is (10 sec: 47512.5, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 2414673920. Throughput: 0: 49100.9. Samples: 1943449220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:17:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:17:22,191][71000] Updated weights for policy 0, policy_version 147384 (0.0039) [2024-06-13 03:17:24,922][71000] Updated weights for policy 0, policy_version 147394 (0.0022) [2024-06-13 03:17:25,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2414919680. Throughput: 0: 49172.5. Samples: 1943747760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:17:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:17:28,758][71000] Updated weights for policy 0, policy_version 147404 (0.0024) [2024-06-13 03:17:30,939][70768] Fps is (10 sec: 50791.5, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 2415181824. Throughput: 0: 49406.7. Samples: 1944043680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:17:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:17:31,616][71000] Updated weights for policy 0, policy_version 147414 (0.0027) [2024-06-13 03:17:35,486][71000] Updated weights for policy 0, policy_version 147424 (0.0032) [2024-06-13 03:17:35,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.1, 300 sec: 49596.3). Total num frames: 2415427584. Throughput: 0: 49198.7. Samples: 1944190880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:17:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 03:17:38,500][71000] Updated weights for policy 0, policy_version 147434 (0.0030) [2024-06-13 03:17:40,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49151.9, 300 sec: 49596.3). Total num frames: 2415656960. Throughput: 0: 49257.2. Samples: 1944488600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:17:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 03:17:40,993][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000147441_2415673344.pth... [2024-06-13 03:17:41,055][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000146717_2403811328.pth [2024-06-13 03:17:41,931][71000] Updated weights for policy 0, policy_version 147444 (0.0032) [2024-06-13 03:17:44,971][71000] Updated weights for policy 0, policy_version 147454 (0.0029) [2024-06-13 03:17:45,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 2415919104. Throughput: 0: 49056.4. Samples: 1944780720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:17:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:17:48,671][71000] Updated weights for policy 0, policy_version 147464 (0.0034) [2024-06-13 03:17:50,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49151.9, 300 sec: 49485.2). Total num frames: 2416148480. Throughput: 0: 49232.7. Samples: 1944936420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:17:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:17:51,831][71000] Updated weights for policy 0, policy_version 147474 (0.0026) [2024-06-13 03:17:55,272][71000] Updated weights for policy 0, policy_version 147484 (0.0024) [2024-06-13 03:17:55,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49424.9, 300 sec: 49651.9). Total num frames: 2416410624. Throughput: 0: 49221.1. Samples: 1945233880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:17:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:17:58,558][70980] Signal inference workers to stop experience collection... (28850 times) [2024-06-13 03:17:58,605][71000] InferenceWorker_p0-w0: stopping experience collection (28850 times) [2024-06-13 03:17:58,612][70980] Signal inference workers to resume experience collection... (28850 times) [2024-06-13 03:17:58,622][71000] InferenceWorker_p0-w0: resuming experience collection (28850 times) [2024-06-13 03:17:58,625][71000] Updated weights for policy 0, policy_version 147494 (0.0029) [2024-06-13 03:18:00,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.1, 300 sec: 49651.8). Total num frames: 2416656384. Throughput: 0: 49333.9. Samples: 1945526480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:18:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:18:01,961][71000] Updated weights for policy 0, policy_version 147504 (0.0023) [2024-06-13 03:18:05,057][71000] Updated weights for policy 0, policy_version 147514 (0.0025) [2024-06-13 03:18:05,942][70768] Fps is (10 sec: 49140.6, 60 sec: 49150.0, 300 sec: 49651.4). Total num frames: 2416902144. Throughput: 0: 49445.1. Samples: 1945674360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:18:05,942][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:18:08,739][71000] Updated weights for policy 0, policy_version 147524 (0.0034) [2024-06-13 03:18:10,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 2417147904. Throughput: 0: 49304.5. Samples: 1945966460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:18:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:18:11,939][71000] Updated weights for policy 0, policy_version 147534 (0.0039) [2024-06-13 03:18:15,373][71000] Updated weights for policy 0, policy_version 147544 (0.0030) [2024-06-13 03:18:15,940][70768] Fps is (10 sec: 49163.3, 60 sec: 49425.1, 300 sec: 49651.8). Total num frames: 2417393664. Throughput: 0: 49204.7. Samples: 1946257900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:18:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:18:18,586][71000] Updated weights for policy 0, policy_version 147554 (0.0029) [2024-06-13 03:18:20,940][70768] Fps is (10 sec: 47512.7, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2417623040. Throughput: 0: 49174.5. Samples: 1946403740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 03:18:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:18:22,211][71000] Updated weights for policy 0, policy_version 147564 (0.0020) [2024-06-13 03:18:25,402][71000] Updated weights for policy 0, policy_version 147574 (0.0022) [2024-06-13 03:18:25,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 2417868800. Throughput: 0: 49062.2. Samples: 1946696400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 03:18:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:18:28,564][71000] Updated weights for policy 0, policy_version 147584 (0.0029) [2024-06-13 03:18:30,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48878.9, 300 sec: 49429.7). Total num frames: 2418114560. Throughput: 0: 49223.6. Samples: 1946995780. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 03:18:30,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 03:18:31,804][71000] Updated weights for policy 0, policy_version 147594 (0.0027) [2024-06-13 03:18:35,251][71000] Updated weights for policy 0, policy_version 147604 (0.0029) [2024-06-13 03:18:35,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48878.9, 300 sec: 49429.7). Total num frames: 2418360320. Throughput: 0: 49023.7. Samples: 1947142480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 03:18:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:18:38,535][71000] Updated weights for policy 0, policy_version 147614 (0.0027) [2024-06-13 03:18:40,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 2418622464. Throughput: 0: 49095.6. Samples: 1947443180. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 03:18:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:18:42,149][71000] Updated weights for policy 0, policy_version 147624 (0.0025) [2024-06-13 03:18:44,725][71000] Updated weights for policy 0, policy_version 147634 (0.0025) [2024-06-13 03:18:45,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2418884608. Throughput: 0: 49367.1. Samples: 1947748000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 03:18:45,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 03:18:48,454][71000] Updated weights for policy 0, policy_version 147644 (0.0025) [2024-06-13 03:18:50,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 2419113984. Throughput: 0: 49388.9. Samples: 1947896740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 03:18:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:18:51,553][71000] Updated weights for policy 0, policy_version 147654 (0.0027) [2024-06-13 03:18:55,116][71000] Updated weights for policy 0, policy_version 147664 (0.0040) [2024-06-13 03:18:55,940][70768] Fps is (10 sec: 45874.7, 60 sec: 48878.9, 300 sec: 49374.2). Total num frames: 2419343360. Throughput: 0: 49338.9. Samples: 1948186720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 03:18:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:18:58,180][71000] Updated weights for policy 0, policy_version 147674 (0.0033) [2024-06-13 03:19:00,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 2419621888. Throughput: 0: 49451.1. Samples: 1948483200. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 03:19:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:19:01,778][71000] Updated weights for policy 0, policy_version 147684 (0.0023) [2024-06-13 03:19:04,783][71000] Updated weights for policy 0, policy_version 147694 (0.0026) [2024-06-13 03:19:05,940][70768] Fps is (10 sec: 54067.2, 60 sec: 49700.0, 300 sec: 49485.2). Total num frames: 2419884032. Throughput: 0: 49524.9. Samples: 1948632360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 03:19:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:19:08,246][71000] Updated weights for policy 0, policy_version 147704 (0.0023) [2024-06-13 03:19:10,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2420097024. Throughput: 0: 49806.3. Samples: 1948937680. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 03:19:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:19:11,343][71000] Updated weights for policy 0, policy_version 147714 (0.0023) [2024-06-13 03:19:12,568][70980] Signal inference workers to stop experience collection... (28900 times) [2024-06-13 03:19:12,603][71000] InferenceWorker_p0-w0: stopping experience collection (28900 times) [2024-06-13 03:19:12,613][70980] Signal inference workers to resume experience collection... (28900 times) [2024-06-13 03:19:12,619][71000] InferenceWorker_p0-w0: resuming experience collection (28900 times) [2024-06-13 03:19:15,073][71000] Updated weights for policy 0, policy_version 147724 (0.0030) [2024-06-13 03:19:15,940][70768] Fps is (10 sec: 45875.7, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 2420342784. Throughput: 0: 49784.9. Samples: 1949236100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 03:19:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:19:17,958][71000] Updated weights for policy 0, policy_version 147734 (0.0030) [2024-06-13 03:19:20,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 2420604928. Throughput: 0: 49578.6. Samples: 1949373520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-13 03:19:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:19:21,526][71000] Updated weights for policy 0, policy_version 147744 (0.0028) [2024-06-13 03:19:24,328][71000] Updated weights for policy 0, policy_version 147754 (0.0025) [2024-06-13 03:19:25,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49971.2, 300 sec: 49374.2). Total num frames: 2420867072. Throughput: 0: 49638.2. Samples: 1949676900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-13 03:19:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:19:28,021][71000] Updated weights for policy 0, policy_version 147764 (0.0027) [2024-06-13 03:19:30,940][70768] Fps is (10 sec: 50787.9, 60 sec: 49970.7, 300 sec: 49429.6). Total num frames: 2421112832. Throughput: 0: 49606.9. Samples: 1949980340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-13 03:19:30,941][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:19:31,286][71000] Updated weights for policy 0, policy_version 147774 (0.0030) [2024-06-13 03:19:35,071][71000] Updated weights for policy 0, policy_version 147784 (0.0031) [2024-06-13 03:19:35,939][70768] Fps is (10 sec: 47514.7, 60 sec: 49698.2, 300 sec: 49318.7). Total num frames: 2421342208. Throughput: 0: 49476.6. Samples: 1950123180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-13 03:19:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:19:37,778][71000] Updated weights for policy 0, policy_version 147794 (0.0029) [2024-06-13 03:19:40,940][70768] Fps is (10 sec: 47516.2, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2421587968. Throughput: 0: 49471.7. Samples: 1950412940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-13 03:19:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:19:40,978][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000147803_2421604352.pth... [2024-06-13 03:19:41,027][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000147080_2409758720.pth [2024-06-13 03:19:41,449][71000] Updated weights for policy 0, policy_version 147804 (0.0029) [2024-06-13 03:19:44,463][71000] Updated weights for policy 0, policy_version 147814 (0.0037) [2024-06-13 03:19:45,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49152.0, 300 sec: 49318.8). Total num frames: 2421833728. Throughput: 0: 49479.3. Samples: 1950709760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-13 03:19:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:19:48,038][71000] Updated weights for policy 0, policy_version 147824 (0.0025) [2024-06-13 03:19:50,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 2422095872. Throughput: 0: 49621.7. Samples: 1950865340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-13 03:19:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:19:50,954][71000] Updated weights for policy 0, policy_version 147834 (0.0027) [2024-06-13 03:19:54,612][71000] Updated weights for policy 0, policy_version 147844 (0.0027) [2024-06-13 03:19:55,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 2422325248. Throughput: 0: 49461.0. Samples: 1951163420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-13 03:19:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 03:19:57,658][71000] Updated weights for policy 0, policy_version 147854 (0.0033) [2024-06-13 03:20:00,939][70768] Fps is (10 sec: 49153.1, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2422587392. Throughput: 0: 49399.6. Samples: 1951459080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-13 03:20:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:20:00,947][71000] Updated weights for policy 0, policy_version 147864 (0.0030) [2024-06-13 03:20:04,159][71000] Updated weights for policy 0, policy_version 147874 (0.0023) [2024-06-13 03:20:05,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49152.0, 300 sec: 49319.3). Total num frames: 2422833152. Throughput: 0: 49601.7. Samples: 1951605600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-13 03:20:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:20:07,451][71000] Updated weights for policy 0, policy_version 147884 (0.0027) [2024-06-13 03:20:10,705][71000] Updated weights for policy 0, policy_version 147894 (0.0041) [2024-06-13 03:20:10,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 2423095296. Throughput: 0: 49705.8. Samples: 1951913660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-13 03:20:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:20:14,277][71000] Updated weights for policy 0, policy_version 147904 (0.0023) [2024-06-13 03:20:15,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2423308288. Throughput: 0: 49526.0. Samples: 1952208980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-13 03:20:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:20:17,639][71000] Updated weights for policy 0, policy_version 147914 (0.0021) [2024-06-13 03:20:20,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2423570432. Throughput: 0: 49456.2. Samples: 1952348720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-13 03:20:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:20:21,175][71000] Updated weights for policy 0, policy_version 147924 (0.0030) [2024-06-13 03:20:24,488][71000] Updated weights for policy 0, policy_version 147934 (0.0030) [2024-06-13 03:20:25,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2423816192. Throughput: 0: 49521.4. Samples: 1952641400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:20:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:20:27,917][71000] Updated weights for policy 0, policy_version 147944 (0.0022) [2024-06-13 03:20:30,906][71000] Updated weights for policy 0, policy_version 147954 (0.0028) [2024-06-13 03:20:30,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.5, 300 sec: 49374.1). Total num frames: 2424078336. Throughput: 0: 49463.1. Samples: 1952935600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:20:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:20:34,254][71000] Updated weights for policy 0, policy_version 147964 (0.0032) [2024-06-13 03:20:35,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2424291328. Throughput: 0: 49310.0. Samples: 1953084280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:20:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:20:37,244][71000] Updated weights for policy 0, policy_version 147974 (0.0027) [2024-06-13 03:20:40,877][71000] Updated weights for policy 0, policy_version 147984 (0.0029) [2024-06-13 03:20:40,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49698.0, 300 sec: 49374.1). Total num frames: 2424569856. Throughput: 0: 49330.5. Samples: 1953383300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:20:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:20:43,979][71000] Updated weights for policy 0, policy_version 147994 (0.0028) [2024-06-13 03:20:44,958][70980] Signal inference workers to stop experience collection... (28950 times) [2024-06-13 03:20:44,960][70980] Signal inference workers to resume experience collection... (28950 times) [2024-06-13 03:20:44,989][71000] InferenceWorker_p0-w0: stopping experience collection (28950 times) [2024-06-13 03:20:44,990][71000] InferenceWorker_p0-w0: resuming experience collection (28950 times) [2024-06-13 03:20:45,940][70768] Fps is (10 sec: 54067.0, 60 sec: 49971.1, 300 sec: 49374.1). Total num frames: 2424832000. Throughput: 0: 49407.0. Samples: 1953682400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:20:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:20:47,594][71000] Updated weights for policy 0, policy_version 148004 (0.0026) [2024-06-13 03:20:50,762][71000] Updated weights for policy 0, policy_version 148014 (0.0028) [2024-06-13 03:20:50,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 2425061376. Throughput: 0: 49503.7. Samples: 1953833260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:20:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:20:54,372][71000] Updated weights for policy 0, policy_version 148024 (0.0029) [2024-06-13 03:20:55,940][70768] Fps is (10 sec: 45874.1, 60 sec: 49424.9, 300 sec: 49374.1). Total num frames: 2425290752. Throughput: 0: 49147.3. Samples: 1954125300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:20:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:20:57,431][71000] Updated weights for policy 0, policy_version 148034 (0.0028) [2024-06-13 03:21:00,878][71000] Updated weights for policy 0, policy_version 148044 (0.0025) [2024-06-13 03:21:00,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49424.9, 300 sec: 49374.1). Total num frames: 2425552896. Throughput: 0: 49100.3. Samples: 1954418500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:21:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:21:04,393][71000] Updated weights for policy 0, policy_version 148054 (0.0030) [2024-06-13 03:21:05,939][70768] Fps is (10 sec: 50792.0, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 2425798656. Throughput: 0: 49306.3. Samples: 1954567500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:21:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:21:07,707][71000] Updated weights for policy 0, policy_version 148064 (0.0032) [2024-06-13 03:21:10,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 2426028032. Throughput: 0: 49165.7. Samples: 1954853860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:21:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:21:11,097][71000] Updated weights for policy 0, policy_version 148074 (0.0026) [2024-06-13 03:21:14,452][71000] Updated weights for policy 0, policy_version 148084 (0.0040) [2024-06-13 03:21:15,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2426273792. Throughput: 0: 48980.8. Samples: 1955139740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:21:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:21:17,973][71000] Updated weights for policy 0, policy_version 148094 (0.0036) [2024-06-13 03:21:20,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2426519552. Throughput: 0: 48936.5. Samples: 1955286420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:21:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:21:21,117][71000] Updated weights for policy 0, policy_version 148104 (0.0020) [2024-06-13 03:21:24,383][71000] Updated weights for policy 0, policy_version 148114 (0.0022) [2024-06-13 03:21:25,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 2426781696. Throughput: 0: 49158.8. Samples: 1955595440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:21:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:21:27,793][71000] Updated weights for policy 0, policy_version 148124 (0.0027) [2024-06-13 03:21:30,939][70768] Fps is (10 sec: 47513.7, 60 sec: 48605.9, 300 sec: 49207.6). Total num frames: 2426994688. Throughput: 0: 49104.5. Samples: 1955892100. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-13 03:21:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:21:31,302][71000] Updated weights for policy 0, policy_version 148134 (0.0023) [2024-06-13 03:21:34,219][71000] Updated weights for policy 0, policy_version 148144 (0.0023) [2024-06-13 03:21:35,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49698.0, 300 sec: 49374.1). Total num frames: 2427273216. Throughput: 0: 48992.7. Samples: 1956037940. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-13 03:21:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:21:37,862][71000] Updated weights for policy 0, policy_version 148154 (0.0029) [2024-06-13 03:21:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48606.0, 300 sec: 49263.1). Total num frames: 2427486208. Throughput: 0: 48912.2. Samples: 1956326340. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-13 03:21:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:21:40,959][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000148163_2427502592.pth... [2024-06-13 03:21:41,012][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000147441_2415673344.pth [2024-06-13 03:21:41,306][71000] Updated weights for policy 0, policy_version 148164 (0.0025) [2024-06-13 03:21:44,691][71000] Updated weights for policy 0, policy_version 148174 (0.0030) [2024-06-13 03:21:45,939][70768] Fps is (10 sec: 49152.9, 60 sec: 48879.0, 300 sec: 49374.2). Total num frames: 2427764736. Throughput: 0: 48944.2. Samples: 1956620980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-13 03:21:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:21:47,691][71000] Updated weights for policy 0, policy_version 148184 (0.0032) [2024-06-13 03:21:50,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48605.9, 300 sec: 49263.1). Total num frames: 2427977728. Throughput: 0: 48918.2. Samples: 1956768820. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-13 03:21:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:21:51,511][71000] Updated weights for policy 0, policy_version 148194 (0.0027) [2024-06-13 03:21:54,571][71000] Updated weights for policy 0, policy_version 148204 (0.0026) [2024-06-13 03:21:55,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49152.2, 300 sec: 49318.6). Total num frames: 2428239872. Throughput: 0: 49036.9. Samples: 1957060520. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-13 03:21:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:21:58,303][71000] Updated weights for policy 0, policy_version 148214 (0.0031) [2024-06-13 03:21:59,031][70980] Signal inference workers to stop experience collection... (29000 times) [2024-06-13 03:21:59,059][71000] InferenceWorker_p0-w0: stopping experience collection (29000 times) [2024-06-13 03:21:59,079][70980] Signal inference workers to resume experience collection... (29000 times) [2024-06-13 03:21:59,079][71000] InferenceWorker_p0-w0: resuming experience collection (29000 times) [2024-06-13 03:22:00,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48606.0, 300 sec: 49207.5). Total num frames: 2428469248. Throughput: 0: 49170.4. Samples: 1957352400. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-13 03:22:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:22:01,241][71000] Updated weights for policy 0, policy_version 148224 (0.0024) [2024-06-13 03:22:04,763][71000] Updated weights for policy 0, policy_version 148234 (0.0033) [2024-06-13 03:22:05,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2428747776. Throughput: 0: 49374.1. Samples: 1957508260. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-13 03:22:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:22:07,486][71000] Updated weights for policy 0, policy_version 148244 (0.0022) [2024-06-13 03:22:10,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2428960768. Throughput: 0: 48959.6. Samples: 1957798620. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-13 03:22:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:22:11,238][71000] Updated weights for policy 0, policy_version 148254 (0.0025) [2024-06-13 03:22:14,483][71000] Updated weights for policy 0, policy_version 148264 (0.0023) [2024-06-13 03:22:15,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 2429239296. Throughput: 0: 48883.1. Samples: 1958091840. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-13 03:22:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:22:18,031][71000] Updated weights for policy 0, policy_version 148274 (0.0033) [2024-06-13 03:22:20,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2429452288. Throughput: 0: 49056.6. Samples: 1958245480. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-13 03:22:20,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:22:21,105][71000] Updated weights for policy 0, policy_version 148284 (0.0030) [2024-06-13 03:22:24,896][71000] Updated weights for policy 0, policy_version 148294 (0.0028) [2024-06-13 03:22:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2429730816. Throughput: 0: 49212.9. Samples: 1958540920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-13 03:22:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:22:27,575][71000] Updated weights for policy 0, policy_version 148304 (0.0029) [2024-06-13 03:22:30,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2429943808. Throughput: 0: 49400.0. Samples: 1958843980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-13 03:22:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:22:31,189][71000] Updated weights for policy 0, policy_version 148314 (0.0027) [2024-06-13 03:22:34,042][71000] Updated weights for policy 0, policy_version 148324 (0.0029) [2024-06-13 03:22:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 2430222336. Throughput: 0: 49361.3. Samples: 1958990080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 03:22:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:22:37,719][71000] Updated weights for policy 0, policy_version 148334 (0.0028) [2024-06-13 03:22:40,734][71000] Updated weights for policy 0, policy_version 148344 (0.0028) [2024-06-13 03:22:40,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 2430468096. Throughput: 0: 49299.2. Samples: 1959278980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 03:22:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:22:44,606][71000] Updated weights for policy 0, policy_version 148354 (0.0022) [2024-06-13 03:22:45,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2430713856. Throughput: 0: 49537.3. Samples: 1959581580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 03:22:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:22:47,430][71000] Updated weights for policy 0, policy_version 148364 (0.0022) [2024-06-13 03:22:50,940][70768] Fps is (10 sec: 45873.9, 60 sec: 49151.8, 300 sec: 49207.5). Total num frames: 2430926848. Throughput: 0: 49160.8. Samples: 1959720500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 03:22:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:22:51,322][71000] Updated weights for policy 0, policy_version 148374 (0.0024) [2024-06-13 03:22:52,211][70980] Signal inference workers to stop experience collection... (29050 times) [2024-06-13 03:22:52,240][71000] InferenceWorker_p0-w0: stopping experience collection (29050 times) [2024-06-13 03:22:52,319][70980] Signal inference workers to resume experience collection... (29050 times) [2024-06-13 03:22:52,319][71000] InferenceWorker_p0-w0: resuming experience collection (29050 times) [2024-06-13 03:22:53,954][71000] Updated weights for policy 0, policy_version 148384 (0.0023) [2024-06-13 03:22:55,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49424.9, 300 sec: 49318.6). Total num frames: 2431205376. Throughput: 0: 49271.8. Samples: 1960015860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 03:22:55,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 03:22:57,745][71000] Updated weights for policy 0, policy_version 148394 (0.0029) [2024-06-13 03:23:00,707][71000] Updated weights for policy 0, policy_version 148404 (0.0030) [2024-06-13 03:23:00,940][70768] Fps is (10 sec: 54068.3, 60 sec: 49971.2, 300 sec: 49374.6). Total num frames: 2431467520. Throughput: 0: 49262.7. Samples: 1960308660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 03:23:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:23:04,971][71000] Updated weights for policy 0, policy_version 148414 (0.0040) [2024-06-13 03:23:05,939][70768] Fps is (10 sec: 47514.8, 60 sec: 48879.1, 300 sec: 49263.1). Total num frames: 2431680512. Throughput: 0: 48979.2. Samples: 1960449540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 03:23:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:23:07,677][71000] Updated weights for policy 0, policy_version 148424 (0.0026) [2024-06-13 03:23:10,940][70768] Fps is (10 sec: 44236.4, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2431909888. Throughput: 0: 48844.8. Samples: 1960738940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 03:23:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:23:11,722][71000] Updated weights for policy 0, policy_version 148434 (0.0027) [2024-06-13 03:23:14,352][71000] Updated weights for policy 0, policy_version 148444 (0.0026) [2024-06-13 03:23:15,940][70768] Fps is (10 sec: 50789.3, 60 sec: 49151.9, 300 sec: 49374.2). Total num frames: 2432188416. Throughput: 0: 48777.1. Samples: 1961038960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 03:23:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 03:23:17,973][71000] Updated weights for policy 0, policy_version 148454 (0.0027) [2024-06-13 03:23:20,609][71000] Updated weights for policy 0, policy_version 148464 (0.0021) [2024-06-13 03:23:20,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 2432434176. Throughput: 0: 49150.6. Samples: 1961201860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 03:23:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:23:25,088][71000] Updated weights for policy 0, policy_version 148474 (0.0041) [2024-06-13 03:23:25,940][70768] Fps is (10 sec: 47514.3, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 2432663552. Throughput: 0: 49133.7. Samples: 1961490000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 03:23:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:23:27,390][71000] Updated weights for policy 0, policy_version 148484 (0.0033) [2024-06-13 03:23:30,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2432892928. Throughput: 0: 48925.6. Samples: 1961783240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 03:23:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:23:31,623][71000] Updated weights for policy 0, policy_version 148494 (0.0025) [2024-06-13 03:23:33,933][71000] Updated weights for policy 0, policy_version 148504 (0.0031) [2024-06-13 03:23:35,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2433171456. Throughput: 0: 49095.2. Samples: 1961929780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:23:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:23:38,066][71000] Updated weights for policy 0, policy_version 148514 (0.0032) [2024-06-13 03:23:40,642][71000] Updated weights for policy 0, policy_version 148524 (0.0026) [2024-06-13 03:23:40,940][70768] Fps is (10 sec: 54067.3, 60 sec: 49424.9, 300 sec: 49318.6). Total num frames: 2433433600. Throughput: 0: 49313.9. Samples: 1962234980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:23:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:23:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000148525_2433433600.pth... [2024-06-13 03:23:40,994][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000147803_2421604352.pth [2024-06-13 03:23:44,801][71000] Updated weights for policy 0, policy_version 148534 (0.0033) [2024-06-13 03:23:45,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2433646592. Throughput: 0: 49326.6. Samples: 1962528360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:23:45,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-13 03:23:47,197][71000] Updated weights for policy 0, policy_version 148544 (0.0032) [2024-06-13 03:23:50,940][70768] Fps is (10 sec: 44236.8, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2433875968. Throughput: 0: 49299.8. Samples: 1962668040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:23:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:23:51,417][71000] Updated weights for policy 0, policy_version 148554 (0.0031) [2024-06-13 03:23:53,866][71000] Updated weights for policy 0, policy_version 148564 (0.0031) [2024-06-13 03:23:55,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48879.1, 300 sec: 49207.6). Total num frames: 2434138112. Throughput: 0: 49295.7. Samples: 1962957240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:23:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:23:57,998][71000] Updated weights for policy 0, policy_version 148574 (0.0035) [2024-06-13 03:24:00,553][71000] Updated weights for policy 0, policy_version 148584 (0.0041) [2024-06-13 03:24:00,940][70768] Fps is (10 sec: 52428.9, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2434400256. Throughput: 0: 49386.7. Samples: 1963261360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:24:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 03:24:04,960][71000] Updated weights for policy 0, policy_version 148594 (0.0025) [2024-06-13 03:24:05,637][70980] Signal inference workers to stop experience collection... (29100 times) [2024-06-13 03:24:05,637][70980] Signal inference workers to resume experience collection... (29100 times) [2024-06-13 03:24:05,647][71000] InferenceWorker_p0-w0: stopping experience collection (29100 times) [2024-06-13 03:24:05,649][71000] InferenceWorker_p0-w0: resuming experience collection (29100 times) [2024-06-13 03:24:05,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48878.9, 300 sec: 49207.6). Total num frames: 2434613248. Throughput: 0: 49044.1. Samples: 1963408840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:24:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:24:07,171][71000] Updated weights for policy 0, policy_version 148604 (0.0025) [2024-06-13 03:24:10,940][70768] Fps is (10 sec: 45875.5, 60 sec: 49152.1, 300 sec: 49207.5). Total num frames: 2434859008. Throughput: 0: 49072.4. Samples: 1963698260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:24:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:24:11,614][71000] Updated weights for policy 0, policy_version 148614 (0.0026) [2024-06-13 03:24:13,952][71000] Updated weights for policy 0, policy_version 148624 (0.0029) [2024-06-13 03:24:15,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2435121152. Throughput: 0: 48840.0. Samples: 1963981040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:24:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:24:18,011][71000] Updated weights for policy 0, policy_version 148634 (0.0028) [2024-06-13 03:24:20,827][71000] Updated weights for policy 0, policy_version 148644 (0.0031) [2024-06-13 03:24:20,939][70768] Fps is (10 sec: 52429.2, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 2435383296. Throughput: 0: 49092.2. Samples: 1964138920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:24:20,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 03:24:24,545][71000] Updated weights for policy 0, policy_version 148654 (0.0030) [2024-06-13 03:24:25,939][70768] Fps is (10 sec: 49152.8, 60 sec: 49152.0, 300 sec: 49152.1). Total num frames: 2435612672. Throughput: 0: 48942.8. Samples: 1964437400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:24:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:24:27,230][71000] Updated weights for policy 0, policy_version 148664 (0.0043) [2024-06-13 03:24:30,940][70768] Fps is (10 sec: 45874.8, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2435842048. Throughput: 0: 48857.4. Samples: 1964726940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:24:30,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 03:24:31,513][71000] Updated weights for policy 0, policy_version 148674 (0.0034) [2024-06-13 03:24:33,889][71000] Updated weights for policy 0, policy_version 148684 (0.0027) [2024-06-13 03:24:35,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2436104192. Throughput: 0: 48981.8. Samples: 1964872220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:24:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:24:37,888][71000] Updated weights for policy 0, policy_version 148694 (0.0022) [2024-06-13 03:24:40,919][71000] Updated weights for policy 0, policy_version 148704 (0.0034) [2024-06-13 03:24:40,940][70768] Fps is (10 sec: 52429.0, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 2436366336. Throughput: 0: 49225.8. Samples: 1965172400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 03:24:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:24:44,481][71000] Updated weights for policy 0, policy_version 148714 (0.0020) [2024-06-13 03:24:45,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2436595712. Throughput: 0: 49108.6. Samples: 1965471240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 03:24:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:24:47,606][71000] Updated weights for policy 0, policy_version 148724 (0.0026) [2024-06-13 03:24:50,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2436825088. Throughput: 0: 49127.5. Samples: 1965619580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 03:24:50,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-13 03:24:51,287][71000] Updated weights for policy 0, policy_version 148734 (0.0023) [2024-06-13 03:24:54,453][71000] Updated weights for policy 0, policy_version 148744 (0.0027) [2024-06-13 03:24:55,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2437087232. Throughput: 0: 49003.2. Samples: 1965903400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 03:24:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:24:57,921][71000] Updated weights for policy 0, policy_version 148754 (0.0024) [2024-06-13 03:25:00,918][71000] Updated weights for policy 0, policy_version 148764 (0.0021) [2024-06-13 03:25:00,940][70768] Fps is (10 sec: 52429.1, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 2437349376. Throughput: 0: 49241.5. Samples: 1966196900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 03:25:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:25:04,487][71000] Updated weights for policy 0, policy_version 148774 (0.0031) [2024-06-13 03:25:05,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.2, 300 sec: 49152.0). Total num frames: 2437595136. Throughput: 0: 49105.4. Samples: 1966348660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 03:25:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:25:07,512][71000] Updated weights for policy 0, policy_version 148784 (0.0030) [2024-06-13 03:25:10,939][70768] Fps is (10 sec: 45875.5, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2437808128. Throughput: 0: 49110.3. Samples: 1966647360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 03:25:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:25:11,236][71000] Updated weights for policy 0, policy_version 148794 (0.0034) [2024-06-13 03:25:14,298][71000] Updated weights for policy 0, policy_version 148804 (0.0022) [2024-06-13 03:25:15,939][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.2, 300 sec: 49152.0). Total num frames: 2438070272. Throughput: 0: 49200.6. Samples: 1966940960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 03:25:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:25:17,956][71000] Updated weights for policy 0, policy_version 148814 (0.0022) [2024-06-13 03:25:20,803][71000] Updated weights for policy 0, policy_version 148824 (0.0028) [2024-06-13 03:25:20,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2438332416. Throughput: 0: 49288.5. Samples: 1967090200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 03:25:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:25:22,475][70980] Signal inference workers to stop experience collection... (29150 times) [2024-06-13 03:25:22,484][70980] Signal inference workers to resume experience collection... (29150 times) [2024-06-13 03:25:22,530][71000] InferenceWorker_p0-w0: stopping experience collection (29150 times) [2024-06-13 03:25:22,530][71000] InferenceWorker_p0-w0: resuming experience collection (29150 times) [2024-06-13 03:25:24,416][71000] Updated weights for policy 0, policy_version 148834 (0.0024) [2024-06-13 03:25:25,939][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2438578176. Throughput: 0: 49250.3. Samples: 1967388660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 03:25:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:25:27,423][71000] Updated weights for policy 0, policy_version 148844 (0.0023) [2024-06-13 03:25:30,940][70768] Fps is (10 sec: 45874.8, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2438791168. Throughput: 0: 49217.2. Samples: 1967686020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 03:25:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 03:25:31,314][71000] Updated weights for policy 0, policy_version 148854 (0.0027) [2024-06-13 03:25:33,865][71000] Updated weights for policy 0, policy_version 148864 (0.0035) [2024-06-13 03:25:35,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2439069696. Throughput: 0: 49234.6. Samples: 1967835140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 03:25:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 03:25:37,695][71000] Updated weights for policy 0, policy_version 148874 (0.0026) [2024-06-13 03:25:40,939][70768] Fps is (10 sec: 50791.1, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2439299072. Throughput: 0: 49553.7. Samples: 1968133320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 03:25:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:25:40,997][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000148884_2439315456.pth... [2024-06-13 03:25:40,998][71000] Updated weights for policy 0, policy_version 148884 (0.0026) [2024-06-13 03:25:41,041][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000148163_2427502592.pth [2024-06-13 03:25:44,421][71000] Updated weights for policy 0, policy_version 148894 (0.0026) [2024-06-13 03:25:45,939][70768] Fps is (10 sec: 49152.8, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2439561216. Throughput: 0: 49503.6. Samples: 1968424560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 03:25:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:25:47,654][71000] Updated weights for policy 0, policy_version 148904 (0.0026) [2024-06-13 03:25:50,855][71000] Updated weights for policy 0, policy_version 148914 (0.0027) [2024-06-13 03:25:50,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49698.1, 300 sec: 49207.6). Total num frames: 2439806976. Throughput: 0: 49442.0. Samples: 1968573560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 03:25:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:25:54,351][71000] Updated weights for policy 0, policy_version 148924 (0.0032) [2024-06-13 03:25:55,940][70768] Fps is (10 sec: 49150.9, 60 sec: 49424.9, 300 sec: 49152.0). Total num frames: 2440052736. Throughput: 0: 49370.8. Samples: 1968869060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 03:25:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:25:57,379][71000] Updated weights for policy 0, policy_version 148934 (0.0028) [2024-06-13 03:26:00,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48878.9, 300 sec: 49096.4). Total num frames: 2440282112. Throughput: 0: 49471.4. Samples: 1969167180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 03:26:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:26:01,129][71000] Updated weights for policy 0, policy_version 148944 (0.0031) [2024-06-13 03:26:04,080][71000] Updated weights for policy 0, policy_version 148954 (0.0027) [2024-06-13 03:26:05,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2440544256. Throughput: 0: 49464.0. Samples: 1969316080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 03:26:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:26:07,774][71000] Updated weights for policy 0, policy_version 148964 (0.0038) [2024-06-13 03:26:10,428][71000] Updated weights for policy 0, policy_version 148974 (0.0036) [2024-06-13 03:26:10,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49971.1, 300 sec: 49263.1). Total num frames: 2440806400. Throughput: 0: 49467.4. Samples: 1969614700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 03:26:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:26:14,318][71000] Updated weights for policy 0, policy_version 148984 (0.0037) [2024-06-13 03:26:15,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2441052160. Throughput: 0: 49283.2. Samples: 1969903760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 03:26:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:26:17,247][71000] Updated weights for policy 0, policy_version 148994 (0.0028) [2024-06-13 03:26:20,940][70768] Fps is (10 sec: 45875.6, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2441265152. Throughput: 0: 49372.1. Samples: 1970056880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 03:26:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:26:21,003][71000] Updated weights for policy 0, policy_version 149004 (0.0033) [2024-06-13 03:26:24,174][71000] Updated weights for policy 0, policy_version 149014 (0.0029) [2024-06-13 03:26:25,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2441527296. Throughput: 0: 49219.1. Samples: 1970348180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 03:26:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 03:26:27,515][71000] Updated weights for policy 0, policy_version 149024 (0.0030) [2024-06-13 03:26:30,571][71000] Updated weights for policy 0, policy_version 149034 (0.0028) [2024-06-13 03:26:30,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.2, 300 sec: 49152.0). Total num frames: 2441773056. Throughput: 0: 49419.9. Samples: 1970648460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 03:26:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:26:34,282][71000] Updated weights for policy 0, policy_version 149044 (0.0026) [2024-06-13 03:26:35,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2442018816. Throughput: 0: 49293.0. Samples: 1970791740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 03:26:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:26:37,210][71000] Updated weights for policy 0, policy_version 149054 (0.0030) [2024-06-13 03:26:40,543][71000] Updated weights for policy 0, policy_version 149064 (0.0028) [2024-06-13 03:26:40,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49424.9, 300 sec: 49152.0). Total num frames: 2442264576. Throughput: 0: 49423.5. Samples: 1971093120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-13 03:26:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:26:41,635][70980] Signal inference workers to stop experience collection... (29200 times) [2024-06-13 03:26:41,636][70980] Signal inference workers to resume experience collection... (29200 times) [2024-06-13 03:26:41,678][71000] InferenceWorker_p0-w0: stopping experience collection (29200 times) [2024-06-13 03:26:41,679][71000] InferenceWorker_p0-w0: resuming experience collection (29200 times) [2024-06-13 03:26:43,878][71000] Updated weights for policy 0, policy_version 149074 (0.0032) [2024-06-13 03:26:45,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2442510336. Throughput: 0: 49316.5. Samples: 1971386420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 03:26:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:26:47,434][71000] Updated weights for policy 0, policy_version 149084 (0.0030) [2024-06-13 03:26:50,618][71000] Updated weights for policy 0, policy_version 149094 (0.0029) [2024-06-13 03:26:50,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2442772480. Throughput: 0: 49246.0. Samples: 1971532160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 03:26:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:26:53,903][71000] Updated weights for policy 0, policy_version 149104 (0.0024) [2024-06-13 03:26:55,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.3, 300 sec: 49318.6). Total num frames: 2443018240. Throughput: 0: 49253.9. Samples: 1971831120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 03:26:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:26:56,964][71000] Updated weights for policy 0, policy_version 149114 (0.0031) [2024-06-13 03:27:00,471][71000] Updated weights for policy 0, policy_version 149124 (0.0036) [2024-06-13 03:27:00,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2443247616. Throughput: 0: 49523.1. Samples: 1972132300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 03:27:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:27:03,711][71000] Updated weights for policy 0, policy_version 149134 (0.0032) [2024-06-13 03:27:05,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2443493376. Throughput: 0: 49368.9. Samples: 1972278480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 03:27:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:27:07,026][71000] Updated weights for policy 0, policy_version 149144 (0.0031) [2024-06-13 03:27:10,482][71000] Updated weights for policy 0, policy_version 149154 (0.0027) [2024-06-13 03:27:10,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2443771904. Throughput: 0: 49387.0. Samples: 1972570600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 03:27:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 03:27:13,871][71000] Updated weights for policy 0, policy_version 149164 (0.0026) [2024-06-13 03:27:15,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2444001280. Throughput: 0: 49277.3. Samples: 1972865940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 03:27:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:27:16,973][71000] Updated weights for policy 0, policy_version 149174 (0.0029) [2024-06-13 03:27:20,559][71000] Updated weights for policy 0, policy_version 149184 (0.0024) [2024-06-13 03:27:20,939][70768] Fps is (10 sec: 45875.6, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2444230656. Throughput: 0: 49347.1. Samples: 1973012360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 03:27:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:27:23,642][71000] Updated weights for policy 0, policy_version 149194 (0.0023) [2024-06-13 03:27:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2444492800. Throughput: 0: 49290.4. Samples: 1973311180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 03:27:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:27:27,294][71000] Updated weights for policy 0, policy_version 149204 (0.0026) [2024-06-13 03:27:30,489][71000] Updated weights for policy 0, policy_version 149214 (0.0038) [2024-06-13 03:27:30,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2444738560. Throughput: 0: 49235.5. Samples: 1973602020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 03:27:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:27:34,010][71000] Updated weights for policy 0, policy_version 149224 (0.0030) [2024-06-13 03:27:35,939][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2444967936. Throughput: 0: 49300.2. Samples: 1973750660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 03:27:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:27:36,955][71000] Updated weights for policy 0, policy_version 149234 (0.0028) [2024-06-13 03:27:40,426][71000] Updated weights for policy 0, policy_version 149244 (0.0025) [2024-06-13 03:27:40,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2445213696. Throughput: 0: 49156.2. Samples: 1974043160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 03:27:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:27:41,076][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000149245_2445230080.pth... [2024-06-13 03:27:41,120][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000148525_2433433600.pth [2024-06-13 03:27:43,707][71000] Updated weights for policy 0, policy_version 149254 (0.0024) [2024-06-13 03:27:45,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.1, 300 sec: 49318.7). Total num frames: 2445475840. Throughput: 0: 48899.6. Samples: 1974332780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 03:27:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:27:47,506][71000] Updated weights for policy 0, policy_version 149264 (0.0033) [2024-06-13 03:27:50,503][71000] Updated weights for policy 0, policy_version 149274 (0.0037) [2024-06-13 03:27:50,940][70768] Fps is (10 sec: 50791.4, 60 sec: 49152.2, 300 sec: 49207.6). Total num frames: 2445721600. Throughput: 0: 49014.3. Samples: 1974484120. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 03:27:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:27:53,976][71000] Updated weights for policy 0, policy_version 149284 (0.0038) [2024-06-13 03:27:55,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48605.7, 300 sec: 49040.9). Total num frames: 2445934592. Throughput: 0: 48896.4. Samples: 1974770940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 03:27:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:27:56,448][70980] Signal inference workers to stop experience collection... (29250 times) [2024-06-13 03:27:56,499][71000] InferenceWorker_p0-w0: stopping experience collection (29250 times) [2024-06-13 03:27:56,503][70980] Signal inference workers to resume experience collection... (29250 times) [2024-06-13 03:27:56,507][71000] InferenceWorker_p0-w0: resuming experience collection (29250 times) [2024-06-13 03:27:57,261][71000] Updated weights for policy 0, policy_version 149294 (0.0027) [2024-06-13 03:28:00,608][71000] Updated weights for policy 0, policy_version 149304 (0.0033) [2024-06-13 03:28:00,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2446196736. Throughput: 0: 48856.4. Samples: 1975064480. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 03:28:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:28:04,059][71000] Updated weights for policy 0, policy_version 149314 (0.0031) [2024-06-13 03:28:05,939][70768] Fps is (10 sec: 52429.7, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 2446458880. Throughput: 0: 48923.1. Samples: 1975213900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 03:28:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:28:07,406][71000] Updated weights for policy 0, policy_version 149324 (0.0025) [2024-06-13 03:28:10,604][71000] Updated weights for policy 0, policy_version 149334 (0.0023) [2024-06-13 03:28:10,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2446704640. Throughput: 0: 49051.1. Samples: 1975518480. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 03:28:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:28:14,206][71000] Updated weights for policy 0, policy_version 149344 (0.0029) [2024-06-13 03:28:15,939][70768] Fps is (10 sec: 45875.0, 60 sec: 48605.9, 300 sec: 49096.5). Total num frames: 2446917632. Throughput: 0: 49072.1. Samples: 1975810260. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 03:28:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:28:17,446][71000] Updated weights for policy 0, policy_version 149354 (0.0025) [2024-06-13 03:28:20,771][71000] Updated weights for policy 0, policy_version 149364 (0.0027) [2024-06-13 03:28:20,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2447179776. Throughput: 0: 48838.6. Samples: 1975948400. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 03:28:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:28:24,030][71000] Updated weights for policy 0, policy_version 149374 (0.0029) [2024-06-13 03:28:25,939][70768] Fps is (10 sec: 52428.8, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2447441920. Throughput: 0: 49142.0. Samples: 1976254540. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 03:28:25,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 03:28:27,079][71000] Updated weights for policy 0, policy_version 149384 (0.0028) [2024-06-13 03:28:30,474][71000] Updated weights for policy 0, policy_version 149394 (0.0027) [2024-06-13 03:28:30,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 2447687680. Throughput: 0: 49407.1. Samples: 1976556100. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 03:28:30,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-13 03:28:33,832][71000] Updated weights for policy 0, policy_version 149404 (0.0034) [2024-06-13 03:28:35,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 2447917056. Throughput: 0: 49279.0. Samples: 1976701680. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 03:28:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:28:37,275][71000] Updated weights for policy 0, policy_version 149414 (0.0023) [2024-06-13 03:28:40,489][71000] Updated weights for policy 0, policy_version 149424 (0.0032) [2024-06-13 03:28:40,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 2448179200. Throughput: 0: 49547.2. Samples: 1977000560. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 03:28:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:28:44,063][71000] Updated weights for policy 0, policy_version 149434 (0.0021) [2024-06-13 03:28:45,940][70768] Fps is (10 sec: 52429.1, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2448441344. Throughput: 0: 49604.5. Samples: 1977296680. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 03:28:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:28:46,951][71000] Updated weights for policy 0, policy_version 149444 (0.0023) [2024-06-13 03:28:50,270][71000] Updated weights for policy 0, policy_version 149454 (0.0025) [2024-06-13 03:28:50,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49424.9, 300 sec: 49318.6). Total num frames: 2448687104. Throughput: 0: 49743.7. Samples: 1977452380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 03:28:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:28:53,892][71000] Updated weights for policy 0, policy_version 149464 (0.0032) [2024-06-13 03:28:55,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49425.2, 300 sec: 49152.0). Total num frames: 2448900096. Throughput: 0: 49445.5. Samples: 1977743520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-13 03:28:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:28:57,158][71000] Updated weights for policy 0, policy_version 149474 (0.0019) [2024-06-13 03:29:00,427][71000] Updated weights for policy 0, policy_version 149484 (0.0026) [2024-06-13 03:29:00,940][70768] Fps is (10 sec: 47514.5, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2449162240. Throughput: 0: 49421.7. Samples: 1978034240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-13 03:29:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:29:03,889][71000] Updated weights for policy 0, policy_version 149494 (0.0031) [2024-06-13 03:29:05,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2449408000. Throughput: 0: 49636.5. Samples: 1978182040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-13 03:29:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:29:06,960][70980] Signal inference workers to stop experience collection... (29300 times) [2024-06-13 03:29:07,009][71000] InferenceWorker_p0-w0: stopping experience collection (29300 times) [2024-06-13 03:29:07,015][70980] Signal inference workers to resume experience collection... (29300 times) [2024-06-13 03:29:07,029][71000] InferenceWorker_p0-w0: resuming experience collection (29300 times) [2024-06-13 03:29:07,180][71000] Updated weights for policy 0, policy_version 149504 (0.0044) [2024-06-13 03:29:10,352][71000] Updated weights for policy 0, policy_version 149514 (0.0021) [2024-06-13 03:29:10,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2449653760. Throughput: 0: 49491.9. Samples: 1978481680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-13 03:29:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:29:13,602][71000] Updated weights for policy 0, policy_version 149524 (0.0033) [2024-06-13 03:29:15,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2449883136. Throughput: 0: 49264.5. Samples: 1978773000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-13 03:29:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:29:17,201][71000] Updated weights for policy 0, policy_version 149534 (0.0029) [2024-06-13 03:29:20,812][71000] Updated weights for policy 0, policy_version 149544 (0.0038) [2024-06-13 03:29:20,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2450128896. Throughput: 0: 49102.3. Samples: 1978911280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-13 03:29:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:29:23,866][71000] Updated weights for policy 0, policy_version 149554 (0.0029) [2024-06-13 03:29:25,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49424.9, 300 sec: 49374.1). Total num frames: 2450407424. Throughput: 0: 49263.4. Samples: 1979217420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-13 03:29:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:29:27,202][71000] Updated weights for policy 0, policy_version 149564 (0.0025) [2024-06-13 03:29:30,462][71000] Updated weights for policy 0, policy_version 149574 (0.0031) [2024-06-13 03:29:30,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2450653184. Throughput: 0: 49493.8. Samples: 1979523900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-13 03:29:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:29:33,932][71000] Updated weights for policy 0, policy_version 149584 (0.0022) [2024-06-13 03:29:35,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2450882560. Throughput: 0: 49151.7. Samples: 1979664200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-13 03:29:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:29:36,963][71000] Updated weights for policy 0, policy_version 149594 (0.0023) [2024-06-13 03:29:40,423][71000] Updated weights for policy 0, policy_version 149604 (0.0025) [2024-06-13 03:29:40,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2451128320. Throughput: 0: 49156.8. Samples: 1979955580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-13 03:29:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:29:41,050][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000149606_2451144704.pth... [2024-06-13 03:29:41,111][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000148884_2439315456.pth [2024-06-13 03:29:43,842][71000] Updated weights for policy 0, policy_version 149614 (0.0028) [2024-06-13 03:29:45,939][70768] Fps is (10 sec: 50791.1, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 2451390464. Throughput: 0: 49215.7. Samples: 1980248940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-13 03:29:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:29:47,223][71000] Updated weights for policy 0, policy_version 149624 (0.0027) [2024-06-13 03:29:50,504][71000] Updated weights for policy 0, policy_version 149634 (0.0023) [2024-06-13 03:29:50,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2451636224. Throughput: 0: 49300.8. Samples: 1980400580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-13 03:29:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:29:53,799][71000] Updated weights for policy 0, policy_version 149644 (0.0033) [2024-06-13 03:29:55,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2451865600. Throughput: 0: 49326.7. Samples: 1980701380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-13 03:29:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:29:57,194][71000] Updated weights for policy 0, policy_version 149654 (0.0032) [2024-06-13 03:30:00,321][71000] Updated weights for policy 0, policy_version 149664 (0.0029) [2024-06-13 03:30:00,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2452111360. Throughput: 0: 49384.5. Samples: 1980995300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 24.0) [2024-06-13 03:30:00,943][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:30:03,426][71000] Updated weights for policy 0, policy_version 149674 (0.0028) [2024-06-13 03:30:03,935][70980] Signal inference workers to stop experience collection... (29350 times) [2024-06-13 03:30:03,955][71000] InferenceWorker_p0-w0: stopping experience collection (29350 times) [2024-06-13 03:30:03,990][70980] Signal inference workers to resume experience collection... (29350 times) [2024-06-13 03:30:04,000][71000] InferenceWorker_p0-w0: resuming experience collection (29350 times) [2024-06-13 03:30:05,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 2452389888. Throughput: 0: 49714.9. Samples: 1981148460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 24.0) [2024-06-13 03:30:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:30:06,927][71000] Updated weights for policy 0, policy_version 149684 (0.0026) [2024-06-13 03:30:10,115][71000] Updated weights for policy 0, policy_version 149694 (0.0036) [2024-06-13 03:30:10,939][70768] Fps is (10 sec: 52429.1, 60 sec: 49698.2, 300 sec: 49374.1). Total num frames: 2452635648. Throughput: 0: 49629.5. Samples: 1981450740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 24.0) [2024-06-13 03:30:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:30:13,465][71000] Updated weights for policy 0, policy_version 149704 (0.0019) [2024-06-13 03:30:15,939][70768] Fps is (10 sec: 45876.2, 60 sec: 49425.1, 300 sec: 49207.6). Total num frames: 2452848640. Throughput: 0: 49203.2. Samples: 1981738040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 24.0) [2024-06-13 03:30:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:30:16,840][71000] Updated weights for policy 0, policy_version 149714 (0.0034) [2024-06-13 03:30:20,382][71000] Updated weights for policy 0, policy_version 149724 (0.0025) [2024-06-13 03:30:20,939][70768] Fps is (10 sec: 45875.5, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2453094400. Throughput: 0: 49199.7. Samples: 1981878180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 24.0) [2024-06-13 03:30:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:30:23,349][71000] Updated weights for policy 0, policy_version 149734 (0.0026) [2024-06-13 03:30:25,940][70768] Fps is (10 sec: 52427.5, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2453372928. Throughput: 0: 49340.7. Samples: 1982175920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 24.0) [2024-06-13 03:30:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:30:27,170][71000] Updated weights for policy 0, policy_version 149744 (0.0031) [2024-06-13 03:30:30,107][71000] Updated weights for policy 0, policy_version 149754 (0.0028) [2024-06-13 03:30:30,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2453602304. Throughput: 0: 49361.2. Samples: 1982470200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 24.0) [2024-06-13 03:30:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:30:33,514][71000] Updated weights for policy 0, policy_version 149764 (0.0025) [2024-06-13 03:30:35,940][70768] Fps is (10 sec: 45875.8, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2453831680. Throughput: 0: 49373.8. Samples: 1982622400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 24.0) [2024-06-13 03:30:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:30:36,729][71000] Updated weights for policy 0, policy_version 149774 (0.0032) [2024-06-13 03:30:40,247][71000] Updated weights for policy 0, policy_version 149784 (0.0030) [2024-06-13 03:30:40,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2454077440. Throughput: 0: 49181.6. Samples: 1982914560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 24.0) [2024-06-13 03:30:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:30:43,232][71000] Updated weights for policy 0, policy_version 149794 (0.0024) [2024-06-13 03:30:45,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2454355968. Throughput: 0: 49156.9. Samples: 1983207360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 24.0) [2024-06-13 03:30:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:30:46,620][71000] Updated weights for policy 0, policy_version 149804 (0.0042) [2024-06-13 03:30:50,092][71000] Updated weights for policy 0, policy_version 149814 (0.0028) [2024-06-13 03:30:50,940][70768] Fps is (10 sec: 52429.5, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2454601728. Throughput: 0: 49252.6. Samples: 1983364820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 24.0) [2024-06-13 03:30:50,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 03:30:53,080][71000] Updated weights for policy 0, policy_version 149824 (0.0028) [2024-06-13 03:30:55,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2454814720. Throughput: 0: 49180.4. Samples: 1983663860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 24.0) [2024-06-13 03:30:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:30:56,601][71000] Updated weights for policy 0, policy_version 149834 (0.0034) [2024-06-13 03:30:59,814][71000] Updated weights for policy 0, policy_version 149844 (0.0023) [2024-06-13 03:31:00,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49424.9, 300 sec: 49263.0). Total num frames: 2455076864. Throughput: 0: 49336.6. Samples: 1983958200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:31:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:31:03,346][71000] Updated weights for policy 0, policy_version 149854 (0.0027) [2024-06-13 03:31:05,940][70768] Fps is (10 sec: 54067.1, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 2455355392. Throughput: 0: 49397.6. Samples: 1984101080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:31:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:31:06,386][71000] Updated weights for policy 0, policy_version 149864 (0.0030) [2024-06-13 03:31:09,754][70980] Signal inference workers to stop experience collection... (29400 times) [2024-06-13 03:31:09,757][70980] Signal inference workers to resume experience collection... (29400 times) [2024-06-13 03:31:09,802][71000] InferenceWorker_p0-w0: stopping experience collection (29400 times) [2024-06-13 03:31:09,802][71000] InferenceWorker_p0-w0: resuming experience collection (29400 times) [2024-06-13 03:31:09,889][71000] Updated weights for policy 0, policy_version 149874 (0.0029) [2024-06-13 03:31:10,940][70768] Fps is (10 sec: 49152.9, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2455568384. Throughput: 0: 49353.5. Samples: 1984396820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:31:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 03:31:13,006][71000] Updated weights for policy 0, policy_version 149884 (0.0029) [2024-06-13 03:31:15,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 2455830528. Throughput: 0: 49608.9. Samples: 1984702600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:31:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:31:16,314][71000] Updated weights for policy 0, policy_version 149894 (0.0022) [2024-06-13 03:31:19,564][71000] Updated weights for policy 0, policy_version 149904 (0.0028) [2024-06-13 03:31:20,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2456059904. Throughput: 0: 49397.9. Samples: 1984845300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:31:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:31:23,056][71000] Updated weights for policy 0, policy_version 149914 (0.0031) [2024-06-13 03:31:25,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2456322048. Throughput: 0: 49378.8. Samples: 1985136600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:31:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:31:26,182][71000] Updated weights for policy 0, policy_version 149924 (0.0031) [2024-06-13 03:31:29,680][71000] Updated weights for policy 0, policy_version 149934 (0.0027) [2024-06-13 03:31:30,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2456551424. Throughput: 0: 49434.1. Samples: 1985431900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:31:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:31:32,982][71000] Updated weights for policy 0, policy_version 149944 (0.0025) [2024-06-13 03:31:35,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49971.1, 300 sec: 49374.2). Total num frames: 2456829952. Throughput: 0: 49171.0. Samples: 1985577520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:31:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:31:36,267][71000] Updated weights for policy 0, policy_version 149954 (0.0022) [2024-06-13 03:31:39,624][71000] Updated weights for policy 0, policy_version 149964 (0.0024) [2024-06-13 03:31:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2457042944. Throughput: 0: 49309.2. Samples: 1985882780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:31:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:31:40,953][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000149967_2457059328.pth... [2024-06-13 03:31:41,001][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000149245_2445230080.pth [2024-06-13 03:31:42,961][71000] Updated weights for policy 0, policy_version 149974 (0.0029) [2024-06-13 03:31:45,940][70768] Fps is (10 sec: 45875.9, 60 sec: 48878.9, 300 sec: 49207.6). Total num frames: 2457288704. Throughput: 0: 49183.4. Samples: 1986171440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:31:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:31:46,359][71000] Updated weights for policy 0, policy_version 149984 (0.0036) [2024-06-13 03:31:49,585][71000] Updated weights for policy 0, policy_version 149994 (0.0035) [2024-06-13 03:31:50,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2457534464. Throughput: 0: 49228.4. Samples: 1986316360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:31:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:31:52,881][71000] Updated weights for policy 0, policy_version 150004 (0.0027) [2024-06-13 03:31:55,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 2457796608. Throughput: 0: 49283.2. Samples: 1986614560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:31:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:31:56,113][71000] Updated weights for policy 0, policy_version 150014 (0.0028) [2024-06-13 03:31:59,509][71000] Updated weights for policy 0, policy_version 150024 (0.0030) [2024-06-13 03:32:00,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 2458042368. Throughput: 0: 49055.5. Samples: 1986910100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:32:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:32:03,058][71000] Updated weights for policy 0, policy_version 150034 (0.0037) [2024-06-13 03:32:05,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2458288128. Throughput: 0: 49143.0. Samples: 1987056740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 03:32:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:32:06,474][71000] Updated weights for policy 0, policy_version 150044 (0.0025) [2024-06-13 03:32:09,997][71000] Updated weights for policy 0, policy_version 150054 (0.0031) [2024-06-13 03:32:10,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2458501120. Throughput: 0: 48896.1. Samples: 1987336920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 03:32:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:32:13,217][71000] Updated weights for policy 0, policy_version 150064 (0.0019) [2024-06-13 03:32:15,734][70980] Signal inference workers to stop experience collection... (29450 times) [2024-06-13 03:32:15,734][70980] Signal inference workers to resume experience collection... (29450 times) [2024-06-13 03:32:15,770][71000] InferenceWorker_p0-w0: stopping experience collection (29450 times) [2024-06-13 03:32:15,770][71000] InferenceWorker_p0-w0: resuming experience collection (29450 times) [2024-06-13 03:32:15,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 2458796032. Throughput: 0: 49113.8. Samples: 1987642020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 03:32:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:32:16,291][71000] Updated weights for policy 0, policy_version 150074 (0.0025) [2024-06-13 03:32:19,799][71000] Updated weights for policy 0, policy_version 150084 (0.0025) [2024-06-13 03:32:20,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2459025408. Throughput: 0: 49323.2. Samples: 1987797060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 03:32:20,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-13 03:32:22,864][71000] Updated weights for policy 0, policy_version 150094 (0.0031) [2024-06-13 03:32:25,940][70768] Fps is (10 sec: 45875.6, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2459254784. Throughput: 0: 49238.8. Samples: 1988098520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 03:32:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:32:26,511][71000] Updated weights for policy 0, policy_version 150104 (0.0027) [2024-06-13 03:32:29,741][71000] Updated weights for policy 0, policy_version 150114 (0.0025) [2024-06-13 03:32:30,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49152.0, 300 sec: 49263.0). Total num frames: 2459500544. Throughput: 0: 49364.7. Samples: 1988392860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 03:32:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 03:32:33,051][71000] Updated weights for policy 0, policy_version 150124 (0.0027) [2024-06-13 03:32:35,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48879.1, 300 sec: 49318.7). Total num frames: 2459762688. Throughput: 0: 49279.7. Samples: 1988533940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 03:32:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 03:32:36,480][71000] Updated weights for policy 0, policy_version 150134 (0.0024) [2024-06-13 03:32:39,569][71000] Updated weights for policy 0, policy_version 150144 (0.0026) [2024-06-13 03:32:40,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 2460024832. Throughput: 0: 49439.0. Samples: 1988839320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 03:32:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:32:42,875][71000] Updated weights for policy 0, policy_version 150154 (0.0028) [2024-06-13 03:32:45,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2460254208. Throughput: 0: 49499.7. Samples: 1989137580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 03:32:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:32:46,236][71000] Updated weights for policy 0, policy_version 150164 (0.0022) [2024-06-13 03:32:49,578][71000] Updated weights for policy 0, policy_version 150174 (0.0027) [2024-06-13 03:32:50,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2460499968. Throughput: 0: 49316.1. Samples: 1989275960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 03:32:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:32:53,018][71000] Updated weights for policy 0, policy_version 150184 (0.0028) [2024-06-13 03:32:55,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2460762112. Throughput: 0: 49813.7. Samples: 1989578540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 03:32:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:32:55,948][71000] Updated weights for policy 0, policy_version 150194 (0.0018) [2024-06-13 03:32:59,335][71000] Updated weights for policy 0, policy_version 150204 (0.0028) [2024-06-13 03:33:00,940][70768] Fps is (10 sec: 55706.0, 60 sec: 50244.3, 300 sec: 49485.2). Total num frames: 2461057024. Throughput: 0: 49884.1. Samples: 1989886800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 03:33:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:33:02,635][71000] Updated weights for policy 0, policy_version 150214 (0.0030) [2024-06-13 03:33:05,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 2461253632. Throughput: 0: 49927.2. Samples: 1990043780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 03:33:05,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 03:33:06,099][71000] Updated weights for policy 0, policy_version 150224 (0.0033) [2024-06-13 03:33:06,591][70980] Signal inference workers to stop experience collection... (29500 times) [2024-06-13 03:33:06,645][70980] Signal inference workers to resume experience collection... (29500 times) [2024-06-13 03:33:06,645][71000] InferenceWorker_p0-w0: stopping experience collection (29500 times) [2024-06-13 03:33:06,655][71000] InferenceWorker_p0-w0: resuming experience collection (29500 times) [2024-06-13 03:33:09,051][71000] Updated weights for policy 0, policy_version 150234 (0.0025) [2024-06-13 03:33:10,940][70768] Fps is (10 sec: 44236.5, 60 sec: 49971.1, 300 sec: 49429.7). Total num frames: 2461499392. Throughput: 0: 49608.9. Samples: 1990330920. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) [2024-06-13 03:33:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 03:33:12,923][71000] Updated weights for policy 0, policy_version 150244 (0.0042) [2024-06-13 03:33:15,916][71000] Updated weights for policy 0, policy_version 150254 (0.0033) [2024-06-13 03:33:15,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2461761536. Throughput: 0: 49536.5. Samples: 1990622000. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) [2024-06-13 03:33:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:33:19,503][71000] Updated weights for policy 0, policy_version 150264 (0.0031) [2024-06-13 03:33:20,940][70768] Fps is (10 sec: 54067.3, 60 sec: 50244.3, 300 sec: 49485.2). Total num frames: 2462040064. Throughput: 0: 49903.5. Samples: 1990779600. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) [2024-06-13 03:33:20,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 03:33:22,499][71000] Updated weights for policy 0, policy_version 150274 (0.0023) [2024-06-13 03:33:25,940][70768] Fps is (10 sec: 45875.5, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2462220288. Throughput: 0: 49616.9. Samples: 1991072080. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) [2024-06-13 03:33:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:33:26,397][71000] Updated weights for policy 0, policy_version 150284 (0.0022) [2024-06-13 03:33:29,092][71000] Updated weights for policy 0, policy_version 150294 (0.0032) [2024-06-13 03:33:30,940][70768] Fps is (10 sec: 42597.8, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2462466048. Throughput: 0: 49405.9. Samples: 1991360860. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) [2024-06-13 03:33:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:33:33,089][71000] Updated weights for policy 0, policy_version 150304 (0.0029) [2024-06-13 03:33:35,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2462728192. Throughput: 0: 49311.2. Samples: 1991494960. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) [2024-06-13 03:33:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 03:33:36,224][71000] Updated weights for policy 0, policy_version 150314 (0.0030) [2024-06-13 03:33:39,827][71000] Updated weights for policy 0, policy_version 150324 (0.0029) [2024-06-13 03:33:40,940][70768] Fps is (10 sec: 55705.9, 60 sec: 49971.1, 300 sec: 49429.7). Total num frames: 2463023104. Throughput: 0: 49425.3. Samples: 1991802680. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) [2024-06-13 03:33:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:33:40,957][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000150331_2463023104.pth... [2024-06-13 03:33:41,006][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000149606_2451144704.pth [2024-06-13 03:33:42,473][71000] Updated weights for policy 0, policy_version 150334 (0.0023) [2024-06-13 03:33:45,940][70768] Fps is (10 sec: 45874.5, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 2463186944. Throughput: 0: 49257.6. Samples: 1992103400. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) [2024-06-13 03:33:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:33:46,316][71000] Updated weights for policy 0, policy_version 150344 (0.0027) [2024-06-13 03:33:48,866][71000] Updated weights for policy 0, policy_version 150354 (0.0025) [2024-06-13 03:33:50,940][70768] Fps is (10 sec: 44237.0, 60 sec: 49425.1, 300 sec: 49374.1). Total num frames: 2463465472. Throughput: 0: 48789.3. Samples: 1992239300. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) [2024-06-13 03:33:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:33:52,925][71000] Updated weights for policy 0, policy_version 150364 (0.0035) [2024-06-13 03:33:55,521][71000] Updated weights for policy 0, policy_version 150374 (0.0034) [2024-06-13 03:33:55,940][70768] Fps is (10 sec: 54067.6, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2463727616. Throughput: 0: 48896.9. Samples: 1992531280. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) [2024-06-13 03:33:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:33:59,339][70980] Signal inference workers to stop experience collection... (29550 times) [2024-06-13 03:33:59,382][71000] InferenceWorker_p0-w0: stopping experience collection (29550 times) [2024-06-13 03:33:59,452][70980] Signal inference workers to resume experience collection... (29550 times) [2024-06-13 03:33:59,452][71000] InferenceWorker_p0-w0: resuming experience collection (29550 times) [2024-06-13 03:33:59,596][71000] Updated weights for policy 0, policy_version 150384 (0.0029) [2024-06-13 03:34:00,939][70768] Fps is (10 sec: 54067.6, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2464006144. Throughput: 0: 49078.8. Samples: 1992830540. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) [2024-06-13 03:34:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:34:02,093][71000] Updated weights for policy 0, policy_version 150394 (0.0042) [2024-06-13 03:34:05,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 2464186368. Throughput: 0: 49179.6. Samples: 1992992680. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) [2024-06-13 03:34:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:34:06,083][71000] Updated weights for policy 0, policy_version 150404 (0.0030) [2024-06-13 03:34:08,323][71000] Updated weights for policy 0, policy_version 150414 (0.0027) [2024-06-13 03:34:10,940][70768] Fps is (10 sec: 45874.4, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2464464896. Throughput: 0: 49278.1. Samples: 1993289600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-13 03:34:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:34:12,534][71000] Updated weights for policy 0, policy_version 150424 (0.0032) [2024-06-13 03:34:15,282][71000] Updated weights for policy 0, policy_version 150434 (0.0024) [2024-06-13 03:34:15,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2464710656. Throughput: 0: 49302.7. Samples: 1993579480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-13 03:34:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:34:19,302][71000] Updated weights for policy 0, policy_version 150444 (0.0024) [2024-06-13 03:34:20,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2464989184. Throughput: 0: 49832.8. Samples: 1993737440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-13 03:34:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:34:21,791][71000] Updated weights for policy 0, policy_version 150454 (0.0030) [2024-06-13 03:34:25,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2465185792. Throughput: 0: 49656.4. Samples: 1994037220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-13 03:34:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:34:25,959][71000] Updated weights for policy 0, policy_version 150464 (0.0024) [2024-06-13 03:34:28,461][71000] Updated weights for policy 0, policy_version 150474 (0.0024) [2024-06-13 03:34:30,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49698.2, 300 sec: 49374.1). Total num frames: 2465447936. Throughput: 0: 49632.9. Samples: 1994336880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-13 03:34:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:34:32,330][71000] Updated weights for policy 0, policy_version 150484 (0.0030) [2024-06-13 03:34:34,883][71000] Updated weights for policy 0, policy_version 150494 (0.0025) [2024-06-13 03:34:35,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2465693696. Throughput: 0: 49796.5. Samples: 1994480140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-13 03:34:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:34:39,006][71000] Updated weights for policy 0, policy_version 150504 (0.0029) [2024-06-13 03:34:40,940][70768] Fps is (10 sec: 54067.7, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2465988608. Throughput: 0: 49944.0. Samples: 1994778760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-13 03:34:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:34:41,739][71000] Updated weights for policy 0, policy_version 150514 (0.0025) [2024-06-13 03:34:45,509][71000] Updated weights for policy 0, policy_version 150524 (0.0028) [2024-06-13 03:34:45,940][70768] Fps is (10 sec: 50789.8, 60 sec: 50244.2, 300 sec: 49374.1). Total num frames: 2466201600. Throughput: 0: 49993.6. Samples: 1995080260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-13 03:34:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:34:46,418][70980] Signal inference workers to stop experience collection... (29600 times) [2024-06-13 03:34:46,440][71000] InferenceWorker_p0-w0: stopping experience collection (29600 times) [2024-06-13 03:34:46,473][70980] Signal inference workers to resume experience collection... (29600 times) [2024-06-13 03:34:46,474][71000] InferenceWorker_p0-w0: resuming experience collection (29600 times) [2024-06-13 03:34:48,352][71000] Updated weights for policy 0, policy_version 150534 (0.0029) [2024-06-13 03:34:50,939][70768] Fps is (10 sec: 45875.4, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 2466447360. Throughput: 0: 49544.0. Samples: 1995222160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-13 03:34:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 03:34:51,960][71000] Updated weights for policy 0, policy_version 150544 (0.0029) [2024-06-13 03:34:54,569][71000] Updated weights for policy 0, policy_version 150554 (0.0026) [2024-06-13 03:34:55,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49424.9, 300 sec: 49429.7). Total num frames: 2466693120. Throughput: 0: 49574.1. Samples: 1995520440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-13 03:34:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:34:58,598][71000] Updated weights for policy 0, policy_version 150564 (0.0023) [2024-06-13 03:35:00,940][70768] Fps is (10 sec: 54066.8, 60 sec: 49698.1, 300 sec: 49485.3). Total num frames: 2466988032. Throughput: 0: 49704.5. Samples: 1995816180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-13 03:35:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:35:01,211][71000] Updated weights for policy 0, policy_version 150574 (0.0030) [2024-06-13 03:35:05,323][71000] Updated weights for policy 0, policy_version 150584 (0.0030) [2024-06-13 03:35:05,940][70768] Fps is (10 sec: 50791.7, 60 sec: 50244.2, 300 sec: 49374.2). Total num frames: 2467201024. Throughput: 0: 49834.3. Samples: 1995979980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-13 03:35:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:35:08,030][71000] Updated weights for policy 0, policy_version 150594 (0.0034) [2024-06-13 03:35:10,940][70768] Fps is (10 sec: 44236.9, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2467430400. Throughput: 0: 49785.0. Samples: 1996277540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-13 03:35:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:35:11,770][71000] Updated weights for policy 0, policy_version 150604 (0.0031) [2024-06-13 03:35:14,391][71000] Updated weights for policy 0, policy_version 150614 (0.0026) [2024-06-13 03:35:15,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2467676160. Throughput: 0: 49653.5. Samples: 1996571280. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 03:35:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:35:18,310][71000] Updated weights for policy 0, policy_version 150624 (0.0027) [2024-06-13 03:35:20,940][70768] Fps is (10 sec: 54064.7, 60 sec: 49697.8, 300 sec: 49485.2). Total num frames: 2467971072. Throughput: 0: 49777.3. Samples: 1996720140. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 03:35:20,941][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:35:21,190][71000] Updated weights for policy 0, policy_version 150634 (0.0025) [2024-06-13 03:35:24,913][71000] Updated weights for policy 0, policy_version 150644 (0.0037) [2024-06-13 03:35:25,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 2468184064. Throughput: 0: 49707.4. Samples: 1997015600. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 03:35:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:35:27,905][71000] Updated weights for policy 0, policy_version 150654 (0.0023) [2024-06-13 03:35:30,940][70768] Fps is (10 sec: 45877.2, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 2468429824. Throughput: 0: 49708.1. Samples: 1997317120. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 03:35:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 03:35:31,607][71000] Updated weights for policy 0, policy_version 150664 (0.0026) [2024-06-13 03:35:34,358][71000] Updated weights for policy 0, policy_version 150674 (0.0029) [2024-06-13 03:35:35,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49698.1, 300 sec: 49485.3). Total num frames: 2468675584. Throughput: 0: 49761.3. Samples: 1997461420. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 03:35:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:35:38,136][71000] Updated weights for policy 0, policy_version 150684 (0.0030) [2024-06-13 03:35:40,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2468937728. Throughput: 0: 49681.5. Samples: 1997756100. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 03:35:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:35:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000150692_2468937728.pth... [2024-06-13 03:35:40,995][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000149967_2457059328.pth [2024-06-13 03:35:41,334][71000] Updated weights for policy 0, policy_version 150694 (0.0034) [2024-06-13 03:35:44,851][71000] Updated weights for policy 0, policy_version 150704 (0.0018) [2024-06-13 03:35:45,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 2469183488. Throughput: 0: 49576.4. Samples: 1998047120. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 03:35:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:35:47,760][71000] Updated weights for policy 0, policy_version 150714 (0.0028) [2024-06-13 03:35:50,939][70768] Fps is (10 sec: 47514.1, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2469412864. Throughput: 0: 49247.1. Samples: 1998196100. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 03:35:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 03:35:51,515][71000] Updated weights for policy 0, policy_version 150724 (0.0031) [2024-06-13 03:35:53,667][70980] Signal inference workers to stop experience collection... (29650 times) [2024-06-13 03:35:53,668][70980] Signal inference workers to resume experience collection... (29650 times) [2024-06-13 03:35:53,710][71000] InferenceWorker_p0-w0: stopping experience collection (29650 times) [2024-06-13 03:35:53,710][71000] InferenceWorker_p0-w0: resuming experience collection (29650 times) [2024-06-13 03:35:54,722][71000] Updated weights for policy 0, policy_version 150734 (0.0032) [2024-06-13 03:35:55,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2469658624. Throughput: 0: 49008.4. Samples: 1998482920. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 03:35:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:35:58,355][71000] Updated weights for policy 0, policy_version 150744 (0.0029) [2024-06-13 03:36:00,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 2469937152. Throughput: 0: 49165.5. Samples: 1998783740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 03:36:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:36:01,565][71000] Updated weights for policy 0, policy_version 150754 (0.0028) [2024-06-13 03:36:04,754][71000] Updated weights for policy 0, policy_version 150764 (0.0032) [2024-06-13 03:36:05,939][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 2470150144. Throughput: 0: 49165.5. Samples: 1998932560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 03:36:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:36:07,870][71000] Updated weights for policy 0, policy_version 150774 (0.0027) [2024-06-13 03:36:10,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49697.9, 300 sec: 49429.7). Total num frames: 2470412288. Throughput: 0: 49165.2. Samples: 1999228040. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 03:36:10,949][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 03:36:11,355][71000] Updated weights for policy 0, policy_version 150784 (0.0039) [2024-06-13 03:36:14,689][71000] Updated weights for policy 0, policy_version 150794 (0.0024) [2024-06-13 03:36:15,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 2470658048. Throughput: 0: 49063.6. Samples: 1999524980. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 03:36:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 03:36:18,040][71000] Updated weights for policy 0, policy_version 150804 (0.0033) [2024-06-13 03:36:20,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48879.2, 300 sec: 49429.7). Total num frames: 2470903808. Throughput: 0: 49069.7. Samples: 1999669560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:36:20,949][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:36:21,493][71000] Updated weights for policy 0, policy_version 150814 (0.0035) [2024-06-13 03:36:24,781][71000] Updated weights for policy 0, policy_version 150824 (0.0034) [2024-06-13 03:36:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2471149568. Throughput: 0: 49159.6. Samples: 1999968280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:36:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:36:27,873][71000] Updated weights for policy 0, policy_version 150834 (0.0030) [2024-06-13 03:36:30,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2471395328. Throughput: 0: 49347.2. Samples: 2000267740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:36:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:36:31,300][71000] Updated weights for policy 0, policy_version 150844 (0.0035) [2024-06-13 03:36:34,719][71000] Updated weights for policy 0, policy_version 150854 (0.0034) [2024-06-13 03:36:35,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 2471641088. Throughput: 0: 49305.7. Samples: 2000414860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:36:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:36:37,697][71000] Updated weights for policy 0, policy_version 150864 (0.0028) [2024-06-13 03:36:40,940][70768] Fps is (10 sec: 47512.7, 60 sec: 48878.9, 300 sec: 49429.7). Total num frames: 2471870464. Throughput: 0: 49393.2. Samples: 2000705620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:36:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:36:41,302][71000] Updated weights for policy 0, policy_version 150874 (0.0035) [2024-06-13 03:36:44,216][71000] Updated weights for policy 0, policy_version 150884 (0.0028) [2024-06-13 03:36:45,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2472132608. Throughput: 0: 49372.9. Samples: 2001005520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:36:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:36:48,114][71000] Updated weights for policy 0, policy_version 150894 (0.0029) [2024-06-13 03:36:50,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2472378368. Throughput: 0: 49365.2. Samples: 2001154000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:36:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:36:51,210][71000] Updated weights for policy 0, policy_version 150904 (0.0021) [2024-06-13 03:36:54,686][71000] Updated weights for policy 0, policy_version 150914 (0.0030) [2024-06-13 03:36:55,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 2472640512. Throughput: 0: 49434.4. Samples: 2001452580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:36:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:36:57,810][71000] Updated weights for policy 0, policy_version 150924 (0.0025) [2024-06-13 03:37:00,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48606.0, 300 sec: 49374.2). Total num frames: 2472853504. Throughput: 0: 49551.1. Samples: 2001754780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:37:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:37:01,249][71000] Updated weights for policy 0, policy_version 150934 (0.0027) [2024-06-13 03:37:04,261][71000] Updated weights for policy 0, policy_version 150944 (0.0035) [2024-06-13 03:37:05,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 2473115648. Throughput: 0: 49346.3. Samples: 2001890140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:37:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:37:08,044][70980] Signal inference workers to stop experience collection... (29700 times) [2024-06-13 03:37:08,046][70980] Signal inference workers to resume experience collection... (29700 times) [2024-06-13 03:37:08,058][71000] Updated weights for policy 0, policy_version 150954 (0.0024) [2024-06-13 03:37:08,086][71000] InferenceWorker_p0-w0: stopping experience collection (29700 times) [2024-06-13 03:37:08,086][71000] InferenceWorker_p0-w0: resuming experience collection (29700 times) [2024-06-13 03:37:10,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2473377792. Throughput: 0: 49288.3. Samples: 2002186260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:37:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:37:11,186][71000] Updated weights for policy 0, policy_version 150964 (0.0027) [2024-06-13 03:37:14,842][71000] Updated weights for policy 0, policy_version 150974 (0.0025) [2024-06-13 03:37:15,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 2473623552. Throughput: 0: 49030.6. Samples: 2002474120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:37:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:37:18,233][71000] Updated weights for policy 0, policy_version 150984 (0.0044) [2024-06-13 03:37:20,940][70768] Fps is (10 sec: 44237.2, 60 sec: 48605.9, 300 sec: 49374.2). Total num frames: 2473820160. Throughput: 0: 49092.5. Samples: 2002624020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 03:37:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:37:21,638][71000] Updated weights for policy 0, policy_version 150994 (0.0026) [2024-06-13 03:37:24,763][71000] Updated weights for policy 0, policy_version 151004 (0.0035) [2024-06-13 03:37:25,940][70768] Fps is (10 sec: 44237.0, 60 sec: 48605.9, 300 sec: 49374.2). Total num frames: 2474065920. Throughput: 0: 48969.5. Samples: 2002909240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 03:37:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:37:28,104][71000] Updated weights for policy 0, policy_version 151014 (0.0027) [2024-06-13 03:37:30,940][70768] Fps is (10 sec: 54066.5, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 2474360832. Throughput: 0: 48966.2. Samples: 2003209000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 03:37:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:37:31,241][71000] Updated weights for policy 0, policy_version 151024 (0.0027) [2024-06-13 03:37:34,780][71000] Updated weights for policy 0, policy_version 151034 (0.0024) [2024-06-13 03:37:35,939][70768] Fps is (10 sec: 54067.5, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2474606592. Throughput: 0: 49441.8. Samples: 2003378880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 03:37:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:37:37,909][71000] Updated weights for policy 0, policy_version 151044 (0.0025) [2024-06-13 03:37:40,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49152.1, 300 sec: 49374.1). Total num frames: 2474819584. Throughput: 0: 49085.0. Samples: 2003661400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 03:37:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:37:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000151051_2474819584.pth... [2024-06-13 03:37:41,003][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000150331_2463023104.pth [2024-06-13 03:37:41,607][71000] Updated weights for policy 0, policy_version 151054 (0.0030) [2024-06-13 03:37:44,907][71000] Updated weights for policy 0, policy_version 151064 (0.0027) [2024-06-13 03:37:45,939][70768] Fps is (10 sec: 44236.7, 60 sec: 48606.0, 300 sec: 49318.6). Total num frames: 2475048960. Throughput: 0: 48555.1. Samples: 2003939760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 03:37:45,942][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:37:48,324][71000] Updated weights for policy 0, policy_version 151074 (0.0037) [2024-06-13 03:37:50,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2475327488. Throughput: 0: 48905.3. Samples: 2004090880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 03:37:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:37:51,351][71000] Updated weights for policy 0, policy_version 151084 (0.0026) [2024-06-13 03:37:54,859][71000] Updated weights for policy 0, policy_version 151094 (0.0025) [2024-06-13 03:37:55,940][70768] Fps is (10 sec: 52428.5, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2475573248. Throughput: 0: 49021.9. Samples: 2004392240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 03:37:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:37:58,113][71000] Updated weights for policy 0, policy_version 151104 (0.0031) [2024-06-13 03:38:00,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2475786240. Throughput: 0: 49161.8. Samples: 2004686400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 03:38:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:38:01,863][71000] Updated weights for policy 0, policy_version 151114 (0.0032) [2024-06-13 03:38:04,870][71000] Updated weights for policy 0, policy_version 151124 (0.0026) [2024-06-13 03:38:05,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48605.9, 300 sec: 49263.1). Total num frames: 2476032000. Throughput: 0: 48703.1. Samples: 2004815660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 03:38:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:38:08,390][71000] Updated weights for policy 0, policy_version 151134 (0.0030) [2024-06-13 03:38:09,240][70980] Signal inference workers to stop experience collection... (29750 times) [2024-06-13 03:38:09,281][71000] InferenceWorker_p0-w0: stopping experience collection (29750 times) [2024-06-13 03:38:09,293][70980] Signal inference workers to resume experience collection... (29750 times) [2024-06-13 03:38:09,305][71000] InferenceWorker_p0-w0: resuming experience collection (29750 times) [2024-06-13 03:38:10,940][70768] Fps is (10 sec: 52428.1, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 2476310528. Throughput: 0: 49006.0. Samples: 2005114520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 03:38:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:38:11,444][71000] Updated weights for policy 0, policy_version 151144 (0.0027) [2024-06-13 03:38:15,256][71000] Updated weights for policy 0, policy_version 151154 (0.0025) [2024-06-13 03:38:15,940][70768] Fps is (10 sec: 52428.3, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2476556288. Throughput: 0: 49107.6. Samples: 2005418840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 03:38:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:38:17,881][71000] Updated weights for policy 0, policy_version 151164 (0.0021) [2024-06-13 03:38:20,939][70768] Fps is (10 sec: 45876.2, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2476769280. Throughput: 0: 48574.2. Samples: 2005564720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 03:38:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:38:21,727][71000] Updated weights for policy 0, policy_version 151174 (0.0029) [2024-06-13 03:38:24,600][71000] Updated weights for policy 0, policy_version 151184 (0.0025) [2024-06-13 03:38:25,940][70768] Fps is (10 sec: 44237.2, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2476998656. Throughput: 0: 48799.1. Samples: 2005857360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-13 03:38:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:38:28,347][71000] Updated weights for policy 0, policy_version 151194 (0.0028) [2024-06-13 03:38:30,940][70768] Fps is (10 sec: 52428.3, 60 sec: 48879.0, 300 sec: 49374.1). Total num frames: 2477293568. Throughput: 0: 49072.8. Samples: 2006148040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 03:38:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:38:31,603][71000] Updated weights for policy 0, policy_version 151204 (0.0031) [2024-06-13 03:38:35,295][71000] Updated weights for policy 0, policy_version 151214 (0.0030) [2024-06-13 03:38:35,939][70768] Fps is (10 sec: 54067.5, 60 sec: 48878.9, 300 sec: 49207.6). Total num frames: 2477539328. Throughput: 0: 49165.4. Samples: 2006303320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 03:38:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:38:38,134][71000] Updated weights for policy 0, policy_version 151224 (0.0031) [2024-06-13 03:38:40,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48878.9, 300 sec: 49374.2). Total num frames: 2477752320. Throughput: 0: 49034.1. Samples: 2006598780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 03:38:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:38:41,809][71000] Updated weights for policy 0, policy_version 151234 (0.0025) [2024-06-13 03:38:44,478][71000] Updated weights for policy 0, policy_version 151244 (0.0026) [2024-06-13 03:38:45,939][70768] Fps is (10 sec: 44236.7, 60 sec: 48878.9, 300 sec: 49207.6). Total num frames: 2477981696. Throughput: 0: 48908.5. Samples: 2006887280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 03:38:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:38:48,381][71000] Updated weights for policy 0, policy_version 151254 (0.0025) [2024-06-13 03:38:50,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2478276608. Throughput: 0: 49299.8. Samples: 2007034160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 03:38:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:38:51,710][71000] Updated weights for policy 0, policy_version 151264 (0.0028) [2024-06-13 03:38:55,367][71000] Updated weights for policy 0, policy_version 151274 (0.0023) [2024-06-13 03:38:55,940][70768] Fps is (10 sec: 52427.9, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 2478505984. Throughput: 0: 49187.6. Samples: 2007327960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 03:38:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:38:58,291][71000] Updated weights for policy 0, policy_version 151284 (0.0036) [2024-06-13 03:39:00,940][70768] Fps is (10 sec: 45875.6, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2478735360. Throughput: 0: 48955.6. Samples: 2007621840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 03:39:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:39:01,933][71000] Updated weights for policy 0, policy_version 151294 (0.0024) [2024-06-13 03:39:04,557][71000] Updated weights for policy 0, policy_version 151304 (0.0025) [2024-06-13 03:39:05,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2478981120. Throughput: 0: 48833.6. Samples: 2007762240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 03:39:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:39:08,537][71000] Updated weights for policy 0, policy_version 151314 (0.0034) [2024-06-13 03:39:10,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 2479243264. Throughput: 0: 49058.2. Samples: 2008064980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 03:39:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:39:11,381][71000] Updated weights for policy 0, policy_version 151324 (0.0041) [2024-06-13 03:39:14,911][70980] Signal inference workers to stop experience collection... (29800 times) [2024-06-13 03:39:14,911][70980] Signal inference workers to resume experience collection... (29800 times) [2024-06-13 03:39:14,957][71000] InferenceWorker_p0-w0: stopping experience collection (29800 times) [2024-06-13 03:39:14,957][71000] InferenceWorker_p0-w0: resuming experience collection (29800 times) [2024-06-13 03:39:15,510][71000] Updated weights for policy 0, policy_version 151334 (0.0030) [2024-06-13 03:39:15,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48605.9, 300 sec: 49096.5). Total num frames: 2479472640. Throughput: 0: 48918.3. Samples: 2008349360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 03:39:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:39:18,551][71000] Updated weights for policy 0, policy_version 151344 (0.0029) [2024-06-13 03:39:20,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49151.8, 300 sec: 49263.1). Total num frames: 2479718400. Throughput: 0: 48696.6. Samples: 2008494680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 03:39:20,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 03:39:22,121][71000] Updated weights for policy 0, policy_version 151354 (0.0031) [2024-06-13 03:39:24,679][71000] Updated weights for policy 0, policy_version 151364 (0.0025) [2024-06-13 03:39:25,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2479964160. Throughput: 0: 48806.2. Samples: 2008795060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 03:39:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:39:28,516][71000] Updated weights for policy 0, policy_version 151374 (0.0030) [2024-06-13 03:39:30,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2480242688. Throughput: 0: 49044.7. Samples: 2009094300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 03:39:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:39:31,482][71000] Updated weights for policy 0, policy_version 151384 (0.0034) [2024-06-13 03:39:35,033][71000] Updated weights for policy 0, policy_version 151394 (0.0027) [2024-06-13 03:39:35,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 2480455680. Throughput: 0: 49254.8. Samples: 2009250620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 03:39:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:39:38,027][71000] Updated weights for policy 0, policy_version 151404 (0.0025) [2024-06-13 03:39:40,940][70768] Fps is (10 sec: 45875.8, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2480701440. Throughput: 0: 49217.9. Samples: 2009542760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 03:39:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:39:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000151410_2480701440.pth... [2024-06-13 03:39:40,993][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000150692_2468937728.pth [2024-06-13 03:39:41,662][71000] Updated weights for policy 0, policy_version 151414 (0.0025) [2024-06-13 03:39:44,806][71000] Updated weights for policy 0, policy_version 151424 (0.0028) [2024-06-13 03:39:45,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2480947200. Throughput: 0: 49062.8. Samples: 2009829660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 03:39:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:39:48,397][71000] Updated weights for policy 0, policy_version 151434 (0.0034) [2024-06-13 03:39:50,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48879.1, 300 sec: 49207.6). Total num frames: 2481209344. Throughput: 0: 49285.9. Samples: 2009980100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 03:39:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:39:51,471][71000] Updated weights for policy 0, policy_version 151444 (0.0032) [2024-06-13 03:39:55,151][71000] Updated weights for policy 0, policy_version 151454 (0.0029) [2024-06-13 03:39:55,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48879.1, 300 sec: 48985.4). Total num frames: 2481438720. Throughput: 0: 49139.7. Samples: 2010276260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 03:39:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:39:58,072][71000] Updated weights for policy 0, policy_version 151464 (0.0033) [2024-06-13 03:40:00,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2481684480. Throughput: 0: 49455.2. Samples: 2010574840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 03:40:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:40:01,650][71000] Updated weights for policy 0, policy_version 151474 (0.0027) [2024-06-13 03:40:04,762][71000] Updated weights for policy 0, policy_version 151484 (0.0033) [2024-06-13 03:40:05,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2481930240. Throughput: 0: 49402.9. Samples: 2010717800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 03:40:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 03:40:08,537][71000] Updated weights for policy 0, policy_version 151494 (0.0024) [2024-06-13 03:40:10,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2482192384. Throughput: 0: 49152.5. Samples: 2011006920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 03:40:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:40:11,400][71000] Updated weights for policy 0, policy_version 151504 (0.0034) [2024-06-13 03:40:15,383][71000] Updated weights for policy 0, policy_version 151514 (0.0032) [2024-06-13 03:40:15,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 48985.5). Total num frames: 2482421760. Throughput: 0: 49123.7. Samples: 2011304860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 03:40:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:40:18,289][71000] Updated weights for policy 0, policy_version 151524 (0.0034) [2024-06-13 03:40:20,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2482667520. Throughput: 0: 48730.7. Samples: 2011443500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 03:40:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:40:21,971][71000] Updated weights for policy 0, policy_version 151534 (0.0029) [2024-06-13 03:40:24,187][70980] Signal inference workers to stop experience collection... (29850 times) [2024-06-13 03:40:24,189][70980] Signal inference workers to resume experience collection... (29850 times) [2024-06-13 03:40:24,204][71000] InferenceWorker_p0-w0: stopping experience collection (29850 times) [2024-06-13 03:40:24,231][71000] InferenceWorker_p0-w0: resuming experience collection (29850 times) [2024-06-13 03:40:24,949][71000] Updated weights for policy 0, policy_version 151544 (0.0024) [2024-06-13 03:40:25,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.2, 300 sec: 49152.0). Total num frames: 2482929664. Throughput: 0: 48726.3. Samples: 2011735440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 03:40:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:40:28,858][71000] Updated weights for policy 0, policy_version 151554 (0.0022) [2024-06-13 03:40:30,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48606.0, 300 sec: 49096.5). Total num frames: 2483159040. Throughput: 0: 49034.6. Samples: 2012036220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-13 03:40:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:40:31,714][71000] Updated weights for policy 0, policy_version 151564 (0.0030) [2024-06-13 03:40:35,392][71000] Updated weights for policy 0, policy_version 151574 (0.0028) [2024-06-13 03:40:35,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 2483404800. Throughput: 0: 48923.9. Samples: 2012181680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 03:40:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:40:38,162][71000] Updated weights for policy 0, policy_version 151584 (0.0038) [2024-06-13 03:40:40,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2483634176. Throughput: 0: 48765.7. Samples: 2012470720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 03:40:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:40:41,933][71000] Updated weights for policy 0, policy_version 151594 (0.0026) [2024-06-13 03:40:45,071][71000] Updated weights for policy 0, policy_version 151604 (0.0029) [2024-06-13 03:40:45,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49424.9, 300 sec: 49152.0). Total num frames: 2483912704. Throughput: 0: 48581.5. Samples: 2012761020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 03:40:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:40:48,753][71000] Updated weights for policy 0, policy_version 151614 (0.0043) [2024-06-13 03:40:50,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 2484125696. Throughput: 0: 48797.4. Samples: 2012913680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 03:40:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:40:51,783][71000] Updated weights for policy 0, policy_version 151624 (0.0031) [2024-06-13 03:40:55,391][71000] Updated weights for policy 0, policy_version 151634 (0.0029) [2024-06-13 03:40:55,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 2484387840. Throughput: 0: 49061.3. Samples: 2013214680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 03:40:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:40:58,646][71000] Updated weights for policy 0, policy_version 151644 (0.0035) [2024-06-13 03:41:00,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2484617216. Throughput: 0: 48892.4. Samples: 2013505020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 03:41:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:41:02,283][71000] Updated weights for policy 0, policy_version 151654 (0.0030) [2024-06-13 03:41:05,426][71000] Updated weights for policy 0, policy_version 151664 (0.0037) [2024-06-13 03:41:05,939][70768] Fps is (10 sec: 49152.8, 60 sec: 49152.0, 300 sec: 49041.0). Total num frames: 2484879360. Throughput: 0: 49129.9. Samples: 2013654340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 03:41:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:41:08,877][71000] Updated weights for policy 0, policy_version 151674 (0.0040) [2024-06-13 03:41:10,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 2485108736. Throughput: 0: 48827.9. Samples: 2013932700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 03:41:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:41:12,099][71000] Updated weights for policy 0, policy_version 151684 (0.0022) [2024-06-13 03:41:15,571][71000] Updated weights for policy 0, policy_version 151694 (0.0029) [2024-06-13 03:41:15,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 2485370880. Throughput: 0: 48907.5. Samples: 2014237060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 03:41:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:41:18,715][71000] Updated weights for policy 0, policy_version 151704 (0.0030) [2024-06-13 03:41:20,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2485616640. Throughput: 0: 48948.1. Samples: 2014384340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 03:41:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 03:41:22,080][71000] Updated weights for policy 0, policy_version 151714 (0.0026) [2024-06-13 03:41:25,233][71000] Updated weights for policy 0, policy_version 151724 (0.0034) [2024-06-13 03:41:25,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2485862400. Throughput: 0: 49232.0. Samples: 2014686160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 03:41:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:41:28,937][71000] Updated weights for policy 0, policy_version 151734 (0.0028) [2024-06-13 03:41:30,941][70768] Fps is (10 sec: 45869.7, 60 sec: 48604.9, 300 sec: 48929.7). Total num frames: 2486075392. Throughput: 0: 48995.8. Samples: 2014965880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 03:41:30,941][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:41:32,276][71000] Updated weights for policy 0, policy_version 151744 (0.0037) [2024-06-13 03:41:35,839][71000] Updated weights for policy 0, policy_version 151754 (0.0033) [2024-06-13 03:41:35,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2486337536. Throughput: 0: 48791.9. Samples: 2015109320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 03:41:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:41:38,748][71000] Updated weights for policy 0, policy_version 151764 (0.0028) [2024-06-13 03:41:40,940][70768] Fps is (10 sec: 52434.8, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 2486599680. Throughput: 0: 48839.2. Samples: 2015412440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 03:41:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:41:41,019][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000151771_2486616064.pth... [2024-06-13 03:41:41,063][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000151051_2474819584.pth [2024-06-13 03:41:42,015][71000] Updated weights for policy 0, policy_version 151774 (0.0022) [2024-06-13 03:41:43,193][70980] Signal inference workers to stop experience collection... (29900 times) [2024-06-13 03:41:43,235][71000] InferenceWorker_p0-w0: stopping experience collection (29900 times) [2024-06-13 03:41:43,254][70980] Signal inference workers to resume experience collection... (29900 times) [2024-06-13 03:41:43,255][71000] InferenceWorker_p0-w0: resuming experience collection (29900 times) [2024-06-13 03:41:45,481][71000] Updated weights for policy 0, policy_version 151784 (0.0027) [2024-06-13 03:41:45,939][70768] Fps is (10 sec: 52429.1, 60 sec: 49152.2, 300 sec: 49096.5). Total num frames: 2486861824. Throughput: 0: 49172.9. Samples: 2015717800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 03:41:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:41:48,586][71000] Updated weights for policy 0, policy_version 151794 (0.0032) [2024-06-13 03:41:50,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 2487058432. Throughput: 0: 49031.9. Samples: 2015860780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 03:41:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:41:52,133][71000] Updated weights for policy 0, policy_version 151804 (0.0024) [2024-06-13 03:41:55,645][71000] Updated weights for policy 0, policy_version 151814 (0.0030) [2024-06-13 03:41:55,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2487320576. Throughput: 0: 49137.0. Samples: 2016143860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 03:41:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:41:58,804][71000] Updated weights for policy 0, policy_version 151824 (0.0025) [2024-06-13 03:42:00,939][70768] Fps is (10 sec: 52429.2, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 2487582720. Throughput: 0: 48997.0. Samples: 2016441920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 03:42:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:42:02,111][71000] Updated weights for policy 0, policy_version 151834 (0.0036) [2024-06-13 03:42:05,511][71000] Updated weights for policy 0, policy_version 151844 (0.0040) [2024-06-13 03:42:05,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 2487828480. Throughput: 0: 49154.5. Samples: 2016596300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 03:42:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:42:08,968][71000] Updated weights for policy 0, policy_version 151854 (0.0026) [2024-06-13 03:42:10,940][70768] Fps is (10 sec: 44236.7, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 2488025088. Throughput: 0: 48667.1. Samples: 2016876180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 03:42:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:42:12,478][71000] Updated weights for policy 0, policy_version 151864 (0.0024) [2024-06-13 03:42:15,563][71000] Updated weights for policy 0, policy_version 151874 (0.0025) [2024-06-13 03:42:15,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2488320000. Throughput: 0: 48843.8. Samples: 2017163800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 03:42:15,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 03:42:19,025][71000] Updated weights for policy 0, policy_version 151884 (0.0025) [2024-06-13 03:42:20,940][70768] Fps is (10 sec: 52428.6, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2488549376. Throughput: 0: 49156.9. Samples: 2017321380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 03:42:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:42:22,295][71000] Updated weights for policy 0, policy_version 151894 (0.0027) [2024-06-13 03:42:25,811][71000] Updated weights for policy 0, policy_version 151904 (0.0029) [2024-06-13 03:42:25,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 2488795136. Throughput: 0: 49096.5. Samples: 2017621780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 03:42:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 03:42:28,907][71000] Updated weights for policy 0, policy_version 151914 (0.0033) [2024-06-13 03:42:30,939][70768] Fps is (10 sec: 45875.4, 60 sec: 48879.9, 300 sec: 48818.8). Total num frames: 2489008128. Throughput: 0: 48988.0. Samples: 2017922260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 03:42:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:42:32,466][71000] Updated weights for policy 0, policy_version 151924 (0.0026) [2024-06-13 03:42:33,859][70980] Signal inference workers to stop experience collection... (29950 times) [2024-06-13 03:42:33,897][71000] InferenceWorker_p0-w0: stopping experience collection (29950 times) [2024-06-13 03:42:33,921][70980] Signal inference workers to resume experience collection... (29950 times) [2024-06-13 03:42:33,923][71000] InferenceWorker_p0-w0: resuming experience collection (29950 times) [2024-06-13 03:42:35,366][71000] Updated weights for policy 0, policy_version 151934 (0.0024) [2024-06-13 03:42:35,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 2489303040. Throughput: 0: 48800.9. Samples: 2018056820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 03:42:35,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 03:42:39,090][71000] Updated weights for policy 0, policy_version 151944 (0.0031) [2024-06-13 03:42:40,940][70768] Fps is (10 sec: 54067.1, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2489548800. Throughput: 0: 49152.9. Samples: 2018355740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 03:42:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 03:42:42,186][71000] Updated weights for policy 0, policy_version 151954 (0.0023) [2024-06-13 03:42:45,940][70768] Fps is (10 sec: 45875.7, 60 sec: 48332.8, 300 sec: 48929.9). Total num frames: 2489761792. Throughput: 0: 49048.0. Samples: 2018649080. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-13 03:42:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:42:45,986][71000] Updated weights for policy 0, policy_version 151964 (0.0027) [2024-06-13 03:42:48,686][71000] Updated weights for policy 0, policy_version 151974 (0.0027) [2024-06-13 03:42:50,940][70768] Fps is (10 sec: 44235.9, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 2489991168. Throughput: 0: 48687.0. Samples: 2018787220. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-13 03:42:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:42:52,601][71000] Updated weights for policy 0, policy_version 151984 (0.0030) [2024-06-13 03:42:55,366][71000] Updated weights for policy 0, policy_version 151994 (0.0032) [2024-06-13 03:42:55,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2490269696. Throughput: 0: 49028.4. Samples: 2019082460. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-13 03:42:55,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 03:42:59,417][71000] Updated weights for policy 0, policy_version 152004 (0.0033) [2024-06-13 03:43:00,940][70768] Fps is (10 sec: 55705.7, 60 sec: 49424.9, 300 sec: 49207.5). Total num frames: 2490548224. Throughput: 0: 49274.6. Samples: 2019381160. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-13 03:43:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:43:01,755][71000] Updated weights for policy 0, policy_version 152014 (0.0028) [2024-06-13 03:43:05,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 2490744832. Throughput: 0: 49091.9. Samples: 2019530520. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-13 03:43:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:43:06,039][71000] Updated weights for policy 0, policy_version 152024 (0.0024) [2024-06-13 03:43:08,652][71000] Updated weights for policy 0, policy_version 152034 (0.0023) [2024-06-13 03:43:10,940][70768] Fps is (10 sec: 44236.7, 60 sec: 49424.9, 300 sec: 48929.8). Total num frames: 2490990592. Throughput: 0: 48988.2. Samples: 2019826260. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-13 03:43:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:43:12,783][71000] Updated weights for policy 0, policy_version 152044 (0.0026) [2024-06-13 03:43:15,405][71000] Updated weights for policy 0, policy_version 152054 (0.0031) [2024-06-13 03:43:15,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48879.0, 300 sec: 49096.4). Total num frames: 2491252736. Throughput: 0: 48650.2. Samples: 2020111520. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-13 03:43:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:43:19,604][71000] Updated weights for policy 0, policy_version 152064 (0.0033) [2024-06-13 03:43:20,940][70768] Fps is (10 sec: 54068.2, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2491531264. Throughput: 0: 49284.5. Samples: 2020274620. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-13 03:43:20,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 03:43:21,746][70980] Signal inference workers to stop experience collection... (30000 times) [2024-06-13 03:43:21,800][71000] InferenceWorker_p0-w0: stopping experience collection (30000 times) [2024-06-13 03:43:21,800][70980] Signal inference workers to resume experience collection... (30000 times) [2024-06-13 03:43:21,811][71000] InferenceWorker_p0-w0: resuming experience collection (30000 times) [2024-06-13 03:43:21,940][71000] Updated weights for policy 0, policy_version 152074 (0.0029) [2024-06-13 03:43:25,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 2491711488. Throughput: 0: 49007.9. Samples: 2020561100. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-13 03:43:25,941][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:43:26,239][71000] Updated weights for policy 0, policy_version 152084 (0.0027) [2024-06-13 03:43:28,801][71000] Updated weights for policy 0, policy_version 152094 (0.0027) [2024-06-13 03:43:30,940][70768] Fps is (10 sec: 44236.7, 60 sec: 49425.0, 300 sec: 48929.8). Total num frames: 2491973632. Throughput: 0: 49172.8. Samples: 2020861860. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-13 03:43:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:43:32,819][71000] Updated weights for policy 0, policy_version 152104 (0.0035) [2024-06-13 03:43:35,339][71000] Updated weights for policy 0, policy_version 152114 (0.0021) [2024-06-13 03:43:35,940][70768] Fps is (10 sec: 52428.5, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2492235776. Throughput: 0: 49292.1. Samples: 2021005360. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-13 03:43:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 03:43:39,678][71000] Updated weights for policy 0, policy_version 152124 (0.0027) [2024-06-13 03:43:40,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2492497920. Throughput: 0: 49379.2. Samples: 2021304520. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-13 03:43:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:43:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000152130_2492497920.pth... [2024-06-13 03:43:41,002][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000151410_2480701440.pth [2024-06-13 03:43:42,317][71000] Updated weights for policy 0, policy_version 152134 (0.0042) [2024-06-13 03:43:45,939][70768] Fps is (10 sec: 45875.8, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 2492694528. Throughput: 0: 49446.0. Samples: 2021606220. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-13 03:43:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:43:46,160][71000] Updated weights for policy 0, policy_version 152144 (0.0029) [2024-06-13 03:43:48,637][71000] Updated weights for policy 0, policy_version 152154 (0.0027) [2024-06-13 03:43:50,939][70768] Fps is (10 sec: 44237.0, 60 sec: 49152.2, 300 sec: 48929.9). Total num frames: 2492940288. Throughput: 0: 49020.6. Samples: 2021736440. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-13 03:43:50,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 03:43:52,665][71000] Updated weights for policy 0, policy_version 152164 (0.0028) [2024-06-13 03:43:55,866][71000] Updated weights for policy 0, policy_version 152174 (0.0022) [2024-06-13 03:43:55,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2493218816. Throughput: 0: 48805.6. Samples: 2022022500. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-13 03:43:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:43:59,822][71000] Updated weights for policy 0, policy_version 152184 (0.0027) [2024-06-13 03:44:00,940][70768] Fps is (10 sec: 52428.7, 60 sec: 48606.0, 300 sec: 49096.5). Total num frames: 2493464576. Throughput: 0: 48958.7. Samples: 2022314660. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-13 03:44:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:44:02,660][71000] Updated weights for policy 0, policy_version 152194 (0.0024) [2024-06-13 03:44:05,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 2493677568. Throughput: 0: 48836.0. Samples: 2022472240. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-13 03:44:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:44:06,231][71000] Updated weights for policy 0, policy_version 152204 (0.0026) [2024-06-13 03:44:08,911][71000] Updated weights for policy 0, policy_version 152214 (0.0039) [2024-06-13 03:44:10,940][70768] Fps is (10 sec: 45874.7, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 2493923328. Throughput: 0: 49031.5. Samples: 2022767520. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-13 03:44:10,949][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:44:12,622][71000] Updated weights for policy 0, policy_version 152224 (0.0027) [2024-06-13 03:44:14,167][70980] Signal inference workers to stop experience collection... (30050 times) [2024-06-13 03:44:14,191][71000] InferenceWorker_p0-w0: stopping experience collection (30050 times) [2024-06-13 03:44:14,226][70980] Signal inference workers to resume experience collection... (30050 times) [2024-06-13 03:44:14,226][71000] InferenceWorker_p0-w0: resuming experience collection (30050 times) [2024-06-13 03:44:15,849][71000] Updated weights for policy 0, policy_version 152234 (0.0025) [2024-06-13 03:44:15,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2494201856. Throughput: 0: 48768.0. Samples: 2023056420. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-13 03:44:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 03:44:19,361][71000] Updated weights for policy 0, policy_version 152244 (0.0036) [2024-06-13 03:44:20,940][70768] Fps is (10 sec: 54067.1, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 2494464000. Throughput: 0: 49203.5. Samples: 2023219520. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-13 03:44:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:44:22,770][71000] Updated weights for policy 0, policy_version 152254 (0.0022) [2024-06-13 03:44:25,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49698.2, 300 sec: 48985.4). Total num frames: 2494693376. Throughput: 0: 49046.7. Samples: 2023511620. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-13 03:44:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:44:25,947][71000] Updated weights for policy 0, policy_version 152264 (0.0025) [2024-06-13 03:44:29,069][71000] Updated weights for policy 0, policy_version 152274 (0.0021) [2024-06-13 03:44:30,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2494939136. Throughput: 0: 49052.4. Samples: 2023813580. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-13 03:44:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:44:32,302][71000] Updated weights for policy 0, policy_version 152284 (0.0028) [2024-06-13 03:44:35,671][71000] Updated weights for policy 0, policy_version 152294 (0.0032) [2024-06-13 03:44:35,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2495184896. Throughput: 0: 49451.5. Samples: 2023961760. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-13 03:44:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:44:39,068][71000] Updated weights for policy 0, policy_version 152304 (0.0031) [2024-06-13 03:44:40,943][70768] Fps is (10 sec: 52411.8, 60 sec: 49422.4, 300 sec: 49207.0). Total num frames: 2495463424. Throughput: 0: 49662.2. Samples: 2024257460. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-13 03:44:40,943][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:44:42,142][71000] Updated weights for policy 0, policy_version 152314 (0.0035) [2024-06-13 03:44:45,653][71000] Updated weights for policy 0, policy_version 152324 (0.0034) [2024-06-13 03:44:45,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49698.0, 300 sec: 49040.9). Total num frames: 2495676416. Throughput: 0: 49764.3. Samples: 2024554060. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-13 03:44:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:44:48,919][71000] Updated weights for policy 0, policy_version 152334 (0.0029) [2024-06-13 03:44:50,940][70768] Fps is (10 sec: 45889.9, 60 sec: 49698.1, 300 sec: 49096.5). Total num frames: 2495922176. Throughput: 0: 49536.0. Samples: 2024701360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:44:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:44:52,002][71000] Updated weights for policy 0, policy_version 152344 (0.0027) [2024-06-13 03:44:55,405][71000] Updated weights for policy 0, policy_version 152354 (0.0032) [2024-06-13 03:44:55,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49424.9, 300 sec: 49152.0). Total num frames: 2496184320. Throughput: 0: 49693.3. Samples: 2025003720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:44:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:44:58,822][71000] Updated weights for policy 0, policy_version 152364 (0.0024) [2024-06-13 03:45:00,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2496413696. Throughput: 0: 49653.0. Samples: 2025290800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:45:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:45:02,146][71000] Updated weights for policy 0, policy_version 152374 (0.0024) [2024-06-13 03:45:05,498][71000] Updated weights for policy 0, policy_version 152384 (0.0029) [2024-06-13 03:45:05,940][70768] Fps is (10 sec: 47514.4, 60 sec: 49698.2, 300 sec: 49040.9). Total num frames: 2496659456. Throughput: 0: 49133.9. Samples: 2025430540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:45:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:45:08,715][71000] Updated weights for policy 0, policy_version 152394 (0.0031) [2024-06-13 03:45:10,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49698.1, 300 sec: 49096.4). Total num frames: 2496905216. Throughput: 0: 49308.7. Samples: 2025730520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:45:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 03:45:12,045][71000] Updated weights for policy 0, policy_version 152404 (0.0030) [2024-06-13 03:45:15,491][71000] Updated weights for policy 0, policy_version 152414 (0.0031) [2024-06-13 03:45:15,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49698.1, 300 sec: 49207.5). Total num frames: 2497183744. Throughput: 0: 49359.5. Samples: 2026034760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:45:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:45:18,528][71000] Updated weights for policy 0, policy_version 152424 (0.0035) [2024-06-13 03:45:20,939][70768] Fps is (10 sec: 50791.3, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2497413120. Throughput: 0: 49267.2. Samples: 2026178780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:45:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:45:22,071][71000] Updated weights for policy 0, policy_version 152434 (0.0029) [2024-06-13 03:45:25,214][71000] Updated weights for policy 0, policy_version 152444 (0.0035) [2024-06-13 03:45:25,939][70768] Fps is (10 sec: 45875.8, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2497642496. Throughput: 0: 49301.4. Samples: 2026475860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:45:25,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 03:45:28,626][71000] Updated weights for policy 0, policy_version 152454 (0.0032) [2024-06-13 03:45:30,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49698.0, 300 sec: 49207.5). Total num frames: 2497921024. Throughput: 0: 49352.4. Samples: 2026774920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:45:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:45:31,981][71000] Updated weights for policy 0, policy_version 152464 (0.0026) [2024-06-13 03:45:33,493][70980] Signal inference workers to stop experience collection... (30100 times) [2024-06-13 03:45:33,537][71000] InferenceWorker_p0-w0: stopping experience collection (30100 times) [2024-06-13 03:45:33,546][70980] Signal inference workers to resume experience collection... (30100 times) [2024-06-13 03:45:33,547][71000] InferenceWorker_p0-w0: resuming experience collection (30100 times) [2024-06-13 03:45:35,227][71000] Updated weights for policy 0, policy_version 152474 (0.0030) [2024-06-13 03:45:35,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2498166784. Throughput: 0: 49506.2. Samples: 2026929140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:45:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:45:38,327][71000] Updated weights for policy 0, policy_version 152484 (0.0030) [2024-06-13 03:45:40,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48881.5, 300 sec: 49096.5). Total num frames: 2498396160. Throughput: 0: 49397.0. Samples: 2027226580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:45:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:45:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000152490_2498396160.pth... [2024-06-13 03:45:40,992][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000151771_2486616064.pth [2024-06-13 03:45:41,873][71000] Updated weights for policy 0, policy_version 152494 (0.0028) [2024-06-13 03:45:45,048][71000] Updated weights for policy 0, policy_version 152504 (0.0035) [2024-06-13 03:45:45,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2498641920. Throughput: 0: 49304.9. Samples: 2027509520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:45:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:45:48,697][71000] Updated weights for policy 0, policy_version 152514 (0.0029) [2024-06-13 03:45:50,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2498887680. Throughput: 0: 49539.5. Samples: 2027659820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 03:45:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:45:51,754][71000] Updated weights for policy 0, policy_version 152524 (0.0025) [2024-06-13 03:45:55,384][71000] Updated weights for policy 0, policy_version 152534 (0.0029) [2024-06-13 03:45:55,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.1, 300 sec: 49207.5). Total num frames: 2499133440. Throughput: 0: 49497.0. Samples: 2027957880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 03:45:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:45:58,275][71000] Updated weights for policy 0, policy_version 152544 (0.0032) [2024-06-13 03:46:00,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2499379200. Throughput: 0: 49400.9. Samples: 2028257800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 03:46:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:46:02,130][71000] Updated weights for policy 0, policy_version 152554 (0.0027) [2024-06-13 03:46:04,810][71000] Updated weights for policy 0, policy_version 152564 (0.0042) [2024-06-13 03:46:05,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.1, 300 sec: 49207.6). Total num frames: 2499624960. Throughput: 0: 49293.8. Samples: 2028397000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 03:46:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:46:08,699][71000] Updated weights for policy 0, policy_version 152574 (0.0028) [2024-06-13 03:46:10,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 49207.5). Total num frames: 2499887104. Throughput: 0: 49240.2. Samples: 2028691680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 03:46:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:46:11,688][71000] Updated weights for policy 0, policy_version 152584 (0.0021) [2024-06-13 03:46:15,450][71000] Updated weights for policy 0, policy_version 152594 (0.0031) [2024-06-13 03:46:15,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2500149248. Throughput: 0: 49341.5. Samples: 2028995280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 03:46:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:46:17,935][71000] Updated weights for policy 0, policy_version 152604 (0.0026) [2024-06-13 03:46:20,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49151.8, 300 sec: 49152.0). Total num frames: 2500362240. Throughput: 0: 49100.7. Samples: 2029138680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 03:46:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:46:22,069][71000] Updated weights for policy 0, policy_version 152614 (0.0035) [2024-06-13 03:46:24,758][71000] Updated weights for policy 0, policy_version 152624 (0.0031) [2024-06-13 03:46:25,940][70768] Fps is (10 sec: 45874.6, 60 sec: 49424.9, 300 sec: 49263.3). Total num frames: 2500608000. Throughput: 0: 48954.6. Samples: 2029429540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 03:46:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:46:28,646][71000] Updated weights for policy 0, policy_version 152634 (0.0028) [2024-06-13 03:46:30,940][70768] Fps is (10 sec: 54067.2, 60 sec: 49698.1, 300 sec: 49374.1). Total num frames: 2500902912. Throughput: 0: 49389.2. Samples: 2029732040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 03:46:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:46:31,427][71000] Updated weights for policy 0, policy_version 152644 (0.0033) [2024-06-13 03:46:33,802][70980] Signal inference workers to stop experience collection... (30150 times) [2024-06-13 03:46:33,803][70980] Signal inference workers to resume experience collection... (30150 times) [2024-06-13 03:46:33,844][71000] InferenceWorker_p0-w0: stopping experience collection (30150 times) [2024-06-13 03:46:33,844][71000] InferenceWorker_p0-w0: resuming experience collection (30150 times) [2024-06-13 03:46:35,029][71000] Updated weights for policy 0, policy_version 152654 (0.0027) [2024-06-13 03:46:35,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2501115904. Throughput: 0: 49727.5. Samples: 2029897560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 03:46:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:46:37,807][71000] Updated weights for policy 0, policy_version 152664 (0.0024) [2024-06-13 03:46:40,940][70768] Fps is (10 sec: 44236.6, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 2501345280. Throughput: 0: 49718.5. Samples: 2030195220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 03:46:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:46:41,531][71000] Updated weights for policy 0, policy_version 152674 (0.0030) [2024-06-13 03:46:44,667][71000] Updated weights for policy 0, policy_version 152684 (0.0033) [2024-06-13 03:46:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2501591040. Throughput: 0: 49420.0. Samples: 2030481700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 03:46:45,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 03:46:48,298][71000] Updated weights for policy 0, policy_version 152694 (0.0037) [2024-06-13 03:46:50,940][70768] Fps is (10 sec: 54068.2, 60 sec: 49971.2, 300 sec: 49374.2). Total num frames: 2501885952. Throughput: 0: 49644.4. Samples: 2030631000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 03:46:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:46:51,193][71000] Updated weights for policy 0, policy_version 152704 (0.0051) [2024-06-13 03:46:55,005][71000] Updated weights for policy 0, policy_version 152714 (0.0031) [2024-06-13 03:46:55,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2502098944. Throughput: 0: 49780.5. Samples: 2030931800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-13 03:46:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:46:57,802][71000] Updated weights for policy 0, policy_version 152724 (0.0030) [2024-06-13 03:47:00,940][70768] Fps is (10 sec: 44236.7, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2502328320. Throughput: 0: 49564.8. Samples: 2031225700. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 03:47:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:47:01,549][71000] Updated weights for policy 0, policy_version 152734 (0.0024) [2024-06-13 03:47:04,689][71000] Updated weights for policy 0, policy_version 152744 (0.0021) [2024-06-13 03:47:05,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 2502590464. Throughput: 0: 49381.5. Samples: 2031360840. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 03:47:05,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 03:47:08,511][71000] Updated weights for policy 0, policy_version 152754 (0.0036) [2024-06-13 03:47:10,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 2502852608. Throughput: 0: 49550.3. Samples: 2031659300. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 03:47:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:47:11,191][71000] Updated weights for policy 0, policy_version 152764 (0.0023) [2024-06-13 03:47:15,032][71000] Updated weights for policy 0, policy_version 152774 (0.0031) [2024-06-13 03:47:15,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2503081984. Throughput: 0: 49504.6. Samples: 2031959740. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 03:47:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:47:17,611][71000] Updated weights for policy 0, policy_version 152784 (0.0028) [2024-06-13 03:47:20,940][70768] Fps is (10 sec: 44237.0, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 2503294976. Throughput: 0: 48956.1. Samples: 2032100580. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 03:47:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:47:21,661][71000] Updated weights for policy 0, policy_version 152794 (0.0031) [2024-06-13 03:47:24,584][71000] Updated weights for policy 0, policy_version 152804 (0.0025) [2024-06-13 03:47:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.1, 300 sec: 49374.1). Total num frames: 2503573504. Throughput: 0: 48979.3. Samples: 2032399280. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 03:47:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 03:47:28,438][71000] Updated weights for policy 0, policy_version 152814 (0.0021) [2024-06-13 03:47:30,940][70768] Fps is (10 sec: 54067.0, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 2503835648. Throughput: 0: 49037.9. Samples: 2032688400. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 03:47:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:47:31,086][71000] Updated weights for policy 0, policy_version 152824 (0.0027) [2024-06-13 03:47:34,705][70980] Signal inference workers to stop experience collection... (30200 times) [2024-06-13 03:47:34,705][70980] Signal inference workers to resume experience collection... (30200 times) [2024-06-13 03:47:34,748][71000] InferenceWorker_p0-w0: stopping experience collection (30200 times) [2024-06-13 03:47:34,748][71000] InferenceWorker_p0-w0: resuming experience collection (30200 times) [2024-06-13 03:47:35,150][71000] Updated weights for policy 0, policy_version 152834 (0.0029) [2024-06-13 03:47:35,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2504081408. Throughput: 0: 49246.2. Samples: 2032847080. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 03:47:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:47:37,741][71000] Updated weights for policy 0, policy_version 152844 (0.0031) [2024-06-13 03:47:40,940][70768] Fps is (10 sec: 44236.8, 60 sec: 48879.1, 300 sec: 49207.5). Total num frames: 2504278016. Throughput: 0: 49029.9. Samples: 2033138140. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 03:47:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:47:41,051][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000152850_2504294400.pth... [2024-06-13 03:47:41,099][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000152130_2492497920.pth [2024-06-13 03:47:41,902][71000] Updated weights for policy 0, policy_version 152854 (0.0040) [2024-06-13 03:47:44,619][71000] Updated weights for policy 0, policy_version 152864 (0.0035) [2024-06-13 03:47:45,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2504540160. Throughput: 0: 48972.5. Samples: 2033429460. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 03:47:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:47:48,583][71000] Updated weights for policy 0, policy_version 152874 (0.0027) [2024-06-13 03:47:50,940][70768] Fps is (10 sec: 54067.2, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 2504818688. Throughput: 0: 49370.7. Samples: 2033582520. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 03:47:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 03:47:51,398][71000] Updated weights for policy 0, policy_version 152884 (0.0040) [2024-06-13 03:47:55,510][71000] Updated weights for policy 0, policy_version 152894 (0.0024) [2024-06-13 03:47:55,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49425.2, 300 sec: 49207.6). Total num frames: 2505064448. Throughput: 0: 49204.5. Samples: 2033873500. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 03:47:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:47:58,028][71000] Updated weights for policy 0, policy_version 152904 (0.0028) [2024-06-13 03:48:00,940][70768] Fps is (10 sec: 44236.9, 60 sec: 48879.0, 300 sec: 49207.6). Total num frames: 2505261056. Throughput: 0: 49067.5. Samples: 2034167780. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 03:48:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:48:02,138][71000] Updated weights for policy 0, policy_version 152914 (0.0026) [2024-06-13 03:48:04,850][71000] Updated weights for policy 0, policy_version 152924 (0.0026) [2024-06-13 03:48:05,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 2505523200. Throughput: 0: 48907.6. Samples: 2034301420. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-13 03:48:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:48:08,843][71000] Updated weights for policy 0, policy_version 152934 (0.0030) [2024-06-13 03:48:10,940][70768] Fps is (10 sec: 52429.0, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 2505785344. Throughput: 0: 48786.8. Samples: 2034594680. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-13 03:48:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:48:11,546][71000] Updated weights for policy 0, policy_version 152944 (0.0033) [2024-06-13 03:48:15,827][71000] Updated weights for policy 0, policy_version 152954 (0.0024) [2024-06-13 03:48:15,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 2505998336. Throughput: 0: 49152.9. Samples: 2034900280. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-13 03:48:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:48:18,340][71000] Updated weights for policy 0, policy_version 152964 (0.0030) [2024-06-13 03:48:20,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2506244096. Throughput: 0: 48501.8. Samples: 2035029660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-13 03:48:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 03:48:22,368][71000] Updated weights for policy 0, policy_version 152974 (0.0039) [2024-06-13 03:48:22,811][70980] Signal inference workers to stop experience collection... (30250 times) [2024-06-13 03:48:22,856][70980] Signal inference workers to resume experience collection... (30250 times) [2024-06-13 03:48:22,857][71000] InferenceWorker_p0-w0: stopping experience collection (30250 times) [2024-06-13 03:48:22,868][71000] InferenceWorker_p0-w0: resuming experience collection (30250 times) [2024-06-13 03:48:24,834][71000] Updated weights for policy 0, policy_version 152984 (0.0016) [2024-06-13 03:48:25,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 2506506240. Throughput: 0: 48616.5. Samples: 2035325880. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-13 03:48:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:48:29,190][71000] Updated weights for policy 0, policy_version 152994 (0.0030) [2024-06-13 03:48:30,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48605.9, 300 sec: 49207.6). Total num frames: 2506752000. Throughput: 0: 48477.8. Samples: 2035610960. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-13 03:48:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:48:31,812][71000] Updated weights for policy 0, policy_version 153004 (0.0035) [2024-06-13 03:48:35,779][71000] Updated weights for policy 0, policy_version 153014 (0.0027) [2024-06-13 03:48:35,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48605.8, 300 sec: 49152.0). Total num frames: 2506997760. Throughput: 0: 48551.5. Samples: 2035767340. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-13 03:48:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:48:38,223][71000] Updated weights for policy 0, policy_version 153024 (0.0023) [2024-06-13 03:48:40,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2507227136. Throughput: 0: 48812.4. Samples: 2036070060. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-13 03:48:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:48:42,194][71000] Updated weights for policy 0, policy_version 153034 (0.0023) [2024-06-13 03:48:44,849][71000] Updated weights for policy 0, policy_version 153044 (0.0029) [2024-06-13 03:48:45,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 2507505664. Throughput: 0: 48641.7. Samples: 2036356660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-13 03:48:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:48:49,062][71000] Updated weights for policy 0, policy_version 153054 (0.0026) [2024-06-13 03:48:50,940][70768] Fps is (10 sec: 52427.9, 60 sec: 48878.8, 300 sec: 49263.0). Total num frames: 2507751424. Throughput: 0: 49137.0. Samples: 2036512600. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-13 03:48:50,941][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:48:51,991][71000] Updated weights for policy 0, policy_version 153064 (0.0029) [2024-06-13 03:48:55,686][71000] Updated weights for policy 0, policy_version 153074 (0.0030) [2024-06-13 03:48:55,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48605.9, 300 sec: 49207.5). Total num frames: 2507980800. Throughput: 0: 48974.6. Samples: 2036798540. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-13 03:48:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:48:58,575][71000] Updated weights for policy 0, policy_version 153084 (0.0029) [2024-06-13 03:49:00,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2508210176. Throughput: 0: 48716.8. Samples: 2037092540. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-13 03:49:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:49:02,387][71000] Updated weights for policy 0, policy_version 153094 (0.0032) [2024-06-13 03:49:05,098][71000] Updated weights for policy 0, policy_version 153104 (0.0027) [2024-06-13 03:49:05,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2508488704. Throughput: 0: 49016.0. Samples: 2037235380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:49:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:49:09,444][71000] Updated weights for policy 0, policy_version 153114 (0.0034) [2024-06-13 03:49:10,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48878.8, 300 sec: 49207.5). Total num frames: 2508718080. Throughput: 0: 48986.0. Samples: 2037530260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:49:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:49:11,974][71000] Updated weights for policy 0, policy_version 153124 (0.0034) [2024-06-13 03:49:15,758][71000] Updated weights for policy 0, policy_version 153134 (0.0026) [2024-06-13 03:49:15,940][70768] Fps is (10 sec: 45874.7, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 2508947456. Throughput: 0: 49354.9. Samples: 2037831940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:49:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:49:18,736][71000] Updated weights for policy 0, policy_version 153144 (0.0018) [2024-06-13 03:49:20,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2509193216. Throughput: 0: 48850.1. Samples: 2037965600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:49:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:49:22,329][71000] Updated weights for policy 0, policy_version 153154 (0.0023) [2024-06-13 03:49:25,295][71000] Updated weights for policy 0, policy_version 153164 (0.0031) [2024-06-13 03:49:25,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2509455360. Throughput: 0: 48800.1. Samples: 2038266060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:49:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:49:29,486][71000] Updated weights for policy 0, policy_version 153174 (0.0032) [2024-06-13 03:49:30,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2509684736. Throughput: 0: 49010.8. Samples: 2038562140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:49:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:49:31,843][71000] Updated weights for policy 0, policy_version 153184 (0.0018) [2024-06-13 03:49:35,913][71000] Updated weights for policy 0, policy_version 153194 (0.0029) [2024-06-13 03:49:35,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48879.0, 300 sec: 49041.5). Total num frames: 2509930496. Throughput: 0: 48423.8. Samples: 2038691660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:49:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:49:37,385][70980] Signal inference workers to stop experience collection... (30300 times) [2024-06-13 03:49:37,432][71000] InferenceWorker_p0-w0: stopping experience collection (30300 times) [2024-06-13 03:49:37,439][70980] Signal inference workers to resume experience collection... (30300 times) [2024-06-13 03:49:37,448][71000] InferenceWorker_p0-w0: resuming experience collection (30300 times) [2024-06-13 03:49:38,685][71000] Updated weights for policy 0, policy_version 153204 (0.0028) [2024-06-13 03:49:40,940][70768] Fps is (10 sec: 47512.7, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 2510159872. Throughput: 0: 48730.5. Samples: 2038991420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:49:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 03:49:40,962][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000153209_2510176256.pth... [2024-06-13 03:49:41,013][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000152490_2498396160.pth [2024-06-13 03:49:42,507][71000] Updated weights for policy 0, policy_version 153214 (0.0024) [2024-06-13 03:49:45,535][71000] Updated weights for policy 0, policy_version 153224 (0.0027) [2024-06-13 03:49:45,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2510438400. Throughput: 0: 48827.5. Samples: 2039289780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:49:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 03:49:49,359][71000] Updated weights for policy 0, policy_version 153234 (0.0025) [2024-06-13 03:49:50,939][70768] Fps is (10 sec: 50791.6, 60 sec: 48606.1, 300 sec: 49096.5). Total num frames: 2510667776. Throughput: 0: 49009.5. Samples: 2039440800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:49:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:49:52,085][71000] Updated weights for policy 0, policy_version 153244 (0.0020) [2024-06-13 03:49:55,734][71000] Updated weights for policy 0, policy_version 153254 (0.0039) [2024-06-13 03:49:55,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2510913536. Throughput: 0: 48955.6. Samples: 2039733260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:49:55,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 03:49:58,705][71000] Updated weights for policy 0, policy_version 153264 (0.0024) [2024-06-13 03:50:00,940][70768] Fps is (10 sec: 50789.1, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2511175680. Throughput: 0: 49087.5. Samples: 2040040880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:50:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 03:50:02,205][71000] Updated weights for policy 0, policy_version 153274 (0.0026) [2024-06-13 03:50:05,287][71000] Updated weights for policy 0, policy_version 153284 (0.0026) [2024-06-13 03:50:05,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48879.0, 300 sec: 49207.6). Total num frames: 2511421440. Throughput: 0: 49432.1. Samples: 2040190040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 03:50:05,940][70768] Avg episode reward: [(0, '0.266')] [2024-06-13 03:50:09,114][71000] Updated weights for policy 0, policy_version 153294 (0.0023) [2024-06-13 03:50:10,940][70768] Fps is (10 sec: 47514.6, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2511650816. Throughput: 0: 49214.2. Samples: 2040480700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 03:50:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:50:12,075][71000] Updated weights for policy 0, policy_version 153304 (0.0025) [2024-06-13 03:50:15,625][71000] Updated weights for policy 0, policy_version 153314 (0.0028) [2024-06-13 03:50:15,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49152.0, 300 sec: 49096.4). Total num frames: 2511896576. Throughput: 0: 49028.3. Samples: 2040768420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 03:50:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:50:18,675][71000] Updated weights for policy 0, policy_version 153324 (0.0038) [2024-06-13 03:50:20,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2512142336. Throughput: 0: 49372.9. Samples: 2040913440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 03:50:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 03:50:22,029][71000] Updated weights for policy 0, policy_version 153334 (0.0028) [2024-06-13 03:50:25,482][71000] Updated weights for policy 0, policy_version 153344 (0.0033) [2024-06-13 03:50:25,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2512404480. Throughput: 0: 49427.7. Samples: 2041215660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 03:50:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:50:29,071][71000] Updated weights for policy 0, policy_version 153354 (0.0025) [2024-06-13 03:50:30,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2512650240. Throughput: 0: 49640.2. Samples: 2041523580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 03:50:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 03:50:31,962][71000] Updated weights for policy 0, policy_version 153364 (0.0031) [2024-06-13 03:50:35,513][71000] Updated weights for policy 0, policy_version 153374 (0.0030) [2024-06-13 03:50:35,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 2512879616. Throughput: 0: 49413.6. Samples: 2041664420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 03:50:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:50:38,589][71000] Updated weights for policy 0, policy_version 153384 (0.0031) [2024-06-13 03:50:40,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49698.2, 300 sec: 49152.0). Total num frames: 2513141760. Throughput: 0: 49389.7. Samples: 2041955800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 03:50:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:50:42,230][71000] Updated weights for policy 0, policy_version 153394 (0.0024) [2024-06-13 03:50:44,829][70980] Signal inference workers to stop experience collection... (30350 times) [2024-06-13 03:50:44,829][70980] Signal inference workers to resume experience collection... (30350 times) [2024-06-13 03:50:44,848][71000] InferenceWorker_p0-w0: stopping experience collection (30350 times) [2024-06-13 03:50:44,848][71000] InferenceWorker_p0-w0: resuming experience collection (30350 times) [2024-06-13 03:50:45,426][71000] Updated weights for policy 0, policy_version 153404 (0.0027) [2024-06-13 03:50:45,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2513387520. Throughput: 0: 49206.2. Samples: 2042255160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 03:50:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:50:48,875][71000] Updated weights for policy 0, policy_version 153414 (0.0023) [2024-06-13 03:50:50,939][70768] Fps is (10 sec: 49153.1, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2513633280. Throughput: 0: 49071.3. Samples: 2042398240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 03:50:50,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 03:50:52,334][71000] Updated weights for policy 0, policy_version 153424 (0.0029) [2024-06-13 03:50:55,855][71000] Updated weights for policy 0, policy_version 153434 (0.0031) [2024-06-13 03:50:55,940][70768] Fps is (10 sec: 47514.7, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2513862656. Throughput: 0: 49108.9. Samples: 2042690600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 03:50:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:50:58,857][71000] Updated weights for policy 0, policy_version 153444 (0.0026) [2024-06-13 03:51:00,940][70768] Fps is (10 sec: 49150.7, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2514124800. Throughput: 0: 49258.7. Samples: 2042985060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 03:51:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:51:02,509][71000] Updated weights for policy 0, policy_version 153454 (0.0029) [2024-06-13 03:51:05,584][71000] Updated weights for policy 0, policy_version 153464 (0.0025) [2024-06-13 03:51:05,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2514386944. Throughput: 0: 49625.7. Samples: 2043146600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 03:51:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:51:08,847][71000] Updated weights for policy 0, policy_version 153474 (0.0029) [2024-06-13 03:51:10,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2514599936. Throughput: 0: 49566.7. Samples: 2043446160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 03:51:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:51:12,207][71000] Updated weights for policy 0, policy_version 153484 (0.0034) [2024-06-13 03:51:15,493][71000] Updated weights for policy 0, policy_version 153494 (0.0033) [2024-06-13 03:51:15,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2514862080. Throughput: 0: 49066.4. Samples: 2043731580. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-13 03:51:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:51:19,007][71000] Updated weights for policy 0, policy_version 153504 (0.0037) [2024-06-13 03:51:20,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49698.1, 300 sec: 49207.6). Total num frames: 2515124224. Throughput: 0: 49395.2. Samples: 2043887200. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-13 03:51:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 03:51:22,266][71000] Updated weights for policy 0, policy_version 153514 (0.0023) [2024-06-13 03:51:25,451][71000] Updated weights for policy 0, policy_version 153524 (0.0026) [2024-06-13 03:51:25,939][70768] Fps is (10 sec: 49153.2, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2515353600. Throughput: 0: 49604.6. Samples: 2044188000. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-13 03:51:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:51:28,486][71000] Updated weights for policy 0, policy_version 153534 (0.0030) [2024-06-13 03:51:30,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49151.8, 300 sec: 49096.5). Total num frames: 2515599360. Throughput: 0: 49500.1. Samples: 2044482660. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-13 03:51:30,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 03:51:32,134][71000] Updated weights for policy 0, policy_version 153544 (0.0031) [2024-06-13 03:51:35,130][71000] Updated weights for policy 0, policy_version 153554 (0.0019) [2024-06-13 03:51:35,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2515845120. Throughput: 0: 49346.4. Samples: 2044618840. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-13 03:51:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:51:39,049][71000] Updated weights for policy 0, policy_version 153564 (0.0023) [2024-06-13 03:51:40,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2516123648. Throughput: 0: 49417.2. Samples: 2044914380. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-13 03:51:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:51:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000153572_2516123648.pth... [2024-06-13 03:51:40,998][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000152850_2504294400.pth [2024-06-13 03:51:42,178][71000] Updated weights for policy 0, policy_version 153574 (0.0022) [2024-06-13 03:51:45,661][71000] Updated weights for policy 0, policy_version 153584 (0.0027) [2024-06-13 03:51:45,758][70980] Signal inference workers to stop experience collection... (30400 times) [2024-06-13 03:51:45,802][71000] InferenceWorker_p0-w0: stopping experience collection (30400 times) [2024-06-13 03:51:45,808][70980] Signal inference workers to resume experience collection... (30400 times) [2024-06-13 03:51:45,810][71000] InferenceWorker_p0-w0: resuming experience collection (30400 times) [2024-06-13 03:51:45,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.2, 300 sec: 48985.4). Total num frames: 2516336640. Throughput: 0: 49682.4. Samples: 2045220760. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-13 03:51:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 03:51:48,715][71000] Updated weights for policy 0, policy_version 153594 (0.0027) [2024-06-13 03:51:50,940][70768] Fps is (10 sec: 45875.8, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 2516582400. Throughput: 0: 49292.5. Samples: 2045364760. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-13 03:51:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:51:52,230][71000] Updated weights for policy 0, policy_version 153604 (0.0031) [2024-06-13 03:51:55,173][71000] Updated weights for policy 0, policy_version 153614 (0.0031) [2024-06-13 03:51:55,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2516828160. Throughput: 0: 49011.5. Samples: 2045651680. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-13 03:51:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 03:51:58,687][71000] Updated weights for policy 0, policy_version 153624 (0.0029) [2024-06-13 03:52:00,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49698.2, 300 sec: 49207.5). Total num frames: 2517106688. Throughput: 0: 49250.0. Samples: 2045947820. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-13 03:52:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:52:01,958][71000] Updated weights for policy 0, policy_version 153634 (0.0029) [2024-06-13 03:52:05,590][71000] Updated weights for policy 0, policy_version 153644 (0.0028) [2024-06-13 03:52:05,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 2517303296. Throughput: 0: 49117.4. Samples: 2046097480. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-13 03:52:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:52:08,380][71000] Updated weights for policy 0, policy_version 153654 (0.0026) [2024-06-13 03:52:10,940][70768] Fps is (10 sec: 44236.8, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2517549056. Throughput: 0: 49072.8. Samples: 2046396280. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-13 03:52:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:52:12,309][71000] Updated weights for policy 0, policy_version 153664 (0.0031) [2024-06-13 03:52:14,977][71000] Updated weights for policy 0, policy_version 153674 (0.0029) [2024-06-13 03:52:15,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2517827584. Throughput: 0: 48955.2. Samples: 2046685640. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-13 03:52:15,944][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:52:18,958][71000] Updated weights for policy 0, policy_version 153684 (0.0027) [2024-06-13 03:52:20,939][70768] Fps is (10 sec: 54067.9, 60 sec: 49425.2, 300 sec: 49207.6). Total num frames: 2518089728. Throughput: 0: 49322.5. Samples: 2046838340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 03:52:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:52:21,802][71000] Updated weights for policy 0, policy_version 153694 (0.0032) [2024-06-13 03:52:25,623][71000] Updated weights for policy 0, policy_version 153704 (0.0031) [2024-06-13 03:52:25,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49151.8, 300 sec: 49040.9). Total num frames: 2518302720. Throughput: 0: 49358.7. Samples: 2047135520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 03:52:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:52:28,216][71000] Updated weights for policy 0, policy_version 153714 (0.0031) [2024-06-13 03:52:30,940][70768] Fps is (10 sec: 45874.7, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 2518548480. Throughput: 0: 49135.6. Samples: 2047431860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 03:52:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:52:32,008][71000] Updated weights for policy 0, policy_version 153724 (0.0025) [2024-06-13 03:52:34,441][71000] Updated weights for policy 0, policy_version 153734 (0.0039) [2024-06-13 03:52:35,939][70768] Fps is (10 sec: 50791.3, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 2518810624. Throughput: 0: 49329.4. Samples: 2047584580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 03:52:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:52:38,630][71000] Updated weights for policy 0, policy_version 153744 (0.0031) [2024-06-13 03:52:39,569][70980] Signal inference workers to stop experience collection... (30450 times) [2024-06-13 03:52:39,570][70980] Signal inference workers to resume experience collection... (30450 times) [2024-06-13 03:52:39,588][71000] InferenceWorker_p0-w0: stopping experience collection (30450 times) [2024-06-13 03:52:39,589][71000] InferenceWorker_p0-w0: resuming experience collection (30450 times) [2024-06-13 03:52:40,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48879.1, 300 sec: 49207.5). Total num frames: 2519056384. Throughput: 0: 49442.3. Samples: 2047876580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 03:52:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 03:52:41,344][71000] Updated weights for policy 0, policy_version 153754 (0.0023) [2024-06-13 03:52:45,433][71000] Updated weights for policy 0, policy_version 153764 (0.0027) [2024-06-13 03:52:45,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 2519285760. Throughput: 0: 49371.9. Samples: 2048169560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 03:52:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 03:52:48,195][71000] Updated weights for policy 0, policy_version 153774 (0.0029) [2024-06-13 03:52:50,939][70768] Fps is (10 sec: 45875.3, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 2519515136. Throughput: 0: 49065.3. Samples: 2048305420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 03:52:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:52:52,051][71000] Updated weights for policy 0, policy_version 153784 (0.0025) [2024-06-13 03:52:54,795][71000] Updated weights for policy 0, policy_version 153794 (0.0030) [2024-06-13 03:52:55,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2519793664. Throughput: 0: 49175.0. Samples: 2048609160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 03:52:55,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 03:52:58,488][71000] Updated weights for policy 0, policy_version 153804 (0.0030) [2024-06-13 03:53:00,940][70768] Fps is (10 sec: 52426.7, 60 sec: 48878.7, 300 sec: 49207.5). Total num frames: 2520039424. Throughput: 0: 49334.4. Samples: 2048905700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 03:53:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:53:01,329][71000] Updated weights for policy 0, policy_version 153814 (0.0029) [2024-06-13 03:53:05,267][71000] Updated weights for policy 0, policy_version 153824 (0.0028) [2024-06-13 03:53:05,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49698.0, 300 sec: 49152.0). Total num frames: 2520285184. Throughput: 0: 49409.1. Samples: 2049061760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 03:53:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:53:08,003][71000] Updated weights for policy 0, policy_version 153834 (0.0031) [2024-06-13 03:53:10,940][70768] Fps is (10 sec: 45876.3, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2520498176. Throughput: 0: 49253.4. Samples: 2049351920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 03:53:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:53:11,738][71000] Updated weights for policy 0, policy_version 153844 (0.0023) [2024-06-13 03:53:14,609][71000] Updated weights for policy 0, policy_version 153854 (0.0025) [2024-06-13 03:53:15,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2520793088. Throughput: 0: 49196.3. Samples: 2049645700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 03:53:15,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-13 03:53:18,316][71000] Updated weights for policy 0, policy_version 153864 (0.0027) [2024-06-13 03:53:20,939][70768] Fps is (10 sec: 54068.1, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2521038848. Throughput: 0: 49248.5. Samples: 2049800760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 03:53:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:53:21,146][71000] Updated weights for policy 0, policy_version 153874 (0.0027) [2024-06-13 03:53:25,086][71000] Updated weights for policy 0, policy_version 153884 (0.0028) [2024-06-13 03:53:25,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2521284608. Throughput: 0: 49562.5. Samples: 2050106900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-13 03:53:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:53:27,753][71000] Updated weights for policy 0, policy_version 153894 (0.0033) [2024-06-13 03:53:30,940][70768] Fps is (10 sec: 45874.8, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2521497600. Throughput: 0: 49594.3. Samples: 2050401300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-13 03:53:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 03:53:31,584][71000] Updated weights for policy 0, policy_version 153904 (0.0039) [2024-06-13 03:53:34,416][71000] Updated weights for policy 0, policy_version 153914 (0.0027) [2024-06-13 03:53:35,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2521776128. Throughput: 0: 49716.4. Samples: 2050542660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-13 03:53:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:53:38,085][71000] Updated weights for policy 0, policy_version 153924 (0.0029) [2024-06-13 03:53:40,426][70980] Signal inference workers to stop experience collection... (30500 times) [2024-06-13 03:53:40,451][71000] InferenceWorker_p0-w0: stopping experience collection (30500 times) [2024-06-13 03:53:40,486][70980] Signal inference workers to resume experience collection... (30500 times) [2024-06-13 03:53:40,486][71000] InferenceWorker_p0-w0: resuming experience collection (30500 times) [2024-06-13 03:53:40,761][71000] Updated weights for policy 0, policy_version 153934 (0.0026) [2024-06-13 03:53:40,940][70768] Fps is (10 sec: 55705.8, 60 sec: 49971.2, 300 sec: 49318.6). Total num frames: 2522054656. Throughput: 0: 49738.4. Samples: 2050847380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-13 03:53:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:53:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000153934_2522054656.pth... [2024-06-13 03:53:40,988][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000153209_2510176256.pth [2024-06-13 03:53:44,591][71000] Updated weights for policy 0, policy_version 153944 (0.0023) [2024-06-13 03:53:45,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49971.3, 300 sec: 49263.1). Total num frames: 2522284032. Throughput: 0: 49709.3. Samples: 2051142600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-13 03:53:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:53:47,862][71000] Updated weights for policy 0, policy_version 153954 (0.0033) [2024-06-13 03:53:50,940][70768] Fps is (10 sec: 44236.3, 60 sec: 49698.0, 300 sec: 49207.5). Total num frames: 2522497024. Throughput: 0: 49311.1. Samples: 2051280760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-13 03:53:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:53:51,391][71000] Updated weights for policy 0, policy_version 153964 (0.0031) [2024-06-13 03:53:54,663][71000] Updated weights for policy 0, policy_version 153974 (0.0026) [2024-06-13 03:53:55,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 2522759168. Throughput: 0: 49373.5. Samples: 2051573720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-13 03:53:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:53:58,151][71000] Updated weights for policy 0, policy_version 153984 (0.0022) [2024-06-13 03:54:00,940][70768] Fps is (10 sec: 52429.5, 60 sec: 49698.4, 300 sec: 49263.1). Total num frames: 2523021312. Throughput: 0: 49387.3. Samples: 2051868120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-13 03:54:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:54:01,147][71000] Updated weights for policy 0, policy_version 153994 (0.0033) [2024-06-13 03:54:04,762][71000] Updated weights for policy 0, policy_version 154004 (0.0041) [2024-06-13 03:54:05,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2523250688. Throughput: 0: 49391.3. Samples: 2052023380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-13 03:54:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:54:07,977][71000] Updated weights for policy 0, policy_version 154014 (0.0032) [2024-06-13 03:54:10,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49698.2, 300 sec: 49263.1). Total num frames: 2523480064. Throughput: 0: 48937.9. Samples: 2052309100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-13 03:54:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:54:11,437][71000] Updated weights for policy 0, policy_version 154024 (0.0027) [2024-06-13 03:54:14,760][71000] Updated weights for policy 0, policy_version 154034 (0.0027) [2024-06-13 03:54:15,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2523742208. Throughput: 0: 49036.8. Samples: 2052607960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-13 03:54:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 03:54:18,045][71000] Updated weights for policy 0, policy_version 154044 (0.0029) [2024-06-13 03:54:20,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2523987968. Throughput: 0: 49224.5. Samples: 2052757760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-13 03:54:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:54:21,207][71000] Updated weights for policy 0, policy_version 154054 (0.0037) [2024-06-13 03:54:24,733][71000] Updated weights for policy 0, policy_version 154064 (0.0026) [2024-06-13 03:54:25,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2524233728. Throughput: 0: 49135.1. Samples: 2053058460. Policy #0 lag: (min: 2.0, avg: 11.9, max: 24.0) [2024-06-13 03:54:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:54:28,220][71000] Updated weights for policy 0, policy_version 154074 (0.0035) [2024-06-13 03:54:28,765][70980] Signal inference workers to stop experience collection... (30550 times) [2024-06-13 03:54:28,765][70980] Signal inference workers to resume experience collection... (30550 times) [2024-06-13 03:54:28,783][71000] InferenceWorker_p0-w0: stopping experience collection (30550 times) [2024-06-13 03:54:28,783][71000] InferenceWorker_p0-w0: resuming experience collection (30550 times) [2024-06-13 03:54:30,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2524463104. Throughput: 0: 48919.9. Samples: 2053344000. Policy #0 lag: (min: 2.0, avg: 11.9, max: 24.0) [2024-06-13 03:54:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:54:31,409][71000] Updated weights for policy 0, policy_version 154084 (0.0026) [2024-06-13 03:54:35,051][71000] Updated weights for policy 0, policy_version 154094 (0.0030) [2024-06-13 03:54:35,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 2524708864. Throughput: 0: 48988.5. Samples: 2053485240. Policy #0 lag: (min: 2.0, avg: 11.9, max: 24.0) [2024-06-13 03:54:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:54:38,570][71000] Updated weights for policy 0, policy_version 154104 (0.0024) [2024-06-13 03:54:40,940][70768] Fps is (10 sec: 50790.9, 60 sec: 48605.8, 300 sec: 49263.1). Total num frames: 2524971008. Throughput: 0: 49167.5. Samples: 2053786260. Policy #0 lag: (min: 2.0, avg: 11.9, max: 24.0) [2024-06-13 03:54:40,952][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 03:54:41,590][71000] Updated weights for policy 0, policy_version 154114 (0.0032) [2024-06-13 03:54:45,175][71000] Updated weights for policy 0, policy_version 154124 (0.0023) [2024-06-13 03:54:45,940][70768] Fps is (10 sec: 50789.1, 60 sec: 48878.7, 300 sec: 49318.6). Total num frames: 2525216768. Throughput: 0: 49150.8. Samples: 2054079920. Policy #0 lag: (min: 2.0, avg: 11.9, max: 24.0) [2024-06-13 03:54:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:54:48,773][71000] Updated weights for policy 0, policy_version 154134 (0.0030) [2024-06-13 03:54:50,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2525429760. Throughput: 0: 48875.7. Samples: 2054222780. Policy #0 lag: (min: 2.0, avg: 11.9, max: 24.0) [2024-06-13 03:54:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:54:51,578][71000] Updated weights for policy 0, policy_version 154144 (0.0028) [2024-06-13 03:54:55,227][71000] Updated weights for policy 0, policy_version 154154 (0.0031) [2024-06-13 03:54:55,940][70768] Fps is (10 sec: 45876.4, 60 sec: 48605.8, 300 sec: 49152.0). Total num frames: 2525675520. Throughput: 0: 48800.9. Samples: 2054505140. Policy #0 lag: (min: 2.0, avg: 11.9, max: 24.0) [2024-06-13 03:54:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:54:58,362][71000] Updated weights for policy 0, policy_version 154164 (0.0025) [2024-06-13 03:55:00,940][70768] Fps is (10 sec: 52427.9, 60 sec: 48878.8, 300 sec: 49263.1). Total num frames: 2525954048. Throughput: 0: 48747.9. Samples: 2054801620. Policy #0 lag: (min: 2.0, avg: 11.9, max: 24.0) [2024-06-13 03:55:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:55:02,365][71000] Updated weights for policy 0, policy_version 154174 (0.0029) [2024-06-13 03:55:05,179][71000] Updated weights for policy 0, policy_version 154184 (0.0030) [2024-06-13 03:55:05,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48879.1, 300 sec: 49263.1). Total num frames: 2526183424. Throughput: 0: 48632.0. Samples: 2054946200. Policy #0 lag: (min: 2.0, avg: 11.9, max: 24.0) [2024-06-13 03:55:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:55:09,058][71000] Updated weights for policy 0, policy_version 154194 (0.0028) [2024-06-13 03:55:10,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2526412800. Throughput: 0: 48752.4. Samples: 2055252320. Policy #0 lag: (min: 2.0, avg: 11.9, max: 24.0) [2024-06-13 03:55:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:55:11,827][71000] Updated weights for policy 0, policy_version 154204 (0.0024) [2024-06-13 03:55:15,631][71000] Updated weights for policy 0, policy_version 154214 (0.0035) [2024-06-13 03:55:15,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48606.0, 300 sec: 49207.5). Total num frames: 2526658560. Throughput: 0: 48958.4. Samples: 2055547120. Policy #0 lag: (min: 2.0, avg: 11.9, max: 24.0) [2024-06-13 03:55:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:55:18,456][71000] Updated weights for policy 0, policy_version 154224 (0.0022) [2024-06-13 03:55:20,939][70768] Fps is (10 sec: 50791.1, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2526920704. Throughput: 0: 48931.7. Samples: 2055687160. Policy #0 lag: (min: 2.0, avg: 11.9, max: 24.0) [2024-06-13 03:55:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:55:22,187][71000] Updated weights for policy 0, policy_version 154234 (0.0027) [2024-06-13 03:55:25,344][71000] Updated weights for policy 0, policy_version 154244 (0.0029) [2024-06-13 03:55:25,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48605.9, 300 sec: 49152.0). Total num frames: 2527150080. Throughput: 0: 48608.4. Samples: 2055973640. Policy #0 lag: (min: 2.0, avg: 11.9, max: 24.0) [2024-06-13 03:55:25,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 03:55:29,107][71000] Updated weights for policy 0, policy_version 154254 (0.0026) [2024-06-13 03:55:30,939][70768] Fps is (10 sec: 47513.5, 60 sec: 48879.1, 300 sec: 49207.6). Total num frames: 2527395840. Throughput: 0: 48685.3. Samples: 2056270740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:55:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:55:32,151][71000] Updated weights for policy 0, policy_version 154264 (0.0029) [2024-06-13 03:55:35,727][71000] Updated weights for policy 0, policy_version 154274 (0.0023) [2024-06-13 03:55:35,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48605.9, 300 sec: 49096.5). Total num frames: 2527625216. Throughput: 0: 48710.2. Samples: 2056414740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:55:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 03:55:38,648][71000] Updated weights for policy 0, policy_version 154284 (0.0038) [2024-06-13 03:55:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48605.9, 300 sec: 49152.0). Total num frames: 2527887360. Throughput: 0: 48984.4. Samples: 2056709440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:55:40,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 03:55:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000154290_2527887360.pth... [2024-06-13 03:55:40,990][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000153572_2516123648.pth [2024-06-13 03:55:42,237][71000] Updated weights for policy 0, policy_version 154294 (0.0025) [2024-06-13 03:55:42,902][70980] Signal inference workers to stop experience collection... (30600 times) [2024-06-13 03:55:42,902][70980] Signal inference workers to resume experience collection... (30600 times) [2024-06-13 03:55:42,933][71000] InferenceWorker_p0-w0: stopping experience collection (30600 times) [2024-06-13 03:55:42,933][71000] InferenceWorker_p0-w0: resuming experience collection (30600 times) [2024-06-13 03:55:45,121][71000] Updated weights for policy 0, policy_version 154304 (0.0032) [2024-06-13 03:55:45,940][70768] Fps is (10 sec: 50789.3, 60 sec: 48606.0, 300 sec: 49151.9). Total num frames: 2528133120. Throughput: 0: 48958.2. Samples: 2057004740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:55:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:55:48,691][71000] Updated weights for policy 0, policy_version 154314 (0.0034) [2024-06-13 03:55:50,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 2528411648. Throughput: 0: 49272.8. Samples: 2057163480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:55:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:55:51,867][71000] Updated weights for policy 0, policy_version 154324 (0.0025) [2024-06-13 03:55:55,525][71000] Updated weights for policy 0, policy_version 154334 (0.0025) [2024-06-13 03:55:55,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2528624640. Throughput: 0: 48918.2. Samples: 2057453640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:55:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:55:58,616][71000] Updated weights for policy 0, policy_version 154344 (0.0038) [2024-06-13 03:56:00,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48606.0, 300 sec: 49096.5). Total num frames: 2528870400. Throughput: 0: 48781.2. Samples: 2057742280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:56:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:56:02,605][71000] Updated weights for policy 0, policy_version 154354 (0.0043) [2024-06-13 03:56:05,107][71000] Updated weights for policy 0, policy_version 154364 (0.0025) [2024-06-13 03:56:05,939][70768] Fps is (10 sec: 50791.1, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2529132544. Throughput: 0: 49005.8. Samples: 2057892420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:56:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 03:56:08,847][71000] Updated weights for policy 0, policy_version 154374 (0.0024) [2024-06-13 03:56:10,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.1, 300 sec: 49207.6). Total num frames: 2529378304. Throughput: 0: 49404.5. Samples: 2058196840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:56:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:56:11,688][71000] Updated weights for policy 0, policy_version 154384 (0.0036) [2024-06-13 03:56:15,656][71000] Updated weights for policy 0, policy_version 154394 (0.0025) [2024-06-13 03:56:15,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 2529607680. Throughput: 0: 49135.8. Samples: 2058481860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:56:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:56:18,427][71000] Updated weights for policy 0, policy_version 154404 (0.0037) [2024-06-13 03:56:20,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 2529853440. Throughput: 0: 49246.0. Samples: 2058630820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:56:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:56:22,531][71000] Updated weights for policy 0, policy_version 154414 (0.0030) [2024-06-13 03:56:25,091][71000] Updated weights for policy 0, policy_version 154424 (0.0027) [2024-06-13 03:56:25,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49425.1, 300 sec: 49207.6). Total num frames: 2530115584. Throughput: 0: 49217.4. Samples: 2058924220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:56:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:56:29,095][71000] Updated weights for policy 0, policy_version 154434 (0.0030) [2024-06-13 03:56:30,939][70768] Fps is (10 sec: 49153.2, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2530344960. Throughput: 0: 49270.5. Samples: 2059221900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 03:56:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:56:31,704][71000] Updated weights for policy 0, policy_version 154444 (0.0025) [2024-06-13 03:56:35,525][71000] Updated weights for policy 0, policy_version 154454 (0.0023) [2024-06-13 03:56:35,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 2530590720. Throughput: 0: 48786.6. Samples: 2059358880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:56:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 03:56:38,447][71000] Updated weights for policy 0, policy_version 154464 (0.0035) [2024-06-13 03:56:40,939][70768] Fps is (10 sec: 47513.4, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2530820096. Throughput: 0: 48909.5. Samples: 2059654560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:56:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:56:42,387][71000] Updated weights for policy 0, policy_version 154474 (0.0023) [2024-06-13 03:56:43,555][70980] Signal inference workers to stop experience collection... (30650 times) [2024-06-13 03:56:43,555][70980] Signal inference workers to resume experience collection... (30650 times) [2024-06-13 03:56:43,574][71000] InferenceWorker_p0-w0: stopping experience collection (30650 times) [2024-06-13 03:56:43,574][71000] InferenceWorker_p0-w0: resuming experience collection (30650 times) [2024-06-13 03:56:45,035][71000] Updated weights for policy 0, policy_version 154484 (0.0029) [2024-06-13 03:56:45,939][70768] Fps is (10 sec: 52429.2, 60 sec: 49698.3, 300 sec: 49263.1). Total num frames: 2531115008. Throughput: 0: 49085.4. Samples: 2059951120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:56:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:56:49,379][71000] Updated weights for policy 0, policy_version 154494 (0.0038) [2024-06-13 03:56:50,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48605.9, 300 sec: 49152.0). Total num frames: 2531328000. Throughput: 0: 49177.3. Samples: 2060105400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:56:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:56:51,950][71000] Updated weights for policy 0, policy_version 154504 (0.0032) [2024-06-13 03:56:55,940][70768] Fps is (10 sec: 42598.4, 60 sec: 48606.0, 300 sec: 48929.9). Total num frames: 2531540992. Throughput: 0: 48735.6. Samples: 2060389940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:56:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:56:55,974][71000] Updated weights for policy 0, policy_version 154514 (0.0030) [2024-06-13 03:56:58,663][71000] Updated weights for policy 0, policy_version 154524 (0.0029) [2024-06-13 03:57:00,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2531803136. Throughput: 0: 48877.4. Samples: 2060681340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:57:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 03:57:02,708][71000] Updated weights for policy 0, policy_version 154534 (0.0026) [2024-06-13 03:57:05,320][71000] Updated weights for policy 0, policy_version 154544 (0.0046) [2024-06-13 03:57:05,939][70768] Fps is (10 sec: 52429.0, 60 sec: 48878.9, 300 sec: 49207.6). Total num frames: 2532065280. Throughput: 0: 48860.2. Samples: 2060829520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:57:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:57:09,574][71000] Updated weights for policy 0, policy_version 154554 (0.0035) [2024-06-13 03:57:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 2532294656. Throughput: 0: 48914.2. Samples: 2061125360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:57:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:57:12,082][71000] Updated weights for policy 0, policy_version 154564 (0.0031) [2024-06-13 03:57:15,940][70768] Fps is (10 sec: 42597.9, 60 sec: 48059.8, 300 sec: 48818.7). Total num frames: 2532491264. Throughput: 0: 48465.6. Samples: 2061402860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:57:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:57:16,571][71000] Updated weights for policy 0, policy_version 154574 (0.0035) [2024-06-13 03:57:19,064][71000] Updated weights for policy 0, policy_version 154584 (0.0031) [2024-06-13 03:57:20,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 2532769792. Throughput: 0: 48529.3. Samples: 2061542700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:57:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:57:22,927][71000] Updated weights for policy 0, policy_version 154594 (0.0029) [2024-06-13 03:57:25,434][71000] Updated weights for policy 0, policy_version 154604 (0.0029) [2024-06-13 03:57:25,940][70768] Fps is (10 sec: 55705.9, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2533048320. Throughput: 0: 48752.4. Samples: 2061848420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:57:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:57:29,510][71000] Updated weights for policy 0, policy_version 154614 (0.0032) [2024-06-13 03:57:30,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 2533277696. Throughput: 0: 48899.1. Samples: 2062151580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:57:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:57:31,755][71000] Updated weights for policy 0, policy_version 154624 (0.0028) [2024-06-13 03:57:35,940][70768] Fps is (10 sec: 44236.9, 60 sec: 48332.9, 300 sec: 48929.8). Total num frames: 2533490688. Throughput: 0: 48550.3. Samples: 2062290160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 03:57:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:57:36,196][71000] Updated weights for policy 0, policy_version 154634 (0.0036) [2024-06-13 03:57:38,740][71000] Updated weights for policy 0, policy_version 154644 (0.0026) [2024-06-13 03:57:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2533769216. Throughput: 0: 48878.2. Samples: 2062589460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 03:57:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:57:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000154649_2533769216.pth... [2024-06-13 03:57:41,013][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000153934_2522054656.pth [2024-06-13 03:57:42,508][71000] Updated weights for policy 0, policy_version 154654 (0.0029) [2024-06-13 03:57:45,281][71000] Updated weights for policy 0, policy_version 154664 (0.0035) [2024-06-13 03:57:45,940][70768] Fps is (10 sec: 54066.9, 60 sec: 48605.8, 300 sec: 49207.5). Total num frames: 2534031360. Throughput: 0: 48983.1. Samples: 2062885580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 03:57:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:57:49,523][71000] Updated weights for policy 0, policy_version 154674 (0.0027) [2024-06-13 03:57:50,193][70980] Signal inference workers to stop experience collection... (30700 times) [2024-06-13 03:57:50,245][71000] InferenceWorker_p0-w0: stopping experience collection (30700 times) [2024-06-13 03:57:50,248][70980] Signal inference workers to resume experience collection... (30700 times) [2024-06-13 03:57:50,266][71000] InferenceWorker_p0-w0: resuming experience collection (30700 times) [2024-06-13 03:57:50,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2534260736. Throughput: 0: 49116.7. Samples: 2063039780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 03:57:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 03:57:52,034][71000] Updated weights for policy 0, policy_version 154684 (0.0030) [2024-06-13 03:57:55,940][70768] Fps is (10 sec: 44236.9, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 2534473728. Throughput: 0: 48889.8. Samples: 2063325400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 03:57:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 03:57:56,357][71000] Updated weights for policy 0, policy_version 154694 (0.0032) [2024-06-13 03:57:58,774][71000] Updated weights for policy 0, policy_version 154704 (0.0032) [2024-06-13 03:58:00,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2534752256. Throughput: 0: 49135.1. Samples: 2063613940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 03:58:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:58:02,936][71000] Updated weights for policy 0, policy_version 154714 (0.0025) [2024-06-13 03:58:05,276][71000] Updated weights for policy 0, policy_version 154724 (0.0021) [2024-06-13 03:58:05,940][70768] Fps is (10 sec: 54066.7, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2535014400. Throughput: 0: 49477.7. Samples: 2063769200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 03:58:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:58:09,559][71000] Updated weights for policy 0, policy_version 154734 (0.0028) [2024-06-13 03:58:10,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 2535260160. Throughput: 0: 49391.1. Samples: 2064071020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 03:58:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:58:12,019][71000] Updated weights for policy 0, policy_version 154744 (0.0039) [2024-06-13 03:58:15,940][70768] Fps is (10 sec: 44236.8, 60 sec: 49425.0, 300 sec: 48874.3). Total num frames: 2535456768. Throughput: 0: 49128.4. Samples: 2064362360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 03:58:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:58:16,329][71000] Updated weights for policy 0, policy_version 154754 (0.0025) [2024-06-13 03:58:18,951][71000] Updated weights for policy 0, policy_version 154764 (0.0026) [2024-06-13 03:58:20,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 2535718912. Throughput: 0: 49087.0. Samples: 2064499080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 03:58:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:58:23,151][71000] Updated weights for policy 0, policy_version 154774 (0.0028) [2024-06-13 03:58:25,352][71000] Updated weights for policy 0, policy_version 154784 (0.0025) [2024-06-13 03:58:25,940][70768] Fps is (10 sec: 52429.4, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2535981056. Throughput: 0: 48905.8. Samples: 2064790220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 03:58:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:58:29,662][71000] Updated weights for policy 0, policy_version 154794 (0.0029) [2024-06-13 03:58:30,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 2536243200. Throughput: 0: 49072.3. Samples: 2065093840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 03:58:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 03:58:32,191][71000] Updated weights for policy 0, policy_version 154804 (0.0031) [2024-06-13 03:58:35,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 2536439808. Throughput: 0: 48723.7. Samples: 2065232340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 03:58:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:58:36,338][71000] Updated weights for policy 0, policy_version 154814 (0.0030) [2024-06-13 03:58:38,880][71000] Updated weights for policy 0, policy_version 154824 (0.0032) [2024-06-13 03:58:40,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 2536718336. Throughput: 0: 48833.6. Samples: 2065522920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:58:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 03:58:42,788][71000] Updated weights for policy 0, policy_version 154834 (0.0022) [2024-06-13 03:58:45,335][71000] Updated weights for policy 0, policy_version 154844 (0.0026) [2024-06-13 03:58:45,940][70768] Fps is (10 sec: 52428.9, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2536964096. Throughput: 0: 49227.6. Samples: 2065829180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:58:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:58:49,515][71000] Updated weights for policy 0, policy_version 154854 (0.0026) [2024-06-13 03:58:50,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 2537226240. Throughput: 0: 49345.3. Samples: 2065989740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:58:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:58:52,101][71000] Updated weights for policy 0, policy_version 154864 (0.0028) [2024-06-13 03:58:55,940][70768] Fps is (10 sec: 45874.4, 60 sec: 49151.9, 300 sec: 48818.7). Total num frames: 2537422848. Throughput: 0: 48896.3. Samples: 2066271360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:58:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 03:58:56,248][70980] Signal inference workers to stop experience collection... (30750 times) [2024-06-13 03:58:56,278][71000] InferenceWorker_p0-w0: stopping experience collection (30750 times) [2024-06-13 03:58:56,312][70980] Signal inference workers to resume experience collection... (30750 times) [2024-06-13 03:58:56,313][71000] InferenceWorker_p0-w0: resuming experience collection (30750 times) [2024-06-13 03:58:56,316][71000] Updated weights for policy 0, policy_version 154874 (0.0027) [2024-06-13 03:58:58,727][71000] Updated weights for policy 0, policy_version 154884 (0.0038) [2024-06-13 03:59:00,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 2537684992. Throughput: 0: 48799.1. Samples: 2066558320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:59:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 03:59:02,778][71000] Updated weights for policy 0, policy_version 154894 (0.0025) [2024-06-13 03:59:05,479][71000] Updated weights for policy 0, policy_version 154904 (0.0031) [2024-06-13 03:59:05,940][70768] Fps is (10 sec: 52429.7, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2537947136. Throughput: 0: 49244.5. Samples: 2066715080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:59:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 03:59:09,442][71000] Updated weights for policy 0, policy_version 154914 (0.0031) [2024-06-13 03:59:10,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2538192896. Throughput: 0: 49281.3. Samples: 2067007880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:59:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 03:59:11,995][71000] Updated weights for policy 0, policy_version 154924 (0.0028) [2024-06-13 03:59:15,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.2, 300 sec: 48929.8). Total num frames: 2538422272. Throughput: 0: 49075.3. Samples: 2067302220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:59:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:59:16,247][71000] Updated weights for policy 0, policy_version 154934 (0.0030) [2024-06-13 03:59:18,959][71000] Updated weights for policy 0, policy_version 154944 (0.0027) [2024-06-13 03:59:20,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 2538651648. Throughput: 0: 49076.9. Samples: 2067440800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:59:20,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 03:59:22,765][71000] Updated weights for policy 0, policy_version 154954 (0.0033) [2024-06-13 03:59:25,499][71000] Updated weights for policy 0, policy_version 154964 (0.0028) [2024-06-13 03:59:25,940][70768] Fps is (10 sec: 50787.8, 60 sec: 49151.6, 300 sec: 49040.9). Total num frames: 2538930176. Throughput: 0: 49246.7. Samples: 2067739040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:59:25,941][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 03:59:29,048][71000] Updated weights for policy 0, policy_version 154974 (0.0026) [2024-06-13 03:59:30,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 2539159552. Throughput: 0: 49194.2. Samples: 2068042920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:59:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 03:59:32,050][71000] Updated weights for policy 0, policy_version 154984 (0.0029) [2024-06-13 03:59:35,819][71000] Updated weights for policy 0, policy_version 154994 (0.0024) [2024-06-13 03:59:35,939][70768] Fps is (10 sec: 49154.7, 60 sec: 49698.2, 300 sec: 48985.4). Total num frames: 2539421696. Throughput: 0: 48888.2. Samples: 2068189700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:59:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 03:59:38,909][71000] Updated weights for policy 0, policy_version 155004 (0.0019) [2024-06-13 03:59:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48879.1, 300 sec: 48929.9). Total num frames: 2539651072. Throughput: 0: 49093.1. Samples: 2068480540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 03:59:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:59:41,074][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000155009_2539667456.pth... [2024-06-13 03:59:41,127][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000154290_2527887360.pth [2024-06-13 03:59:42,253][71000] Updated weights for policy 0, policy_version 155014 (0.0030) [2024-06-13 03:59:45,517][71000] Updated weights for policy 0, policy_version 155024 (0.0031) [2024-06-13 03:59:45,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2539913216. Throughput: 0: 49257.4. Samples: 2068774900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 03:59:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 03:59:48,903][71000] Updated weights for policy 0, policy_version 155034 (0.0040) [2024-06-13 03:59:50,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2540158976. Throughput: 0: 49168.3. Samples: 2068927660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 03:59:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 03:59:52,240][71000] Updated weights for policy 0, policy_version 155044 (0.0027) [2024-06-13 03:59:55,863][71000] Updated weights for policy 0, policy_version 155054 (0.0036) [2024-06-13 03:59:55,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49698.2, 300 sec: 48985.4). Total num frames: 2540404736. Throughput: 0: 49135.4. Samples: 2069218980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 03:59:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 03:59:59,212][71000] Updated weights for policy 0, policy_version 155064 (0.0032) [2024-06-13 04:00:00,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 2540634112. Throughput: 0: 49058.2. Samples: 2069509840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 04:00:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:00:02,841][71000] Updated weights for policy 0, policy_version 155074 (0.0030) [2024-06-13 04:00:05,333][70980] Signal inference workers to stop experience collection... (30800 times) [2024-06-13 04:00:05,373][71000] InferenceWorker_p0-w0: stopping experience collection (30800 times) [2024-06-13 04:00:05,389][70980] Signal inference workers to resume experience collection... (30800 times) [2024-06-13 04:00:05,391][71000] InferenceWorker_p0-w0: resuming experience collection (30800 times) [2024-06-13 04:00:05,691][71000] Updated weights for policy 0, policy_version 155084 (0.0025) [2024-06-13 04:00:05,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2540896256. Throughput: 0: 49204.0. Samples: 2069654980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 04:00:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:00:09,092][71000] Updated weights for policy 0, policy_version 155094 (0.0031) [2024-06-13 04:00:10,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 2541142016. Throughput: 0: 49276.0. Samples: 2069956440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 04:00:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:00:12,444][71000] Updated weights for policy 0, policy_version 155104 (0.0027) [2024-06-13 04:00:15,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2541371392. Throughput: 0: 49073.4. Samples: 2070251220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 04:00:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:00:15,991][71000] Updated weights for policy 0, policy_version 155114 (0.0023) [2024-06-13 04:00:19,563][71000] Updated weights for policy 0, policy_version 155124 (0.0030) [2024-06-13 04:00:20,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 2541617152. Throughput: 0: 49023.0. Samples: 2070395740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 04:00:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:00:22,401][71000] Updated weights for policy 0, policy_version 155134 (0.0030) [2024-06-13 04:00:25,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48879.3, 300 sec: 49040.9). Total num frames: 2541862912. Throughput: 0: 49107.9. Samples: 2070690400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 04:00:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:00:26,011][71000] Updated weights for policy 0, policy_version 155144 (0.0021) [2024-06-13 04:00:28,969][71000] Updated weights for policy 0, policy_version 155154 (0.0032) [2024-06-13 04:00:30,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2542125056. Throughput: 0: 49275.1. Samples: 2070992280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 04:00:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:00:32,561][71000] Updated weights for policy 0, policy_version 155164 (0.0041) [2024-06-13 04:00:35,672][71000] Updated weights for policy 0, policy_version 155174 (0.0034) [2024-06-13 04:00:35,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2542370816. Throughput: 0: 49324.1. Samples: 2071147240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 04:00:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:00:39,033][71000] Updated weights for policy 0, policy_version 155184 (0.0028) [2024-06-13 04:00:40,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49424.9, 300 sec: 49096.5). Total num frames: 2542616576. Throughput: 0: 49341.3. Samples: 2071439340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 04:00:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:00:42,444][71000] Updated weights for policy 0, policy_version 155194 (0.0036) [2024-06-13 04:00:45,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 2542845952. Throughput: 0: 49237.2. Samples: 2071725520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 04:00:45,943][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:00:45,953][71000] Updated weights for policy 0, policy_version 155204 (0.0032) [2024-06-13 04:00:48,962][71000] Updated weights for policy 0, policy_version 155214 (0.0032) [2024-06-13 04:00:50,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2543108096. Throughput: 0: 49324.7. Samples: 2071874600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 04:00:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:00:52,664][71000] Updated weights for policy 0, policy_version 155224 (0.0028) [2024-06-13 04:00:55,770][71000] Updated weights for policy 0, policy_version 155234 (0.0029) [2024-06-13 04:00:55,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 49096.4). Total num frames: 2543353856. Throughput: 0: 49246.6. Samples: 2072172540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 04:00:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:00:59,530][71000] Updated weights for policy 0, policy_version 155244 (0.0025) [2024-06-13 04:01:00,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2543583232. Throughput: 0: 49137.7. Samples: 2072462420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 04:01:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:01:02,622][71000] Updated weights for policy 0, policy_version 155254 (0.0024) [2024-06-13 04:01:05,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 2543828992. Throughput: 0: 49032.3. Samples: 2072602200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 04:01:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:01:06,023][71000] Updated weights for policy 0, policy_version 155264 (0.0027) [2024-06-13 04:01:09,220][71000] Updated weights for policy 0, policy_version 155274 (0.0023) [2024-06-13 04:01:10,941][70768] Fps is (10 sec: 50785.0, 60 sec: 49151.2, 300 sec: 49096.3). Total num frames: 2544091136. Throughput: 0: 49006.5. Samples: 2072895740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 04:01:10,941][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:01:12,763][71000] Updated weights for policy 0, policy_version 155284 (0.0029) [2024-06-13 04:01:15,939][70768] Fps is (10 sec: 47514.3, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2544304128. Throughput: 0: 48940.0. Samples: 2073194580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 04:01:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:01:16,065][70980] Signal inference workers to stop experience collection... (30850 times) [2024-06-13 04:01:16,066][70980] Signal inference workers to resume experience collection... (30850 times) [2024-06-13 04:01:16,108][71000] InferenceWorker_p0-w0: stopping experience collection (30850 times) [2024-06-13 04:01:16,108][71000] InferenceWorker_p0-w0: resuming experience collection (30850 times) [2024-06-13 04:01:16,225][71000] Updated weights for policy 0, policy_version 155294 (0.0039) [2024-06-13 04:01:19,769][71000] Updated weights for policy 0, policy_version 155304 (0.0026) [2024-06-13 04:01:20,940][70768] Fps is (10 sec: 47518.8, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2544566272. Throughput: 0: 48763.1. Samples: 2073341580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 04:01:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:01:22,762][71000] Updated weights for policy 0, policy_version 155314 (0.0027) [2024-06-13 04:01:25,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2544795648. Throughput: 0: 48581.8. Samples: 2073625520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 04:01:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:01:26,317][71000] Updated weights for policy 0, policy_version 155324 (0.0029) [2024-06-13 04:01:29,383][71000] Updated weights for policy 0, policy_version 155334 (0.0030) [2024-06-13 04:01:30,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2545074176. Throughput: 0: 48735.7. Samples: 2073918620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 04:01:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 04:01:32,754][71000] Updated weights for policy 0, policy_version 155344 (0.0023) [2024-06-13 04:01:35,802][71000] Updated weights for policy 0, policy_version 155354 (0.0023) [2024-06-13 04:01:35,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2545319936. Throughput: 0: 49004.1. Samples: 2074079780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 04:01:35,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 04:01:39,520][71000] Updated weights for policy 0, policy_version 155364 (0.0030) [2024-06-13 04:01:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 2545565696. Throughput: 0: 49045.4. Samples: 2074379580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 04:01:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:01:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000155369_2545565696.pth... [2024-06-13 04:01:41,002][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000154649_2533769216.pth [2024-06-13 04:01:42,433][71000] Updated weights for policy 0, policy_version 155374 (0.0029) [2024-06-13 04:01:45,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2545778688. Throughput: 0: 49064.8. Samples: 2074670340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 04:01:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:01:46,518][71000] Updated weights for policy 0, policy_version 155384 (0.0030) [2024-06-13 04:01:48,879][71000] Updated weights for policy 0, policy_version 155394 (0.0029) [2024-06-13 04:01:50,939][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.2, 300 sec: 49207.5). Total num frames: 2546057216. Throughput: 0: 49278.4. Samples: 2074819720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 04:01:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:01:52,948][71000] Updated weights for policy 0, policy_version 155404 (0.0026) [2024-06-13 04:01:55,870][71000] Updated weights for policy 0, policy_version 155414 (0.0033) [2024-06-13 04:01:55,939][70768] Fps is (10 sec: 52429.7, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2546302976. Throughput: 0: 49257.2. Samples: 2075112260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 04:01:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:01:59,795][71000] Updated weights for policy 0, policy_version 155424 (0.0026) [2024-06-13 04:02:00,939][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2546548736. Throughput: 0: 49211.1. Samples: 2075409080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 04:02:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:02:02,525][71000] Updated weights for policy 0, policy_version 155434 (0.0029) [2024-06-13 04:02:05,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2546761728. Throughput: 0: 49002.2. Samples: 2075546680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 04:02:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:02:06,373][71000] Updated weights for policy 0, policy_version 155444 (0.0030) [2024-06-13 04:02:09,291][71000] Updated weights for policy 0, policy_version 155454 (0.0031) [2024-06-13 04:02:10,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48879.8, 300 sec: 49263.1). Total num frames: 2547023872. Throughput: 0: 49382.3. Samples: 2075847720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 04:02:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:02:13,184][71000] Updated weights for policy 0, policy_version 155464 (0.0029) [2024-06-13 04:02:15,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49424.9, 300 sec: 49152.0). Total num frames: 2547269632. Throughput: 0: 49347.4. Samples: 2076139260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 04:02:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:02:16,011][71000] Updated weights for policy 0, policy_version 155474 (0.0033) [2024-06-13 04:02:19,881][71000] Updated weights for policy 0, policy_version 155484 (0.0036) [2024-06-13 04:02:20,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 2547515392. Throughput: 0: 49003.0. Samples: 2076284920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 04:02:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:02:22,199][70980] Signal inference workers to stop experience collection... (30900 times) [2024-06-13 04:02:22,200][70980] Signal inference workers to resume experience collection... (30900 times) [2024-06-13 04:02:22,227][71000] InferenceWorker_p0-w0: stopping experience collection (30900 times) [2024-06-13 04:02:22,227][71000] InferenceWorker_p0-w0: resuming experience collection (30900 times) [2024-06-13 04:02:22,722][71000] Updated weights for policy 0, policy_version 155494 (0.0022) [2024-06-13 04:02:25,940][70768] Fps is (10 sec: 45875.6, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 2547728384. Throughput: 0: 48700.9. Samples: 2076571120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 04:02:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:02:26,511][71000] Updated weights for policy 0, policy_version 155504 (0.0032) [2024-06-13 04:02:29,195][71000] Updated weights for policy 0, policy_version 155514 (0.0028) [2024-06-13 04:02:30,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2548023296. Throughput: 0: 48893.9. Samples: 2076870560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 04:02:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:02:33,063][71000] Updated weights for policy 0, policy_version 155524 (0.0024) [2024-06-13 04:02:35,916][71000] Updated weights for policy 0, policy_version 155534 (0.0039) [2024-06-13 04:02:35,940][70768] Fps is (10 sec: 54066.7, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2548269056. Throughput: 0: 49055.3. Samples: 2077027220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 04:02:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:02:39,687][71000] Updated weights for policy 0, policy_version 155544 (0.0023) [2024-06-13 04:02:40,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2548498432. Throughput: 0: 49083.9. Samples: 2077321040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 04:02:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:02:42,729][71000] Updated weights for policy 0, policy_version 155554 (0.0026) [2024-06-13 04:02:45,939][70768] Fps is (10 sec: 42599.4, 60 sec: 48606.0, 300 sec: 48929.9). Total num frames: 2548695040. Throughput: 0: 48833.8. Samples: 2077606600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 04:02:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:02:46,424][71000] Updated weights for policy 0, policy_version 155564 (0.0029) [2024-06-13 04:02:49,655][71000] Updated weights for policy 0, policy_version 155574 (0.0032) [2024-06-13 04:02:50,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48878.8, 300 sec: 49207.5). Total num frames: 2548989952. Throughput: 0: 48841.7. Samples: 2077744560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 04:02:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:02:53,005][71000] Updated weights for policy 0, policy_version 155584 (0.0019) [2024-06-13 04:02:55,918][71000] Updated weights for policy 0, policy_version 155594 (0.0027) [2024-06-13 04:02:55,940][70768] Fps is (10 sec: 55704.2, 60 sec: 49151.8, 300 sec: 49152.0). Total num frames: 2549252096. Throughput: 0: 48857.2. Samples: 2078046300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 04:02:55,941][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:02:59,747][71000] Updated weights for policy 0, policy_version 155604 (0.0036) [2024-06-13 04:03:00,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 2549465088. Throughput: 0: 48891.7. Samples: 2078339380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 04:03:00,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 04:03:02,742][71000] Updated weights for policy 0, policy_version 155614 (0.0037) [2024-06-13 04:03:05,939][70768] Fps is (10 sec: 44237.9, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 2549694464. Throughput: 0: 48839.4. Samples: 2078482680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 04:03:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:03:06,511][71000] Updated weights for policy 0, policy_version 155624 (0.0026) [2024-06-13 04:03:09,514][71000] Updated weights for policy 0, policy_version 155634 (0.0034) [2024-06-13 04:03:10,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2549956608. Throughput: 0: 48890.2. Samples: 2078771180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 04:03:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:03:13,319][71000] Updated weights for policy 0, policy_version 155644 (0.0020) [2024-06-13 04:03:15,887][71000] Updated weights for policy 0, policy_version 155654 (0.0040) [2024-06-13 04:03:15,939][70768] Fps is (10 sec: 54066.9, 60 sec: 49425.2, 300 sec: 49207.6). Total num frames: 2550235136. Throughput: 0: 48951.7. Samples: 2079073380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 04:03:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:03:19,656][71000] Updated weights for policy 0, policy_version 155664 (0.0033) [2024-06-13 04:03:20,939][70768] Fps is (10 sec: 45875.9, 60 sec: 48333.0, 300 sec: 48929.9). Total num frames: 2550415360. Throughput: 0: 48779.8. Samples: 2079222300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 04:03:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:03:22,598][71000] Updated weights for policy 0, policy_version 155674 (0.0025) [2024-06-13 04:03:25,940][70768] Fps is (10 sec: 44236.0, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 2550677504. Throughput: 0: 48609.7. Samples: 2079508480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 04:03:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:03:26,507][71000] Updated weights for policy 0, policy_version 155684 (0.0043) [2024-06-13 04:03:29,918][71000] Updated weights for policy 0, policy_version 155694 (0.0020) [2024-06-13 04:03:30,940][70768] Fps is (10 sec: 52427.8, 60 sec: 48605.8, 300 sec: 49152.0). Total num frames: 2550939648. Throughput: 0: 48685.1. Samples: 2079797440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 04:03:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:03:33,285][71000] Updated weights for policy 0, policy_version 155704 (0.0030) [2024-06-13 04:03:35,530][70980] Signal inference workers to stop experience collection... (30950 times) [2024-06-13 04:03:35,532][70980] Signal inference workers to resume experience collection... (30950 times) [2024-06-13 04:03:35,577][71000] InferenceWorker_p0-w0: stopping experience collection (30950 times) [2024-06-13 04:03:35,577][71000] InferenceWorker_p0-w0: resuming experience collection (30950 times) [2024-06-13 04:03:35,940][70768] Fps is (10 sec: 50790.9, 60 sec: 48606.0, 300 sec: 49040.9). Total num frames: 2551185408. Throughput: 0: 48942.7. Samples: 2079946980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 04:03:35,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 04:03:36,314][71000] Updated weights for policy 0, policy_version 155714 (0.0032) [2024-06-13 04:03:39,889][71000] Updated weights for policy 0, policy_version 155724 (0.0035) [2024-06-13 04:03:40,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 2551414784. Throughput: 0: 48977.9. Samples: 2080250300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 04:03:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:03:41,062][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000155727_2551431168.pth... [2024-06-13 04:03:41,103][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000155009_2539667456.pth [2024-06-13 04:03:42,965][71000] Updated weights for policy 0, policy_version 155734 (0.0030) [2024-06-13 04:03:45,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49698.1, 300 sec: 48985.4). Total num frames: 2551676928. Throughput: 0: 48830.3. Samples: 2080536740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 04:03:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:03:46,619][71000] Updated weights for policy 0, policy_version 155744 (0.0025) [2024-06-13 04:03:50,003][71000] Updated weights for policy 0, policy_version 155754 (0.0025) [2024-06-13 04:03:50,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48606.0, 300 sec: 49096.5). Total num frames: 2551906304. Throughput: 0: 48973.7. Samples: 2080686500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 04:03:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:03:53,474][71000] Updated weights for policy 0, policy_version 155764 (0.0036) [2024-06-13 04:03:55,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48606.0, 300 sec: 49096.5). Total num frames: 2552168448. Throughput: 0: 49119.6. Samples: 2080981560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 04:03:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:03:56,526][71000] Updated weights for policy 0, policy_version 155774 (0.0030) [2024-06-13 04:04:00,034][71000] Updated weights for policy 0, policy_version 155784 (0.0029) [2024-06-13 04:04:00,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 2552381440. Throughput: 0: 48846.1. Samples: 2081271460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 04:04:00,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 04:04:03,307][71000] Updated weights for policy 0, policy_version 155794 (0.0036) [2024-06-13 04:04:05,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 2552659968. Throughput: 0: 48774.2. Samples: 2081417140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 04:04:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:04:06,384][71000] Updated weights for policy 0, policy_version 155804 (0.0037) [2024-06-13 04:04:10,026][71000] Updated weights for policy 0, policy_version 155814 (0.0029) [2024-06-13 04:04:10,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 2552872960. Throughput: 0: 49022.6. Samples: 2081714500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 04:04:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:04:13,691][71000] Updated weights for policy 0, policy_version 155824 (0.0032) [2024-06-13 04:04:15,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48332.8, 300 sec: 49096.5). Total num frames: 2553135104. Throughput: 0: 49014.4. Samples: 2082003080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 04:04:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:04:16,547][71000] Updated weights for policy 0, policy_version 155834 (0.0036) [2024-06-13 04:04:20,083][71000] Updated weights for policy 0, policy_version 155844 (0.0028) [2024-06-13 04:04:20,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 2553364480. Throughput: 0: 49001.8. Samples: 2082152060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 04:04:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:04:23,342][71000] Updated weights for policy 0, policy_version 155854 (0.0024) [2024-06-13 04:04:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 2553626624. Throughput: 0: 48660.9. Samples: 2082440040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 04:04:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:04:26,776][71000] Updated weights for policy 0, policy_version 155864 (0.0023) [2024-06-13 04:04:30,189][71000] Updated weights for policy 0, policy_version 155874 (0.0033) [2024-06-13 04:04:30,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 2553856000. Throughput: 0: 48698.6. Samples: 2082728180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 04:04:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:04:33,580][71000] Updated weights for policy 0, policy_version 155884 (0.0041) [2024-06-13 04:04:35,939][70768] Fps is (10 sec: 47514.1, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 2554101760. Throughput: 0: 48697.8. Samples: 2082877900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 04:04:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:04:36,694][71000] Updated weights for policy 0, policy_version 155894 (0.0029) [2024-06-13 04:04:37,919][70980] Signal inference workers to stop experience collection... (31000 times) [2024-06-13 04:04:37,920][70980] Signal inference workers to resume experience collection... (31000 times) [2024-06-13 04:04:37,955][71000] InferenceWorker_p0-w0: stopping experience collection (31000 times) [2024-06-13 04:04:37,955][71000] InferenceWorker_p0-w0: resuming experience collection (31000 times) [2024-06-13 04:04:39,991][71000] Updated weights for policy 0, policy_version 155904 (0.0023) [2024-06-13 04:04:40,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 2554347520. Throughput: 0: 48676.4. Samples: 2083172000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 04:04:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:04:43,252][71000] Updated weights for policy 0, policy_version 155914 (0.0035) [2024-06-13 04:04:45,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 2554626048. Throughput: 0: 48967.6. Samples: 2083475000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 04:04:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:04:46,461][71000] Updated weights for policy 0, policy_version 155924 (0.0028) [2024-06-13 04:04:50,045][71000] Updated weights for policy 0, policy_version 155934 (0.0031) [2024-06-13 04:04:50,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 2554839040. Throughput: 0: 49196.0. Samples: 2083630960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 04:04:50,944][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:04:53,207][71000] Updated weights for policy 0, policy_version 155944 (0.0022) [2024-06-13 04:04:55,942][70768] Fps is (10 sec: 47504.3, 60 sec: 48877.3, 300 sec: 49040.6). Total num frames: 2555101184. Throughput: 0: 49305.1. Samples: 2083933320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 04:04:55,942][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:04:56,487][71000] Updated weights for policy 0, policy_version 155954 (0.0024) [2024-06-13 04:04:59,639][71000] Updated weights for policy 0, policy_version 155964 (0.0022) [2024-06-13 04:05:00,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 48929.9). Total num frames: 2555330560. Throughput: 0: 49148.1. Samples: 2084214740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 04:05:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:05:03,280][71000] Updated weights for policy 0, policy_version 155974 (0.0028) [2024-06-13 04:05:05,940][70768] Fps is (10 sec: 50800.2, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 2555609088. Throughput: 0: 49127.1. Samples: 2084362780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 04:05:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:05:06,477][71000] Updated weights for policy 0, policy_version 155984 (0.0028) [2024-06-13 04:05:10,082][71000] Updated weights for policy 0, policy_version 155994 (0.0028) [2024-06-13 04:05:10,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.2, 300 sec: 48985.4). Total num frames: 2555822080. Throughput: 0: 49326.3. Samples: 2084659720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 04:05:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:05:13,174][71000] Updated weights for policy 0, policy_version 156004 (0.0032) [2024-06-13 04:05:15,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2556100608. Throughput: 0: 49653.0. Samples: 2084962560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 04:05:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:05:16,665][71000] Updated weights for policy 0, policy_version 156014 (0.0031) [2024-06-13 04:05:19,679][71000] Updated weights for policy 0, policy_version 156024 (0.0026) [2024-06-13 04:05:20,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2556313600. Throughput: 0: 49555.0. Samples: 2085107880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 04:05:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:05:23,119][71000] Updated weights for policy 0, policy_version 156034 (0.0036) [2024-06-13 04:05:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 2556592128. Throughput: 0: 49573.4. Samples: 2085402800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 04:05:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:05:26,240][71000] Updated weights for policy 0, policy_version 156044 (0.0036) [2024-06-13 04:05:29,762][71000] Updated weights for policy 0, policy_version 156054 (0.0029) [2024-06-13 04:05:30,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 2556821504. Throughput: 0: 49359.9. Samples: 2085696200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 04:05:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:05:33,049][71000] Updated weights for policy 0, policy_version 156064 (0.0033) [2024-06-13 04:05:35,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 2557067264. Throughput: 0: 49011.2. Samples: 2085836460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 04:05:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:05:36,633][71000] Updated weights for policy 0, policy_version 156074 (0.0023) [2024-06-13 04:05:39,998][71000] Updated weights for policy 0, policy_version 156084 (0.0031) [2024-06-13 04:05:40,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2557296640. Throughput: 0: 48850.0. Samples: 2086131480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 04:05:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:05:41,062][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000156086_2557313024.pth... [2024-06-13 04:05:41,110][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000155369_2545565696.pth [2024-06-13 04:05:42,125][70980] Signal inference workers to stop experience collection... (31050 times) [2024-06-13 04:05:42,174][71000] InferenceWorker_p0-w0: stopping experience collection (31050 times) [2024-06-13 04:05:42,177][70980] Signal inference workers to resume experience collection... (31050 times) [2024-06-13 04:05:42,194][71000] InferenceWorker_p0-w0: resuming experience collection (31050 times) [2024-06-13 04:05:43,097][71000] Updated weights for policy 0, policy_version 156094 (0.0034) [2024-06-13 04:05:45,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49041.0). Total num frames: 2557575168. Throughput: 0: 49171.6. Samples: 2086427460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 04:05:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:05:46,326][71000] Updated weights for policy 0, policy_version 156104 (0.0027) [2024-06-13 04:05:49,923][71000] Updated weights for policy 0, policy_version 156114 (0.0027) [2024-06-13 04:05:50,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 2557804544. Throughput: 0: 49368.9. Samples: 2086584380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 04:05:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:05:53,310][71000] Updated weights for policy 0, policy_version 156124 (0.0027) [2024-06-13 04:05:55,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49153.6, 300 sec: 49040.9). Total num frames: 2558050304. Throughput: 0: 49280.9. Samples: 2086877360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 04:05:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:05:56,815][71000] Updated weights for policy 0, policy_version 156134 (0.0037) [2024-06-13 04:06:00,147][71000] Updated weights for policy 0, policy_version 156144 (0.0023) [2024-06-13 04:06:00,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 2558279680. Throughput: 0: 49004.7. Samples: 2087167780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 04:06:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:06:03,294][71000] Updated weights for policy 0, policy_version 156154 (0.0026) [2024-06-13 04:06:05,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.1, 300 sec: 49041.1). Total num frames: 2558558208. Throughput: 0: 49112.9. Samples: 2087317960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 04:06:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:06:06,429][71000] Updated weights for policy 0, policy_version 156164 (0.0027) [2024-06-13 04:06:09,967][71000] Updated weights for policy 0, policy_version 156174 (0.0030) [2024-06-13 04:06:10,940][70768] Fps is (10 sec: 54067.1, 60 sec: 49971.1, 300 sec: 49207.5). Total num frames: 2558820352. Throughput: 0: 49448.3. Samples: 2087627980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-13 04:06:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:06:12,886][71000] Updated weights for policy 0, policy_version 156184 (0.0035) [2024-06-13 04:06:15,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2559033344. Throughput: 0: 49351.2. Samples: 2087917000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-13 04:06:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:06:16,559][71000] Updated weights for policy 0, policy_version 156194 (0.0037) [2024-06-13 04:06:19,771][71000] Updated weights for policy 0, policy_version 156204 (0.0033) [2024-06-13 04:06:20,940][70768] Fps is (10 sec: 44237.0, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 2559262720. Throughput: 0: 49468.3. Samples: 2088062540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-13 04:06:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:06:23,056][71000] Updated weights for policy 0, policy_version 156214 (0.0036) [2024-06-13 04:06:25,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49425.0, 300 sec: 49096.4). Total num frames: 2559557632. Throughput: 0: 49619.5. Samples: 2088364360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-13 04:06:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:06:26,092][71000] Updated weights for policy 0, policy_version 156224 (0.0020) [2024-06-13 04:06:29,567][71000] Updated weights for policy 0, policy_version 156234 (0.0025) [2024-06-13 04:06:30,940][70768] Fps is (10 sec: 55705.1, 60 sec: 49971.1, 300 sec: 49152.0). Total num frames: 2559819776. Throughput: 0: 49714.4. Samples: 2088664620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-13 04:06:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:06:32,697][71000] Updated weights for policy 0, policy_version 156244 (0.0027) [2024-06-13 04:06:35,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49698.1, 300 sec: 49096.5). Total num frames: 2560049152. Throughput: 0: 49643.1. Samples: 2088818320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-13 04:06:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:06:36,065][71000] Updated weights for policy 0, policy_version 156254 (0.0030) [2024-06-13 04:06:39,681][71000] Updated weights for policy 0, policy_version 156264 (0.0026) [2024-06-13 04:06:40,940][70768] Fps is (10 sec: 45875.5, 60 sec: 49698.1, 300 sec: 49152.0). Total num frames: 2560278528. Throughput: 0: 49627.4. Samples: 2089110600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-13 04:06:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:06:42,738][71000] Updated weights for policy 0, policy_version 156274 (0.0032) [2024-06-13 04:06:45,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.0, 300 sec: 49096.4). Total num frames: 2560540672. Throughput: 0: 49811.2. Samples: 2089409280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-13 04:06:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:06:46,102][71000] Updated weights for policy 0, policy_version 156284 (0.0030) [2024-06-13 04:06:49,544][71000] Updated weights for policy 0, policy_version 156294 (0.0043) [2024-06-13 04:06:50,451][70980] Signal inference workers to stop experience collection... (31100 times) [2024-06-13 04:06:50,497][71000] InferenceWorker_p0-w0: stopping experience collection (31100 times) [2024-06-13 04:06:50,503][70980] Signal inference workers to resume experience collection... (31100 times) [2024-06-13 04:06:50,514][71000] InferenceWorker_p0-w0: resuming experience collection (31100 times) [2024-06-13 04:06:50,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49971.0, 300 sec: 49152.0). Total num frames: 2560802816. Throughput: 0: 49739.3. Samples: 2089556240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-13 04:06:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:06:52,805][71000] Updated weights for policy 0, policy_version 156304 (0.0034) [2024-06-13 04:06:55,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.1, 300 sec: 49096.5). Total num frames: 2561032192. Throughput: 0: 49503.3. Samples: 2089855620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-13 04:06:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:06:55,993][71000] Updated weights for policy 0, policy_version 156314 (0.0032) [2024-06-13 04:06:59,487][71000] Updated weights for policy 0, policy_version 156324 (0.0033) [2024-06-13 04:07:00,940][70768] Fps is (10 sec: 47514.4, 60 sec: 49971.3, 300 sec: 49207.5). Total num frames: 2561277952. Throughput: 0: 49624.9. Samples: 2090150120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-13 04:07:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:07:02,773][71000] Updated weights for policy 0, policy_version 156334 (0.0030) [2024-06-13 04:07:05,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 2561507328. Throughput: 0: 49590.3. Samples: 2090294100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-13 04:07:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:07:06,210][71000] Updated weights for policy 0, policy_version 156344 (0.0025) [2024-06-13 04:07:09,411][71000] Updated weights for policy 0, policy_version 156354 (0.0030) [2024-06-13 04:07:10,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2561785856. Throughput: 0: 49587.2. Samples: 2090595780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-13 04:07:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:07:12,866][71000] Updated weights for policy 0, policy_version 156364 (0.0028) [2024-06-13 04:07:15,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.1, 300 sec: 49152.0). Total num frames: 2562015232. Throughput: 0: 49512.2. Samples: 2090892660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 04:07:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:07:16,078][71000] Updated weights for policy 0, policy_version 156374 (0.0022) [2024-06-13 04:07:19,514][71000] Updated weights for policy 0, policy_version 156384 (0.0026) [2024-06-13 04:07:20,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49971.2, 300 sec: 49263.1). Total num frames: 2562260992. Throughput: 0: 49254.5. Samples: 2091034780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 04:07:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 04:07:22,723][71000] Updated weights for policy 0, policy_version 156394 (0.0028) [2024-06-13 04:07:25,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2562490368. Throughput: 0: 49304.0. Samples: 2091329280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 04:07:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:07:26,235][71000] Updated weights for policy 0, policy_version 156404 (0.0030) [2024-06-13 04:07:29,456][71000] Updated weights for policy 0, policy_version 156414 (0.0035) [2024-06-13 04:07:30,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2562768896. Throughput: 0: 49125.3. Samples: 2091619920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 04:07:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:07:32,909][71000] Updated weights for policy 0, policy_version 156424 (0.0034) [2024-06-13 04:07:35,939][70768] Fps is (10 sec: 49153.0, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2562981888. Throughput: 0: 49132.3. Samples: 2091767180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 04:07:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:07:36,276][71000] Updated weights for policy 0, policy_version 156434 (0.0028) [2024-06-13 04:07:39,516][71000] Updated weights for policy 0, policy_version 156444 (0.0034) [2024-06-13 04:07:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49698.1, 300 sec: 49374.1). Total num frames: 2563260416. Throughput: 0: 49355.0. Samples: 2092076600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 04:07:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:07:40,960][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000156449_2563260416.pth... [2024-06-13 04:07:41,028][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000155727_2551431168.pth [2024-06-13 04:07:42,626][71000] Updated weights for policy 0, policy_version 156454 (0.0033) [2024-06-13 04:07:45,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2563473408. Throughput: 0: 49222.2. Samples: 2092365120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 04:07:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:07:46,361][71000] Updated weights for policy 0, policy_version 156464 (0.0027) [2024-06-13 04:07:49,521][71000] Updated weights for policy 0, policy_version 156474 (0.0030) [2024-06-13 04:07:50,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48879.1, 300 sec: 49096.5). Total num frames: 2563735552. Throughput: 0: 49296.0. Samples: 2092512420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 04:07:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:07:52,819][71000] Updated weights for policy 0, policy_version 156484 (0.0025) [2024-06-13 04:07:55,932][71000] Updated weights for policy 0, policy_version 156494 (0.0029) [2024-06-13 04:07:55,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2563997696. Throughput: 0: 49176.0. Samples: 2092808700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 04:07:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:07:59,605][71000] Updated weights for policy 0, policy_version 156504 (0.0028) [2024-06-13 04:08:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2564243456. Throughput: 0: 49098.2. Samples: 2093102080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 04:08:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:08:02,656][71000] Updated weights for policy 0, policy_version 156514 (0.0027) [2024-06-13 04:08:05,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2564456448. Throughput: 0: 49257.8. Samples: 2093251380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 04:08:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:08:06,111][71000] Updated weights for policy 0, policy_version 156524 (0.0031) [2024-06-13 04:08:06,610][70980] Signal inference workers to stop experience collection... (31150 times) [2024-06-13 04:08:06,611][70980] Signal inference workers to resume experience collection... (31150 times) [2024-06-13 04:08:06,627][71000] InferenceWorker_p0-w0: stopping experience collection (31150 times) [2024-06-13 04:08:06,627][71000] InferenceWorker_p0-w0: resuming experience collection (31150 times) [2024-06-13 04:08:09,310][71000] Updated weights for policy 0, policy_version 156534 (0.0030) [2024-06-13 04:08:10,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48878.9, 300 sec: 49096.4). Total num frames: 2564718592. Throughput: 0: 49404.5. Samples: 2093552480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 04:08:10,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 04:08:12,899][71000] Updated weights for policy 0, policy_version 156544 (0.0032) [2024-06-13 04:08:15,939][70768] Fps is (10 sec: 50791.1, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2564964352. Throughput: 0: 49256.1. Samples: 2093836440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 04:08:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:08:15,979][71000] Updated weights for policy 0, policy_version 156554 (0.0031) [2024-06-13 04:08:19,484][71000] Updated weights for policy 0, policy_version 156564 (0.0030) [2024-06-13 04:08:20,939][70768] Fps is (10 sec: 52430.1, 60 sec: 49698.4, 300 sec: 49374.2). Total num frames: 2565242880. Throughput: 0: 49453.0. Samples: 2093992560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 04:08:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 04:08:22,313][71000] Updated weights for policy 0, policy_version 156574 (0.0032) [2024-06-13 04:08:25,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2565439488. Throughput: 0: 49099.2. Samples: 2094286060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 04:08:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:08:26,215][71000] Updated weights for policy 0, policy_version 156584 (0.0027) [2024-06-13 04:08:28,879][71000] Updated weights for policy 0, policy_version 156594 (0.0022) [2024-06-13 04:08:30,940][70768] Fps is (10 sec: 44236.3, 60 sec: 48605.9, 300 sec: 49152.0). Total num frames: 2565685248. Throughput: 0: 49202.7. Samples: 2094579240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 04:08:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:08:32,904][71000] Updated weights for policy 0, policy_version 156604 (0.0031) [2024-06-13 04:08:35,795][71000] Updated weights for policy 0, policy_version 156614 (0.0034) [2024-06-13 04:08:35,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49698.0, 300 sec: 49318.6). Total num frames: 2565963776. Throughput: 0: 49211.5. Samples: 2094726940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 04:08:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:08:39,557][71000] Updated weights for policy 0, policy_version 156624 (0.0030) [2024-06-13 04:08:40,940][70768] Fps is (10 sec: 54066.5, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2566225920. Throughput: 0: 49297.3. Samples: 2095027080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 04:08:40,941][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:08:42,130][71000] Updated weights for policy 0, policy_version 156634 (0.0024) [2024-06-13 04:08:45,939][70768] Fps is (10 sec: 47514.1, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2566438912. Throughput: 0: 49428.5. Samples: 2095326360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 04:08:45,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:08:46,044][71000] Updated weights for policy 0, policy_version 156644 (0.0030) [2024-06-13 04:08:48,848][71000] Updated weights for policy 0, policy_version 156654 (0.0023) [2024-06-13 04:08:50,940][70768] Fps is (10 sec: 45875.6, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2566684672. Throughput: 0: 49145.9. Samples: 2095462940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 04:08:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:08:52,726][71000] Updated weights for policy 0, policy_version 156664 (0.0023) [2024-06-13 04:08:55,391][71000] Updated weights for policy 0, policy_version 156674 (0.0027) [2024-06-13 04:08:55,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2566963200. Throughput: 0: 49168.5. Samples: 2095765060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 04:08:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:08:59,406][71000] Updated weights for policy 0, policy_version 156684 (0.0029) [2024-06-13 04:09:00,941][70768] Fps is (10 sec: 52421.4, 60 sec: 49423.9, 300 sec: 49318.4). Total num frames: 2567208960. Throughput: 0: 49302.8. Samples: 2096055140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 04:09:00,941][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 04:09:01,961][71000] Updated weights for policy 0, policy_version 156694 (0.0028) [2024-06-13 04:09:05,912][71000] Updated weights for policy 0, policy_version 156704 (0.0035) [2024-06-13 04:09:05,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 2567438336. Throughput: 0: 49374.0. Samples: 2096214400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 04:09:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:09:08,548][71000] Updated weights for policy 0, policy_version 156714 (0.0030) [2024-06-13 04:09:10,940][70768] Fps is (10 sec: 45881.5, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2567667712. Throughput: 0: 49493.7. Samples: 2096513280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 04:09:10,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 04:09:12,324][71000] Updated weights for policy 0, policy_version 156724 (0.0033) [2024-06-13 04:09:12,728][70980] Signal inference workers to stop experience collection... (31200 times) [2024-06-13 04:09:12,729][70980] Signal inference workers to resume experience collection... (31200 times) [2024-06-13 04:09:12,739][71000] InferenceWorker_p0-w0: stopping experience collection (31200 times) [2024-06-13 04:09:12,761][71000] InferenceWorker_p0-w0: resuming experience collection (31200 times) [2024-06-13 04:09:15,198][71000] Updated weights for policy 0, policy_version 156734 (0.0023) [2024-06-13 04:09:15,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49697.9, 300 sec: 49429.7). Total num frames: 2567946240. Throughput: 0: 49500.7. Samples: 2096806780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 04:09:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:09:18,971][71000] Updated weights for policy 0, policy_version 156744 (0.0031) [2024-06-13 04:09:20,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49151.9, 300 sec: 49374.2). Total num frames: 2568192000. Throughput: 0: 49892.0. Samples: 2096972080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 04:09:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:09:21,607][71000] Updated weights for policy 0, policy_version 156754 (0.0029) [2024-06-13 04:09:25,505][71000] Updated weights for policy 0, policy_version 156764 (0.0030) [2024-06-13 04:09:25,940][70768] Fps is (10 sec: 50791.2, 60 sec: 50244.3, 300 sec: 49485.2). Total num frames: 2568454144. Throughput: 0: 49933.4. Samples: 2097274080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 04:09:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:09:28,098][71000] Updated weights for policy 0, policy_version 156774 (0.0034) [2024-06-13 04:09:30,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49971.1, 300 sec: 49429.7). Total num frames: 2568683520. Throughput: 0: 49827.4. Samples: 2097568600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 04:09:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:09:32,067][71000] Updated weights for policy 0, policy_version 156784 (0.0027) [2024-06-13 04:09:34,481][71000] Updated weights for policy 0, policy_version 156794 (0.0024) [2024-06-13 04:09:35,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2568929280. Throughput: 0: 49928.7. Samples: 2097709740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 04:09:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:09:38,548][71000] Updated weights for policy 0, policy_version 156804 (0.0026) [2024-06-13 04:09:40,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 2569207808. Throughput: 0: 49887.0. Samples: 2098009980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 04:09:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:09:41,075][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000156813_2569224192.pth... [2024-06-13 04:09:41,118][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000156086_2557313024.pth [2024-06-13 04:09:41,307][71000] Updated weights for policy 0, policy_version 156814 (0.0032) [2024-06-13 04:09:45,242][71000] Updated weights for policy 0, policy_version 156824 (0.0032) [2024-06-13 04:09:45,940][70768] Fps is (10 sec: 52429.7, 60 sec: 50244.2, 300 sec: 49540.8). Total num frames: 2569453568. Throughput: 0: 50168.3. Samples: 2098312640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 04:09:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:09:47,998][71000] Updated weights for policy 0, policy_version 156834 (0.0021) [2024-06-13 04:09:50,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49971.2, 300 sec: 49430.0). Total num frames: 2569682944. Throughput: 0: 49800.1. Samples: 2098455400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 04:09:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:09:51,750][71000] Updated weights for policy 0, policy_version 156844 (0.0025) [2024-06-13 04:09:54,596][71000] Updated weights for policy 0, policy_version 156854 (0.0032) [2024-06-13 04:09:55,940][70768] Fps is (10 sec: 45871.2, 60 sec: 49151.3, 300 sec: 49429.5). Total num frames: 2569912320. Throughput: 0: 49818.2. Samples: 2098755140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 04:09:55,941][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:09:58,136][71000] Updated weights for policy 0, policy_version 156864 (0.0031) [2024-06-13 04:10:00,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49699.3, 300 sec: 49429.7). Total num frames: 2570190848. Throughput: 0: 49876.6. Samples: 2099051220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 04:10:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:10:01,297][71000] Updated weights for policy 0, policy_version 156874 (0.0025) [2024-06-13 04:10:04,802][71000] Updated weights for policy 0, policy_version 156884 (0.0027) [2024-06-13 04:10:05,940][70768] Fps is (10 sec: 52432.7, 60 sec: 49971.2, 300 sec: 49540.7). Total num frames: 2570436608. Throughput: 0: 49762.1. Samples: 2099211380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 04:10:05,940][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 04:10:05,941][70980] Saving new best policy, reward=0.294! [2024-06-13 04:10:07,566][71000] Updated weights for policy 0, policy_version 156894 (0.0030) [2024-06-13 04:10:09,985][70980] Signal inference workers to stop experience collection... (31250 times) [2024-06-13 04:10:10,034][70980] Signal inference workers to resume experience collection... (31250 times) [2024-06-13 04:10:10,034][71000] InferenceWorker_p0-w0: stopping experience collection (31250 times) [2024-06-13 04:10:10,062][71000] InferenceWorker_p0-w0: resuming experience collection (31250 times) [2024-06-13 04:10:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 49429.7). Total num frames: 2570682368. Throughput: 0: 49562.2. Samples: 2099504380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 04:10:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:10:11,430][71000] Updated weights for policy 0, policy_version 156904 (0.0028) [2024-06-13 04:10:14,514][71000] Updated weights for policy 0, policy_version 156914 (0.0023) [2024-06-13 04:10:15,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 2570911744. Throughput: 0: 49524.1. Samples: 2099797180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 04:10:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 04:10:18,063][71000] Updated weights for policy 0, policy_version 156924 (0.0043) [2024-06-13 04:10:20,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 2571190272. Throughput: 0: 49630.9. Samples: 2099943120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 04:10:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:10:21,035][71000] Updated weights for policy 0, policy_version 156934 (0.0036) [2024-06-13 04:10:24,414][71000] Updated weights for policy 0, policy_version 156944 (0.0029) [2024-06-13 04:10:25,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2571419648. Throughput: 0: 49685.0. Samples: 2100245800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 04:10:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:10:27,819][71000] Updated weights for policy 0, policy_version 156954 (0.0022) [2024-06-13 04:10:30,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49698.3, 300 sec: 49485.2). Total num frames: 2571665408. Throughput: 0: 49630.3. Samples: 2100546000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 04:10:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:10:31,092][71000] Updated weights for policy 0, policy_version 156964 (0.0030) [2024-06-13 04:10:34,592][71000] Updated weights for policy 0, policy_version 156974 (0.0029) [2024-06-13 04:10:35,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 2571894784. Throughput: 0: 49558.7. Samples: 2100685540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 04:10:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:10:37,861][71000] Updated weights for policy 0, policy_version 156984 (0.0032) [2024-06-13 04:10:40,940][70768] Fps is (10 sec: 50789.1, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 2572173312. Throughput: 0: 49539.0. Samples: 2100984360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 04:10:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:10:41,108][71000] Updated weights for policy 0, policy_version 156994 (0.0028) [2024-06-13 04:10:44,330][71000] Updated weights for policy 0, policy_version 157004 (0.0034) [2024-06-13 04:10:45,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2572402688. Throughput: 0: 49496.5. Samples: 2101278560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 04:10:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:10:47,375][71000] Updated weights for policy 0, policy_version 157014 (0.0034) [2024-06-13 04:10:50,757][71000] Updated weights for policy 0, policy_version 157024 (0.0029) [2024-06-13 04:10:50,940][70768] Fps is (10 sec: 52429.8, 60 sec: 50244.3, 300 sec: 49651.9). Total num frames: 2572697600. Throughput: 0: 49393.1. Samples: 2101434060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 04:10:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:10:54,280][71000] Updated weights for policy 0, policy_version 157034 (0.0032) [2024-06-13 04:10:55,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.8, 300 sec: 49485.3). Total num frames: 2572877824. Throughput: 0: 49482.3. Samples: 2101731080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 04:10:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:10:57,292][71000] Updated weights for policy 0, policy_version 157044 (0.0030) [2024-06-13 04:11:00,940][70768] Fps is (10 sec: 45874.8, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 2573156352. Throughput: 0: 49563.5. Samples: 2102027540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 04:11:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:11:01,201][71000] Updated weights for policy 0, policy_version 157054 (0.0035) [2024-06-13 04:11:03,989][71000] Updated weights for policy 0, policy_version 157064 (0.0030) [2024-06-13 04:11:05,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 2573385728. Throughput: 0: 49722.2. Samples: 2102180620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 04:11:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:11:07,809][71000] Updated weights for policy 0, policy_version 157074 (0.0021) [2024-06-13 04:11:09,374][70980] Signal inference workers to stop experience collection... (31300 times) [2024-06-13 04:11:09,379][70980] Signal inference workers to resume experience collection... (31300 times) [2024-06-13 04:11:09,384][71000] InferenceWorker_p0-w0: stopping experience collection (31300 times) [2024-06-13 04:11:09,404][71000] InferenceWorker_p0-w0: resuming experience collection (31300 times) [2024-06-13 04:11:10,791][71000] Updated weights for policy 0, policy_version 157084 (0.0025) [2024-06-13 04:11:10,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 2573664256. Throughput: 0: 49602.7. Samples: 2102477920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 04:11:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:11:14,400][71000] Updated weights for policy 0, policy_version 157094 (0.0031) [2024-06-13 04:11:15,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 2573877248. Throughput: 0: 49695.9. Samples: 2102782320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 04:11:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:11:17,186][71000] Updated weights for policy 0, policy_version 157104 (0.0027) [2024-06-13 04:11:20,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2574139392. Throughput: 0: 49716.4. Samples: 2102922780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 04:11:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:11:21,102][71000] Updated weights for policy 0, policy_version 157114 (0.0023) [2024-06-13 04:11:24,068][71000] Updated weights for policy 0, policy_version 157124 (0.0036) [2024-06-13 04:11:25,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 2574401536. Throughput: 0: 49554.2. Samples: 2103214300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-13 04:11:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:11:27,826][71000] Updated weights for policy 0, policy_version 157134 (0.0031) [2024-06-13 04:11:30,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2574630912. Throughput: 0: 49538.7. Samples: 2103507800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-13 04:11:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:11:31,080][71000] Updated weights for policy 0, policy_version 157144 (0.0032) [2024-06-13 04:11:34,529][71000] Updated weights for policy 0, policy_version 157154 (0.0030) [2024-06-13 04:11:35,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 2574893056. Throughput: 0: 49487.5. Samples: 2103661000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-13 04:11:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:11:37,501][71000] Updated weights for policy 0, policy_version 157164 (0.0024) [2024-06-13 04:11:40,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 2575122432. Throughput: 0: 49355.4. Samples: 2103952080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-13 04:11:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:11:40,953][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000157173_2575122432.pth... [2024-06-13 04:11:41,033][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000156449_2563260416.pth [2024-06-13 04:11:41,188][71000] Updated weights for policy 0, policy_version 157174 (0.0025) [2024-06-13 04:11:44,049][71000] Updated weights for policy 0, policy_version 157184 (0.0032) [2024-06-13 04:11:45,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 2575384576. Throughput: 0: 49176.1. Samples: 2104240460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-13 04:11:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:11:47,746][71000] Updated weights for policy 0, policy_version 157194 (0.0029) [2024-06-13 04:11:50,898][71000] Updated weights for policy 0, policy_version 157204 (0.0027) [2024-06-13 04:11:50,940][70768] Fps is (10 sec: 50790.9, 60 sec: 48878.9, 300 sec: 49485.2). Total num frames: 2575630336. Throughput: 0: 49157.3. Samples: 2104392700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-13 04:11:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:11:54,151][71000] Updated weights for policy 0, policy_version 157214 (0.0027) [2024-06-13 04:11:55,940][70768] Fps is (10 sec: 49150.8, 60 sec: 49971.0, 300 sec: 49485.2). Total num frames: 2575876096. Throughput: 0: 49157.1. Samples: 2104690000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-13 04:11:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:11:57,355][71000] Updated weights for policy 0, policy_version 157224 (0.0033) [2024-06-13 04:12:00,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 2576105472. Throughput: 0: 49268.0. Samples: 2104999380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-13 04:12:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:12:00,958][71000] Updated weights for policy 0, policy_version 157234 (0.0024) [2024-06-13 04:12:03,721][71000] Updated weights for policy 0, policy_version 157244 (0.0026) [2024-06-13 04:12:05,941][70768] Fps is (10 sec: 49145.9, 60 sec: 49696.9, 300 sec: 49429.5). Total num frames: 2576367616. Throughput: 0: 49337.1. Samples: 2105143020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-13 04:12:05,942][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:12:07,555][71000] Updated weights for policy 0, policy_version 157254 (0.0036) [2024-06-13 04:12:10,703][71000] Updated weights for policy 0, policy_version 157264 (0.0022) [2024-06-13 04:12:10,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2576613376. Throughput: 0: 49153.0. Samples: 2105426180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-13 04:12:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:12:12,110][70980] Signal inference workers to stop experience collection... (31350 times) [2024-06-13 04:12:12,162][71000] InferenceWorker_p0-w0: stopping experience collection (31350 times) [2024-06-13 04:12:12,167][70980] Signal inference workers to resume experience collection... (31350 times) [2024-06-13 04:12:12,175][71000] InferenceWorker_p0-w0: resuming experience collection (31350 times) [2024-06-13 04:12:14,478][71000] Updated weights for policy 0, policy_version 157274 (0.0027) [2024-06-13 04:12:15,940][70768] Fps is (10 sec: 49159.0, 60 sec: 49698.1, 300 sec: 49485.3). Total num frames: 2576859136. Throughput: 0: 49248.0. Samples: 2105723960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-13 04:12:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:12:17,143][71000] Updated weights for policy 0, policy_version 157284 (0.0030) [2024-06-13 04:12:20,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.1, 300 sec: 49485.3). Total num frames: 2577088512. Throughput: 0: 49204.0. Samples: 2105875180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-13 04:12:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:12:21,043][71000] Updated weights for policy 0, policy_version 157294 (0.0027) [2024-06-13 04:12:24,059][71000] Updated weights for policy 0, policy_version 157304 (0.0020) [2024-06-13 04:12:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 2577350656. Throughput: 0: 49483.1. Samples: 2106178820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-13 04:12:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:12:27,421][71000] Updated weights for policy 0, policy_version 157314 (0.0028) [2024-06-13 04:12:30,677][71000] Updated weights for policy 0, policy_version 157324 (0.0031) [2024-06-13 04:12:30,939][70768] Fps is (10 sec: 52428.8, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 2577612800. Throughput: 0: 49623.6. Samples: 2106473520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-13 04:12:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:12:33,971][71000] Updated weights for policy 0, policy_version 157334 (0.0037) [2024-06-13 04:12:35,939][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2577842176. Throughput: 0: 49603.2. Samples: 2106624840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:12:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:12:37,296][71000] Updated weights for policy 0, policy_version 157344 (0.0039) [2024-06-13 04:12:40,907][71000] Updated weights for policy 0, policy_version 157354 (0.0038) [2024-06-13 04:12:40,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49425.2, 300 sec: 49540.8). Total num frames: 2578087936. Throughput: 0: 49389.6. Samples: 2106912520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:12:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:12:43,860][71000] Updated weights for policy 0, policy_version 157364 (0.0030) [2024-06-13 04:12:45,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49151.9, 300 sec: 49485.2). Total num frames: 2578333696. Throughput: 0: 49210.1. Samples: 2107213840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:12:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:12:47,218][71000] Updated weights for policy 0, policy_version 157374 (0.0030) [2024-06-13 04:12:50,347][71000] Updated weights for policy 0, policy_version 157384 (0.0026) [2024-06-13 04:12:50,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49425.1, 300 sec: 49485.3). Total num frames: 2578595840. Throughput: 0: 49287.5. Samples: 2107360880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:12:50,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:12:54,013][71000] Updated weights for policy 0, policy_version 157394 (0.0033) [2024-06-13 04:12:55,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 2578841600. Throughput: 0: 49629.3. Samples: 2107659500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:12:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:12:56,959][71000] Updated weights for policy 0, policy_version 157404 (0.0031) [2024-06-13 04:13:00,536][71000] Updated weights for policy 0, policy_version 157414 (0.0022) [2024-06-13 04:13:00,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49698.0, 300 sec: 49596.3). Total num frames: 2579087360. Throughput: 0: 49802.1. Samples: 2107965060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:13:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:13:03,294][71000] Updated weights for policy 0, policy_version 157424 (0.0028) [2024-06-13 04:13:05,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49153.3, 300 sec: 49485.3). Total num frames: 2579316736. Throughput: 0: 49617.8. Samples: 2108107980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:13:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:13:07,022][71000] Updated weights for policy 0, policy_version 157434 (0.0025) [2024-06-13 04:13:09,996][71000] Updated weights for policy 0, policy_version 157444 (0.0031) [2024-06-13 04:13:10,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 2579578880. Throughput: 0: 49537.4. Samples: 2108408000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:13:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:13:13,676][71000] Updated weights for policy 0, policy_version 157454 (0.0033) [2024-06-13 04:13:15,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 2579841024. Throughput: 0: 49427.0. Samples: 2108697740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:13:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:13:16,668][71000] Updated weights for policy 0, policy_version 157464 (0.0019) [2024-06-13 04:13:20,341][71000] Updated weights for policy 0, policy_version 157474 (0.0023) [2024-06-13 04:13:20,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 2580070400. Throughput: 0: 49462.1. Samples: 2108850640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:13:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:13:23,317][71000] Updated weights for policy 0, policy_version 157484 (0.0036) [2024-06-13 04:13:25,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 2580316160. Throughput: 0: 49684.3. Samples: 2109148320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:13:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:13:27,025][71000] Updated weights for policy 0, policy_version 157494 (0.0032) [2024-06-13 04:13:27,518][70980] Signal inference workers to stop experience collection... (31400 times) [2024-06-13 04:13:27,518][70980] Signal inference workers to resume experience collection... (31400 times) [2024-06-13 04:13:27,529][71000] InferenceWorker_p0-w0: stopping experience collection (31400 times) [2024-06-13 04:13:27,559][71000] InferenceWorker_p0-w0: resuming experience collection (31400 times) [2024-06-13 04:13:30,275][71000] Updated weights for policy 0, policy_version 157504 (0.0029) [2024-06-13 04:13:30,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2580561920. Throughput: 0: 49395.6. Samples: 2109436640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:13:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:13:33,869][71000] Updated weights for policy 0, policy_version 157514 (0.0029) [2024-06-13 04:13:35,939][70768] Fps is (10 sec: 52429.6, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 2580840448. Throughput: 0: 49473.7. Samples: 2109587200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:13:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:13:36,842][71000] Updated weights for policy 0, policy_version 157524 (0.0028) [2024-06-13 04:13:40,735][71000] Updated weights for policy 0, policy_version 157534 (0.0021) [2024-06-13 04:13:40,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49425.0, 300 sec: 49540.7). Total num frames: 2581053440. Throughput: 0: 49341.2. Samples: 2109879860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 04:13:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:13:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000157535_2581053440.pth... [2024-06-13 04:13:41,024][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000156813_2569224192.pth [2024-06-13 04:13:42,994][71000] Updated weights for policy 0, policy_version 157544 (0.0036) [2024-06-13 04:13:45,939][70768] Fps is (10 sec: 44236.9, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 2581282816. Throughput: 0: 49226.4. Samples: 2110180240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 04:13:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:13:47,364][71000] Updated weights for policy 0, policy_version 157554 (0.0025) [2024-06-13 04:13:50,241][71000] Updated weights for policy 0, policy_version 157564 (0.0036) [2024-06-13 04:13:50,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 2581544960. Throughput: 0: 49036.8. Samples: 2110314640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 04:13:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:13:54,006][71000] Updated weights for policy 0, policy_version 157574 (0.0030) [2024-06-13 04:13:55,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49425.1, 300 sec: 49485.5). Total num frames: 2581807104. Throughput: 0: 49016.9. Samples: 2110613760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 04:13:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:13:56,754][71000] Updated weights for policy 0, policy_version 157584 (0.0030) [2024-06-13 04:14:00,526][71000] Updated weights for policy 0, policy_version 157594 (0.0025) [2024-06-13 04:14:00,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2582036480. Throughput: 0: 49105.8. Samples: 2110907500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 04:14:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:14:03,587][71000] Updated weights for policy 0, policy_version 157604 (0.0039) [2024-06-13 04:14:05,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 2582282240. Throughput: 0: 48911.1. Samples: 2111051640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 04:14:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:14:07,279][71000] Updated weights for policy 0, policy_version 157614 (0.0031) [2024-06-13 04:14:09,959][71000] Updated weights for policy 0, policy_version 157624 (0.0022) [2024-06-13 04:14:10,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 2582528000. Throughput: 0: 49057.3. Samples: 2111355900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 04:14:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:14:13,797][71000] Updated weights for policy 0, policy_version 157634 (0.0026) [2024-06-13 04:14:15,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2582790144. Throughput: 0: 49028.4. Samples: 2111642920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 04:14:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:14:16,645][71000] Updated weights for policy 0, policy_version 157644 (0.0034) [2024-06-13 04:14:20,472][71000] Updated weights for policy 0, policy_version 157654 (0.0037) [2024-06-13 04:14:20,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2583019520. Throughput: 0: 49294.1. Samples: 2111805440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 04:14:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:14:23,554][71000] Updated weights for policy 0, policy_version 157664 (0.0024) [2024-06-13 04:14:24,748][70980] Signal inference workers to stop experience collection... (31450 times) [2024-06-13 04:14:24,749][70980] Signal inference workers to resume experience collection... (31450 times) [2024-06-13 04:14:24,778][71000] InferenceWorker_p0-w0: stopping experience collection (31450 times) [2024-06-13 04:14:24,778][71000] InferenceWorker_p0-w0: resuming experience collection (31450 times) [2024-06-13 04:14:25,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2583265280. Throughput: 0: 49247.1. Samples: 2112095980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 04:14:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:14:26,827][71000] Updated weights for policy 0, policy_version 157674 (0.0026) [2024-06-13 04:14:30,085][71000] Updated weights for policy 0, policy_version 157684 (0.0024) [2024-06-13 04:14:30,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49151.8, 300 sec: 49429.7). Total num frames: 2583511040. Throughput: 0: 49130.4. Samples: 2112391120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 04:14:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:14:33,603][71000] Updated weights for policy 0, policy_version 157694 (0.0035) [2024-06-13 04:14:35,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48878.9, 300 sec: 49374.2). Total num frames: 2583773184. Throughput: 0: 49597.3. Samples: 2112546520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 04:14:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:14:36,802][71000] Updated weights for policy 0, policy_version 157704 (0.0031) [2024-06-13 04:14:40,435][71000] Updated weights for policy 0, policy_version 157714 (0.0032) [2024-06-13 04:14:40,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.1, 300 sec: 49374.1). Total num frames: 2584018944. Throughput: 0: 49325.2. Samples: 2112833400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 04:14:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:14:43,621][71000] Updated weights for policy 0, policy_version 157724 (0.0030) [2024-06-13 04:14:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2584248320. Throughput: 0: 49342.7. Samples: 2113127920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:14:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:14:46,874][71000] Updated weights for policy 0, policy_version 157734 (0.0025) [2024-06-13 04:14:50,113][71000] Updated weights for policy 0, policy_version 157744 (0.0027) [2024-06-13 04:14:50,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49151.9, 300 sec: 49429.8). Total num frames: 2584494080. Throughput: 0: 49258.6. Samples: 2113268280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:14:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:14:53,394][71000] Updated weights for policy 0, policy_version 157754 (0.0032) [2024-06-13 04:14:55,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 2584772608. Throughput: 0: 49414.8. Samples: 2113579560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:14:55,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 04:14:56,652][71000] Updated weights for policy 0, policy_version 157764 (0.0026) [2024-06-13 04:14:59,953][71000] Updated weights for policy 0, policy_version 157774 (0.0028) [2024-06-13 04:15:00,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 2585018368. Throughput: 0: 49673.7. Samples: 2113878240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:15:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:15:03,536][71000] Updated weights for policy 0, policy_version 157784 (0.0031) [2024-06-13 04:15:05,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2585247744. Throughput: 0: 49382.3. Samples: 2114027640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:15:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:15:06,510][71000] Updated weights for policy 0, policy_version 157794 (0.0030) [2024-06-13 04:15:09,826][71000] Updated weights for policy 0, policy_version 157804 (0.0029) [2024-06-13 04:15:10,943][70768] Fps is (10 sec: 47496.0, 60 sec: 49422.1, 300 sec: 49429.1). Total num frames: 2585493504. Throughput: 0: 49600.9. Samples: 2114328200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:15:10,944][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 04:15:12,777][71000] Updated weights for policy 0, policy_version 157814 (0.0033) [2024-06-13 04:15:15,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 2585755648. Throughput: 0: 49681.0. Samples: 2114626760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:15:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:15:16,173][71000] Updated weights for policy 0, policy_version 157824 (0.0029) [2024-06-13 04:15:19,743][71000] Updated weights for policy 0, policy_version 157834 (0.0029) [2024-06-13 04:15:20,940][70768] Fps is (10 sec: 50809.2, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 2586001408. Throughput: 0: 49557.8. Samples: 2114776620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:15:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:15:22,850][71000] Updated weights for policy 0, policy_version 157844 (0.0031) [2024-06-13 04:15:25,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 2586247168. Throughput: 0: 49864.4. Samples: 2115077300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:15:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 04:15:26,574][71000] Updated weights for policy 0, policy_version 157854 (0.0032) [2024-06-13 04:15:29,608][71000] Updated weights for policy 0, policy_version 157864 (0.0032) [2024-06-13 04:15:30,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2586476544. Throughput: 0: 49870.1. Samples: 2115372080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:15:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:15:31,430][70980] Signal inference workers to stop experience collection... (31500 times) [2024-06-13 04:15:31,431][70980] Signal inference workers to resume experience collection... (31500 times) [2024-06-13 04:15:31,486][71000] InferenceWorker_p0-w0: stopping experience collection (31500 times) [2024-06-13 04:15:31,487][71000] InferenceWorker_p0-w0: resuming experience collection (31500 times) [2024-06-13 04:15:32,928][71000] Updated weights for policy 0, policy_version 157874 (0.0034) [2024-06-13 04:15:35,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 2586755072. Throughput: 0: 49937.9. Samples: 2115515480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:15:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 04:15:36,315][71000] Updated weights for policy 0, policy_version 157884 (0.0026) [2024-06-13 04:15:39,467][71000] Updated weights for policy 0, policy_version 157894 (0.0030) [2024-06-13 04:15:40,940][70768] Fps is (10 sec: 52429.6, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 2587000832. Throughput: 0: 49642.2. Samples: 2115813460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:15:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:15:41,051][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000157899_2587017216.pth... [2024-06-13 04:15:41,098][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000157173_2575122432.pth [2024-06-13 04:15:43,204][71000] Updated weights for policy 0, policy_version 157904 (0.0023) [2024-06-13 04:15:45,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49698.2, 300 sec: 49263.1). Total num frames: 2587230208. Throughput: 0: 49519.6. Samples: 2116106620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:15:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:15:46,459][71000] Updated weights for policy 0, policy_version 157914 (0.0029) [2024-06-13 04:15:49,955][71000] Updated weights for policy 0, policy_version 157924 (0.0029) [2024-06-13 04:15:50,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 2587475968. Throughput: 0: 49455.1. Samples: 2116253120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-13 04:15:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:15:53,015][71000] Updated weights for policy 0, policy_version 157934 (0.0034) [2024-06-13 04:15:55,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 2587721728. Throughput: 0: 49336.8. Samples: 2116548180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-13 04:15:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:15:56,367][71000] Updated weights for policy 0, policy_version 157944 (0.0026) [2024-06-13 04:15:59,671][71000] Updated weights for policy 0, policy_version 157954 (0.0032) [2024-06-13 04:16:00,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 2587967488. Throughput: 0: 49148.4. Samples: 2116838440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-13 04:16:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:16:03,073][71000] Updated weights for policy 0, policy_version 157964 (0.0039) [2024-06-13 04:16:05,939][70768] Fps is (10 sec: 49153.0, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2588213248. Throughput: 0: 49135.2. Samples: 2116987700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-13 04:16:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:16:06,609][71000] Updated weights for policy 0, policy_version 157974 (0.0029) [2024-06-13 04:16:09,818][71000] Updated weights for policy 0, policy_version 157984 (0.0029) [2024-06-13 04:16:10,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49428.1, 300 sec: 49429.7). Total num frames: 2588459008. Throughput: 0: 49002.7. Samples: 2117282420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-13 04:16:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:16:13,187][71000] Updated weights for policy 0, policy_version 157994 (0.0031) [2024-06-13 04:16:15,940][70768] Fps is (10 sec: 49150.6, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 2588704768. Throughput: 0: 49104.4. Samples: 2117581780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-13 04:16:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:16:16,295][71000] Updated weights for policy 0, policy_version 158004 (0.0029) [2024-06-13 04:16:19,770][71000] Updated weights for policy 0, policy_version 158014 (0.0030) [2024-06-13 04:16:20,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2588950528. Throughput: 0: 49063.1. Samples: 2117723320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-13 04:16:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:16:23,168][71000] Updated weights for policy 0, policy_version 158024 (0.0029) [2024-06-13 04:16:25,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 2589196288. Throughput: 0: 48941.2. Samples: 2118015820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-13 04:16:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:16:26,424][71000] Updated weights for policy 0, policy_version 158034 (0.0026) [2024-06-13 04:16:29,789][71000] Updated weights for policy 0, policy_version 158044 (0.0035) [2024-06-13 04:16:30,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.3, 300 sec: 49374.2). Total num frames: 2589458432. Throughput: 0: 49090.6. Samples: 2118315700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-13 04:16:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:16:32,973][71000] Updated weights for policy 0, policy_version 158054 (0.0028) [2024-06-13 04:16:35,940][70768] Fps is (10 sec: 49152.9, 60 sec: 48879.0, 300 sec: 49374.2). Total num frames: 2589687808. Throughput: 0: 49052.4. Samples: 2118460480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-13 04:16:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:16:36,163][71000] Updated weights for policy 0, policy_version 158064 (0.0023) [2024-06-13 04:16:39,991][71000] Updated weights for policy 0, policy_version 158074 (0.0034) [2024-06-13 04:16:40,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 2589933568. Throughput: 0: 49084.2. Samples: 2118756960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-13 04:16:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:16:42,361][70980] Signal inference workers to stop experience collection... (31550 times) [2024-06-13 04:16:42,364][70980] Signal inference workers to resume experience collection... (31550 times) [2024-06-13 04:16:42,371][71000] InferenceWorker_p0-w0: stopping experience collection (31550 times) [2024-06-13 04:16:42,395][71000] InferenceWorker_p0-w0: resuming experience collection (31550 times) [2024-06-13 04:16:43,131][71000] Updated weights for policy 0, policy_version 158084 (0.0031) [2024-06-13 04:16:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2590162944. Throughput: 0: 48918.4. Samples: 2119039760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-13 04:16:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:16:46,590][71000] Updated weights for policy 0, policy_version 158094 (0.0029) [2024-06-13 04:16:49,958][71000] Updated weights for policy 0, policy_version 158104 (0.0034) [2024-06-13 04:16:50,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2590408704. Throughput: 0: 48987.4. Samples: 2119192140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 04:16:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 04:16:53,207][71000] Updated weights for policy 0, policy_version 158114 (0.0024) [2024-06-13 04:16:55,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48606.0, 300 sec: 49263.1). Total num frames: 2590638080. Throughput: 0: 48859.7. Samples: 2119481100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 04:16:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:16:56,543][71000] Updated weights for policy 0, policy_version 158124 (0.0027) [2024-06-13 04:16:59,832][71000] Updated weights for policy 0, policy_version 158134 (0.0027) [2024-06-13 04:17:00,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.1, 300 sec: 49318.9). Total num frames: 2590916608. Throughput: 0: 48729.2. Samples: 2119774580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 04:17:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 04:17:03,578][71000] Updated weights for policy 0, policy_version 158144 (0.0032) [2024-06-13 04:17:05,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2591162368. Throughput: 0: 48792.9. Samples: 2119919000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 04:17:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 04:17:06,732][71000] Updated weights for policy 0, policy_version 158154 (0.0035) [2024-06-13 04:17:10,458][71000] Updated weights for policy 0, policy_version 158164 (0.0025) [2024-06-13 04:17:10,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48606.0, 300 sec: 49207.5). Total num frames: 2591375360. Throughput: 0: 48933.5. Samples: 2120217820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 04:17:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:17:13,457][71000] Updated weights for policy 0, policy_version 158174 (0.0022) [2024-06-13 04:17:15,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48606.1, 300 sec: 49263.1). Total num frames: 2591621120. Throughput: 0: 48693.4. Samples: 2120506900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 04:17:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:17:17,067][71000] Updated weights for policy 0, policy_version 158184 (0.0023) [2024-06-13 04:17:20,194][71000] Updated weights for policy 0, policy_version 158194 (0.0039) [2024-06-13 04:17:20,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48605.9, 300 sec: 49207.6). Total num frames: 2591866880. Throughput: 0: 48609.3. Samples: 2120647900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 04:17:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:17:23,455][71000] Updated weights for policy 0, policy_version 158204 (0.0026) [2024-06-13 04:17:25,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2592129024. Throughput: 0: 48535.0. Samples: 2120941040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 04:17:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:17:26,801][71000] Updated weights for policy 0, policy_version 158214 (0.0031) [2024-06-13 04:17:30,490][71000] Updated weights for policy 0, policy_version 158224 (0.0030) [2024-06-13 04:17:30,939][70768] Fps is (10 sec: 50790.9, 60 sec: 48605.9, 300 sec: 49263.1). Total num frames: 2592374784. Throughput: 0: 49038.3. Samples: 2121246480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 04:17:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:17:33,380][71000] Updated weights for policy 0, policy_version 158234 (0.0029) [2024-06-13 04:17:35,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48605.8, 300 sec: 49207.5). Total num frames: 2592604160. Throughput: 0: 48856.0. Samples: 2121390660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 04:17:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:17:36,971][71000] Updated weights for policy 0, policy_version 158244 (0.0022) [2024-06-13 04:17:40,074][71000] Updated weights for policy 0, policy_version 158254 (0.0020) [2024-06-13 04:17:40,940][70768] Fps is (10 sec: 47512.7, 60 sec: 48605.8, 300 sec: 49207.5). Total num frames: 2592849920. Throughput: 0: 49028.3. Samples: 2121687380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 04:17:40,943][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:17:41,055][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000158256_2592866304.pth... [2024-06-13 04:17:41,109][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000157535_2581053440.pth [2024-06-13 04:17:43,263][71000] Updated weights for policy 0, policy_version 158264 (0.0035) [2024-06-13 04:17:45,939][70768] Fps is (10 sec: 52429.6, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2593128448. Throughput: 0: 49172.9. Samples: 2121987360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 04:17:45,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 04:17:46,481][71000] Updated weights for policy 0, policy_version 158274 (0.0022) [2024-06-13 04:17:50,235][71000] Updated weights for policy 0, policy_version 158284 (0.0029) [2024-06-13 04:17:50,548][70980] Signal inference workers to stop experience collection... (31600 times) [2024-06-13 04:17:50,548][70980] Signal inference workers to resume experience collection... (31600 times) [2024-06-13 04:17:50,559][71000] InferenceWorker_p0-w0: stopping experience collection (31600 times) [2024-06-13 04:17:50,559][71000] InferenceWorker_p0-w0: resuming experience collection (31600 times) [2024-06-13 04:17:50,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49424.9, 300 sec: 49263.0). Total num frames: 2593374208. Throughput: 0: 49325.5. Samples: 2122138660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 04:17:50,941][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:17:53,419][71000] Updated weights for policy 0, policy_version 158294 (0.0031) [2024-06-13 04:17:55,940][70768] Fps is (10 sec: 45874.7, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2593587200. Throughput: 0: 48998.6. Samples: 2122422760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:17:55,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 04:17:56,900][71000] Updated weights for policy 0, policy_version 158304 (0.0029) [2024-06-13 04:17:59,936][71000] Updated weights for policy 0, policy_version 158314 (0.0026) [2024-06-13 04:18:00,940][70768] Fps is (10 sec: 45876.5, 60 sec: 48605.8, 300 sec: 49207.5). Total num frames: 2593832960. Throughput: 0: 49153.3. Samples: 2122718800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:18:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:18:03,415][71000] Updated weights for policy 0, policy_version 158324 (0.0023) [2024-06-13 04:18:05,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2594111488. Throughput: 0: 49383.6. Samples: 2122870160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:18:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:18:06,536][71000] Updated weights for policy 0, policy_version 158334 (0.0030) [2024-06-13 04:18:10,164][71000] Updated weights for policy 0, policy_version 158344 (0.0023) [2024-06-13 04:18:10,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49698.1, 300 sec: 49207.5). Total num frames: 2594357248. Throughput: 0: 49687.1. Samples: 2123176960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:18:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:18:12,944][71000] Updated weights for policy 0, policy_version 158354 (0.0027) [2024-06-13 04:18:15,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2594586624. Throughput: 0: 49371.8. Samples: 2123468220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:18:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:18:16,663][71000] Updated weights for policy 0, policy_version 158364 (0.0023) [2024-06-13 04:18:19,705][71000] Updated weights for policy 0, policy_version 158374 (0.0036) [2024-06-13 04:18:20,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2594816000. Throughput: 0: 49268.9. Samples: 2123607760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:18:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 04:18:23,106][71000] Updated weights for policy 0, policy_version 158384 (0.0024) [2024-06-13 04:18:25,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2595094528. Throughput: 0: 49480.1. Samples: 2123913980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:18:25,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 04:18:26,532][71000] Updated weights for policy 0, policy_version 158394 (0.0029) [2024-06-13 04:18:29,863][71000] Updated weights for policy 0, policy_version 158404 (0.0030) [2024-06-13 04:18:30,940][70768] Fps is (10 sec: 55705.8, 60 sec: 49971.1, 300 sec: 49263.1). Total num frames: 2595373056. Throughput: 0: 49488.8. Samples: 2124214360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:18:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:18:32,918][71000] Updated weights for policy 0, policy_version 158414 (0.0040) [2024-06-13 04:18:35,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49425.2, 300 sec: 49207.6). Total num frames: 2595569664. Throughput: 0: 49316.8. Samples: 2124357900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:18:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:18:36,328][71000] Updated weights for policy 0, policy_version 158424 (0.0031) [2024-06-13 04:18:39,376][71000] Updated weights for policy 0, policy_version 158434 (0.0030) [2024-06-13 04:18:40,940][70768] Fps is (10 sec: 44236.8, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2595815424. Throughput: 0: 49489.4. Samples: 2124649780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:18:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:18:42,845][71000] Updated weights for policy 0, policy_version 158444 (0.0035) [2024-06-13 04:18:45,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2596077568. Throughput: 0: 49421.3. Samples: 2124942760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:18:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:18:46,524][71000] Updated weights for policy 0, policy_version 158454 (0.0031) [2024-06-13 04:18:49,641][71000] Updated weights for policy 0, policy_version 158464 (0.0029) [2024-06-13 04:18:50,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.3, 300 sec: 49263.1). Total num frames: 2596339712. Throughput: 0: 49531.1. Samples: 2125099060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:18:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:18:52,823][71000] Updated weights for policy 0, policy_version 158474 (0.0030) [2024-06-13 04:18:55,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49698.0, 300 sec: 49263.1). Total num frames: 2596569088. Throughput: 0: 49293.2. Samples: 2125395160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:18:55,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 04:18:56,176][71000] Updated weights for policy 0, policy_version 158484 (0.0035) [2024-06-13 04:18:57,587][70980] Signal inference workers to stop experience collection... (31650 times) [2024-06-13 04:18:57,612][71000] InferenceWorker_p0-w0: stopping experience collection (31650 times) [2024-06-13 04:18:57,645][70980] Signal inference workers to resume experience collection... (31650 times) [2024-06-13 04:18:57,645][71000] InferenceWorker_p0-w0: resuming experience collection (31650 times) [2024-06-13 04:18:59,757][71000] Updated weights for policy 0, policy_version 158494 (0.0028) [2024-06-13 04:19:00,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49698.0, 300 sec: 49263.1). Total num frames: 2596814848. Throughput: 0: 49391.5. Samples: 2125690840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 04:19:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:19:02,863][71000] Updated weights for policy 0, policy_version 158504 (0.0026) [2024-06-13 04:19:05,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2597060608. Throughput: 0: 49434.3. Samples: 2125832300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 04:19:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:19:06,303][71000] Updated weights for policy 0, policy_version 158514 (0.0026) [2024-06-13 04:19:09,479][71000] Updated weights for policy 0, policy_version 158524 (0.0024) [2024-06-13 04:19:10,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49424.9, 300 sec: 49263.0). Total num frames: 2597322752. Throughput: 0: 49438.9. Samples: 2126138740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 04:19:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:19:13,087][71000] Updated weights for policy 0, policy_version 158534 (0.0033) [2024-06-13 04:19:15,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2597552128. Throughput: 0: 49400.9. Samples: 2126437400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 04:19:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:19:16,148][71000] Updated weights for policy 0, policy_version 158544 (0.0026) [2024-06-13 04:19:19,468][71000] Updated weights for policy 0, policy_version 158554 (0.0028) [2024-06-13 04:19:20,940][70768] Fps is (10 sec: 47514.5, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2597797888. Throughput: 0: 49373.7. Samples: 2126579720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 04:19:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:19:22,862][71000] Updated weights for policy 0, policy_version 158564 (0.0037) [2024-06-13 04:19:25,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2598043648. Throughput: 0: 49344.4. Samples: 2126870280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 04:19:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:19:26,650][71000] Updated weights for policy 0, policy_version 158574 (0.0023) [2024-06-13 04:19:29,697][71000] Updated weights for policy 0, policy_version 158584 (0.0025) [2024-06-13 04:19:30,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 2598305792. Throughput: 0: 49244.9. Samples: 2127158780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 04:19:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:19:33,307][71000] Updated weights for policy 0, policy_version 158594 (0.0026) [2024-06-13 04:19:35,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.0, 300 sec: 49207.6). Total num frames: 2598535168. Throughput: 0: 49299.5. Samples: 2127317540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 04:19:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:19:36,220][71000] Updated weights for policy 0, policy_version 158604 (0.0028) [2024-06-13 04:19:39,910][71000] Updated weights for policy 0, policy_version 158614 (0.0034) [2024-06-13 04:19:40,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2598780928. Throughput: 0: 49241.5. Samples: 2127611020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 04:19:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:19:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000158617_2598780928.pth... [2024-06-13 04:19:40,994][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000157899_2587017216.pth [2024-06-13 04:19:43,165][71000] Updated weights for policy 0, policy_version 158624 (0.0033) [2024-06-13 04:19:45,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.8, 300 sec: 49207.5). Total num frames: 2599010304. Throughput: 0: 48900.9. Samples: 2127891380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 04:19:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:19:46,609][71000] Updated weights for policy 0, policy_version 158634 (0.0030) [2024-06-13 04:19:49,929][71000] Updated weights for policy 0, policy_version 158644 (0.0032) [2024-06-13 04:19:50,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2599288832. Throughput: 0: 48895.5. Samples: 2128032600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 04:19:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:19:53,460][71000] Updated weights for policy 0, policy_version 158654 (0.0026) [2024-06-13 04:19:55,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2599518208. Throughput: 0: 48750.4. Samples: 2128332500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 04:19:55,950][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:19:56,499][71000] Updated weights for policy 0, policy_version 158664 (0.0022) [2024-06-13 04:20:00,279][71000] Updated weights for policy 0, policy_version 158674 (0.0029) [2024-06-13 04:20:00,939][70768] Fps is (10 sec: 45875.3, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 2599747584. Throughput: 0: 48584.9. Samples: 2128623720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 04:20:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:20:02,987][71000] Updated weights for policy 0, policy_version 158684 (0.0032) [2024-06-13 04:20:05,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48605.8, 300 sec: 49097.1). Total num frames: 2599976960. Throughput: 0: 48501.8. Samples: 2128762300. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-13 04:20:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:20:06,919][71000] Updated weights for policy 0, policy_version 158694 (0.0024) [2024-06-13 04:20:08,251][70980] Signal inference workers to stop experience collection... (31700 times) [2024-06-13 04:20:08,298][71000] InferenceWorker_p0-w0: stopping experience collection (31700 times) [2024-06-13 04:20:08,305][70980] Signal inference workers to resume experience collection... (31700 times) [2024-06-13 04:20:08,311][71000] InferenceWorker_p0-w0: resuming experience collection (31700 times) [2024-06-13 04:20:09,838][71000] Updated weights for policy 0, policy_version 158704 (0.0023) [2024-06-13 04:20:10,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 2600255488. Throughput: 0: 48732.5. Samples: 2129063240. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-13 04:20:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:20:13,559][71000] Updated weights for policy 0, policy_version 158714 (0.0025) [2024-06-13 04:20:15,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2600501248. Throughput: 0: 48881.6. Samples: 2129358460. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-13 04:20:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:20:16,360][71000] Updated weights for policy 0, policy_version 158724 (0.0028) [2024-06-13 04:20:20,277][71000] Updated weights for policy 0, policy_version 158734 (0.0029) [2024-06-13 04:20:20,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 2600714240. Throughput: 0: 48588.9. Samples: 2129504040. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-13 04:20:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:20:23,299][71000] Updated weights for policy 0, policy_version 158744 (0.0030) [2024-06-13 04:20:25,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48605.8, 300 sec: 49096.5). Total num frames: 2600960000. Throughput: 0: 48534.1. Samples: 2129795060. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-13 04:20:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:20:26,948][71000] Updated weights for policy 0, policy_version 158754 (0.0029) [2024-06-13 04:20:29,775][71000] Updated weights for policy 0, policy_version 158764 (0.0030) [2024-06-13 04:20:30,940][70768] Fps is (10 sec: 52428.0, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 2601238528. Throughput: 0: 48986.2. Samples: 2130095760. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-13 04:20:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:20:33,925][71000] Updated weights for policy 0, policy_version 158774 (0.0036) [2024-06-13 04:20:35,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49151.8, 300 sec: 49096.4). Total num frames: 2601484288. Throughput: 0: 49219.3. Samples: 2130247480. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-13 04:20:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:20:36,502][71000] Updated weights for policy 0, policy_version 158784 (0.0026) [2024-06-13 04:20:40,578][71000] Updated weights for policy 0, policy_version 158794 (0.0022) [2024-06-13 04:20:40,940][70768] Fps is (10 sec: 45875.6, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 2601697280. Throughput: 0: 49114.7. Samples: 2130542660. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-13 04:20:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:20:42,883][71000] Updated weights for policy 0, policy_version 158804 (0.0034) [2024-06-13 04:20:45,940][70768] Fps is (10 sec: 47514.4, 60 sec: 49152.0, 300 sec: 49096.4). Total num frames: 2601959424. Throughput: 0: 49283.4. Samples: 2130841480. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-13 04:20:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:20:47,100][71000] Updated weights for policy 0, policy_version 158814 (0.0028) [2024-06-13 04:20:49,655][71000] Updated weights for policy 0, policy_version 158824 (0.0034) [2024-06-13 04:20:50,940][70768] Fps is (10 sec: 52429.1, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2602221568. Throughput: 0: 49437.8. Samples: 2130987000. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-13 04:20:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:20:53,589][71000] Updated weights for policy 0, policy_version 158834 (0.0025) [2024-06-13 04:20:55,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2602483712. Throughput: 0: 49427.0. Samples: 2131287460. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-13 04:20:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:20:56,071][71000] Updated weights for policy 0, policy_version 158844 (0.0029) [2024-06-13 04:21:00,219][71000] Updated weights for policy 0, policy_version 158854 (0.0032) [2024-06-13 04:21:00,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2602696704. Throughput: 0: 49441.9. Samples: 2131583340. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-13 04:21:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:21:02,774][71000] Updated weights for policy 0, policy_version 158864 (0.0030) [2024-06-13 04:21:05,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 2602942464. Throughput: 0: 49277.2. Samples: 2131721520. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-13 04:21:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 04:21:06,985][71000] Updated weights for policy 0, policy_version 158874 (0.0025) [2024-06-13 04:21:09,443][71000] Updated weights for policy 0, policy_version 158884 (0.0030) [2024-06-13 04:21:10,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2603204608. Throughput: 0: 49400.0. Samples: 2132018060. Policy #0 lag: (min: 2.0, avg: 8.7, max: 22.0) [2024-06-13 04:21:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:21:13,545][71000] Updated weights for policy 0, policy_version 158894 (0.0032) [2024-06-13 04:21:15,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2603450368. Throughput: 0: 49259.3. Samples: 2132312420. Policy #0 lag: (min: 2.0, avg: 8.7, max: 22.0) [2024-06-13 04:21:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:21:16,256][71000] Updated weights for policy 0, policy_version 158904 (0.0028) [2024-06-13 04:21:20,265][71000] Updated weights for policy 0, policy_version 158914 (0.0038) [2024-06-13 04:21:20,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49424.9, 300 sec: 49096.5). Total num frames: 2603679744. Throughput: 0: 49262.3. Samples: 2132464280. Policy #0 lag: (min: 2.0, avg: 8.7, max: 22.0) [2024-06-13 04:21:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:21:21,014][70980] Signal inference workers to stop experience collection... (31750 times) [2024-06-13 04:21:21,045][71000] InferenceWorker_p0-w0: stopping experience collection (31750 times) [2024-06-13 04:21:21,060][70980] Signal inference workers to resume experience collection... (31750 times) [2024-06-13 04:21:21,064][71000] InferenceWorker_p0-w0: resuming experience collection (31750 times) [2024-06-13 04:21:22,732][71000] Updated weights for policy 0, policy_version 158924 (0.0031) [2024-06-13 04:21:25,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 2603909120. Throughput: 0: 49250.8. Samples: 2132758940. Policy #0 lag: (min: 2.0, avg: 8.7, max: 22.0) [2024-06-13 04:21:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 04:21:26,801][71000] Updated weights for policy 0, policy_version 158934 (0.0028) [2024-06-13 04:21:29,032][71000] Updated weights for policy 0, policy_version 158944 (0.0028) [2024-06-13 04:21:30,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2604187648. Throughput: 0: 49286.7. Samples: 2133059380. Policy #0 lag: (min: 2.0, avg: 8.7, max: 22.0) [2024-06-13 04:21:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:21:33,164][71000] Updated weights for policy 0, policy_version 158954 (0.0026) [2024-06-13 04:21:35,940][70768] Fps is (10 sec: 54067.3, 60 sec: 49425.3, 300 sec: 49207.5). Total num frames: 2604449792. Throughput: 0: 49410.7. Samples: 2133210480. Policy #0 lag: (min: 2.0, avg: 8.7, max: 22.0) [2024-06-13 04:21:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:21:36,132][71000] Updated weights for policy 0, policy_version 158964 (0.0023) [2024-06-13 04:21:39,875][71000] Updated weights for policy 0, policy_version 158974 (0.0029) [2024-06-13 04:21:40,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49971.2, 300 sec: 49263.1). Total num frames: 2604695552. Throughput: 0: 49544.9. Samples: 2133516980. Policy #0 lag: (min: 2.0, avg: 8.7, max: 22.0) [2024-06-13 04:21:40,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 04:21:40,960][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000158978_2604695552.pth... [2024-06-13 04:21:41,004][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000158256_2592866304.pth [2024-06-13 04:21:42,830][71000] Updated weights for policy 0, policy_version 158984 (0.0028) [2024-06-13 04:21:45,940][70768] Fps is (10 sec: 44236.2, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2604892160. Throughput: 0: 49407.8. Samples: 2133806700. Policy #0 lag: (min: 2.0, avg: 8.7, max: 22.0) [2024-06-13 04:21:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:21:46,609][71000] Updated weights for policy 0, policy_version 158994 (0.0020) [2024-06-13 04:21:49,259][71000] Updated weights for policy 0, policy_version 159004 (0.0022) [2024-06-13 04:21:50,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2605170688. Throughput: 0: 49477.8. Samples: 2133948020. Policy #0 lag: (min: 2.0, avg: 8.7, max: 22.0) [2024-06-13 04:21:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:21:53,061][71000] Updated weights for policy 0, policy_version 159014 (0.0025) [2024-06-13 04:21:55,939][70768] Fps is (10 sec: 54068.1, 60 sec: 49152.1, 300 sec: 49207.5). Total num frames: 2605432832. Throughput: 0: 49418.4. Samples: 2134241880. Policy #0 lag: (min: 2.0, avg: 8.7, max: 22.0) [2024-06-13 04:21:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:21:55,968][71000] Updated weights for policy 0, policy_version 159024 (0.0024) [2024-06-13 04:21:59,765][71000] Updated weights for policy 0, policy_version 159034 (0.0032) [2024-06-13 04:22:00,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2605662208. Throughput: 0: 49421.8. Samples: 2134536400. Policy #0 lag: (min: 2.0, avg: 8.7, max: 22.0) [2024-06-13 04:22:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:22:02,937][71000] Updated weights for policy 0, policy_version 159044 (0.0039) [2024-06-13 04:22:05,939][70768] Fps is (10 sec: 44236.7, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 2605875200. Throughput: 0: 49118.5. Samples: 2134674600. Policy #0 lag: (min: 2.0, avg: 8.7, max: 22.0) [2024-06-13 04:22:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:22:06,600][71000] Updated weights for policy 0, policy_version 159054 (0.0028) [2024-06-13 04:22:09,427][71000] Updated weights for policy 0, policy_version 159064 (0.0042) [2024-06-13 04:22:10,940][70768] Fps is (10 sec: 47512.6, 60 sec: 48878.8, 300 sec: 49207.5). Total num frames: 2606137344. Throughput: 0: 49245.1. Samples: 2134974980. Policy #0 lag: (min: 2.0, avg: 8.7, max: 22.0) [2024-06-13 04:22:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:22:13,265][71000] Updated weights for policy 0, policy_version 159074 (0.0043) [2024-06-13 04:22:15,940][70768] Fps is (10 sec: 54066.2, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2606415872. Throughput: 0: 49175.9. Samples: 2135272300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:22:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:22:16,185][71000] Updated weights for policy 0, policy_version 159084 (0.0029) [2024-06-13 04:22:19,474][71000] Updated weights for policy 0, policy_version 159094 (0.0030) [2024-06-13 04:22:20,939][70768] Fps is (10 sec: 50791.8, 60 sec: 49425.3, 300 sec: 49207.6). Total num frames: 2606645248. Throughput: 0: 49271.6. Samples: 2135427700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:22:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:22:22,696][71000] Updated weights for policy 0, policy_version 159104 (0.0028) [2024-06-13 04:22:25,940][70768] Fps is (10 sec: 45875.8, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2606874624. Throughput: 0: 48953.5. Samples: 2135719880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:22:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:22:26,314][71000] Updated weights for policy 0, policy_version 159114 (0.0023) [2024-06-13 04:22:29,323][71000] Updated weights for policy 0, policy_version 159124 (0.0030) [2024-06-13 04:22:30,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2607120384. Throughput: 0: 49032.0. Samples: 2136013140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:22:30,949][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 04:22:32,354][70980] Signal inference workers to stop experience collection... (31800 times) [2024-06-13 04:22:32,354][70980] Signal inference workers to resume experience collection... (31800 times) [2024-06-13 04:22:32,374][71000] InferenceWorker_p0-w0: stopping experience collection (31800 times) [2024-06-13 04:22:32,374][71000] InferenceWorker_p0-w0: resuming experience collection (31800 times) [2024-06-13 04:22:33,031][71000] Updated weights for policy 0, policy_version 159134 (0.0035) [2024-06-13 04:22:35,878][71000] Updated weights for policy 0, policy_version 159144 (0.0030) [2024-06-13 04:22:35,940][70768] Fps is (10 sec: 54066.3, 60 sec: 49424.9, 300 sec: 49374.1). Total num frames: 2607415296. Throughput: 0: 49360.8. Samples: 2136169260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:22:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:22:39,559][71000] Updated weights for policy 0, policy_version 159154 (0.0025) [2024-06-13 04:22:40,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2607628288. Throughput: 0: 49097.1. Samples: 2136451260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:22:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 04:22:42,962][71000] Updated weights for policy 0, policy_version 159164 (0.0033) [2024-06-13 04:22:45,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49698.2, 300 sec: 49152.0). Total num frames: 2607874048. Throughput: 0: 49120.5. Samples: 2136746820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:22:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:22:46,421][71000] Updated weights for policy 0, policy_version 159174 (0.0031) [2024-06-13 04:22:49,613][71000] Updated weights for policy 0, policy_version 159184 (0.0030) [2024-06-13 04:22:50,940][70768] Fps is (10 sec: 47514.4, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2608103424. Throughput: 0: 49060.4. Samples: 2136882320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:22:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:22:53,130][71000] Updated weights for policy 0, policy_version 159194 (0.0021) [2024-06-13 04:22:55,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2608381952. Throughput: 0: 49194.4. Samples: 2137188720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:22:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:22:56,327][71000] Updated weights for policy 0, policy_version 159204 (0.0029) [2024-06-13 04:22:59,673][71000] Updated weights for policy 0, policy_version 159214 (0.0022) [2024-06-13 04:23:00,944][70768] Fps is (10 sec: 50768.4, 60 sec: 49148.5, 300 sec: 49151.3). Total num frames: 2608611328. Throughput: 0: 49109.2. Samples: 2137482420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:23:00,944][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:23:02,607][71000] Updated weights for policy 0, policy_version 159224 (0.0033) [2024-06-13 04:23:05,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49698.0, 300 sec: 49152.0). Total num frames: 2608857088. Throughput: 0: 49025.1. Samples: 2137633840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:23:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:23:06,148][71000] Updated weights for policy 0, policy_version 159234 (0.0029) [2024-06-13 04:23:09,305][71000] Updated weights for policy 0, policy_version 159244 (0.0026) [2024-06-13 04:23:10,939][70768] Fps is (10 sec: 47534.3, 60 sec: 49152.2, 300 sec: 49152.0). Total num frames: 2609086464. Throughput: 0: 49125.8. Samples: 2137930540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 04:23:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:23:12,917][71000] Updated weights for policy 0, policy_version 159254 (0.0025) [2024-06-13 04:23:15,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2609364992. Throughput: 0: 49077.9. Samples: 2138221640. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 04:23:15,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 04:23:16,015][71000] Updated weights for policy 0, policy_version 159264 (0.0032) [2024-06-13 04:23:19,655][71000] Updated weights for policy 0, policy_version 159274 (0.0028) [2024-06-13 04:23:20,940][70768] Fps is (10 sec: 54066.9, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2609627136. Throughput: 0: 49149.5. Samples: 2138380980. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 04:23:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:23:22,330][71000] Updated weights for policy 0, policy_version 159284 (0.0028) [2024-06-13 04:23:25,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 2609840128. Throughput: 0: 49501.0. Samples: 2138678800. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 04:23:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:23:26,251][71000] Updated weights for policy 0, policy_version 159294 (0.0032) [2024-06-13 04:23:29,150][71000] Updated weights for policy 0, policy_version 159304 (0.0030) [2024-06-13 04:23:30,940][70768] Fps is (10 sec: 44236.8, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2610069504. Throughput: 0: 49379.6. Samples: 2138968900. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 04:23:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:23:32,950][71000] Updated weights for policy 0, policy_version 159314 (0.0028) [2024-06-13 04:23:35,762][71000] Updated weights for policy 0, policy_version 159324 (0.0030) [2024-06-13 04:23:35,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2610364416. Throughput: 0: 49625.6. Samples: 2139115480. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 04:23:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:23:39,585][70980] Signal inference workers to stop experience collection... (31850 times) [2024-06-13 04:23:39,588][70980] Signal inference workers to resume experience collection... (31850 times) [2024-06-13 04:23:39,592][71000] Updated weights for policy 0, policy_version 159334 (0.0027) [2024-06-13 04:23:39,600][71000] InferenceWorker_p0-w0: stopping experience collection (31850 times) [2024-06-13 04:23:39,600][71000] InferenceWorker_p0-w0: resuming experience collection (31850 times) [2024-06-13 04:23:40,940][70768] Fps is (10 sec: 54067.2, 60 sec: 49698.3, 300 sec: 49263.1). Total num frames: 2610610176. Throughput: 0: 49564.5. Samples: 2139419120. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 04:23:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:23:40,956][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000159339_2610610176.pth... [2024-06-13 04:23:40,995][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000158617_2598780928.pth [2024-06-13 04:23:42,634][71000] Updated weights for policy 0, policy_version 159344 (0.0028) [2024-06-13 04:23:45,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2610839552. Throughput: 0: 49576.6. Samples: 2139713160. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 04:23:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:23:46,119][71000] Updated weights for policy 0, policy_version 159354 (0.0024) [2024-06-13 04:23:49,188][71000] Updated weights for policy 0, policy_version 159364 (0.0025) [2024-06-13 04:23:50,939][70768] Fps is (10 sec: 45875.3, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2611068928. Throughput: 0: 49439.3. Samples: 2139858600. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 04:23:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:23:52,553][71000] Updated weights for policy 0, policy_version 159374 (0.0033) [2024-06-13 04:23:55,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2611331072. Throughput: 0: 49276.7. Samples: 2140148000. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 04:23:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:23:56,079][71000] Updated weights for policy 0, policy_version 159384 (0.0022) [2024-06-13 04:23:59,658][71000] Updated weights for policy 0, policy_version 159394 (0.0029) [2024-06-13 04:24:00,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49701.8, 300 sec: 49263.1). Total num frames: 2611593216. Throughput: 0: 49257.4. Samples: 2140438220. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 04:24:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:24:03,046][71000] Updated weights for policy 0, policy_version 159404 (0.0038) [2024-06-13 04:24:05,939][70768] Fps is (10 sec: 47514.6, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2611806208. Throughput: 0: 49141.4. Samples: 2140592340. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 04:24:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 04:24:06,196][71000] Updated weights for policy 0, policy_version 159414 (0.0035) [2024-06-13 04:24:09,836][71000] Updated weights for policy 0, policy_version 159424 (0.0029) [2024-06-13 04:24:10,940][70768] Fps is (10 sec: 44236.1, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 2612035584. Throughput: 0: 49075.1. Samples: 2140887180. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 04:24:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:24:13,050][71000] Updated weights for policy 0, policy_version 159434 (0.0030) [2024-06-13 04:24:15,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2612297728. Throughput: 0: 48986.6. Samples: 2141173300. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 04:24:15,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 04:24:16,455][71000] Updated weights for policy 0, policy_version 159444 (0.0026) [2024-06-13 04:24:19,614][71000] Updated weights for policy 0, policy_version 159454 (0.0027) [2024-06-13 04:24:20,940][70768] Fps is (10 sec: 54067.5, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2612576256. Throughput: 0: 49149.9. Samples: 2141327220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-13 04:24:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:24:22,923][71000] Updated weights for policy 0, policy_version 159464 (0.0027) [2024-06-13 04:24:25,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2612805632. Throughput: 0: 49070.5. Samples: 2141627300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-13 04:24:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:24:26,306][71000] Updated weights for policy 0, policy_version 159474 (0.0027) [2024-06-13 04:24:29,829][71000] Updated weights for policy 0, policy_version 159484 (0.0021) [2024-06-13 04:24:30,940][70768] Fps is (10 sec: 45875.5, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2613035008. Throughput: 0: 49059.7. Samples: 2141920840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-13 04:24:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:24:32,941][71000] Updated weights for policy 0, policy_version 159494 (0.0032) [2024-06-13 04:24:35,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48605.9, 300 sec: 49152.0). Total num frames: 2613280768. Throughput: 0: 48992.8. Samples: 2142063280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-13 04:24:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:24:36,438][71000] Updated weights for policy 0, policy_version 159504 (0.0025) [2024-06-13 04:24:39,491][71000] Updated weights for policy 0, policy_version 159514 (0.0032) [2024-06-13 04:24:39,973][70980] Signal inference workers to stop experience collection... (31900 times) [2024-06-13 04:24:40,015][71000] InferenceWorker_p0-w0: stopping experience collection (31900 times) [2024-06-13 04:24:40,023][70980] Signal inference workers to resume experience collection... (31900 times) [2024-06-13 04:24:40,032][71000] InferenceWorker_p0-w0: resuming experience collection (31900 times) [2024-06-13 04:24:40,940][70768] Fps is (10 sec: 52427.8, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2613559296. Throughput: 0: 49117.8. Samples: 2142358300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-13 04:24:40,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-13 04:24:43,403][71000] Updated weights for policy 0, policy_version 159524 (0.0031) [2024-06-13 04:24:45,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2613788672. Throughput: 0: 49308.7. Samples: 2142657120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-13 04:24:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:24:46,089][71000] Updated weights for policy 0, policy_version 159534 (0.0028) [2024-06-13 04:24:49,992][71000] Updated weights for policy 0, policy_version 159544 (0.0023) [2024-06-13 04:24:50,940][70768] Fps is (10 sec: 47514.4, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2614034432. Throughput: 0: 49106.2. Samples: 2142802120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-13 04:24:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:24:52,826][71000] Updated weights for policy 0, policy_version 159554 (0.0032) [2024-06-13 04:24:55,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2614263808. Throughput: 0: 49046.2. Samples: 2143094260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-13 04:24:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:24:56,487][71000] Updated weights for policy 0, policy_version 159564 (0.0032) [2024-06-13 04:24:59,514][71000] Updated weights for policy 0, policy_version 159574 (0.0035) [2024-06-13 04:25:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49151.9, 300 sec: 49374.2). Total num frames: 2614542336. Throughput: 0: 49441.3. Samples: 2143398160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-13 04:25:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:25:03,096][71000] Updated weights for policy 0, policy_version 159584 (0.0030) [2024-06-13 04:25:05,850][71000] Updated weights for policy 0, policy_version 159594 (0.0032) [2024-06-13 04:25:05,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49698.0, 300 sec: 49263.1). Total num frames: 2614788096. Throughput: 0: 49333.3. Samples: 2143547220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-13 04:25:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:25:09,619][71000] Updated weights for policy 0, policy_version 159604 (0.0026) [2024-06-13 04:25:10,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49971.2, 300 sec: 49263.1). Total num frames: 2615033856. Throughput: 0: 49277.3. Samples: 2143844780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-13 04:25:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:25:12,708][71000] Updated weights for policy 0, policy_version 159614 (0.0035) [2024-06-13 04:25:15,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2615246848. Throughput: 0: 49271.9. Samples: 2144138080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-13 04:25:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:25:16,257][71000] Updated weights for policy 0, policy_version 159624 (0.0032) [2024-06-13 04:25:19,234][71000] Updated weights for policy 0, policy_version 159634 (0.0034) [2024-06-13 04:25:20,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 2615508992. Throughput: 0: 49433.0. Samples: 2144287760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-13 04:25:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:25:22,889][71000] Updated weights for policy 0, policy_version 159644 (0.0029) [2024-06-13 04:25:25,732][71000] Updated weights for policy 0, policy_version 159654 (0.0032) [2024-06-13 04:25:25,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2615771136. Throughput: 0: 49326.8. Samples: 2144578000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:25:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:25:29,373][71000] Updated weights for policy 0, policy_version 159664 (0.0032) [2024-06-13 04:25:30,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.0, 300 sec: 49207.6). Total num frames: 2616000512. Throughput: 0: 49360.1. Samples: 2144878320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:25:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:25:32,339][71000] Updated weights for policy 0, policy_version 159674 (0.0025) [2024-06-13 04:25:35,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2616229888. Throughput: 0: 49249.2. Samples: 2145018340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:25:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:25:36,096][71000] Updated weights for policy 0, policy_version 159684 (0.0027) [2024-06-13 04:25:39,238][71000] Updated weights for policy 0, policy_version 159694 (0.0028) [2024-06-13 04:25:40,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48606.0, 300 sec: 49207.6). Total num frames: 2616475648. Throughput: 0: 49213.8. Samples: 2145308880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:25:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:25:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000159697_2616475648.pth... [2024-06-13 04:25:40,999][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000158978_2604695552.pth [2024-06-13 04:25:42,921][71000] Updated weights for policy 0, policy_version 159704 (0.0029) [2024-06-13 04:25:45,766][71000] Updated weights for policy 0, policy_version 159714 (0.0025) [2024-06-13 04:25:45,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 2616754176. Throughput: 0: 49036.0. Samples: 2145604780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:25:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:25:49,350][71000] Updated weights for policy 0, policy_version 159724 (0.0028) [2024-06-13 04:25:50,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2616983552. Throughput: 0: 49358.3. Samples: 2145768340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:25:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:25:52,182][71000] Updated weights for policy 0, policy_version 159734 (0.0024) [2024-06-13 04:25:55,803][71000] Updated weights for policy 0, policy_version 159744 (0.0033) [2024-06-13 04:25:55,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 2617245696. Throughput: 0: 49343.7. Samples: 2146065240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:25:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:25:58,520][70980] Signal inference workers to stop experience collection... (31950 times) [2024-06-13 04:25:58,571][71000] InferenceWorker_p0-w0: stopping experience collection (31950 times) [2024-06-13 04:25:58,631][70980] Signal inference workers to resume experience collection... (31950 times) [2024-06-13 04:25:58,631][71000] InferenceWorker_p0-w0: resuming experience collection (31950 times) [2024-06-13 04:25:59,006][71000] Updated weights for policy 0, policy_version 159754 (0.0024) [2024-06-13 04:26:00,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48605.8, 300 sec: 49207.5). Total num frames: 2617458688. Throughput: 0: 49261.3. Samples: 2146354840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:26:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:26:02,752][71000] Updated weights for policy 0, policy_version 159764 (0.0039) [2024-06-13 04:26:05,671][71000] Updated weights for policy 0, policy_version 159774 (0.0028) [2024-06-13 04:26:05,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2617753600. Throughput: 0: 49102.2. Samples: 2146497360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:26:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:26:09,342][71000] Updated weights for policy 0, policy_version 159784 (0.0024) [2024-06-13 04:26:10,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2617966592. Throughput: 0: 49235.1. Samples: 2146793580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:26:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:26:12,584][71000] Updated weights for policy 0, policy_version 159794 (0.0029) [2024-06-13 04:26:15,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2618212352. Throughput: 0: 49091.1. Samples: 2147087420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:26:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:26:16,123][71000] Updated weights for policy 0, policy_version 159804 (0.0025) [2024-06-13 04:26:19,323][71000] Updated weights for policy 0, policy_version 159814 (0.0026) [2024-06-13 04:26:20,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2618441728. Throughput: 0: 49166.3. Samples: 2147230820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:26:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:26:22,579][71000] Updated weights for policy 0, policy_version 159824 (0.0034) [2024-06-13 04:26:25,852][71000] Updated weights for policy 0, policy_version 159834 (0.0026) [2024-06-13 04:26:25,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2618720256. Throughput: 0: 49118.7. Samples: 2147519220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:26:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:26:29,502][71000] Updated weights for policy 0, policy_version 159844 (0.0031) [2024-06-13 04:26:30,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2618949632. Throughput: 0: 49058.2. Samples: 2147812400. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 04:26:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 04:26:32,457][71000] Updated weights for policy 0, policy_version 159854 (0.0029) [2024-06-13 04:26:35,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2619195392. Throughput: 0: 48816.8. Samples: 2147965100. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 04:26:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 04:26:35,949][71000] Updated weights for policy 0, policy_version 159864 (0.0028) [2024-06-13 04:26:39,246][71000] Updated weights for policy 0, policy_version 159874 (0.0026) [2024-06-13 04:26:40,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2619424768. Throughput: 0: 48859.0. Samples: 2148263900. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 04:26:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 04:26:42,722][71000] Updated weights for policy 0, policy_version 159884 (0.0031) [2024-06-13 04:26:45,717][71000] Updated weights for policy 0, policy_version 159894 (0.0024) [2024-06-13 04:26:45,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2619719680. Throughput: 0: 49166.6. Samples: 2148567340. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 04:26:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:26:49,357][71000] Updated weights for policy 0, policy_version 159904 (0.0029) [2024-06-13 04:26:50,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2619949056. Throughput: 0: 49391.9. Samples: 2148720000. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 04:26:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:26:52,139][71000] Updated weights for policy 0, policy_version 159914 (0.0024) [2024-06-13 04:26:55,858][71000] Updated weights for policy 0, policy_version 159924 (0.0029) [2024-06-13 04:26:55,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2620194816. Throughput: 0: 49444.5. Samples: 2149018580. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 04:26:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:26:59,074][71000] Updated weights for policy 0, policy_version 159934 (0.0020) [2024-06-13 04:27:00,942][70768] Fps is (10 sec: 47501.6, 60 sec: 49423.0, 300 sec: 49318.2). Total num frames: 2620424192. Throughput: 0: 49411.8. Samples: 2149311080. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 04:27:00,943][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:27:02,414][71000] Updated weights for policy 0, policy_version 159944 (0.0031) [2024-06-13 04:27:05,722][71000] Updated weights for policy 0, policy_version 159954 (0.0034) [2024-06-13 04:27:05,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2620702720. Throughput: 0: 49514.3. Samples: 2149458960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 04:27:05,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 04:27:09,191][71000] Updated weights for policy 0, policy_version 159964 (0.0027) [2024-06-13 04:27:10,940][70768] Fps is (10 sec: 52442.8, 60 sec: 49698.2, 300 sec: 49263.1). Total num frames: 2620948480. Throughput: 0: 49823.5. Samples: 2149761280. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 04:27:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:27:12,034][71000] Updated weights for policy 0, policy_version 159974 (0.0027) [2024-06-13 04:27:15,732][71000] Updated weights for policy 0, policy_version 159984 (0.0031) [2024-06-13 04:27:15,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2621177856. Throughput: 0: 49840.8. Samples: 2150055240. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 04:27:15,941][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:27:19,051][71000] Updated weights for policy 0, policy_version 159994 (0.0044) [2024-06-13 04:27:20,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2621407232. Throughput: 0: 49469.5. Samples: 2150191220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 04:27:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:27:22,465][70980] Signal inference workers to stop experience collection... (32000 times) [2024-06-13 04:27:22,465][70980] Signal inference workers to resume experience collection... (32000 times) [2024-06-13 04:27:22,487][71000] InferenceWorker_p0-w0: stopping experience collection (32000 times) [2024-06-13 04:27:22,488][71000] InferenceWorker_p0-w0: resuming experience collection (32000 times) [2024-06-13 04:27:22,595][71000] Updated weights for policy 0, policy_version 160004 (0.0037) [2024-06-13 04:27:25,653][71000] Updated weights for policy 0, policy_version 160014 (0.0031) [2024-06-13 04:27:25,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2621669376. Throughput: 0: 49274.0. Samples: 2150481220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 04:27:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:27:29,328][71000] Updated weights for policy 0, policy_version 160024 (0.0022) [2024-06-13 04:27:30,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.2, 300 sec: 49152.0). Total num frames: 2621915136. Throughput: 0: 49159.3. Samples: 2150779500. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 04:27:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:27:32,290][71000] Updated weights for policy 0, policy_version 160034 (0.0024) [2024-06-13 04:27:35,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 2622144512. Throughput: 0: 49107.2. Samples: 2150929820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 04:27:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:27:36,080][71000] Updated weights for policy 0, policy_version 160044 (0.0039) [2024-06-13 04:27:38,879][71000] Updated weights for policy 0, policy_version 160054 (0.0025) [2024-06-13 04:27:40,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49425.2, 300 sec: 49207.5). Total num frames: 2622390272. Throughput: 0: 48693.3. Samples: 2151209780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 04:27:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:27:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000160058_2622390272.pth... [2024-06-13 04:27:40,996][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000159339_2610610176.pth [2024-06-13 04:27:42,903][71000] Updated weights for policy 0, policy_version 160064 (0.0042) [2024-06-13 04:27:45,843][71000] Updated weights for policy 0, policy_version 160074 (0.0029) [2024-06-13 04:27:45,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48879.1, 300 sec: 49318.6). Total num frames: 2622652416. Throughput: 0: 48526.9. Samples: 2151494660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 04:27:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 04:27:49,335][71000] Updated weights for policy 0, policy_version 160084 (0.0032) [2024-06-13 04:27:50,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 2622881792. Throughput: 0: 48852.5. Samples: 2151657320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 04:27:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:27:52,446][71000] Updated weights for policy 0, policy_version 160094 (0.0031) [2024-06-13 04:27:55,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48878.9, 300 sec: 49208.3). Total num frames: 2623127552. Throughput: 0: 48751.9. Samples: 2151955120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 04:27:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:27:56,055][71000] Updated weights for policy 0, policy_version 160104 (0.0040) [2024-06-13 04:27:59,206][71000] Updated weights for policy 0, policy_version 160114 (0.0037) [2024-06-13 04:28:00,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49154.1, 300 sec: 49207.5). Total num frames: 2623373312. Throughput: 0: 48708.0. Samples: 2152247100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 04:28:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:28:02,842][71000] Updated weights for policy 0, policy_version 160124 (0.0026) [2024-06-13 04:28:05,776][71000] Updated weights for policy 0, policy_version 160134 (0.0024) [2024-06-13 04:28:05,940][70768] Fps is (10 sec: 50790.9, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 2623635456. Throughput: 0: 48946.3. Samples: 2152393800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 04:28:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:28:09,209][71000] Updated weights for policy 0, policy_version 160144 (0.0022) [2024-06-13 04:28:10,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2623881216. Throughput: 0: 49027.1. Samples: 2152687440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 04:28:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:28:12,395][71000] Updated weights for policy 0, policy_version 160154 (0.0034) [2024-06-13 04:28:15,852][71000] Updated weights for policy 0, policy_version 160164 (0.0035) [2024-06-13 04:28:15,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2624126976. Throughput: 0: 49027.9. Samples: 2152985760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 04:28:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:28:19,404][71000] Updated weights for policy 0, policy_version 160174 (0.0035) [2024-06-13 04:28:20,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2624339968. Throughput: 0: 48790.2. Samples: 2153125380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 04:28:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:28:22,620][71000] Updated weights for policy 0, policy_version 160184 (0.0023) [2024-06-13 04:28:25,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2624602112. Throughput: 0: 49164.8. Samples: 2153422200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 04:28:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:28:26,006][71000] Updated weights for policy 0, policy_version 160194 (0.0031) [2024-06-13 04:28:29,488][71000] Updated weights for policy 0, policy_version 160204 (0.0032) [2024-06-13 04:28:30,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2624847872. Throughput: 0: 49374.2. Samples: 2153716500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 04:28:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:28:33,083][71000] Updated weights for policy 0, policy_version 160214 (0.0031) [2024-06-13 04:28:35,939][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.2, 300 sec: 49152.0). Total num frames: 2625110016. Throughput: 0: 49102.3. Samples: 2153866920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 04:28:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:28:35,946][71000] Updated weights for policy 0, policy_version 160224 (0.0023) [2024-06-13 04:28:36,851][70980] Signal inference workers to stop experience collection... (32050 times) [2024-06-13 04:28:36,852][70980] Signal inference workers to resume experience collection... (32050 times) [2024-06-13 04:28:36,884][71000] InferenceWorker_p0-w0: stopping experience collection (32050 times) [2024-06-13 04:28:36,884][71000] InferenceWorker_p0-w0: resuming experience collection (32050 times) [2024-06-13 04:28:39,451][71000] Updated weights for policy 0, policy_version 160234 (0.0024) [2024-06-13 04:28:40,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2625323008. Throughput: 0: 48942.2. Samples: 2154157520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 04:28:40,941][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:28:42,577][71000] Updated weights for policy 0, policy_version 160244 (0.0039) [2024-06-13 04:28:45,940][70768] Fps is (10 sec: 45874.7, 60 sec: 48605.8, 300 sec: 49152.0). Total num frames: 2625568768. Throughput: 0: 48889.9. Samples: 2154447140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 04:28:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:28:46,203][71000] Updated weights for policy 0, policy_version 160254 (0.0033) [2024-06-13 04:28:49,658][71000] Updated weights for policy 0, policy_version 160264 (0.0037) [2024-06-13 04:28:50,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2625830912. Throughput: 0: 48996.7. Samples: 2154598660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 04:28:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:28:53,038][71000] Updated weights for policy 0, policy_version 160274 (0.0034) [2024-06-13 04:28:55,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2626076672. Throughput: 0: 49100.5. Samples: 2154896960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 04:28:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:28:55,968][71000] Updated weights for policy 0, policy_version 160284 (0.0028) [2024-06-13 04:28:59,378][71000] Updated weights for policy 0, policy_version 160294 (0.0028) [2024-06-13 04:29:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2626338816. Throughput: 0: 49136.3. Samples: 2155196900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 04:29:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:29:02,870][71000] Updated weights for policy 0, policy_version 160304 (0.0037) [2024-06-13 04:29:05,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48605.8, 300 sec: 49207.6). Total num frames: 2626551808. Throughput: 0: 49292.5. Samples: 2155343540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 04:29:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:29:06,196][71000] Updated weights for policy 0, policy_version 160314 (0.0026) [2024-06-13 04:29:09,661][71000] Updated weights for policy 0, policy_version 160324 (0.0027) [2024-06-13 04:29:10,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48605.8, 300 sec: 49152.0). Total num frames: 2626797568. Throughput: 0: 49137.2. Samples: 2155633380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 04:29:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:29:12,778][71000] Updated weights for policy 0, policy_version 160334 (0.0026) [2024-06-13 04:29:15,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2627059712. Throughput: 0: 49162.2. Samples: 2155928800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 04:29:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:29:16,003][71000] Updated weights for policy 0, policy_version 160344 (0.0022) [2024-06-13 04:29:19,368][71000] Updated weights for policy 0, policy_version 160354 (0.0031) [2024-06-13 04:29:20,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49698.1, 300 sec: 49207.6). Total num frames: 2627321856. Throughput: 0: 49262.1. Samples: 2156083720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 04:29:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:29:22,516][71000] Updated weights for policy 0, policy_version 160364 (0.0031) [2024-06-13 04:29:25,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2627551232. Throughput: 0: 49346.3. Samples: 2156378100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 04:29:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:29:26,018][71000] Updated weights for policy 0, policy_version 160374 (0.0026) [2024-06-13 04:29:29,319][71000] Updated weights for policy 0, policy_version 160384 (0.0027) [2024-06-13 04:29:30,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49424.9, 300 sec: 49263.1). Total num frames: 2627813376. Throughput: 0: 49539.9. Samples: 2156676440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 04:29:30,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 04:29:32,757][71000] Updated weights for policy 0, policy_version 160394 (0.0025) [2024-06-13 04:29:35,711][71000] Updated weights for policy 0, policy_version 160404 (0.0034) [2024-06-13 04:29:35,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2628059136. Throughput: 0: 49496.9. Samples: 2156826020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 04:29:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:29:39,249][71000] Updated weights for policy 0, policy_version 160414 (0.0027) [2024-06-13 04:29:40,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49971.2, 300 sec: 49263.1). Total num frames: 2628321280. Throughput: 0: 49447.0. Samples: 2157122080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 04:29:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:29:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000160420_2628321280.pth... [2024-06-13 04:29:40,997][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000159697_2616475648.pth [2024-06-13 04:29:42,594][71000] Updated weights for policy 0, policy_version 160424 (0.0035) [2024-06-13 04:29:43,452][70980] Signal inference workers to stop experience collection... (32100 times) [2024-06-13 04:29:43,453][70980] Signal inference workers to resume experience collection... (32100 times) [2024-06-13 04:29:43,470][71000] InferenceWorker_p0-w0: stopping experience collection (32100 times) [2024-06-13 04:29:43,470][71000] InferenceWorker_p0-w0: resuming experience collection (32100 times) [2024-06-13 04:29:45,940][70768] Fps is (10 sec: 45875.6, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2628517888. Throughput: 0: 49419.7. Samples: 2157420780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 04:29:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:29:46,153][71000] Updated weights for policy 0, policy_version 160434 (0.0026) [2024-06-13 04:29:49,505][71000] Updated weights for policy 0, policy_version 160444 (0.0031) [2024-06-13 04:29:50,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2628780032. Throughput: 0: 49105.3. Samples: 2157553280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 04:29:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:29:52,875][71000] Updated weights for policy 0, policy_version 160454 (0.0032) [2024-06-13 04:29:55,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 2629025792. Throughput: 0: 49222.2. Samples: 2157848380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 04:29:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:29:56,000][71000] Updated weights for policy 0, policy_version 160464 (0.0035) [2024-06-13 04:29:59,419][71000] Updated weights for policy 0, policy_version 160474 (0.0026) [2024-06-13 04:30:00,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2629287936. Throughput: 0: 49232.5. Samples: 2158144260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 04:30:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:30:02,326][71000] Updated weights for policy 0, policy_version 160484 (0.0025) [2024-06-13 04:30:05,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 2629500928. Throughput: 0: 49280.4. Samples: 2158301340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 04:30:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:30:06,152][71000] Updated weights for policy 0, policy_version 160494 (0.0029) [2024-06-13 04:30:09,126][71000] Updated weights for policy 0, policy_version 160504 (0.0036) [2024-06-13 04:30:10,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.2, 300 sec: 49207.6). Total num frames: 2629763072. Throughput: 0: 49220.9. Samples: 2158593040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 04:30:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:30:12,771][71000] Updated weights for policy 0, policy_version 160514 (0.0028) [2024-06-13 04:30:15,939][70768] Fps is (10 sec: 50791.1, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2630008832. Throughput: 0: 49059.3. Samples: 2158884100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 04:30:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:30:15,965][71000] Updated weights for policy 0, policy_version 160524 (0.0031) [2024-06-13 04:30:19,290][71000] Updated weights for policy 0, policy_version 160534 (0.0029) [2024-06-13 04:30:20,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2630270976. Throughput: 0: 49006.2. Samples: 2159031300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 04:30:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:30:22,711][71000] Updated weights for policy 0, policy_version 160544 (0.0029) [2024-06-13 04:30:25,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2630500352. Throughput: 0: 48980.5. Samples: 2159326200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 04:30:25,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 04:30:26,049][71000] Updated weights for policy 0, policy_version 160554 (0.0023) [2024-06-13 04:30:29,341][71000] Updated weights for policy 0, policy_version 160564 (0.0019) [2024-06-13 04:30:30,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48606.0, 300 sec: 49152.0). Total num frames: 2630729728. Throughput: 0: 48938.6. Samples: 2159623020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 04:30:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:30:32,507][71000] Updated weights for policy 0, policy_version 160574 (0.0029) [2024-06-13 04:30:35,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48605.9, 300 sec: 49152.0). Total num frames: 2630975488. Throughput: 0: 49194.2. Samples: 2159767020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 04:30:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:30:36,181][71000] Updated weights for policy 0, policy_version 160584 (0.0029) [2024-06-13 04:30:39,459][71000] Updated weights for policy 0, policy_version 160594 (0.0028) [2024-06-13 04:30:40,940][70768] Fps is (10 sec: 52429.1, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2631254016. Throughput: 0: 49080.6. Samples: 2160057000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 04:30:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:30:42,663][71000] Updated weights for policy 0, policy_version 160604 (0.0028) [2024-06-13 04:30:45,854][71000] Updated weights for policy 0, policy_version 160614 (0.0030) [2024-06-13 04:30:45,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49698.1, 300 sec: 49207.5). Total num frames: 2631499776. Throughput: 0: 49289.7. Samples: 2160362300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 04:30:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 04:30:49,550][71000] Updated weights for policy 0, policy_version 160624 (0.0028) [2024-06-13 04:30:50,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2631712768. Throughput: 0: 48996.1. Samples: 2160506160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 04:30:50,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 04:30:52,415][71000] Updated weights for policy 0, policy_version 160634 (0.0027) [2024-06-13 04:30:55,786][70980] Signal inference workers to stop experience collection... (32150 times) [2024-06-13 04:30:55,823][71000] InferenceWorker_p0-w0: stopping experience collection (32150 times) [2024-06-13 04:30:55,847][70980] Signal inference workers to resume experience collection... (32150 times) [2024-06-13 04:30:55,848][71000] InferenceWorker_p0-w0: resuming experience collection (32150 times) [2024-06-13 04:30:55,940][70768] Fps is (10 sec: 45874.7, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2631958528. Throughput: 0: 48964.3. Samples: 2160796440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 04:30:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 04:30:56,147][71000] Updated weights for policy 0, policy_version 160644 (0.0033) [2024-06-13 04:30:59,348][71000] Updated weights for policy 0, policy_version 160654 (0.0029) [2024-06-13 04:31:00,939][70768] Fps is (10 sec: 50790.8, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2632220672. Throughput: 0: 49044.9. Samples: 2161091120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 04:31:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:31:02,619][71000] Updated weights for policy 0, policy_version 160664 (0.0038) [2024-06-13 04:31:05,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2632466432. Throughput: 0: 49049.0. Samples: 2161238500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 04:31:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:31:06,054][71000] Updated weights for policy 0, policy_version 160674 (0.0030) [2024-06-13 04:31:09,213][71000] Updated weights for policy 0, policy_version 160684 (0.0023) [2024-06-13 04:31:10,940][70768] Fps is (10 sec: 47512.7, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 2632695808. Throughput: 0: 49037.1. Samples: 2161532880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 04:31:10,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 04:31:12,555][71000] Updated weights for policy 0, policy_version 160694 (0.0027) [2024-06-13 04:31:15,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2632957952. Throughput: 0: 49053.3. Samples: 2161830420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 04:31:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:31:16,014][71000] Updated weights for policy 0, policy_version 160704 (0.0035) [2024-06-13 04:31:19,439][71000] Updated weights for policy 0, policy_version 160714 (0.0032) [2024-06-13 04:31:20,940][70768] Fps is (10 sec: 50791.2, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2633203712. Throughput: 0: 49200.1. Samples: 2161981020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 04:31:20,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 04:31:22,601][71000] Updated weights for policy 0, policy_version 160724 (0.0029) [2024-06-13 04:31:25,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2633433088. Throughput: 0: 49219.5. Samples: 2162271880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 04:31:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:31:26,128][71000] Updated weights for policy 0, policy_version 160734 (0.0023) [2024-06-13 04:31:29,080][71000] Updated weights for policy 0, policy_version 160744 (0.0034) [2024-06-13 04:31:30,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2633678848. Throughput: 0: 48979.6. Samples: 2162566380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 04:31:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:31:32,541][71000] Updated weights for policy 0, policy_version 160754 (0.0021) [2024-06-13 04:31:35,795][71000] Updated weights for policy 0, policy_version 160764 (0.0032) [2024-06-13 04:31:35,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49698.2, 300 sec: 49263.1). Total num frames: 2633957376. Throughput: 0: 48979.6. Samples: 2162710240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 04:31:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:31:39,120][71000] Updated weights for policy 0, policy_version 160774 (0.0034) [2024-06-13 04:31:40,940][70768] Fps is (10 sec: 49150.9, 60 sec: 48605.7, 300 sec: 48985.4). Total num frames: 2634170368. Throughput: 0: 49087.0. Samples: 2163005360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 04:31:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:31:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000160777_2634170368.pth... [2024-06-13 04:31:40,999][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000160058_2622390272.pth [2024-06-13 04:31:42,541][71000] Updated weights for policy 0, policy_version 160784 (0.0026) [2024-06-13 04:31:45,785][71000] Updated weights for policy 0, policy_version 160794 (0.0031) [2024-06-13 04:31:45,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2634448896. Throughput: 0: 49382.9. Samples: 2163313360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 04:31:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:31:48,974][71000] Updated weights for policy 0, policy_version 160804 (0.0030) [2024-06-13 04:31:50,940][70768] Fps is (10 sec: 52430.1, 60 sec: 49698.2, 300 sec: 49152.0). Total num frames: 2634694656. Throughput: 0: 49428.5. Samples: 2163462780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 04:31:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:31:51,923][70980] Signal inference workers to stop experience collection... (32200 times) [2024-06-13 04:31:51,923][70980] Signal inference workers to resume experience collection... (32200 times) [2024-06-13 04:31:51,964][71000] InferenceWorker_p0-w0: stopping experience collection (32200 times) [2024-06-13 04:31:51,964][71000] InferenceWorker_p0-w0: resuming experience collection (32200 times) [2024-06-13 04:31:52,595][71000] Updated weights for policy 0, policy_version 160814 (0.0029) [2024-06-13 04:31:55,751][71000] Updated weights for policy 0, policy_version 160824 (0.0031) [2024-06-13 04:31:55,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49698.1, 300 sec: 49208.0). Total num frames: 2634940416. Throughput: 0: 49492.4. Samples: 2163760040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 04:31:55,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 04:31:59,213][71000] Updated weights for policy 0, policy_version 160834 (0.0037) [2024-06-13 04:32:00,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2635169792. Throughput: 0: 49292.9. Samples: 2164048600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 04:32:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:32:02,418][71000] Updated weights for policy 0, policy_version 160844 (0.0029) [2024-06-13 04:32:05,939][70768] Fps is (10 sec: 47514.6, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 2635415552. Throughput: 0: 49366.3. Samples: 2164202500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 04:32:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:32:06,089][71000] Updated weights for policy 0, policy_version 160854 (0.0028) [2024-06-13 04:32:09,059][71000] Updated weights for policy 0, policy_version 160864 (0.0034) [2024-06-13 04:32:10,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49698.1, 300 sec: 49152.0). Total num frames: 2635677696. Throughput: 0: 49187.8. Samples: 2164485340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 04:32:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 04:32:12,793][71000] Updated weights for policy 0, policy_version 160874 (0.0028) [2024-06-13 04:32:15,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2635907072. Throughput: 0: 49184.0. Samples: 2164779660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 04:32:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:32:16,075][71000] Updated weights for policy 0, policy_version 160884 (0.0034) [2024-06-13 04:32:19,564][71000] Updated weights for policy 0, policy_version 160894 (0.0030) [2024-06-13 04:32:20,940][70768] Fps is (10 sec: 47514.5, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2636152832. Throughput: 0: 49185.4. Samples: 2164923580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 04:32:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 04:32:22,593][71000] Updated weights for policy 0, policy_version 160904 (0.0025) [2024-06-13 04:32:25,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2636398592. Throughput: 0: 49327.4. Samples: 2165225080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 04:32:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:32:26,006][71000] Updated weights for policy 0, policy_version 160914 (0.0027) [2024-06-13 04:32:29,384][71000] Updated weights for policy 0, policy_version 160924 (0.0024) [2024-06-13 04:32:30,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49971.2, 300 sec: 49263.1). Total num frames: 2636677120. Throughput: 0: 49222.3. Samples: 2165528360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 04:32:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:32:32,408][71000] Updated weights for policy 0, policy_version 160934 (0.0018) [2024-06-13 04:32:35,940][70768] Fps is (10 sec: 49150.9, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 2636890112. Throughput: 0: 49239.8. Samples: 2165678580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 04:32:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:32:36,113][71000] Updated weights for policy 0, policy_version 160944 (0.0046) [2024-06-13 04:32:39,161][71000] Updated weights for policy 0, policy_version 160954 (0.0026) [2024-06-13 04:32:40,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49698.4, 300 sec: 49152.0). Total num frames: 2637152256. Throughput: 0: 49029.1. Samples: 2165966340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 04:32:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:32:42,531][71000] Updated weights for policy 0, policy_version 160964 (0.0025) [2024-06-13 04:32:45,626][71000] Updated weights for policy 0, policy_version 160974 (0.0023) [2024-06-13 04:32:45,940][70768] Fps is (10 sec: 52429.7, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 2637414400. Throughput: 0: 49444.0. Samples: 2166273580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 04:32:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:32:49,000][71000] Updated weights for policy 0, policy_version 160984 (0.0033) [2024-06-13 04:32:50,940][70768] Fps is (10 sec: 50789.2, 60 sec: 49424.9, 300 sec: 49263.1). Total num frames: 2637660160. Throughput: 0: 49325.1. Samples: 2166422140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 04:32:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:32:52,522][71000] Updated weights for policy 0, policy_version 160994 (0.0026) [2024-06-13 04:32:55,590][71000] Updated weights for policy 0, policy_version 161004 (0.0030) [2024-06-13 04:32:55,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 2637889536. Throughput: 0: 49565.1. Samples: 2166715760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 04:32:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:32:58,948][71000] Updated weights for policy 0, policy_version 161014 (0.0026) [2024-06-13 04:33:00,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49424.9, 300 sec: 49152.0). Total num frames: 2638135296. Throughput: 0: 49507.4. Samples: 2167007500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 04:33:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:33:02,461][71000] Updated weights for policy 0, policy_version 161024 (0.0033) [2024-06-13 04:33:05,574][71000] Updated weights for policy 0, policy_version 161034 (0.0030) [2024-06-13 04:33:05,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2638381056. Throughput: 0: 49660.0. Samples: 2167158280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 04:33:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:33:07,228][70980] Signal inference workers to stop experience collection... (32250 times) [2024-06-13 04:33:07,230][70980] Signal inference workers to resume experience collection... (32250 times) [2024-06-13 04:33:07,273][71000] InferenceWorker_p0-w0: stopping experience collection (32250 times) [2024-06-13 04:33:07,273][71000] InferenceWorker_p0-w0: resuming experience collection (32250 times) [2024-06-13 04:33:09,163][71000] Updated weights for policy 0, policy_version 161044 (0.0030) [2024-06-13 04:33:10,939][70768] Fps is (10 sec: 50791.7, 60 sec: 49425.3, 300 sec: 49207.6). Total num frames: 2638643200. Throughput: 0: 49585.4. Samples: 2167456420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 04:33:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:33:12,029][71000] Updated weights for policy 0, policy_version 161054 (0.0022) [2024-06-13 04:33:15,889][71000] Updated weights for policy 0, policy_version 161064 (0.0021) [2024-06-13 04:33:15,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2638872576. Throughput: 0: 49320.9. Samples: 2167747800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 04:33:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 04:33:18,755][71000] Updated weights for policy 0, policy_version 161074 (0.0022) [2024-06-13 04:33:20,940][70768] Fps is (10 sec: 47512.6, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2639118336. Throughput: 0: 49135.6. Samples: 2167889680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 04:33:20,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:33:22,509][71000] Updated weights for policy 0, policy_version 161084 (0.0027) [2024-06-13 04:33:25,301][71000] Updated weights for policy 0, policy_version 161094 (0.0033) [2024-06-13 04:33:25,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2639364096. Throughput: 0: 49258.6. Samples: 2168182980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 04:33:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:33:29,148][71000] Updated weights for policy 0, policy_version 161104 (0.0027) [2024-06-13 04:33:30,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2639626240. Throughput: 0: 49131.0. Samples: 2168484480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 04:33:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:33:32,015][71000] Updated weights for policy 0, policy_version 161114 (0.0022) [2024-06-13 04:33:35,822][71000] Updated weights for policy 0, policy_version 161124 (0.0030) [2024-06-13 04:33:35,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 2639855616. Throughput: 0: 49461.5. Samples: 2168647900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 04:33:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:33:38,468][71000] Updated weights for policy 0, policy_version 161134 (0.0030) [2024-06-13 04:33:40,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49151.8, 300 sec: 49263.0). Total num frames: 2640101376. Throughput: 0: 49204.6. Samples: 2168929980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 04:33:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:33:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000161139_2640101376.pth... [2024-06-13 04:33:40,999][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000160420_2628321280.pth [2024-06-13 04:33:42,628][71000] Updated weights for policy 0, policy_version 161144 (0.0033) [2024-06-13 04:33:45,370][71000] Updated weights for policy 0, policy_version 161154 (0.0038) [2024-06-13 04:33:45,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2640347136. Throughput: 0: 49141.4. Samples: 2169218860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 04:33:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:33:49,381][71000] Updated weights for policy 0, policy_version 161164 (0.0026) [2024-06-13 04:33:50,940][70768] Fps is (10 sec: 52430.1, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 2640625664. Throughput: 0: 49305.7. Samples: 2169377040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 04:33:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:33:51,786][71000] Updated weights for policy 0, policy_version 161174 (0.0020) [2024-06-13 04:33:55,867][71000] Updated weights for policy 0, policy_version 161184 (0.0031) [2024-06-13 04:33:55,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2640838656. Throughput: 0: 49175.3. Samples: 2169669320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 04:33:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:33:58,906][71000] Updated weights for policy 0, policy_version 161194 (0.0031) [2024-06-13 04:34:00,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2641084416. Throughput: 0: 49046.7. Samples: 2169954900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:34:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:34:02,889][71000] Updated weights for policy 0, policy_version 161204 (0.0034) [2024-06-13 04:34:03,463][70980] Signal inference workers to stop experience collection... (32300 times) [2024-06-13 04:34:03,463][70980] Signal inference workers to resume experience collection... (32300 times) [2024-06-13 04:34:03,473][71000] InferenceWorker_p0-w0: stopping experience collection (32300 times) [2024-06-13 04:34:03,473][71000] InferenceWorker_p0-w0: resuming experience collection (32300 times) [2024-06-13 04:34:05,939][70768] Fps is (10 sec: 45875.9, 60 sec: 48605.9, 300 sec: 49152.0). Total num frames: 2641297408. Throughput: 0: 48922.0. Samples: 2170091160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:34:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:34:06,204][71000] Updated weights for policy 0, policy_version 161214 (0.0025) [2024-06-13 04:34:09,564][71000] Updated weights for policy 0, policy_version 161224 (0.0026) [2024-06-13 04:34:10,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2641592320. Throughput: 0: 49072.9. Samples: 2170391260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:34:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:34:12,455][71000] Updated weights for policy 0, policy_version 161234 (0.0025) [2024-06-13 04:34:15,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 2641788928. Throughput: 0: 48990.9. Samples: 2170689060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:34:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:34:16,192][71000] Updated weights for policy 0, policy_version 161244 (0.0026) [2024-06-13 04:34:19,205][71000] Updated weights for policy 0, policy_version 161254 (0.0042) [2024-06-13 04:34:20,939][70768] Fps is (10 sec: 45875.5, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 2642051072. Throughput: 0: 48351.6. Samples: 2170823720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:34:20,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:34:22,853][71000] Updated weights for policy 0, policy_version 161264 (0.0040) [2024-06-13 04:34:25,940][70768] Fps is (10 sec: 49150.8, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 2642280448. Throughput: 0: 48548.1. Samples: 2171114640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:34:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:34:26,295][71000] Updated weights for policy 0, policy_version 161274 (0.0031) [2024-06-13 04:34:29,629][71000] Updated weights for policy 0, policy_version 161284 (0.0038) [2024-06-13 04:34:30,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2642558976. Throughput: 0: 48837.4. Samples: 2171416540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:34:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:34:32,711][71000] Updated weights for policy 0, policy_version 161294 (0.0031) [2024-06-13 04:34:35,939][70768] Fps is (10 sec: 50791.6, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2642788352. Throughput: 0: 48766.3. Samples: 2171571520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:34:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:34:36,395][71000] Updated weights for policy 0, policy_version 161304 (0.0026) [2024-06-13 04:34:39,226][71000] Updated weights for policy 0, policy_version 161314 (0.0021) [2024-06-13 04:34:40,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48879.1, 300 sec: 49207.5). Total num frames: 2643034112. Throughput: 0: 48844.0. Samples: 2171867300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:34:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:34:42,674][71000] Updated weights for policy 0, policy_version 161324 (0.0029) [2024-06-13 04:34:45,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2643279872. Throughput: 0: 48817.6. Samples: 2172151700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:34:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:34:46,174][71000] Updated weights for policy 0, policy_version 161334 (0.0030) [2024-06-13 04:34:49,570][71000] Updated weights for policy 0, policy_version 161344 (0.0024) [2024-06-13 04:34:50,939][70768] Fps is (10 sec: 50791.0, 60 sec: 48605.9, 300 sec: 49207.6). Total num frames: 2643542016. Throughput: 0: 49183.1. Samples: 2172304400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:34:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:34:52,863][71000] Updated weights for policy 0, policy_version 161354 (0.0032) [2024-06-13 04:34:55,939][70768] Fps is (10 sec: 47514.5, 60 sec: 48606.0, 300 sec: 49040.9). Total num frames: 2643755008. Throughput: 0: 48932.5. Samples: 2172593220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:34:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:34:56,390][71000] Updated weights for policy 0, policy_version 161364 (0.0027) [2024-06-13 04:34:59,681][71000] Updated weights for policy 0, policy_version 161374 (0.0035) [2024-06-13 04:35:00,940][70768] Fps is (10 sec: 45874.0, 60 sec: 48605.7, 300 sec: 49152.0). Total num frames: 2644000768. Throughput: 0: 48944.1. Samples: 2172891560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-13 04:35:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:35:02,611][71000] Updated weights for policy 0, policy_version 161384 (0.0020) [2024-06-13 04:35:05,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49424.9, 300 sec: 49152.0). Total num frames: 2644262912. Throughput: 0: 49166.9. Samples: 2173036240. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-13 04:35:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:35:06,095][71000] Updated weights for policy 0, policy_version 161394 (0.0032) [2024-06-13 04:35:07,916][70980] Signal inference workers to stop experience collection... (32350 times) [2024-06-13 04:35:07,958][71000] InferenceWorker_p0-w0: stopping experience collection (32350 times) [2024-06-13 04:35:07,965][70980] Signal inference workers to resume experience collection... (32350 times) [2024-06-13 04:35:07,976][71000] InferenceWorker_p0-w0: resuming experience collection (32350 times) [2024-06-13 04:35:09,104][71000] Updated weights for policy 0, policy_version 161404 (0.0029) [2024-06-13 04:35:10,940][70768] Fps is (10 sec: 52429.5, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2644525056. Throughput: 0: 49243.6. Samples: 2173330600. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-13 04:35:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:35:12,807][71000] Updated weights for policy 0, policy_version 161414 (0.0028) [2024-06-13 04:35:15,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 2644754432. Throughput: 0: 49217.4. Samples: 2173631320. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-13 04:35:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:35:15,976][71000] Updated weights for policy 0, policy_version 161424 (0.0025) [2024-06-13 04:35:19,027][71000] Updated weights for policy 0, policy_version 161434 (0.0020) [2024-06-13 04:35:20,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2645016576. Throughput: 0: 49170.1. Samples: 2173784180. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-13 04:35:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:35:21,993][71000] Updated weights for policy 0, policy_version 161444 (0.0022) [2024-06-13 04:35:25,571][71000] Updated weights for policy 0, policy_version 161454 (0.0023) [2024-06-13 04:35:25,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.3, 300 sec: 49263.1). Total num frames: 2645262336. Throughput: 0: 49452.1. Samples: 2174092640. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-13 04:35:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:35:28,932][71000] Updated weights for policy 0, policy_version 161464 (0.0031) [2024-06-13 04:35:30,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2645508096. Throughput: 0: 49388.9. Samples: 2174374200. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-13 04:35:30,949][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:35:32,751][71000] Updated weights for policy 0, policy_version 161474 (0.0037) [2024-06-13 04:35:35,819][71000] Updated weights for policy 0, policy_version 161484 (0.0031) [2024-06-13 04:35:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2645753856. Throughput: 0: 49244.9. Samples: 2174520420. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-13 04:35:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:35:39,219][71000] Updated weights for policy 0, policy_version 161494 (0.0026) [2024-06-13 04:35:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2645999616. Throughput: 0: 49386.6. Samples: 2174815620. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-13 04:35:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:35:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000161499_2645999616.pth... [2024-06-13 04:35:41,010][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000160777_2634170368.pth [2024-06-13 04:35:42,323][71000] Updated weights for policy 0, policy_version 161504 (0.0025) [2024-06-13 04:35:45,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2646228992. Throughput: 0: 49291.7. Samples: 2175109680. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-13 04:35:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:35:45,945][71000] Updated weights for policy 0, policy_version 161514 (0.0025) [2024-06-13 04:35:49,159][71000] Updated weights for policy 0, policy_version 161524 (0.0025) [2024-06-13 04:35:50,940][70768] Fps is (10 sec: 45874.5, 60 sec: 48605.7, 300 sec: 49152.0). Total num frames: 2646458368. Throughput: 0: 49378.6. Samples: 2175258280. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-13 04:35:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:35:52,985][71000] Updated weights for policy 0, policy_version 161534 (0.0028) [2024-06-13 04:35:55,871][71000] Updated weights for policy 0, policy_version 161544 (0.0018) [2024-06-13 04:35:55,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.1, 300 sec: 49207.5). Total num frames: 2646736896. Throughput: 0: 49421.9. Samples: 2175554580. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-13 04:35:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:35:59,334][71000] Updated weights for policy 0, policy_version 161554 (0.0028) [2024-06-13 04:36:00,940][70768] Fps is (10 sec: 52429.6, 60 sec: 49698.3, 300 sec: 49207.5). Total num frames: 2646982656. Throughput: 0: 49404.4. Samples: 2175854520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-13 04:36:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:36:02,295][71000] Updated weights for policy 0, policy_version 161564 (0.0028) [2024-06-13 04:36:05,808][71000] Updated weights for policy 0, policy_version 161574 (0.0022) [2024-06-13 04:36:05,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 2647228416. Throughput: 0: 49213.5. Samples: 2175998780. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-13 04:36:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:36:08,123][70980] Signal inference workers to stop experience collection... (32400 times) [2024-06-13 04:36:08,123][70980] Signal inference workers to resume experience collection... (32400 times) [2024-06-13 04:36:08,163][71000] InferenceWorker_p0-w0: stopping experience collection (32400 times) [2024-06-13 04:36:08,163][71000] InferenceWorker_p0-w0: resuming experience collection (32400 times) [2024-06-13 04:36:08,905][71000] Updated weights for policy 0, policy_version 161584 (0.0026) [2024-06-13 04:36:10,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2647457792. Throughput: 0: 48905.6. Samples: 2176293400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 04:36:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:36:12,510][71000] Updated weights for policy 0, policy_version 161594 (0.0021) [2024-06-13 04:36:15,550][71000] Updated weights for policy 0, policy_version 161604 (0.0033) [2024-06-13 04:36:15,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2647736320. Throughput: 0: 49198.3. Samples: 2176588120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 04:36:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:36:19,177][71000] Updated weights for policy 0, policy_version 161614 (0.0032) [2024-06-13 04:36:20,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2647965696. Throughput: 0: 49186.0. Samples: 2176733800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 04:36:20,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:36:22,264][71000] Updated weights for policy 0, policy_version 161624 (0.0030) [2024-06-13 04:36:25,607][71000] Updated weights for policy 0, policy_version 161634 (0.0039) [2024-06-13 04:36:25,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2648227840. Throughput: 0: 49283.2. Samples: 2177033360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 04:36:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:36:28,854][71000] Updated weights for policy 0, policy_version 161644 (0.0030) [2024-06-13 04:36:30,939][70768] Fps is (10 sec: 47514.5, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2648440832. Throughput: 0: 49433.5. Samples: 2177334180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 04:36:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:36:32,333][71000] Updated weights for policy 0, policy_version 161654 (0.0021) [2024-06-13 04:36:35,429][71000] Updated weights for policy 0, policy_version 161664 (0.0032) [2024-06-13 04:36:35,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 2648735744. Throughput: 0: 49487.3. Samples: 2177485200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 04:36:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:36:39,001][71000] Updated weights for policy 0, policy_version 161674 (0.0029) [2024-06-13 04:36:40,940][70768] Fps is (10 sec: 50789.3, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2648948736. Throughput: 0: 49291.3. Samples: 2177772700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 04:36:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:36:42,271][71000] Updated weights for policy 0, policy_version 161684 (0.0038) [2024-06-13 04:36:45,564][71000] Updated weights for policy 0, policy_version 161694 (0.0024) [2024-06-13 04:36:45,940][70768] Fps is (10 sec: 47512.0, 60 sec: 49697.9, 300 sec: 49207.5). Total num frames: 2649210880. Throughput: 0: 49277.9. Samples: 2178072040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 04:36:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:36:48,930][71000] Updated weights for policy 0, policy_version 161704 (0.0029) [2024-06-13 04:36:50,940][70768] Fps is (10 sec: 47514.4, 60 sec: 49425.2, 300 sec: 49096.5). Total num frames: 2649423872. Throughput: 0: 49279.5. Samples: 2178216360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 04:36:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:36:52,367][71000] Updated weights for policy 0, policy_version 161714 (0.0024) [2024-06-13 04:36:55,643][71000] Updated weights for policy 0, policy_version 161724 (0.0031) [2024-06-13 04:36:55,940][70768] Fps is (10 sec: 49153.8, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2649702400. Throughput: 0: 49298.4. Samples: 2178511820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 04:36:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:36:59,073][71000] Updated weights for policy 0, policy_version 161734 (0.0034) [2024-06-13 04:37:00,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2649931776. Throughput: 0: 49304.0. Samples: 2178806800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 04:37:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:37:02,038][71000] Updated weights for policy 0, policy_version 161744 (0.0027) [2024-06-13 04:37:05,916][71000] Updated weights for policy 0, policy_version 161754 (0.0035) [2024-06-13 04:37:05,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2650177536. Throughput: 0: 49301.5. Samples: 2178952360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 04:37:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:37:08,719][71000] Updated weights for policy 0, policy_version 161764 (0.0030) [2024-06-13 04:37:10,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2650406912. Throughput: 0: 48944.9. Samples: 2179235880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 04:37:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:37:12,507][71000] Updated weights for policy 0, policy_version 161774 (0.0032) [2024-06-13 04:37:15,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48605.9, 300 sec: 49152.0). Total num frames: 2650652672. Throughput: 0: 48774.2. Samples: 2179529020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:37:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:37:15,992][71000] Updated weights for policy 0, policy_version 161784 (0.0034) [2024-06-13 04:37:19,118][71000] Updated weights for policy 0, policy_version 161794 (0.0028) [2024-06-13 04:37:20,942][70768] Fps is (10 sec: 50776.7, 60 sec: 49149.9, 300 sec: 49207.1). Total num frames: 2650914816. Throughput: 0: 49028.2. Samples: 2179691600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:37:20,943][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:37:22,436][71000] Updated weights for policy 0, policy_version 161804 (0.0027) [2024-06-13 04:37:25,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 2651144192. Throughput: 0: 48974.4. Samples: 2179976540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:37:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:37:26,040][71000] Updated weights for policy 0, policy_version 161814 (0.0039) [2024-06-13 04:37:26,714][70980] Signal inference workers to stop experience collection... (32450 times) [2024-06-13 04:37:26,718][70980] Signal inference workers to resume experience collection... (32450 times) [2024-06-13 04:37:26,757][71000] InferenceWorker_p0-w0: stopping experience collection (32450 times) [2024-06-13 04:37:26,757][71000] InferenceWorker_p0-w0: resuming experience collection (32450 times) [2024-06-13 04:37:28,995][71000] Updated weights for policy 0, policy_version 161824 (0.0029) [2024-06-13 04:37:30,940][70768] Fps is (10 sec: 45887.5, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2651373568. Throughput: 0: 48620.8. Samples: 2180259960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:37:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:37:32,787][71000] Updated weights for policy 0, policy_version 161834 (0.0034) [2024-06-13 04:37:35,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48332.7, 300 sec: 49096.4). Total num frames: 2651635712. Throughput: 0: 48675.4. Samples: 2180406760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:37:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:37:36,203][71000] Updated weights for policy 0, policy_version 161844 (0.0023) [2024-06-13 04:37:39,249][71000] Updated weights for policy 0, policy_version 161854 (0.0026) [2024-06-13 04:37:40,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49152.2, 300 sec: 49096.5). Total num frames: 2651897856. Throughput: 0: 48819.1. Samples: 2180708680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:37:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:37:41,013][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000161860_2651914240.pth... [2024-06-13 04:37:41,055][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000161139_2640101376.pth [2024-06-13 04:37:42,659][71000] Updated weights for policy 0, policy_version 161864 (0.0026) [2024-06-13 04:37:45,940][70768] Fps is (10 sec: 49152.8, 60 sec: 48606.2, 300 sec: 49041.0). Total num frames: 2652127232. Throughput: 0: 49002.7. Samples: 2181011920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:37:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:37:46,112][71000] Updated weights for policy 0, policy_version 161874 (0.0026) [2024-06-13 04:37:49,251][71000] Updated weights for policy 0, policy_version 161884 (0.0028) [2024-06-13 04:37:50,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2652372992. Throughput: 0: 48842.7. Samples: 2181150280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:37:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 04:37:52,714][71000] Updated weights for policy 0, policy_version 161894 (0.0024) [2024-06-13 04:37:55,765][71000] Updated weights for policy 0, policy_version 161904 (0.0035) [2024-06-13 04:37:55,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2652635136. Throughput: 0: 49173.3. Samples: 2181448680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:37:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:37:59,284][71000] Updated weights for policy 0, policy_version 161914 (0.0026) [2024-06-13 04:38:00,940][70768] Fps is (10 sec: 52427.8, 60 sec: 49424.9, 300 sec: 49207.5). Total num frames: 2652897280. Throughput: 0: 49336.2. Samples: 2181749160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:38:00,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 04:38:02,591][71000] Updated weights for policy 0, policy_version 161924 (0.0024) [2024-06-13 04:38:05,679][71000] Updated weights for policy 0, policy_version 161934 (0.0036) [2024-06-13 04:38:05,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 2653126656. Throughput: 0: 49233.5. Samples: 2181906980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:38:05,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 04:38:09,242][71000] Updated weights for policy 0, policy_version 161944 (0.0030) [2024-06-13 04:38:10,940][70768] Fps is (10 sec: 47514.6, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2653372416. Throughput: 0: 49343.2. Samples: 2182196980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:38:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:38:12,833][71000] Updated weights for policy 0, policy_version 161954 (0.0030) [2024-06-13 04:38:15,746][71000] Updated weights for policy 0, policy_version 161964 (0.0033) [2024-06-13 04:38:15,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2653618176. Throughput: 0: 49411.1. Samples: 2182483460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 04:38:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:38:19,289][71000] Updated weights for policy 0, policy_version 161974 (0.0031) [2024-06-13 04:38:20,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49427.2, 300 sec: 49207.5). Total num frames: 2653880320. Throughput: 0: 49578.8. Samples: 2182637800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-13 04:38:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:38:22,359][71000] Updated weights for policy 0, policy_version 161984 (0.0027) [2024-06-13 04:38:25,574][71000] Updated weights for policy 0, policy_version 161994 (0.0026) [2024-06-13 04:38:25,859][70980] Signal inference workers to stop experience collection... (32500 times) [2024-06-13 04:38:25,859][70980] Signal inference workers to resume experience collection... (32500 times) [2024-06-13 04:38:25,880][71000] InferenceWorker_p0-w0: stopping experience collection (32500 times) [2024-06-13 04:38:25,880][71000] InferenceWorker_p0-w0: resuming experience collection (32500 times) [2024-06-13 04:38:25,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.2, 300 sec: 49152.0). Total num frames: 2654126080. Throughput: 0: 49718.2. Samples: 2182946000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-13 04:38:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:38:29,052][71000] Updated weights for policy 0, policy_version 162004 (0.0030) [2024-06-13 04:38:30,939][70768] Fps is (10 sec: 47514.3, 60 sec: 49698.2, 300 sec: 49152.0). Total num frames: 2654355456. Throughput: 0: 49622.3. Samples: 2183244920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-13 04:38:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:38:32,167][71000] Updated weights for policy 0, policy_version 162014 (0.0027) [2024-06-13 04:38:35,406][71000] Updated weights for policy 0, policy_version 162024 (0.0033) [2024-06-13 04:38:35,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.2, 300 sec: 49152.0). Total num frames: 2654601216. Throughput: 0: 49718.3. Samples: 2183387600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-13 04:38:35,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 04:38:38,891][71000] Updated weights for policy 0, policy_version 162034 (0.0033) [2024-06-13 04:38:40,940][70768] Fps is (10 sec: 50789.1, 60 sec: 49424.9, 300 sec: 49207.5). Total num frames: 2654863360. Throughput: 0: 49739.8. Samples: 2183686980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-13 04:38:40,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 04:38:42,265][71000] Updated weights for policy 0, policy_version 162044 (0.0031) [2024-06-13 04:38:45,513][71000] Updated weights for policy 0, policy_version 162054 (0.0035) [2024-06-13 04:38:45,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49698.0, 300 sec: 49096.4). Total num frames: 2655109120. Throughput: 0: 49569.8. Samples: 2183979800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-13 04:38:45,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:38:49,197][71000] Updated weights for policy 0, policy_version 162064 (0.0036) [2024-06-13 04:38:50,939][70768] Fps is (10 sec: 49153.2, 60 sec: 49698.2, 300 sec: 49207.6). Total num frames: 2655354880. Throughput: 0: 49419.4. Samples: 2184130840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-13 04:38:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:38:51,821][71000] Updated weights for policy 0, policy_version 162074 (0.0031) [2024-06-13 04:38:55,673][71000] Updated weights for policy 0, policy_version 162084 (0.0030) [2024-06-13 04:38:55,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49424.9, 300 sec: 49207.5). Total num frames: 2655600640. Throughput: 0: 49388.3. Samples: 2184419460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-13 04:38:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:38:59,010][71000] Updated weights for policy 0, policy_version 162094 (0.0033) [2024-06-13 04:39:00,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49425.2, 300 sec: 49374.1). Total num frames: 2655862784. Throughput: 0: 49430.6. Samples: 2184707840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-13 04:39:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 04:39:02,552][71000] Updated weights for policy 0, policy_version 162104 (0.0026) [2024-06-13 04:39:05,617][71000] Updated weights for policy 0, policy_version 162114 (0.0041) [2024-06-13 04:39:05,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49425.2, 300 sec: 49152.0). Total num frames: 2656092160. Throughput: 0: 49493.9. Samples: 2184865020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-13 04:39:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 04:39:09,278][71000] Updated weights for policy 0, policy_version 162124 (0.0026) [2024-06-13 04:39:10,939][70768] Fps is (10 sec: 45876.1, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2656321536. Throughput: 0: 49044.1. Samples: 2185152980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-13 04:39:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:39:12,212][71000] Updated weights for policy 0, policy_version 162134 (0.0032) [2024-06-13 04:39:15,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2656550912. Throughput: 0: 48734.6. Samples: 2185437980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-13 04:39:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:39:15,978][71000] Updated weights for policy 0, policy_version 162144 (0.0032) [2024-06-13 04:39:18,943][71000] Updated weights for policy 0, policy_version 162154 (0.0031) [2024-06-13 04:39:20,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 2656813056. Throughput: 0: 48788.4. Samples: 2185583080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-13 04:39:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:39:22,832][71000] Updated weights for policy 0, policy_version 162164 (0.0036) [2024-06-13 04:39:25,457][71000] Updated weights for policy 0, policy_version 162174 (0.0026) [2024-06-13 04:39:25,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 2657058816. Throughput: 0: 48613.8. Samples: 2185874600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 04:39:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:39:29,434][71000] Updated weights for policy 0, policy_version 162184 (0.0029) [2024-06-13 04:39:30,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2657304576. Throughput: 0: 48975.2. Samples: 2186183680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 04:39:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:39:32,081][71000] Updated weights for policy 0, policy_version 162194 (0.0024) [2024-06-13 04:39:35,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 2657533952. Throughput: 0: 48746.0. Samples: 2186324420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 04:39:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:39:36,118][71000] Updated weights for policy 0, policy_version 162204 (0.0031) [2024-06-13 04:39:39,124][71000] Updated weights for policy 0, policy_version 162214 (0.0032) [2024-06-13 04:39:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48879.0, 300 sec: 49207.6). Total num frames: 2657796096. Throughput: 0: 48429.5. Samples: 2186598780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 04:39:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:39:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000162219_2657796096.pth... [2024-06-13 04:39:40,998][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000161499_2645999616.pth [2024-06-13 04:39:42,807][71000] Updated weights for policy 0, policy_version 162224 (0.0029) [2024-06-13 04:39:44,255][70980] Signal inference workers to stop experience collection... (32550 times) [2024-06-13 04:39:44,255][70980] Signal inference workers to resume experience collection... (32550 times) [2024-06-13 04:39:44,277][71000] InferenceWorker_p0-w0: stopping experience collection (32550 times) [2024-06-13 04:39:44,277][71000] InferenceWorker_p0-w0: resuming experience collection (32550 times) [2024-06-13 04:39:45,730][71000] Updated weights for policy 0, policy_version 162234 (0.0029) [2024-06-13 04:39:45,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2658041856. Throughput: 0: 48696.1. Samples: 2186899160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 04:39:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:39:49,570][71000] Updated weights for policy 0, policy_version 162244 (0.0026) [2024-06-13 04:39:50,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48878.8, 300 sec: 49263.0). Total num frames: 2658287616. Throughput: 0: 48454.5. Samples: 2187045480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 04:39:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:39:52,566][71000] Updated weights for policy 0, policy_version 162254 (0.0027) [2024-06-13 04:39:55,937][71000] Updated weights for policy 0, policy_version 162264 (0.0024) [2024-06-13 04:39:55,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2658533376. Throughput: 0: 48632.2. Samples: 2187341440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 04:39:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:39:59,336][71000] Updated weights for policy 0, policy_version 162274 (0.0031) [2024-06-13 04:40:00,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48332.8, 300 sec: 49152.0). Total num frames: 2658762752. Throughput: 0: 48805.3. Samples: 2187634220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 04:40:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:40:02,639][71000] Updated weights for policy 0, policy_version 162284 (0.0035) [2024-06-13 04:40:05,815][71000] Updated weights for policy 0, policy_version 162294 (0.0027) [2024-06-13 04:40:05,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 2659024896. Throughput: 0: 48918.5. Samples: 2187784420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 04:40:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 04:40:09,379][71000] Updated weights for policy 0, policy_version 162304 (0.0035) [2024-06-13 04:40:10,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 2659254272. Throughput: 0: 48899.2. Samples: 2188075060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 04:40:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:40:12,587][71000] Updated weights for policy 0, policy_version 162314 (0.0024) [2024-06-13 04:40:15,943][70768] Fps is (10 sec: 45861.5, 60 sec: 48876.4, 300 sec: 49040.4). Total num frames: 2659483648. Throughput: 0: 48733.9. Samples: 2188376860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 04:40:15,943][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:40:16,134][71000] Updated weights for policy 0, policy_version 162324 (0.0028) [2024-06-13 04:40:18,991][71000] Updated weights for policy 0, policy_version 162334 (0.0022) [2024-06-13 04:40:20,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 2659729408. Throughput: 0: 48915.6. Samples: 2188525620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 04:40:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:40:22,850][71000] Updated weights for policy 0, policy_version 162344 (0.0024) [2024-06-13 04:40:25,909][71000] Updated weights for policy 0, policy_version 162354 (0.0027) [2024-06-13 04:40:25,940][70768] Fps is (10 sec: 52445.4, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2660007936. Throughput: 0: 49263.6. Samples: 2188815640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 04:40:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:40:29,554][71000] Updated weights for policy 0, policy_version 162364 (0.0024) [2024-06-13 04:40:30,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 2660237312. Throughput: 0: 49174.1. Samples: 2189112000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 04:40:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:40:32,534][71000] Updated weights for policy 0, policy_version 162374 (0.0037) [2024-06-13 04:40:35,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2660483072. Throughput: 0: 49174.4. Samples: 2189258320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 04:40:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:40:36,064][71000] Updated weights for policy 0, policy_version 162384 (0.0026) [2024-06-13 04:40:39,097][71000] Updated weights for policy 0, policy_version 162394 (0.0025) [2024-06-13 04:40:40,940][70768] Fps is (10 sec: 49152.8, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2660728832. Throughput: 0: 49193.9. Samples: 2189555160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 04:40:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:40:42,726][71000] Updated weights for policy 0, policy_version 162404 (0.0027) [2024-06-13 04:40:45,632][71000] Updated weights for policy 0, policy_version 162414 (0.0031) [2024-06-13 04:40:45,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2660990976. Throughput: 0: 49356.9. Samples: 2189855280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 04:40:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:40:49,494][71000] Updated weights for policy 0, policy_version 162424 (0.0030) [2024-06-13 04:40:50,942][70768] Fps is (10 sec: 50776.3, 60 sec: 49149.8, 300 sec: 49151.5). Total num frames: 2661236736. Throughput: 0: 49289.6. Samples: 2190002580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 04:40:50,943][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:40:52,359][71000] Updated weights for policy 0, policy_version 162434 (0.0022) [2024-06-13 04:40:55,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2661466112. Throughput: 0: 49554.2. Samples: 2190305000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 04:40:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:40:55,953][71000] Updated weights for policy 0, policy_version 162444 (0.0024) [2024-06-13 04:40:56,482][70980] Signal inference workers to stop experience collection... (32600 times) [2024-06-13 04:40:56,483][70980] Signal inference workers to resume experience collection... (32600 times) [2024-06-13 04:40:56,522][71000] InferenceWorker_p0-w0: stopping experience collection (32600 times) [2024-06-13 04:40:56,522][71000] InferenceWorker_p0-w0: resuming experience collection (32600 times) [2024-06-13 04:40:59,027][71000] Updated weights for policy 0, policy_version 162454 (0.0032) [2024-06-13 04:41:00,940][70768] Fps is (10 sec: 49165.2, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2661728256. Throughput: 0: 49486.0. Samples: 2190603580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 04:41:00,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 04:41:02,615][71000] Updated weights for policy 0, policy_version 162464 (0.0022) [2024-06-13 04:41:05,617][71000] Updated weights for policy 0, policy_version 162474 (0.0030) [2024-06-13 04:41:05,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 2661974016. Throughput: 0: 49392.9. Samples: 2190748300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 04:41:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:41:09,181][71000] Updated weights for policy 0, policy_version 162484 (0.0035) [2024-06-13 04:41:10,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.2, 300 sec: 49096.5). Total num frames: 2662219776. Throughput: 0: 49377.0. Samples: 2191037600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 04:41:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:41:12,032][71000] Updated weights for policy 0, policy_version 162494 (0.0030) [2024-06-13 04:41:15,939][70768] Fps is (10 sec: 47513.7, 60 sec: 49427.7, 300 sec: 49096.5). Total num frames: 2662449152. Throughput: 0: 49317.1. Samples: 2191331260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 04:41:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 04:41:15,958][71000] Updated weights for policy 0, policy_version 162504 (0.0032) [2024-06-13 04:41:19,250][71000] Updated weights for policy 0, policy_version 162514 (0.0036) [2024-06-13 04:41:20,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 2662678528. Throughput: 0: 49224.5. Samples: 2191473420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 04:41:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:41:22,401][71000] Updated weights for policy 0, policy_version 162524 (0.0021) [2024-06-13 04:41:25,931][71000] Updated weights for policy 0, policy_version 162534 (0.0025) [2024-06-13 04:41:25,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2662957056. Throughput: 0: 49289.4. Samples: 2191773180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 04:41:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:41:29,046][71000] Updated weights for policy 0, policy_version 162544 (0.0029) [2024-06-13 04:41:30,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.2, 300 sec: 49040.9). Total num frames: 2663202816. Throughput: 0: 49171.6. Samples: 2192068000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 04:41:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:41:32,513][71000] Updated weights for policy 0, policy_version 162554 (0.0031) [2024-06-13 04:41:35,774][71000] Updated weights for policy 0, policy_version 162564 (0.0022) [2024-06-13 04:41:35,944][70768] Fps is (10 sec: 49130.7, 60 sec: 49421.5, 300 sec: 49151.3). Total num frames: 2663448576. Throughput: 0: 49220.5. Samples: 2192217580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-06-13 04:41:35,944][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:41:39,146][71000] Updated weights for policy 0, policy_version 162574 (0.0027) [2024-06-13 04:41:40,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2663694336. Throughput: 0: 49051.7. Samples: 2192512320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-06-13 04:41:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:41:40,955][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000162579_2663694336.pth... [2024-06-13 04:41:40,994][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000161860_2651914240.pth [2024-06-13 04:41:42,212][71000] Updated weights for policy 0, policy_version 162584 (0.0031) [2024-06-13 04:41:45,940][70768] Fps is (10 sec: 47534.2, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2663923712. Throughput: 0: 48733.5. Samples: 2192796580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-06-13 04:41:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:41:46,109][71000] Updated weights for policy 0, policy_version 162594 (0.0039) [2024-06-13 04:41:49,074][71000] Updated weights for policy 0, policy_version 162604 (0.0027) [2024-06-13 04:41:50,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48881.2, 300 sec: 49040.9). Total num frames: 2664169472. Throughput: 0: 48855.1. Samples: 2192946780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-06-13 04:41:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:41:52,906][71000] Updated weights for policy 0, policy_version 162614 (0.0027) [2024-06-13 04:41:55,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2664415232. Throughput: 0: 48867.5. Samples: 2193236640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-06-13 04:41:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:41:55,993][71000] Updated weights for policy 0, policy_version 162624 (0.0029) [2024-06-13 04:41:59,243][71000] Updated weights for policy 0, policy_version 162634 (0.0026) [2024-06-13 04:42:00,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2664677376. Throughput: 0: 49219.9. Samples: 2193546160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-06-13 04:42:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:42:02,315][71000] Updated weights for policy 0, policy_version 162644 (0.0026) [2024-06-13 04:42:05,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 2664906752. Throughput: 0: 49312.7. Samples: 2193692500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-06-13 04:42:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:42:06,071][71000] Updated weights for policy 0, policy_version 162654 (0.0030) [2024-06-13 04:42:08,814][71000] Updated weights for policy 0, policy_version 162664 (0.0036) [2024-06-13 04:42:10,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49151.8, 300 sec: 49207.5). Total num frames: 2665168896. Throughput: 0: 49019.8. Samples: 2193979080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-06-13 04:42:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:42:12,845][71000] Updated weights for policy 0, policy_version 162674 (0.0024) [2024-06-13 04:42:15,461][71000] Updated weights for policy 0, policy_version 162684 (0.0030) [2024-06-13 04:42:15,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49424.9, 300 sec: 49152.4). Total num frames: 2665414656. Throughput: 0: 49109.6. Samples: 2194277940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-06-13 04:42:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:42:16,501][70980] Signal inference workers to stop experience collection... (32650 times) [2024-06-13 04:42:16,525][71000] InferenceWorker_p0-w0: stopping experience collection (32650 times) [2024-06-13 04:42:16,560][70980] Signal inference workers to resume experience collection... (32650 times) [2024-06-13 04:42:16,560][71000] InferenceWorker_p0-w0: resuming experience collection (32650 times) [2024-06-13 04:42:19,365][71000] Updated weights for policy 0, policy_version 162694 (0.0026) [2024-06-13 04:42:20,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49698.1, 300 sec: 49207.5). Total num frames: 2665660416. Throughput: 0: 49282.9. Samples: 2194435100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-06-13 04:42:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 04:42:22,367][71000] Updated weights for policy 0, policy_version 162704 (0.0028) [2024-06-13 04:42:25,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.8, 300 sec: 49207.5). Total num frames: 2665889792. Throughput: 0: 49386.4. Samples: 2194734720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-06-13 04:42:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:42:26,114][71000] Updated weights for policy 0, policy_version 162714 (0.0035) [2024-06-13 04:42:28,868][71000] Updated weights for policy 0, policy_version 162724 (0.0025) [2024-06-13 04:42:30,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 2666151936. Throughput: 0: 49612.5. Samples: 2195029140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-06-13 04:42:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:42:32,526][71000] Updated weights for policy 0, policy_version 162734 (0.0031) [2024-06-13 04:42:35,453][71000] Updated weights for policy 0, policy_version 162744 (0.0030) [2024-06-13 04:42:35,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49155.5, 300 sec: 49152.0). Total num frames: 2666397696. Throughput: 0: 49570.2. Samples: 2195177440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-06-13 04:42:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:42:39,515][71000] Updated weights for policy 0, policy_version 162754 (0.0033) [2024-06-13 04:42:40,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2666659840. Throughput: 0: 49965.3. Samples: 2195485080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 04:42:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:42:42,164][71000] Updated weights for policy 0, policy_version 162764 (0.0024) [2024-06-13 04:42:45,929][71000] Updated weights for policy 0, policy_version 162774 (0.0032) [2024-06-13 04:42:45,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2666889216. Throughput: 0: 49377.3. Samples: 2195768140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 04:42:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:42:48,987][71000] Updated weights for policy 0, policy_version 162784 (0.0022) [2024-06-13 04:42:50,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49424.9, 300 sec: 49152.0). Total num frames: 2667134976. Throughput: 0: 49267.5. Samples: 2195909540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 04:42:50,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:42:52,727][71000] Updated weights for policy 0, policy_version 162794 (0.0025) [2024-06-13 04:42:55,601][71000] Updated weights for policy 0, policy_version 162804 (0.0027) [2024-06-13 04:42:55,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 2667380736. Throughput: 0: 49487.2. Samples: 2196206000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 04:42:55,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 04:42:59,131][71000] Updated weights for policy 0, policy_version 162814 (0.0022) [2024-06-13 04:43:00,939][70768] Fps is (10 sec: 54068.1, 60 sec: 49971.3, 300 sec: 49318.6). Total num frames: 2667675648. Throughput: 0: 49658.0. Samples: 2196512540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 04:43:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:43:02,173][71000] Updated weights for policy 0, policy_version 162824 (0.0025) [2024-06-13 04:43:05,782][71000] Updated weights for policy 0, policy_version 162834 (0.0025) [2024-06-13 04:43:05,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2667872256. Throughput: 0: 49461.4. Samples: 2196660860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 04:43:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 04:43:08,671][71000] Updated weights for policy 0, policy_version 162844 (0.0035) [2024-06-13 04:43:10,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49425.2, 300 sec: 49207.5). Total num frames: 2668134400. Throughput: 0: 49337.0. Samples: 2196954880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 04:43:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:43:12,254][71000] Updated weights for policy 0, policy_version 162854 (0.0037) [2024-06-13 04:43:15,341][71000] Updated weights for policy 0, policy_version 162864 (0.0040) [2024-06-13 04:43:15,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.2, 300 sec: 49152.0). Total num frames: 2668380160. Throughput: 0: 49343.1. Samples: 2197249580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 04:43:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:43:19,057][71000] Updated weights for policy 0, policy_version 162874 (0.0026) [2024-06-13 04:43:20,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2668625920. Throughput: 0: 49291.1. Samples: 2197395540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 04:43:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 04:43:21,965][71000] Updated weights for policy 0, policy_version 162884 (0.0033) [2024-06-13 04:43:25,823][71000] Updated weights for policy 0, policy_version 162894 (0.0042) [2024-06-13 04:43:25,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.2, 300 sec: 49152.0). Total num frames: 2668855296. Throughput: 0: 49028.1. Samples: 2197691340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 04:43:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:43:26,448][70980] Signal inference workers to stop experience collection... (32700 times) [2024-06-13 04:43:26,448][70980] Signal inference workers to resume experience collection... (32700 times) [2024-06-13 04:43:26,470][71000] InferenceWorker_p0-w0: stopping experience collection (32700 times) [2024-06-13 04:43:26,470][71000] InferenceWorker_p0-w0: resuming experience collection (32700 times) [2024-06-13 04:43:28,723][71000] Updated weights for policy 0, policy_version 162904 (0.0026) [2024-06-13 04:43:30,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2669101056. Throughput: 0: 49281.9. Samples: 2197985820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 04:43:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:43:32,359][71000] Updated weights for policy 0, policy_version 162914 (0.0037) [2024-06-13 04:43:35,479][71000] Updated weights for policy 0, policy_version 162924 (0.0027) [2024-06-13 04:43:35,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2669363200. Throughput: 0: 49331.7. Samples: 2198129460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 04:43:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:43:38,988][71000] Updated weights for policy 0, policy_version 162934 (0.0026) [2024-06-13 04:43:40,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2669608960. Throughput: 0: 49277.4. Samples: 2198423480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:43:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:43:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000162940_2669608960.pth... [2024-06-13 04:43:41,001][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000162219_2657796096.pth [2024-06-13 04:43:42,367][71000] Updated weights for policy 0, policy_version 162944 (0.0031) [2024-06-13 04:43:45,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2669821952. Throughput: 0: 48917.3. Samples: 2198713820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:43:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:43:45,948][71000] Updated weights for policy 0, policy_version 162954 (0.0026) [2024-06-13 04:43:49,129][71000] Updated weights for policy 0, policy_version 162964 (0.0026) [2024-06-13 04:43:50,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2670084096. Throughput: 0: 48895.6. Samples: 2198861160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:43:50,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 04:43:52,734][71000] Updated weights for policy 0, policy_version 162974 (0.0030) [2024-06-13 04:43:55,751][71000] Updated weights for policy 0, policy_version 162984 (0.0031) [2024-06-13 04:43:55,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2670329856. Throughput: 0: 48828.0. Samples: 2199152140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:43:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:43:59,293][71000] Updated weights for policy 0, policy_version 162994 (0.0035) [2024-06-13 04:44:00,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48605.8, 300 sec: 49152.0). Total num frames: 2670592000. Throughput: 0: 48958.6. Samples: 2199452720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:44:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 04:44:02,115][71000] Updated weights for policy 0, policy_version 163004 (0.0027) [2024-06-13 04:44:05,788][71000] Updated weights for policy 0, policy_version 163014 (0.0026) [2024-06-13 04:44:05,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2670821376. Throughput: 0: 49008.9. Samples: 2199600940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:44:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:44:09,113][71000] Updated weights for policy 0, policy_version 163024 (0.0026) [2024-06-13 04:44:10,941][70768] Fps is (10 sec: 47507.5, 60 sec: 48877.8, 300 sec: 49207.3). Total num frames: 2671067136. Throughput: 0: 48968.6. Samples: 2199895000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:44:10,942][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:44:12,355][71000] Updated weights for policy 0, policy_version 163034 (0.0024) [2024-06-13 04:44:15,710][71000] Updated weights for policy 0, policy_version 163044 (0.0028) [2024-06-13 04:44:15,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2671312896. Throughput: 0: 48997.8. Samples: 2200190720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:44:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:44:19,033][71000] Updated weights for policy 0, policy_version 163054 (0.0030) [2024-06-13 04:44:20,940][70768] Fps is (10 sec: 49158.6, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2671558656. Throughput: 0: 49123.5. Samples: 2200340020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:44:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 04:44:22,262][71000] Updated weights for policy 0, policy_version 163064 (0.0031) [2024-06-13 04:44:25,663][71000] Updated weights for policy 0, policy_version 163074 (0.0024) [2024-06-13 04:44:25,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2671820800. Throughput: 0: 49173.7. Samples: 2200636300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:44:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:44:29,025][71000] Updated weights for policy 0, policy_version 163084 (0.0035) [2024-06-13 04:44:30,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2672050176. Throughput: 0: 49124.8. Samples: 2200924440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:44:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:44:32,271][71000] Updated weights for policy 0, policy_version 163094 (0.0039) [2024-06-13 04:44:35,842][71000] Updated weights for policy 0, policy_version 163104 (0.0026) [2024-06-13 04:44:35,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2672295936. Throughput: 0: 49063.1. Samples: 2201069000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:44:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:44:38,791][71000] Updated weights for policy 0, policy_version 163114 (0.0025) [2024-06-13 04:44:40,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2672541696. Throughput: 0: 49261.8. Samples: 2201368920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 04:44:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 04:44:42,297][71000] Updated weights for policy 0, policy_version 163124 (0.0034) [2024-06-13 04:44:45,483][71000] Updated weights for policy 0, policy_version 163134 (0.0030) [2024-06-13 04:44:45,560][70980] Signal inference workers to stop experience collection... (32750 times) [2024-06-13 04:44:45,588][71000] InferenceWorker_p0-w0: stopping experience collection (32750 times) [2024-06-13 04:44:45,666][70980] Signal inference workers to resume experience collection... (32750 times) [2024-06-13 04:44:45,666][71000] InferenceWorker_p0-w0: resuming experience collection (32750 times) [2024-06-13 04:44:45,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49971.2, 300 sec: 49263.1). Total num frames: 2672820224. Throughput: 0: 49509.9. Samples: 2201680660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 04:44:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:44:48,891][71000] Updated weights for policy 0, policy_version 163144 (0.0032) [2024-06-13 04:44:50,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.1, 300 sec: 49207.6). Total num frames: 2673049600. Throughput: 0: 49544.1. Samples: 2201830420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 04:44:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:44:51,911][71000] Updated weights for policy 0, policy_version 163154 (0.0028) [2024-06-13 04:44:55,685][71000] Updated weights for policy 0, policy_version 163164 (0.0029) [2024-06-13 04:44:55,939][70768] Fps is (10 sec: 45875.5, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 2673278976. Throughput: 0: 49489.1. Samples: 2202121940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 04:44:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:44:58,611][71000] Updated weights for policy 0, policy_version 163174 (0.0027) [2024-06-13 04:45:00,939][70768] Fps is (10 sec: 47513.3, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 2673524736. Throughput: 0: 49362.7. Samples: 2202412040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 04:45:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:45:02,436][71000] Updated weights for policy 0, policy_version 163184 (0.0025) [2024-06-13 04:45:05,194][71000] Updated weights for policy 0, policy_version 163194 (0.0034) [2024-06-13 04:45:05,940][70768] Fps is (10 sec: 54067.1, 60 sec: 49971.2, 300 sec: 49374.2). Total num frames: 2673819648. Throughput: 0: 49328.1. Samples: 2202559780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 04:45:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:45:08,763][71000] Updated weights for policy 0, policy_version 163204 (0.0023) [2024-06-13 04:45:10,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49426.2, 300 sec: 49319.1). Total num frames: 2674032640. Throughput: 0: 49606.7. Samples: 2202868600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 04:45:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:45:11,723][71000] Updated weights for policy 0, policy_version 163214 (0.0036) [2024-06-13 04:45:15,692][71000] Updated weights for policy 0, policy_version 163224 (0.0039) [2024-06-13 04:45:15,940][70768] Fps is (10 sec: 44236.7, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2674262016. Throughput: 0: 49602.4. Samples: 2203156540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 04:45:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:45:18,441][71000] Updated weights for policy 0, policy_version 163234 (0.0027) [2024-06-13 04:45:20,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2674524160. Throughput: 0: 49654.2. Samples: 2203303440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 04:45:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:45:22,209][71000] Updated weights for policy 0, policy_version 163244 (0.0035) [2024-06-13 04:45:25,074][71000] Updated weights for policy 0, policy_version 163254 (0.0028) [2024-06-13 04:45:25,939][70768] Fps is (10 sec: 54067.3, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 2674802688. Throughput: 0: 49653.8. Samples: 2203603340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 04:45:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:45:28,910][71000] Updated weights for policy 0, policy_version 163264 (0.0024) [2024-06-13 04:45:30,940][70768] Fps is (10 sec: 49150.8, 60 sec: 49425.0, 300 sec: 49263.0). Total num frames: 2675015680. Throughput: 0: 49240.2. Samples: 2203896480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 04:45:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:45:31,648][71000] Updated weights for policy 0, policy_version 163274 (0.0034) [2024-06-13 04:45:34,466][70980] Signal inference workers to stop experience collection... (32800 times) [2024-06-13 04:45:34,467][70980] Signal inference workers to resume experience collection... (32800 times) [2024-06-13 04:45:34,508][71000] InferenceWorker_p0-w0: stopping experience collection (32800 times) [2024-06-13 04:45:34,508][71000] InferenceWorker_p0-w0: resuming experience collection (32800 times) [2024-06-13 04:45:35,206][71000] Updated weights for policy 0, policy_version 163284 (0.0027) [2024-06-13 04:45:35,940][70768] Fps is (10 sec: 44236.6, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2675245056. Throughput: 0: 49206.1. Samples: 2204044700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 04:45:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:45:38,123][71000] Updated weights for policy 0, policy_version 163294 (0.0026) [2024-06-13 04:45:40,940][70768] Fps is (10 sec: 47514.5, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2675490816. Throughput: 0: 49202.1. Samples: 2204336040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 04:45:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:45:41,073][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000163301_2675523584.pth... [2024-06-13 04:45:41,123][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000162579_2663694336.pth [2024-06-13 04:45:42,098][71000] Updated weights for policy 0, policy_version 163304 (0.0031) [2024-06-13 04:45:44,721][71000] Updated weights for policy 0, policy_version 163314 (0.0023) [2024-06-13 04:45:45,940][70768] Fps is (10 sec: 54067.3, 60 sec: 49425.1, 300 sec: 49319.1). Total num frames: 2675785728. Throughput: 0: 49258.2. Samples: 2204628660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 04:45:45,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 04:45:49,098][71000] Updated weights for policy 0, policy_version 163324 (0.0031) [2024-06-13 04:45:50,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2675998720. Throughput: 0: 49372.5. Samples: 2204781540. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:45:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:45:51,465][71000] Updated weights for policy 0, policy_version 163334 (0.0022) [2024-06-13 04:45:55,530][71000] Updated weights for policy 0, policy_version 163344 (0.0028) [2024-06-13 04:45:55,940][70768] Fps is (10 sec: 44236.2, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2676228096. Throughput: 0: 49082.1. Samples: 2205077300. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:45:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:45:57,910][71000] Updated weights for policy 0, policy_version 163354 (0.0031) [2024-06-13 04:46:00,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2676473856. Throughput: 0: 49167.1. Samples: 2205369060. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:46:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:46:01,895][71000] Updated weights for policy 0, policy_version 163364 (0.0029) [2024-06-13 04:46:04,827][71000] Updated weights for policy 0, policy_version 163374 (0.0024) [2024-06-13 04:46:05,939][70768] Fps is (10 sec: 54068.2, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2676768768. Throughput: 0: 49380.5. Samples: 2205525560. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:46:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:46:08,780][71000] Updated weights for policy 0, policy_version 163384 (0.0029) [2024-06-13 04:46:10,940][70768] Fps is (10 sec: 54067.2, 60 sec: 49698.1, 300 sec: 49374.1). Total num frames: 2677014528. Throughput: 0: 49184.8. Samples: 2205816660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:46:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:46:11,322][71000] Updated weights for policy 0, policy_version 163394 (0.0025) [2024-06-13 04:46:15,572][71000] Updated weights for policy 0, policy_version 163404 (0.0028) [2024-06-13 04:46:15,940][70768] Fps is (10 sec: 44236.3, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2677211136. Throughput: 0: 49167.8. Samples: 2206109020. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:46:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:46:17,038][70980] Signal inference workers to stop experience collection... (32850 times) [2024-06-13 04:46:17,073][71000] InferenceWorker_p0-w0: stopping experience collection (32850 times) [2024-06-13 04:46:17,095][70980] Signal inference workers to resume experience collection... (32850 times) [2024-06-13 04:46:17,096][71000] InferenceWorker_p0-w0: resuming experience collection (32850 times) [2024-06-13 04:46:18,274][71000] Updated weights for policy 0, policy_version 163414 (0.0031) [2024-06-13 04:46:20,940][70768] Fps is (10 sec: 44236.9, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2677456896. Throughput: 0: 48938.2. Samples: 2206246920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:46:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:46:22,017][71000] Updated weights for policy 0, policy_version 163424 (0.0021) [2024-06-13 04:46:24,823][71000] Updated weights for policy 0, policy_version 163434 (0.0025) [2024-06-13 04:46:25,940][70768] Fps is (10 sec: 54067.5, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2677751808. Throughput: 0: 49137.4. Samples: 2206547220. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:46:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:46:28,698][71000] Updated weights for policy 0, policy_version 163444 (0.0023) [2024-06-13 04:46:30,940][70768] Fps is (10 sec: 54066.4, 60 sec: 49698.2, 300 sec: 49319.3). Total num frames: 2677997568. Throughput: 0: 49326.0. Samples: 2206848340. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:46:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:46:31,388][71000] Updated weights for policy 0, policy_version 163454 (0.0030) [2024-06-13 04:46:35,571][71000] Updated weights for policy 0, policy_version 163464 (0.0024) [2024-06-13 04:46:35,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2678210560. Throughput: 0: 48950.1. Samples: 2206984300. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:46:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:46:38,100][71000] Updated weights for policy 0, policy_version 163474 (0.0025) [2024-06-13 04:46:40,940][70768] Fps is (10 sec: 44237.5, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2678439936. Throughput: 0: 49006.8. Samples: 2207282600. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:46:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:46:42,287][71000] Updated weights for policy 0, policy_version 163484 (0.0031) [2024-06-13 04:46:44,880][71000] Updated weights for policy 0, policy_version 163494 (0.0036) [2024-06-13 04:46:45,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49151.8, 300 sec: 49374.1). Total num frames: 2678734848. Throughput: 0: 49051.4. Samples: 2207576380. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:46:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:46:48,954][71000] Updated weights for policy 0, policy_version 163504 (0.0027) [2024-06-13 04:46:50,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2678964224. Throughput: 0: 48979.9. Samples: 2207729660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:46:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:46:51,539][71000] Updated weights for policy 0, policy_version 163514 (0.0032) [2024-06-13 04:46:55,458][71000] Updated weights for policy 0, policy_version 163524 (0.0034) [2024-06-13 04:46:55,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49425.2, 300 sec: 49207.5). Total num frames: 2679193600. Throughput: 0: 49149.8. Samples: 2208028400. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:46:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:46:58,395][71000] Updated weights for policy 0, policy_version 163534 (0.0035) [2024-06-13 04:47:00,940][70768] Fps is (10 sec: 45874.6, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2679422976. Throughput: 0: 49035.0. Samples: 2208315600. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:47:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:47:02,371][71000] Updated weights for policy 0, policy_version 163544 (0.0032) [2024-06-13 04:47:04,934][71000] Updated weights for policy 0, policy_version 163554 (0.0038) [2024-06-13 04:47:05,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48878.8, 300 sec: 49263.1). Total num frames: 2679701504. Throughput: 0: 49220.3. Samples: 2208461840. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:47:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:47:09,018][71000] Updated weights for policy 0, policy_version 163564 (0.0037) [2024-06-13 04:47:10,940][70768] Fps is (10 sec: 50791.0, 60 sec: 48605.9, 300 sec: 49207.6). Total num frames: 2679930880. Throughput: 0: 49026.2. Samples: 2208753400. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:47:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:47:11,873][71000] Updated weights for policy 0, policy_version 163574 (0.0033) [2024-06-13 04:47:14,863][70980] Signal inference workers to stop experience collection... (32900 times) [2024-06-13 04:47:14,864][70980] Signal inference workers to resume experience collection... (32900 times) [2024-06-13 04:47:14,879][71000] InferenceWorker_p0-w0: stopping experience collection (32900 times) [2024-06-13 04:47:14,879][71000] InferenceWorker_p0-w0: resuming experience collection (32900 times) [2024-06-13 04:47:15,560][71000] Updated weights for policy 0, policy_version 163584 (0.0025) [2024-06-13 04:47:15,940][70768] Fps is (10 sec: 45875.8, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2680160256. Throughput: 0: 48983.8. Samples: 2209052600. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:47:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:47:18,578][71000] Updated weights for policy 0, policy_version 163594 (0.0030) [2024-06-13 04:47:20,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 2680406016. Throughput: 0: 49192.1. Samples: 2209197940. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:47:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:47:21,985][71000] Updated weights for policy 0, policy_version 163604 (0.0024) [2024-06-13 04:47:25,206][71000] Updated weights for policy 0, policy_version 163614 (0.0027) [2024-06-13 04:47:25,940][70768] Fps is (10 sec: 52428.8, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2680684544. Throughput: 0: 49196.5. Samples: 2209496440. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:47:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:47:28,821][71000] Updated weights for policy 0, policy_version 163624 (0.0026) [2024-06-13 04:47:30,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48606.0, 300 sec: 49207.5). Total num frames: 2680913920. Throughput: 0: 49243.3. Samples: 2209792320. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:47:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:47:31,741][71000] Updated weights for policy 0, policy_version 163634 (0.0025) [2024-06-13 04:47:35,523][71000] Updated weights for policy 0, policy_version 163644 (0.0031) [2024-06-13 04:47:35,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2681159680. Throughput: 0: 49151.1. Samples: 2209941460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:47:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:47:38,636][71000] Updated weights for policy 0, policy_version 163654 (0.0035) [2024-06-13 04:47:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2681405440. Throughput: 0: 48971.1. Samples: 2210232100. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:47:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:47:41,014][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000163661_2681421824.pth... [2024-06-13 04:47:41,053][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000162940_2669608960.pth [2024-06-13 04:47:41,930][71000] Updated weights for policy 0, policy_version 163664 (0.0026) [2024-06-13 04:47:45,200][71000] Updated weights for policy 0, policy_version 163674 (0.0027) [2024-06-13 04:47:45,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48605.9, 300 sec: 49207.5). Total num frames: 2681651200. Throughput: 0: 49240.9. Samples: 2210531440. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:47:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:47:48,405][71000] Updated weights for policy 0, policy_version 163684 (0.0026) [2024-06-13 04:47:50,939][70768] Fps is (10 sec: 49152.7, 60 sec: 48879.0, 300 sec: 49207.6). Total num frames: 2681896960. Throughput: 0: 49464.7. Samples: 2210687740. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:47:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:47:51,495][71000] Updated weights for policy 0, policy_version 163694 (0.0031) [2024-06-13 04:47:54,957][71000] Updated weights for policy 0, policy_version 163704 (0.0028) [2024-06-13 04:47:55,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49698.2, 300 sec: 49152.0). Total num frames: 2682175488. Throughput: 0: 49588.0. Samples: 2210984860. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-13 04:47:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:47:58,279][71000] Updated weights for policy 0, policy_version 163714 (0.0039) [2024-06-13 04:48:00,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49971.3, 300 sec: 49318.6). Total num frames: 2682421248. Throughput: 0: 49605.3. Samples: 2211284840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-13 04:48:00,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 04:48:01,470][71000] Updated weights for policy 0, policy_version 163724 (0.0023) [2024-06-13 04:48:04,968][71000] Updated weights for policy 0, policy_version 163734 (0.0024) [2024-06-13 04:48:05,939][70768] Fps is (10 sec: 45875.4, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 2682634240. Throughput: 0: 49635.6. Samples: 2211431540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-13 04:48:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:48:08,186][71000] Updated weights for policy 0, policy_version 163744 (0.0025) [2024-06-13 04:48:10,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2682880000. Throughput: 0: 49392.4. Samples: 2211719100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-13 04:48:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:48:11,856][71000] Updated weights for policy 0, policy_version 163754 (0.0026) [2024-06-13 04:48:11,868][70980] Signal inference workers to stop experience collection... (32950 times) [2024-06-13 04:48:11,868][70980] Signal inference workers to resume experience collection... (32950 times) [2024-06-13 04:48:11,887][71000] InferenceWorker_p0-w0: stopping experience collection (32950 times) [2024-06-13 04:48:11,887][71000] InferenceWorker_p0-w0: resuming experience collection (32950 times) [2024-06-13 04:48:14,723][71000] Updated weights for policy 0, policy_version 163764 (0.0025) [2024-06-13 04:48:15,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49971.1, 300 sec: 49263.1). Total num frames: 2683158528. Throughput: 0: 49459.0. Samples: 2212017980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-13 04:48:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 04:48:18,548][71000] Updated weights for policy 0, policy_version 163774 (0.0028) [2024-06-13 04:48:20,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49971.1, 300 sec: 49318.6). Total num frames: 2683404288. Throughput: 0: 49582.5. Samples: 2212172680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-13 04:48:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:48:21,451][71000] Updated weights for policy 0, policy_version 163784 (0.0024) [2024-06-13 04:48:25,114][71000] Updated weights for policy 0, policy_version 163794 (0.0031) [2024-06-13 04:48:25,939][70768] Fps is (10 sec: 45875.9, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2683617280. Throughput: 0: 49617.0. Samples: 2212464860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-13 04:48:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:48:28,193][71000] Updated weights for policy 0, policy_version 163804 (0.0028) [2024-06-13 04:48:30,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2683879424. Throughput: 0: 49330.7. Samples: 2212751320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-13 04:48:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:48:31,975][71000] Updated weights for policy 0, policy_version 163814 (0.0029) [2024-06-13 04:48:34,820][71000] Updated weights for policy 0, policy_version 163824 (0.0029) [2024-06-13 04:48:35,940][70768] Fps is (10 sec: 52427.6, 60 sec: 49697.9, 300 sec: 49263.0). Total num frames: 2684141568. Throughput: 0: 49268.5. Samples: 2212904840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-13 04:48:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:48:38,509][71000] Updated weights for policy 0, policy_version 163834 (0.0026) [2024-06-13 04:48:40,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.1, 300 sec: 49374.1). Total num frames: 2684387328. Throughput: 0: 49440.7. Samples: 2213209700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-13 04:48:40,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 04:48:41,230][71000] Updated weights for policy 0, policy_version 163844 (0.0030) [2024-06-13 04:48:45,014][71000] Updated weights for policy 0, policy_version 163854 (0.0029) [2024-06-13 04:48:45,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2684616704. Throughput: 0: 49307.5. Samples: 2213503680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-13 04:48:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:48:47,946][71000] Updated weights for policy 0, policy_version 163864 (0.0036) [2024-06-13 04:48:50,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2684862464. Throughput: 0: 49179.9. Samples: 2213644640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-13 04:48:50,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-13 04:48:51,749][71000] Updated weights for policy 0, policy_version 163874 (0.0032) [2024-06-13 04:48:54,424][71000] Updated weights for policy 0, policy_version 163884 (0.0032) [2024-06-13 04:48:55,939][70768] Fps is (10 sec: 50791.0, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2685124608. Throughput: 0: 49515.2. Samples: 2213947280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-13 04:48:55,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 04:48:58,557][71000] Updated weights for policy 0, policy_version 163894 (0.0025) [2024-06-13 04:49:00,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2685370368. Throughput: 0: 49375.7. Samples: 2214239880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-13 04:49:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:49:01,168][71000] Updated weights for policy 0, policy_version 163904 (0.0027) [2024-06-13 04:49:05,382][71000] Updated weights for policy 0, policy_version 163914 (0.0025) [2024-06-13 04:49:05,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49698.1, 300 sec: 49318.8). Total num frames: 2685616128. Throughput: 0: 49344.6. Samples: 2214393180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 04:49:05,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:49:07,829][71000] Updated weights for policy 0, policy_version 163924 (0.0022) [2024-06-13 04:49:10,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2685845504. Throughput: 0: 49250.6. Samples: 2214681140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 04:49:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:49:11,882][71000] Updated weights for policy 0, policy_version 163934 (0.0028) [2024-06-13 04:49:14,459][70980] Signal inference workers to stop experience collection... (33000 times) [2024-06-13 04:49:14,460][70980] Signal inference workers to resume experience collection... (33000 times) [2024-06-13 04:49:14,481][71000] InferenceWorker_p0-w0: stopping experience collection (33000 times) [2024-06-13 04:49:14,482][71000] InferenceWorker_p0-w0: resuming experience collection (33000 times) [2024-06-13 04:49:14,602][71000] Updated weights for policy 0, policy_version 163944 (0.0027) [2024-06-13 04:49:15,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2686107648. Throughput: 0: 49218.7. Samples: 2214966160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 04:49:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 04:49:18,591][71000] Updated weights for policy 0, policy_version 163954 (0.0032) [2024-06-13 04:49:20,911][71000] Updated weights for policy 0, policy_version 163964 (0.0024) [2024-06-13 04:49:20,940][70768] Fps is (10 sec: 54066.3, 60 sec: 49698.1, 300 sec: 49374.1). Total num frames: 2686386176. Throughput: 0: 49508.0. Samples: 2215132700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 04:49:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:49:25,417][71000] Updated weights for policy 0, policy_version 163974 (0.0024) [2024-06-13 04:49:25,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2686582784. Throughput: 0: 49095.2. Samples: 2215418980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 04:49:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:49:27,622][71000] Updated weights for policy 0, policy_version 163984 (0.0027) [2024-06-13 04:49:30,940][70768] Fps is (10 sec: 44237.3, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2686828544. Throughput: 0: 49137.8. Samples: 2215714880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 04:49:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:49:31,778][71000] Updated weights for policy 0, policy_version 163994 (0.0031) [2024-06-13 04:49:34,199][71000] Updated weights for policy 0, policy_version 164004 (0.0027) [2024-06-13 04:49:35,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2687090688. Throughput: 0: 49295.0. Samples: 2215862920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 04:49:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:49:38,501][71000] Updated weights for policy 0, policy_version 164014 (0.0030) [2024-06-13 04:49:40,642][71000] Updated weights for policy 0, policy_version 164024 (0.0032) [2024-06-13 04:49:40,940][70768] Fps is (10 sec: 54066.7, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 2687369216. Throughput: 0: 49464.2. Samples: 2216173180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 04:49:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:49:40,955][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000164024_2687369216.pth... [2024-06-13 04:49:40,996][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000163301_2675523584.pth [2024-06-13 04:49:44,988][71000] Updated weights for policy 0, policy_version 164034 (0.0035) [2024-06-13 04:49:45,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2687565824. Throughput: 0: 49422.5. Samples: 2216463900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 04:49:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:49:47,148][71000] Updated weights for policy 0, policy_version 164044 (0.0026) [2024-06-13 04:49:50,939][70768] Fps is (10 sec: 42599.2, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2687795200. Throughput: 0: 49094.7. Samples: 2216602440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 04:49:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:49:51,832][71000] Updated weights for policy 0, policy_version 164054 (0.0035) [2024-06-13 04:49:54,178][71000] Updated weights for policy 0, policy_version 164064 (0.0031) [2024-06-13 04:49:55,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2688073728. Throughput: 0: 49059.6. Samples: 2216888820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 04:49:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:49:58,417][71000] Updated weights for policy 0, policy_version 164074 (0.0029) [2024-06-13 04:50:00,811][71000] Updated weights for policy 0, policy_version 164084 (0.0028) [2024-06-13 04:50:00,940][70768] Fps is (10 sec: 55704.8, 60 sec: 49698.0, 300 sec: 49263.1). Total num frames: 2688352256. Throughput: 0: 49269.7. Samples: 2217183300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 04:50:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:50:05,274][71000] Updated weights for policy 0, policy_version 164094 (0.0037) [2024-06-13 04:50:05,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48605.9, 300 sec: 49152.0). Total num frames: 2688532480. Throughput: 0: 48854.9. Samples: 2217331160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-13 04:50:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:50:06,708][70980] Signal inference workers to stop experience collection... (33050 times) [2024-06-13 04:50:06,708][70980] Signal inference workers to resume experience collection... (33050 times) [2024-06-13 04:50:06,733][71000] InferenceWorker_p0-w0: stopping experience collection (33050 times) [2024-06-13 04:50:06,734][71000] InferenceWorker_p0-w0: resuming experience collection (33050 times) [2024-06-13 04:50:07,785][71000] Updated weights for policy 0, policy_version 164104 (0.0034) [2024-06-13 04:50:10,939][70768] Fps is (10 sec: 42599.0, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2688778240. Throughput: 0: 48996.1. Samples: 2217623800. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-13 04:50:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:50:11,715][71000] Updated weights for policy 0, policy_version 164114 (0.0026) [2024-06-13 04:50:14,536][71000] Updated weights for policy 0, policy_version 164124 (0.0023) [2024-06-13 04:50:15,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2689056768. Throughput: 0: 48865.8. Samples: 2217913840. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-13 04:50:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:50:18,693][71000] Updated weights for policy 0, policy_version 164134 (0.0035) [2024-06-13 04:50:20,940][70768] Fps is (10 sec: 54067.1, 60 sec: 48879.1, 300 sec: 49207.5). Total num frames: 2689318912. Throughput: 0: 49045.0. Samples: 2218069940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-13 04:50:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 04:50:21,075][71000] Updated weights for policy 0, policy_version 164144 (0.0031) [2024-06-13 04:50:25,436][71000] Updated weights for policy 0, policy_version 164154 (0.0026) [2024-06-13 04:50:25,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2689515520. Throughput: 0: 48594.0. Samples: 2218359900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-13 04:50:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:50:27,891][71000] Updated weights for policy 0, policy_version 164164 (0.0022) [2024-06-13 04:50:30,939][70768] Fps is (10 sec: 42598.7, 60 sec: 48606.0, 300 sec: 49152.0). Total num frames: 2689744896. Throughput: 0: 48643.4. Samples: 2218652840. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-13 04:50:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:50:32,226][71000] Updated weights for policy 0, policy_version 164174 (0.0026) [2024-06-13 04:50:34,219][71000] Updated weights for policy 0, policy_version 164184 (0.0038) [2024-06-13 04:50:35,939][70768] Fps is (10 sec: 52429.2, 60 sec: 49152.2, 300 sec: 49318.6). Total num frames: 2690039808. Throughput: 0: 48871.6. Samples: 2218801660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-13 04:50:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:50:38,596][71000] Updated weights for policy 0, policy_version 164194 (0.0032) [2024-06-13 04:50:40,940][70768] Fps is (10 sec: 55705.1, 60 sec: 48879.1, 300 sec: 49207.5). Total num frames: 2690301952. Throughput: 0: 49296.4. Samples: 2219107160. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-13 04:50:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:50:41,234][71000] Updated weights for policy 0, policy_version 164204 (0.0024) [2024-06-13 04:50:45,248][71000] Updated weights for policy 0, policy_version 164214 (0.0027) [2024-06-13 04:50:45,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49152.1, 300 sec: 49207.5). Total num frames: 2690514944. Throughput: 0: 49328.6. Samples: 2219403080. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-13 04:50:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:50:47,662][71000] Updated weights for policy 0, policy_version 164224 (0.0024) [2024-06-13 04:50:50,940][70768] Fps is (10 sec: 44235.9, 60 sec: 49151.8, 300 sec: 49207.5). Total num frames: 2690744320. Throughput: 0: 49066.0. Samples: 2219539140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-13 04:50:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:50:51,859][71000] Updated weights for policy 0, policy_version 164234 (0.0025) [2024-06-13 04:50:54,381][71000] Updated weights for policy 0, policy_version 164244 (0.0034) [2024-06-13 04:50:55,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2691022848. Throughput: 0: 49220.0. Samples: 2219838700. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-13 04:50:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:50:58,316][71000] Updated weights for policy 0, policy_version 164254 (0.0023) [2024-06-13 04:51:00,940][70768] Fps is (10 sec: 54068.1, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2691284992. Throughput: 0: 49518.6. Samples: 2220142180. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-13 04:51:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:51:01,179][71000] Updated weights for policy 0, policy_version 164264 (0.0032) [2024-06-13 04:51:05,041][71000] Updated weights for policy 0, policy_version 164274 (0.0037) [2024-06-13 04:51:05,491][70980] Signal inference workers to stop experience collection... (33100 times) [2024-06-13 04:51:05,491][70980] Signal inference workers to resume experience collection... (33100 times) [2024-06-13 04:51:05,527][71000] InferenceWorker_p0-w0: stopping experience collection (33100 times) [2024-06-13 04:51:05,527][71000] InferenceWorker_p0-w0: resuming experience collection (33100 times) [2024-06-13 04:51:05,943][70768] Fps is (10 sec: 49136.5, 60 sec: 49695.5, 300 sec: 49151.5). Total num frames: 2691514368. Throughput: 0: 49428.1. Samples: 2220294360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-13 04:51:05,943][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:51:07,710][71000] Updated weights for policy 0, policy_version 164284 (0.0029) [2024-06-13 04:51:10,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2691743744. Throughput: 0: 49467.5. Samples: 2220585940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-13 04:51:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:51:11,851][71000] Updated weights for policy 0, policy_version 164294 (0.0029) [2024-06-13 04:51:14,429][71000] Updated weights for policy 0, policy_version 164304 (0.0032) [2024-06-13 04:51:15,943][70768] Fps is (10 sec: 49152.8, 60 sec: 49149.6, 300 sec: 49318.1). Total num frames: 2692005888. Throughput: 0: 49301.5. Samples: 2220871560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 04:51:15,943][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:51:18,483][71000] Updated weights for policy 0, policy_version 164314 (0.0028) [2024-06-13 04:51:20,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 2692251648. Throughput: 0: 49553.1. Samples: 2221031560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 04:51:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:51:21,553][71000] Updated weights for policy 0, policy_version 164324 (0.0035) [2024-06-13 04:51:25,172][71000] Updated weights for policy 0, policy_version 164334 (0.0034) [2024-06-13 04:51:25,940][70768] Fps is (10 sec: 47527.4, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 2692481024. Throughput: 0: 49098.6. Samples: 2221316600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 04:51:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:51:28,122][71000] Updated weights for policy 0, policy_version 164344 (0.0022) [2024-06-13 04:51:30,939][70768] Fps is (10 sec: 45876.1, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2692710400. Throughput: 0: 49088.5. Samples: 2221612060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 04:51:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:51:31,602][71000] Updated weights for policy 0, policy_version 164354 (0.0021) [2024-06-13 04:51:34,454][71000] Updated weights for policy 0, policy_version 164364 (0.0027) [2024-06-13 04:51:35,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2692988928. Throughput: 0: 49421.5. Samples: 2221763100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 04:51:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:51:38,232][71000] Updated weights for policy 0, policy_version 164374 (0.0025) [2024-06-13 04:51:40,940][70768] Fps is (10 sec: 52428.5, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2693234688. Throughput: 0: 49357.3. Samples: 2222059780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 04:51:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:51:41,004][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000164383_2693251072.pth... [2024-06-13 04:51:41,040][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000163661_2681421824.pth [2024-06-13 04:51:41,386][71000] Updated weights for policy 0, policy_version 164384 (0.0038) [2024-06-13 04:51:44,895][71000] Updated weights for policy 0, policy_version 164394 (0.0026) [2024-06-13 04:51:45,939][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2693464064. Throughput: 0: 48989.4. Samples: 2222346700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 04:51:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:51:48,360][71000] Updated weights for policy 0, policy_version 164404 (0.0031) [2024-06-13 04:51:50,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.3, 300 sec: 49207.5). Total num frames: 2693709824. Throughput: 0: 48931.5. Samples: 2222496120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 04:51:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:51:51,510][71000] Updated weights for policy 0, policy_version 164414 (0.0027) [2024-06-13 04:51:54,870][71000] Updated weights for policy 0, policy_version 164424 (0.0023) [2024-06-13 04:51:55,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2693971968. Throughput: 0: 49113.7. Samples: 2222796060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 04:51:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:51:58,318][71000] Updated weights for policy 0, policy_version 164434 (0.0033) [2024-06-13 04:52:00,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48879.0, 300 sec: 49207.6). Total num frames: 2694217728. Throughput: 0: 49112.1. Samples: 2223081460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 04:52:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:52:01,392][71000] Updated weights for policy 0, policy_version 164444 (0.0030) [2024-06-13 04:52:05,079][71000] Updated weights for policy 0, policy_version 164454 (0.0030) [2024-06-13 04:52:05,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49154.6, 300 sec: 49263.1). Total num frames: 2694463488. Throughput: 0: 48761.1. Samples: 2223225800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 04:52:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:52:08,491][71000] Updated weights for policy 0, policy_version 164464 (0.0034) [2024-06-13 04:52:10,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2694692864. Throughput: 0: 48981.0. Samples: 2223520740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 04:52:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:52:11,683][71000] Updated weights for policy 0, policy_version 164474 (0.0022) [2024-06-13 04:52:15,074][71000] Updated weights for policy 0, policy_version 164484 (0.0031) [2024-06-13 04:52:15,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49154.5, 300 sec: 49318.6). Total num frames: 2694955008. Throughput: 0: 48929.8. Samples: 2223813900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 04:52:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:52:18,390][71000] Updated weights for policy 0, policy_version 164494 (0.0029) [2024-06-13 04:52:20,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2695184384. Throughput: 0: 48922.1. Samples: 2223964600. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-13 04:52:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 04:52:21,862][71000] Updated weights for policy 0, policy_version 164504 (0.0026) [2024-06-13 04:52:25,230][71000] Updated weights for policy 0, policy_version 164514 (0.0035) [2024-06-13 04:52:25,940][70768] Fps is (10 sec: 45874.6, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2695413760. Throughput: 0: 48671.5. Samples: 2224250000. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-13 04:52:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:52:26,099][70980] Signal inference workers to stop experience collection... (33150 times) [2024-06-13 04:52:26,099][70980] Signal inference workers to resume experience collection... (33150 times) [2024-06-13 04:52:26,146][71000] InferenceWorker_p0-w0: stopping experience collection (33150 times) [2024-06-13 04:52:26,146][71000] InferenceWorker_p0-w0: resuming experience collection (33150 times) [2024-06-13 04:52:28,565][71000] Updated weights for policy 0, policy_version 164524 (0.0022) [2024-06-13 04:52:30,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49424.9, 300 sec: 49207.5). Total num frames: 2695675904. Throughput: 0: 48847.4. Samples: 2224544840. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-13 04:52:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:52:31,586][71000] Updated weights for policy 0, policy_version 164534 (0.0026) [2024-06-13 04:52:35,101][71000] Updated weights for policy 0, policy_version 164544 (0.0023) [2024-06-13 04:52:35,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2695921664. Throughput: 0: 48867.5. Samples: 2224695160. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-13 04:52:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:52:38,135][71000] Updated weights for policy 0, policy_version 164554 (0.0025) [2024-06-13 04:52:40,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48878.8, 300 sec: 49207.5). Total num frames: 2696167424. Throughput: 0: 48946.6. Samples: 2224998660. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-13 04:52:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:52:42,004][71000] Updated weights for policy 0, policy_version 164564 (0.0034) [2024-06-13 04:52:44,869][71000] Updated weights for policy 0, policy_version 164574 (0.0026) [2024-06-13 04:52:45,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2696413184. Throughput: 0: 48931.6. Samples: 2225283380. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-13 04:52:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:52:48,534][71000] Updated weights for policy 0, policy_version 164584 (0.0031) [2024-06-13 04:52:50,940][70768] Fps is (10 sec: 50791.3, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2696675328. Throughput: 0: 49216.9. Samples: 2225440560. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-13 04:52:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:52:51,564][71000] Updated weights for policy 0, policy_version 164594 (0.0022) [2024-06-13 04:52:54,773][71000] Updated weights for policy 0, policy_version 164604 (0.0029) [2024-06-13 04:52:55,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2696921088. Throughput: 0: 49387.0. Samples: 2225743160. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-13 04:52:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:52:57,766][71000] Updated weights for policy 0, policy_version 164614 (0.0024) [2024-06-13 04:53:00,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2697150464. Throughput: 0: 49398.6. Samples: 2226036840. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-13 04:53:00,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 04:53:01,874][71000] Updated weights for policy 0, policy_version 164624 (0.0032) [2024-06-13 04:53:04,487][71000] Updated weights for policy 0, policy_version 164634 (0.0032) [2024-06-13 04:53:05,940][70768] Fps is (10 sec: 49150.9, 60 sec: 49151.7, 300 sec: 49263.0). Total num frames: 2697412608. Throughput: 0: 49240.3. Samples: 2226180420. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-13 04:53:05,941][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:53:08,656][71000] Updated weights for policy 0, policy_version 164644 (0.0041) [2024-06-13 04:53:10,939][70768] Fps is (10 sec: 52429.4, 60 sec: 49698.2, 300 sec: 49207.6). Total num frames: 2697674752. Throughput: 0: 49492.2. Samples: 2226477140. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-13 04:53:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:53:11,011][71000] Updated weights for policy 0, policy_version 164654 (0.0024) [2024-06-13 04:53:14,854][71000] Updated weights for policy 0, policy_version 164664 (0.0030) [2024-06-13 04:53:15,940][70768] Fps is (10 sec: 49153.2, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2697904128. Throughput: 0: 49638.3. Samples: 2226778560. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-13 04:53:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:53:17,758][71000] Updated weights for policy 0, policy_version 164674 (0.0030) [2024-06-13 04:53:20,940][70768] Fps is (10 sec: 45874.5, 60 sec: 49152.1, 300 sec: 49207.5). Total num frames: 2698133504. Throughput: 0: 49410.2. Samples: 2226918620. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-13 04:53:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:53:22,031][71000] Updated weights for policy 0, policy_version 164684 (0.0033) [2024-06-13 04:53:24,622][71000] Updated weights for policy 0, policy_version 164694 (0.0034) [2024-06-13 04:53:25,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49698.2, 300 sec: 49207.5). Total num frames: 2698395648. Throughput: 0: 49112.1. Samples: 2227208700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 04:53:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:53:28,611][71000] Updated weights for policy 0, policy_version 164704 (0.0036) [2024-06-13 04:53:28,649][70980] Signal inference workers to stop experience collection... (33200 times) [2024-06-13 04:53:28,649][70980] Signal inference workers to resume experience collection... (33200 times) [2024-06-13 04:53:28,658][71000] InferenceWorker_p0-w0: stopping experience collection (33200 times) [2024-06-13 04:53:28,658][71000] InferenceWorker_p0-w0: resuming experience collection (33200 times) [2024-06-13 04:53:30,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49698.2, 300 sec: 49207.6). Total num frames: 2698657792. Throughput: 0: 49323.9. Samples: 2227502960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 04:53:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:53:31,030][71000] Updated weights for policy 0, policy_version 164714 (0.0025) [2024-06-13 04:53:35,083][71000] Updated weights for policy 0, policy_version 164724 (0.0027) [2024-06-13 04:53:35,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2698887168. Throughput: 0: 49416.0. Samples: 2227664280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 04:53:35,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:53:37,581][71000] Updated weights for policy 0, policy_version 164734 (0.0032) [2024-06-13 04:53:40,940][70768] Fps is (10 sec: 45874.5, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2699116544. Throughput: 0: 49201.2. Samples: 2227957220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 04:53:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:53:41,039][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000164742_2699132928.pth... [2024-06-13 04:53:41,087][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000164024_2687369216.pth [2024-06-13 04:53:41,777][71000] Updated weights for policy 0, policy_version 164744 (0.0036) [2024-06-13 04:53:44,402][71000] Updated weights for policy 0, policy_version 164754 (0.0022) [2024-06-13 04:53:45,940][70768] Fps is (10 sec: 49150.9, 60 sec: 49424.9, 300 sec: 49207.5). Total num frames: 2699378688. Throughput: 0: 49004.3. Samples: 2228242040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 04:53:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:53:48,440][71000] Updated weights for policy 0, policy_version 164764 (0.0028) [2024-06-13 04:53:50,880][71000] Updated weights for policy 0, policy_version 164774 (0.0040) [2024-06-13 04:53:50,940][70768] Fps is (10 sec: 54068.3, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2699657216. Throughput: 0: 49424.4. Samples: 2228404500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 04:53:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:53:54,966][71000] Updated weights for policy 0, policy_version 164784 (0.0024) [2024-06-13 04:53:55,940][70768] Fps is (10 sec: 50791.3, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2699886592. Throughput: 0: 49478.5. Samples: 2228703680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 04:53:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:53:57,363][71000] Updated weights for policy 0, policy_version 164794 (0.0025) [2024-06-13 04:54:00,940][70768] Fps is (10 sec: 44236.7, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2700099584. Throughput: 0: 49493.0. Samples: 2229005740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 04:54:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:54:01,622][71000] Updated weights for policy 0, policy_version 164804 (0.0027) [2024-06-13 04:54:03,976][71000] Updated weights for policy 0, policy_version 164814 (0.0023) [2024-06-13 04:54:05,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.2, 300 sec: 49207.5). Total num frames: 2700361728. Throughput: 0: 49336.9. Samples: 2229138780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 04:54:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:54:08,192][71000] Updated weights for policy 0, policy_version 164824 (0.0033) [2024-06-13 04:54:10,531][71000] Updated weights for policy 0, policy_version 164834 (0.0025) [2024-06-13 04:54:10,940][70768] Fps is (10 sec: 54067.0, 60 sec: 49424.9, 300 sec: 49263.1). Total num frames: 2700640256. Throughput: 0: 49689.8. Samples: 2229444740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 04:54:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:54:14,782][71000] Updated weights for policy 0, policy_version 164844 (0.0024) [2024-06-13 04:54:15,941][70768] Fps is (10 sec: 52419.7, 60 sec: 49696.7, 300 sec: 49151.7). Total num frames: 2700886016. Throughput: 0: 49892.7. Samples: 2229748220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 04:54:15,942][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:54:17,233][71000] Updated weights for policy 0, policy_version 164854 (0.0025) [2024-06-13 04:54:20,940][70768] Fps is (10 sec: 44237.1, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2701082624. Throughput: 0: 49361.8. Samples: 2229885560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 04:54:20,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 04:54:21,329][71000] Updated weights for policy 0, policy_version 164864 (0.0031) [2024-06-13 04:54:23,921][71000] Updated weights for policy 0, policy_version 164874 (0.0028) [2024-06-13 04:54:25,940][70768] Fps is (10 sec: 45882.7, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2701344768. Throughput: 0: 49254.7. Samples: 2230173680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 04:54:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 04:54:28,212][71000] Updated weights for policy 0, policy_version 164884 (0.0036) [2024-06-13 04:54:29,999][70980] Signal inference workers to stop experience collection... (33250 times) [2024-06-13 04:54:30,000][70980] Signal inference workers to resume experience collection... (33250 times) [2024-06-13 04:54:30,026][71000] InferenceWorker_p0-w0: stopping experience collection (33250 times) [2024-06-13 04:54:30,027][71000] InferenceWorker_p0-w0: resuming experience collection (33250 times) [2024-06-13 04:54:30,526][71000] Updated weights for policy 0, policy_version 164894 (0.0031) [2024-06-13 04:54:30,940][70768] Fps is (10 sec: 54067.3, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2701623296. Throughput: 0: 49404.7. Samples: 2230465240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 04:54:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:54:34,763][71000] Updated weights for policy 0, policy_version 164904 (0.0034) [2024-06-13 04:54:35,939][70768] Fps is (10 sec: 49153.1, 60 sec: 49152.0, 300 sec: 49041.0). Total num frames: 2701836288. Throughput: 0: 49474.3. Samples: 2230630840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 04:54:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:54:37,288][71000] Updated weights for policy 0, policy_version 164914 (0.0031) [2024-06-13 04:54:40,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49425.2, 300 sec: 49207.6). Total num frames: 2702082048. Throughput: 0: 49419.5. Samples: 2230927560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 04:54:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:54:41,223][71000] Updated weights for policy 0, policy_version 164924 (0.0028) [2024-06-13 04:54:43,887][71000] Updated weights for policy 0, policy_version 164934 (0.0043) [2024-06-13 04:54:45,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2702327808. Throughput: 0: 49131.9. Samples: 2231216680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 04:54:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:54:47,815][71000] Updated weights for policy 0, policy_version 164944 (0.0026) [2024-06-13 04:54:50,298][71000] Updated weights for policy 0, policy_version 164954 (0.0027) [2024-06-13 04:54:50,940][70768] Fps is (10 sec: 54067.4, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2702622720. Throughput: 0: 49580.1. Samples: 2231369880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 04:54:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:54:54,357][71000] Updated weights for policy 0, policy_version 164964 (0.0029) [2024-06-13 04:54:55,939][70768] Fps is (10 sec: 52429.5, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2702852096. Throughput: 0: 49672.5. Samples: 2231680000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 04:54:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:54:57,128][71000] Updated weights for policy 0, policy_version 164974 (0.0024) [2024-06-13 04:55:00,935][71000] Updated weights for policy 0, policy_version 164984 (0.0023) [2024-06-13 04:55:00,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49971.2, 300 sec: 49374.2). Total num frames: 2703097856. Throughput: 0: 49429.5. Samples: 2231972460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 04:55:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:55:03,911][71000] Updated weights for policy 0, policy_version 164994 (0.0033) [2024-06-13 04:55:05,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2703310848. Throughput: 0: 49336.0. Samples: 2232105680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 04:55:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:55:07,783][71000] Updated weights for policy 0, policy_version 165004 (0.0028) [2024-06-13 04:55:10,526][71000] Updated weights for policy 0, policy_version 165014 (0.0027) [2024-06-13 04:55:10,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2703605760. Throughput: 0: 49437.4. Samples: 2232398360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 04:55:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:55:14,381][71000] Updated weights for policy 0, policy_version 165024 (0.0032) [2024-06-13 04:55:15,939][70768] Fps is (10 sec: 52429.4, 60 sec: 49153.6, 300 sec: 49207.6). Total num frames: 2703835136. Throughput: 0: 49738.3. Samples: 2232703460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 04:55:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:55:17,194][71000] Updated weights for policy 0, policy_version 165034 (0.0028) [2024-06-13 04:55:20,925][71000] Updated weights for policy 0, policy_version 165044 (0.0029) [2024-06-13 04:55:20,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49971.2, 300 sec: 49374.2). Total num frames: 2704080896. Throughput: 0: 49257.3. Samples: 2232847420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 04:55:20,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 04:55:23,766][71000] Updated weights for policy 0, policy_version 165054 (0.0033) [2024-06-13 04:55:25,940][70768] Fps is (10 sec: 45874.2, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2704293888. Throughput: 0: 49140.4. Samples: 2233138880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 04:55:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:55:27,607][71000] Updated weights for policy 0, policy_version 165064 (0.0045) [2024-06-13 04:55:30,043][70980] Signal inference workers to stop experience collection... (33300 times) [2024-06-13 04:55:30,045][70980] Signal inference workers to resume experience collection... (33300 times) [2024-06-13 04:55:30,056][71000] InferenceWorker_p0-w0: stopping experience collection (33300 times) [2024-06-13 04:55:30,088][71000] InferenceWorker_p0-w0: resuming experience collection (33300 times) [2024-06-13 04:55:30,516][71000] Updated weights for policy 0, policy_version 165074 (0.0035) [2024-06-13 04:55:30,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2704588800. Throughput: 0: 49376.5. Samples: 2233438620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 04:55:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:55:34,437][71000] Updated weights for policy 0, policy_version 165084 (0.0028) [2024-06-13 04:55:35,940][70768] Fps is (10 sec: 54067.4, 60 sec: 49971.1, 300 sec: 49263.1). Total num frames: 2704834560. Throughput: 0: 49392.4. Samples: 2233592540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 26.0) [2024-06-13 04:55:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:55:37,014][71000] Updated weights for policy 0, policy_version 165094 (0.0021) [2024-06-13 04:55:40,940][70768] Fps is (10 sec: 45874.7, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2705047552. Throughput: 0: 49174.0. Samples: 2233892840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 26.0) [2024-06-13 04:55:40,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-13 04:55:40,954][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000165103_2705047552.pth... [2024-06-13 04:55:41,026][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000164383_2693251072.pth [2024-06-13 04:55:41,167][71000] Updated weights for policy 0, policy_version 165104 (0.0037) [2024-06-13 04:55:43,678][71000] Updated weights for policy 0, policy_version 165114 (0.0027) [2024-06-13 04:55:45,940][70768] Fps is (10 sec: 45874.8, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2705293312. Throughput: 0: 48984.7. Samples: 2234176780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 26.0) [2024-06-13 04:55:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:55:47,620][71000] Updated weights for policy 0, policy_version 165124 (0.0030) [2024-06-13 04:55:50,708][71000] Updated weights for policy 0, policy_version 165134 (0.0031) [2024-06-13 04:55:50,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2705571840. Throughput: 0: 49358.5. Samples: 2234326820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 26.0) [2024-06-13 04:55:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:55:54,585][71000] Updated weights for policy 0, policy_version 165144 (0.0032) [2024-06-13 04:55:55,939][70768] Fps is (10 sec: 50791.4, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 2705801216. Throughput: 0: 49320.6. Samples: 2234617780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 26.0) [2024-06-13 04:55:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:55:57,462][71000] Updated weights for policy 0, policy_version 165154 (0.0022) [2024-06-13 04:56:00,939][70768] Fps is (10 sec: 44237.3, 60 sec: 48605.9, 300 sec: 49152.5). Total num frames: 2706014208. Throughput: 0: 49139.0. Samples: 2234914720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 26.0) [2024-06-13 04:56:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:56:01,126][71000] Updated weights for policy 0, policy_version 165164 (0.0029) [2024-06-13 04:56:03,958][71000] Updated weights for policy 0, policy_version 165174 (0.0022) [2024-06-13 04:56:05,939][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2706276352. Throughput: 0: 49196.9. Samples: 2235061280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 26.0) [2024-06-13 04:56:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:56:07,663][71000] Updated weights for policy 0, policy_version 165184 (0.0022) [2024-06-13 04:56:10,450][71000] Updated weights for policy 0, policy_version 165194 (0.0032) [2024-06-13 04:56:10,940][70768] Fps is (10 sec: 54066.1, 60 sec: 49152.0, 300 sec: 49319.1). Total num frames: 2706554880. Throughput: 0: 49273.2. Samples: 2235356180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 26.0) [2024-06-13 04:56:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:56:14,693][71000] Updated weights for policy 0, policy_version 165204 (0.0033) [2024-06-13 04:56:15,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2706784256. Throughput: 0: 49106.3. Samples: 2235648400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 26.0) [2024-06-13 04:56:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:56:17,313][71000] Updated weights for policy 0, policy_version 165214 (0.0022) [2024-06-13 04:56:20,940][70768] Fps is (10 sec: 44237.6, 60 sec: 48605.9, 300 sec: 49207.5). Total num frames: 2706997248. Throughput: 0: 48940.5. Samples: 2235794860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 26.0) [2024-06-13 04:56:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:56:21,112][71000] Updated weights for policy 0, policy_version 165224 (0.0035) [2024-06-13 04:56:23,818][71000] Updated weights for policy 0, policy_version 165234 (0.0032) [2024-06-13 04:56:25,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49698.2, 300 sec: 49374.1). Total num frames: 2707275776. Throughput: 0: 48874.8. Samples: 2236092200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 26.0) [2024-06-13 04:56:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:56:27,626][71000] Updated weights for policy 0, policy_version 165244 (0.0029) [2024-06-13 04:56:30,703][71000] Updated weights for policy 0, policy_version 165254 (0.0028) [2024-06-13 04:56:30,940][70768] Fps is (10 sec: 52428.2, 60 sec: 48878.8, 300 sec: 49263.1). Total num frames: 2707521536. Throughput: 0: 49155.1. Samples: 2236388760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 26.0) [2024-06-13 04:56:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 04:56:34,266][71000] Updated weights for policy 0, policy_version 165264 (0.0026) [2024-06-13 04:56:35,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 2707767296. Throughput: 0: 49101.9. Samples: 2236536400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 26.0) [2024-06-13 04:56:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:56:37,588][71000] Updated weights for policy 0, policy_version 165274 (0.0032) [2024-06-13 04:56:40,829][70980] Signal inference workers to stop experience collection... (33350 times) [2024-06-13 04:56:40,833][70980] Signal inference workers to resume experience collection... (33350 times) [2024-06-13 04:56:40,862][71000] InferenceWorker_p0-w0: stopping experience collection (33350 times) [2024-06-13 04:56:40,862][71000] InferenceWorker_p0-w0: resuming experience collection (33350 times) [2024-06-13 04:56:40,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49151.9, 300 sec: 49263.0). Total num frames: 2707996672. Throughput: 0: 49228.6. Samples: 2236833080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 04:56:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 04:56:40,966][71000] Updated weights for policy 0, policy_version 165284 (0.0040) [2024-06-13 04:56:44,401][71000] Updated weights for policy 0, policy_version 165294 (0.0031) [2024-06-13 04:56:45,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 2708258816. Throughput: 0: 49437.8. Samples: 2237139420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 04:56:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:56:47,277][71000] Updated weights for policy 0, policy_version 165304 (0.0025) [2024-06-13 04:56:50,670][71000] Updated weights for policy 0, policy_version 165314 (0.0022) [2024-06-13 04:56:50,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2708504576. Throughput: 0: 49408.2. Samples: 2237284660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 04:56:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:56:53,586][71000] Updated weights for policy 0, policy_version 165324 (0.0028) [2024-06-13 04:56:55,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2708766720. Throughput: 0: 49548.6. Samples: 2237585860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 04:56:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:56:57,219][71000] Updated weights for policy 0, policy_version 165334 (0.0028) [2024-06-13 04:57:00,325][71000] Updated weights for policy 0, policy_version 165344 (0.0040) [2024-06-13 04:57:00,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49971.1, 300 sec: 49318.6). Total num frames: 2709012480. Throughput: 0: 49530.1. Samples: 2237877260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 04:57:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:57:04,109][71000] Updated weights for policy 0, policy_version 165354 (0.0025) [2024-06-13 04:57:05,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49971.1, 300 sec: 49429.7). Total num frames: 2709274624. Throughput: 0: 49725.3. Samples: 2238032500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 04:57:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:57:06,937][71000] Updated weights for policy 0, policy_version 165364 (0.0028) [2024-06-13 04:57:10,539][71000] Updated weights for policy 0, policy_version 165374 (0.0031) [2024-06-13 04:57:10,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 2709487616. Throughput: 0: 49730.7. Samples: 2238330080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 04:57:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:57:13,579][71000] Updated weights for policy 0, policy_version 165384 (0.0027) [2024-06-13 04:57:15,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49424.9, 300 sec: 49374.2). Total num frames: 2709749760. Throughput: 0: 49522.2. Samples: 2238617260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 04:57:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:57:17,298][71000] Updated weights for policy 0, policy_version 165394 (0.0039) [2024-06-13 04:57:20,217][71000] Updated weights for policy 0, policy_version 165404 (0.0038) [2024-06-13 04:57:20,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49971.2, 300 sec: 49429.7). Total num frames: 2709995520. Throughput: 0: 49776.4. Samples: 2238776340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 04:57:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:57:23,987][71000] Updated weights for policy 0, policy_version 165414 (0.0026) [2024-06-13 04:57:25,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2710241280. Throughput: 0: 49774.5. Samples: 2239072920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 04:57:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 04:57:26,919][71000] Updated weights for policy 0, policy_version 165424 (0.0022) [2024-06-13 04:57:30,613][71000] Updated weights for policy 0, policy_version 165434 (0.0030) [2024-06-13 04:57:30,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2710470656. Throughput: 0: 49395.4. Samples: 2239362220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 04:57:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:57:33,401][71000] Updated weights for policy 0, policy_version 165444 (0.0021) [2024-06-13 04:57:35,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2710732800. Throughput: 0: 49388.1. Samples: 2239507120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 04:57:35,942][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:57:37,546][71000] Updated weights for policy 0, policy_version 165454 (0.0034) [2024-06-13 04:57:40,434][71000] Updated weights for policy 0, policy_version 165464 (0.0028) [2024-06-13 04:57:40,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.2, 300 sec: 49374.1). Total num frames: 2710978560. Throughput: 0: 49087.9. Samples: 2239794820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 04:57:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:57:41,006][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000165466_2710994944.pth... [2024-06-13 04:57:41,042][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000164742_2699132928.pth [2024-06-13 04:57:44,624][71000] Updated weights for policy 0, policy_version 165474 (0.0032) [2024-06-13 04:57:45,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2711224320. Throughput: 0: 49126.1. Samples: 2240087940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 04:57:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:57:47,146][71000] Updated weights for policy 0, policy_version 165484 (0.0024) [2024-06-13 04:57:50,939][70768] Fps is (10 sec: 45876.1, 60 sec: 48879.1, 300 sec: 49207.6). Total num frames: 2711437312. Throughput: 0: 48767.3. Samples: 2240227020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 04:57:50,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 04:57:51,054][71000] Updated weights for policy 0, policy_version 165494 (0.0031) [2024-06-13 04:57:53,762][71000] Updated weights for policy 0, policy_version 165504 (0.0030) [2024-06-13 04:57:55,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.8, 300 sec: 49318.6). Total num frames: 2711699456. Throughput: 0: 48534.6. Samples: 2240514140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 04:57:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:57:57,681][71000] Updated weights for policy 0, policy_version 165514 (0.0030) [2024-06-13 04:57:58,740][70980] Signal inference workers to stop experience collection... (33400 times) [2024-06-13 04:57:58,744][70980] Signal inference workers to resume experience collection... (33400 times) [2024-06-13 04:57:58,767][71000] InferenceWorker_p0-w0: stopping experience collection (33400 times) [2024-06-13 04:57:58,768][71000] InferenceWorker_p0-w0: resuming experience collection (33400 times) [2024-06-13 04:58:00,541][71000] Updated weights for policy 0, policy_version 165524 (0.0023) [2024-06-13 04:58:00,940][70768] Fps is (10 sec: 52427.6, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2711961600. Throughput: 0: 48825.8. Samples: 2240814420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 04:58:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:58:04,429][71000] Updated weights for policy 0, policy_version 165534 (0.0029) [2024-06-13 04:58:05,940][70768] Fps is (10 sec: 50791.1, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 2712207360. Throughput: 0: 48822.2. Samples: 2240973340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 04:58:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:58:06,764][71000] Updated weights for policy 0, policy_version 165544 (0.0020) [2024-06-13 04:58:10,857][71000] Updated weights for policy 0, policy_version 165554 (0.0026) [2024-06-13 04:58:10,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2712436736. Throughput: 0: 48824.7. Samples: 2241270040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 04:58:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:58:13,406][71000] Updated weights for policy 0, policy_version 165564 (0.0037) [2024-06-13 04:58:15,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48879.1, 300 sec: 49318.6). Total num frames: 2712682496. Throughput: 0: 48918.4. Samples: 2241563540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 04:58:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:58:17,479][71000] Updated weights for policy 0, policy_version 165574 (0.0023) [2024-06-13 04:58:19,835][71000] Updated weights for policy 0, policy_version 165584 (0.0020) [2024-06-13 04:58:20,940][70768] Fps is (10 sec: 52425.0, 60 sec: 49424.4, 300 sec: 49374.0). Total num frames: 2712961024. Throughput: 0: 49069.3. Samples: 2241715280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 04:58:20,941][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:58:24,113][71000] Updated weights for policy 0, policy_version 165594 (0.0033) [2024-06-13 04:58:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2713174016. Throughput: 0: 49236.1. Samples: 2242010440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 04:58:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 04:58:26,665][71000] Updated weights for policy 0, policy_version 165604 (0.0025) [2024-06-13 04:58:30,578][71000] Updated weights for policy 0, policy_version 165614 (0.0031) [2024-06-13 04:58:30,940][70768] Fps is (10 sec: 47517.5, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2713436160. Throughput: 0: 49379.2. Samples: 2242310000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 04:58:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 04:58:33,337][71000] Updated weights for policy 0, policy_version 165624 (0.0029) [2024-06-13 04:58:35,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48878.8, 300 sec: 49318.6). Total num frames: 2713665536. Throughput: 0: 49299.3. Samples: 2242445500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 04:58:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:58:37,539][71000] Updated weights for policy 0, policy_version 165634 (0.0027) [2024-06-13 04:58:39,762][71000] Updated weights for policy 0, policy_version 165644 (0.0021) [2024-06-13 04:58:40,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.1, 300 sec: 49318.7). Total num frames: 2713927680. Throughput: 0: 49653.5. Samples: 2242748540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 04:58:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:58:43,909][71000] Updated weights for policy 0, policy_version 165654 (0.0028) [2024-06-13 04:58:45,939][70768] Fps is (10 sec: 50791.8, 60 sec: 49152.2, 300 sec: 49207.5). Total num frames: 2714173440. Throughput: 0: 49673.1. Samples: 2243049700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-13 04:58:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:58:46,479][71000] Updated weights for policy 0, policy_version 165664 (0.0025) [2024-06-13 04:58:48,759][70980] Signal inference workers to stop experience collection... (33450 times) [2024-06-13 04:58:48,792][71000] InferenceWorker_p0-w0: stopping experience collection (33450 times) [2024-06-13 04:58:48,871][70980] Signal inference workers to resume experience collection... (33450 times) [2024-06-13 04:58:48,871][71000] InferenceWorker_p0-w0: resuming experience collection (33450 times) [2024-06-13 04:58:50,883][71000] Updated weights for policy 0, policy_version 165674 (0.0030) [2024-06-13 04:58:50,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49424.9, 300 sec: 49207.5). Total num frames: 2714402816. Throughput: 0: 49274.6. Samples: 2243190700. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-13 04:58:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 04:58:53,224][71000] Updated weights for policy 0, policy_version 165684 (0.0023) [2024-06-13 04:58:55,939][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2714648576. Throughput: 0: 49067.7. Samples: 2243478080. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-13 04:58:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:58:57,640][71000] Updated weights for policy 0, policy_version 165694 (0.0026) [2024-06-13 04:59:00,163][71000] Updated weights for policy 0, policy_version 165704 (0.0024) [2024-06-13 04:59:00,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2714910720. Throughput: 0: 48969.2. Samples: 2243767160. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-13 04:59:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:59:04,205][71000] Updated weights for policy 0, policy_version 165714 (0.0027) [2024-06-13 04:59:05,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2715140096. Throughput: 0: 49082.7. Samples: 2243923960. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-13 04:59:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 04:59:06,798][71000] Updated weights for policy 0, policy_version 165724 (0.0024) [2024-06-13 04:59:10,803][71000] Updated weights for policy 0, policy_version 165734 (0.0026) [2024-06-13 04:59:10,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.1, 300 sec: 49152.3). Total num frames: 2715385856. Throughput: 0: 49019.1. Samples: 2244216300. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-13 04:59:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 04:59:13,691][71000] Updated weights for policy 0, policy_version 165744 (0.0020) [2024-06-13 04:59:15,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2715615232. Throughput: 0: 48954.8. Samples: 2244512960. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-13 04:59:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:59:17,303][71000] Updated weights for policy 0, policy_version 165754 (0.0024) [2024-06-13 04:59:20,501][71000] Updated weights for policy 0, policy_version 165764 (0.0032) [2024-06-13 04:59:20,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48879.6, 300 sec: 49318.6). Total num frames: 2715893760. Throughput: 0: 49195.3. Samples: 2244659280. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-13 04:59:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 04:59:23,923][71000] Updated weights for policy 0, policy_version 165774 (0.0029) [2024-06-13 04:59:25,940][70768] Fps is (10 sec: 54067.0, 60 sec: 49698.2, 300 sec: 49263.1). Total num frames: 2716155904. Throughput: 0: 49228.4. Samples: 2244963820. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-13 04:59:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 04:59:26,840][71000] Updated weights for policy 0, policy_version 165784 (0.0032) [2024-06-13 04:59:30,587][71000] Updated weights for policy 0, policy_version 165794 (0.0026) [2024-06-13 04:59:30,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2716385280. Throughput: 0: 49231.0. Samples: 2245265100. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-13 04:59:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:59:33,263][71000] Updated weights for policy 0, policy_version 165804 (0.0031) [2024-06-13 04:59:35,939][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.2, 300 sec: 49263.1). Total num frames: 2716614656. Throughput: 0: 49117.9. Samples: 2245401000. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-13 04:59:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 04:59:37,131][71000] Updated weights for policy 0, policy_version 165814 (0.0035) [2024-06-13 04:59:40,284][71000] Updated weights for policy 0, policy_version 165824 (0.0033) [2024-06-13 04:59:40,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2716876800. Throughput: 0: 49200.8. Samples: 2245692120. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-13 04:59:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:59:41,048][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000165826_2716893184.pth... [2024-06-13 04:59:41,099][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000165103_2705047552.pth [2024-06-13 04:59:43,787][71000] Updated weights for policy 0, policy_version 165834 (0.0028) [2024-06-13 04:59:45,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2717138944. Throughput: 0: 49556.5. Samples: 2245997200. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-13 04:59:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 04:59:46,618][71000] Updated weights for policy 0, policy_version 165844 (0.0029) [2024-06-13 04:59:50,427][71000] Updated weights for policy 0, policy_version 165854 (0.0031) [2024-06-13 04:59:50,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2717368320. Throughput: 0: 49512.8. Samples: 2246152040. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-13 04:59:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 04:59:53,192][71000] Updated weights for policy 0, policy_version 165864 (0.0024) [2024-06-13 04:59:55,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2717614080. Throughput: 0: 49699.0. Samples: 2246452760. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-13 04:59:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 04:59:56,817][71000] Updated weights for policy 0, policy_version 165874 (0.0024) [2024-06-13 04:59:59,859][71000] Updated weights for policy 0, policy_version 165884 (0.0031) [2024-06-13 05:00:00,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2717876224. Throughput: 0: 49784.0. Samples: 2246753240. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-13 05:00:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:00:03,415][71000] Updated weights for policy 0, policy_version 165894 (0.0031) [2024-06-13 05:00:05,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49971.2, 300 sec: 49263.1). Total num frames: 2718138368. Throughput: 0: 49716.0. Samples: 2246896500. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-13 05:00:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:00:06,943][71000] Updated weights for policy 0, policy_version 165904 (0.0032) [2024-06-13 05:00:08,636][70980] Signal inference workers to stop experience collection... (33500 times) [2024-06-13 05:00:08,636][70980] Signal inference workers to resume experience collection... (33500 times) [2024-06-13 05:00:08,670][71000] InferenceWorker_p0-w0: stopping experience collection (33500 times) [2024-06-13 05:00:08,670][71000] InferenceWorker_p0-w0: resuming experience collection (33500 times) [2024-06-13 05:00:10,459][71000] Updated weights for policy 0, policy_version 165914 (0.0032) [2024-06-13 05:00:10,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2718351360. Throughput: 0: 49215.5. Samples: 2247178520. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-13 05:00:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:00:13,527][71000] Updated weights for policy 0, policy_version 165924 (0.0039) [2024-06-13 05:00:15,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49698.1, 300 sec: 49207.5). Total num frames: 2718597120. Throughput: 0: 49043.1. Samples: 2247472040. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-13 05:00:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:00:17,206][71000] Updated weights for policy 0, policy_version 165934 (0.0032) [2024-06-13 05:00:20,370][71000] Updated weights for policy 0, policy_version 165944 (0.0028) [2024-06-13 05:00:20,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2718842880. Throughput: 0: 49268.9. Samples: 2247618100. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-13 05:00:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:00:23,826][71000] Updated weights for policy 0, policy_version 165954 (0.0020) [2024-06-13 05:00:25,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2719088640. Throughput: 0: 49268.1. Samples: 2247909180. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-13 05:00:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:00:26,854][71000] Updated weights for policy 0, policy_version 165964 (0.0021) [2024-06-13 05:00:30,562][71000] Updated weights for policy 0, policy_version 165974 (0.0031) [2024-06-13 05:00:30,940][70768] Fps is (10 sec: 50788.8, 60 sec: 49424.8, 300 sec: 49207.5). Total num frames: 2719350784. Throughput: 0: 49258.8. Samples: 2248213860. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-13 05:00:30,941][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:00:33,855][71000] Updated weights for policy 0, policy_version 165984 (0.0030) [2024-06-13 05:00:35,940][70768] Fps is (10 sec: 49150.6, 60 sec: 49424.8, 300 sec: 49263.1). Total num frames: 2719580160. Throughput: 0: 48861.1. Samples: 2248350800. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-13 05:00:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:00:37,111][71000] Updated weights for policy 0, policy_version 165994 (0.0028) [2024-06-13 05:00:40,499][71000] Updated weights for policy 0, policy_version 166004 (0.0021) [2024-06-13 05:00:40,939][70768] Fps is (10 sec: 45876.7, 60 sec: 48879.0, 300 sec: 49207.6). Total num frames: 2719809536. Throughput: 0: 48865.5. Samples: 2248651700. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-13 05:00:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:00:43,541][71000] Updated weights for policy 0, policy_version 166014 (0.0028) [2024-06-13 05:00:45,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 2720071680. Throughput: 0: 48687.3. Samples: 2248944180. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-13 05:00:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:00:46,934][71000] Updated weights for policy 0, policy_version 166024 (0.0032) [2024-06-13 05:00:50,238][71000] Updated weights for policy 0, policy_version 166034 (0.0028) [2024-06-13 05:00:50,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.2, 300 sec: 49207.5). Total num frames: 2720317440. Throughput: 0: 48929.9. Samples: 2249098340. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-13 05:00:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:00:53,776][71000] Updated weights for policy 0, policy_version 166044 (0.0030) [2024-06-13 05:00:55,939][70768] Fps is (10 sec: 49153.2, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2720563200. Throughput: 0: 49215.2. Samples: 2249393200. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-13 05:00:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:00:56,987][71000] Updated weights for policy 0, policy_version 166054 (0.0037) [2024-06-13 05:01:00,782][71000] Updated weights for policy 0, policy_version 166064 (0.0025) [2024-06-13 05:01:00,940][70768] Fps is (10 sec: 47512.5, 60 sec: 48605.7, 300 sec: 49207.5). Total num frames: 2720792576. Throughput: 0: 49203.0. Samples: 2249686180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 05:01:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:01:03,631][71000] Updated weights for policy 0, policy_version 166074 (0.0022) [2024-06-13 05:01:05,939][70768] Fps is (10 sec: 49151.9, 60 sec: 48605.9, 300 sec: 49152.0). Total num frames: 2721054720. Throughput: 0: 49053.3. Samples: 2249825500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 05:01:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:01:07,241][71000] Updated weights for policy 0, policy_version 166084 (0.0024) [2024-06-13 05:01:10,378][71000] Updated weights for policy 0, policy_version 166094 (0.0032) [2024-06-13 05:01:10,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49425.0, 300 sec: 49263.0). Total num frames: 2721316864. Throughput: 0: 49331.0. Samples: 2250129080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 05:01:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:01:14,075][71000] Updated weights for policy 0, policy_version 166104 (0.0036) [2024-06-13 05:01:15,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 2721529856. Throughput: 0: 48892.8. Samples: 2250414020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 05:01:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 05:01:17,059][71000] Updated weights for policy 0, policy_version 166114 (0.0028) [2024-06-13 05:01:20,940][70768] Fps is (10 sec: 44237.3, 60 sec: 48605.8, 300 sec: 49096.5). Total num frames: 2721759232. Throughput: 0: 48972.7. Samples: 2250554560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 05:01:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:01:21,157][71000] Updated weights for policy 0, policy_version 166124 (0.0033) [2024-06-13 05:01:21,360][70980] Signal inference workers to stop experience collection... (33550 times) [2024-06-13 05:01:21,404][71000] InferenceWorker_p0-w0: stopping experience collection (33550 times) [2024-06-13 05:01:21,418][70980] Signal inference workers to resume experience collection... (33550 times) [2024-06-13 05:01:21,423][71000] InferenceWorker_p0-w0: resuming experience collection (33550 times) [2024-06-13 05:01:23,543][71000] Updated weights for policy 0, policy_version 166134 (0.0027) [2024-06-13 05:01:25,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2722037760. Throughput: 0: 48765.2. Samples: 2250846140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 05:01:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:01:27,633][71000] Updated weights for policy 0, policy_version 166144 (0.0028) [2024-06-13 05:01:30,540][71000] Updated weights for policy 0, policy_version 166154 (0.0031) [2024-06-13 05:01:30,940][70768] Fps is (10 sec: 52428.6, 60 sec: 48879.1, 300 sec: 49207.5). Total num frames: 2722283520. Throughput: 0: 48860.6. Samples: 2251142900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 05:01:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:01:34,385][71000] Updated weights for policy 0, policy_version 166164 (0.0023) [2024-06-13 05:01:35,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2722512896. Throughput: 0: 48955.7. Samples: 2251301360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 05:01:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:01:37,120][71000] Updated weights for policy 0, policy_version 166174 (0.0025) [2024-06-13 05:01:40,905][71000] Updated weights for policy 0, policy_version 166184 (0.0030) [2024-06-13 05:01:40,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49151.8, 300 sec: 49152.0). Total num frames: 2722758656. Throughput: 0: 48668.6. Samples: 2251583300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 05:01:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:01:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000166184_2722758656.pth... [2024-06-13 05:01:40,990][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000165466_2710994944.pth [2024-06-13 05:01:43,825][71000] Updated weights for policy 0, policy_version 166194 (0.0024) [2024-06-13 05:01:45,940][70768] Fps is (10 sec: 50789.0, 60 sec: 49151.8, 300 sec: 49207.5). Total num frames: 2723020800. Throughput: 0: 48692.6. Samples: 2251877360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 05:01:45,941][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:01:47,451][71000] Updated weights for policy 0, policy_version 166204 (0.0025) [2024-06-13 05:01:50,159][71000] Updated weights for policy 0, policy_version 166214 (0.0028) [2024-06-13 05:01:50,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49151.8, 300 sec: 49152.0). Total num frames: 2723266560. Throughput: 0: 49147.4. Samples: 2252037140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 05:01:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:01:53,988][71000] Updated weights for policy 0, policy_version 166224 (0.0026) [2024-06-13 05:01:55,939][70768] Fps is (10 sec: 49154.5, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2723512320. Throughput: 0: 49205.1. Samples: 2252343300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 05:01:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:01:56,634][71000] Updated weights for policy 0, policy_version 166234 (0.0032) [2024-06-13 05:02:00,487][71000] Updated weights for policy 0, policy_version 166244 (0.0033) [2024-06-13 05:02:00,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49425.2, 300 sec: 49096.5). Total num frames: 2723758080. Throughput: 0: 49395.5. Samples: 2252636820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 05:02:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:02:03,200][71000] Updated weights for policy 0, policy_version 166254 (0.0029) [2024-06-13 05:02:05,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 2724003840. Throughput: 0: 49341.4. Samples: 2252774920. Policy #0 lag: (min: 1.0, avg: 9.0, max: 24.0) [2024-06-13 05:02:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:02:07,103][71000] Updated weights for policy 0, policy_version 166264 (0.0028) [2024-06-13 05:02:09,823][71000] Updated weights for policy 0, policy_version 166274 (0.0027) [2024-06-13 05:02:10,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 2724249600. Throughput: 0: 49738.8. Samples: 2253084380. Policy #0 lag: (min: 1.0, avg: 9.0, max: 24.0) [2024-06-13 05:02:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:02:10,963][70980] Signal inference workers to stop experience collection... (33600 times) [2024-06-13 05:02:11,018][71000] InferenceWorker_p0-w0: stopping experience collection (33600 times) [2024-06-13 05:02:11,021][70980] Signal inference workers to resume experience collection... (33600 times) [2024-06-13 05:02:11,028][71000] InferenceWorker_p0-w0: resuming experience collection (33600 times) [2024-06-13 05:02:13,465][71000] Updated weights for policy 0, policy_version 166284 (0.0030) [2024-06-13 05:02:15,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.1, 300 sec: 49207.5). Total num frames: 2724511744. Throughput: 0: 49751.6. Samples: 2253381720. Policy #0 lag: (min: 1.0, avg: 9.0, max: 24.0) [2024-06-13 05:02:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:02:16,407][71000] Updated weights for policy 0, policy_version 166294 (0.0022) [2024-06-13 05:02:20,185][71000] Updated weights for policy 0, policy_version 166304 (0.0032) [2024-06-13 05:02:20,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2724724736. Throughput: 0: 49340.6. Samples: 2253521680. Policy #0 lag: (min: 1.0, avg: 9.0, max: 24.0) [2024-06-13 05:02:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:02:23,331][71000] Updated weights for policy 0, policy_version 166314 (0.0027) [2024-06-13 05:02:25,940][70768] Fps is (10 sec: 47512.7, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2724986880. Throughput: 0: 49584.5. Samples: 2253814600. Policy #0 lag: (min: 1.0, avg: 9.0, max: 24.0) [2024-06-13 05:02:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:02:26,811][71000] Updated weights for policy 0, policy_version 166324 (0.0033) [2024-06-13 05:02:29,776][71000] Updated weights for policy 0, policy_version 166334 (0.0023) [2024-06-13 05:02:30,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2725249024. Throughput: 0: 49645.4. Samples: 2254111380. Policy #0 lag: (min: 1.0, avg: 9.0, max: 24.0) [2024-06-13 05:02:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:02:33,211][71000] Updated weights for policy 0, policy_version 166344 (0.0026) [2024-06-13 05:02:35,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49698.3, 300 sec: 49207.6). Total num frames: 2725494784. Throughput: 0: 49477.0. Samples: 2254263600. Policy #0 lag: (min: 1.0, avg: 9.0, max: 24.0) [2024-06-13 05:02:35,950][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:02:36,343][71000] Updated weights for policy 0, policy_version 166354 (0.0024) [2024-06-13 05:02:39,734][71000] Updated weights for policy 0, policy_version 166364 (0.0030) [2024-06-13 05:02:40,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49425.2, 300 sec: 49152.0). Total num frames: 2725724160. Throughput: 0: 49321.2. Samples: 2254562760. Policy #0 lag: (min: 1.0, avg: 9.0, max: 24.0) [2024-06-13 05:02:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:02:43,318][71000] Updated weights for policy 0, policy_version 166374 (0.0025) [2024-06-13 05:02:45,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49152.3, 300 sec: 49263.0). Total num frames: 2725969920. Throughput: 0: 49246.5. Samples: 2254852920. Policy #0 lag: (min: 1.0, avg: 9.0, max: 24.0) [2024-06-13 05:02:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:02:46,719][71000] Updated weights for policy 0, policy_version 166384 (0.0031) [2024-06-13 05:02:49,801][71000] Updated weights for policy 0, policy_version 166394 (0.0027) [2024-06-13 05:02:50,939][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 2726232064. Throughput: 0: 49432.9. Samples: 2254999400. Policy #0 lag: (min: 1.0, avg: 9.0, max: 24.0) [2024-06-13 05:02:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:02:53,314][71000] Updated weights for policy 0, policy_version 166404 (0.0022) [2024-06-13 05:02:55,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2726494208. Throughput: 0: 49257.2. Samples: 2255300960. Policy #0 lag: (min: 1.0, avg: 9.0, max: 24.0) [2024-06-13 05:02:55,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:02:56,481][71000] Updated weights for policy 0, policy_version 166414 (0.0036) [2024-06-13 05:02:59,866][71000] Updated weights for policy 0, policy_version 166424 (0.0034) [2024-06-13 05:03:00,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2726707200. Throughput: 0: 48997.3. Samples: 2255586600. Policy #0 lag: (min: 1.0, avg: 9.0, max: 24.0) [2024-06-13 05:03:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:03:03,231][71000] Updated weights for policy 0, policy_version 166434 (0.0026) [2024-06-13 05:03:05,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 2726952960. Throughput: 0: 49053.8. Samples: 2255729100. Policy #0 lag: (min: 1.0, avg: 9.0, max: 24.0) [2024-06-13 05:03:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:03:06,422][71000] Updated weights for policy 0, policy_version 166444 (0.0026) [2024-06-13 05:03:09,843][71000] Updated weights for policy 0, policy_version 166454 (0.0029) [2024-06-13 05:03:10,795][70980] Signal inference workers to stop experience collection... (33650 times) [2024-06-13 05:03:10,795][70980] Signal inference workers to resume experience collection... (33650 times) [2024-06-13 05:03:10,813][71000] InferenceWorker_p0-w0: stopping experience collection (33650 times) [2024-06-13 05:03:10,813][71000] InferenceWorker_p0-w0: resuming experience collection (33650 times) [2024-06-13 05:03:10,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49698.0, 300 sec: 49318.6). Total num frames: 2727231488. Throughput: 0: 49342.3. Samples: 2256035000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:03:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:03:13,391][71000] Updated weights for policy 0, policy_version 166464 (0.0034) [2024-06-13 05:03:15,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49425.0, 300 sec: 49207.7). Total num frames: 2727477248. Throughput: 0: 49612.4. Samples: 2256343940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:03:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:03:16,162][71000] Updated weights for policy 0, policy_version 166474 (0.0036) [2024-06-13 05:03:19,825][71000] Updated weights for policy 0, policy_version 166484 (0.0039) [2024-06-13 05:03:20,940][70768] Fps is (10 sec: 45875.6, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2727690240. Throughput: 0: 49519.6. Samples: 2256491980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:03:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:03:23,146][71000] Updated weights for policy 0, policy_version 166494 (0.0031) [2024-06-13 05:03:25,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49971.3, 300 sec: 49318.6). Total num frames: 2727985152. Throughput: 0: 49325.3. Samples: 2256782400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:03:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:03:26,315][71000] Updated weights for policy 0, policy_version 166504 (0.0027) [2024-06-13 05:03:29,411][71000] Updated weights for policy 0, policy_version 166514 (0.0025) [2024-06-13 05:03:30,940][70768] Fps is (10 sec: 52429.1, 60 sec: 49425.1, 300 sec: 49318.7). Total num frames: 2728214528. Throughput: 0: 49644.2. Samples: 2257086900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:03:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:03:33,142][71000] Updated weights for policy 0, policy_version 166524 (0.0035) [2024-06-13 05:03:35,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2728460288. Throughput: 0: 49671.8. Samples: 2257234640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:03:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:03:36,166][71000] Updated weights for policy 0, policy_version 166534 (0.0038) [2024-06-13 05:03:39,961][71000] Updated weights for policy 0, policy_version 166544 (0.0031) [2024-06-13 05:03:40,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49698.2, 300 sec: 49263.1). Total num frames: 2728706048. Throughput: 0: 49648.9. Samples: 2257535160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:03:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:03:41,067][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000166548_2728722432.pth... [2024-06-13 05:03:41,118][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000165826_2716893184.pth [2024-06-13 05:03:42,814][71000] Updated weights for policy 0, policy_version 166554 (0.0036) [2024-06-13 05:03:45,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49698.3, 300 sec: 49318.6). Total num frames: 2728951808. Throughput: 0: 49842.8. Samples: 2257829520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:03:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:03:46,448][71000] Updated weights for policy 0, policy_version 166564 (0.0030) [2024-06-13 05:03:49,340][71000] Updated weights for policy 0, policy_version 166574 (0.0028) [2024-06-13 05:03:50,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2729197568. Throughput: 0: 50135.1. Samples: 2257985180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:03:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:03:52,642][71000] Updated weights for policy 0, policy_version 166584 (0.0027) [2024-06-13 05:03:55,942][70768] Fps is (10 sec: 50775.7, 60 sec: 49422.8, 300 sec: 49318.2). Total num frames: 2729459712. Throughput: 0: 50092.9. Samples: 2258289320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:03:55,943][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:03:56,067][71000] Updated weights for policy 0, policy_version 166594 (0.0026) [2024-06-13 05:03:59,546][71000] Updated weights for policy 0, policy_version 166604 (0.0023) [2024-06-13 05:04:00,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49971.3, 300 sec: 49374.2). Total num frames: 2729705472. Throughput: 0: 49640.1. Samples: 2258577740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:04:00,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 05:04:02,658][71000] Updated weights for policy 0, policy_version 166614 (0.0027) [2024-06-13 05:04:05,940][70768] Fps is (10 sec: 49166.0, 60 sec: 49971.2, 300 sec: 49374.2). Total num frames: 2729951232. Throughput: 0: 49543.6. Samples: 2258721440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:04:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:04:05,987][71000] Updated weights for policy 0, policy_version 166624 (0.0033) [2024-06-13 05:04:09,469][71000] Updated weights for policy 0, policy_version 166634 (0.0026) [2024-06-13 05:04:10,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2730196992. Throughput: 0: 49802.9. Samples: 2259023520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:04:10,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 05:04:12,760][71000] Updated weights for policy 0, policy_version 166644 (0.0036) [2024-06-13 05:04:15,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2730442752. Throughput: 0: 49739.5. Samples: 2259325180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 05:04:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:04:15,947][71000] Updated weights for policy 0, policy_version 166654 (0.0029) [2024-06-13 05:04:19,173][71000] Updated weights for policy 0, policy_version 166664 (0.0034) [2024-06-13 05:04:20,940][70768] Fps is (10 sec: 50790.0, 60 sec: 50244.3, 300 sec: 49318.6). Total num frames: 2730704896. Throughput: 0: 49658.8. Samples: 2259469280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 05:04:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:04:22,440][71000] Updated weights for policy 0, policy_version 166674 (0.0031) [2024-06-13 05:04:25,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2730934272. Throughput: 0: 49607.1. Samples: 2259767480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 05:04:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:04:26,022][71000] Updated weights for policy 0, policy_version 166684 (0.0030) [2024-06-13 05:04:28,586][70980] Signal inference workers to stop experience collection... (33700 times) [2024-06-13 05:04:28,588][70980] Signal inference workers to resume experience collection... (33700 times) [2024-06-13 05:04:28,605][71000] InferenceWorker_p0-w0: stopping experience collection (33700 times) [2024-06-13 05:04:28,605][71000] InferenceWorker_p0-w0: resuming experience collection (33700 times) [2024-06-13 05:04:29,047][71000] Updated weights for policy 0, policy_version 166694 (0.0027) [2024-06-13 05:04:30,944][70768] Fps is (10 sec: 49129.7, 60 sec: 49694.3, 300 sec: 49428.9). Total num frames: 2731196416. Throughput: 0: 49610.0. Samples: 2260062200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 05:04:30,945][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:04:32,775][71000] Updated weights for policy 0, policy_version 166704 (0.0026) [2024-06-13 05:04:35,878][71000] Updated weights for policy 0, policy_version 166714 (0.0026) [2024-06-13 05:04:35,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49698.1, 300 sec: 49374.1). Total num frames: 2731442176. Throughput: 0: 49373.2. Samples: 2260206980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 05:04:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:04:39,186][71000] Updated weights for policy 0, policy_version 166724 (0.0023) [2024-06-13 05:04:40,940][70768] Fps is (10 sec: 49173.5, 60 sec: 49698.0, 300 sec: 49318.6). Total num frames: 2731687936. Throughput: 0: 49297.6. Samples: 2260507580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 05:04:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:04:42,661][71000] Updated weights for policy 0, policy_version 166734 (0.0024) [2024-06-13 05:04:45,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2731917312. Throughput: 0: 49465.3. Samples: 2260803680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 05:04:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:04:46,064][71000] Updated weights for policy 0, policy_version 166744 (0.0030) [2024-06-13 05:04:49,078][71000] Updated weights for policy 0, policy_version 166754 (0.0028) [2024-06-13 05:04:50,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49971.1, 300 sec: 49429.7). Total num frames: 2732195840. Throughput: 0: 49706.5. Samples: 2260958240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 05:04:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:04:52,614][71000] Updated weights for policy 0, policy_version 166764 (0.0032) [2024-06-13 05:04:55,670][71000] Updated weights for policy 0, policy_version 166774 (0.0040) [2024-06-13 05:04:55,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49700.4, 300 sec: 49374.1). Total num frames: 2732441600. Throughput: 0: 49441.6. Samples: 2261248400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 05:04:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:04:59,272][71000] Updated weights for policy 0, policy_version 166784 (0.0024) [2024-06-13 05:05:00,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2732670976. Throughput: 0: 49322.2. Samples: 2261544680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 05:05:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:05:02,520][71000] Updated weights for policy 0, policy_version 166794 (0.0040) [2024-06-13 05:05:05,775][71000] Updated weights for policy 0, policy_version 166804 (0.0024) [2024-06-13 05:05:05,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.0, 300 sec: 49374.2). Total num frames: 2732916736. Throughput: 0: 49192.9. Samples: 2261682960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 05:05:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:05:08,971][71000] Updated weights for policy 0, policy_version 166814 (0.0029) [2024-06-13 05:05:10,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49698.0, 300 sec: 49429.7). Total num frames: 2733178880. Throughput: 0: 49349.7. Samples: 2261988220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 05:05:10,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 05:05:12,552][71000] Updated weights for policy 0, policy_version 166824 (0.0039) [2024-06-13 05:05:15,825][71000] Updated weights for policy 0, policy_version 166834 (0.0029) [2024-06-13 05:05:15,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2733408256. Throughput: 0: 49482.8. Samples: 2262288700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-13 05:05:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:05:19,100][71000] Updated weights for policy 0, policy_version 166844 (0.0022) [2024-06-13 05:05:20,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49151.8, 300 sec: 49374.1). Total num frames: 2733654016. Throughput: 0: 49383.9. Samples: 2262429260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-13 05:05:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:05:22,601][71000] Updated weights for policy 0, policy_version 166854 (0.0031) [2024-06-13 05:05:25,825][71000] Updated weights for policy 0, policy_version 166864 (0.0029) [2024-06-13 05:05:25,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.0, 300 sec: 49318.7). Total num frames: 2733899776. Throughput: 0: 49103.7. Samples: 2262717240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-13 05:05:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:05:28,219][70980] Signal inference workers to stop experience collection... (33750 times) [2024-06-13 05:05:28,263][71000] InferenceWorker_p0-w0: stopping experience collection (33750 times) [2024-06-13 05:05:28,267][70980] Signal inference workers to resume experience collection... (33750 times) [2024-06-13 05:05:28,275][71000] InferenceWorker_p0-w0: resuming experience collection (33750 times) [2024-06-13 05:05:29,087][71000] Updated weights for policy 0, policy_version 166874 (0.0027) [2024-06-13 05:05:30,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49428.7, 300 sec: 49429.7). Total num frames: 2734161920. Throughput: 0: 48955.0. Samples: 2263006660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-13 05:05:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:05:32,452][71000] Updated weights for policy 0, policy_version 166884 (0.0024) [2024-06-13 05:05:35,804][71000] Updated weights for policy 0, policy_version 166894 (0.0036) [2024-06-13 05:05:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 2734391296. Throughput: 0: 49075.2. Samples: 2263166620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-13 05:05:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:05:39,131][71000] Updated weights for policy 0, policy_version 166904 (0.0040) [2024-06-13 05:05:40,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 2734637056. Throughput: 0: 49090.2. Samples: 2263457460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-13 05:05:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:05:41,066][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000166910_2734653440.pth... [2024-06-13 05:05:41,109][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000166184_2722758656.pth [2024-06-13 05:05:42,557][71000] Updated weights for policy 0, policy_version 166914 (0.0030) [2024-06-13 05:05:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2734866432. Throughput: 0: 49116.0. Samples: 2263754900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-13 05:05:45,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 05:05:45,982][71000] Updated weights for policy 0, policy_version 166924 (0.0025) [2024-06-13 05:05:49,158][71000] Updated weights for policy 0, policy_version 166934 (0.0025) [2024-06-13 05:05:50,942][70768] Fps is (10 sec: 50779.6, 60 sec: 49150.3, 300 sec: 49429.3). Total num frames: 2735144960. Throughput: 0: 49400.2. Samples: 2263906080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-13 05:05:50,942][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:05:52,707][71000] Updated weights for policy 0, policy_version 166944 (0.0025) [2024-06-13 05:05:55,803][71000] Updated weights for policy 0, policy_version 166954 (0.0033) [2024-06-13 05:05:55,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48878.9, 300 sec: 49429.7). Total num frames: 2735374336. Throughput: 0: 49167.1. Samples: 2264200740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-13 05:05:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:05:59,159][71000] Updated weights for policy 0, policy_version 166964 (0.0038) [2024-06-13 05:06:00,940][70768] Fps is (10 sec: 45885.4, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 2735603712. Throughput: 0: 48892.9. Samples: 2264488880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-13 05:06:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:06:02,487][71000] Updated weights for policy 0, policy_version 166974 (0.0031) [2024-06-13 05:06:05,626][71000] Updated weights for policy 0, policy_version 166984 (0.0027) [2024-06-13 05:06:05,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49151.8, 300 sec: 49318.6). Total num frames: 2735865856. Throughput: 0: 49025.8. Samples: 2264635420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-13 05:06:05,952][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:06:09,298][71000] Updated weights for policy 0, policy_version 166994 (0.0026) [2024-06-13 05:06:10,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 2736128000. Throughput: 0: 49195.9. Samples: 2264931060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-13 05:06:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:06:12,458][71000] Updated weights for policy 0, policy_version 167004 (0.0026) [2024-06-13 05:06:15,940][70768] Fps is (10 sec: 47514.7, 60 sec: 48878.9, 300 sec: 49429.7). Total num frames: 2736340992. Throughput: 0: 49308.6. Samples: 2265225540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-13 05:06:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:06:15,979][71000] Updated weights for policy 0, policy_version 167014 (0.0029) [2024-06-13 05:06:19,306][71000] Updated weights for policy 0, policy_version 167024 (0.0028) [2024-06-13 05:06:20,940][70768] Fps is (10 sec: 44237.2, 60 sec: 48606.0, 300 sec: 49263.1). Total num frames: 2736570368. Throughput: 0: 48851.5. Samples: 2265364940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-13 05:06:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:06:22,735][71000] Updated weights for policy 0, policy_version 167034 (0.0027) [2024-06-13 05:06:25,788][71000] Updated weights for policy 0, policy_version 167044 (0.0028) [2024-06-13 05:06:25,940][70768] Fps is (10 sec: 50789.2, 60 sec: 49151.9, 300 sec: 49374.1). Total num frames: 2736848896. Throughput: 0: 48763.0. Samples: 2265651800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:06:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:06:28,573][70980] Signal inference workers to stop experience collection... (33800 times) [2024-06-13 05:06:28,623][71000] InferenceWorker_p0-w0: stopping experience collection (33800 times) [2024-06-13 05:06:28,682][70980] Signal inference workers to resume experience collection... (33800 times) [2024-06-13 05:06:28,682][71000] InferenceWorker_p0-w0: resuming experience collection (33800 times) [2024-06-13 05:06:29,361][71000] Updated weights for policy 0, policy_version 167054 (0.0029) [2024-06-13 05:06:30,940][70768] Fps is (10 sec: 52429.1, 60 sec: 48879.1, 300 sec: 49429.7). Total num frames: 2737094656. Throughput: 0: 48679.1. Samples: 2265945460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:06:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:06:32,883][71000] Updated weights for policy 0, policy_version 167064 (0.0032) [2024-06-13 05:06:35,940][70768] Fps is (10 sec: 45876.0, 60 sec: 48605.8, 300 sec: 49318.6). Total num frames: 2737307648. Throughput: 0: 48590.4. Samples: 2266092540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:06:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:06:36,282][71000] Updated weights for policy 0, policy_version 167074 (0.0032) [2024-06-13 05:06:39,433][71000] Updated weights for policy 0, policy_version 167084 (0.0030) [2024-06-13 05:06:40,940][70768] Fps is (10 sec: 44236.6, 60 sec: 48332.9, 300 sec: 49207.6). Total num frames: 2737537024. Throughput: 0: 48376.0. Samples: 2266377660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:06:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:06:42,912][71000] Updated weights for policy 0, policy_version 167094 (0.0028) [2024-06-13 05:06:45,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2737815552. Throughput: 0: 48524.0. Samples: 2266672460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:06:45,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 05:06:46,227][71000] Updated weights for policy 0, policy_version 167104 (0.0026) [2024-06-13 05:06:49,600][71000] Updated weights for policy 0, policy_version 167114 (0.0031) [2024-06-13 05:06:50,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48334.5, 300 sec: 49263.0). Total num frames: 2738044928. Throughput: 0: 48529.8. Samples: 2266819260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:06:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:06:52,892][71000] Updated weights for policy 0, policy_version 167124 (0.0030) [2024-06-13 05:06:55,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48605.8, 300 sec: 49263.1). Total num frames: 2738290688. Throughput: 0: 48735.5. Samples: 2267124160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:06:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:06:56,485][71000] Updated weights for policy 0, policy_version 167134 (0.0028) [2024-06-13 05:06:59,577][71000] Updated weights for policy 0, policy_version 167144 (0.0035) [2024-06-13 05:07:00,940][70768] Fps is (10 sec: 45876.1, 60 sec: 48332.8, 300 sec: 49152.0). Total num frames: 2738503680. Throughput: 0: 48500.4. Samples: 2267408060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:07:00,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 05:07:03,249][71000] Updated weights for policy 0, policy_version 167154 (0.0024) [2024-06-13 05:07:05,939][70768] Fps is (10 sec: 50791.3, 60 sec: 48879.1, 300 sec: 49318.6). Total num frames: 2738798592. Throughput: 0: 48582.3. Samples: 2267551140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:07:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:07:06,285][71000] Updated weights for policy 0, policy_version 167164 (0.0025) [2024-06-13 05:07:09,845][71000] Updated weights for policy 0, policy_version 167174 (0.0027) [2024-06-13 05:07:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 47786.8, 300 sec: 49096.5). Total num frames: 2738995200. Throughput: 0: 48719.8. Samples: 2267844180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:07:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:07:13,073][71000] Updated weights for policy 0, policy_version 167184 (0.0023) [2024-06-13 05:07:15,940][70768] Fps is (10 sec: 47512.6, 60 sec: 48878.8, 300 sec: 49318.6). Total num frames: 2739273728. Throughput: 0: 48828.2. Samples: 2268142740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:07:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:07:16,573][71000] Updated weights for policy 0, policy_version 167194 (0.0028) [2024-06-13 05:07:19,402][71000] Updated weights for policy 0, policy_version 167204 (0.0034) [2024-06-13 05:07:20,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48605.9, 300 sec: 49152.0). Total num frames: 2739486720. Throughput: 0: 48845.0. Samples: 2268290560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:07:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:07:22,908][71000] Updated weights for policy 0, policy_version 167214 (0.0022) [2024-06-13 05:07:25,923][71000] Updated weights for policy 0, policy_version 167224 (0.0025) [2024-06-13 05:07:25,940][70768] Fps is (10 sec: 52429.7, 60 sec: 49152.2, 300 sec: 49318.6). Total num frames: 2739798016. Throughput: 0: 49259.1. Samples: 2268594320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-13 05:07:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:07:29,292][71000] Updated weights for policy 0, policy_version 167234 (0.0034) [2024-06-13 05:07:30,940][70768] Fps is (10 sec: 52427.8, 60 sec: 48605.7, 300 sec: 49207.5). Total num frames: 2740011008. Throughput: 0: 49436.3. Samples: 2268897100. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-13 05:07:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:07:32,307][71000] Updated weights for policy 0, policy_version 167244 (0.0021) [2024-06-13 05:07:35,142][70980] Signal inference workers to stop experience collection... (33850 times) [2024-06-13 05:07:35,142][70980] Signal inference workers to resume experience collection... (33850 times) [2024-06-13 05:07:35,187][71000] InferenceWorker_p0-w0: stopping experience collection (33850 times) [2024-06-13 05:07:35,187][71000] InferenceWorker_p0-w0: resuming experience collection (33850 times) [2024-06-13 05:07:35,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2740273152. Throughput: 0: 49517.8. Samples: 2269047560. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-13 05:07:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:07:36,103][71000] Updated weights for policy 0, policy_version 167254 (0.0030) [2024-06-13 05:07:39,018][71000] Updated weights for policy 0, policy_version 167264 (0.0030) [2024-06-13 05:07:40,941][70768] Fps is (10 sec: 49147.3, 60 sec: 49424.2, 300 sec: 49262.9). Total num frames: 2740502528. Throughput: 0: 49121.2. Samples: 2269334660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-13 05:07:40,941][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:07:40,995][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000167268_2740518912.pth... [2024-06-13 05:07:41,041][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000166548_2728722432.pth [2024-06-13 05:07:42,503][71000] Updated weights for policy 0, policy_version 167274 (0.0028) [2024-06-13 05:07:45,743][71000] Updated weights for policy 0, policy_version 167284 (0.0034) [2024-06-13 05:07:45,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2740781056. Throughput: 0: 49384.7. Samples: 2269630380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-13 05:07:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:07:49,334][71000] Updated weights for policy 0, policy_version 167294 (0.0029) [2024-06-13 05:07:50,940][70768] Fps is (10 sec: 49157.5, 60 sec: 49152.2, 300 sec: 49152.0). Total num frames: 2740994048. Throughput: 0: 49410.7. Samples: 2269774620. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-13 05:07:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:07:52,658][71000] Updated weights for policy 0, policy_version 167304 (0.0024) [2024-06-13 05:07:55,939][70768] Fps is (10 sec: 47514.3, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 2741256192. Throughput: 0: 49506.3. Samples: 2270071960. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-13 05:07:55,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 05:07:56,220][71000] Updated weights for policy 0, policy_version 167314 (0.0025) [2024-06-13 05:07:59,383][71000] Updated weights for policy 0, policy_version 167324 (0.0031) [2024-06-13 05:08:00,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2741485568. Throughput: 0: 49450.0. Samples: 2270367980. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-13 05:08:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:08:02,885][71000] Updated weights for policy 0, policy_version 167334 (0.0029) [2024-06-13 05:08:05,838][71000] Updated weights for policy 0, policy_version 167344 (0.0021) [2024-06-13 05:08:05,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2741764096. Throughput: 0: 49553.6. Samples: 2270520480. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-13 05:08:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:08:09,451][71000] Updated weights for policy 0, policy_version 167354 (0.0030) [2024-06-13 05:08:10,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2741960704. Throughput: 0: 49192.4. Samples: 2270807980. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-13 05:08:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:08:12,892][71000] Updated weights for policy 0, policy_version 167364 (0.0030) [2024-06-13 05:08:15,944][70768] Fps is (10 sec: 45855.9, 60 sec: 49148.6, 300 sec: 49262.4). Total num frames: 2742222848. Throughput: 0: 49004.4. Samples: 2271102500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-13 05:08:15,944][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:08:16,267][71000] Updated weights for policy 0, policy_version 167374 (0.0022) [2024-06-13 05:08:19,341][71000] Updated weights for policy 0, policy_version 167384 (0.0023) [2024-06-13 05:08:20,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.1, 300 sec: 49096.5). Total num frames: 2742468608. Throughput: 0: 48951.3. Samples: 2271250360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-13 05:08:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:08:22,758][71000] Updated weights for policy 0, policy_version 167394 (0.0023) [2024-06-13 05:08:25,355][70980] Signal inference workers to stop experience collection... (33900 times) [2024-06-13 05:08:25,358][70980] Signal inference workers to resume experience collection... (33900 times) [2024-06-13 05:08:25,396][71000] InferenceWorker_p0-w0: stopping experience collection (33900 times) [2024-06-13 05:08:25,396][71000] InferenceWorker_p0-w0: resuming experience collection (33900 times) [2024-06-13 05:08:25,910][71000] Updated weights for policy 0, policy_version 167404 (0.0039) [2024-06-13 05:08:25,940][70768] Fps is (10 sec: 52451.3, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2742747136. Throughput: 0: 49280.3. Samples: 2271552220. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-13 05:08:25,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 05:08:29,886][71000] Updated weights for policy 0, policy_version 167414 (0.0036) [2024-06-13 05:08:30,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2742960128. Throughput: 0: 49033.3. Samples: 2271836880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 05:08:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:08:32,944][71000] Updated weights for policy 0, policy_version 167424 (0.0025) [2024-06-13 05:08:35,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2743205888. Throughput: 0: 48934.2. Samples: 2271976660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 05:08:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:08:36,387][71000] Updated weights for policy 0, policy_version 167434 (0.0026) [2024-06-13 05:08:39,451][71000] Updated weights for policy 0, policy_version 167444 (0.0030) [2024-06-13 05:08:40,939][70768] Fps is (10 sec: 49153.2, 60 sec: 49153.0, 300 sec: 49152.0). Total num frames: 2743451648. Throughput: 0: 48985.8. Samples: 2272276320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 05:08:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:08:43,238][71000] Updated weights for policy 0, policy_version 167454 (0.0032) [2024-06-13 05:08:45,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48332.9, 300 sec: 49096.5). Total num frames: 2743681024. Throughput: 0: 48832.5. Samples: 2272565440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 05:08:45,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:08:46,601][71000] Updated weights for policy 0, policy_version 167464 (0.0026) [2024-06-13 05:08:50,078][71000] Updated weights for policy 0, policy_version 167474 (0.0034) [2024-06-13 05:08:50,939][70768] Fps is (10 sec: 45875.0, 60 sec: 48605.9, 300 sec: 48985.9). Total num frames: 2743910400. Throughput: 0: 48577.0. Samples: 2272706440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 05:08:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:08:53,459][71000] Updated weights for policy 0, policy_version 167484 (0.0027) [2024-06-13 05:08:55,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48332.6, 300 sec: 48985.4). Total num frames: 2744156160. Throughput: 0: 48448.7. Samples: 2272988180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 05:08:55,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 05:08:56,882][71000] Updated weights for policy 0, policy_version 167494 (0.0030) [2024-06-13 05:09:00,127][71000] Updated weights for policy 0, policy_version 167504 (0.0024) [2024-06-13 05:09:00,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 2744434688. Throughput: 0: 48474.7. Samples: 2273283660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 05:09:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:09:03,781][71000] Updated weights for policy 0, policy_version 167514 (0.0028) [2024-06-13 05:09:05,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48059.8, 300 sec: 48985.4). Total num frames: 2744647680. Throughput: 0: 48633.7. Samples: 2273438880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 05:09:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:09:06,799][71000] Updated weights for policy 0, policy_version 167524 (0.0024) [2024-06-13 05:09:10,417][71000] Updated weights for policy 0, policy_version 167534 (0.0025) [2024-06-13 05:09:10,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2744893440. Throughput: 0: 48372.0. Samples: 2273728960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 05:09:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:09:13,507][71000] Updated weights for policy 0, policy_version 167544 (0.0030) [2024-06-13 05:09:15,939][70768] Fps is (10 sec: 49152.7, 60 sec: 48609.4, 300 sec: 48929.9). Total num frames: 2745139200. Throughput: 0: 48347.4. Samples: 2274012500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 05:09:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:09:17,379][71000] Updated weights for policy 0, policy_version 167554 (0.0020) [2024-06-13 05:09:20,387][71000] Updated weights for policy 0, policy_version 167564 (0.0032) [2024-06-13 05:09:20,939][70768] Fps is (10 sec: 50790.8, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2745401344. Throughput: 0: 48518.8. Samples: 2274160000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 05:09:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:09:24,149][71000] Updated weights for policy 0, policy_version 167574 (0.0026) [2024-06-13 05:09:25,940][70768] Fps is (10 sec: 47512.7, 60 sec: 47786.6, 300 sec: 48875.0). Total num frames: 2745614336. Throughput: 0: 48339.8. Samples: 2274451620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 05:09:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:09:27,022][71000] Updated weights for policy 0, policy_version 167584 (0.0032) [2024-06-13 05:09:27,502][70980] Signal inference workers to stop experience collection... (33950 times) [2024-06-13 05:09:27,503][70980] Signal inference workers to resume experience collection... (33950 times) [2024-06-13 05:09:27,523][71000] InferenceWorker_p0-w0: stopping experience collection (33950 times) [2024-06-13 05:09:27,523][71000] InferenceWorker_p0-w0: resuming experience collection (33950 times) [2024-06-13 05:09:30,856][71000] Updated weights for policy 0, policy_version 167594 (0.0036) [2024-06-13 05:09:30,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48332.9, 300 sec: 48874.3). Total num frames: 2745860096. Throughput: 0: 48439.6. Samples: 2274745220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 05:09:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:09:33,864][71000] Updated weights for policy 0, policy_version 167604 (0.0040) [2024-06-13 05:09:35,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48605.8, 300 sec: 48929.9). Total num frames: 2746122240. Throughput: 0: 48598.9. Samples: 2274893400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-13 05:09:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:09:37,665][71000] Updated weights for policy 0, policy_version 167614 (0.0035) [2024-06-13 05:09:40,614][71000] Updated weights for policy 0, policy_version 167624 (0.0031) [2024-06-13 05:09:40,939][70768] Fps is (10 sec: 50790.8, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 2746368000. Throughput: 0: 48859.8. Samples: 2275186860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-13 05:09:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:09:41,040][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000167626_2746384384.pth... [2024-06-13 05:09:41,116][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000166910_2734653440.pth [2024-06-13 05:09:44,242][71000] Updated weights for policy 0, policy_version 167634 (0.0032) [2024-06-13 05:09:45,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48332.7, 300 sec: 48763.2). Total num frames: 2746580992. Throughput: 0: 48691.6. Samples: 2275474780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-13 05:09:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:09:47,264][71000] Updated weights for policy 0, policy_version 167644 (0.0022) [2024-06-13 05:09:50,773][71000] Updated weights for policy 0, policy_version 167654 (0.0031) [2024-06-13 05:09:50,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 2746843136. Throughput: 0: 48456.0. Samples: 2275619400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-13 05:09:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:09:53,864][71000] Updated weights for policy 0, policy_version 167664 (0.0031) [2024-06-13 05:09:55,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49152.1, 300 sec: 48929.8). Total num frames: 2747105280. Throughput: 0: 48524.4. Samples: 2275912560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-13 05:09:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:09:57,758][71000] Updated weights for policy 0, policy_version 167674 (0.0027) [2024-06-13 05:10:00,618][71000] Updated weights for policy 0, policy_version 167684 (0.0023) [2024-06-13 05:10:00,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48606.0, 300 sec: 48929.8). Total num frames: 2747351040. Throughput: 0: 48645.7. Samples: 2276201560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-13 05:10:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:10:04,672][71000] Updated weights for policy 0, policy_version 167694 (0.0028) [2024-06-13 05:10:05,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 2747580416. Throughput: 0: 48656.4. Samples: 2276349540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-13 05:10:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:10:07,348][71000] Updated weights for policy 0, policy_version 167704 (0.0024) [2024-06-13 05:10:10,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 2747809792. Throughput: 0: 48702.4. Samples: 2276643220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-13 05:10:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:10:11,190][71000] Updated weights for policy 0, policy_version 167714 (0.0023) [2024-06-13 05:10:14,179][71000] Updated weights for policy 0, policy_version 167724 (0.0026) [2024-06-13 05:10:15,940][70768] Fps is (10 sec: 50789.3, 60 sec: 49151.8, 300 sec: 48929.8). Total num frames: 2748088320. Throughput: 0: 48761.1. Samples: 2276939480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-13 05:10:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:10:17,746][71000] Updated weights for policy 0, policy_version 167734 (0.0033) [2024-06-13 05:10:20,731][71000] Updated weights for policy 0, policy_version 167744 (0.0028) [2024-06-13 05:10:20,940][70768] Fps is (10 sec: 52428.1, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 2748334080. Throughput: 0: 48866.7. Samples: 2277092400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-13 05:10:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:10:24,353][71000] Updated weights for policy 0, policy_version 167754 (0.0030) [2024-06-13 05:10:25,940][70768] Fps is (10 sec: 45876.0, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 2748547072. Throughput: 0: 48896.8. Samples: 2277387220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-13 05:10:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:10:27,141][71000] Updated weights for policy 0, policy_version 167764 (0.0034) [2024-06-13 05:10:30,833][71000] Updated weights for policy 0, policy_version 167774 (0.0030) [2024-06-13 05:10:30,939][70768] Fps is (10 sec: 47514.4, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 2748809216. Throughput: 0: 49148.6. Samples: 2277686460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-13 05:10:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:10:33,742][71000] Updated weights for policy 0, policy_version 167784 (0.0032) [2024-06-13 05:10:35,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49152.1, 300 sec: 48929.9). Total num frames: 2749071360. Throughput: 0: 49306.2. Samples: 2277838180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-13 05:10:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:10:37,629][71000] Updated weights for policy 0, policy_version 167794 (0.0029) [2024-06-13 05:10:40,752][71000] Updated weights for policy 0, policy_version 167804 (0.0032) [2024-06-13 05:10:40,940][70768] Fps is (10 sec: 50789.2, 60 sec: 49151.8, 300 sec: 48985.4). Total num frames: 2749317120. Throughput: 0: 49235.8. Samples: 2278128180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 05:10:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:10:44,345][71000] Updated weights for policy 0, policy_version 167814 (0.0030) [2024-06-13 05:10:45,837][70980] Signal inference workers to stop experience collection... (34000 times) [2024-06-13 05:10:45,881][71000] InferenceWorker_p0-w0: stopping experience collection (34000 times) [2024-06-13 05:10:45,889][70980] Signal inference workers to resume experience collection... (34000 times) [2024-06-13 05:10:45,890][71000] InferenceWorker_p0-w0: resuming experience collection (34000 times) [2024-06-13 05:10:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.1, 300 sec: 48819.1). Total num frames: 2749546496. Throughput: 0: 49482.2. Samples: 2278428260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 05:10:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:10:47,243][71000] Updated weights for policy 0, policy_version 167824 (0.0035) [2024-06-13 05:10:50,853][71000] Updated weights for policy 0, policy_version 167834 (0.0023) [2024-06-13 05:10:50,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 2749792256. Throughput: 0: 49229.2. Samples: 2278564860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 05:10:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:10:53,761][71000] Updated weights for policy 0, policy_version 167844 (0.0026) [2024-06-13 05:10:55,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 2750054400. Throughput: 0: 49361.2. Samples: 2278864480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 05:10:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:10:57,482][71000] Updated weights for policy 0, policy_version 167854 (0.0032) [2024-06-13 05:11:00,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 2750267392. Throughput: 0: 49296.2. Samples: 2279157800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 05:11:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:11:00,954][71000] Updated weights for policy 0, policy_version 167864 (0.0027) [2024-06-13 05:11:04,250][71000] Updated weights for policy 0, policy_version 167874 (0.0032) [2024-06-13 05:11:05,940][70768] Fps is (10 sec: 44236.7, 60 sec: 48605.7, 300 sec: 48707.7). Total num frames: 2750496768. Throughput: 0: 49004.4. Samples: 2279297600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 05:11:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:11:07,445][71000] Updated weights for policy 0, policy_version 167884 (0.0021) [2024-06-13 05:11:10,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 2750758912. Throughput: 0: 48950.7. Samples: 2279590000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 05:11:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 05:11:10,974][71000] Updated weights for policy 0, policy_version 167894 (0.0035) [2024-06-13 05:11:13,873][71000] Updated weights for policy 0, policy_version 167904 (0.0033) [2024-06-13 05:11:15,940][70768] Fps is (10 sec: 54067.4, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2751037440. Throughput: 0: 49000.7. Samples: 2279891500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 05:11:15,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 05:11:17,634][71000] Updated weights for policy 0, policy_version 167914 (0.0033) [2024-06-13 05:11:20,698][71000] Updated weights for policy 0, policy_version 167924 (0.0030) [2024-06-13 05:11:20,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 2751266816. Throughput: 0: 49013.4. Samples: 2280043780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 05:11:20,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 05:11:24,136][71000] Updated weights for policy 0, policy_version 167934 (0.0038) [2024-06-13 05:11:25,939][70768] Fps is (10 sec: 47514.5, 60 sec: 49425.1, 300 sec: 48874.3). Total num frames: 2751512576. Throughput: 0: 49068.7. Samples: 2280336260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 05:11:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 05:11:27,601][71000] Updated weights for policy 0, policy_version 167944 (0.0024) [2024-06-13 05:11:30,884][71000] Updated weights for policy 0, policy_version 167954 (0.0029) [2024-06-13 05:11:30,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49151.8, 300 sec: 48985.4). Total num frames: 2751758336. Throughput: 0: 48978.1. Samples: 2280632280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 05:11:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:11:34,050][71000] Updated weights for policy 0, policy_version 167964 (0.0026) [2024-06-13 05:11:35,939][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2752020480. Throughput: 0: 49262.7. Samples: 2280781680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 05:11:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:11:37,665][71000] Updated weights for policy 0, policy_version 167974 (0.0027) [2024-06-13 05:11:40,805][71000] Updated weights for policy 0, policy_version 167984 (0.0031) [2024-06-13 05:11:40,940][70768] Fps is (10 sec: 50791.3, 60 sec: 49152.2, 300 sec: 48985.4). Total num frames: 2752266240. Throughput: 0: 49269.1. Samples: 2281081580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 05:11:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:11:40,946][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000167985_2752266240.pth... [2024-06-13 05:11:40,989][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000167268_2740518912.pth [2024-06-13 05:11:44,152][71000] Updated weights for policy 0, policy_version 167994 (0.0026) [2024-06-13 05:11:45,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2752495616. Throughput: 0: 49398.6. Samples: 2281380740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:11:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:11:47,432][71000] Updated weights for policy 0, policy_version 168004 (0.0031) [2024-06-13 05:11:50,571][71000] Updated weights for policy 0, policy_version 168014 (0.0032) [2024-06-13 05:11:50,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 2752741376. Throughput: 0: 49304.5. Samples: 2281516300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:11:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:11:54,018][71000] Updated weights for policy 0, policy_version 168024 (0.0032) [2024-06-13 05:11:55,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2753003520. Throughput: 0: 49513.7. Samples: 2281818120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:11:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:11:57,463][71000] Updated weights for policy 0, policy_version 168034 (0.0033) [2024-06-13 05:12:00,526][71000] Updated weights for policy 0, policy_version 168044 (0.0028) [2024-06-13 05:12:00,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49698.0, 300 sec: 48985.3). Total num frames: 2753249280. Throughput: 0: 49531.9. Samples: 2282120440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:12:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:12:04,096][71000] Updated weights for policy 0, policy_version 168054 (0.0035) [2024-06-13 05:12:05,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49971.4, 300 sec: 49152.0). Total num frames: 2753495040. Throughput: 0: 49444.0. Samples: 2282268760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:12:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:12:07,237][71000] Updated weights for policy 0, policy_version 168064 (0.0028) [2024-06-13 05:12:10,675][71000] Updated weights for policy 0, policy_version 168074 (0.0026) [2024-06-13 05:12:10,941][70768] Fps is (10 sec: 47506.0, 60 sec: 49423.6, 300 sec: 48985.1). Total num frames: 2753724416. Throughput: 0: 49406.0. Samples: 2282559620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:12:10,942][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:12:13,773][71000] Updated weights for policy 0, policy_version 168084 (0.0028) [2024-06-13 05:12:15,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2754002944. Throughput: 0: 49332.9. Samples: 2282852260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:12:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:12:17,667][71000] Updated weights for policy 0, policy_version 168094 (0.0025) [2024-06-13 05:12:20,700][71000] Updated weights for policy 0, policy_version 168104 (0.0030) [2024-06-13 05:12:20,940][70768] Fps is (10 sec: 50799.1, 60 sec: 49425.0, 300 sec: 48929.8). Total num frames: 2754232320. Throughput: 0: 49243.9. Samples: 2282997660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:12:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:12:24,329][71000] Updated weights for policy 0, policy_version 168114 (0.0029) [2024-06-13 05:12:25,481][70980] Signal inference workers to stop experience collection... (34050 times) [2024-06-13 05:12:25,481][70980] Signal inference workers to resume experience collection... (34050 times) [2024-06-13 05:12:25,523][71000] InferenceWorker_p0-w0: stopping experience collection (34050 times) [2024-06-13 05:12:25,523][71000] InferenceWorker_p0-w0: resuming experience collection (34050 times) [2024-06-13 05:12:25,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 2754461696. Throughput: 0: 49237.6. Samples: 2283297280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:12:25,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 05:12:27,218][71000] Updated weights for policy 0, policy_version 168124 (0.0033) [2024-06-13 05:12:30,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 2754691072. Throughput: 0: 48984.8. Samples: 2283585060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:12:30,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 05:12:30,970][71000] Updated weights for policy 0, policy_version 168134 (0.0034) [2024-06-13 05:12:34,037][71000] Updated weights for policy 0, policy_version 168144 (0.0027) [2024-06-13 05:12:35,942][70768] Fps is (10 sec: 49139.4, 60 sec: 48876.7, 300 sec: 48985.1). Total num frames: 2754953216. Throughput: 0: 49232.8. Samples: 2283731900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:12:35,943][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:12:37,662][71000] Updated weights for policy 0, policy_version 168154 (0.0022) [2024-06-13 05:12:40,706][71000] Updated weights for policy 0, policy_version 168164 (0.0031) [2024-06-13 05:12:40,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 2755198976. Throughput: 0: 48841.7. Samples: 2284016000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:12:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:12:44,342][71000] Updated weights for policy 0, policy_version 168174 (0.0022) [2024-06-13 05:12:45,940][70768] Fps is (10 sec: 49164.8, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2755444736. Throughput: 0: 48904.6. Samples: 2284321140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:12:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:12:47,291][71000] Updated weights for policy 0, policy_version 168184 (0.0019) [2024-06-13 05:12:50,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 2755657728. Throughput: 0: 48680.8. Samples: 2284459400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:12:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:12:51,263][71000] Updated weights for policy 0, policy_version 168194 (0.0034) [2024-06-13 05:12:53,918][71000] Updated weights for policy 0, policy_version 168204 (0.0025) [2024-06-13 05:12:55,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2755936256. Throughput: 0: 48867.3. Samples: 2284758560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:12:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:12:57,698][71000] Updated weights for policy 0, policy_version 168214 (0.0034) [2024-06-13 05:13:00,770][71000] Updated weights for policy 0, policy_version 168224 (0.0027) [2024-06-13 05:13:00,940][70768] Fps is (10 sec: 52428.7, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 2756182016. Throughput: 0: 49002.3. Samples: 2285057360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:13:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:13:04,103][71000] Updated weights for policy 0, policy_version 168234 (0.0028) [2024-06-13 05:13:05,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2756427776. Throughput: 0: 49077.0. Samples: 2285206120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:13:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:13:07,195][71000] Updated weights for policy 0, policy_version 168244 (0.0025) [2024-06-13 05:13:10,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48880.4, 300 sec: 48930.6). Total num frames: 2756657152. Throughput: 0: 48914.8. Samples: 2285498440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:13:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:13:11,022][71000] Updated weights for policy 0, policy_version 168254 (0.0026) [2024-06-13 05:13:13,866][71000] Updated weights for policy 0, policy_version 168264 (0.0030) [2024-06-13 05:13:15,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 2756919296. Throughput: 0: 48808.5. Samples: 2285781440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:13:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:13:17,851][71000] Updated weights for policy 0, policy_version 168274 (0.0023) [2024-06-13 05:13:20,849][71000] Updated weights for policy 0, policy_version 168284 (0.0025) [2024-06-13 05:13:20,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 2757165056. Throughput: 0: 48804.5. Samples: 2285927980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:13:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:13:24,622][71000] Updated weights for policy 0, policy_version 168294 (0.0024) [2024-06-13 05:13:25,939][70768] Fps is (10 sec: 45875.3, 60 sec: 48606.0, 300 sec: 48874.3). Total num frames: 2757378048. Throughput: 0: 49029.9. Samples: 2286222340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:13:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:13:27,535][71000] Updated weights for policy 0, policy_version 168304 (0.0034) [2024-06-13 05:13:30,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 2757623808. Throughput: 0: 48791.2. Samples: 2286516740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:13:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:13:31,254][71000] Updated weights for policy 0, policy_version 168314 (0.0034) [2024-06-13 05:13:34,038][71000] Updated weights for policy 0, policy_version 168324 (0.0031) [2024-06-13 05:13:35,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49154.1, 300 sec: 48985.4). Total num frames: 2757902336. Throughput: 0: 48940.4. Samples: 2286661720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:13:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:13:37,793][71000] Updated weights for policy 0, policy_version 168334 (0.0031) [2024-06-13 05:13:39,986][70980] Signal inference workers to stop experience collection... (34100 times) [2024-06-13 05:13:40,028][71000] InferenceWorker_p0-w0: stopping experience collection (34100 times) [2024-06-13 05:13:40,033][70980] Signal inference workers to resume experience collection... (34100 times) [2024-06-13 05:13:40,046][71000] InferenceWorker_p0-w0: resuming experience collection (34100 times) [2024-06-13 05:13:40,723][71000] Updated weights for policy 0, policy_version 168344 (0.0034) [2024-06-13 05:13:40,939][70768] Fps is (10 sec: 52429.1, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 2758148096. Throughput: 0: 48919.2. Samples: 2286959920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:13:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:13:40,998][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000168345_2758164480.pth... [2024-06-13 05:13:41,040][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000167626_2746384384.pth [2024-06-13 05:13:44,500][71000] Updated weights for policy 0, policy_version 168354 (0.0027) [2024-06-13 05:13:45,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 2758361088. Throughput: 0: 48721.0. Samples: 2287249800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:13:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:13:47,566][71000] Updated weights for policy 0, policy_version 168364 (0.0027) [2024-06-13 05:13:50,940][70768] Fps is (10 sec: 45874.6, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2758606848. Throughput: 0: 48421.7. Samples: 2287385100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:13:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:13:51,252][71000] Updated weights for policy 0, policy_version 168374 (0.0031) [2024-06-13 05:13:54,310][71000] Updated weights for policy 0, policy_version 168384 (0.0028) [2024-06-13 05:13:55,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 2758868992. Throughput: 0: 48612.0. Samples: 2287685980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:13:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:13:57,916][71000] Updated weights for policy 0, policy_version 168394 (0.0023) [2024-06-13 05:14:00,823][71000] Updated weights for policy 0, policy_version 168404 (0.0031) [2024-06-13 05:14:00,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2759131136. Throughput: 0: 48968.8. Samples: 2287985040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:14:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:14:04,527][71000] Updated weights for policy 0, policy_version 168414 (0.0023) [2024-06-13 05:14:05,939][70768] Fps is (10 sec: 47513.7, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 2759344128. Throughput: 0: 48952.2. Samples: 2288130820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:14:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:14:07,526][71000] Updated weights for policy 0, policy_version 168424 (0.0035) [2024-06-13 05:14:10,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 2759589888. Throughput: 0: 48897.8. Samples: 2288422740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:14:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:14:11,099][71000] Updated weights for policy 0, policy_version 168434 (0.0032) [2024-06-13 05:14:14,174][71000] Updated weights for policy 0, policy_version 168444 (0.0025) [2024-06-13 05:14:15,939][70768] Fps is (10 sec: 49152.1, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 2759835648. Throughput: 0: 48916.1. Samples: 2288717960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:14:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:14:18,074][71000] Updated weights for policy 0, policy_version 168454 (0.0033) [2024-06-13 05:14:20,769][71000] Updated weights for policy 0, policy_version 168464 (0.0033) [2024-06-13 05:14:20,940][70768] Fps is (10 sec: 52427.8, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2760114176. Throughput: 0: 49029.7. Samples: 2288868060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:14:20,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-13 05:14:24,665][71000] Updated weights for policy 0, policy_version 168474 (0.0032) [2024-06-13 05:14:25,939][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2760327168. Throughput: 0: 48884.4. Samples: 2289159720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:14:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:14:27,618][71000] Updated weights for policy 0, policy_version 168484 (0.0028) [2024-06-13 05:14:30,939][70768] Fps is (10 sec: 45876.2, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2760572928. Throughput: 0: 48784.1. Samples: 2289445080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:14:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:14:31,315][71000] Updated weights for policy 0, policy_version 168494 (0.0026) [2024-06-13 05:14:34,382][71000] Updated weights for policy 0, policy_version 168504 (0.0027) [2024-06-13 05:14:35,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 2760818688. Throughput: 0: 49028.9. Samples: 2289591400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:14:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:14:38,292][71000] Updated weights for policy 0, policy_version 168514 (0.0032) [2024-06-13 05:14:40,914][71000] Updated weights for policy 0, policy_version 168524 (0.0024) [2024-06-13 05:14:40,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2761097216. Throughput: 0: 49021.2. Samples: 2289891940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:14:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:14:44,789][71000] Updated weights for policy 0, policy_version 168534 (0.0026) [2024-06-13 05:14:45,939][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 2761310208. Throughput: 0: 48957.9. Samples: 2290188140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:14:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:14:47,523][71000] Updated weights for policy 0, policy_version 168544 (0.0038) [2024-06-13 05:14:50,940][70768] Fps is (10 sec: 45871.9, 60 sec: 49151.4, 300 sec: 48985.3). Total num frames: 2761555968. Throughput: 0: 48854.6. Samples: 2290329320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:14:50,941][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:14:51,444][71000] Updated weights for policy 0, policy_version 168554 (0.0028) [2024-06-13 05:14:54,124][70980] Signal inference workers to stop experience collection... (34150 times) [2024-06-13 05:14:54,124][70980] Signal inference workers to resume experience collection... (34150 times) [2024-06-13 05:14:54,176][71000] InferenceWorker_p0-w0: stopping experience collection (34150 times) [2024-06-13 05:14:54,176][71000] InferenceWorker_p0-w0: resuming experience collection (34150 times) [2024-06-13 05:14:54,258][71000] Updated weights for policy 0, policy_version 168564 (0.0023) [2024-06-13 05:14:55,939][70768] Fps is (10 sec: 49152.0, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 2761801728. Throughput: 0: 48774.7. Samples: 2290617600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:14:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:14:57,867][71000] Updated weights for policy 0, policy_version 168574 (0.0034) [2024-06-13 05:15:00,940][70768] Fps is (10 sec: 50794.5, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2762063872. Throughput: 0: 48979.0. Samples: 2290922020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 05:15:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:15:00,994][71000] Updated weights for policy 0, policy_version 168584 (0.0027) [2024-06-13 05:15:04,609][71000] Updated weights for policy 0, policy_version 168594 (0.0023) [2024-06-13 05:15:05,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2762293248. Throughput: 0: 49104.7. Samples: 2291077760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 05:15:05,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 05:15:07,823][71000] Updated weights for policy 0, policy_version 168604 (0.0028) [2024-06-13 05:15:10,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2762539008. Throughput: 0: 49040.4. Samples: 2291366540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 05:15:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:15:11,570][71000] Updated weights for policy 0, policy_version 168614 (0.0026) [2024-06-13 05:15:14,678][71000] Updated weights for policy 0, policy_version 168624 (0.0035) [2024-06-13 05:15:15,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 2762784768. Throughput: 0: 49159.2. Samples: 2291657240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 05:15:15,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 05:15:17,950][71000] Updated weights for policy 0, policy_version 168634 (0.0024) [2024-06-13 05:15:20,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 2763046912. Throughput: 0: 49173.4. Samples: 2291804200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 05:15:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:15:21,192][71000] Updated weights for policy 0, policy_version 168644 (0.0025) [2024-06-13 05:15:24,477][71000] Updated weights for policy 0, policy_version 168654 (0.0030) [2024-06-13 05:15:25,939][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2763292672. Throughput: 0: 49197.1. Samples: 2292105800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 05:15:25,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 05:15:28,048][71000] Updated weights for policy 0, policy_version 168664 (0.0032) [2024-06-13 05:15:30,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2763522048. Throughput: 0: 49263.5. Samples: 2292405000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 05:15:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:15:31,280][71000] Updated weights for policy 0, policy_version 168674 (0.0028) [2024-06-13 05:15:34,786][71000] Updated weights for policy 0, policy_version 168684 (0.0022) [2024-06-13 05:15:35,944][70768] Fps is (10 sec: 47492.7, 60 sec: 49148.5, 300 sec: 48984.7). Total num frames: 2763767808. Throughput: 0: 49194.0. Samples: 2292543220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 05:15:35,944][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:15:37,740][71000] Updated weights for policy 0, policy_version 168694 (0.0023) [2024-06-13 05:15:40,939][70768] Fps is (10 sec: 50791.0, 60 sec: 48879.1, 300 sec: 49096.5). Total num frames: 2764029952. Throughput: 0: 49460.5. Samples: 2292843320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 05:15:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 05:15:41,054][71000] Updated weights for policy 0, policy_version 168704 (0.0031) [2024-06-13 05:15:41,056][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000168704_2764046336.pth... [2024-06-13 05:15:41,103][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000167985_2752266240.pth [2024-06-13 05:15:44,427][71000] Updated weights for policy 0, policy_version 168714 (0.0031) [2024-06-13 05:15:45,940][70768] Fps is (10 sec: 50812.4, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 2764275712. Throughput: 0: 49341.8. Samples: 2293142400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 05:15:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:15:47,604][71000] Updated weights for policy 0, policy_version 168724 (0.0023) [2024-06-13 05:15:50,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49152.7, 300 sec: 48985.4). Total num frames: 2764505088. Throughput: 0: 49236.8. Samples: 2293293420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 05:15:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:15:51,267][71000] Updated weights for policy 0, policy_version 168734 (0.0028) [2024-06-13 05:15:54,695][71000] Updated weights for policy 0, policy_version 168744 (0.0035) [2024-06-13 05:15:55,299][70980] Signal inference workers to stop experience collection... (34200 times) [2024-06-13 05:15:55,343][71000] InferenceWorker_p0-w0: stopping experience collection (34200 times) [2024-06-13 05:15:55,351][70980] Signal inference workers to resume experience collection... (34200 times) [2024-06-13 05:15:55,353][71000] InferenceWorker_p0-w0: resuming experience collection (34200 times) [2024-06-13 05:15:55,940][70768] Fps is (10 sec: 47512.7, 60 sec: 49151.8, 300 sec: 49096.4). Total num frames: 2764750848. Throughput: 0: 49314.5. Samples: 2293585700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 05:15:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:15:57,691][71000] Updated weights for policy 0, policy_version 168754 (0.0035) [2024-06-13 05:16:00,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 2765012992. Throughput: 0: 49411.5. Samples: 2293880760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 05:16:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:16:01,113][71000] Updated weights for policy 0, policy_version 168764 (0.0032) [2024-06-13 05:16:04,397][71000] Updated weights for policy 0, policy_version 168774 (0.0032) [2024-06-13 05:16:05,939][70768] Fps is (10 sec: 54068.5, 60 sec: 49971.2, 300 sec: 49263.1). Total num frames: 2765291520. Throughput: 0: 49459.6. Samples: 2294029880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 05:16:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:16:07,598][71000] Updated weights for policy 0, policy_version 168784 (0.0023) [2024-06-13 05:16:10,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49041.0). Total num frames: 2765504512. Throughput: 0: 49304.0. Samples: 2294324480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 05:16:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:16:10,948][71000] Updated weights for policy 0, policy_version 168794 (0.0032) [2024-06-13 05:16:14,863][71000] Updated weights for policy 0, policy_version 168804 (0.0028) [2024-06-13 05:16:15,939][70768] Fps is (10 sec: 42598.3, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2765717504. Throughput: 0: 49012.9. Samples: 2294610580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 05:16:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:16:17,737][71000] Updated weights for policy 0, policy_version 168814 (0.0024) [2024-06-13 05:16:20,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2765979648. Throughput: 0: 49367.9. Samples: 2294764560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 05:16:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:16:21,326][71000] Updated weights for policy 0, policy_version 168824 (0.0023) [2024-06-13 05:16:24,229][71000] Updated weights for policy 0, policy_version 168834 (0.0031) [2024-06-13 05:16:25,939][70768] Fps is (10 sec: 54067.5, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2766258176. Throughput: 0: 49268.4. Samples: 2295060400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 05:16:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:16:27,842][71000] Updated weights for policy 0, policy_version 168844 (0.0028) [2024-06-13 05:16:30,880][71000] Updated weights for policy 0, policy_version 168854 (0.0027) [2024-06-13 05:16:30,939][70768] Fps is (10 sec: 52429.2, 60 sec: 49698.2, 300 sec: 49096.5). Total num frames: 2766503936. Throughput: 0: 49226.8. Samples: 2295357600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 05:16:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:16:34,776][71000] Updated weights for policy 0, policy_version 168864 (0.0030) [2024-06-13 05:16:35,939][70768] Fps is (10 sec: 45875.1, 60 sec: 49155.6, 300 sec: 48985.4). Total num frames: 2766716928. Throughput: 0: 49065.0. Samples: 2295501340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 05:16:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:16:37,687][71000] Updated weights for policy 0, policy_version 168874 (0.0023) [2024-06-13 05:16:40,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 2766962688. Throughput: 0: 49044.2. Samples: 2295792680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 05:16:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:16:41,315][71000] Updated weights for policy 0, policy_version 168884 (0.0027) [2024-06-13 05:16:44,210][71000] Updated weights for policy 0, policy_version 168894 (0.0030) [2024-06-13 05:16:45,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2767241216. Throughput: 0: 49107.1. Samples: 2296090580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 05:16:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:16:48,005][71000] Updated weights for policy 0, policy_version 168904 (0.0027) [2024-06-13 05:16:50,915][71000] Updated weights for policy 0, policy_version 168914 (0.0026) [2024-06-13 05:16:50,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49698.2, 300 sec: 49096.5). Total num frames: 2767486976. Throughput: 0: 49379.5. Samples: 2296251960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 05:16:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:16:51,570][70980] Signal inference workers to stop experience collection... (34250 times) [2024-06-13 05:16:51,571][70980] Signal inference workers to resume experience collection... (34250 times) [2024-06-13 05:16:51,609][71000] InferenceWorker_p0-w0: stopping experience collection (34250 times) [2024-06-13 05:16:51,609][71000] InferenceWorker_p0-w0: resuming experience collection (34250 times) [2024-06-13 05:16:54,510][71000] Updated weights for policy 0, policy_version 168924 (0.0022) [2024-06-13 05:16:55,939][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.2, 300 sec: 48985.4). Total num frames: 2767699968. Throughput: 0: 49437.8. Samples: 2296549180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 05:16:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:16:57,449][71000] Updated weights for policy 0, policy_version 168934 (0.0036) [2024-06-13 05:17:00,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2767962112. Throughput: 0: 49435.2. Samples: 2296835160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 05:17:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:17:01,489][71000] Updated weights for policy 0, policy_version 168944 (0.0020) [2024-06-13 05:17:03,987][71000] Updated weights for policy 0, policy_version 168954 (0.0033) [2024-06-13 05:17:05,939][70768] Fps is (10 sec: 52428.7, 60 sec: 48879.0, 300 sec: 49152.3). Total num frames: 2768224256. Throughput: 0: 49344.5. Samples: 2296985060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-13 05:17:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:17:07,768][71000] Updated weights for policy 0, policy_version 168964 (0.0030) [2024-06-13 05:17:10,546][71000] Updated weights for policy 0, policy_version 168974 (0.0026) [2024-06-13 05:17:10,940][70768] Fps is (10 sec: 50787.6, 60 sec: 49424.6, 300 sec: 49040.9). Total num frames: 2768470016. Throughput: 0: 49531.8. Samples: 2297289360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 05:17:10,941][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:17:14,384][71000] Updated weights for policy 0, policy_version 168984 (0.0037) [2024-06-13 05:17:15,939][70768] Fps is (10 sec: 47513.5, 60 sec: 49698.2, 300 sec: 49041.0). Total num frames: 2768699392. Throughput: 0: 49651.5. Samples: 2297591920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 05:17:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:17:17,017][71000] Updated weights for policy 0, policy_version 168994 (0.0030) [2024-06-13 05:17:20,939][70768] Fps is (10 sec: 47516.1, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2768945152. Throughput: 0: 49419.5. Samples: 2297725220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 05:17:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:17:20,981][71000] Updated weights for policy 0, policy_version 169004 (0.0021) [2024-06-13 05:17:23,939][71000] Updated weights for policy 0, policy_version 169014 (0.0024) [2024-06-13 05:17:25,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 2769207296. Throughput: 0: 49410.3. Samples: 2298016140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 05:17:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:17:27,666][71000] Updated weights for policy 0, policy_version 169024 (0.0021) [2024-06-13 05:17:30,813][71000] Updated weights for policy 0, policy_version 169034 (0.0024) [2024-06-13 05:17:30,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 49152.5). Total num frames: 2769453056. Throughput: 0: 49597.3. Samples: 2298322460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 05:17:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:17:34,524][71000] Updated weights for policy 0, policy_version 169044 (0.0025) [2024-06-13 05:17:35,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.1, 300 sec: 49152.0). Total num frames: 2769698816. Throughput: 0: 49374.3. Samples: 2298473800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 05:17:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:17:37,154][71000] Updated weights for policy 0, policy_version 169054 (0.0028) [2024-06-13 05:17:40,853][71000] Updated weights for policy 0, policy_version 169064 (0.0023) [2024-06-13 05:17:40,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49698.1, 300 sec: 49152.0). Total num frames: 2769944576. Throughput: 0: 49361.1. Samples: 2298770440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 05:17:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:17:40,954][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000169064_2769944576.pth... [2024-06-13 05:17:41,000][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000168345_2758164480.pth [2024-06-13 05:17:43,713][71000] Updated weights for policy 0, policy_version 169074 (0.0026) [2024-06-13 05:17:45,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2770190336. Throughput: 0: 49526.6. Samples: 2299063860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 05:17:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:17:47,275][71000] Updated weights for policy 0, policy_version 169084 (0.0029) [2024-06-13 05:17:48,290][70980] Signal inference workers to stop experience collection... (34300 times) [2024-06-13 05:17:48,291][70980] Signal inference workers to resume experience collection... (34300 times) [2024-06-13 05:17:48,303][71000] InferenceWorker_p0-w0: stopping experience collection (34300 times) [2024-06-13 05:17:48,304][71000] InferenceWorker_p0-w0: resuming experience collection (34300 times) [2024-06-13 05:17:50,498][71000] Updated weights for policy 0, policy_version 169094 (0.0027) [2024-06-13 05:17:50,939][70768] Fps is (10 sec: 50791.2, 60 sec: 49425.1, 300 sec: 49207.6). Total num frames: 2770452480. Throughput: 0: 49732.9. Samples: 2299223040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 05:17:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:17:54,252][71000] Updated weights for policy 0, policy_version 169104 (0.0031) [2024-06-13 05:17:55,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49971.2, 300 sec: 49207.6). Total num frames: 2770698240. Throughput: 0: 49617.9. Samples: 2299522140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 05:17:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 05:17:57,039][71000] Updated weights for policy 0, policy_version 169114 (0.0031) [2024-06-13 05:18:00,776][71000] Updated weights for policy 0, policy_version 169124 (0.0030) [2024-06-13 05:18:00,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2770927616. Throughput: 0: 49409.7. Samples: 2299815360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 05:18:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:18:03,569][71000] Updated weights for policy 0, policy_version 169134 (0.0021) [2024-06-13 05:18:05,939][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2771189760. Throughput: 0: 49787.5. Samples: 2299965660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 05:18:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:18:07,365][71000] Updated weights for policy 0, policy_version 169144 (0.0024) [2024-06-13 05:18:10,519][71000] Updated weights for policy 0, policy_version 169154 (0.0031) [2024-06-13 05:18:10,939][70768] Fps is (10 sec: 52429.0, 60 sec: 49698.5, 300 sec: 49263.1). Total num frames: 2771451904. Throughput: 0: 49911.0. Samples: 2300262140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 05:18:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:18:13,669][71000] Updated weights for policy 0, policy_version 169164 (0.0021) [2024-06-13 05:18:15,939][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2771664896. Throughput: 0: 49730.6. Samples: 2300560340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:18:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:18:16,805][71000] Updated weights for policy 0, policy_version 169174 (0.0031) [2024-06-13 05:18:20,520][71000] Updated weights for policy 0, policy_version 169184 (0.0027) [2024-06-13 05:18:20,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49698.0, 300 sec: 49318.6). Total num frames: 2771927040. Throughput: 0: 49526.1. Samples: 2300702480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:18:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:18:23,776][71000] Updated weights for policy 0, policy_version 169194 (0.0034) [2024-06-13 05:18:25,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49698.1, 300 sec: 49374.2). Total num frames: 2772189184. Throughput: 0: 49593.4. Samples: 2301002140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:18:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:18:27,097][71000] Updated weights for policy 0, policy_version 169204 (0.0039) [2024-06-13 05:18:30,090][71000] Updated weights for policy 0, policy_version 169214 (0.0031) [2024-06-13 05:18:30,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2772434944. Throughput: 0: 49776.9. Samples: 2301303820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:18:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:18:33,719][71000] Updated weights for policy 0, policy_version 169224 (0.0028) [2024-06-13 05:18:35,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2772664320. Throughput: 0: 49496.9. Samples: 2301450400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:18:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:18:36,950][71000] Updated weights for policy 0, policy_version 169234 (0.0035) [2024-06-13 05:18:40,271][71000] Updated weights for policy 0, policy_version 169244 (0.0020) [2024-06-13 05:18:40,939][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.2, 300 sec: 49318.6). Total num frames: 2772910080. Throughput: 0: 49297.3. Samples: 2301740520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:18:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:18:43,686][71000] Updated weights for policy 0, policy_version 169254 (0.0024) [2024-06-13 05:18:45,942][70768] Fps is (10 sec: 50778.3, 60 sec: 49696.2, 300 sec: 49373.8). Total num frames: 2773172224. Throughput: 0: 49321.5. Samples: 2302034940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:18:45,942][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 05:18:46,888][71000] Updated weights for policy 0, policy_version 169264 (0.0021) [2024-06-13 05:18:47,910][70980] Signal inference workers to stop experience collection... (34350 times) [2024-06-13 05:18:47,956][71000] InferenceWorker_p0-w0: stopping experience collection (34350 times) [2024-06-13 05:18:47,956][70980] Signal inference workers to resume experience collection... (34350 times) [2024-06-13 05:18:47,968][71000] InferenceWorker_p0-w0: resuming experience collection (34350 times) [2024-06-13 05:18:50,259][71000] Updated weights for policy 0, policy_version 169274 (0.0029) [2024-06-13 05:18:50,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2773417984. Throughput: 0: 49389.3. Samples: 2302188180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:18:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:18:53,545][71000] Updated weights for policy 0, policy_version 169284 (0.0037) [2024-06-13 05:18:55,939][70768] Fps is (10 sec: 47524.8, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 2773647360. Throughput: 0: 49478.2. Samples: 2302488660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:18:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:18:56,737][71000] Updated weights for policy 0, policy_version 169294 (0.0032) [2024-06-13 05:19:00,293][71000] Updated weights for policy 0, policy_version 169304 (0.0028) [2024-06-13 05:19:00,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 2773909504. Throughput: 0: 49343.6. Samples: 2302780800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:19:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:19:03,587][71000] Updated weights for policy 0, policy_version 169314 (0.0033) [2024-06-13 05:19:05,939][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2774138880. Throughput: 0: 49526.3. Samples: 2302931160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:19:05,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-13 05:19:06,834][71000] Updated weights for policy 0, policy_version 169324 (0.0035) [2024-06-13 05:19:10,365][71000] Updated weights for policy 0, policy_version 169334 (0.0019) [2024-06-13 05:19:10,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 2774401024. Throughput: 0: 49333.8. Samples: 2303222160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:19:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:19:13,507][71000] Updated weights for policy 0, policy_version 169344 (0.0026) [2024-06-13 05:19:15,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49698.2, 300 sec: 49263.1). Total num frames: 2774646784. Throughput: 0: 49175.6. Samples: 2303516720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:19:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:19:16,942][71000] Updated weights for policy 0, policy_version 169354 (0.0034) [2024-06-13 05:19:20,217][71000] Updated weights for policy 0, policy_version 169364 (0.0035) [2024-06-13 05:19:20,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.2, 300 sec: 49374.2). Total num frames: 2774892544. Throughput: 0: 49204.9. Samples: 2303664620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:19:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 05:19:23,685][71000] Updated weights for policy 0, policy_version 169374 (0.0034) [2024-06-13 05:19:25,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2775154688. Throughput: 0: 49416.0. Samples: 2303964240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:19:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:19:26,589][71000] Updated weights for policy 0, policy_version 169384 (0.0026) [2024-06-13 05:19:30,278][71000] Updated weights for policy 0, policy_version 169394 (0.0030) [2024-06-13 05:19:30,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 2775367680. Throughput: 0: 49490.2. Samples: 2304261880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:19:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:19:33,239][71000] Updated weights for policy 0, policy_version 169404 (0.0029) [2024-06-13 05:19:35,939][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2775629824. Throughput: 0: 49243.7. Samples: 2304404140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:19:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:19:36,850][71000] Updated weights for policy 0, policy_version 169414 (0.0030) [2024-06-13 05:19:39,881][71000] Updated weights for policy 0, policy_version 169424 (0.0021) [2024-06-13 05:19:40,940][70768] Fps is (10 sec: 52427.1, 60 sec: 49697.9, 300 sec: 49429.6). Total num frames: 2775891968. Throughput: 0: 49176.1. Samples: 2304701600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:19:40,941][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:19:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000169427_2775891968.pth... [2024-06-13 05:19:41,010][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000168704_2764046336.pth [2024-06-13 05:19:43,748][71000] Updated weights for policy 0, policy_version 169434 (0.0033) [2024-06-13 05:19:45,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49154.0, 300 sec: 49374.3). Total num frames: 2776121344. Throughput: 0: 49061.9. Samples: 2304988580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:19:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 05:19:46,659][71000] Updated weights for policy 0, policy_version 169444 (0.0030) [2024-06-13 05:19:50,046][71000] Updated weights for policy 0, policy_version 169454 (0.0029) [2024-06-13 05:19:50,939][70768] Fps is (10 sec: 47514.8, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 2776367104. Throughput: 0: 49072.9. Samples: 2305139440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:19:50,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 05:19:53,285][71000] Updated weights for policy 0, policy_version 169464 (0.0026) [2024-06-13 05:19:55,940][70768] Fps is (10 sec: 49150.6, 60 sec: 49424.9, 300 sec: 49318.6). Total num frames: 2776612864. Throughput: 0: 49222.0. Samples: 2305437160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:19:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 05:19:56,754][71000] Updated weights for policy 0, policy_version 169474 (0.0033) [2024-06-13 05:20:00,015][71000] Updated weights for policy 0, policy_version 169484 (0.0030) [2024-06-13 05:20:00,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2776875008. Throughput: 0: 49287.1. Samples: 2305734640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:20:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:20:03,348][70980] Signal inference workers to stop experience collection... (34400 times) [2024-06-13 05:20:03,349][70980] Signal inference workers to resume experience collection... (34400 times) [2024-06-13 05:20:03,388][71000] InferenceWorker_p0-w0: stopping experience collection (34400 times) [2024-06-13 05:20:03,388][71000] InferenceWorker_p0-w0: resuming experience collection (34400 times) [2024-06-13 05:20:03,486][71000] Updated weights for policy 0, policy_version 169494 (0.0029) [2024-06-13 05:20:05,940][70768] Fps is (10 sec: 50791.4, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 2777120768. Throughput: 0: 49442.2. Samples: 2305889520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:20:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:20:06,405][71000] Updated weights for policy 0, policy_version 169504 (0.0024) [2024-06-13 05:20:10,169][71000] Updated weights for policy 0, policy_version 169514 (0.0023) [2024-06-13 05:20:10,939][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 2777350144. Throughput: 0: 49176.0. Samples: 2306177160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:20:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:20:13,163][71000] Updated weights for policy 0, policy_version 169524 (0.0027) [2024-06-13 05:20:15,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2777595904. Throughput: 0: 49083.9. Samples: 2306470660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:20:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:20:16,827][71000] Updated weights for policy 0, policy_version 169534 (0.0031) [2024-06-13 05:20:19,817][71000] Updated weights for policy 0, policy_version 169544 (0.0029) [2024-06-13 05:20:20,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.1, 300 sec: 49374.2). Total num frames: 2777858048. Throughput: 0: 49229.3. Samples: 2306619460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:20:20,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:20:23,367][71000] Updated weights for policy 0, policy_version 169554 (0.0029) [2024-06-13 05:20:25,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48605.9, 300 sec: 49318.6). Total num frames: 2778071040. Throughput: 0: 49196.3. Samples: 2306915420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 05:20:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:20:26,611][71000] Updated weights for policy 0, policy_version 169564 (0.0036) [2024-06-13 05:20:30,117][71000] Updated weights for policy 0, policy_version 169574 (0.0034) [2024-06-13 05:20:30,939][70768] Fps is (10 sec: 45875.0, 60 sec: 49152.0, 300 sec: 49319.3). Total num frames: 2778316800. Throughput: 0: 49299.1. Samples: 2307207040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 05:20:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:20:33,151][71000] Updated weights for policy 0, policy_version 169584 (0.0022) [2024-06-13 05:20:35,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2778578944. Throughput: 0: 49247.2. Samples: 2307355560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 05:20:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:20:36,655][71000] Updated weights for policy 0, policy_version 169594 (0.0026) [2024-06-13 05:20:39,819][71000] Updated weights for policy 0, policy_version 169604 (0.0025) [2024-06-13 05:20:40,940][70768] Fps is (10 sec: 54066.7, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 2778857472. Throughput: 0: 49562.8. Samples: 2307667480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 05:20:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:20:43,334][71000] Updated weights for policy 0, policy_version 169614 (0.0025) [2024-06-13 05:20:45,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 2779070464. Throughput: 0: 49413.4. Samples: 2307958240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 05:20:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:20:46,345][71000] Updated weights for policy 0, policy_version 169624 (0.0026) [2024-06-13 05:20:49,899][71000] Updated weights for policy 0, policy_version 169634 (0.0033) [2024-06-13 05:20:50,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2779332608. Throughput: 0: 49141.4. Samples: 2308100880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 05:20:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:20:53,033][71000] Updated weights for policy 0, policy_version 169644 (0.0027) [2024-06-13 05:20:55,941][70768] Fps is (10 sec: 50782.7, 60 sec: 49424.0, 300 sec: 49373.9). Total num frames: 2779578368. Throughput: 0: 49236.6. Samples: 2308392880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 05:20:55,942][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:20:56,504][71000] Updated weights for policy 0, policy_version 169654 (0.0032) [2024-06-13 05:20:59,754][71000] Updated weights for policy 0, policy_version 169664 (0.0034) [2024-06-13 05:21:00,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2779840512. Throughput: 0: 49248.8. Samples: 2308686860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 05:21:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:21:03,042][71000] Updated weights for policy 0, policy_version 169674 (0.0024) [2024-06-13 05:21:05,939][70768] Fps is (10 sec: 49159.5, 60 sec: 49152.1, 300 sec: 49374.2). Total num frames: 2780069888. Throughput: 0: 49351.5. Samples: 2308840280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 05:21:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:21:06,372][71000] Updated weights for policy 0, policy_version 169684 (0.0026) [2024-06-13 05:21:09,500][71000] Updated weights for policy 0, policy_version 169694 (0.0035) [2024-06-13 05:21:10,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 2780315648. Throughput: 0: 49471.5. Samples: 2309141640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 05:21:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:21:12,899][71000] Updated weights for policy 0, policy_version 169704 (0.0031) [2024-06-13 05:21:15,939][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 2780577792. Throughput: 0: 49650.2. Samples: 2309441300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 05:21:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:21:16,291][71000] Updated weights for policy 0, policy_version 169714 (0.0027) [2024-06-13 05:21:19,330][70980] Signal inference workers to stop experience collection... (34450 times) [2024-06-13 05:21:19,376][70980] Signal inference workers to resume experience collection... (34450 times) [2024-06-13 05:21:19,376][71000] InferenceWorker_p0-w0: stopping experience collection (34450 times) [2024-06-13 05:21:19,390][71000] InferenceWorker_p0-w0: resuming experience collection (34450 times) [2024-06-13 05:21:19,525][71000] Updated weights for policy 0, policy_version 169724 (0.0029) [2024-06-13 05:21:20,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 2780823552. Throughput: 0: 49586.2. Samples: 2309586940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 05:21:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:21:22,852][71000] Updated weights for policy 0, policy_version 169734 (0.0027) [2024-06-13 05:21:25,939][70768] Fps is (10 sec: 45875.2, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2781036544. Throughput: 0: 49305.4. Samples: 2309886220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 05:21:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:21:26,289][71000] Updated weights for policy 0, policy_version 169744 (0.0027) [2024-06-13 05:21:29,509][71000] Updated weights for policy 0, policy_version 169754 (0.0042) [2024-06-13 05:21:30,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 2781298688. Throughput: 0: 49298.7. Samples: 2310176680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:21:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:21:32,837][71000] Updated weights for policy 0, policy_version 169764 (0.0021) [2024-06-13 05:21:35,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 2781544448. Throughput: 0: 49320.9. Samples: 2310320320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:21:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:21:36,188][71000] Updated weights for policy 0, policy_version 169774 (0.0032) [2024-06-13 05:21:39,653][71000] Updated weights for policy 0, policy_version 169784 (0.0030) [2024-06-13 05:21:40,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 2781806592. Throughput: 0: 49407.8. Samples: 2310616160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:21:40,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 05:21:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000169788_2781806592.pth... [2024-06-13 05:21:40,996][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000169064_2769944576.pth [2024-06-13 05:21:43,008][71000] Updated weights for policy 0, policy_version 169794 (0.0018) [2024-06-13 05:21:45,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2782019584. Throughput: 0: 49528.9. Samples: 2310915660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:21:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:21:46,521][71000] Updated weights for policy 0, policy_version 169804 (0.0033) [2024-06-13 05:21:49,495][71000] Updated weights for policy 0, policy_version 169814 (0.0029) [2024-06-13 05:21:50,939][70768] Fps is (10 sec: 45875.5, 60 sec: 48878.9, 300 sec: 49374.1). Total num frames: 2782265344. Throughput: 0: 49109.3. Samples: 2311050200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:21:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:21:53,091][71000] Updated weights for policy 0, policy_version 169824 (0.0030) [2024-06-13 05:21:55,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49153.2, 300 sec: 49374.1). Total num frames: 2782527488. Throughput: 0: 48940.9. Samples: 2311343980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:21:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:21:56,370][71000] Updated weights for policy 0, policy_version 169834 (0.0030) [2024-06-13 05:21:59,738][71000] Updated weights for policy 0, policy_version 169844 (0.0026) [2024-06-13 05:22:00,939][70768] Fps is (10 sec: 50790.5, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 2782773248. Throughput: 0: 48776.4. Samples: 2311636240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:22:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:22:03,062][71000] Updated weights for policy 0, policy_version 169854 (0.0035) [2024-06-13 05:22:05,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48605.9, 300 sec: 49207.6). Total num frames: 2782986240. Throughput: 0: 48992.5. Samples: 2311791600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:22:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:22:06,588][71000] Updated weights for policy 0, policy_version 169864 (0.0035) [2024-06-13 05:22:09,739][71000] Updated weights for policy 0, policy_version 169874 (0.0030) [2024-06-13 05:22:10,939][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 2783264768. Throughput: 0: 48668.4. Samples: 2312076300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:22:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:22:13,378][71000] Updated weights for policy 0, policy_version 169884 (0.0038) [2024-06-13 05:22:15,939][70768] Fps is (10 sec: 52428.8, 60 sec: 48879.0, 300 sec: 49374.2). Total num frames: 2783510528. Throughput: 0: 48798.2. Samples: 2312372600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:22:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:22:16,183][71000] Updated weights for policy 0, policy_version 169894 (0.0031) [2024-06-13 05:22:19,874][71000] Updated weights for policy 0, policy_version 169904 (0.0027) [2024-06-13 05:22:20,939][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 2783756288. Throughput: 0: 49082.2. Samples: 2312529020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:22:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:22:22,864][71000] Updated weights for policy 0, policy_version 169914 (0.0029) [2024-06-13 05:22:25,944][70768] Fps is (10 sec: 45855.2, 60 sec: 48875.4, 300 sec: 49206.8). Total num frames: 2783969280. Throughput: 0: 48910.9. Samples: 2312817360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:22:25,944][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:22:26,533][71000] Updated weights for policy 0, policy_version 169924 (0.0027) [2024-06-13 05:22:29,369][71000] Updated weights for policy 0, policy_version 169934 (0.0035) [2024-06-13 05:22:30,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2784247808. Throughput: 0: 48979.2. Samples: 2313119720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:22:30,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 05:22:33,295][71000] Updated weights for policy 0, policy_version 169944 (0.0035) [2024-06-13 05:22:33,521][70980] Signal inference workers to stop experience collection... (34500 times) [2024-06-13 05:22:33,564][71000] InferenceWorker_p0-w0: stopping experience collection (34500 times) [2024-06-13 05:22:33,574][70980] Signal inference workers to resume experience collection... (34500 times) [2024-06-13 05:22:33,582][71000] InferenceWorker_p0-w0: resuming experience collection (34500 times) [2024-06-13 05:22:35,940][70768] Fps is (10 sec: 54089.8, 60 sec: 49424.9, 300 sec: 49374.2). Total num frames: 2784509952. Throughput: 0: 49332.8. Samples: 2313270180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:22:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:22:35,978][71000] Updated weights for policy 0, policy_version 169954 (0.0028) [2024-06-13 05:22:39,883][71000] Updated weights for policy 0, policy_version 169964 (0.0034) [2024-06-13 05:22:40,939][70768] Fps is (10 sec: 49151.8, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 2784739328. Throughput: 0: 49561.8. Samples: 2313574260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:22:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:22:42,530][71000] Updated weights for policy 0, policy_version 169974 (0.0022) [2024-06-13 05:22:45,939][70768] Fps is (10 sec: 45875.9, 60 sec: 49152.1, 300 sec: 49207.5). Total num frames: 2784968704. Throughput: 0: 49451.6. Samples: 2313861560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:22:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:22:46,533][71000] Updated weights for policy 0, policy_version 169984 (0.0025) [2024-06-13 05:22:49,151][71000] Updated weights for policy 0, policy_version 169994 (0.0038) [2024-06-13 05:22:50,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2785214464. Throughput: 0: 49079.1. Samples: 2314000160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:22:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:22:53,502][71000] Updated weights for policy 0, policy_version 170004 (0.0033) [2024-06-13 05:22:55,939][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2785476608. Throughput: 0: 49107.1. Samples: 2314286120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:22:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:22:56,506][71000] Updated weights for policy 0, policy_version 170014 (0.0041) [2024-06-13 05:23:00,215][71000] Updated weights for policy 0, policy_version 170024 (0.0037) [2024-06-13 05:23:00,939][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2785722368. Throughput: 0: 49171.5. Samples: 2314585320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:23:00,942][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:23:02,994][71000] Updated weights for policy 0, policy_version 170034 (0.0033) [2024-06-13 05:23:05,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 2785935360. Throughput: 0: 48707.0. Samples: 2314720840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:23:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 05:23:06,894][71000] Updated weights for policy 0, policy_version 170044 (0.0031) [2024-06-13 05:23:09,316][71000] Updated weights for policy 0, policy_version 170054 (0.0030) [2024-06-13 05:23:10,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2786197504. Throughput: 0: 48828.2. Samples: 2315014420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:23:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:23:13,514][71000] Updated weights for policy 0, policy_version 170064 (0.0023) [2024-06-13 05:23:15,939][70768] Fps is (10 sec: 54067.9, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2786476032. Throughput: 0: 48974.7. Samples: 2315323580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:23:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:23:16,079][71000] Updated weights for policy 0, policy_version 170074 (0.0034) [2024-06-13 05:23:20,391][71000] Updated weights for policy 0, policy_version 170084 (0.0027) [2024-06-13 05:23:20,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48605.9, 300 sec: 49096.5). Total num frames: 2786672640. Throughput: 0: 48961.9. Samples: 2315473460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:23:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:23:22,798][71000] Updated weights for policy 0, policy_version 170094 (0.0027) [2024-06-13 05:23:25,940][70768] Fps is (10 sec: 44236.4, 60 sec: 49155.5, 300 sec: 49096.5). Total num frames: 2786918400. Throughput: 0: 48592.0. Samples: 2315760900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:23:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:23:27,141][71000] Updated weights for policy 0, policy_version 170104 (0.0028) [2024-06-13 05:23:29,308][71000] Updated weights for policy 0, policy_version 170114 (0.0033) [2024-06-13 05:23:30,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48878.8, 300 sec: 49207.5). Total num frames: 2787180544. Throughput: 0: 48693.6. Samples: 2316052780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:23:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:23:33,822][71000] Updated weights for policy 0, policy_version 170124 (0.0029) [2024-06-13 05:23:34,063][70980] Signal inference workers to stop experience collection... (34550 times) [2024-06-13 05:23:34,064][70980] Signal inference workers to resume experience collection... (34550 times) [2024-06-13 05:23:34,093][71000] InferenceWorker_p0-w0: stopping experience collection (34550 times) [2024-06-13 05:23:34,093][71000] InferenceWorker_p0-w0: resuming experience collection (34550 times) [2024-06-13 05:23:35,940][70768] Fps is (10 sec: 52427.6, 60 sec: 48878.8, 300 sec: 49263.0). Total num frames: 2787442688. Throughput: 0: 49052.5. Samples: 2316207540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-13 05:23:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:23:36,448][71000] Updated weights for policy 0, policy_version 170134 (0.0021) [2024-06-13 05:23:40,305][71000] Updated weights for policy 0, policy_version 170144 (0.0028) [2024-06-13 05:23:40,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48878.9, 300 sec: 49152.4). Total num frames: 2787672064. Throughput: 0: 49312.3. Samples: 2316505180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 05:23:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:23:41,020][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000170147_2787688448.pth... [2024-06-13 05:23:41,081][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000169427_2775891968.pth [2024-06-13 05:23:42,915][71000] Updated weights for policy 0, policy_version 170154 (0.0024) [2024-06-13 05:23:45,940][70768] Fps is (10 sec: 44236.2, 60 sec: 48605.5, 300 sec: 49040.9). Total num frames: 2787885056. Throughput: 0: 49197.3. Samples: 2316799220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 05:23:45,941][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:23:47,056][71000] Updated weights for policy 0, policy_version 170164 (0.0030) [2024-06-13 05:23:49,486][71000] Updated weights for policy 0, policy_version 170174 (0.0025) [2024-06-13 05:23:50,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.7, 300 sec: 49152.0). Total num frames: 2788147200. Throughput: 0: 49087.5. Samples: 2316929780. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 05:23:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:23:53,715][71000] Updated weights for policy 0, policy_version 170184 (0.0023) [2024-06-13 05:23:55,939][70768] Fps is (10 sec: 54069.8, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 2788425728. Throughput: 0: 49260.6. Samples: 2317231140. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 05:23:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:23:56,707][71000] Updated weights for policy 0, policy_version 170194 (0.0028) [2024-06-13 05:24:00,516][71000] Updated weights for policy 0, policy_version 170204 (0.0023) [2024-06-13 05:24:00,940][70768] Fps is (10 sec: 50791.1, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2788655104. Throughput: 0: 49143.0. Samples: 2317535020. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 05:24:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:24:03,033][71000] Updated weights for policy 0, policy_version 170214 (0.0031) [2024-06-13 05:24:05,939][70768] Fps is (10 sec: 45874.9, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2788884480. Throughput: 0: 48897.3. Samples: 2317673840. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 05:24:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 05:24:07,032][71000] Updated weights for policy 0, policy_version 170224 (0.0032) [2024-06-13 05:24:09,887][71000] Updated weights for policy 0, policy_version 170234 (0.0021) [2024-06-13 05:24:10,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.9, 300 sec: 49096.4). Total num frames: 2789130240. Throughput: 0: 48864.4. Samples: 2317959800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 05:24:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:24:13,739][71000] Updated weights for policy 0, policy_version 170244 (0.0037) [2024-06-13 05:24:15,940][70768] Fps is (10 sec: 52428.3, 60 sec: 48878.8, 300 sec: 49207.5). Total num frames: 2789408768. Throughput: 0: 48824.9. Samples: 2318249900. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 05:24:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:24:16,864][71000] Updated weights for policy 0, policy_version 170254 (0.0024) [2024-06-13 05:24:20,491][71000] Updated weights for policy 0, policy_version 170264 (0.0028) [2024-06-13 05:24:20,939][70768] Fps is (10 sec: 47514.2, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 2789605376. Throughput: 0: 48875.9. Samples: 2318406940. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 05:24:20,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:24:23,427][71000] Updated weights for policy 0, policy_version 170274 (0.0028) [2024-06-13 05:24:25,940][70768] Fps is (10 sec: 44236.5, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 2789851136. Throughput: 0: 48650.2. Samples: 2318694440. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 05:24:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:24:27,176][71000] Updated weights for policy 0, policy_version 170284 (0.0032) [2024-06-13 05:24:30,138][71000] Updated weights for policy 0, policy_version 170294 (0.0031) [2024-06-13 05:24:30,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48879.0, 300 sec: 49096.4). Total num frames: 2790113280. Throughput: 0: 48447.5. Samples: 2318979340. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 05:24:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:24:33,955][71000] Updated weights for policy 0, policy_version 170304 (0.0027) [2024-06-13 05:24:35,192][70980] Signal inference workers to stop experience collection... (34600 times) [2024-06-13 05:24:35,197][70980] Signal inference workers to resume experience collection... (34600 times) [2024-06-13 05:24:35,212][71000] InferenceWorker_p0-w0: stopping experience collection (34600 times) [2024-06-13 05:24:35,239][71000] InferenceWorker_p0-w0: resuming experience collection (34600 times) [2024-06-13 05:24:35,940][70768] Fps is (10 sec: 54067.8, 60 sec: 49152.2, 300 sec: 49152.0). Total num frames: 2790391808. Throughput: 0: 49113.0. Samples: 2319139860. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 05:24:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:24:36,784][71000] Updated weights for policy 0, policy_version 170314 (0.0027) [2024-06-13 05:24:40,242][71000] Updated weights for policy 0, policy_version 170324 (0.0024) [2024-06-13 05:24:40,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2790604800. Throughput: 0: 49015.1. Samples: 2319436820. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 05:24:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:24:43,313][71000] Updated weights for policy 0, policy_version 170334 (0.0035) [2024-06-13 05:24:45,940][70768] Fps is (10 sec: 44236.8, 60 sec: 49152.3, 300 sec: 49040.9). Total num frames: 2790834176. Throughput: 0: 48749.7. Samples: 2319728760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 05:24:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:24:47,322][71000] Updated weights for policy 0, policy_version 170344 (0.0032) [2024-06-13 05:24:49,811][71000] Updated weights for policy 0, policy_version 170354 (0.0029) [2024-06-13 05:24:50,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2791096320. Throughput: 0: 48871.8. Samples: 2319873080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 05:24:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:24:53,766][71000] Updated weights for policy 0, policy_version 170364 (0.0029) [2024-06-13 05:24:55,939][70768] Fps is (10 sec: 54067.7, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2791374848. Throughput: 0: 49140.5. Samples: 2320171120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 05:24:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:24:56,444][71000] Updated weights for policy 0, policy_version 170374 (0.0023) [2024-06-13 05:25:00,313][71000] Updated weights for policy 0, policy_version 170384 (0.0028) [2024-06-13 05:25:00,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 2791587840. Throughput: 0: 49283.9. Samples: 2320467680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 05:25:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:25:03,054][71000] Updated weights for policy 0, policy_version 170394 (0.0031) [2024-06-13 05:25:05,939][70768] Fps is (10 sec: 45874.9, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2791833600. Throughput: 0: 48874.6. Samples: 2320606300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 05:25:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:25:06,951][71000] Updated weights for policy 0, policy_version 170404 (0.0026) [2024-06-13 05:25:09,600][71000] Updated weights for policy 0, policy_version 170414 (0.0038) [2024-06-13 05:25:10,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2792062976. Throughput: 0: 49045.0. Samples: 2320901460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 05:25:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:25:13,587][71000] Updated weights for policy 0, policy_version 170424 (0.0023) [2024-06-13 05:25:15,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2792357888. Throughput: 0: 49344.8. Samples: 2321199860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 05:25:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:25:16,404][71000] Updated weights for policy 0, policy_version 170434 (0.0026) [2024-06-13 05:25:20,353][71000] Updated weights for policy 0, policy_version 170444 (0.0032) [2024-06-13 05:25:20,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49424.9, 300 sec: 49152.0). Total num frames: 2792570880. Throughput: 0: 49096.0. Samples: 2321349180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 05:25:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:25:23,213][71000] Updated weights for policy 0, policy_version 170454 (0.0024) [2024-06-13 05:25:25,940][70768] Fps is (10 sec: 45875.5, 60 sec: 49425.2, 300 sec: 49152.0). Total num frames: 2792816640. Throughput: 0: 49170.6. Samples: 2321649500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 05:25:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:25:26,942][71000] Updated weights for policy 0, policy_version 170464 (0.0024) [2024-06-13 05:25:29,667][71000] Updated weights for policy 0, policy_version 170474 (0.0026) [2024-06-13 05:25:30,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2793046016. Throughput: 0: 49041.4. Samples: 2321935620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 05:25:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:25:33,577][71000] Updated weights for policy 0, policy_version 170484 (0.0019) [2024-06-13 05:25:35,710][70980] Signal inference workers to stop experience collection... (34650 times) [2024-06-13 05:25:35,713][70980] Signal inference workers to resume experience collection... (34650 times) [2024-06-13 05:25:35,730][71000] InferenceWorker_p0-w0: stopping experience collection (34650 times) [2024-06-13 05:25:35,730][71000] InferenceWorker_p0-w0: resuming experience collection (34650 times) [2024-06-13 05:25:35,939][70768] Fps is (10 sec: 52429.4, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2793340928. Throughput: 0: 49322.1. Samples: 2322092560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 05:25:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:25:36,410][71000] Updated weights for policy 0, policy_version 170494 (0.0030) [2024-06-13 05:25:40,072][71000] Updated weights for policy 0, policy_version 170504 (0.0027) [2024-06-13 05:25:40,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 2793553920. Throughput: 0: 49278.1. Samples: 2322388640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 05:25:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:25:40,981][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000170506_2793570304.pth... [2024-06-13 05:25:41,026][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000169788_2781806592.pth [2024-06-13 05:25:43,323][71000] Updated weights for policy 0, policy_version 170514 (0.0035) [2024-06-13 05:25:45,940][70768] Fps is (10 sec: 44235.4, 60 sec: 49151.9, 300 sec: 48985.3). Total num frames: 2793783296. Throughput: 0: 49047.9. Samples: 2322674840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 05:25:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:25:46,961][71000] Updated weights for policy 0, policy_version 170524 (0.0036) [2024-06-13 05:25:50,295][71000] Updated weights for policy 0, policy_version 170534 (0.0031) [2024-06-13 05:25:50,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 49041.1). Total num frames: 2794045440. Throughput: 0: 49041.2. Samples: 2322813160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-13 05:25:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:25:53,710][71000] Updated weights for policy 0, policy_version 170544 (0.0025) [2024-06-13 05:25:55,939][70768] Fps is (10 sec: 50791.7, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 2794291200. Throughput: 0: 49100.5. Samples: 2323110980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-13 05:25:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:25:56,711][71000] Updated weights for policy 0, policy_version 170554 (0.0035) [2024-06-13 05:26:00,160][71000] Updated weights for policy 0, policy_version 170564 (0.0027) [2024-06-13 05:26:00,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 2794536960. Throughput: 0: 49166.8. Samples: 2323412360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-13 05:26:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:26:03,074][71000] Updated weights for policy 0, policy_version 170574 (0.0029) [2024-06-13 05:26:05,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2794782720. Throughput: 0: 49046.2. Samples: 2323556260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-13 05:26:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:26:06,822][71000] Updated weights for policy 0, policy_version 170584 (0.0023) [2024-06-13 05:26:10,273][71000] Updated weights for policy 0, policy_version 170594 (0.0038) [2024-06-13 05:26:10,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49698.1, 300 sec: 49040.9). Total num frames: 2795044864. Throughput: 0: 49117.2. Samples: 2323859780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-13 05:26:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:26:13,645][71000] Updated weights for policy 0, policy_version 170604 (0.0030) [2024-06-13 05:26:15,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 2795274240. Throughput: 0: 49243.1. Samples: 2324151560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-13 05:26:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:26:16,954][71000] Updated weights for policy 0, policy_version 170614 (0.0028) [2024-06-13 05:26:20,156][71000] Updated weights for policy 0, policy_version 170624 (0.0035) [2024-06-13 05:26:20,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2795520000. Throughput: 0: 49167.0. Samples: 2324305080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-13 05:26:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:26:23,552][71000] Updated weights for policy 0, policy_version 170634 (0.0032) [2024-06-13 05:26:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 2795765760. Throughput: 0: 49047.5. Samples: 2324595780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-13 05:26:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:26:26,778][71000] Updated weights for policy 0, policy_version 170644 (0.0031) [2024-06-13 05:26:30,358][71000] Updated weights for policy 0, policy_version 170654 (0.0035) [2024-06-13 05:26:30,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49698.1, 300 sec: 49096.5). Total num frames: 2796027904. Throughput: 0: 49301.2. Samples: 2324893380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-13 05:26:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:26:33,774][71000] Updated weights for policy 0, policy_version 170664 (0.0029) [2024-06-13 05:26:35,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48605.7, 300 sec: 48985.4). Total num frames: 2796257280. Throughput: 0: 49278.3. Samples: 2325030680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-13 05:26:35,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 05:26:37,357][71000] Updated weights for policy 0, policy_version 170674 (0.0030) [2024-06-13 05:26:40,598][71000] Updated weights for policy 0, policy_version 170684 (0.0037) [2024-06-13 05:26:40,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2796503040. Throughput: 0: 48973.7. Samples: 2325314800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-13 05:26:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:26:44,097][71000] Updated weights for policy 0, policy_version 170694 (0.0021) [2024-06-13 05:26:44,241][70980] Signal inference workers to stop experience collection... (34700 times) [2024-06-13 05:26:44,241][70980] Signal inference workers to resume experience collection... (34700 times) [2024-06-13 05:26:44,252][71000] InferenceWorker_p0-w0: stopping experience collection (34700 times) [2024-06-13 05:26:44,252][71000] InferenceWorker_p0-w0: resuming experience collection (34700 times) [2024-06-13 05:26:45,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.3, 300 sec: 49096.5). Total num frames: 2796748800. Throughput: 0: 48891.6. Samples: 2325612480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-13 05:26:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:26:47,207][71000] Updated weights for policy 0, policy_version 170704 (0.0037) [2024-06-13 05:26:50,673][71000] Updated weights for policy 0, policy_version 170714 (0.0048) [2024-06-13 05:26:50,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 2796978176. Throughput: 0: 49015.1. Samples: 2325761940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-13 05:26:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 05:26:53,987][71000] Updated weights for policy 0, policy_version 170724 (0.0036) [2024-06-13 05:26:55,939][70768] Fps is (10 sec: 45875.0, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 2797207552. Throughput: 0: 48468.1. Samples: 2326040840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 05:26:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:26:57,451][71000] Updated weights for policy 0, policy_version 170734 (0.0027) [2024-06-13 05:27:00,586][71000] Updated weights for policy 0, policy_version 170744 (0.0027) [2024-06-13 05:27:00,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2797486080. Throughput: 0: 48627.1. Samples: 2326339780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 05:27:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:27:04,020][71000] Updated weights for policy 0, policy_version 170754 (0.0030) [2024-06-13 05:27:05,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2797731840. Throughput: 0: 48681.8. Samples: 2326495760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 05:27:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:27:07,227][71000] Updated weights for policy 0, policy_version 170764 (0.0029) [2024-06-13 05:27:10,541][71000] Updated weights for policy 0, policy_version 170774 (0.0033) [2024-06-13 05:27:10,939][70768] Fps is (10 sec: 49152.5, 60 sec: 48879.1, 300 sec: 49040.9). Total num frames: 2797977600. Throughput: 0: 48797.9. Samples: 2326791680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 05:27:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:27:13,883][71000] Updated weights for policy 0, policy_version 170784 (0.0027) [2024-06-13 05:27:15,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 2798190592. Throughput: 0: 48538.5. Samples: 2327077620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 05:27:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:27:17,401][71000] Updated weights for policy 0, policy_version 170794 (0.0028) [2024-06-13 05:27:20,595][71000] Updated weights for policy 0, policy_version 170804 (0.0023) [2024-06-13 05:27:20,940][70768] Fps is (10 sec: 47512.6, 60 sec: 48878.8, 300 sec: 49097.2). Total num frames: 2798452736. Throughput: 0: 48826.6. Samples: 2327227880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 05:27:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:27:23,934][71000] Updated weights for policy 0, policy_version 170814 (0.0033) [2024-06-13 05:27:25,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 2798714880. Throughput: 0: 49072.7. Samples: 2327523080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 05:27:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:27:27,143][71000] Updated weights for policy 0, policy_version 170824 (0.0025) [2024-06-13 05:27:30,377][71000] Updated weights for policy 0, policy_version 170834 (0.0029) [2024-06-13 05:27:30,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 2798960640. Throughput: 0: 49174.1. Samples: 2327825320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 05:27:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:27:34,025][71000] Updated weights for policy 0, policy_version 170844 (0.0026) [2024-06-13 05:27:35,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 2799190016. Throughput: 0: 49147.9. Samples: 2327973600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 05:27:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:27:37,351][71000] Updated weights for policy 0, policy_version 170854 (0.0022) [2024-06-13 05:27:40,580][71000] Updated weights for policy 0, policy_version 170864 (0.0021) [2024-06-13 05:27:40,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2799435776. Throughput: 0: 49309.8. Samples: 2328259780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 05:27:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:27:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000170864_2799435776.pth... [2024-06-13 05:27:41,001][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000170147_2787688448.pth [2024-06-13 05:27:44,159][71000] Updated weights for policy 0, policy_version 170874 (0.0028) [2024-06-13 05:27:44,543][70980] Signal inference workers to stop experience collection... (34750 times) [2024-06-13 05:27:44,593][71000] InferenceWorker_p0-w0: stopping experience collection (34750 times) [2024-06-13 05:27:44,657][70980] Signal inference workers to resume experience collection... (34750 times) [2024-06-13 05:27:44,657][71000] InferenceWorker_p0-w0: resuming experience collection (34750 times) [2024-06-13 05:27:45,940][70768] Fps is (10 sec: 50789.0, 60 sec: 49151.6, 300 sec: 49096.4). Total num frames: 2799697920. Throughput: 0: 49270.2. Samples: 2328556960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 05:27:45,941][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:27:47,271][71000] Updated weights for policy 0, policy_version 170884 (0.0021) [2024-06-13 05:27:50,637][71000] Updated weights for policy 0, policy_version 170894 (0.0023) [2024-06-13 05:27:50,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 2799943680. Throughput: 0: 49160.8. Samples: 2328708000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 05:27:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:27:53,730][71000] Updated weights for policy 0, policy_version 170904 (0.0028) [2024-06-13 05:27:55,940][70768] Fps is (10 sec: 47515.3, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 2800173056. Throughput: 0: 49214.9. Samples: 2329006360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 05:27:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 05:27:57,273][71000] Updated weights for policy 0, policy_version 170914 (0.0030) [2024-06-13 05:28:00,590][71000] Updated weights for policy 0, policy_version 170924 (0.0022) [2024-06-13 05:28:00,939][70768] Fps is (10 sec: 47514.2, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2800418816. Throughput: 0: 49255.3. Samples: 2329294100. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-13 05:28:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 05:28:03,938][71000] Updated weights for policy 0, policy_version 170934 (0.0025) [2024-06-13 05:28:05,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2800680960. Throughput: 0: 49555.6. Samples: 2329457880. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-13 05:28:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:28:07,319][71000] Updated weights for policy 0, policy_version 170944 (0.0028) [2024-06-13 05:28:10,585][71000] Updated weights for policy 0, policy_version 170954 (0.0024) [2024-06-13 05:28:10,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 2800910336. Throughput: 0: 49462.4. Samples: 2329748880. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-13 05:28:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:28:13,811][71000] Updated weights for policy 0, policy_version 170964 (0.0033) [2024-06-13 05:28:15,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49425.1, 300 sec: 49096.4). Total num frames: 2801156096. Throughput: 0: 49217.8. Samples: 2330040120. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-13 05:28:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:28:17,230][71000] Updated weights for policy 0, policy_version 170974 (0.0028) [2024-06-13 05:28:20,303][71000] Updated weights for policy 0, policy_version 170984 (0.0029) [2024-06-13 05:28:20,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49425.2, 300 sec: 49152.0). Total num frames: 2801418240. Throughput: 0: 49101.1. Samples: 2330183140. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-13 05:28:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 05:28:24,020][71000] Updated weights for policy 0, policy_version 170994 (0.0031) [2024-06-13 05:28:25,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2801664000. Throughput: 0: 49393.8. Samples: 2330482500. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-13 05:28:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:28:27,204][71000] Updated weights for policy 0, policy_version 171004 (0.0028) [2024-06-13 05:28:30,534][71000] Updated weights for policy 0, policy_version 171014 (0.0023) [2024-06-13 05:28:30,939][70768] Fps is (10 sec: 47513.7, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 2801893376. Throughput: 0: 49451.2. Samples: 2330782240. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-13 05:28:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:28:33,770][71000] Updated weights for policy 0, policy_version 171024 (0.0025) [2024-06-13 05:28:35,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 2802122752. Throughput: 0: 49250.6. Samples: 2330924280. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-13 05:28:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:28:37,446][71000] Updated weights for policy 0, policy_version 171034 (0.0035) [2024-06-13 05:28:40,361][71000] Updated weights for policy 0, policy_version 171044 (0.0026) [2024-06-13 05:28:40,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49425.0, 300 sec: 49207.6). Total num frames: 2802401280. Throughput: 0: 48889.8. Samples: 2331206400. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-13 05:28:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:28:43,831][71000] Updated weights for policy 0, policy_version 171054 (0.0032) [2024-06-13 05:28:45,939][70768] Fps is (10 sec: 52429.5, 60 sec: 49152.4, 300 sec: 49152.0). Total num frames: 2802647040. Throughput: 0: 49301.3. Samples: 2331512660. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-13 05:28:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:28:46,936][71000] Updated weights for policy 0, policy_version 171064 (0.0021) [2024-06-13 05:28:50,690][71000] Updated weights for policy 0, policy_version 171074 (0.0034) [2024-06-13 05:28:50,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2802876416. Throughput: 0: 48878.2. Samples: 2331657400. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-13 05:28:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:28:53,610][71000] Updated weights for policy 0, policy_version 171084 (0.0028) [2024-06-13 05:28:55,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 2803122176. Throughput: 0: 48964.1. Samples: 2331952260. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-13 05:28:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:28:57,383][71000] Updated weights for policy 0, policy_version 171094 (0.0034) [2024-06-13 05:29:00,379][71000] Updated weights for policy 0, policy_version 171104 (0.0032) [2024-06-13 05:29:00,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 2803367936. Throughput: 0: 48792.4. Samples: 2332235780. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-13 05:29:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:29:01,478][70980] Signal inference workers to stop experience collection... (34800 times) [2024-06-13 05:29:01,478][70980] Signal inference workers to resume experience collection... (34800 times) [2024-06-13 05:29:01,493][71000] InferenceWorker_p0-w0: stopping experience collection (34800 times) [2024-06-13 05:29:01,493][71000] InferenceWorker_p0-w0: resuming experience collection (34800 times) [2024-06-13 05:29:04,276][71000] Updated weights for policy 0, policy_version 171114 (0.0036) [2024-06-13 05:29:05,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2803630080. Throughput: 0: 49084.4. Samples: 2332391940. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-13 05:29:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:29:07,043][71000] Updated weights for policy 0, policy_version 171124 (0.0026) [2024-06-13 05:29:10,520][71000] Updated weights for policy 0, policy_version 171134 (0.0023) [2024-06-13 05:29:10,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2803859456. Throughput: 0: 49029.8. Samples: 2332688840. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-13 05:29:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:29:13,456][71000] Updated weights for policy 0, policy_version 171144 (0.0025) [2024-06-13 05:29:15,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2804105216. Throughput: 0: 49079.5. Samples: 2332990820. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-13 05:29:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:29:17,302][71000] Updated weights for policy 0, policy_version 171154 (0.0035) [2024-06-13 05:29:20,365][71000] Updated weights for policy 0, policy_version 171164 (0.0032) [2024-06-13 05:29:20,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2804367360. Throughput: 0: 49092.5. Samples: 2333133440. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-13 05:29:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:29:24,147][71000] Updated weights for policy 0, policy_version 171174 (0.0024) [2024-06-13 05:29:25,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 2804580352. Throughput: 0: 49071.6. Samples: 2333414620. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-13 05:29:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:29:26,929][71000] Updated weights for policy 0, policy_version 171184 (0.0031) [2024-06-13 05:29:30,632][71000] Updated weights for policy 0, policy_version 171194 (0.0026) [2024-06-13 05:29:30,941][70768] Fps is (10 sec: 47509.3, 60 sec: 49151.2, 300 sec: 48985.2). Total num frames: 2804842496. Throughput: 0: 48866.0. Samples: 2333711680. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-13 05:29:30,941][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 05:29:33,582][71000] Updated weights for policy 0, policy_version 171204 (0.0031) [2024-06-13 05:29:35,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49425.0, 300 sec: 49096.4). Total num frames: 2805088256. Throughput: 0: 48870.6. Samples: 2333856580. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-13 05:29:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:29:37,182][71000] Updated weights for policy 0, policy_version 171214 (0.0031) [2024-06-13 05:29:40,050][71000] Updated weights for policy 0, policy_version 171224 (0.0022) [2024-06-13 05:29:40,940][70768] Fps is (10 sec: 50795.3, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2805350400. Throughput: 0: 49100.3. Samples: 2334161780. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-13 05:29:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:29:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000171225_2805350400.pth... [2024-06-13 05:29:41,004][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000170506_2793570304.pth [2024-06-13 05:29:44,156][71000] Updated weights for policy 0, policy_version 171234 (0.0036) [2024-06-13 05:29:45,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48878.8, 300 sec: 49096.5). Total num frames: 2805579776. Throughput: 0: 49524.0. Samples: 2334464360. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-13 05:29:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:29:46,987][71000] Updated weights for policy 0, policy_version 171244 (0.0029) [2024-06-13 05:29:50,895][71000] Updated weights for policy 0, policy_version 171254 (0.0035) [2024-06-13 05:29:50,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2805825536. Throughput: 0: 49067.1. Samples: 2334599960. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-13 05:29:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:29:53,088][70980] Signal inference workers to stop experience collection... (34850 times) [2024-06-13 05:29:53,138][71000] InferenceWorker_p0-w0: stopping experience collection (34850 times) [2024-06-13 05:29:53,144][70980] Signal inference workers to resume experience collection... (34850 times) [2024-06-13 05:29:53,151][71000] InferenceWorker_p0-w0: resuming experience collection (34850 times) [2024-06-13 05:29:53,567][71000] Updated weights for policy 0, policy_version 171264 (0.0027) [2024-06-13 05:29:55,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 2806071296. Throughput: 0: 49037.3. Samples: 2334895520. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-13 05:29:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:29:57,193][71000] Updated weights for policy 0, policy_version 171274 (0.0032) [2024-06-13 05:30:00,203][71000] Updated weights for policy 0, policy_version 171284 (0.0031) [2024-06-13 05:30:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2806333440. Throughput: 0: 48743.0. Samples: 2335184260. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-13 05:30:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:30:04,055][71000] Updated weights for policy 0, policy_version 171294 (0.0027) [2024-06-13 05:30:05,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2806562816. Throughput: 0: 48897.0. Samples: 2335333800. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-13 05:30:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:30:07,085][71000] Updated weights for policy 0, policy_version 171304 (0.0032) [2024-06-13 05:30:10,727][71000] Updated weights for policy 0, policy_version 171314 (0.0030) [2024-06-13 05:30:10,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2806808576. Throughput: 0: 49268.9. Samples: 2335631720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 05:30:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:30:13,422][71000] Updated weights for policy 0, policy_version 171324 (0.0022) [2024-06-13 05:30:15,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2807054336. Throughput: 0: 49211.3. Samples: 2335926140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 05:30:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:30:17,155][71000] Updated weights for policy 0, policy_version 171334 (0.0024) [2024-06-13 05:30:20,343][71000] Updated weights for policy 0, policy_version 171344 (0.0028) [2024-06-13 05:30:20,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2807316480. Throughput: 0: 49361.0. Samples: 2336077820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 05:30:20,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:30:24,267][71000] Updated weights for policy 0, policy_version 171354 (0.0033) [2024-06-13 05:30:25,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2807513088. Throughput: 0: 48960.4. Samples: 2336365000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 05:30:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:30:27,064][71000] Updated weights for policy 0, policy_version 171364 (0.0025) [2024-06-13 05:30:30,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48879.7, 300 sec: 48929.8). Total num frames: 2807775232. Throughput: 0: 48686.7. Samples: 2336655260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 05:30:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:30:31,217][71000] Updated weights for policy 0, policy_version 171374 (0.0025) [2024-06-13 05:30:33,883][71000] Updated weights for policy 0, policy_version 171384 (0.0025) [2024-06-13 05:30:35,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2808037376. Throughput: 0: 48991.2. Samples: 2336804560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 05:30:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:30:37,815][71000] Updated weights for policy 0, policy_version 171394 (0.0028) [2024-06-13 05:30:40,623][71000] Updated weights for policy 0, policy_version 171404 (0.0021) [2024-06-13 05:30:40,939][70768] Fps is (10 sec: 50790.6, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2808283136. Throughput: 0: 49104.1. Samples: 2337105200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 05:30:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:30:44,362][71000] Updated weights for policy 0, policy_version 171414 (0.0034) [2024-06-13 05:30:45,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 2808496128. Throughput: 0: 49122.7. Samples: 2337394780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 05:30:45,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-13 05:30:47,447][71000] Updated weights for policy 0, policy_version 171424 (0.0025) [2024-06-13 05:30:50,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 2808741888. Throughput: 0: 48746.2. Samples: 2337527380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 05:30:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:30:51,559][71000] Updated weights for policy 0, policy_version 171434 (0.0025) [2024-06-13 05:30:54,278][71000] Updated weights for policy 0, policy_version 171444 (0.0031) [2024-06-13 05:30:55,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 2809020416. Throughput: 0: 48679.0. Samples: 2337822280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 05:30:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:30:58,109][71000] Updated weights for policy 0, policy_version 171454 (0.0037) [2024-06-13 05:31:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 2809249792. Throughput: 0: 48715.5. Samples: 2338118340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 05:31:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:31:01,106][71000] Updated weights for policy 0, policy_version 171464 (0.0031) [2024-06-13 05:31:02,674][70980] Signal inference workers to stop experience collection... (34900 times) [2024-06-13 05:31:02,674][70980] Signal inference workers to resume experience collection... (34900 times) [2024-06-13 05:31:02,714][71000] InferenceWorker_p0-w0: stopping experience collection (34900 times) [2024-06-13 05:31:02,714][71000] InferenceWorker_p0-w0: resuming experience collection (34900 times) [2024-06-13 05:31:04,777][71000] Updated weights for policy 0, policy_version 171474 (0.0030) [2024-06-13 05:31:05,940][70768] Fps is (10 sec: 45875.7, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 2809479168. Throughput: 0: 48481.4. Samples: 2338259480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 05:31:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:31:07,726][71000] Updated weights for policy 0, policy_version 171484 (0.0024) [2024-06-13 05:31:10,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48332.8, 300 sec: 48929.8). Total num frames: 2809708544. Throughput: 0: 48668.9. Samples: 2338555100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 05:31:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:31:11,439][71000] Updated weights for policy 0, policy_version 171494 (0.0027) [2024-06-13 05:31:14,443][71000] Updated weights for policy 0, policy_version 171504 (0.0023) [2024-06-13 05:31:15,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49152.0, 300 sec: 49096.4). Total num frames: 2810003456. Throughput: 0: 48808.4. Samples: 2338851640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 05:31:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:31:17,983][71000] Updated weights for policy 0, policy_version 171514 (0.0026) [2024-06-13 05:31:20,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48332.8, 300 sec: 48985.4). Total num frames: 2810216448. Throughput: 0: 48875.5. Samples: 2339003960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 05:31:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:31:21,168][71000] Updated weights for policy 0, policy_version 171524 (0.0038) [2024-06-13 05:31:24,674][71000] Updated weights for policy 0, policy_version 171534 (0.0029) [2024-06-13 05:31:25,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 2810462208. Throughput: 0: 48790.6. Samples: 2339300780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 05:31:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:31:27,633][71000] Updated weights for policy 0, policy_version 171544 (0.0039) [2024-06-13 05:31:30,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 2810707968. Throughput: 0: 48844.1. Samples: 2339592760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 05:31:30,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 05:31:31,435][71000] Updated weights for policy 0, policy_version 171554 (0.0028) [2024-06-13 05:31:34,589][71000] Updated weights for policy 0, policy_version 171564 (0.0028) [2024-06-13 05:31:35,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2810970112. Throughput: 0: 49100.5. Samples: 2339736900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 05:31:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:31:38,050][71000] Updated weights for policy 0, policy_version 171574 (0.0025) [2024-06-13 05:31:40,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48332.8, 300 sec: 48929.8). Total num frames: 2811183104. Throughput: 0: 48862.4. Samples: 2340021080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 05:31:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:31:40,971][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000171582_2811199488.pth... [2024-06-13 05:31:41,012][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000170864_2799435776.pth [2024-06-13 05:31:41,474][71000] Updated weights for policy 0, policy_version 171584 (0.0034) [2024-06-13 05:31:45,033][71000] Updated weights for policy 0, policy_version 171594 (0.0027) [2024-06-13 05:31:45,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2811428864. Throughput: 0: 48701.8. Samples: 2340309920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 05:31:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:31:48,342][71000] Updated weights for policy 0, policy_version 171604 (0.0030) [2024-06-13 05:31:50,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 2811658240. Throughput: 0: 48756.5. Samples: 2340453520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 05:31:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:31:51,985][71000] Updated weights for policy 0, policy_version 171614 (0.0024) [2024-06-13 05:31:54,729][71000] Updated weights for policy 0, policy_version 171624 (0.0031) [2024-06-13 05:31:55,939][70768] Fps is (10 sec: 50791.0, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 2811936768. Throughput: 0: 48864.5. Samples: 2340754000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 05:31:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:31:58,407][71000] Updated weights for policy 0, policy_version 171634 (0.0026) [2024-06-13 05:32:00,940][70768] Fps is (10 sec: 50789.3, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 2812166144. Throughput: 0: 48787.4. Samples: 2341047080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 05:32:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:32:01,505][71000] Updated weights for policy 0, policy_version 171644 (0.0031) [2024-06-13 05:32:05,005][71000] Updated weights for policy 0, policy_version 171654 (0.0026) [2024-06-13 05:32:05,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 2812411904. Throughput: 0: 48644.0. Samples: 2341192940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 05:32:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:32:08,257][71000] Updated weights for policy 0, policy_version 171664 (0.0037) [2024-06-13 05:32:08,719][70980] Signal inference workers to stop experience collection... (34950 times) [2024-06-13 05:32:08,729][71000] InferenceWorker_p0-w0: stopping experience collection (34950 times) [2024-06-13 05:32:08,775][70980] Signal inference workers to resume experience collection... (34950 times) [2024-06-13 05:32:08,775][71000] InferenceWorker_p0-w0: resuming experience collection (34950 times) [2024-06-13 05:32:10,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 2812657664. Throughput: 0: 48524.8. Samples: 2341484400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 05:32:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:32:12,019][71000] Updated weights for policy 0, policy_version 171674 (0.0031) [2024-06-13 05:32:14,812][71000] Updated weights for policy 0, policy_version 171684 (0.0035) [2024-06-13 05:32:15,940][70768] Fps is (10 sec: 52428.8, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2812936192. Throughput: 0: 48734.6. Samples: 2341785820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 05:32:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:32:18,505][71000] Updated weights for policy 0, policy_version 171694 (0.0032) [2024-06-13 05:32:20,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 2813149184. Throughput: 0: 49046.1. Samples: 2341943980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 05:32:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:32:21,546][71000] Updated weights for policy 0, policy_version 171704 (0.0030) [2024-06-13 05:32:25,038][71000] Updated weights for policy 0, policy_version 171714 (0.0030) [2024-06-13 05:32:25,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2813411328. Throughput: 0: 48964.9. Samples: 2342224500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 05:32:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:32:28,369][71000] Updated weights for policy 0, policy_version 171724 (0.0033) [2024-06-13 05:32:30,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2813657088. Throughput: 0: 49052.1. Samples: 2342517260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 05:32:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:32:31,950][71000] Updated weights for policy 0, policy_version 171734 (0.0038) [2024-06-13 05:32:35,015][71000] Updated weights for policy 0, policy_version 171744 (0.0023) [2024-06-13 05:32:35,939][70768] Fps is (10 sec: 49152.1, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2813902848. Throughput: 0: 49171.1. Samples: 2342666220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 05:32:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:32:38,920][71000] Updated weights for policy 0, policy_version 171754 (0.0039) [2024-06-13 05:32:40,939][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.1, 300 sec: 48929.9). Total num frames: 2814132224. Throughput: 0: 48959.2. Samples: 2342957160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 05:32:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:32:41,909][71000] Updated weights for policy 0, policy_version 171764 (0.0030) [2024-06-13 05:32:45,545][71000] Updated weights for policy 0, policy_version 171774 (0.0023) [2024-06-13 05:32:45,940][70768] Fps is (10 sec: 45874.5, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 2814361600. Throughput: 0: 49085.4. Samples: 2343255920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 05:32:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:32:48,703][71000] Updated weights for policy 0, policy_version 171784 (0.0022) [2024-06-13 05:32:50,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49424.9, 300 sec: 48985.4). Total num frames: 2814623744. Throughput: 0: 48996.8. Samples: 2343397800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 05:32:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:32:51,874][71000] Updated weights for policy 0, policy_version 171794 (0.0024) [2024-06-13 05:32:55,361][71000] Updated weights for policy 0, policy_version 171804 (0.0023) [2024-06-13 05:32:55,940][70768] Fps is (10 sec: 52429.5, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2814885888. Throughput: 0: 48996.2. Samples: 2343689220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 05:32:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:32:58,847][71000] Updated weights for policy 0, policy_version 171814 (0.0028) [2024-06-13 05:33:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 2815131648. Throughput: 0: 48896.8. Samples: 2343986180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 05:33:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:33:01,962][71000] Updated weights for policy 0, policy_version 171824 (0.0029) [2024-06-13 05:33:05,495][71000] Updated weights for policy 0, policy_version 171834 (0.0027) [2024-06-13 05:33:05,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 2815344640. Throughput: 0: 48532.0. Samples: 2344127920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 05:33:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:33:08,824][71000] Updated weights for policy 0, policy_version 171844 (0.0026) [2024-06-13 05:33:10,939][70768] Fps is (10 sec: 45875.9, 60 sec: 48879.1, 300 sec: 48929.9). Total num frames: 2815590400. Throughput: 0: 48780.0. Samples: 2344419600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 05:33:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:33:12,083][70980] Signal inference workers to stop experience collection... (35000 times) [2024-06-13 05:33:12,084][70980] Signal inference workers to resume experience collection... (35000 times) [2024-06-13 05:33:12,089][71000] Updated weights for policy 0, policy_version 171854 (0.0028) [2024-06-13 05:33:12,134][71000] InferenceWorker_p0-w0: stopping experience collection (35000 times) [2024-06-13 05:33:12,134][71000] InferenceWorker_p0-w0: resuming experience collection (35000 times) [2024-06-13 05:33:15,022][71000] Updated weights for policy 0, policy_version 171864 (0.0027) [2024-06-13 05:33:15,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48332.8, 300 sec: 48874.3). Total num frames: 2815836160. Throughput: 0: 48898.7. Samples: 2344717700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 05:33:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 05:33:18,700][71000] Updated weights for policy 0, policy_version 171874 (0.0021) [2024-06-13 05:33:20,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.1, 300 sec: 48929.9). Total num frames: 2816098304. Throughput: 0: 48980.9. Samples: 2344870360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 05:33:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:33:21,770][71000] Updated weights for policy 0, policy_version 171884 (0.0040) [2024-06-13 05:33:25,104][71000] Updated weights for policy 0, policy_version 171894 (0.0034) [2024-06-13 05:33:25,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 2816327680. Throughput: 0: 49184.3. Samples: 2345170460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 05:33:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:33:28,398][71000] Updated weights for policy 0, policy_version 171904 (0.0032) [2024-06-13 05:33:30,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2816606208. Throughput: 0: 48962.7. Samples: 2345459240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 05:33:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:33:31,794][71000] Updated weights for policy 0, policy_version 171914 (0.0036) [2024-06-13 05:33:34,775][71000] Updated weights for policy 0, policy_version 171924 (0.0024) [2024-06-13 05:33:35,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 2816835584. Throughput: 0: 49221.4. Samples: 2345612760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 05:33:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:33:38,192][71000] Updated weights for policy 0, policy_version 171934 (0.0029) [2024-06-13 05:33:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49424.9, 300 sec: 48985.4). Total num frames: 2817097728. Throughput: 0: 49509.7. Samples: 2345917160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 05:33:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:33:41,024][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000171943_2817114112.pth... [2024-06-13 05:33:41,073][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000171225_2805350400.pth [2024-06-13 05:33:41,217][71000] Updated weights for policy 0, policy_version 171944 (0.0033) [2024-06-13 05:33:44,919][71000] Updated weights for policy 0, policy_version 171954 (0.0029) [2024-06-13 05:33:45,939][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.1, 300 sec: 48929.9). Total num frames: 2817310720. Throughput: 0: 49130.0. Samples: 2346197020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 05:33:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:33:48,136][71000] Updated weights for policy 0, policy_version 171964 (0.0033) [2024-06-13 05:33:50,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 2817572864. Throughput: 0: 49157.4. Samples: 2346340000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 05:33:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:33:51,577][71000] Updated weights for policy 0, policy_version 171974 (0.0027) [2024-06-13 05:33:54,690][71000] Updated weights for policy 0, policy_version 171984 (0.0028) [2024-06-13 05:33:55,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 2817802240. Throughput: 0: 49429.7. Samples: 2346643940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 05:33:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:33:58,224][71000] Updated weights for policy 0, policy_version 171994 (0.0034) [2024-06-13 05:34:00,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 2818048000. Throughput: 0: 49312.0. Samples: 2346936740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 05:34:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:34:01,578][71000] Updated weights for policy 0, policy_version 172004 (0.0024) [2024-06-13 05:34:04,940][71000] Updated weights for policy 0, policy_version 172014 (0.0028) [2024-06-13 05:34:05,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.2, 300 sec: 48985.4). Total num frames: 2818310144. Throughput: 0: 49161.8. Samples: 2347082640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 05:34:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:34:08,251][71000] Updated weights for policy 0, policy_version 172024 (0.0037) [2024-06-13 05:34:10,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 2818539520. Throughput: 0: 48993.8. Samples: 2347375180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 05:34:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:34:11,634][71000] Updated weights for policy 0, policy_version 172034 (0.0032) [2024-06-13 05:34:15,143][71000] Updated weights for policy 0, policy_version 172044 (0.0029) [2024-06-13 05:34:15,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 2818785280. Throughput: 0: 48869.4. Samples: 2347658360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 05:34:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 05:34:18,576][71000] Updated weights for policy 0, policy_version 172054 (0.0032) [2024-06-13 05:34:19,187][70980] Signal inference workers to stop experience collection... (35050 times) [2024-06-13 05:34:19,187][70980] Signal inference workers to resume experience collection... (35050 times) [2024-06-13 05:34:19,229][71000] InferenceWorker_p0-w0: stopping experience collection (35050 times) [2024-06-13 05:34:19,229][71000] InferenceWorker_p0-w0: resuming experience collection (35050 times) [2024-06-13 05:34:20,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2819031040. Throughput: 0: 48798.8. Samples: 2347808700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 05:34:20,942][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:34:22,039][71000] Updated weights for policy 0, policy_version 172064 (0.0026) [2024-06-13 05:34:25,216][71000] Updated weights for policy 0, policy_version 172074 (0.0029) [2024-06-13 05:34:25,939][70768] Fps is (10 sec: 52429.2, 60 sec: 49698.2, 300 sec: 49041.1). Total num frames: 2819309568. Throughput: 0: 48685.0. Samples: 2348107980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-13 05:34:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:34:28,621][71000] Updated weights for policy 0, policy_version 172084 (0.0036) [2024-06-13 05:34:30,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48332.8, 300 sec: 48874.3). Total num frames: 2819506176. Throughput: 0: 48719.4. Samples: 2348389400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:34:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:34:32,300][71000] Updated weights for policy 0, policy_version 172094 (0.0025) [2024-06-13 05:34:35,272][71000] Updated weights for policy 0, policy_version 172104 (0.0023) [2024-06-13 05:34:35,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 2819768320. Throughput: 0: 48580.8. Samples: 2348526140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:34:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:34:38,676][71000] Updated weights for policy 0, policy_version 172114 (0.0030) [2024-06-13 05:34:40,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48332.8, 300 sec: 48874.3). Total num frames: 2819997696. Throughput: 0: 48451.9. Samples: 2348824280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:34:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:34:42,352][71000] Updated weights for policy 0, policy_version 172124 (0.0029) [2024-06-13 05:34:45,351][71000] Updated weights for policy 0, policy_version 172134 (0.0024) [2024-06-13 05:34:45,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49698.0, 300 sec: 49040.9). Total num frames: 2820292608. Throughput: 0: 48725.7. Samples: 2349129400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:34:45,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 05:34:48,940][71000] Updated weights for policy 0, policy_version 172144 (0.0026) [2024-06-13 05:34:50,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 2820489216. Throughput: 0: 48782.1. Samples: 2349277840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:34:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:34:52,314][71000] Updated weights for policy 0, policy_version 172154 (0.0033) [2024-06-13 05:34:55,644][71000] Updated weights for policy 0, policy_version 172164 (0.0028) [2024-06-13 05:34:55,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 2820751360. Throughput: 0: 48574.2. Samples: 2349561020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:34:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:34:59,003][71000] Updated weights for policy 0, policy_version 172174 (0.0025) [2024-06-13 05:35:00,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 2820980736. Throughput: 0: 48748.1. Samples: 2349852020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:35:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:35:02,422][71000] Updated weights for policy 0, policy_version 172184 (0.0037) [2024-06-13 05:35:05,693][71000] Updated weights for policy 0, policy_version 172194 (0.0030) [2024-06-13 05:35:05,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 2821226496. Throughput: 0: 48610.2. Samples: 2349996160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:35:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:35:09,099][71000] Updated weights for policy 0, policy_version 172204 (0.0024) [2024-06-13 05:35:10,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 2821472256. Throughput: 0: 48510.1. Samples: 2350290940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:35:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:35:12,571][71000] Updated weights for policy 0, policy_version 172214 (0.0029) [2024-06-13 05:35:15,797][71000] Updated weights for policy 0, policy_version 172224 (0.0030) [2024-06-13 05:35:15,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 2821718016. Throughput: 0: 48852.0. Samples: 2350587740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:35:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:35:18,900][70980] Signal inference workers to stop experience collection... (35100 times) [2024-06-13 05:35:18,900][70980] Signal inference workers to resume experience collection... (35100 times) [2024-06-13 05:35:18,933][71000] InferenceWorker_p0-w0: stopping experience collection (35100 times) [2024-06-13 05:35:18,933][71000] InferenceWorker_p0-w0: resuming experience collection (35100 times) [2024-06-13 05:35:19,046][71000] Updated weights for policy 0, policy_version 172234 (0.0026) [2024-06-13 05:35:20,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2821963776. Throughput: 0: 49065.4. Samples: 2350734080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:35:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:35:22,585][71000] Updated weights for policy 0, policy_version 172244 (0.0026) [2024-06-13 05:35:25,826][71000] Updated weights for policy 0, policy_version 172254 (0.0028) [2024-06-13 05:35:25,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48332.8, 300 sec: 48929.8). Total num frames: 2822209536. Throughput: 0: 49114.3. Samples: 2351034420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:35:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:35:29,563][71000] Updated weights for policy 0, policy_version 172264 (0.0031) [2024-06-13 05:35:30,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 2822455296. Throughput: 0: 48719.7. Samples: 2351321780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-13 05:35:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:35:32,543][71000] Updated weights for policy 0, policy_version 172274 (0.0030) [2024-06-13 05:35:35,829][71000] Updated weights for policy 0, policy_version 172284 (0.0022) [2024-06-13 05:35:35,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 2822701056. Throughput: 0: 48632.0. Samples: 2351466280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 05:35:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:35:38,920][71000] Updated weights for policy 0, policy_version 172294 (0.0026) [2024-06-13 05:35:40,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2822946816. Throughput: 0: 48962.6. Samples: 2351764340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 05:35:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:35:40,955][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000172299_2822946816.pth... [2024-06-13 05:35:41,014][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000171582_2811199488.pth [2024-06-13 05:35:42,708][71000] Updated weights for policy 0, policy_version 172304 (0.0032) [2024-06-13 05:35:45,600][71000] Updated weights for policy 0, policy_version 172314 (0.0022) [2024-06-13 05:35:45,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 2823208960. Throughput: 0: 49171.5. Samples: 2352064740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 05:35:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:35:49,329][71000] Updated weights for policy 0, policy_version 172324 (0.0035) [2024-06-13 05:35:50,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 2823438336. Throughput: 0: 49226.6. Samples: 2352211360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 05:35:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:35:52,326][71000] Updated weights for policy 0, policy_version 172334 (0.0024) [2024-06-13 05:35:55,940][70768] Fps is (10 sec: 44236.5, 60 sec: 48332.7, 300 sec: 48818.8). Total num frames: 2823651328. Throughput: 0: 49058.1. Samples: 2352498560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 05:35:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:35:56,404][71000] Updated weights for policy 0, policy_version 172344 (0.0042) [2024-06-13 05:35:59,250][71000] Updated weights for policy 0, policy_version 172354 (0.0026) [2024-06-13 05:36:00,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 2823946240. Throughput: 0: 48972.4. Samples: 2352791500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 05:36:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:36:02,809][71000] Updated weights for policy 0, policy_version 172364 (0.0024) [2024-06-13 05:36:05,883][71000] Updated weights for policy 0, policy_version 172374 (0.0032) [2024-06-13 05:36:05,940][70768] Fps is (10 sec: 52429.5, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 2824175616. Throughput: 0: 49328.4. Samples: 2352953860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 05:36:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:36:09,772][71000] Updated weights for policy 0, policy_version 172384 (0.0026) [2024-06-13 05:36:10,939][70768] Fps is (10 sec: 44237.3, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 2824388608. Throughput: 0: 48932.0. Samples: 2353236360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 05:36:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:36:12,673][71000] Updated weights for policy 0, policy_version 172394 (0.0021) [2024-06-13 05:36:15,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 2824634368. Throughput: 0: 48847.9. Samples: 2353519940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 05:36:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:36:16,713][71000] Updated weights for policy 0, policy_version 172404 (0.0028) [2024-06-13 05:36:19,437][71000] Updated weights for policy 0, policy_version 172414 (0.0028) [2024-06-13 05:36:20,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 2824896512. Throughput: 0: 48927.6. Samples: 2353668020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 05:36:20,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:36:23,115][71000] Updated weights for policy 0, policy_version 172424 (0.0020) [2024-06-13 05:36:25,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 2825142272. Throughput: 0: 48788.4. Samples: 2353959820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 05:36:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:36:26,083][71000] Updated weights for policy 0, policy_version 172434 (0.0038) [2024-06-13 05:36:29,965][71000] Updated weights for policy 0, policy_version 172444 (0.0026) [2024-06-13 05:36:30,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.8, 300 sec: 48818.8). Total num frames: 2825371648. Throughput: 0: 48639.2. Samples: 2354253500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 05:36:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:36:32,432][70980] Signal inference workers to stop experience collection... (35150 times) [2024-06-13 05:36:32,432][70980] Signal inference workers to resume experience collection... (35150 times) [2024-06-13 05:36:32,447][71000] InferenceWorker_p0-w0: stopping experience collection (35150 times) [2024-06-13 05:36:32,447][71000] InferenceWorker_p0-w0: resuming experience collection (35150 times) [2024-06-13 05:36:32,881][71000] Updated weights for policy 0, policy_version 172454 (0.0027) [2024-06-13 05:36:35,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 2825617408. Throughput: 0: 48504.0. Samples: 2354394040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 05:36:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:36:36,434][71000] Updated weights for policy 0, policy_version 172464 (0.0032) [2024-06-13 05:36:39,567][71000] Updated weights for policy 0, policy_version 172474 (0.0027) [2024-06-13 05:36:40,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2825879552. Throughput: 0: 48741.9. Samples: 2354691940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 05:36:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:36:43,279][71000] Updated weights for policy 0, policy_version 172484 (0.0030) [2024-06-13 05:36:45,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48332.8, 300 sec: 48985.4). Total num frames: 2826108928. Throughput: 0: 48764.1. Samples: 2354985880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:36:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:36:46,434][71000] Updated weights for policy 0, policy_version 172494 (0.0044) [2024-06-13 05:36:50,035][71000] Updated weights for policy 0, policy_version 172504 (0.0027) [2024-06-13 05:36:50,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48605.7, 300 sec: 48874.3). Total num frames: 2826354688. Throughput: 0: 48309.6. Samples: 2355127800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:36:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:36:52,973][71000] Updated weights for policy 0, policy_version 172514 (0.0036) [2024-06-13 05:36:55,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 2826600448. Throughput: 0: 48547.4. Samples: 2355421000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:36:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:36:56,463][71000] Updated weights for policy 0, policy_version 172524 (0.0036) [2024-06-13 05:36:59,628][71000] Updated weights for policy 0, policy_version 172534 (0.0022) [2024-06-13 05:37:00,939][70768] Fps is (10 sec: 49153.4, 60 sec: 48332.9, 300 sec: 48929.9). Total num frames: 2826846208. Throughput: 0: 48886.4. Samples: 2355719820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:37:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:37:03,199][71000] Updated weights for policy 0, policy_version 172544 (0.0023) [2024-06-13 05:37:05,940][70768] Fps is (10 sec: 50791.1, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 2827108352. Throughput: 0: 49057.0. Samples: 2355875580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:37:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:37:06,125][71000] Updated weights for policy 0, policy_version 172554 (0.0028) [2024-06-13 05:37:09,793][71000] Updated weights for policy 0, policy_version 172564 (0.0025) [2024-06-13 05:37:10,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 2827321344. Throughput: 0: 49018.7. Samples: 2356165660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:37:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:37:13,118][71000] Updated weights for policy 0, policy_version 172574 (0.0032) [2024-06-13 05:37:15,944][70768] Fps is (10 sec: 47491.7, 60 sec: 49148.3, 300 sec: 48929.1). Total num frames: 2827583488. Throughput: 0: 48975.9. Samples: 2356457640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:37:15,944][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:37:16,346][71000] Updated weights for policy 0, policy_version 172584 (0.0028) [2024-06-13 05:37:19,565][71000] Updated weights for policy 0, policy_version 172594 (0.0027) [2024-06-13 05:37:20,940][70768] Fps is (10 sec: 52427.1, 60 sec: 49151.8, 300 sec: 48929.8). Total num frames: 2827845632. Throughput: 0: 49242.3. Samples: 2356609960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:37:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:37:23,252][71000] Updated weights for policy 0, policy_version 172604 (0.0029) [2024-06-13 05:37:25,940][70768] Fps is (10 sec: 50813.6, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 2828091392. Throughput: 0: 49191.2. Samples: 2356905540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:37:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:37:26,094][71000] Updated weights for policy 0, policy_version 172614 (0.0020) [2024-06-13 05:37:29,830][71000] Updated weights for policy 0, policy_version 172624 (0.0033) [2024-06-13 05:37:30,939][70768] Fps is (10 sec: 45876.7, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 2828304384. Throughput: 0: 49090.3. Samples: 2357194940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:37:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:37:33,107][71000] Updated weights for policy 0, policy_version 172634 (0.0032) [2024-06-13 05:37:35,940][70768] Fps is (10 sec: 47512.2, 60 sec: 49151.7, 300 sec: 48929.8). Total num frames: 2828566528. Throughput: 0: 49108.3. Samples: 2357337680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:37:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:37:36,312][71000] Updated weights for policy 0, policy_version 172644 (0.0025) [2024-06-13 05:37:39,698][71000] Updated weights for policy 0, policy_version 172654 (0.0031) [2024-06-13 05:37:40,940][70768] Fps is (10 sec: 50789.3, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 2828812288. Throughput: 0: 49328.8. Samples: 2357640800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:37:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:37:41,012][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000172658_2828828672.pth... [2024-06-13 05:37:41,062][70980] Signal inference workers to stop experience collection... (35200 times) [2024-06-13 05:37:41,072][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000171943_2817114112.pth [2024-06-13 05:37:41,076][71000] InferenceWorker_p0-w0: stopping experience collection (35200 times) [2024-06-13 05:37:41,178][70980] Signal inference workers to resume experience collection... (35200 times) [2024-06-13 05:37:41,179][71000] InferenceWorker_p0-w0: resuming experience collection (35200 times) [2024-06-13 05:37:42,954][71000] Updated weights for policy 0, policy_version 172664 (0.0026) [2024-06-13 05:37:45,940][70768] Fps is (10 sec: 49153.4, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 2829058048. Throughput: 0: 49225.7. Samples: 2357934980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 05:37:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:37:46,191][71000] Updated weights for policy 0, policy_version 172674 (0.0028) [2024-06-13 05:37:49,841][71000] Updated weights for policy 0, policy_version 172684 (0.0037) [2024-06-13 05:37:50,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 2829287424. Throughput: 0: 48798.6. Samples: 2358071520. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-13 05:37:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 05:37:53,385][71000] Updated weights for policy 0, policy_version 172694 (0.0028) [2024-06-13 05:37:55,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 2829549568. Throughput: 0: 48987.1. Samples: 2358370080. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-13 05:37:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:37:56,283][71000] Updated weights for policy 0, policy_version 172704 (0.0028) [2024-06-13 05:37:59,847][71000] Updated weights for policy 0, policy_version 172714 (0.0025) [2024-06-13 05:38:00,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2829795328. Throughput: 0: 49149.9. Samples: 2358669160. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-13 05:38:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:38:03,049][71000] Updated weights for policy 0, policy_version 172724 (0.0021) [2024-06-13 05:38:05,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 2830024704. Throughput: 0: 49045.2. Samples: 2358816980. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-13 05:38:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:38:06,552][71000] Updated weights for policy 0, policy_version 172734 (0.0026) [2024-06-13 05:38:09,971][71000] Updated weights for policy 0, policy_version 172744 (0.0026) [2024-06-13 05:38:10,939][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 2830270464. Throughput: 0: 48767.6. Samples: 2359100080. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-13 05:38:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:38:13,244][71000] Updated weights for policy 0, policy_version 172754 (0.0035) [2024-06-13 05:38:15,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49155.7, 300 sec: 48929.8). Total num frames: 2830532608. Throughput: 0: 48859.0. Samples: 2359393600. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-13 05:38:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:38:16,304][71000] Updated weights for policy 0, policy_version 172764 (0.0029) [2024-06-13 05:38:20,001][71000] Updated weights for policy 0, policy_version 172774 (0.0022) [2024-06-13 05:38:20,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48879.1, 300 sec: 48985.4). Total num frames: 2830778368. Throughput: 0: 49186.0. Samples: 2359551040. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-13 05:38:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:38:22,777][71000] Updated weights for policy 0, policy_version 172784 (0.0035) [2024-06-13 05:38:25,939][70768] Fps is (10 sec: 49152.8, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 2831024128. Throughput: 0: 49094.9. Samples: 2359850060. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-13 05:38:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:38:26,624][71000] Updated weights for policy 0, policy_version 172794 (0.0023) [2024-06-13 05:38:29,782][71000] Updated weights for policy 0, policy_version 172804 (0.0035) [2024-06-13 05:38:30,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 2831253504. Throughput: 0: 48901.7. Samples: 2360135560. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-13 05:38:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:38:33,207][71000] Updated weights for policy 0, policy_version 172814 (0.0025) [2024-06-13 05:38:35,940][70768] Fps is (10 sec: 49150.9, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 2831515648. Throughput: 0: 48935.9. Samples: 2360273640. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-13 05:38:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:38:36,290][71000] Updated weights for policy 0, policy_version 172824 (0.0020) [2024-06-13 05:38:40,235][71000] Updated weights for policy 0, policy_version 172834 (0.0033) [2024-06-13 05:38:40,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48606.0, 300 sec: 48874.3). Total num frames: 2831728640. Throughput: 0: 48916.4. Samples: 2360571320. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-13 05:38:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:38:43,792][71000] Updated weights for policy 0, policy_version 172844 (0.0037) [2024-06-13 05:38:45,939][70768] Fps is (10 sec: 49153.0, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 2832007168. Throughput: 0: 48764.4. Samples: 2360863560. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-13 05:38:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:38:46,934][71000] Updated weights for policy 0, policy_version 172854 (0.0023) [2024-06-13 05:38:50,293][71000] Updated weights for policy 0, policy_version 172864 (0.0032) [2024-06-13 05:38:50,556][70980] Signal inference workers to stop experience collection... (35250 times) [2024-06-13 05:38:50,557][70980] Signal inference workers to resume experience collection... (35250 times) [2024-06-13 05:38:50,580][71000] InferenceWorker_p0-w0: stopping experience collection (35250 times) [2024-06-13 05:38:50,580][71000] InferenceWorker_p0-w0: resuming experience collection (35250 times) [2024-06-13 05:38:50,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 2832252928. Throughput: 0: 48834.7. Samples: 2361014540. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-13 05:38:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:38:53,469][71000] Updated weights for policy 0, policy_version 172874 (0.0023) [2024-06-13 05:38:55,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 2832498688. Throughput: 0: 49101.8. Samples: 2361309660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:38:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:38:56,815][71000] Updated weights for policy 0, policy_version 172884 (0.0025) [2024-06-13 05:39:00,081][71000] Updated weights for policy 0, policy_version 172894 (0.0020) [2024-06-13 05:39:00,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48605.8, 300 sec: 48818.8). Total num frames: 2832711680. Throughput: 0: 49033.5. Samples: 2361600100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:39:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:39:03,729][71000] Updated weights for policy 0, policy_version 172904 (0.0038) [2024-06-13 05:39:05,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 2832990208. Throughput: 0: 48803.2. Samples: 2361747180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:39:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 05:39:06,905][71000] Updated weights for policy 0, policy_version 172914 (0.0036) [2024-06-13 05:39:10,515][71000] Updated weights for policy 0, policy_version 172924 (0.0039) [2024-06-13 05:39:10,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 2833203200. Throughput: 0: 48542.0. Samples: 2362034460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:39:10,949][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:39:13,524][71000] Updated weights for policy 0, policy_version 172934 (0.0030) [2024-06-13 05:39:15,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2833481728. Throughput: 0: 48841.8. Samples: 2362333440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:39:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:39:17,144][71000] Updated weights for policy 0, policy_version 172944 (0.0028) [2024-06-13 05:39:20,228][71000] Updated weights for policy 0, policy_version 172954 (0.0031) [2024-06-13 05:39:20,939][70768] Fps is (10 sec: 49153.0, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 2833694720. Throughput: 0: 48951.8. Samples: 2362476460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:39:20,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 05:39:23,581][71000] Updated weights for policy 0, policy_version 172964 (0.0036) [2024-06-13 05:39:25,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2833956864. Throughput: 0: 48974.7. Samples: 2362775180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:39:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:39:26,793][71000] Updated weights for policy 0, policy_version 172974 (0.0030) [2024-06-13 05:39:30,168][71000] Updated weights for policy 0, policy_version 172984 (0.0028) [2024-06-13 05:39:30,940][70768] Fps is (10 sec: 50788.6, 60 sec: 49151.8, 300 sec: 48929.8). Total num frames: 2834202624. Throughput: 0: 49124.1. Samples: 2363074160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:39:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 05:39:33,619][71000] Updated weights for policy 0, policy_version 172994 (0.0033) [2024-06-13 05:39:35,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2834464768. Throughput: 0: 49099.4. Samples: 2363224020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:39:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:39:36,922][71000] Updated weights for policy 0, policy_version 173004 (0.0033) [2024-06-13 05:39:40,253][71000] Updated weights for policy 0, policy_version 173014 (0.0027) [2024-06-13 05:39:40,940][70768] Fps is (10 sec: 49153.4, 60 sec: 49425.1, 300 sec: 48818.8). Total num frames: 2834694144. Throughput: 0: 48999.5. Samples: 2363514640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:39:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:39:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000173016_2834694144.pth... [2024-06-13 05:39:41,004][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000172299_2822946816.pth [2024-06-13 05:39:43,522][71000] Updated weights for policy 0, policy_version 173024 (0.0022) [2024-06-13 05:39:45,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 2834923520. Throughput: 0: 48868.9. Samples: 2363799200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:39:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:39:47,138][71000] Updated weights for policy 0, policy_version 173034 (0.0041) [2024-06-13 05:39:50,468][71000] Updated weights for policy 0, policy_version 173044 (0.0025) [2024-06-13 05:39:50,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 2835185664. Throughput: 0: 48895.1. Samples: 2363947460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:39:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:39:53,715][71000] Updated weights for policy 0, policy_version 173054 (0.0030) [2024-06-13 05:39:55,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 2835415040. Throughput: 0: 49088.1. Samples: 2364243420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 05:39:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:39:57,095][71000] Updated weights for policy 0, policy_version 173064 (0.0028) [2024-06-13 05:40:00,392][71000] Updated weights for policy 0, policy_version 173074 (0.0023) [2024-06-13 05:40:00,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 2835660800. Throughput: 0: 49102.7. Samples: 2364543060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 05:40:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:40:03,536][71000] Updated weights for policy 0, policy_version 173084 (0.0031) [2024-06-13 05:40:05,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 2835906560. Throughput: 0: 49143.1. Samples: 2364687900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 05:40:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:40:07,057][71000] Updated weights for policy 0, policy_version 173094 (0.0036) [2024-06-13 05:40:10,158][70980] Signal inference workers to stop experience collection... (35300 times) [2024-06-13 05:40:10,214][71000] InferenceWorker_p0-w0: stopping experience collection (35300 times) [2024-06-13 05:40:10,216][70980] Signal inference workers to resume experience collection... (35300 times) [2024-06-13 05:40:10,228][71000] InferenceWorker_p0-w0: resuming experience collection (35300 times) [2024-06-13 05:40:10,375][71000] Updated weights for policy 0, policy_version 173104 (0.0026) [2024-06-13 05:40:10,939][70768] Fps is (10 sec: 52429.3, 60 sec: 49698.3, 300 sec: 49040.9). Total num frames: 2836185088. Throughput: 0: 49011.6. Samples: 2364980700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 05:40:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:40:13,904][71000] Updated weights for policy 0, policy_version 173114 (0.0033) [2024-06-13 05:40:15,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48332.8, 300 sec: 48874.3). Total num frames: 2836381696. Throughput: 0: 48668.7. Samples: 2365264240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 05:40:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:40:17,155][71000] Updated weights for policy 0, policy_version 173124 (0.0031) [2024-06-13 05:40:20,638][71000] Updated weights for policy 0, policy_version 173134 (0.0028) [2024-06-13 05:40:20,941][70768] Fps is (10 sec: 45869.6, 60 sec: 49151.0, 300 sec: 48929.6). Total num frames: 2836643840. Throughput: 0: 48669.1. Samples: 2365414180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 05:40:20,941][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:40:23,729][71000] Updated weights for policy 0, policy_version 173144 (0.0032) [2024-06-13 05:40:25,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 2836889600. Throughput: 0: 48661.3. Samples: 2365704400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 05:40:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:40:27,199][71000] Updated weights for policy 0, policy_version 173154 (0.0026) [2024-06-13 05:40:30,625][71000] Updated weights for policy 0, policy_version 173164 (0.0030) [2024-06-13 05:40:30,940][70768] Fps is (10 sec: 49157.9, 60 sec: 48879.2, 300 sec: 48929.9). Total num frames: 2837135360. Throughput: 0: 49004.5. Samples: 2366004400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 05:40:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:40:34,182][71000] Updated weights for policy 0, policy_version 173174 (0.0036) [2024-06-13 05:40:35,939][70768] Fps is (10 sec: 47514.1, 60 sec: 48333.0, 300 sec: 48874.3). Total num frames: 2837364736. Throughput: 0: 48862.8. Samples: 2366146280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 05:40:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:40:37,113][71000] Updated weights for policy 0, policy_version 173184 (0.0037) [2024-06-13 05:40:40,676][71000] Updated weights for policy 0, policy_version 173194 (0.0027) [2024-06-13 05:40:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 2837626880. Throughput: 0: 48796.1. Samples: 2366439240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 05:40:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:40:43,676][71000] Updated weights for policy 0, policy_version 173204 (0.0031) [2024-06-13 05:40:45,939][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 2837872640. Throughput: 0: 48745.0. Samples: 2366736580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 05:40:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:40:47,283][71000] Updated weights for policy 0, policy_version 173214 (0.0032) [2024-06-13 05:40:50,479][71000] Updated weights for policy 0, policy_version 173224 (0.0028) [2024-06-13 05:40:50,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48879.0, 300 sec: 49041.0). Total num frames: 2838118400. Throughput: 0: 48842.2. Samples: 2366885800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 05:40:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:40:53,811][71000] Updated weights for policy 0, policy_version 173234 (0.0034) [2024-06-13 05:40:55,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 2838347776. Throughput: 0: 48767.1. Samples: 2367175220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 05:40:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:40:57,259][71000] Updated weights for policy 0, policy_version 173244 (0.0030) [2024-06-13 05:41:00,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 2838577152. Throughput: 0: 48930.7. Samples: 2367466120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 05:41:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:41:00,949][71000] Updated weights for policy 0, policy_version 173254 (0.0027) [2024-06-13 05:41:03,868][71000] Updated weights for policy 0, policy_version 173264 (0.0031) [2024-06-13 05:41:05,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2838839296. Throughput: 0: 49011.4. Samples: 2367619640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-13 05:41:05,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 05:41:07,254][70980] Signal inference workers to stop experience collection... (35350 times) [2024-06-13 05:41:07,257][70980] Signal inference workers to resume experience collection... (35350 times) [2024-06-13 05:41:07,264][71000] InferenceWorker_p0-w0: stopping experience collection (35350 times) [2024-06-13 05:41:07,295][71000] InferenceWorker_p0-w0: resuming experience collection (35350 times) [2024-06-13 05:41:07,396][71000] Updated weights for policy 0, policy_version 173274 (0.0025) [2024-06-13 05:41:10,607][71000] Updated weights for policy 0, policy_version 173284 (0.0026) [2024-06-13 05:41:10,940][70768] Fps is (10 sec: 52428.7, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 2839101440. Throughput: 0: 49050.6. Samples: 2367911680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-13 05:41:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:41:14,246][71000] Updated weights for policy 0, policy_version 173294 (0.0029) [2024-06-13 05:41:15,939][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.1, 300 sec: 48929.9). Total num frames: 2839330816. Throughput: 0: 48831.1. Samples: 2368201800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-13 05:41:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:41:17,299][71000] Updated weights for policy 0, policy_version 173304 (0.0033) [2024-06-13 05:41:20,814][71000] Updated weights for policy 0, policy_version 173314 (0.0035) [2024-06-13 05:41:20,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48879.9, 300 sec: 48929.8). Total num frames: 2839576576. Throughput: 0: 48929.2. Samples: 2368348100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-13 05:41:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:41:23,975][71000] Updated weights for policy 0, policy_version 173324 (0.0028) [2024-06-13 05:41:25,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2839822336. Throughput: 0: 49025.7. Samples: 2368645400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-13 05:41:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:41:27,543][71000] Updated weights for policy 0, policy_version 173334 (0.0030) [2024-06-13 05:41:30,437][71000] Updated weights for policy 0, policy_version 173344 (0.0028) [2024-06-13 05:41:30,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 2840084480. Throughput: 0: 49087.9. Samples: 2368945540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-13 05:41:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:41:33,778][71000] Updated weights for policy 0, policy_version 173354 (0.0029) [2024-06-13 05:41:35,939][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 2840330240. Throughput: 0: 49201.8. Samples: 2369099880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-13 05:41:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:41:37,112][71000] Updated weights for policy 0, policy_version 173364 (0.0034) [2024-06-13 05:41:40,289][71000] Updated weights for policy 0, policy_version 173374 (0.0022) [2024-06-13 05:41:40,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2840559616. Throughput: 0: 49174.6. Samples: 2369388080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-13 05:41:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:41:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000173374_2840559616.pth... [2024-06-13 05:41:40,991][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000172658_2828828672.pth [2024-06-13 05:41:44,114][71000] Updated weights for policy 0, policy_version 173384 (0.0023) [2024-06-13 05:41:45,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 2840805376. Throughput: 0: 49392.4. Samples: 2369688780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-13 05:41:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:41:47,378][71000] Updated weights for policy 0, policy_version 173394 (0.0022) [2024-06-13 05:41:50,482][71000] Updated weights for policy 0, policy_version 173404 (0.0033) [2024-06-13 05:41:50,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 2841067520. Throughput: 0: 49287.6. Samples: 2369837580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-13 05:41:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:41:53,687][71000] Updated weights for policy 0, policy_version 173414 (0.0029) [2024-06-13 05:41:55,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 2841313280. Throughput: 0: 49345.3. Samples: 2370132220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-13 05:41:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:41:57,130][71000] Updated weights for policy 0, policy_version 173424 (0.0026) [2024-06-13 05:42:00,623][71000] Updated weights for policy 0, policy_version 173434 (0.0029) [2024-06-13 05:42:00,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.1, 300 sec: 48929.8). Total num frames: 2841542656. Throughput: 0: 49485.7. Samples: 2370428660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-13 05:42:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:42:03,867][71000] Updated weights for policy 0, policy_version 173444 (0.0027) [2024-06-13 05:42:05,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2841804800. Throughput: 0: 49451.6. Samples: 2370573420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-13 05:42:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:42:07,417][71000] Updated weights for policy 0, policy_version 173454 (0.0033) [2024-06-13 05:42:10,570][71000] Updated weights for policy 0, policy_version 173464 (0.0025) [2024-06-13 05:42:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48879.0, 300 sec: 48986.1). Total num frames: 2842034176. Throughput: 0: 49398.3. Samples: 2370868320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 05:42:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:42:13,822][71000] Updated weights for policy 0, policy_version 173474 (0.0031) [2024-06-13 05:42:15,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 2842296320. Throughput: 0: 49375.2. Samples: 2371167420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 05:42:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:42:17,103][71000] Updated weights for policy 0, policy_version 173484 (0.0021) [2024-06-13 05:42:20,505][71000] Updated weights for policy 0, policy_version 173494 (0.0029) [2024-06-13 05:42:20,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 2842542080. Throughput: 0: 49134.7. Samples: 2371310940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 05:42:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:42:23,962][71000] Updated weights for policy 0, policy_version 173504 (0.0035) [2024-06-13 05:42:25,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2842787840. Throughput: 0: 49142.3. Samples: 2371599480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 05:42:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:42:27,118][71000] Updated weights for policy 0, policy_version 173514 (0.0035) [2024-06-13 05:42:30,751][71000] Updated weights for policy 0, policy_version 173524 (0.0036) [2024-06-13 05:42:30,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 2843017216. Throughput: 0: 49032.5. Samples: 2371895240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 05:42:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:42:33,980][71000] Updated weights for policy 0, policy_version 173534 (0.0027) [2024-06-13 05:42:35,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2843279360. Throughput: 0: 49022.7. Samples: 2372043600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 05:42:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:42:37,153][71000] Updated weights for policy 0, policy_version 173544 (0.0027) [2024-06-13 05:42:40,426][71000] Updated weights for policy 0, policy_version 173554 (0.0029) [2024-06-13 05:42:40,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 2843525120. Throughput: 0: 49039.1. Samples: 2372338980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 05:42:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:42:43,825][71000] Updated weights for policy 0, policy_version 173564 (0.0025) [2024-06-13 05:42:45,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2843754496. Throughput: 0: 49046.7. Samples: 2372635760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 05:42:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:42:47,250][71000] Updated weights for policy 0, policy_version 173574 (0.0022) [2024-06-13 05:42:48,244][70980] Signal inference workers to stop experience collection... (35400 times) [2024-06-13 05:42:48,245][70980] Signal inference workers to resume experience collection... (35400 times) [2024-06-13 05:42:48,285][71000] InferenceWorker_p0-w0: stopping experience collection (35400 times) [2024-06-13 05:42:48,285][71000] InferenceWorker_p0-w0: resuming experience collection (35400 times) [2024-06-13 05:42:50,672][71000] Updated weights for policy 0, policy_version 173584 (0.0030) [2024-06-13 05:42:50,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2844000256. Throughput: 0: 49129.6. Samples: 2372784260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 05:42:50,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 05:42:53,890][71000] Updated weights for policy 0, policy_version 173594 (0.0042) [2024-06-13 05:42:55,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2844262400. Throughput: 0: 48921.3. Samples: 2373069780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 05:42:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:42:57,468][71000] Updated weights for policy 0, policy_version 173604 (0.0026) [2024-06-13 05:43:00,638][71000] Updated weights for policy 0, policy_version 173614 (0.0031) [2024-06-13 05:43:00,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 2844491776. Throughput: 0: 48839.1. Samples: 2373365180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 05:43:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:43:03,867][71000] Updated weights for policy 0, policy_version 173624 (0.0036) [2024-06-13 05:43:05,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2844737536. Throughput: 0: 49044.0. Samples: 2373517920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 05:43:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:43:06,970][71000] Updated weights for policy 0, policy_version 173634 (0.0022) [2024-06-13 05:43:10,382][71000] Updated weights for policy 0, policy_version 173644 (0.0027) [2024-06-13 05:43:10,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 2844999680. Throughput: 0: 49416.3. Samples: 2373823220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 05:43:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:43:13,829][71000] Updated weights for policy 0, policy_version 173654 (0.0030) [2024-06-13 05:43:15,943][70768] Fps is (10 sec: 49132.9, 60 sec: 48875.7, 300 sec: 48984.8). Total num frames: 2845229056. Throughput: 0: 48988.7. Samples: 2374099920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 05:43:15,944][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:43:17,311][71000] Updated weights for policy 0, policy_version 173664 (0.0022) [2024-06-13 05:43:20,730][71000] Updated weights for policy 0, policy_version 173674 (0.0028) [2024-06-13 05:43:20,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.8, 300 sec: 48985.3). Total num frames: 2845474816. Throughput: 0: 48911.0. Samples: 2374244600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 05:43:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:43:24,203][71000] Updated weights for policy 0, policy_version 173684 (0.0029) [2024-06-13 05:43:25,939][70768] Fps is (10 sec: 47532.1, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 2845704192. Throughput: 0: 48877.4. Samples: 2374538460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 05:43:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:43:27,198][71000] Updated weights for policy 0, policy_version 173694 (0.0032) [2024-06-13 05:43:30,924][71000] Updated weights for policy 0, policy_version 173704 (0.0027) [2024-06-13 05:43:30,939][70768] Fps is (10 sec: 49153.1, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 2845966336. Throughput: 0: 49005.0. Samples: 2374840980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 05:43:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:43:33,838][71000] Updated weights for policy 0, policy_version 173714 (0.0026) [2024-06-13 05:43:35,940][70768] Fps is (10 sec: 50789.7, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2846212096. Throughput: 0: 48720.4. Samples: 2374976680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 05:43:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:43:37,574][71000] Updated weights for policy 0, policy_version 173724 (0.0028) [2024-06-13 05:43:40,582][71000] Updated weights for policy 0, policy_version 173734 (0.0026) [2024-06-13 05:43:40,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2846457856. Throughput: 0: 49049.7. Samples: 2375277020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 05:43:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:43:41,060][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000173735_2846474240.pth... [2024-06-13 05:43:41,107][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000173016_2834694144.pth [2024-06-13 05:43:44,332][71000] Updated weights for policy 0, policy_version 173744 (0.0036) [2024-06-13 05:43:45,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2846703616. Throughput: 0: 49080.3. Samples: 2375573800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 05:43:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:43:47,334][71000] Updated weights for policy 0, policy_version 173754 (0.0024) [2024-06-13 05:43:50,895][71000] Updated weights for policy 0, policy_version 173764 (0.0028) [2024-06-13 05:43:50,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2846949376. Throughput: 0: 48676.8. Samples: 2375708380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 05:43:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:43:54,323][71000] Updated weights for policy 0, policy_version 173774 (0.0025) [2024-06-13 05:43:55,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2847195136. Throughput: 0: 48530.7. Samples: 2376007100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 05:43:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:43:57,722][71000] Updated weights for policy 0, policy_version 173784 (0.0031) [2024-06-13 05:44:00,847][71000] Updated weights for policy 0, policy_version 173794 (0.0025) [2024-06-13 05:44:00,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 2847440896. Throughput: 0: 48852.5. Samples: 2376298100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 05:44:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:44:01,509][70980] Signal inference workers to stop experience collection... (35450 times) [2024-06-13 05:44:01,540][71000] InferenceWorker_p0-w0: stopping experience collection (35450 times) [2024-06-13 05:44:01,570][70980] Signal inference workers to resume experience collection... (35450 times) [2024-06-13 05:44:01,571][71000] InferenceWorker_p0-w0: resuming experience collection (35450 times) [2024-06-13 05:44:04,390][71000] Updated weights for policy 0, policy_version 173804 (0.0026) [2024-06-13 05:44:05,939][70768] Fps is (10 sec: 47514.2, 60 sec: 48879.0, 300 sec: 49041.0). Total num frames: 2847670272. Throughput: 0: 49099.8. Samples: 2376454080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 05:44:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:44:07,617][71000] Updated weights for policy 0, policy_version 173814 (0.0030) [2024-06-13 05:44:10,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 2847916032. Throughput: 0: 49133.2. Samples: 2376749460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 05:44:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:44:11,102][71000] Updated weights for policy 0, policy_version 173824 (0.0027) [2024-06-13 05:44:14,223][71000] Updated weights for policy 0, policy_version 173834 (0.0032) [2024-06-13 05:44:15,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49155.1, 300 sec: 49096.4). Total num frames: 2848178176. Throughput: 0: 48787.9. Samples: 2377036440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 05:44:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:44:17,602][71000] Updated weights for policy 0, policy_version 173844 (0.0023) [2024-06-13 05:44:20,782][71000] Updated weights for policy 0, policy_version 173854 (0.0027) [2024-06-13 05:44:20,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 2848423936. Throughput: 0: 49059.1. Samples: 2377184340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 05:44:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:44:24,864][71000] Updated weights for policy 0, policy_version 173864 (0.0033) [2024-06-13 05:44:25,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2848653312. Throughput: 0: 49130.8. Samples: 2377487900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 05:44:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:44:27,515][71000] Updated weights for policy 0, policy_version 173874 (0.0030) [2024-06-13 05:44:30,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 2848882688. Throughput: 0: 48843.6. Samples: 2377771760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 05:44:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:44:31,439][71000] Updated weights for policy 0, policy_version 173884 (0.0036) [2024-06-13 05:44:34,103][71000] Updated weights for policy 0, policy_version 173894 (0.0028) [2024-06-13 05:44:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 2849144832. Throughput: 0: 49060.5. Samples: 2377916100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 05:44:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:44:38,431][71000] Updated weights for policy 0, policy_version 173904 (0.0022) [2024-06-13 05:44:40,734][71000] Updated weights for policy 0, policy_version 173914 (0.0024) [2024-06-13 05:44:40,941][70768] Fps is (10 sec: 52418.9, 60 sec: 49150.5, 300 sec: 49096.1). Total num frames: 2849406976. Throughput: 0: 48939.8. Samples: 2378209480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 05:44:40,942][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:44:45,190][71000] Updated weights for policy 0, policy_version 173924 (0.0022) [2024-06-13 05:44:45,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 2849652736. Throughput: 0: 49259.7. Samples: 2378514780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 05:44:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:44:47,598][71000] Updated weights for policy 0, policy_version 173934 (0.0020) [2024-06-13 05:44:50,940][70768] Fps is (10 sec: 45883.6, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 2849865728. Throughput: 0: 48879.4. Samples: 2378653660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 05:44:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:44:51,615][71000] Updated weights for policy 0, policy_version 173944 (0.0037) [2024-06-13 05:44:54,035][71000] Updated weights for policy 0, policy_version 173954 (0.0029) [2024-06-13 05:44:55,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2850144256. Throughput: 0: 48725.4. Samples: 2378942100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 05:44:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:44:58,506][71000] Updated weights for policy 0, policy_version 173964 (0.0036) [2024-06-13 05:45:00,758][71000] Updated weights for policy 0, policy_version 173974 (0.0030) [2024-06-13 05:45:00,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2850390016. Throughput: 0: 48936.0. Samples: 2379238560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 05:45:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:45:05,117][71000] Updated weights for policy 0, policy_version 173984 (0.0029) [2024-06-13 05:45:05,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 2850619392. Throughput: 0: 48899.5. Samples: 2379384820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 05:45:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:45:05,942][70980] Signal inference workers to stop experience collection... (35500 times) [2024-06-13 05:45:05,942][70980] Signal inference workers to resume experience collection... (35500 times) [2024-06-13 05:45:05,975][71000] InferenceWorker_p0-w0: stopping experience collection (35500 times) [2024-06-13 05:45:05,975][71000] InferenceWorker_p0-w0: resuming experience collection (35500 times) [2024-06-13 05:45:07,730][71000] Updated weights for policy 0, policy_version 173994 (0.0026) [2024-06-13 05:45:10,940][70768] Fps is (10 sec: 45874.6, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 2850848768. Throughput: 0: 48571.4. Samples: 2379673620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 05:45:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:45:11,876][71000] Updated weights for policy 0, policy_version 174004 (0.0028) [2024-06-13 05:45:14,367][71000] Updated weights for policy 0, policy_version 174014 (0.0027) [2024-06-13 05:45:15,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.9, 300 sec: 48985.6). Total num frames: 2851094528. Throughput: 0: 48793.3. Samples: 2379967460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 05:45:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:45:18,667][71000] Updated weights for policy 0, policy_version 174024 (0.0031) [2024-06-13 05:45:20,883][71000] Updated weights for policy 0, policy_version 174034 (0.0022) [2024-06-13 05:45:20,939][70768] Fps is (10 sec: 52429.9, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2851373056. Throughput: 0: 48908.1. Samples: 2380116960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 05:45:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:45:25,329][71000] Updated weights for policy 0, policy_version 174044 (0.0028) [2024-06-13 05:45:25,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48605.7, 300 sec: 48929.8). Total num frames: 2851569664. Throughput: 0: 48845.9. Samples: 2380407460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-06-13 05:45:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:45:27,900][71000] Updated weights for policy 0, policy_version 174054 (0.0033) [2024-06-13 05:45:30,940][70768] Fps is (10 sec: 45874.4, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 2851831808. Throughput: 0: 48529.2. Samples: 2380698600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-06-13 05:45:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:45:32,060][71000] Updated weights for policy 0, policy_version 174064 (0.0030) [2024-06-13 05:45:34,440][71000] Updated weights for policy 0, policy_version 174074 (0.0031) [2024-06-13 05:45:35,940][70768] Fps is (10 sec: 49152.8, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 2852061184. Throughput: 0: 48622.7. Samples: 2380841680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-06-13 05:45:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:45:38,515][71000] Updated weights for policy 0, policy_version 174084 (0.0026) [2024-06-13 05:45:40,944][70768] Fps is (10 sec: 50769.1, 60 sec: 48877.0, 300 sec: 49040.2). Total num frames: 2852339712. Throughput: 0: 48885.1. Samples: 2381142140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-06-13 05:45:40,944][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:45:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000174093_2852339712.pth... [2024-06-13 05:45:41,006][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000173374_2840559616.pth [2024-06-13 05:45:41,187][71000] Updated weights for policy 0, policy_version 174094 (0.0024) [2024-06-13 05:45:45,239][71000] Updated weights for policy 0, policy_version 174104 (0.0032) [2024-06-13 05:45:45,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48332.8, 300 sec: 48929.8). Total num frames: 2852552704. Throughput: 0: 48808.5. Samples: 2381434940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-06-13 05:45:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:45:47,853][71000] Updated weights for policy 0, policy_version 174114 (0.0024) [2024-06-13 05:45:50,940][70768] Fps is (10 sec: 45894.4, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2852798464. Throughput: 0: 48550.6. Samples: 2381569600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-06-13 05:45:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:45:52,224][71000] Updated weights for policy 0, policy_version 174124 (0.0026) [2024-06-13 05:45:54,817][71000] Updated weights for policy 0, policy_version 174134 (0.0025) [2024-06-13 05:45:55,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48059.6, 300 sec: 48985.4). Total num frames: 2853027840. Throughput: 0: 48561.8. Samples: 2381858900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-06-13 05:45:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:45:59,035][71000] Updated weights for policy 0, policy_version 174144 (0.0023) [2024-06-13 05:46:00,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48332.7, 300 sec: 48985.4). Total num frames: 2853289984. Throughput: 0: 48580.8. Samples: 2382153600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-06-13 05:46:00,944][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:46:01,478][71000] Updated weights for policy 0, policy_version 174154 (0.0029) [2024-06-13 05:46:05,518][71000] Updated weights for policy 0, policy_version 174164 (0.0025) [2024-06-13 05:46:05,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 2853535744. Throughput: 0: 48598.5. Samples: 2382303900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-06-13 05:46:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:46:07,948][71000] Updated weights for policy 0, policy_version 174174 (0.0033) [2024-06-13 05:46:10,502][70980] Signal inference workers to stop experience collection... (35550 times) [2024-06-13 05:46:10,532][71000] InferenceWorker_p0-w0: stopping experience collection (35550 times) [2024-06-13 05:46:10,559][70980] Signal inference workers to resume experience collection... (35550 times) [2024-06-13 05:46:10,560][71000] InferenceWorker_p0-w0: resuming experience collection (35550 times) [2024-06-13 05:46:10,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 2853781504. Throughput: 0: 48729.9. Samples: 2382600300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-06-13 05:46:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:46:12,146][71000] Updated weights for policy 0, policy_version 174184 (0.0026) [2024-06-13 05:46:14,843][71000] Updated weights for policy 0, policy_version 174194 (0.0028) [2024-06-13 05:46:15,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 2854010880. Throughput: 0: 48692.4. Samples: 2382889760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-06-13 05:46:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:46:18,868][71000] Updated weights for policy 0, policy_version 174204 (0.0037) [2024-06-13 05:46:20,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48059.7, 300 sec: 48929.9). Total num frames: 2854256640. Throughput: 0: 48732.9. Samples: 2383034660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-06-13 05:46:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:46:21,756][71000] Updated weights for policy 0, policy_version 174214 (0.0031) [2024-06-13 05:46:25,393][71000] Updated weights for policy 0, policy_version 174224 (0.0020) [2024-06-13 05:46:25,939][70768] Fps is (10 sec: 49153.1, 60 sec: 48879.1, 300 sec: 48874.3). Total num frames: 2854502400. Throughput: 0: 48635.4. Samples: 2383330520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-06-13 05:46:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:46:28,234][71000] Updated weights for policy 0, policy_version 174234 (0.0031) [2024-06-13 05:46:30,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48332.7, 300 sec: 48818.7). Total num frames: 2854731776. Throughput: 0: 48589.5. Samples: 2383621480. Policy #0 lag: (min: 2.0, avg: 10.9, max: 21.0) [2024-06-13 05:46:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:46:32,293][71000] Updated weights for policy 0, policy_version 174244 (0.0034) [2024-06-13 05:46:35,277][71000] Updated weights for policy 0, policy_version 174254 (0.0024) [2024-06-13 05:46:35,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 2854977536. Throughput: 0: 48666.3. Samples: 2383759580. Policy #0 lag: (min: 2.0, avg: 10.9, max: 21.0) [2024-06-13 05:46:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:46:38,662][71000] Updated weights for policy 0, policy_version 174264 (0.0032) [2024-06-13 05:46:40,940][70768] Fps is (10 sec: 49153.0, 60 sec: 48063.2, 300 sec: 48874.3). Total num frames: 2855223296. Throughput: 0: 48712.6. Samples: 2384050960. Policy #0 lag: (min: 2.0, avg: 10.9, max: 21.0) [2024-06-13 05:46:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:46:42,034][71000] Updated weights for policy 0, policy_version 174274 (0.0036) [2024-06-13 05:46:45,557][71000] Updated weights for policy 0, policy_version 174284 (0.0025) [2024-06-13 05:46:45,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48605.7, 300 sec: 48818.7). Total num frames: 2855469056. Throughput: 0: 48758.6. Samples: 2384347740. Policy #0 lag: (min: 2.0, avg: 10.9, max: 21.0) [2024-06-13 05:46:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:46:48,739][71000] Updated weights for policy 0, policy_version 174294 (0.0031) [2024-06-13 05:46:50,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 2855714816. Throughput: 0: 48699.1. Samples: 2384495360. Policy #0 lag: (min: 2.0, avg: 10.9, max: 21.0) [2024-06-13 05:46:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:46:52,108][71000] Updated weights for policy 0, policy_version 174304 (0.0040) [2024-06-13 05:46:55,313][71000] Updated weights for policy 0, policy_version 174314 (0.0030) [2024-06-13 05:46:55,939][70768] Fps is (10 sec: 49153.2, 60 sec: 48879.1, 300 sec: 48874.3). Total num frames: 2855960576. Throughput: 0: 48707.2. Samples: 2384792120. Policy #0 lag: (min: 2.0, avg: 10.9, max: 21.0) [2024-06-13 05:46:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:46:58,804][71000] Updated weights for policy 0, policy_version 174324 (0.0027) [2024-06-13 05:47:00,940][70768] Fps is (10 sec: 50790.9, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 2856222720. Throughput: 0: 48829.0. Samples: 2385087060. Policy #0 lag: (min: 2.0, avg: 10.9, max: 21.0) [2024-06-13 05:47:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:47:02,346][71000] Updated weights for policy 0, policy_version 174334 (0.0037) [2024-06-13 05:47:05,351][71000] Updated weights for policy 0, policy_version 174344 (0.0034) [2024-06-13 05:47:05,939][70768] Fps is (10 sec: 49151.7, 60 sec: 48606.0, 300 sec: 48874.3). Total num frames: 2856452096. Throughput: 0: 48918.3. Samples: 2385235980. Policy #0 lag: (min: 2.0, avg: 10.9, max: 21.0) [2024-06-13 05:47:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:47:08,897][71000] Updated weights for policy 0, policy_version 174354 (0.0032) [2024-06-13 05:47:10,940][70768] Fps is (10 sec: 45874.5, 60 sec: 48332.7, 300 sec: 48763.2). Total num frames: 2856681472. Throughput: 0: 48648.2. Samples: 2385519700. Policy #0 lag: (min: 2.0, avg: 10.9, max: 21.0) [2024-06-13 05:47:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:47:12,417][71000] Updated weights for policy 0, policy_version 174364 (0.0030) [2024-06-13 05:47:15,841][71000] Updated weights for policy 0, policy_version 174374 (0.0027) [2024-06-13 05:47:15,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48879.0, 300 sec: 48818.7). Total num frames: 2856943616. Throughput: 0: 48541.5. Samples: 2385805840. Policy #0 lag: (min: 2.0, avg: 10.9, max: 21.0) [2024-06-13 05:47:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:47:18,655][70980] Signal inference workers to stop experience collection... (35600 times) [2024-06-13 05:47:18,656][70980] Signal inference workers to resume experience collection... (35600 times) [2024-06-13 05:47:18,664][71000] InferenceWorker_p0-w0: stopping experience collection (35600 times) [2024-06-13 05:47:18,664][71000] InferenceWorker_p0-w0: resuming experience collection (35600 times) [2024-06-13 05:47:19,319][71000] Updated weights for policy 0, policy_version 174384 (0.0024) [2024-06-13 05:47:20,940][70768] Fps is (10 sec: 50791.4, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 2857189376. Throughput: 0: 48766.7. Samples: 2385954080. Policy #0 lag: (min: 2.0, avg: 10.9, max: 21.0) [2024-06-13 05:47:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:47:22,751][71000] Updated weights for policy 0, policy_version 174394 (0.0029) [2024-06-13 05:47:25,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.8, 300 sec: 48818.8). Total num frames: 2857418752. Throughput: 0: 48776.4. Samples: 2386245900. Policy #0 lag: (min: 2.0, avg: 10.9, max: 21.0) [2024-06-13 05:47:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:47:25,984][71000] Updated weights for policy 0, policy_version 174404 (0.0038) [2024-06-13 05:47:29,454][71000] Updated weights for policy 0, policy_version 174414 (0.0023) [2024-06-13 05:47:30,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48606.0, 300 sec: 48707.7). Total num frames: 2857648128. Throughput: 0: 48525.5. Samples: 2386531380. Policy #0 lag: (min: 2.0, avg: 10.9, max: 21.0) [2024-06-13 05:47:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:47:32,856][71000] Updated weights for policy 0, policy_version 174424 (0.0023) [2024-06-13 05:47:35,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 2857893888. Throughput: 0: 48428.9. Samples: 2386674660. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 05:47:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:47:36,458][71000] Updated weights for policy 0, policy_version 174434 (0.0027) [2024-06-13 05:47:39,502][71000] Updated weights for policy 0, policy_version 174444 (0.0033) [2024-06-13 05:47:40,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 2858156032. Throughput: 0: 48455.9. Samples: 2386972640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 05:47:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:47:41,036][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000174449_2858172416.pth... [2024-06-13 05:47:41,091][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000173735_2846474240.pth [2024-06-13 05:47:43,051][71000] Updated weights for policy 0, policy_version 174454 (0.0021) [2024-06-13 05:47:45,940][70768] Fps is (10 sec: 50791.0, 60 sec: 48879.1, 300 sec: 48818.8). Total num frames: 2858401792. Throughput: 0: 48325.4. Samples: 2387261700. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 05:47:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:47:46,212][71000] Updated weights for policy 0, policy_version 174464 (0.0024) [2024-06-13 05:47:50,086][71000] Updated weights for policy 0, policy_version 174474 (0.0029) [2024-06-13 05:47:50,940][70768] Fps is (10 sec: 47511.5, 60 sec: 48605.6, 300 sec: 48707.6). Total num frames: 2858631168. Throughput: 0: 48122.6. Samples: 2387401520. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 05:47:50,941][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:47:52,828][71000] Updated weights for policy 0, policy_version 174484 (0.0026) [2024-06-13 05:47:55,940][70768] Fps is (10 sec: 45874.6, 60 sec: 48332.6, 300 sec: 48707.7). Total num frames: 2858860544. Throughput: 0: 48356.5. Samples: 2387695740. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 05:47:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:47:56,731][71000] Updated weights for policy 0, policy_version 174494 (0.0035) [2024-06-13 05:47:59,661][71000] Updated weights for policy 0, policy_version 174504 (0.0030) [2024-06-13 05:48:00,939][70768] Fps is (10 sec: 49154.5, 60 sec: 48332.9, 300 sec: 48763.2). Total num frames: 2859122688. Throughput: 0: 48395.7. Samples: 2387983640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 05:48:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:48:03,417][71000] Updated weights for policy 0, policy_version 174514 (0.0023) [2024-06-13 05:48:05,940][70768] Fps is (10 sec: 50791.0, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 2859368448. Throughput: 0: 48745.3. Samples: 2388147620. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 05:48:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:48:06,288][71000] Updated weights for policy 0, policy_version 174524 (0.0030) [2024-06-13 05:48:09,940][71000] Updated weights for policy 0, policy_version 174534 (0.0024) [2024-06-13 05:48:10,939][70768] Fps is (10 sec: 45875.2, 60 sec: 48333.0, 300 sec: 48652.8). Total num frames: 2859581440. Throughput: 0: 48687.2. Samples: 2388436820. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 05:48:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:48:12,844][71000] Updated weights for policy 0, policy_version 174544 (0.0029) [2024-06-13 05:48:15,940][70768] Fps is (10 sec: 45874.6, 60 sec: 48059.7, 300 sec: 48652.2). Total num frames: 2859827200. Throughput: 0: 48651.0. Samples: 2388720680. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 05:48:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:48:16,880][71000] Updated weights for policy 0, policy_version 174554 (0.0028) [2024-06-13 05:48:19,826][71000] Updated weights for policy 0, policy_version 174564 (0.0023) [2024-06-13 05:48:20,939][70768] Fps is (10 sec: 52428.6, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 2860105728. Throughput: 0: 48817.1. Samples: 2388871420. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 05:48:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:48:23,374][71000] Updated weights for policy 0, policy_version 174574 (0.0027) [2024-06-13 05:48:25,940][70768] Fps is (10 sec: 50790.9, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 2860335104. Throughput: 0: 48704.4. Samples: 2389164340. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 05:48:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:48:26,725][71000] Updated weights for policy 0, policy_version 174584 (0.0025) [2024-06-13 05:48:27,205][70980] Signal inference workers to stop experience collection... (35650 times) [2024-06-13 05:48:27,247][71000] InferenceWorker_p0-w0: stopping experience collection (35650 times) [2024-06-13 05:48:27,253][70980] Signal inference workers to resume experience collection... (35650 times) [2024-06-13 05:48:27,258][71000] InferenceWorker_p0-w0: resuming experience collection (35650 times) [2024-06-13 05:48:30,246][71000] Updated weights for policy 0, policy_version 174594 (0.0029) [2024-06-13 05:48:30,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48605.9, 300 sec: 48652.2). Total num frames: 2860564480. Throughput: 0: 48629.8. Samples: 2389450040. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 05:48:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:48:33,314][71000] Updated weights for policy 0, policy_version 174604 (0.0025) [2024-06-13 05:48:35,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.9, 300 sec: 48652.2). Total num frames: 2860810240. Throughput: 0: 48665.8. Samples: 2389591460. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 05:48:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:48:37,113][71000] Updated weights for policy 0, policy_version 174614 (0.0034) [2024-06-13 05:48:39,769][71000] Updated weights for policy 0, policy_version 174624 (0.0029) [2024-06-13 05:48:40,940][70768] Fps is (10 sec: 52428.6, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 2861088768. Throughput: 0: 48966.8. Samples: 2389899240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:48:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:48:43,607][71000] Updated weights for policy 0, policy_version 174634 (0.0022) [2024-06-13 05:48:45,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48332.8, 300 sec: 48652.2). Total num frames: 2861301760. Throughput: 0: 48985.7. Samples: 2390188000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:48:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 05:48:46,642][71000] Updated weights for policy 0, policy_version 174644 (0.0023) [2024-06-13 05:48:50,316][71000] Updated weights for policy 0, policy_version 174654 (0.0032) [2024-06-13 05:48:50,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48606.3, 300 sec: 48652.2). Total num frames: 2861547520. Throughput: 0: 48429.9. Samples: 2390326960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:48:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:48:53,479][71000] Updated weights for policy 0, policy_version 174664 (0.0025) [2024-06-13 05:48:55,940][70768] Fps is (10 sec: 49150.4, 60 sec: 48878.8, 300 sec: 48652.1). Total num frames: 2861793280. Throughput: 0: 48623.6. Samples: 2390624900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:48:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:48:57,065][71000] Updated weights for policy 0, policy_version 174674 (0.0031) [2024-06-13 05:49:00,033][71000] Updated weights for policy 0, policy_version 174684 (0.0042) [2024-06-13 05:49:00,939][70768] Fps is (10 sec: 54067.0, 60 sec: 49425.0, 300 sec: 48874.3). Total num frames: 2862088192. Throughput: 0: 48985.1. Samples: 2390925000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:49:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:49:03,767][71000] Updated weights for policy 0, policy_version 174694 (0.0026) [2024-06-13 05:49:05,940][70768] Fps is (10 sec: 49153.6, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 2862284800. Throughput: 0: 49076.9. Samples: 2391079880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:49:05,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-13 05:49:06,770][71000] Updated weights for policy 0, policy_version 174704 (0.0025) [2024-06-13 05:49:10,358][71000] Updated weights for policy 0, policy_version 174714 (0.0033) [2024-06-13 05:49:10,939][70768] Fps is (10 sec: 44237.0, 60 sec: 49152.0, 300 sec: 48652.2). Total num frames: 2862530560. Throughput: 0: 49026.8. Samples: 2391370540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:49:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 05:49:13,238][71000] Updated weights for policy 0, policy_version 174724 (0.0033) [2024-06-13 05:49:15,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.2, 300 sec: 48707.7). Total num frames: 2862792704. Throughput: 0: 49125.8. Samples: 2391660700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:49:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:49:16,925][71000] Updated weights for policy 0, policy_version 174734 (0.0027) [2024-06-13 05:49:19,968][71000] Updated weights for policy 0, policy_version 174744 (0.0027) [2024-06-13 05:49:20,940][70768] Fps is (10 sec: 54066.9, 60 sec: 49425.1, 300 sec: 48874.3). Total num frames: 2863071232. Throughput: 0: 49384.5. Samples: 2391813760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:49:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:49:24,039][71000] Updated weights for policy 0, policy_version 174754 (0.0028) [2024-06-13 05:49:25,940][70768] Fps is (10 sec: 47512.6, 60 sec: 48878.8, 300 sec: 48763.2). Total num frames: 2863267840. Throughput: 0: 49157.2. Samples: 2392111320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:49:25,941][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:49:26,571][71000] Updated weights for policy 0, policy_version 174764 (0.0027) [2024-06-13 05:49:30,473][71000] Updated weights for policy 0, policy_version 174774 (0.0031) [2024-06-13 05:49:30,940][70768] Fps is (10 sec: 44236.0, 60 sec: 49151.9, 300 sec: 48707.7). Total num frames: 2863513600. Throughput: 0: 49159.3. Samples: 2392400180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:49:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:49:31,916][70980] Signal inference workers to stop experience collection... (35700 times) [2024-06-13 05:49:31,919][70980] Signal inference workers to resume experience collection... (35700 times) [2024-06-13 05:49:31,940][71000] InferenceWorker_p0-w0: stopping experience collection (35700 times) [2024-06-13 05:49:31,941][71000] InferenceWorker_p0-w0: resuming experience collection (35700 times) [2024-06-13 05:49:33,456][71000] Updated weights for policy 0, policy_version 174784 (0.0028) [2024-06-13 05:49:35,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49151.9, 300 sec: 48652.4). Total num frames: 2863759360. Throughput: 0: 49179.4. Samples: 2392540040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:49:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:49:37,355][71000] Updated weights for policy 0, policy_version 174794 (0.0029) [2024-06-13 05:49:40,135][71000] Updated weights for policy 0, policy_version 174804 (0.0032) [2024-06-13 05:49:40,940][70768] Fps is (10 sec: 52429.6, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 2864037888. Throughput: 0: 49127.9. Samples: 2392835640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 05:49:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:49:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000174807_2864037888.pth... [2024-06-13 05:49:40,994][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000174093_2852339712.pth [2024-06-13 05:49:44,065][71000] Updated weights for policy 0, policy_version 174814 (0.0024) [2024-06-13 05:49:45,939][70768] Fps is (10 sec: 47514.3, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 2864234496. Throughput: 0: 49017.3. Samples: 2393130780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:49:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:49:46,698][71000] Updated weights for policy 0, policy_version 174824 (0.0021) [2024-06-13 05:49:50,244][71000] Updated weights for policy 0, policy_version 174834 (0.0027) [2024-06-13 05:49:50,939][70768] Fps is (10 sec: 45875.7, 60 sec: 49152.0, 300 sec: 48652.2). Total num frames: 2864496640. Throughput: 0: 48809.8. Samples: 2393276320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:49:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:49:53,261][71000] Updated weights for policy 0, policy_version 174844 (0.0026) [2024-06-13 05:49:55,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49152.2, 300 sec: 48652.1). Total num frames: 2864742400. Throughput: 0: 48931.4. Samples: 2393572460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:49:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:49:56,976][71000] Updated weights for policy 0, policy_version 174854 (0.0035) [2024-06-13 05:50:00,102][71000] Updated weights for policy 0, policy_version 174864 (0.0030) [2024-06-13 05:50:00,939][70768] Fps is (10 sec: 52428.7, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 2865020928. Throughput: 0: 49284.0. Samples: 2393878480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:50:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:50:03,818][71000] Updated weights for policy 0, policy_version 174874 (0.0032) [2024-06-13 05:50:05,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 48763.3). Total num frames: 2865233920. Throughput: 0: 49135.5. Samples: 2394024860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:50:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:50:06,730][71000] Updated weights for policy 0, policy_version 174884 (0.0028) [2024-06-13 05:50:10,378][71000] Updated weights for policy 0, policy_version 174894 (0.0027) [2024-06-13 05:50:10,939][70768] Fps is (10 sec: 44236.8, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 2865463296. Throughput: 0: 48834.9. Samples: 2394308880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:50:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:50:13,567][71000] Updated weights for policy 0, policy_version 174904 (0.0029) [2024-06-13 05:50:15,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48879.0, 300 sec: 48652.2). Total num frames: 2865725440. Throughput: 0: 48706.5. Samples: 2394591960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:50:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:50:17,454][71000] Updated weights for policy 0, policy_version 174914 (0.0023) [2024-06-13 05:50:20,531][71000] Updated weights for policy 0, policy_version 174924 (0.0030) [2024-06-13 05:50:20,940][70768] Fps is (10 sec: 50789.1, 60 sec: 48332.6, 300 sec: 48818.8). Total num frames: 2865971200. Throughput: 0: 49016.3. Samples: 2394745780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:50:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:50:23,954][71000] Updated weights for policy 0, policy_version 174934 (0.0031) [2024-06-13 05:50:25,939][70768] Fps is (10 sec: 45875.1, 60 sec: 48606.0, 300 sec: 48652.2). Total num frames: 2866184192. Throughput: 0: 48918.3. Samples: 2395036960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:50:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:50:27,039][71000] Updated weights for policy 0, policy_version 174944 (0.0027) [2024-06-13 05:50:30,746][71000] Updated weights for policy 0, policy_version 174954 (0.0025) [2024-06-13 05:50:30,940][70768] Fps is (10 sec: 47514.6, 60 sec: 48879.1, 300 sec: 48763.2). Total num frames: 2866446336. Throughput: 0: 48865.7. Samples: 2395329740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:50:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:50:33,535][71000] Updated weights for policy 0, policy_version 174964 (0.0036) [2024-06-13 05:50:35,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49152.0, 300 sec: 48708.4). Total num frames: 2866708480. Throughput: 0: 48757.6. Samples: 2395470420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:50:35,952][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:50:37,278][71000] Updated weights for policy 0, policy_version 174974 (0.0027) [2024-06-13 05:50:40,594][71000] Updated weights for policy 0, policy_version 174984 (0.0034) [2024-06-13 05:50:40,944][70768] Fps is (10 sec: 50768.4, 60 sec: 48602.4, 300 sec: 48818.0). Total num frames: 2866954240. Throughput: 0: 48743.4. Samples: 2395766120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:50:40,944][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:50:43,865][71000] Updated weights for policy 0, policy_version 174994 (0.0039) [2024-06-13 05:50:45,940][70768] Fps is (10 sec: 45875.7, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 2867167232. Throughput: 0: 48403.0. Samples: 2396056620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 05:50:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:50:47,159][71000] Updated weights for policy 0, policy_version 175004 (0.0032) [2024-06-13 05:50:50,765][71000] Updated weights for policy 0, policy_version 175014 (0.0033) [2024-06-13 05:50:50,940][70768] Fps is (10 sec: 47534.3, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 2867429376. Throughput: 0: 48200.5. Samples: 2396193880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 05:50:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:50:53,410][70980] Signal inference workers to stop experience collection... (35750 times) [2024-06-13 05:50:53,413][70980] Signal inference workers to resume experience collection... (35750 times) [2024-06-13 05:50:53,432][71000] InferenceWorker_p0-w0: stopping experience collection (35750 times) [2024-06-13 05:50:53,432][71000] InferenceWorker_p0-w0: resuming experience collection (35750 times) [2024-06-13 05:50:53,719][71000] Updated weights for policy 0, policy_version 175024 (0.0025) [2024-06-13 05:50:55,939][70768] Fps is (10 sec: 50791.2, 60 sec: 48879.1, 300 sec: 48763.3). Total num frames: 2867675136. Throughput: 0: 48517.4. Samples: 2396492160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 05:50:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:50:57,548][71000] Updated weights for policy 0, policy_version 175034 (0.0026) [2024-06-13 05:51:00,539][71000] Updated weights for policy 0, policy_version 175044 (0.0025) [2024-06-13 05:51:00,939][70768] Fps is (10 sec: 50790.6, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 2867937280. Throughput: 0: 48865.3. Samples: 2396790900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 05:51:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:51:04,356][71000] Updated weights for policy 0, policy_version 175054 (0.0018) [2024-06-13 05:51:05,940][70768] Fps is (10 sec: 49151.0, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 2868166656. Throughput: 0: 48789.1. Samples: 2396941280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 05:51:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:51:07,257][71000] Updated weights for policy 0, policy_version 175064 (0.0025) [2024-06-13 05:51:10,939][70768] Fps is (10 sec: 44236.9, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 2868379648. Throughput: 0: 48802.7. Samples: 2397233080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 05:51:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:51:11,267][71000] Updated weights for policy 0, policy_version 175074 (0.0030) [2024-06-13 05:51:13,861][71000] Updated weights for policy 0, policy_version 175084 (0.0036) [2024-06-13 05:51:15,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 2868674560. Throughput: 0: 48678.6. Samples: 2397520280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 05:51:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:51:17,747][71000] Updated weights for policy 0, policy_version 175094 (0.0027) [2024-06-13 05:51:20,642][71000] Updated weights for policy 0, policy_version 175104 (0.0023) [2024-06-13 05:51:20,940][70768] Fps is (10 sec: 54066.3, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 2868920320. Throughput: 0: 49021.4. Samples: 2397676380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 05:51:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:51:24,399][71000] Updated weights for policy 0, policy_version 175114 (0.0034) [2024-06-13 05:51:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.0, 300 sec: 48929.9). Total num frames: 2869166080. Throughput: 0: 49342.0. Samples: 2397986300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 05:51:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:51:26,953][71000] Updated weights for policy 0, policy_version 175124 (0.0025) [2024-06-13 05:51:30,764][71000] Updated weights for policy 0, policy_version 175134 (0.0029) [2024-06-13 05:51:30,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 2869395456. Throughput: 0: 49254.2. Samples: 2398273060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 05:51:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:51:33,899][71000] Updated weights for policy 0, policy_version 175144 (0.0024) [2024-06-13 05:51:35,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 2869657600. Throughput: 0: 49533.2. Samples: 2398422880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 05:51:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:51:37,644][71000] Updated weights for policy 0, policy_version 175154 (0.0034) [2024-06-13 05:51:40,442][71000] Updated weights for policy 0, policy_version 175164 (0.0026) [2024-06-13 05:51:40,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48882.5, 300 sec: 48874.3). Total num frames: 2869886976. Throughput: 0: 49419.4. Samples: 2398716040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 05:51:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 05:51:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000175165_2869903360.pth... [2024-06-13 05:51:41,003][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000174449_2858172416.pth [2024-06-13 05:51:44,219][71000] Updated weights for policy 0, policy_version 175174 (0.0025) [2024-06-13 05:51:45,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49698.1, 300 sec: 48929.9). Total num frames: 2870149120. Throughput: 0: 49283.9. Samples: 2399008680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 05:51:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:51:47,329][71000] Updated weights for policy 0, policy_version 175184 (0.0026) [2024-06-13 05:51:50,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.9, 300 sec: 48818.7). Total num frames: 2870362112. Throughput: 0: 49297.4. Samples: 2399159660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 05:51:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:51:51,168][71000] Updated weights for policy 0, policy_version 175194 (0.0035) [2024-06-13 05:51:54,203][71000] Updated weights for policy 0, policy_version 175204 (0.0023) [2024-06-13 05:51:55,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.0, 300 sec: 48874.3). Total num frames: 2870640640. Throughput: 0: 49277.8. Samples: 2399450580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 05:51:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:51:57,673][71000] Updated weights for policy 0, policy_version 175214 (0.0028) [2024-06-13 05:52:00,935][71000] Updated weights for policy 0, policy_version 175224 (0.0026) [2024-06-13 05:52:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 2870870016. Throughput: 0: 49383.1. Samples: 2399742520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 05:52:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:52:02,255][70980] Signal inference workers to stop experience collection... (35800 times) [2024-06-13 05:52:02,285][71000] InferenceWorker_p0-w0: stopping experience collection (35800 times) [2024-06-13 05:52:02,314][70980] Signal inference workers to resume experience collection... (35800 times) [2024-06-13 05:52:02,314][71000] InferenceWorker_p0-w0: resuming experience collection (35800 times) [2024-06-13 05:52:04,506][71000] Updated weights for policy 0, policy_version 175234 (0.0021) [2024-06-13 05:52:05,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 2871115776. Throughput: 0: 49186.3. Samples: 2399889760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 05:52:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:52:07,813][71000] Updated weights for policy 0, policy_version 175244 (0.0025) [2024-06-13 05:52:10,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.0, 300 sec: 48818.8). Total num frames: 2871345152. Throughput: 0: 48809.3. Samples: 2400182720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 05:52:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:52:10,977][71000] Updated weights for policy 0, policy_version 175254 (0.0025) [2024-06-13 05:52:14,246][71000] Updated weights for policy 0, policy_version 175264 (0.0033) [2024-06-13 05:52:15,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 2871623680. Throughput: 0: 49174.6. Samples: 2400485920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 05:52:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:52:17,477][71000] Updated weights for policy 0, policy_version 175274 (0.0029) [2024-06-13 05:52:20,539][71000] Updated weights for policy 0, policy_version 175284 (0.0035) [2024-06-13 05:52:20,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 2871853056. Throughput: 0: 49142.2. Samples: 2400634280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 05:52:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:52:24,007][71000] Updated weights for policy 0, policy_version 175294 (0.0027) [2024-06-13 05:52:25,939][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 2872115200. Throughput: 0: 49220.0. Samples: 2400930940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 05:52:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:52:27,322][71000] Updated weights for policy 0, policy_version 175304 (0.0037) [2024-06-13 05:52:30,895][71000] Updated weights for policy 0, policy_version 175314 (0.0036) [2024-06-13 05:52:30,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 2872344576. Throughput: 0: 49108.3. Samples: 2401218560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 05:52:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:52:34,295][71000] Updated weights for policy 0, policy_version 175324 (0.0027) [2024-06-13 05:52:35,939][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.1, 300 sec: 48929.9). Total num frames: 2872590336. Throughput: 0: 48975.2. Samples: 2401363540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 05:52:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:52:37,349][71000] Updated weights for policy 0, policy_version 175334 (0.0028) [2024-06-13 05:52:40,835][71000] Updated weights for policy 0, policy_version 175344 (0.0035) [2024-06-13 05:52:40,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 2872836096. Throughput: 0: 49137.7. Samples: 2401661780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 05:52:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:52:43,870][71000] Updated weights for policy 0, policy_version 175354 (0.0023) [2024-06-13 05:52:45,940][70768] Fps is (10 sec: 50788.6, 60 sec: 49151.8, 300 sec: 49040.9). Total num frames: 2873098240. Throughput: 0: 49165.1. Samples: 2401954960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 05:52:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:52:47,491][71000] Updated weights for policy 0, policy_version 175364 (0.0029) [2024-06-13 05:52:50,779][71000] Updated weights for policy 0, policy_version 175374 (0.0034) [2024-06-13 05:52:50,940][70768] Fps is (10 sec: 49150.5, 60 sec: 49424.8, 300 sec: 49040.9). Total num frames: 2873327616. Throughput: 0: 49213.0. Samples: 2402104360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 05:52:50,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 05:52:54,265][71000] Updated weights for policy 0, policy_version 175384 (0.0035) [2024-06-13 05:52:55,940][70768] Fps is (10 sec: 44237.5, 60 sec: 48332.6, 300 sec: 48874.3). Total num frames: 2873540608. Throughput: 0: 49058.5. Samples: 2402390360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 05:52:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:52:57,358][71000] Updated weights for policy 0, policy_version 175394 (0.0029) [2024-06-13 05:53:00,761][71000] Updated weights for policy 0, policy_version 175404 (0.0024) [2024-06-13 05:53:00,940][70768] Fps is (10 sec: 49153.6, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 2873819136. Throughput: 0: 49042.7. Samples: 2402692840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:53:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:53:04,037][71000] Updated weights for policy 0, policy_version 175414 (0.0025) [2024-06-13 05:53:05,940][70768] Fps is (10 sec: 54068.1, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2874081280. Throughput: 0: 49044.2. Samples: 2402841260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:53:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:53:07,495][71000] Updated weights for policy 0, policy_version 175424 (0.0029) [2024-06-13 05:53:10,506][71000] Updated weights for policy 0, policy_version 175434 (0.0031) [2024-06-13 05:53:10,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.2, 300 sec: 49152.0). Total num frames: 2874327040. Throughput: 0: 49156.5. Samples: 2403142980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:53:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:53:14,022][71000] Updated weights for policy 0, policy_version 175444 (0.0035) [2024-06-13 05:53:15,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 2874556416. Throughput: 0: 49443.7. Samples: 2403443520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:53:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:53:16,973][70980] Signal inference workers to stop experience collection... (35850 times) [2024-06-13 05:53:16,975][70980] Signal inference workers to resume experience collection... (35850 times) [2024-06-13 05:53:16,994][71000] InferenceWorker_p0-w0: stopping experience collection (35850 times) [2024-06-13 05:53:16,994][71000] InferenceWorker_p0-w0: resuming experience collection (35850 times) [2024-06-13 05:53:17,123][71000] Updated weights for policy 0, policy_version 175454 (0.0028) [2024-06-13 05:53:20,530][71000] Updated weights for policy 0, policy_version 175464 (0.0030) [2024-06-13 05:53:20,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49425.1, 300 sec: 49096.4). Total num frames: 2874818560. Throughput: 0: 49394.9. Samples: 2403586320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:53:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:53:23,512][71000] Updated weights for policy 0, policy_version 175474 (0.0028) [2024-06-13 05:53:25,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2875064320. Throughput: 0: 49516.3. Samples: 2403890020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:53:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:53:27,220][71000] Updated weights for policy 0, policy_version 175484 (0.0027) [2024-06-13 05:53:30,361][71000] Updated weights for policy 0, policy_version 175494 (0.0029) [2024-06-13 05:53:30,939][70768] Fps is (10 sec: 49153.0, 60 sec: 49425.2, 300 sec: 49152.0). Total num frames: 2875310080. Throughput: 0: 49428.9. Samples: 2404179240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:53:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:53:33,926][71000] Updated weights for policy 0, policy_version 175504 (0.0029) [2024-06-13 05:53:35,939][70768] Fps is (10 sec: 47514.3, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2875539456. Throughput: 0: 49359.0. Samples: 2404325500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:53:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:53:36,993][71000] Updated weights for policy 0, policy_version 175514 (0.0024) [2024-06-13 05:53:40,587][71000] Updated weights for policy 0, policy_version 175524 (0.0031) [2024-06-13 05:53:40,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2875785216. Throughput: 0: 49421.1. Samples: 2404614300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:53:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:53:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000175524_2875785216.pth... [2024-06-13 05:53:40,996][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000174807_2864037888.pth [2024-06-13 05:53:43,655][71000] Updated weights for policy 0, policy_version 175534 (0.0020) [2024-06-13 05:53:45,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.3, 300 sec: 49152.0). Total num frames: 2876047360. Throughput: 0: 49243.1. Samples: 2404908780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:53:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:53:47,332][71000] Updated weights for policy 0, policy_version 175544 (0.0031) [2024-06-13 05:53:50,350][71000] Updated weights for policy 0, policy_version 175554 (0.0038) [2024-06-13 05:53:50,940][70768] Fps is (10 sec: 50789.2, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2876293120. Throughput: 0: 49288.2. Samples: 2405059240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:53:50,941][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:53:53,856][71000] Updated weights for policy 0, policy_version 175564 (0.0041) [2024-06-13 05:53:55,939][70768] Fps is (10 sec: 44237.0, 60 sec: 49152.2, 300 sec: 48818.8). Total num frames: 2876489728. Throughput: 0: 49124.0. Samples: 2405353560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:53:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 05:53:57,145][71000] Updated weights for policy 0, policy_version 175574 (0.0034) [2024-06-13 05:54:00,759][71000] Updated weights for policy 0, policy_version 175584 (0.0039) [2024-06-13 05:54:00,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49151.8, 300 sec: 49096.4). Total num frames: 2876768256. Throughput: 0: 48856.7. Samples: 2405642080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 05:54:00,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 05:54:03,781][71000] Updated weights for policy 0, policy_version 175594 (0.0032) [2024-06-13 05:54:05,939][70768] Fps is (10 sec: 52428.8, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2877014016. Throughput: 0: 48962.9. Samples: 2405789640. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 05:54:05,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 05:54:07,793][71000] Updated weights for policy 0, policy_version 175604 (0.0030) [2024-06-13 05:54:10,605][71000] Updated weights for policy 0, policy_version 175614 (0.0045) [2024-06-13 05:54:10,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2877259776. Throughput: 0: 48770.8. Samples: 2406084700. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 05:54:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:54:14,326][71000] Updated weights for policy 0, policy_version 175624 (0.0025) [2024-06-13 05:54:15,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 2877489152. Throughput: 0: 48869.3. Samples: 2406378360. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 05:54:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:54:17,333][71000] Updated weights for policy 0, policy_version 175634 (0.0033) [2024-06-13 05:54:19,768][70980] Signal inference workers to stop experience collection... (35900 times) [2024-06-13 05:54:19,770][70980] Signal inference workers to resume experience collection... (35900 times) [2024-06-13 05:54:19,816][71000] InferenceWorker_p0-w0: stopping experience collection (35900 times) [2024-06-13 05:54:19,816][71000] InferenceWorker_p0-w0: resuming experience collection (35900 times) [2024-06-13 05:54:20,659][71000] Updated weights for policy 0, policy_version 175644 (0.0032) [2024-06-13 05:54:20,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48879.1, 300 sec: 49096.5). Total num frames: 2877751296. Throughput: 0: 48745.3. Samples: 2406519040. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 05:54:20,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 05:54:23,995][71000] Updated weights for policy 0, policy_version 175654 (0.0026) [2024-06-13 05:54:25,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2878013440. Throughput: 0: 48801.7. Samples: 2406810380. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 05:54:25,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 05:54:27,718][71000] Updated weights for policy 0, policy_version 175664 (0.0026) [2024-06-13 05:54:30,688][71000] Updated weights for policy 0, policy_version 175674 (0.0025) [2024-06-13 05:54:30,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2878259200. Throughput: 0: 49091.1. Samples: 2407117880. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 05:54:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:54:34,457][71000] Updated weights for policy 0, policy_version 175684 (0.0027) [2024-06-13 05:54:35,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 2878472192. Throughput: 0: 48873.6. Samples: 2407258540. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 05:54:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:54:37,247][71000] Updated weights for policy 0, policy_version 175694 (0.0025) [2024-06-13 05:54:40,919][71000] Updated weights for policy 0, policy_version 175704 (0.0028) [2024-06-13 05:54:40,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2878734336. Throughput: 0: 48851.9. Samples: 2407551900. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 05:54:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:54:44,056][71000] Updated weights for policy 0, policy_version 175714 (0.0041) [2024-06-13 05:54:45,939][70768] Fps is (10 sec: 52429.2, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2878996480. Throughput: 0: 48996.2. Samples: 2407846900. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 05:54:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:54:48,138][71000] Updated weights for policy 0, policy_version 175724 (0.0027) [2024-06-13 05:54:50,768][71000] Updated weights for policy 0, policy_version 175734 (0.0046) [2024-06-13 05:54:50,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48879.1, 300 sec: 49096.5). Total num frames: 2879225856. Throughput: 0: 49331.0. Samples: 2408009540. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 05:54:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:54:54,545][71000] Updated weights for policy 0, policy_version 175744 (0.0024) [2024-06-13 05:54:55,942][70768] Fps is (10 sec: 45862.3, 60 sec: 49422.7, 300 sec: 48929.4). Total num frames: 2879455232. Throughput: 0: 49417.9. Samples: 2408308640. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 05:54:55,943][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:54:57,314][71000] Updated weights for policy 0, policy_version 175754 (0.0027) [2024-06-13 05:55:00,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2879700992. Throughput: 0: 49200.8. Samples: 2408592400. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 05:55:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:55:01,016][71000] Updated weights for policy 0, policy_version 175764 (0.0030) [2024-06-13 05:55:03,727][71000] Updated weights for policy 0, policy_version 175774 (0.0022) [2024-06-13 05:55:05,940][70768] Fps is (10 sec: 55720.0, 60 sec: 49971.0, 300 sec: 49318.6). Total num frames: 2880012288. Throughput: 0: 49443.8. Samples: 2408744020. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-13 05:55:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:55:07,862][71000] Updated weights for policy 0, policy_version 175784 (0.0027) [2024-06-13 05:55:10,560][71000] Updated weights for policy 0, policy_version 175794 (0.0026) [2024-06-13 05:55:10,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 49096.4). Total num frames: 2880208896. Throughput: 0: 49551.6. Samples: 2409040200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 05:55:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:55:14,485][71000] Updated weights for policy 0, policy_version 175804 (0.0028) [2024-06-13 05:55:15,940][70768] Fps is (10 sec: 42599.0, 60 sec: 49151.9, 300 sec: 49041.0). Total num frames: 2880438272. Throughput: 0: 49299.9. Samples: 2409336380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 05:55:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:55:17,266][71000] Updated weights for policy 0, policy_version 175814 (0.0028) [2024-06-13 05:55:19,121][70980] Signal inference workers to stop experience collection... (35950 times) [2024-06-13 05:55:19,121][70980] Signal inference workers to resume experience collection... (35950 times) [2024-06-13 05:55:19,162][71000] InferenceWorker_p0-w0: stopping experience collection (35950 times) [2024-06-13 05:55:19,162][71000] InferenceWorker_p0-w0: resuming experience collection (35950 times) [2024-06-13 05:55:20,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2880684032. Throughput: 0: 49202.3. Samples: 2409472640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 05:55:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:55:21,283][71000] Updated weights for policy 0, policy_version 175824 (0.0027) [2024-06-13 05:55:23,891][71000] Updated weights for policy 0, policy_version 175834 (0.0032) [2024-06-13 05:55:25,942][70768] Fps is (10 sec: 54052.2, 60 sec: 49422.8, 300 sec: 49262.6). Total num frames: 2880978944. Throughput: 0: 49240.9. Samples: 2409767880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 05:55:25,943][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:55:27,721][71000] Updated weights for policy 0, policy_version 175844 (0.0029) [2024-06-13 05:55:30,511][71000] Updated weights for policy 0, policy_version 175854 (0.0029) [2024-06-13 05:55:30,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2881208320. Throughput: 0: 49276.8. Samples: 2410064360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 05:55:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:55:34,300][71000] Updated weights for policy 0, policy_version 175864 (0.0025) [2024-06-13 05:55:35,940][70768] Fps is (10 sec: 45887.9, 60 sec: 49425.1, 300 sec: 49097.2). Total num frames: 2881437696. Throughput: 0: 48829.7. Samples: 2410206880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 05:55:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 05:55:37,109][71000] Updated weights for policy 0, policy_version 175874 (0.0027) [2024-06-13 05:55:40,940][70768] Fps is (10 sec: 45874.4, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 2881667072. Throughput: 0: 48750.3. Samples: 2410502280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 05:55:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:55:40,974][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000175884_2881683456.pth... [2024-06-13 05:55:40,978][71000] Updated weights for policy 0, policy_version 175884 (0.0023) [2024-06-13 05:55:41,029][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000175165_2869903360.pth [2024-06-13 05:55:43,762][71000] Updated weights for policy 0, policy_version 175894 (0.0024) [2024-06-13 05:55:45,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2881945600. Throughput: 0: 48780.9. Samples: 2410787540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 05:55:45,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:55:47,939][71000] Updated weights for policy 0, policy_version 175904 (0.0030) [2024-06-13 05:55:50,699][71000] Updated weights for policy 0, policy_version 175914 (0.0030) [2024-06-13 05:55:50,939][70768] Fps is (10 sec: 50791.5, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2882174976. Throughput: 0: 48858.4. Samples: 2410942640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 05:55:50,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 05:55:54,104][71000] Updated weights for policy 0, policy_version 175924 (0.0029) [2024-06-13 05:55:55,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49427.4, 300 sec: 49096.5). Total num frames: 2882420736. Throughput: 0: 48937.4. Samples: 2411242380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 05:55:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:55:57,337][71000] Updated weights for policy 0, policy_version 175934 (0.0023) [2024-06-13 05:56:00,903][71000] Updated weights for policy 0, policy_version 175944 (0.0031) [2024-06-13 05:56:00,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2882666496. Throughput: 0: 48989.3. Samples: 2411540900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 05:56:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:56:03,958][71000] Updated weights for policy 0, policy_version 175954 (0.0025) [2024-06-13 05:56:05,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48606.0, 300 sec: 49318.6). Total num frames: 2882928640. Throughput: 0: 49209.3. Samples: 2411687060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 05:56:05,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:56:07,626][71000] Updated weights for policy 0, policy_version 175964 (0.0036) [2024-06-13 05:56:10,450][71000] Updated weights for policy 0, policy_version 175974 (0.0024) [2024-06-13 05:56:10,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2883174400. Throughput: 0: 49431.6. Samples: 2411992160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-13 05:56:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:56:14,218][71000] Updated weights for policy 0, policy_version 175984 (0.0035) [2024-06-13 05:56:15,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 2883403776. Throughput: 0: 49453.3. Samples: 2412289760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:56:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:56:17,070][71000] Updated weights for policy 0, policy_version 175994 (0.0035) [2024-06-13 05:56:20,711][71000] Updated weights for policy 0, policy_version 176004 (0.0029) [2024-06-13 05:56:20,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 2883649536. Throughput: 0: 49321.8. Samples: 2412426360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:56:20,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 05:56:21,605][70980] Signal inference workers to stop experience collection... (36000 times) [2024-06-13 05:56:21,635][71000] InferenceWorker_p0-w0: stopping experience collection (36000 times) [2024-06-13 05:56:21,663][70980] Signal inference workers to resume experience collection... (36000 times) [2024-06-13 05:56:21,664][71000] InferenceWorker_p0-w0: resuming experience collection (36000 times) [2024-06-13 05:56:24,046][71000] Updated weights for policy 0, policy_version 176014 (0.0028) [2024-06-13 05:56:25,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48881.1, 300 sec: 49207.5). Total num frames: 2883911680. Throughput: 0: 49301.9. Samples: 2412720860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:56:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:56:27,305][71000] Updated weights for policy 0, policy_version 176024 (0.0031) [2024-06-13 05:56:30,646][71000] Updated weights for policy 0, policy_version 176034 (0.0027) [2024-06-13 05:56:30,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2884157440. Throughput: 0: 49711.2. Samples: 2413024540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:56:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:56:34,367][71000] Updated weights for policy 0, policy_version 176044 (0.0031) [2024-06-13 05:56:35,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2884386816. Throughput: 0: 49407.8. Samples: 2413166000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:56:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:56:37,264][71000] Updated weights for policy 0, policy_version 176054 (0.0028) [2024-06-13 05:56:40,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49425.2, 300 sec: 49096.5). Total num frames: 2884632576. Throughput: 0: 49211.0. Samples: 2413456880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:56:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:56:40,943][71000] Updated weights for policy 0, policy_version 176064 (0.0025) [2024-06-13 05:56:43,634][71000] Updated weights for policy 0, policy_version 176074 (0.0031) [2024-06-13 05:56:45,940][70768] Fps is (10 sec: 49152.8, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2884878336. Throughput: 0: 49191.1. Samples: 2413754500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:56:45,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:56:47,462][71000] Updated weights for policy 0, policy_version 176084 (0.0042) [2024-06-13 05:56:50,642][71000] Updated weights for policy 0, policy_version 176094 (0.0037) [2024-06-13 05:56:50,939][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2885124096. Throughput: 0: 49245.4. Samples: 2413903100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:56:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:56:54,370][71000] Updated weights for policy 0, policy_version 176104 (0.0036) [2024-06-13 05:56:55,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2885369856. Throughput: 0: 48971.0. Samples: 2414195860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:56:55,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:56:57,331][71000] Updated weights for policy 0, policy_version 176114 (0.0022) [2024-06-13 05:57:00,940][70768] Fps is (10 sec: 47512.7, 60 sec: 48878.9, 300 sec: 49096.4). Total num frames: 2885599232. Throughput: 0: 48809.3. Samples: 2414486180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:57:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:57:00,967][71000] Updated weights for policy 0, policy_version 176124 (0.0033) [2024-06-13 05:57:03,994][71000] Updated weights for policy 0, policy_version 176134 (0.0027) [2024-06-13 05:57:05,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2885861376. Throughput: 0: 49208.4. Samples: 2414640740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:57:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:57:07,599][71000] Updated weights for policy 0, policy_version 176144 (0.0028) [2024-06-13 05:57:10,646][71000] Updated weights for policy 0, policy_version 176154 (0.0028) [2024-06-13 05:57:10,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2886107136. Throughput: 0: 49065.8. Samples: 2414928820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:57:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:57:14,449][71000] Updated weights for policy 0, policy_version 176164 (0.0032) [2024-06-13 05:57:15,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2886352896. Throughput: 0: 48726.7. Samples: 2415217240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 05:57:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 05:57:17,507][71000] Updated weights for policy 0, policy_version 176174 (0.0024) [2024-06-13 05:57:20,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 2886582272. Throughput: 0: 48815.6. Samples: 2415362700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 05:57:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:57:21,076][71000] Updated weights for policy 0, policy_version 176184 (0.0032) [2024-06-13 05:57:24,019][71000] Updated weights for policy 0, policy_version 176194 (0.0024) [2024-06-13 05:57:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2886844416. Throughput: 0: 48998.3. Samples: 2415661800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 05:57:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:57:27,682][71000] Updated weights for policy 0, policy_version 176204 (0.0029) [2024-06-13 05:57:29,849][70980] Signal inference workers to stop experience collection... (36050 times) [2024-06-13 05:57:29,879][71000] InferenceWorker_p0-w0: stopping experience collection (36050 times) [2024-06-13 05:57:29,899][70980] Signal inference workers to resume experience collection... (36050 times) [2024-06-13 05:57:29,900][71000] InferenceWorker_p0-w0: resuming experience collection (36050 times) [2024-06-13 05:57:30,561][71000] Updated weights for policy 0, policy_version 176214 (0.0040) [2024-06-13 05:57:30,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 2887090176. Throughput: 0: 48889.7. Samples: 2415954540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 05:57:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:57:34,370][71000] Updated weights for policy 0, policy_version 176224 (0.0030) [2024-06-13 05:57:35,942][70768] Fps is (10 sec: 49139.6, 60 sec: 49150.1, 300 sec: 49151.6). Total num frames: 2887335936. Throughput: 0: 48869.2. Samples: 2416102340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 05:57:35,943][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:57:37,577][71000] Updated weights for policy 0, policy_version 176234 (0.0033) [2024-06-13 05:57:40,890][71000] Updated weights for policy 0, policy_version 176244 (0.0032) [2024-06-13 05:57:40,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2887581696. Throughput: 0: 48989.4. Samples: 2416400380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 05:57:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:57:41,039][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000176245_2887598080.pth... [2024-06-13 05:57:41,081][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000175524_2875785216.pth [2024-06-13 05:57:43,936][71000] Updated weights for policy 0, policy_version 176254 (0.0028) [2024-06-13 05:57:45,940][70768] Fps is (10 sec: 50803.3, 60 sec: 49425.1, 300 sec: 49207.6). Total num frames: 2887843840. Throughput: 0: 49241.0. Samples: 2416702020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 05:57:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:57:47,443][71000] Updated weights for policy 0, policy_version 176264 (0.0019) [2024-06-13 05:57:50,451][71000] Updated weights for policy 0, policy_version 176274 (0.0028) [2024-06-13 05:57:50,940][70768] Fps is (10 sec: 50788.7, 60 sec: 49424.7, 300 sec: 49318.6). Total num frames: 2888089600. Throughput: 0: 49093.4. Samples: 2416849960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 05:57:50,941][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 05:57:54,051][71000] Updated weights for policy 0, policy_version 176284 (0.0028) [2024-06-13 05:57:55,939][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2888318976. Throughput: 0: 49303.6. Samples: 2417147480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 05:57:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:57:57,446][71000] Updated weights for policy 0, policy_version 176294 (0.0023) [2024-06-13 05:58:00,591][71000] Updated weights for policy 0, policy_version 176304 (0.0026) [2024-06-13 05:58:00,940][70768] Fps is (10 sec: 47515.5, 60 sec: 49425.2, 300 sec: 49096.5). Total num frames: 2888564736. Throughput: 0: 49450.2. Samples: 2417442500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 05:58:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 05:58:03,883][71000] Updated weights for policy 0, policy_version 176314 (0.0032) [2024-06-13 05:58:05,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2888810496. Throughput: 0: 49517.6. Samples: 2417590980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 05:58:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:58:07,263][71000] Updated weights for policy 0, policy_version 176324 (0.0020) [2024-06-13 05:58:10,575][71000] Updated weights for policy 0, policy_version 176334 (0.0028) [2024-06-13 05:58:10,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2889072640. Throughput: 0: 49403.9. Samples: 2417884980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 05:58:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:58:14,012][71000] Updated weights for policy 0, policy_version 176344 (0.0028) [2024-06-13 05:58:15,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2889302016. Throughput: 0: 49295.7. Samples: 2418172840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 05:58:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:58:17,104][71000] Updated weights for policy 0, policy_version 176354 (0.0026) [2024-06-13 05:58:20,553][71000] Updated weights for policy 0, policy_version 176364 (0.0035) [2024-06-13 05:58:20,939][70768] Fps is (10 sec: 47514.6, 60 sec: 49425.3, 300 sec: 49096.5). Total num frames: 2889547776. Throughput: 0: 49320.2. Samples: 2418321620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 05:58:20,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 05:58:23,934][71000] Updated weights for policy 0, policy_version 176374 (0.0029) [2024-06-13 05:58:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2889793536. Throughput: 0: 49295.2. Samples: 2418618660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 05:58:25,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 05:58:27,188][71000] Updated weights for policy 0, policy_version 176384 (0.0024) [2024-06-13 05:58:30,551][71000] Updated weights for policy 0, policy_version 176394 (0.0029) [2024-06-13 05:58:30,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 2890055680. Throughput: 0: 49209.6. Samples: 2418916460. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-06-13 05:58:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:58:33,853][71000] Updated weights for policy 0, policy_version 176404 (0.0035) [2024-06-13 05:58:35,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49154.1, 300 sec: 49152.0). Total num frames: 2890285056. Throughput: 0: 49384.5. Samples: 2419072240. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-06-13 05:58:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:58:37,314][71000] Updated weights for policy 0, policy_version 176414 (0.0036) [2024-06-13 05:58:39,923][70980] Signal inference workers to stop experience collection... (36100 times) [2024-06-13 05:58:39,970][71000] InferenceWorker_p0-w0: stopping experience collection (36100 times) [2024-06-13 05:58:39,978][70980] Signal inference workers to resume experience collection... (36100 times) [2024-06-13 05:58:39,979][71000] InferenceWorker_p0-w0: resuming experience collection (36100 times) [2024-06-13 05:58:40,481][71000] Updated weights for policy 0, policy_version 176424 (0.0021) [2024-06-13 05:58:40,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.1, 300 sec: 49207.5). Total num frames: 2890563584. Throughput: 0: 49259.0. Samples: 2419364140. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-06-13 05:58:40,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-13 05:58:43,859][71000] Updated weights for policy 0, policy_version 176434 (0.0026) [2024-06-13 05:58:45,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48605.8, 300 sec: 49041.0). Total num frames: 2890760192. Throughput: 0: 49077.7. Samples: 2419651000. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-06-13 05:58:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 05:58:47,100][71000] Updated weights for policy 0, policy_version 176444 (0.0028) [2024-06-13 05:58:50,726][71000] Updated weights for policy 0, policy_version 176454 (0.0028) [2024-06-13 05:58:50,940][70768] Fps is (10 sec: 45874.6, 60 sec: 48879.0, 300 sec: 49263.0). Total num frames: 2891022336. Throughput: 0: 48962.3. Samples: 2419794300. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-06-13 05:58:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 05:58:53,814][71000] Updated weights for policy 0, policy_version 176464 (0.0035) [2024-06-13 05:58:55,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2891268096. Throughput: 0: 49048.2. Samples: 2420092140. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-06-13 05:58:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 05:58:57,335][71000] Updated weights for policy 0, policy_version 176474 (0.0019) [2024-06-13 05:59:00,411][71000] Updated weights for policy 0, policy_version 176484 (0.0023) [2024-06-13 05:59:00,940][70768] Fps is (10 sec: 52429.7, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2891546624. Throughput: 0: 49302.6. Samples: 2420391460. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-06-13 05:59:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:59:03,862][71000] Updated weights for policy 0, policy_version 176494 (0.0029) [2024-06-13 05:59:05,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2891759616. Throughput: 0: 49258.6. Samples: 2420538260. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-06-13 05:59:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:59:07,192][71000] Updated weights for policy 0, policy_version 176504 (0.0022) [2024-06-13 05:59:10,825][71000] Updated weights for policy 0, policy_version 176514 (0.0036) [2024-06-13 05:59:10,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2892005376. Throughput: 0: 49135.0. Samples: 2420829740. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-06-13 05:59:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:59:14,071][71000] Updated weights for policy 0, policy_version 176524 (0.0034) [2024-06-13 05:59:15,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2892251136. Throughput: 0: 49101.0. Samples: 2421126000. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-06-13 05:59:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:59:17,433][71000] Updated weights for policy 0, policy_version 176534 (0.0034) [2024-06-13 05:59:20,369][71000] Updated weights for policy 0, policy_version 176544 (0.0026) [2024-06-13 05:59:20,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49698.1, 300 sec: 49207.5). Total num frames: 2892529664. Throughput: 0: 49001.2. Samples: 2421277300. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-06-13 05:59:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:59:24,042][71000] Updated weights for policy 0, policy_version 176554 (0.0035) [2024-06-13 05:59:25,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2892742656. Throughput: 0: 49020.1. Samples: 2421570040. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-06-13 05:59:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:59:27,051][71000] Updated weights for policy 0, policy_version 176564 (0.0032) [2024-06-13 05:59:30,673][71000] Updated weights for policy 0, policy_version 176574 (0.0027) [2024-06-13 05:59:30,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2892988416. Throughput: 0: 49432.5. Samples: 2421875460. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-06-13 05:59:30,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 05:59:33,761][71000] Updated weights for policy 0, policy_version 176584 (0.0024) [2024-06-13 05:59:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2893234176. Throughput: 0: 49430.9. Samples: 2422018680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-13 05:59:35,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 05:59:37,510][71000] Updated weights for policy 0, policy_version 176594 (0.0027) [2024-06-13 05:59:40,404][71000] Updated weights for policy 0, policy_version 176604 (0.0039) [2024-06-13 05:59:40,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2893496320. Throughput: 0: 49348.9. Samples: 2422312840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-13 05:59:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 05:59:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000176605_2893496320.pth... [2024-06-13 05:59:41,019][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000175884_2881683456.pth [2024-06-13 05:59:43,943][71000] Updated weights for policy 0, policy_version 176614 (0.0038) [2024-06-13 05:59:45,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2893725696. Throughput: 0: 49230.2. Samples: 2422606820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-13 05:59:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 05:59:47,143][70980] Signal inference workers to stop experience collection... (36150 times) [2024-06-13 05:59:47,149][70980] Signal inference workers to resume experience collection... (36150 times) [2024-06-13 05:59:47,152][71000] InferenceWorker_p0-w0: stopping experience collection (36150 times) [2024-06-13 05:59:47,176][71000] InferenceWorker_p0-w0: resuming experience collection (36150 times) [2024-06-13 05:59:47,285][71000] Updated weights for policy 0, policy_version 176624 (0.0032) [2024-06-13 05:59:50,891][71000] Updated weights for policy 0, policy_version 176634 (0.0031) [2024-06-13 05:59:50,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.2, 300 sec: 49208.0). Total num frames: 2893971456. Throughput: 0: 48905.3. Samples: 2422739000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-13 05:59:50,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 05:59:53,838][71000] Updated weights for policy 0, policy_version 176644 (0.0033) [2024-06-13 05:59:55,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 2894217216. Throughput: 0: 49118.8. Samples: 2423040080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-13 05:59:55,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 05:59:57,421][71000] Updated weights for policy 0, policy_version 176654 (0.0031) [2024-06-13 06:00:00,428][71000] Updated weights for policy 0, policy_version 176664 (0.0038) [2024-06-13 06:00:00,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48879.0, 300 sec: 49041.0). Total num frames: 2894479360. Throughput: 0: 49032.1. Samples: 2423332440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-13 06:00:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:00:04,284][71000] Updated weights for policy 0, policy_version 176674 (0.0034) [2024-06-13 06:00:05,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2894692352. Throughput: 0: 48874.2. Samples: 2423476640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-13 06:00:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:00:07,199][71000] Updated weights for policy 0, policy_version 176684 (0.0030) [2024-06-13 06:00:10,847][71000] Updated weights for policy 0, policy_version 176694 (0.0023) [2024-06-13 06:00:10,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2894954496. Throughput: 0: 48958.6. Samples: 2423773180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-13 06:00:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:00:14,150][71000] Updated weights for policy 0, policy_version 176704 (0.0031) [2024-06-13 06:00:15,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2895183872. Throughput: 0: 48549.4. Samples: 2424060180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-13 06:00:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:00:18,155][71000] Updated weights for policy 0, policy_version 176714 (0.0033) [2024-06-13 06:00:20,849][71000] Updated weights for policy 0, policy_version 176724 (0.0027) [2024-06-13 06:00:20,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48605.8, 300 sec: 49041.4). Total num frames: 2895446016. Throughput: 0: 48620.3. Samples: 2424206600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-13 06:00:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:00:24,817][71000] Updated weights for policy 0, policy_version 176734 (0.0021) [2024-06-13 06:00:25,939][70768] Fps is (10 sec: 45875.3, 60 sec: 48332.9, 300 sec: 48929.9). Total num frames: 2895642624. Throughput: 0: 48568.1. Samples: 2424498400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-13 06:00:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:00:27,642][71000] Updated weights for policy 0, policy_version 176744 (0.0027) [2024-06-13 06:00:30,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 2895904768. Throughput: 0: 48402.6. Samples: 2424784940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-13 06:00:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:00:31,661][71000] Updated weights for policy 0, policy_version 176754 (0.0031) [2024-06-13 06:00:34,147][71000] Updated weights for policy 0, policy_version 176764 (0.0032) [2024-06-13 06:00:35,940][70768] Fps is (10 sec: 54066.9, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 2896183296. Throughput: 0: 48926.3. Samples: 2424940680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-13 06:00:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:00:38,001][71000] Updated weights for policy 0, policy_version 176774 (0.0022) [2024-06-13 06:00:40,780][71000] Updated weights for policy 0, policy_version 176784 (0.0035) [2024-06-13 06:00:40,939][70768] Fps is (10 sec: 52430.0, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2896429056. Throughput: 0: 48976.0. Samples: 2425244000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 06:00:40,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 06:00:44,754][71000] Updated weights for policy 0, policy_version 176794 (0.0028) [2024-06-13 06:00:45,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 2896642048. Throughput: 0: 48997.3. Samples: 2425537320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 06:00:45,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:00:47,472][71000] Updated weights for policy 0, policy_version 176804 (0.0024) [2024-06-13 06:00:50,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2896904192. Throughput: 0: 48889.3. Samples: 2425676660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 06:00:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:00:51,061][71000] Updated weights for policy 0, policy_version 176814 (0.0021) [2024-06-13 06:00:53,993][71000] Updated weights for policy 0, policy_version 176824 (0.0025) [2024-06-13 06:00:55,939][70768] Fps is (10 sec: 52429.4, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2897166336. Throughput: 0: 48884.2. Samples: 2425972960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 06:00:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 06:00:57,833][71000] Updated weights for policy 0, policy_version 176834 (0.0022) [2024-06-13 06:00:58,773][70980] Signal inference workers to stop experience collection... (36200 times) [2024-06-13 06:00:58,774][70980] Signal inference workers to resume experience collection... (36200 times) [2024-06-13 06:00:58,820][71000] InferenceWorker_p0-w0: stopping experience collection (36200 times) [2024-06-13 06:00:58,820][71000] InferenceWorker_p0-w0: resuming experience collection (36200 times) [2024-06-13 06:01:00,708][71000] Updated weights for policy 0, policy_version 176844 (0.0031) [2024-06-13 06:01:00,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 2897412096. Throughput: 0: 49179.4. Samples: 2426273260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 06:01:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:01:04,694][71000] Updated weights for policy 0, policy_version 176854 (0.0023) [2024-06-13 06:01:05,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2897625088. Throughput: 0: 49105.0. Samples: 2426416320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 06:01:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:01:07,400][71000] Updated weights for policy 0, policy_version 176864 (0.0025) [2024-06-13 06:01:10,939][70768] Fps is (10 sec: 47514.4, 60 sec: 48879.1, 300 sec: 49096.5). Total num frames: 2897887232. Throughput: 0: 49130.2. Samples: 2426709260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 06:01:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:01:11,032][71000] Updated weights for policy 0, policy_version 176874 (0.0028) [2024-06-13 06:01:14,045][71000] Updated weights for policy 0, policy_version 176884 (0.0036) [2024-06-13 06:01:15,940][70768] Fps is (10 sec: 54066.9, 60 sec: 49698.0, 300 sec: 49207.5). Total num frames: 2898165760. Throughput: 0: 49461.9. Samples: 2427010720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 06:01:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:01:18,121][71000] Updated weights for policy 0, policy_version 176894 (0.0024) [2024-06-13 06:01:20,731][71000] Updated weights for policy 0, policy_version 176904 (0.0027) [2024-06-13 06:01:20,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2898395136. Throughput: 0: 49433.7. Samples: 2427165200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 06:01:20,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:01:24,557][71000] Updated weights for policy 0, policy_version 176914 (0.0026) [2024-06-13 06:01:25,939][70768] Fps is (10 sec: 44237.2, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 2898608128. Throughput: 0: 49278.2. Samples: 2427461520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 06:01:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:01:27,393][71000] Updated weights for policy 0, policy_version 176924 (0.0033) [2024-06-13 06:01:30,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 2898853888. Throughput: 0: 49060.9. Samples: 2427745060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 06:01:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:01:31,139][71000] Updated weights for policy 0, policy_version 176934 (0.0032) [2024-06-13 06:01:34,097][71000] Updated weights for policy 0, policy_version 176944 (0.0029) [2024-06-13 06:01:35,940][70768] Fps is (10 sec: 54066.5, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2899148800. Throughput: 0: 49411.5. Samples: 2427900180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 06:01:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:01:37,779][71000] Updated weights for policy 0, policy_version 176954 (0.0025) [2024-06-13 06:01:40,939][70768] Fps is (10 sec: 50790.9, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2899361792. Throughput: 0: 49392.0. Samples: 2428195600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 06:01:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:01:41,059][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000176964_2899378176.pth... [2024-06-13 06:01:41,071][71000] Updated weights for policy 0, policy_version 176964 (0.0028) [2024-06-13 06:01:41,116][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000176245_2887598080.pth [2024-06-13 06:01:44,299][71000] Updated weights for policy 0, policy_version 176974 (0.0028) [2024-06-13 06:01:45,940][70768] Fps is (10 sec: 44236.8, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2899591168. Throughput: 0: 49386.7. Samples: 2428495660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 19.0) [2024-06-13 06:01:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:01:47,406][71000] Updated weights for policy 0, policy_version 176984 (0.0028) [2024-06-13 06:01:50,776][71000] Updated weights for policy 0, policy_version 176994 (0.0035) [2024-06-13 06:01:50,939][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2899869696. Throughput: 0: 49296.0. Samples: 2428634640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 19.0) [2024-06-13 06:01:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:01:54,135][71000] Updated weights for policy 0, policy_version 177004 (0.0027) [2024-06-13 06:01:55,942][70768] Fps is (10 sec: 54055.8, 60 sec: 49423.2, 300 sec: 49262.7). Total num frames: 2900131840. Throughput: 0: 49417.5. Samples: 2428933160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 19.0) [2024-06-13 06:01:55,942][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:01:57,210][71000] Updated weights for policy 0, policy_version 177014 (0.0030) [2024-06-13 06:02:00,659][71000] Updated weights for policy 0, policy_version 177024 (0.0034) [2024-06-13 06:02:00,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2900361216. Throughput: 0: 49434.3. Samples: 2429235260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 19.0) [2024-06-13 06:02:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:02:04,081][71000] Updated weights for policy 0, policy_version 177034 (0.0027) [2024-06-13 06:02:05,940][70768] Fps is (10 sec: 45885.2, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2900590592. Throughput: 0: 49106.7. Samples: 2429375000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 19.0) [2024-06-13 06:02:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:02:07,377][71000] Updated weights for policy 0, policy_version 177044 (0.0035) [2024-06-13 06:02:10,506][71000] Updated weights for policy 0, policy_version 177054 (0.0024) [2024-06-13 06:02:10,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.1, 300 sec: 49207.5). Total num frames: 2900869120. Throughput: 0: 49274.6. Samples: 2429678880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 19.0) [2024-06-13 06:02:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:02:12,559][70980] Signal inference workers to stop experience collection... (36250 times) [2024-06-13 06:02:12,559][70980] Signal inference workers to resume experience collection... (36250 times) [2024-06-13 06:02:12,594][71000] InferenceWorker_p0-w0: stopping experience collection (36250 times) [2024-06-13 06:02:12,595][71000] InferenceWorker_p0-w0: resuming experience collection (36250 times) [2024-06-13 06:02:13,718][71000] Updated weights for policy 0, policy_version 177064 (0.0031) [2024-06-13 06:02:15,939][70768] Fps is (10 sec: 52429.2, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2901114880. Throughput: 0: 49549.5. Samples: 2429974780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 19.0) [2024-06-13 06:02:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:02:17,255][71000] Updated weights for policy 0, policy_version 177074 (0.0030) [2024-06-13 06:02:20,458][71000] Updated weights for policy 0, policy_version 177084 (0.0034) [2024-06-13 06:02:20,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2901344256. Throughput: 0: 49588.9. Samples: 2430131680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 19.0) [2024-06-13 06:02:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:02:23,819][71000] Updated weights for policy 0, policy_version 177094 (0.0034) [2024-06-13 06:02:25,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49698.0, 300 sec: 49152.0). Total num frames: 2901590016. Throughput: 0: 49392.7. Samples: 2430418280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 19.0) [2024-06-13 06:02:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:02:27,471][71000] Updated weights for policy 0, policy_version 177104 (0.0031) [2024-06-13 06:02:30,311][71000] Updated weights for policy 0, policy_version 177114 (0.0028) [2024-06-13 06:02:30,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49971.3, 300 sec: 49208.0). Total num frames: 2901852160. Throughput: 0: 49237.0. Samples: 2430711320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 19.0) [2024-06-13 06:02:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:02:33,898][71000] Updated weights for policy 0, policy_version 177124 (0.0025) [2024-06-13 06:02:35,943][70768] Fps is (10 sec: 52413.7, 60 sec: 49422.7, 300 sec: 49262.6). Total num frames: 2902114304. Throughput: 0: 49710.5. Samples: 2430871760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 19.0) [2024-06-13 06:02:35,943][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 06:02:36,743][71000] Updated weights for policy 0, policy_version 177134 (0.0036) [2024-06-13 06:02:40,758][71000] Updated weights for policy 0, policy_version 177144 (0.0032) [2024-06-13 06:02:40,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2902327296. Throughput: 0: 49690.5. Samples: 2431169120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 19.0) [2024-06-13 06:02:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:02:43,629][71000] Updated weights for policy 0, policy_version 177154 (0.0024) [2024-06-13 06:02:45,939][70768] Fps is (10 sec: 45889.2, 60 sec: 49698.3, 300 sec: 49096.5). Total num frames: 2902573056. Throughput: 0: 49380.5. Samples: 2431457380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 19.0) [2024-06-13 06:02:45,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 06:02:47,404][71000] Updated weights for policy 0, policy_version 177164 (0.0031) [2024-06-13 06:02:50,230][71000] Updated weights for policy 0, policy_version 177174 (0.0028) [2024-06-13 06:02:50,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2902818816. Throughput: 0: 49371.1. Samples: 2431596700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 06:02:50,948][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:02:54,044][71000] Updated weights for policy 0, policy_version 177184 (0.0028) [2024-06-13 06:02:55,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48880.6, 300 sec: 49152.0). Total num frames: 2903064576. Throughput: 0: 49215.4. Samples: 2431893580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 06:02:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:02:57,234][71000] Updated weights for policy 0, policy_version 177194 (0.0033) [2024-06-13 06:03:00,937][71000] Updated weights for policy 0, policy_version 177204 (0.0027) [2024-06-13 06:03:00,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2903310336. Throughput: 0: 49214.2. Samples: 2432189420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 06:03:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:03:03,954][71000] Updated weights for policy 0, policy_version 177214 (0.0042) [2024-06-13 06:03:05,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2903556096. Throughput: 0: 48786.7. Samples: 2432327080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 06:03:05,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:03:07,586][71000] Updated weights for policy 0, policy_version 177224 (0.0026) [2024-06-13 06:03:10,473][71000] Updated weights for policy 0, policy_version 177234 (0.0031) [2024-06-13 06:03:10,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2903818240. Throughput: 0: 48882.7. Samples: 2432618000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 06:03:10,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 06:03:14,426][70980] Signal inference workers to stop experience collection... (36300 times) [2024-06-13 06:03:14,426][70980] Signal inference workers to resume experience collection... (36300 times) [2024-06-13 06:03:14,442][71000] InferenceWorker_p0-w0: stopping experience collection (36300 times) [2024-06-13 06:03:14,442][71000] InferenceWorker_p0-w0: resuming experience collection (36300 times) [2024-06-13 06:03:14,588][71000] Updated weights for policy 0, policy_version 177244 (0.0021) [2024-06-13 06:03:15,941][70768] Fps is (10 sec: 49144.1, 60 sec: 48877.6, 300 sec: 49151.7). Total num frames: 2904047616. Throughput: 0: 48777.3. Samples: 2432906380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 06:03:15,942][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:03:17,103][71000] Updated weights for policy 0, policy_version 177254 (0.0026) [2024-06-13 06:03:20,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2904276992. Throughput: 0: 48626.3. Samples: 2433059800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 06:03:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:03:21,004][71000] Updated weights for policy 0, policy_version 177264 (0.0022) [2024-06-13 06:03:23,929][71000] Updated weights for policy 0, policy_version 177274 (0.0041) [2024-06-13 06:03:25,940][70768] Fps is (10 sec: 49159.3, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2904539136. Throughput: 0: 48511.7. Samples: 2433352160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 06:03:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:03:27,827][71000] Updated weights for policy 0, policy_version 177284 (0.0031) [2024-06-13 06:03:30,543][71000] Updated weights for policy 0, policy_version 177294 (0.0025) [2024-06-13 06:03:30,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2904801280. Throughput: 0: 48709.3. Samples: 2433649300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 06:03:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:03:34,262][71000] Updated weights for policy 0, policy_version 177304 (0.0023) [2024-06-13 06:03:35,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48608.2, 300 sec: 49040.9). Total num frames: 2905030656. Throughput: 0: 48955.0. Samples: 2433799680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 06:03:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:03:37,270][71000] Updated weights for policy 0, policy_version 177314 (0.0020) [2024-06-13 06:03:40,939][70768] Fps is (10 sec: 45875.5, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2905260032. Throughput: 0: 48803.7. Samples: 2434089740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 06:03:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:03:41,024][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000177324_2905276416.pth... [2024-06-13 06:03:41,034][71000] Updated weights for policy 0, policy_version 177324 (0.0028) [2024-06-13 06:03:41,055][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000176605_2893496320.pth [2024-06-13 06:03:43,947][71000] Updated weights for policy 0, policy_version 177334 (0.0029) [2024-06-13 06:03:45,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2905522176. Throughput: 0: 48863.0. Samples: 2434388260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 06:03:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:03:47,445][71000] Updated weights for policy 0, policy_version 177344 (0.0027) [2024-06-13 06:03:50,638][71000] Updated weights for policy 0, policy_version 177354 (0.0034) [2024-06-13 06:03:50,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2905767936. Throughput: 0: 49237.8. Samples: 2434542780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 06:03:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:03:54,180][71000] Updated weights for policy 0, policy_version 177364 (0.0029) [2024-06-13 06:03:55,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 2906013696. Throughput: 0: 49309.4. Samples: 2434836920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 06:03:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:03:57,359][71000] Updated weights for policy 0, policy_version 177374 (0.0025) [2024-06-13 06:04:00,715][71000] Updated weights for policy 0, policy_version 177384 (0.0035) [2024-06-13 06:04:00,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2906259456. Throughput: 0: 49428.4. Samples: 2435130580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 06:04:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:04:04,069][71000] Updated weights for policy 0, policy_version 177394 (0.0029) [2024-06-13 06:04:05,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2906488832. Throughput: 0: 49122.7. Samples: 2435270320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 06:04:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:04:07,703][71000] Updated weights for policy 0, policy_version 177404 (0.0034) [2024-06-13 06:04:10,870][71000] Updated weights for policy 0, policy_version 177414 (0.0024) [2024-06-13 06:04:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2906750976. Throughput: 0: 49200.5. Samples: 2435566180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 06:04:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:04:14,102][70980] Signal inference workers to stop experience collection... (36350 times) [2024-06-13 06:04:14,108][70980] Signal inference workers to resume experience collection... (36350 times) [2024-06-13 06:04:14,130][71000] InferenceWorker_p0-w0: stopping experience collection (36350 times) [2024-06-13 06:04:14,131][71000] InferenceWorker_p0-w0: resuming experience collection (36350 times) [2024-06-13 06:04:14,254][71000] Updated weights for policy 0, policy_version 177424 (0.0035) [2024-06-13 06:04:15,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48880.2, 300 sec: 48985.4). Total num frames: 2906980352. Throughput: 0: 49253.2. Samples: 2435865700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 06:04:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:04:17,387][71000] Updated weights for policy 0, policy_version 177434 (0.0028) [2024-06-13 06:04:20,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2907226112. Throughput: 0: 48935.6. Samples: 2436001780. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 06:04:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:04:21,100][71000] Updated weights for policy 0, policy_version 177444 (0.0034) [2024-06-13 06:04:24,133][71000] Updated weights for policy 0, policy_version 177454 (0.0026) [2024-06-13 06:04:25,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2907471872. Throughput: 0: 49097.7. Samples: 2436299140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 06:04:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:04:27,782][71000] Updated weights for policy 0, policy_version 177464 (0.0028) [2024-06-13 06:04:30,699][71000] Updated weights for policy 0, policy_version 177474 (0.0028) [2024-06-13 06:04:30,939][70768] Fps is (10 sec: 52429.2, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 2907750400. Throughput: 0: 49094.8. Samples: 2436597520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 06:04:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 06:04:34,435][71000] Updated weights for policy 0, policy_version 177484 (0.0025) [2024-06-13 06:04:35,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2907963392. Throughput: 0: 48985.6. Samples: 2436747140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 06:04:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:04:37,189][71000] Updated weights for policy 0, policy_version 177494 (0.0036) [2024-06-13 06:04:40,940][70768] Fps is (10 sec: 45874.2, 60 sec: 49151.8, 300 sec: 49096.5). Total num frames: 2908209152. Throughput: 0: 48988.7. Samples: 2437041420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 06:04:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:04:41,131][71000] Updated weights for policy 0, policy_version 177504 (0.0035) [2024-06-13 06:04:44,082][71000] Updated weights for policy 0, policy_version 177514 (0.0031) [2024-06-13 06:04:45,939][70768] Fps is (10 sec: 47514.7, 60 sec: 48606.0, 300 sec: 49040.9). Total num frames: 2908438528. Throughput: 0: 48873.9. Samples: 2437329900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 06:04:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:04:48,053][71000] Updated weights for policy 0, policy_version 177524 (0.0031) [2024-06-13 06:04:50,778][71000] Updated weights for policy 0, policy_version 177534 (0.0034) [2024-06-13 06:04:50,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2908717056. Throughput: 0: 49001.3. Samples: 2437475380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 06:04:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:04:54,824][71000] Updated weights for policy 0, policy_version 177544 (0.0034) [2024-06-13 06:04:55,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48605.7, 300 sec: 48985.4). Total num frames: 2908930048. Throughput: 0: 48869.3. Samples: 2437765300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-13 06:04:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:04:57,767][71000] Updated weights for policy 0, policy_version 177554 (0.0037) [2024-06-13 06:05:00,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48605.9, 300 sec: 49096.5). Total num frames: 2909175808. Throughput: 0: 48606.3. Samples: 2438052980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:05:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:05:01,512][71000] Updated weights for policy 0, policy_version 177564 (0.0026) [2024-06-13 06:05:04,317][71000] Updated weights for policy 0, policy_version 177574 (0.0038) [2024-06-13 06:05:05,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2909421568. Throughput: 0: 48909.2. Samples: 2438202700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:05:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:05:07,855][71000] Updated weights for policy 0, policy_version 177584 (0.0035) [2024-06-13 06:05:10,641][70980] Signal inference workers to stop experience collection... (36400 times) [2024-06-13 06:05:10,691][70980] Signal inference workers to resume experience collection... (36400 times) [2024-06-13 06:05:10,692][71000] InferenceWorker_p0-w0: stopping experience collection (36400 times) [2024-06-13 06:05:10,703][71000] InferenceWorker_p0-w0: resuming experience collection (36400 times) [2024-06-13 06:05:10,827][71000] Updated weights for policy 0, policy_version 177594 (0.0023) [2024-06-13 06:05:10,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2909700096. Throughput: 0: 48914.1. Samples: 2438500280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:05:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:05:14,644][71000] Updated weights for policy 0, policy_version 177604 (0.0035) [2024-06-13 06:05:15,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2909913088. Throughput: 0: 49006.1. Samples: 2438802800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:05:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:05:17,494][71000] Updated weights for policy 0, policy_version 177614 (0.0024) [2024-06-13 06:05:20,939][70768] Fps is (10 sec: 45876.1, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2910158848. Throughput: 0: 48710.4. Samples: 2438939100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:05:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:05:21,305][71000] Updated weights for policy 0, policy_version 177624 (0.0029) [2024-06-13 06:05:24,585][71000] Updated weights for policy 0, policy_version 177634 (0.0033) [2024-06-13 06:05:25,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2910404608. Throughput: 0: 48668.5. Samples: 2439231500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:05:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:05:27,887][71000] Updated weights for policy 0, policy_version 177644 (0.0022) [2024-06-13 06:05:30,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48605.8, 300 sec: 49096.5). Total num frames: 2910666752. Throughput: 0: 48925.7. Samples: 2439531560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:05:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:05:31,066][71000] Updated weights for policy 0, policy_version 177654 (0.0031) [2024-06-13 06:05:35,110][71000] Updated weights for policy 0, policy_version 177664 (0.0029) [2024-06-13 06:05:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2910896128. Throughput: 0: 48936.4. Samples: 2439677520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:05:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 06:05:37,656][71000] Updated weights for policy 0, policy_version 177674 (0.0023) [2024-06-13 06:05:40,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2911141888. Throughput: 0: 49014.7. Samples: 2439970960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:05:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:05:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000177683_2911158272.pth... [2024-06-13 06:05:40,987][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000176964_2899378176.pth [2024-06-13 06:05:41,662][71000] Updated weights for policy 0, policy_version 177684 (0.0030) [2024-06-13 06:05:44,849][71000] Updated weights for policy 0, policy_version 177694 (0.0029) [2024-06-13 06:05:45,939][70768] Fps is (10 sec: 50791.2, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2911404032. Throughput: 0: 49113.9. Samples: 2440263100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:05:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:05:48,146][71000] Updated weights for policy 0, policy_version 177704 (0.0025) [2024-06-13 06:05:50,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 2911633408. Throughput: 0: 48998.8. Samples: 2440407640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:05:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:05:51,687][71000] Updated weights for policy 0, policy_version 177714 (0.0039) [2024-06-13 06:05:54,630][71000] Updated weights for policy 0, policy_version 177724 (0.0037) [2024-06-13 06:05:55,939][70768] Fps is (10 sec: 44236.8, 60 sec: 48606.0, 300 sec: 48929.9). Total num frames: 2911846400. Throughput: 0: 49064.2. Samples: 2440708160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:05:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:05:58,198][71000] Updated weights for policy 0, policy_version 177734 (0.0026) [2024-06-13 06:06:00,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2912124928. Throughput: 0: 48668.4. Samples: 2440992880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:06:00,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:06:01,622][71000] Updated weights for policy 0, policy_version 177744 (0.0041) [2024-06-13 06:06:04,882][71000] Updated weights for policy 0, policy_version 177754 (0.0031) [2024-06-13 06:06:05,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2912370688. Throughput: 0: 49067.1. Samples: 2441147120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 06:06:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:06:08,374][71000] Updated weights for policy 0, policy_version 177764 (0.0026) [2024-06-13 06:06:10,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48332.9, 300 sec: 48929.8). Total num frames: 2912600064. Throughput: 0: 49051.6. Samples: 2441438820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 06:06:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:06:11,684][70980] Signal inference workers to stop experience collection... (36450 times) [2024-06-13 06:06:11,685][70980] Signal inference workers to resume experience collection... (36450 times) [2024-06-13 06:06:11,700][71000] InferenceWorker_p0-w0: stopping experience collection (36450 times) [2024-06-13 06:06:11,700][71000] InferenceWorker_p0-w0: resuming experience collection (36450 times) [2024-06-13 06:06:11,832][71000] Updated weights for policy 0, policy_version 177774 (0.0033) [2024-06-13 06:06:15,200][71000] Updated weights for policy 0, policy_version 177784 (0.0032) [2024-06-13 06:06:15,940][70768] Fps is (10 sec: 47512.7, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 2912845824. Throughput: 0: 48880.3. Samples: 2441731180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 06:06:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:06:18,514][71000] Updated weights for policy 0, policy_version 177794 (0.0025) [2024-06-13 06:06:20,939][70768] Fps is (10 sec: 50791.1, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2913107968. Throughput: 0: 49019.3. Samples: 2441883380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 06:06:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:06:21,898][71000] Updated weights for policy 0, policy_version 177804 (0.0033) [2024-06-13 06:06:25,275][71000] Updated weights for policy 0, policy_version 177814 (0.0023) [2024-06-13 06:06:25,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2913353728. Throughput: 0: 48883.1. Samples: 2442170700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 06:06:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:06:28,398][71000] Updated weights for policy 0, policy_version 177824 (0.0033) [2024-06-13 06:06:30,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 2913583104. Throughput: 0: 48728.8. Samples: 2442455900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 06:06:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:06:31,994][71000] Updated weights for policy 0, policy_version 177834 (0.0029) [2024-06-13 06:06:35,015][71000] Updated weights for policy 0, policy_version 177844 (0.0029) [2024-06-13 06:06:35,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 2913812480. Throughput: 0: 48728.5. Samples: 2442600420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 06:06:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:06:38,704][71000] Updated weights for policy 0, policy_version 177854 (0.0032) [2024-06-13 06:06:40,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 2914091008. Throughput: 0: 48693.3. Samples: 2442899360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 06:06:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:06:41,755][71000] Updated weights for policy 0, policy_version 177864 (0.0024) [2024-06-13 06:06:45,366][71000] Updated weights for policy 0, policy_version 177874 (0.0025) [2024-06-13 06:06:45,939][70768] Fps is (10 sec: 50790.9, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 2914320384. Throughput: 0: 48910.8. Samples: 2443193860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 06:06:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:06:48,427][71000] Updated weights for policy 0, policy_version 177884 (0.0035) [2024-06-13 06:06:50,939][70768] Fps is (10 sec: 45875.2, 60 sec: 48605.9, 300 sec: 48874.7). Total num frames: 2914549760. Throughput: 0: 48560.4. Samples: 2443332340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 06:06:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:06:51,946][71000] Updated weights for policy 0, policy_version 177894 (0.0031) [2024-06-13 06:06:54,723][71000] Updated weights for policy 0, policy_version 177904 (0.0030) [2024-06-13 06:06:55,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 2914795520. Throughput: 0: 48667.6. Samples: 2443628860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 06:06:55,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:06:58,645][71000] Updated weights for policy 0, policy_version 177914 (0.0025) [2024-06-13 06:07:00,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2915074048. Throughput: 0: 48737.8. Samples: 2443924380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 06:07:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:07:01,617][71000] Updated weights for policy 0, policy_version 177924 (0.0033) [2024-06-13 06:07:05,467][71000] Updated weights for policy 0, policy_version 177934 (0.0026) [2024-06-13 06:07:05,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48605.7, 300 sec: 48874.3). Total num frames: 2915287040. Throughput: 0: 48752.2. Samples: 2444077240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 06:07:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:07:08,202][71000] Updated weights for policy 0, policy_version 177944 (0.0023) [2024-06-13 06:07:10,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 2915532800. Throughput: 0: 48770.6. Samples: 2444365380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 06:07:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:07:12,046][71000] Updated weights for policy 0, policy_version 177954 (0.0035) [2024-06-13 06:07:14,915][71000] Updated weights for policy 0, policy_version 177964 (0.0030) [2024-06-13 06:07:15,940][70768] Fps is (10 sec: 49152.8, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 2915778560. Throughput: 0: 48744.8. Samples: 2444649420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 06:07:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:07:17,636][70980] Signal inference workers to stop experience collection... (36500 times) [2024-06-13 06:07:17,637][70980] Signal inference workers to resume experience collection... (36500 times) [2024-06-13 06:07:17,681][71000] InferenceWorker_p0-w0: stopping experience collection (36500 times) [2024-06-13 06:07:17,682][71000] InferenceWorker_p0-w0: resuming experience collection (36500 times) [2024-06-13 06:07:18,623][71000] Updated weights for policy 0, policy_version 177974 (0.0024) [2024-06-13 06:07:20,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2916040704. Throughput: 0: 49062.2. Samples: 2444808220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 06:07:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:07:21,454][71000] Updated weights for policy 0, policy_version 177984 (0.0022) [2024-06-13 06:07:25,327][71000] Updated weights for policy 0, policy_version 177994 (0.0036) [2024-06-13 06:07:25,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 2916286464. Throughput: 0: 49011.1. Samples: 2445104860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 06:07:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:07:28,103][71000] Updated weights for policy 0, policy_version 178004 (0.0044) [2024-06-13 06:07:30,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48605.9, 300 sec: 48763.7). Total num frames: 2916499456. Throughput: 0: 49200.8. Samples: 2445407900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 06:07:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:07:32,052][71000] Updated weights for policy 0, policy_version 178014 (0.0037) [2024-06-13 06:07:34,659][71000] Updated weights for policy 0, policy_version 178024 (0.0028) [2024-06-13 06:07:35,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 2916777984. Throughput: 0: 49048.4. Samples: 2445539520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 06:07:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:07:38,442][71000] Updated weights for policy 0, policy_version 178034 (0.0036) [2024-06-13 06:07:40,940][70768] Fps is (10 sec: 52427.7, 60 sec: 48878.8, 300 sec: 48985.3). Total num frames: 2917023744. Throughput: 0: 49195.8. Samples: 2445842680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 06:07:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:07:40,966][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000178042_2917040128.pth... [2024-06-13 06:07:41,007][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000177324_2905276416.pth [2024-06-13 06:07:41,562][71000] Updated weights for policy 0, policy_version 178044 (0.0026) [2024-06-13 06:07:45,294][71000] Updated weights for policy 0, policy_version 178054 (0.0028) [2024-06-13 06:07:45,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 2917269504. Throughput: 0: 49147.2. Samples: 2446136000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 06:07:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:07:48,303][71000] Updated weights for policy 0, policy_version 178064 (0.0038) [2024-06-13 06:07:50,939][70768] Fps is (10 sec: 42599.4, 60 sec: 48332.8, 300 sec: 48763.3). Total num frames: 2917449728. Throughput: 0: 48872.3. Samples: 2446276480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 06:07:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:07:52,005][71000] Updated weights for policy 0, policy_version 178074 (0.0032) [2024-06-13 06:07:54,972][71000] Updated weights for policy 0, policy_version 178084 (0.0026) [2024-06-13 06:07:55,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 2917761024. Throughput: 0: 48928.5. Samples: 2446567160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 06:07:55,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:07:58,655][71000] Updated weights for policy 0, policy_version 178094 (0.0027) [2024-06-13 06:08:00,940][70768] Fps is (10 sec: 55704.4, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2918006784. Throughput: 0: 49196.3. Samples: 2446863260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 06:08:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:08:01,834][71000] Updated weights for policy 0, policy_version 178104 (0.0027) [2024-06-13 06:08:05,438][71000] Updated weights for policy 0, policy_version 178114 (0.0023) [2024-06-13 06:08:05,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.2, 300 sec: 48929.9). Total num frames: 2918252544. Throughput: 0: 49056.4. Samples: 2447015760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 06:08:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:08:08,604][71000] Updated weights for policy 0, policy_version 178124 (0.0034) [2024-06-13 06:08:10,940][70768] Fps is (10 sec: 42599.0, 60 sec: 48332.8, 300 sec: 48763.5). Total num frames: 2918432768. Throughput: 0: 48656.0. Samples: 2447294380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 06:08:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:08:11,595][70980] Signal inference workers to stop experience collection... (36550 times) [2024-06-13 06:08:11,641][71000] InferenceWorker_p0-w0: stopping experience collection (36550 times) [2024-06-13 06:08:11,655][70980] Signal inference workers to resume experience collection... (36550 times) [2024-06-13 06:08:11,656][71000] InferenceWorker_p0-w0: resuming experience collection (36550 times) [2024-06-13 06:08:12,114][71000] Updated weights for policy 0, policy_version 178134 (0.0028) [2024-06-13 06:08:15,158][71000] Updated weights for policy 0, policy_version 178144 (0.0035) [2024-06-13 06:08:15,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 2918744064. Throughput: 0: 48474.7. Samples: 2447589260. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-13 06:08:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:08:18,702][71000] Updated weights for policy 0, policy_version 178154 (0.0031) [2024-06-13 06:08:20,940][70768] Fps is (10 sec: 54067.1, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 2918973440. Throughput: 0: 48931.1. Samples: 2447741420. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-13 06:08:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:08:21,889][71000] Updated weights for policy 0, policy_version 178164 (0.0027) [2024-06-13 06:08:25,571][71000] Updated weights for policy 0, policy_version 178174 (0.0025) [2024-06-13 06:08:25,940][70768] Fps is (10 sec: 47512.5, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 2919219200. Throughput: 0: 48828.9. Samples: 2448039980. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-13 06:08:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:08:28,664][71000] Updated weights for policy 0, policy_version 178184 (0.0026) [2024-06-13 06:08:30,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 2919432192. Throughput: 0: 48799.5. Samples: 2448331980. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-13 06:08:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:08:32,374][71000] Updated weights for policy 0, policy_version 178194 (0.0027) [2024-06-13 06:08:35,402][71000] Updated weights for policy 0, policy_version 178204 (0.0027) [2024-06-13 06:08:35,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2919727104. Throughput: 0: 48843.9. Samples: 2448474460. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-13 06:08:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:08:38,961][71000] Updated weights for policy 0, policy_version 178214 (0.0028) [2024-06-13 06:08:40,940][70768] Fps is (10 sec: 52427.9, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 2919956480. Throughput: 0: 48909.6. Samples: 2448768100. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-13 06:08:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:08:41,821][71000] Updated weights for policy 0, policy_version 178224 (0.0034) [2024-06-13 06:08:45,615][71000] Updated weights for policy 0, policy_version 178234 (0.0030) [2024-06-13 06:08:45,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 2920202240. Throughput: 0: 48932.6. Samples: 2449065220. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-13 06:08:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:08:48,762][71000] Updated weights for policy 0, policy_version 178244 (0.0037) [2024-06-13 06:08:50,940][70768] Fps is (10 sec: 47514.4, 60 sec: 49698.0, 300 sec: 48874.3). Total num frames: 2920431616. Throughput: 0: 48683.9. Samples: 2449206540. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-13 06:08:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:08:52,481][71000] Updated weights for policy 0, policy_version 178254 (0.0029) [2024-06-13 06:08:55,557][71000] Updated weights for policy 0, policy_version 178264 (0.0024) [2024-06-13 06:08:55,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 2920693760. Throughput: 0: 49039.9. Samples: 2449501180. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-13 06:08:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:08:59,271][71000] Updated weights for policy 0, policy_version 178274 (0.0023) [2024-06-13 06:09:00,939][70768] Fps is (10 sec: 50791.1, 60 sec: 48879.1, 300 sec: 48985.4). Total num frames: 2920939520. Throughput: 0: 49016.5. Samples: 2449795000. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-13 06:09:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 06:09:02,151][71000] Updated weights for policy 0, policy_version 178284 (0.0029) [2024-06-13 06:09:05,940][70768] Fps is (10 sec: 44237.2, 60 sec: 48059.7, 300 sec: 48763.2). Total num frames: 2921136128. Throughput: 0: 48975.2. Samples: 2449945300. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-13 06:09:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:09:06,222][71000] Updated weights for policy 0, policy_version 178294 (0.0029) [2024-06-13 06:09:07,216][70980] Signal inference workers to stop experience collection... (36600 times) [2024-06-13 06:09:07,216][70980] Signal inference workers to resume experience collection... (36600 times) [2024-06-13 06:09:07,235][71000] InferenceWorker_p0-w0: stopping experience collection (36600 times) [2024-06-13 06:09:07,254][71000] InferenceWorker_p0-w0: resuming experience collection (36600 times) [2024-06-13 06:09:08,933][71000] Updated weights for policy 0, policy_version 178304 (0.0026) [2024-06-13 06:09:10,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49698.1, 300 sec: 48929.9). Total num frames: 2921414656. Throughput: 0: 48809.5. Samples: 2450236400. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-13 06:09:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:09:12,813][71000] Updated weights for policy 0, policy_version 178314 (0.0031) [2024-06-13 06:09:15,740][71000] Updated weights for policy 0, policy_version 178324 (0.0027) [2024-06-13 06:09:15,940][70768] Fps is (10 sec: 54067.1, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2921676800. Throughput: 0: 48988.5. Samples: 2450536460. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-13 06:09:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:09:19,310][71000] Updated weights for policy 0, policy_version 178334 (0.0032) [2024-06-13 06:09:20,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2921922560. Throughput: 0: 49128.0. Samples: 2450685220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 06:09:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:09:22,221][71000] Updated weights for policy 0, policy_version 178344 (0.0035) [2024-06-13 06:09:25,940][70768] Fps is (10 sec: 44236.8, 60 sec: 48332.9, 300 sec: 48707.7). Total num frames: 2922119168. Throughput: 0: 49039.8. Samples: 2450974880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 06:09:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:09:26,162][71000] Updated weights for policy 0, policy_version 178354 (0.0040) [2024-06-13 06:09:29,140][71000] Updated weights for policy 0, policy_version 178364 (0.0028) [2024-06-13 06:09:30,940][70768] Fps is (10 sec: 47512.2, 60 sec: 49424.8, 300 sec: 48929.8). Total num frames: 2922397696. Throughput: 0: 48799.2. Samples: 2451261200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 06:09:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:09:32,971][71000] Updated weights for policy 0, policy_version 178374 (0.0032) [2024-06-13 06:09:35,815][71000] Updated weights for policy 0, policy_version 178384 (0.0023) [2024-06-13 06:09:35,940][70768] Fps is (10 sec: 52428.6, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 2922643456. Throughput: 0: 49108.0. Samples: 2451416400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 06:09:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:09:39,476][71000] Updated weights for policy 0, policy_version 178394 (0.0036) [2024-06-13 06:09:40,940][70768] Fps is (10 sec: 49153.5, 60 sec: 48879.1, 300 sec: 48985.4). Total num frames: 2922889216. Throughput: 0: 49051.2. Samples: 2451708480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 06:09:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:09:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000178399_2922889216.pth... [2024-06-13 06:09:41,015][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000177683_2911158272.pth [2024-06-13 06:09:42,658][71000] Updated weights for policy 0, policy_version 178404 (0.0022) [2024-06-13 06:09:45,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48332.8, 300 sec: 48763.2). Total num frames: 2923102208. Throughput: 0: 48777.2. Samples: 2451989980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 06:09:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:09:46,393][71000] Updated weights for policy 0, policy_version 178414 (0.0033) [2024-06-13 06:09:49,577][71000] Updated weights for policy 0, policy_version 178424 (0.0029) [2024-06-13 06:09:50,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2923380736. Throughput: 0: 48600.3. Samples: 2452132320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 06:09:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:09:53,066][71000] Updated weights for policy 0, policy_version 178434 (0.0031) [2024-06-13 06:09:55,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 2923610112. Throughput: 0: 48861.3. Samples: 2452435160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 06:09:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:09:56,066][71000] Updated weights for policy 0, policy_version 178444 (0.0030) [2024-06-13 06:09:59,711][71000] Updated weights for policy 0, policy_version 178454 (0.0023) [2024-06-13 06:10:00,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48605.8, 300 sec: 48929.9). Total num frames: 2923855872. Throughput: 0: 48936.9. Samples: 2452738620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 06:10:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:10:02,483][71000] Updated weights for policy 0, policy_version 178464 (0.0029) [2024-06-13 06:10:05,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.1, 300 sec: 48818.8). Total num frames: 2924101632. Throughput: 0: 48955.6. Samples: 2452888220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 06:10:05,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:10:06,225][71000] Updated weights for policy 0, policy_version 178474 (0.0033) [2024-06-13 06:10:09,238][71000] Updated weights for policy 0, policy_version 178484 (0.0028) [2024-06-13 06:10:10,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 2924380160. Throughput: 0: 49027.5. Samples: 2453181120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 06:10:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:10:12,828][71000] Updated weights for policy 0, policy_version 178494 (0.0033) [2024-06-13 06:10:15,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 2924593152. Throughput: 0: 48900.8. Samples: 2453461720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 06:10:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:10:16,018][71000] Updated weights for policy 0, policy_version 178504 (0.0020) [2024-06-13 06:10:19,501][71000] Updated weights for policy 0, policy_version 178514 (0.0028) [2024-06-13 06:10:20,939][70768] Fps is (10 sec: 44237.2, 60 sec: 48332.9, 300 sec: 48874.3). Total num frames: 2924822528. Throughput: 0: 48638.3. Samples: 2453605120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 06:10:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:10:22,811][71000] Updated weights for policy 0, policy_version 178524 (0.0024) [2024-06-13 06:10:25,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.0, 300 sec: 48874.3). Total num frames: 2925084672. Throughput: 0: 48812.0. Samples: 2453905020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 06:10:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:10:26,366][71000] Updated weights for policy 0, policy_version 178534 (0.0026) [2024-06-13 06:10:27,174][70980] Signal inference workers to stop experience collection... (36650 times) [2024-06-13 06:10:27,175][70980] Signal inference workers to resume experience collection... (36650 times) [2024-06-13 06:10:27,194][71000] InferenceWorker_p0-w0: stopping experience collection (36650 times) [2024-06-13 06:10:27,194][71000] InferenceWorker_p0-w0: resuming experience collection (36650 times) [2024-06-13 06:10:29,238][71000] Updated weights for policy 0, policy_version 178544 (0.0030) [2024-06-13 06:10:30,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49152.3, 300 sec: 48985.4). Total num frames: 2925346816. Throughput: 0: 49133.3. Samples: 2454200980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:10:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 06:10:33,171][71000] Updated weights for policy 0, policy_version 178554 (0.0035) [2024-06-13 06:10:35,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 2925559808. Throughput: 0: 49177.4. Samples: 2454345300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:10:35,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 06:10:36,357][71000] Updated weights for policy 0, policy_version 178564 (0.0029) [2024-06-13 06:10:39,843][71000] Updated weights for policy 0, policy_version 178574 (0.0023) [2024-06-13 06:10:40,940][70768] Fps is (10 sec: 45874.7, 60 sec: 48605.8, 300 sec: 48818.7). Total num frames: 2925805568. Throughput: 0: 49055.0. Samples: 2454642640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:10:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:10:42,775][71000] Updated weights for policy 0, policy_version 178584 (0.0032) [2024-06-13 06:10:45,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.1, 300 sec: 48929.8). Total num frames: 2926067712. Throughput: 0: 48823.1. Samples: 2454935660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:10:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:10:46,246][71000] Updated weights for policy 0, policy_version 178594 (0.0037) [2024-06-13 06:10:49,287][71000] Updated weights for policy 0, policy_version 178604 (0.0029) [2024-06-13 06:10:50,940][70768] Fps is (10 sec: 50791.1, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2926313472. Throughput: 0: 48808.9. Samples: 2455084620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:10:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:10:52,881][71000] Updated weights for policy 0, policy_version 178614 (0.0032) [2024-06-13 06:10:55,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 2926559232. Throughput: 0: 48964.9. Samples: 2455384540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:10:55,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 06:10:56,089][71000] Updated weights for policy 0, policy_version 178624 (0.0022) [2024-06-13 06:10:59,626][71000] Updated weights for policy 0, policy_version 178634 (0.0030) [2024-06-13 06:11:00,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 2926788608. Throughput: 0: 49380.9. Samples: 2455683860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:11:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:11:02,569][71000] Updated weights for policy 0, policy_version 178644 (0.0040) [2024-06-13 06:11:05,942][70768] Fps is (10 sec: 49140.4, 60 sec: 49150.1, 300 sec: 48985.0). Total num frames: 2927050752. Throughput: 0: 49375.6. Samples: 2455827140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:11:05,943][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:11:05,992][71000] Updated weights for policy 0, policy_version 178654 (0.0032) [2024-06-13 06:11:09,463][71000] Updated weights for policy 0, policy_version 178664 (0.0039) [2024-06-13 06:11:10,940][70768] Fps is (10 sec: 52428.6, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2927312896. Throughput: 0: 49236.9. Samples: 2456120680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:11:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:11:13,032][71000] Updated weights for policy 0, policy_version 178674 (0.0027) [2024-06-13 06:11:15,940][70768] Fps is (10 sec: 47523.7, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 2927525888. Throughput: 0: 48934.0. Samples: 2456403020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:11:15,941][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:11:16,458][71000] Updated weights for policy 0, policy_version 178684 (0.0027) [2024-06-13 06:11:19,621][71000] Updated weights for policy 0, policy_version 178694 (0.0032) [2024-06-13 06:11:20,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 2927771648. Throughput: 0: 48757.8. Samples: 2456539400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:11:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:11:22,904][71000] Updated weights for policy 0, policy_version 178704 (0.0027) [2024-06-13 06:11:25,939][70768] Fps is (10 sec: 49153.4, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 2928017408. Throughput: 0: 48805.6. Samples: 2456838880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:11:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:11:26,233][71000] Updated weights for policy 0, policy_version 178714 (0.0035) [2024-06-13 06:11:29,690][71000] Updated weights for policy 0, policy_version 178724 (0.0031) [2024-06-13 06:11:30,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2928279552. Throughput: 0: 48929.7. Samples: 2457137500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:11:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:11:31,563][70980] Signal inference workers to stop experience collection... (36700 times) [2024-06-13 06:11:31,565][70980] Signal inference workers to resume experience collection... (36700 times) [2024-06-13 06:11:31,612][71000] InferenceWorker_p0-w0: stopping experience collection (36700 times) [2024-06-13 06:11:31,612][71000] InferenceWorker_p0-w0: resuming experience collection (36700 times) [2024-06-13 06:11:33,103][71000] Updated weights for policy 0, policy_version 178734 (0.0032) [2024-06-13 06:11:35,940][70768] Fps is (10 sec: 49150.8, 60 sec: 49151.8, 300 sec: 48874.3). Total num frames: 2928508928. Throughput: 0: 48815.3. Samples: 2457281320. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 06:11:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:11:36,455][71000] Updated weights for policy 0, policy_version 178744 (0.0033) [2024-06-13 06:11:39,769][71000] Updated weights for policy 0, policy_version 178754 (0.0043) [2024-06-13 06:11:40,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48879.1, 300 sec: 48874.3). Total num frames: 2928738304. Throughput: 0: 48504.0. Samples: 2457567220. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 06:11:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:11:41,004][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000178757_2928754688.pth... [2024-06-13 06:11:41,061][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000178042_2917040128.pth [2024-06-13 06:11:43,157][71000] Updated weights for policy 0, policy_version 178764 (0.0023) [2024-06-13 06:11:45,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 2929000448. Throughput: 0: 48410.9. Samples: 2457862360. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 06:11:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:11:46,626][71000] Updated weights for policy 0, policy_version 178774 (0.0027) [2024-06-13 06:11:49,817][71000] Updated weights for policy 0, policy_version 178784 (0.0041) [2024-06-13 06:11:50,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 2929229824. Throughput: 0: 48767.5. Samples: 2458021560. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 06:11:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:11:53,108][71000] Updated weights for policy 0, policy_version 178794 (0.0025) [2024-06-13 06:11:55,940][70768] Fps is (10 sec: 45876.0, 60 sec: 48332.8, 300 sec: 48763.2). Total num frames: 2929459200. Throughput: 0: 48605.4. Samples: 2458307920. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 06:11:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:11:56,693][71000] Updated weights for policy 0, policy_version 178804 (0.0032) [2024-06-13 06:11:59,637][71000] Updated weights for policy 0, policy_version 178814 (0.0029) [2024-06-13 06:12:00,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 2929721344. Throughput: 0: 48661.1. Samples: 2458592760. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 06:12:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:12:03,272][71000] Updated weights for policy 0, policy_version 178824 (0.0036) [2024-06-13 06:12:05,940][70768] Fps is (10 sec: 52426.9, 60 sec: 48880.6, 300 sec: 48985.3). Total num frames: 2929983488. Throughput: 0: 49015.2. Samples: 2458745100. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 06:12:05,949][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:12:06,609][71000] Updated weights for policy 0, policy_version 178834 (0.0027) [2024-06-13 06:12:10,053][71000] Updated weights for policy 0, policy_version 178844 (0.0032) [2024-06-13 06:12:10,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 2930229248. Throughput: 0: 49203.9. Samples: 2459053060. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 06:12:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:12:13,120][71000] Updated weights for policy 0, policy_version 178854 (0.0027) [2024-06-13 06:12:15,940][70768] Fps is (10 sec: 47515.1, 60 sec: 48879.1, 300 sec: 48874.3). Total num frames: 2930458624. Throughput: 0: 48975.3. Samples: 2459341380. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 06:12:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 06:12:16,621][71000] Updated weights for policy 0, policy_version 178864 (0.0032) [2024-06-13 06:12:19,725][71000] Updated weights for policy 0, policy_version 178874 (0.0020) [2024-06-13 06:12:20,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 2930720768. Throughput: 0: 48862.4. Samples: 2459480120. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 06:12:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:12:23,553][71000] Updated weights for policy 0, policy_version 178884 (0.0020) [2024-06-13 06:12:25,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 2930966528. Throughput: 0: 49148.8. Samples: 2459778920. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 06:12:25,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 06:12:26,299][71000] Updated weights for policy 0, policy_version 178894 (0.0025) [2024-06-13 06:12:30,020][71000] Updated weights for policy 0, policy_version 178904 (0.0027) [2024-06-13 06:12:30,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48606.0, 300 sec: 48874.3). Total num frames: 2931195904. Throughput: 0: 49301.6. Samples: 2460080920. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 06:12:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 06:12:33,080][71000] Updated weights for policy 0, policy_version 178914 (0.0030) [2024-06-13 06:12:35,939][70768] Fps is (10 sec: 45875.9, 60 sec: 48606.1, 300 sec: 48818.8). Total num frames: 2931425280. Throughput: 0: 48899.2. Samples: 2460222020. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-13 06:12:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:12:36,864][71000] Updated weights for policy 0, policy_version 178924 (0.0026) [2024-06-13 06:12:40,061][71000] Updated weights for policy 0, policy_version 178934 (0.0029) [2024-06-13 06:12:40,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49425.1, 300 sec: 48929.8). Total num frames: 2931703808. Throughput: 0: 49067.5. Samples: 2460515960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:12:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:12:43,591][71000] Updated weights for policy 0, policy_version 178944 (0.0028) [2024-06-13 06:12:45,940][70768] Fps is (10 sec: 52427.6, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2931949568. Throughput: 0: 49223.4. Samples: 2460807820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:12:45,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 06:12:46,590][71000] Updated weights for policy 0, policy_version 178954 (0.0032) [2024-06-13 06:12:49,574][70980] Signal inference workers to stop experience collection... (36750 times) [2024-06-13 06:12:49,575][70980] Signal inference workers to resume experience collection... (36750 times) [2024-06-13 06:12:49,611][71000] InferenceWorker_p0-w0: stopping experience collection (36750 times) [2024-06-13 06:12:49,611][71000] InferenceWorker_p0-w0: resuming experience collection (36750 times) [2024-06-13 06:12:50,089][71000] Updated weights for policy 0, policy_version 178964 (0.0030) [2024-06-13 06:12:50,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 2932178944. Throughput: 0: 49126.1. Samples: 2460955760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:12:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:12:53,243][71000] Updated weights for policy 0, policy_version 178974 (0.0032) [2024-06-13 06:12:55,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.0, 300 sec: 48929.8). Total num frames: 2932441088. Throughput: 0: 49033.3. Samples: 2461259560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:12:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:12:56,679][71000] Updated weights for policy 0, policy_version 178984 (0.0025) [2024-06-13 06:12:59,809][71000] Updated weights for policy 0, policy_version 178994 (0.0019) [2024-06-13 06:13:00,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.1, 300 sec: 48929.8). Total num frames: 2932686848. Throughput: 0: 49263.1. Samples: 2461558220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:13:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:13:03,418][71000] Updated weights for policy 0, policy_version 179004 (0.0025) [2024-06-13 06:13:05,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.2, 300 sec: 49152.0). Total num frames: 2932932608. Throughput: 0: 49628.0. Samples: 2461713380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:13:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:13:06,318][71000] Updated weights for policy 0, policy_version 179014 (0.0028) [2024-06-13 06:13:09,915][71000] Updated weights for policy 0, policy_version 179024 (0.0027) [2024-06-13 06:13:10,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.2, 300 sec: 48985.4). Total num frames: 2933194752. Throughput: 0: 49608.2. Samples: 2462011280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:13:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:13:12,951][71000] Updated weights for policy 0, policy_version 179034 (0.0029) [2024-06-13 06:13:15,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 2933424128. Throughput: 0: 49388.3. Samples: 2462303400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:13:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:13:16,529][71000] Updated weights for policy 0, policy_version 179044 (0.0035) [2024-06-13 06:13:19,418][71000] Updated weights for policy 0, policy_version 179054 (0.0026) [2024-06-13 06:13:20,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2933669888. Throughput: 0: 49575.0. Samples: 2462452900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:13:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:13:23,152][71000] Updated weights for policy 0, policy_version 179064 (0.0032) [2024-06-13 06:13:25,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2933932032. Throughput: 0: 49614.2. Samples: 2462748600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:13:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:13:26,240][71000] Updated weights for policy 0, policy_version 179074 (0.0028) [2024-06-13 06:13:29,748][71000] Updated weights for policy 0, policy_version 179084 (0.0021) [2024-06-13 06:13:30,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.1, 300 sec: 48985.4). Total num frames: 2934177792. Throughput: 0: 49794.4. Samples: 2463048560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:13:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:13:32,854][71000] Updated weights for policy 0, policy_version 179094 (0.0027) [2024-06-13 06:13:35,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49698.0, 300 sec: 48985.4). Total num frames: 2934407168. Throughput: 0: 49724.9. Samples: 2463193380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:13:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:13:36,453][71000] Updated weights for policy 0, policy_version 179104 (0.0037) [2024-06-13 06:13:39,378][71000] Updated weights for policy 0, policy_version 179114 (0.0031) [2024-06-13 06:13:40,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 2934669312. Throughput: 0: 49427.3. Samples: 2463483780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:13:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:13:41,019][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000179119_2934685696.pth... [2024-06-13 06:13:41,063][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000178399_2922889216.pth [2024-06-13 06:13:43,118][71000] Updated weights for policy 0, policy_version 179124 (0.0029) [2024-06-13 06:13:45,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 2934898688. Throughput: 0: 49299.1. Samples: 2463776680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:13:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:13:46,205][71000] Updated weights for policy 0, policy_version 179134 (0.0033) [2024-06-13 06:13:50,109][71000] Updated weights for policy 0, policy_version 179144 (0.0030) [2024-06-13 06:13:50,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 2935144448. Throughput: 0: 48895.1. Samples: 2463913660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:13:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:13:52,708][70980] Signal inference workers to stop experience collection... (36800 times) [2024-06-13 06:13:52,730][71000] InferenceWorker_p0-w0: stopping experience collection (36800 times) [2024-06-13 06:13:52,764][70980] Signal inference workers to resume experience collection... (36800 times) [2024-06-13 06:13:52,764][71000] InferenceWorker_p0-w0: resuming experience collection (36800 times) [2024-06-13 06:13:52,900][71000] Updated weights for policy 0, policy_version 179154 (0.0025) [2024-06-13 06:13:55,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49152.0, 300 sec: 48985.3). Total num frames: 2935390208. Throughput: 0: 48936.7. Samples: 2464213440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:13:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:13:56,468][71000] Updated weights for policy 0, policy_version 179164 (0.0033) [2024-06-13 06:13:59,678][71000] Updated weights for policy 0, policy_version 179174 (0.0034) [2024-06-13 06:14:00,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2935635968. Throughput: 0: 49001.7. Samples: 2464508480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:14:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 06:14:03,431][71000] Updated weights for policy 0, policy_version 179184 (0.0034) [2024-06-13 06:14:05,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2935881728. Throughput: 0: 48969.8. Samples: 2464656540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:14:05,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:14:06,292][71000] Updated weights for policy 0, policy_version 179194 (0.0028) [2024-06-13 06:14:10,098][71000] Updated weights for policy 0, policy_version 179204 (0.0030) [2024-06-13 06:14:10,939][70768] Fps is (10 sec: 49152.8, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2936127488. Throughput: 0: 48973.4. Samples: 2464952400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:14:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:14:12,909][71000] Updated weights for policy 0, policy_version 179214 (0.0027) [2024-06-13 06:14:15,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 2936373248. Throughput: 0: 48882.6. Samples: 2465248280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:14:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:14:16,807][71000] Updated weights for policy 0, policy_version 179224 (0.0030) [2024-06-13 06:14:19,483][71000] Updated weights for policy 0, policy_version 179234 (0.0023) [2024-06-13 06:14:20,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2936619008. Throughput: 0: 48820.5. Samples: 2465390300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:14:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:14:23,202][71000] Updated weights for policy 0, policy_version 179244 (0.0029) [2024-06-13 06:14:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48878.9, 300 sec: 49041.0). Total num frames: 2936864768. Throughput: 0: 49037.6. Samples: 2465690480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:14:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:14:26,258][71000] Updated weights for policy 0, policy_version 179254 (0.0037) [2024-06-13 06:14:29,877][71000] Updated weights for policy 0, policy_version 179264 (0.0021) [2024-06-13 06:14:30,940][70768] Fps is (10 sec: 49150.7, 60 sec: 48878.7, 300 sec: 49040.9). Total num frames: 2937110528. Throughput: 0: 49256.6. Samples: 2465993240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:14:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:14:32,817][71000] Updated weights for policy 0, policy_version 179274 (0.0027) [2024-06-13 06:14:35,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2937339904. Throughput: 0: 49287.0. Samples: 2466131580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:14:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:14:36,676][71000] Updated weights for policy 0, policy_version 179284 (0.0036) [2024-06-13 06:14:39,458][71000] Updated weights for policy 0, policy_version 179294 (0.0038) [2024-06-13 06:14:40,940][70768] Fps is (10 sec: 50791.7, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2937618432. Throughput: 0: 49211.7. Samples: 2466427960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:14:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:14:43,034][71000] Updated weights for policy 0, policy_version 179304 (0.0031) [2024-06-13 06:14:45,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2937847808. Throughput: 0: 49295.7. Samples: 2466726780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:14:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:14:46,279][71000] Updated weights for policy 0, policy_version 179314 (0.0031) [2024-06-13 06:14:49,917][71000] Updated weights for policy 0, policy_version 179324 (0.0027) [2024-06-13 06:14:50,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 2938093568. Throughput: 0: 49283.9. Samples: 2466874320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:14:50,950][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:14:52,608][71000] Updated weights for policy 0, policy_version 179334 (0.0034) [2024-06-13 06:14:55,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2938339328. Throughput: 0: 49274.6. Samples: 2467169760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:14:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:14:56,397][71000] Updated weights for policy 0, policy_version 179344 (0.0035) [2024-06-13 06:14:59,517][71000] Updated weights for policy 0, policy_version 179354 (0.0025) [2024-06-13 06:15:00,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49698.2, 300 sec: 49207.5). Total num frames: 2938617856. Throughput: 0: 49315.2. Samples: 2467467460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:15:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:15:02,689][71000] Updated weights for policy 0, policy_version 179364 (0.0031) [2024-06-13 06:15:05,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 2938847232. Throughput: 0: 49696.0. Samples: 2467626620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:15:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:15:06,199][71000] Updated weights for policy 0, policy_version 179374 (0.0025) [2024-06-13 06:15:09,379][71000] Updated weights for policy 0, policy_version 179384 (0.0025) [2024-06-13 06:15:10,832][70980] Signal inference workers to stop experience collection... (36850 times) [2024-06-13 06:15:10,833][70980] Signal inference workers to resume experience collection... (36850 times) [2024-06-13 06:15:10,884][71000] InferenceWorker_p0-w0: stopping experience collection (36850 times) [2024-06-13 06:15:10,884][71000] InferenceWorker_p0-w0: resuming experience collection (36850 times) [2024-06-13 06:15:10,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2939092992. Throughput: 0: 49711.3. Samples: 2467927480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:15:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:15:12,472][71000] Updated weights for policy 0, policy_version 179394 (0.0025) [2024-06-13 06:15:15,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2939322368. Throughput: 0: 49570.0. Samples: 2468223880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:15:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:15:16,273][71000] Updated weights for policy 0, policy_version 179404 (0.0027) [2024-06-13 06:15:19,241][71000] Updated weights for policy 0, policy_version 179414 (0.0029) [2024-06-13 06:15:20,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49971.1, 300 sec: 49263.1). Total num frames: 2939617280. Throughput: 0: 49636.9. Samples: 2468365240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:15:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:15:22,521][71000] Updated weights for policy 0, policy_version 179424 (0.0028) [2024-06-13 06:15:25,666][71000] Updated weights for policy 0, policy_version 179434 (0.0027) [2024-06-13 06:15:25,939][70768] Fps is (10 sec: 52429.3, 60 sec: 49698.3, 300 sec: 49152.0). Total num frames: 2939846656. Throughput: 0: 49860.5. Samples: 2468671680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:15:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:15:29,243][71000] Updated weights for policy 0, policy_version 179444 (0.0032) [2024-06-13 06:15:30,939][70768] Fps is (10 sec: 47514.5, 60 sec: 49698.4, 300 sec: 49263.1). Total num frames: 2940092416. Throughput: 0: 49841.0. Samples: 2468969620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:15:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 06:15:32,566][71000] Updated weights for policy 0, policy_version 179454 (0.0023) [2024-06-13 06:15:35,940][70768] Fps is (10 sec: 47512.9, 60 sec: 49698.2, 300 sec: 49207.6). Total num frames: 2940321792. Throughput: 0: 49650.3. Samples: 2469108580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:15:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 06:15:36,077][71000] Updated weights for policy 0, policy_version 179464 (0.0031) [2024-06-13 06:15:39,043][71000] Updated weights for policy 0, policy_version 179474 (0.0033) [2024-06-13 06:15:40,939][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.2, 300 sec: 49263.1). Total num frames: 2940600320. Throughput: 0: 49777.4. Samples: 2469409740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:15:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 06:15:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000179480_2940600320.pth... [2024-06-13 06:15:41,001][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000178757_2928754688.pth [2024-06-13 06:15:42,298][71000] Updated weights for policy 0, policy_version 179484 (0.0026) [2024-06-13 06:15:45,419][71000] Updated weights for policy 0, policy_version 179494 (0.0023) [2024-06-13 06:15:45,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.1, 300 sec: 49207.5). Total num frames: 2940829696. Throughput: 0: 49813.3. Samples: 2469709060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:15:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:15:49,083][71000] Updated weights for policy 0, policy_version 179504 (0.0028) [2024-06-13 06:15:50,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49971.3, 300 sec: 49263.1). Total num frames: 2941091840. Throughput: 0: 49649.8. Samples: 2469860860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:15:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:15:52,465][71000] Updated weights for policy 0, policy_version 179514 (0.0025) [2024-06-13 06:15:55,795][71000] Updated weights for policy 0, policy_version 179524 (0.0028) [2024-06-13 06:15:55,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 2941321216. Throughput: 0: 49296.3. Samples: 2470145820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 06:15:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:15:58,906][71000] Updated weights for policy 0, policy_version 179534 (0.0036) [2024-06-13 06:16:00,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.1, 300 sec: 49263.5). Total num frames: 2941583360. Throughput: 0: 49451.2. Samples: 2470449180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 06:16:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:16:02,065][71000] Updated weights for policy 0, policy_version 179544 (0.0027) [2024-06-13 06:16:05,622][71000] Updated weights for policy 0, policy_version 179554 (0.0026) [2024-06-13 06:16:05,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49698.2, 300 sec: 49207.5). Total num frames: 2941829120. Throughput: 0: 49713.9. Samples: 2470602360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 06:16:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:16:08,692][71000] Updated weights for policy 0, policy_version 179564 (0.0027) [2024-06-13 06:16:10,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 49318.7). Total num frames: 2942074880. Throughput: 0: 49379.9. Samples: 2470893780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 06:16:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:16:12,375][71000] Updated weights for policy 0, policy_version 179574 (0.0028) [2024-06-13 06:16:15,567][71000] Updated weights for policy 0, policy_version 179584 (0.0030) [2024-06-13 06:16:15,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49698.2, 300 sec: 49263.1). Total num frames: 2942304256. Throughput: 0: 49239.9. Samples: 2471185420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 06:16:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:16:18,882][71000] Updated weights for policy 0, policy_version 179594 (0.0034) [2024-06-13 06:16:20,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2942566400. Throughput: 0: 49385.0. Samples: 2471330900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 06:16:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:16:21,516][70980] Signal inference workers to stop experience collection... (36900 times) [2024-06-13 06:16:21,516][70980] Signal inference workers to resume experience collection... (36900 times) [2024-06-13 06:16:21,529][71000] InferenceWorker_p0-w0: stopping experience collection (36900 times) [2024-06-13 06:16:21,529][71000] InferenceWorker_p0-w0: resuming experience collection (36900 times) [2024-06-13 06:16:22,042][71000] Updated weights for policy 0, policy_version 179604 (0.0038) [2024-06-13 06:16:25,536][71000] Updated weights for policy 0, policy_version 179614 (0.0025) [2024-06-13 06:16:25,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2942812160. Throughput: 0: 49310.2. Samples: 2471628700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 06:16:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 06:16:28,498][71000] Updated weights for policy 0, policy_version 179624 (0.0033) [2024-06-13 06:16:30,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49424.9, 300 sec: 49318.6). Total num frames: 2943057920. Throughput: 0: 49289.7. Samples: 2471927100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 06:16:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:16:32,201][71000] Updated weights for policy 0, policy_version 179634 (0.0025) [2024-06-13 06:16:35,338][71000] Updated weights for policy 0, policy_version 179644 (0.0028) [2024-06-13 06:16:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 2943303680. Throughput: 0: 49275.1. Samples: 2472078240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 06:16:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:16:38,771][71000] Updated weights for policy 0, policy_version 179654 (0.0031) [2024-06-13 06:16:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2943549440. Throughput: 0: 49447.1. Samples: 2472370940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 06:16:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:16:41,806][71000] Updated weights for policy 0, policy_version 179664 (0.0026) [2024-06-13 06:16:45,382][71000] Updated weights for policy 0, policy_version 179674 (0.0031) [2024-06-13 06:16:45,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2943778816. Throughput: 0: 49160.5. Samples: 2472661400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 06:16:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:16:48,447][71000] Updated weights for policy 0, policy_version 179684 (0.0030) [2024-06-13 06:16:50,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 2944040960. Throughput: 0: 48918.6. Samples: 2472803700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 06:16:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:16:52,283][71000] Updated weights for policy 0, policy_version 179694 (0.0033) [2024-06-13 06:16:55,806][71000] Updated weights for policy 0, policy_version 179704 (0.0035) [2024-06-13 06:16:55,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2944270336. Throughput: 0: 49106.6. Samples: 2473103580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 06:16:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:16:59,062][71000] Updated weights for policy 0, policy_version 179714 (0.0028) [2024-06-13 06:17:00,939][70768] Fps is (10 sec: 45875.5, 60 sec: 48605.9, 300 sec: 49207.6). Total num frames: 2944499712. Throughput: 0: 49133.8. Samples: 2473396440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 06:17:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:17:02,373][71000] Updated weights for policy 0, policy_version 179724 (0.0031) [2024-06-13 06:17:05,730][71000] Updated weights for policy 0, policy_version 179734 (0.0022) [2024-06-13 06:17:05,939][70768] Fps is (10 sec: 49152.6, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2944761856. Throughput: 0: 49041.3. Samples: 2473537760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 06:17:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:17:08,890][71000] Updated weights for policy 0, policy_version 179744 (0.0030) [2024-06-13 06:17:10,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 2945007616. Throughput: 0: 48770.7. Samples: 2473823380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 06:17:10,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 06:17:12,223][71000] Updated weights for policy 0, policy_version 179754 (0.0036) [2024-06-13 06:17:15,481][71000] Updated weights for policy 0, policy_version 179764 (0.0024) [2024-06-13 06:17:15,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2945253376. Throughput: 0: 48575.3. Samples: 2474112980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 06:17:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:17:18,913][71000] Updated weights for policy 0, policy_version 179774 (0.0042) [2024-06-13 06:17:20,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.8, 300 sec: 49207.5). Total num frames: 2945482752. Throughput: 0: 48622.2. Samples: 2474266240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 06:17:20,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 06:17:22,515][71000] Updated weights for policy 0, policy_version 179784 (0.0030) [2024-06-13 06:17:25,242][71000] Updated weights for policy 0, policy_version 179794 (0.0026) [2024-06-13 06:17:25,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 2945744896. Throughput: 0: 48817.9. Samples: 2474567740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 06:17:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:17:29,030][71000] Updated weights for policy 0, policy_version 179804 (0.0028) [2024-06-13 06:17:30,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48879.0, 300 sec: 49374.1). Total num frames: 2945990656. Throughput: 0: 48798.5. Samples: 2474857340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 06:17:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 06:17:32,412][71000] Updated weights for policy 0, policy_version 179814 (0.0027) [2024-06-13 06:17:33,648][70980] Signal inference workers to stop experience collection... (36950 times) [2024-06-13 06:17:33,657][71000] InferenceWorker_p0-w0: stopping experience collection (36950 times) [2024-06-13 06:17:33,704][70980] Signal inference workers to resume experience collection... (36950 times) [2024-06-13 06:17:33,704][71000] InferenceWorker_p0-w0: resuming experience collection (36950 times) [2024-06-13 06:17:35,623][71000] Updated weights for policy 0, policy_version 179824 (0.0031) [2024-06-13 06:17:35,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2946252800. Throughput: 0: 49023.2. Samples: 2475009740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 06:17:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:17:38,830][71000] Updated weights for policy 0, policy_version 179834 (0.0029) [2024-06-13 06:17:40,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 2946482176. Throughput: 0: 48835.7. Samples: 2475301180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 06:17:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:17:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000179839_2946482176.pth... [2024-06-13 06:17:41,024][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000179119_2934685696.pth [2024-06-13 06:17:42,299][71000] Updated weights for policy 0, policy_version 179844 (0.0036) [2024-06-13 06:17:45,698][71000] Updated weights for policy 0, policy_version 179854 (0.0036) [2024-06-13 06:17:45,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49151.8, 300 sec: 49318.6). Total num frames: 2946727936. Throughput: 0: 48922.9. Samples: 2475597980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 06:17:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:17:48,858][71000] Updated weights for policy 0, policy_version 179864 (0.0032) [2024-06-13 06:17:50,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 2946973696. Throughput: 0: 49291.5. Samples: 2475755880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 06:17:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:17:52,233][71000] Updated weights for policy 0, policy_version 179874 (0.0028) [2024-06-13 06:17:55,649][71000] Updated weights for policy 0, policy_version 179884 (0.0028) [2024-06-13 06:17:55,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 2947235840. Throughput: 0: 49307.8. Samples: 2476042240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 06:17:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:17:59,146][71000] Updated weights for policy 0, policy_version 179894 (0.0028) [2024-06-13 06:18:00,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2947448832. Throughput: 0: 49394.1. Samples: 2476335720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 06:18:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:18:02,527][71000] Updated weights for policy 0, policy_version 179904 (0.0025) [2024-06-13 06:18:05,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2947694592. Throughput: 0: 48791.1. Samples: 2476461840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 06:18:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:18:06,110][71000] Updated weights for policy 0, policy_version 179914 (0.0028) [2024-06-13 06:18:09,271][71000] Updated weights for policy 0, policy_version 179924 (0.0034) [2024-06-13 06:18:10,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 2947956736. Throughput: 0: 48766.6. Samples: 2476762240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 06:18:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:18:12,505][71000] Updated weights for policy 0, policy_version 179934 (0.0040) [2024-06-13 06:18:15,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48878.7, 300 sec: 49207.5). Total num frames: 2948186112. Throughput: 0: 48820.3. Samples: 2477054260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 06:18:15,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:18:16,073][71000] Updated weights for policy 0, policy_version 179944 (0.0024) [2024-06-13 06:18:19,907][71000] Updated weights for policy 0, policy_version 179954 (0.0029) [2024-06-13 06:18:20,939][70768] Fps is (10 sec: 45875.8, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2948415488. Throughput: 0: 48667.2. Samples: 2477199760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 06:18:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:18:22,538][71000] Updated weights for policy 0, policy_version 179964 (0.0027) [2024-06-13 06:18:25,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2948677632. Throughput: 0: 48783.5. Samples: 2477496440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 06:18:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:18:26,308][71000] Updated weights for policy 0, policy_version 179974 (0.0023) [2024-06-13 06:18:29,454][71000] Updated weights for policy 0, policy_version 179984 (0.0023) [2024-06-13 06:18:30,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2948939776. Throughput: 0: 48748.5. Samples: 2477791660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 06:18:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:18:32,807][71000] Updated weights for policy 0, policy_version 179994 (0.0024) [2024-06-13 06:18:35,926][71000] Updated weights for policy 0, policy_version 180004 (0.0025) [2024-06-13 06:18:35,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2949185536. Throughput: 0: 48577.4. Samples: 2477941860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 06:18:35,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:18:39,681][71000] Updated weights for policy 0, policy_version 180014 (0.0027) [2024-06-13 06:18:40,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48605.8, 300 sec: 49152.0). Total num frames: 2949398528. Throughput: 0: 48881.5. Samples: 2478241900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 06:18:40,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:18:42,401][71000] Updated weights for policy 0, policy_version 180024 (0.0020) [2024-06-13 06:18:43,209][70980] Signal inference workers to stop experience collection... (37000 times) [2024-06-13 06:18:43,210][70980] Signal inference workers to resume experience collection... (37000 times) [2024-06-13 06:18:43,219][71000] InferenceWorker_p0-w0: stopping experience collection (37000 times) [2024-06-13 06:18:43,219][71000] InferenceWorker_p0-w0: resuming experience collection (37000 times) [2024-06-13 06:18:45,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48606.0, 300 sec: 49152.0). Total num frames: 2949644288. Throughput: 0: 49020.9. Samples: 2478541660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 06:18:45,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 06:18:46,253][71000] Updated weights for policy 0, policy_version 180034 (0.0029) [2024-06-13 06:18:49,163][71000] Updated weights for policy 0, policy_version 180044 (0.0031) [2024-06-13 06:18:50,940][70768] Fps is (10 sec: 54067.5, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 2949939200. Throughput: 0: 49358.3. Samples: 2478682960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 06:18:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:18:52,881][71000] Updated weights for policy 0, policy_version 180054 (0.0033) [2024-06-13 06:18:55,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48606.1, 300 sec: 49207.6). Total num frames: 2950152192. Throughput: 0: 49362.4. Samples: 2478983540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 06:18:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:18:55,950][71000] Updated weights for policy 0, policy_version 180064 (0.0028) [2024-06-13 06:18:59,966][71000] Updated weights for policy 0, policy_version 180074 (0.0030) [2024-06-13 06:19:00,940][70768] Fps is (10 sec: 44236.4, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2950381568. Throughput: 0: 49085.9. Samples: 2479263120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 06:19:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:19:02,855][71000] Updated weights for policy 0, policy_version 180084 (0.0034) [2024-06-13 06:19:05,940][70768] Fps is (10 sec: 47511.8, 60 sec: 48878.7, 300 sec: 49151.9). Total num frames: 2950627328. Throughput: 0: 49032.0. Samples: 2479406220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 06:19:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 06:19:06,488][71000] Updated weights for policy 0, policy_version 180094 (0.0030) [2024-06-13 06:19:09,406][71000] Updated weights for policy 0, policy_version 180104 (0.0024) [2024-06-13 06:19:10,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2950905856. Throughput: 0: 49128.5. Samples: 2479707220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 06:19:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:19:13,088][71000] Updated weights for policy 0, policy_version 180114 (0.0021) [2024-06-13 06:19:15,940][70768] Fps is (10 sec: 49153.5, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 2951118848. Throughput: 0: 49041.4. Samples: 2479998520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 06:19:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:19:16,374][71000] Updated weights for policy 0, policy_version 180124 (0.0033) [2024-06-13 06:19:19,728][71000] Updated weights for policy 0, policy_version 180134 (0.0035) [2024-06-13 06:19:20,940][70768] Fps is (10 sec: 44236.5, 60 sec: 48878.8, 300 sec: 49096.5). Total num frames: 2951348224. Throughput: 0: 48881.7. Samples: 2480141540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 06:19:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:19:23,023][71000] Updated weights for policy 0, policy_version 180144 (0.0024) [2024-06-13 06:19:25,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48605.8, 300 sec: 49096.5). Total num frames: 2951593984. Throughput: 0: 48467.0. Samples: 2480422920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 06:19:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:19:26,811][71000] Updated weights for policy 0, policy_version 180154 (0.0031) [2024-06-13 06:19:29,754][71000] Updated weights for policy 0, policy_version 180164 (0.0035) [2024-06-13 06:19:30,940][70768] Fps is (10 sec: 52429.0, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 2951872512. Throughput: 0: 48554.2. Samples: 2480726600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 06:19:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:19:33,198][71000] Updated weights for policy 0, policy_version 180174 (0.0025) [2024-06-13 06:19:35,940][70768] Fps is (10 sec: 52429.2, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2952118272. Throughput: 0: 48820.4. Samples: 2480879880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 06:19:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:19:36,275][71000] Updated weights for policy 0, policy_version 180184 (0.0028) [2024-06-13 06:19:40,082][71000] Updated weights for policy 0, policy_version 180194 (0.0030) [2024-06-13 06:19:40,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2952347648. Throughput: 0: 48855.1. Samples: 2481182020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 06:19:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:19:40,996][70980] Signal inference workers to stop experience collection... (37050 times) [2024-06-13 06:19:41,043][70980] Signal inference workers to resume experience collection... (37050 times) [2024-06-13 06:19:41,044][71000] InferenceWorker_p0-w0: stopping experience collection (37050 times) [2024-06-13 06:19:41,044][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000180198_2952364032.pth... [2024-06-13 06:19:41,053][71000] InferenceWorker_p0-w0: resuming experience collection (37050 times) [2024-06-13 06:19:41,093][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000179480_2940600320.pth [2024-06-13 06:19:43,275][71000] Updated weights for policy 0, policy_version 180204 (0.0026) [2024-06-13 06:19:45,940][70768] Fps is (10 sec: 44236.2, 60 sec: 48605.7, 300 sec: 49040.9). Total num frames: 2952560640. Throughput: 0: 48715.0. Samples: 2481455300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 06:19:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:19:47,063][71000] Updated weights for policy 0, policy_version 180214 (0.0029) [2024-06-13 06:19:49,564][71000] Updated weights for policy 0, policy_version 180224 (0.0039) [2024-06-13 06:19:50,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48605.9, 300 sec: 49207.5). Total num frames: 2952855552. Throughput: 0: 48877.2. Samples: 2481605680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 06:19:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:19:53,485][71000] Updated weights for policy 0, policy_version 180234 (0.0031) [2024-06-13 06:19:55,940][70768] Fps is (10 sec: 52430.0, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2953084928. Throughput: 0: 48791.1. Samples: 2481902820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 06:19:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:19:56,617][71000] Updated weights for policy 0, policy_version 180244 (0.0030) [2024-06-13 06:20:00,315][71000] Updated weights for policy 0, policy_version 180254 (0.0032) [2024-06-13 06:20:00,940][70768] Fps is (10 sec: 47510.6, 60 sec: 49151.5, 300 sec: 49096.4). Total num frames: 2953330688. Throughput: 0: 48964.2. Samples: 2482201940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 06:20:00,941][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:20:03,088][71000] Updated weights for policy 0, policy_version 180264 (0.0027) [2024-06-13 06:20:05,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48606.1, 300 sec: 48985.4). Total num frames: 2953543680. Throughput: 0: 48845.9. Samples: 2482339600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 06:20:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:20:06,755][71000] Updated weights for policy 0, policy_version 180274 (0.0029) [2024-06-13 06:20:09,449][71000] Updated weights for policy 0, policy_version 180284 (0.0022) [2024-06-13 06:20:10,940][70768] Fps is (10 sec: 49155.2, 60 sec: 48605.9, 300 sec: 49152.0). Total num frames: 2953822208. Throughput: 0: 49213.5. Samples: 2482637520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 06:20:10,949][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:20:13,406][71000] Updated weights for policy 0, policy_version 180294 (0.0025) [2024-06-13 06:20:15,940][70768] Fps is (10 sec: 54067.1, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 2954084352. Throughput: 0: 49220.0. Samples: 2482941500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 06:20:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:20:16,073][71000] Updated weights for policy 0, policy_version 180304 (0.0028) [2024-06-13 06:20:20,010][71000] Updated weights for policy 0, policy_version 180314 (0.0029) [2024-06-13 06:20:20,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.2, 300 sec: 49040.9). Total num frames: 2954313728. Throughput: 0: 49056.6. Samples: 2483087420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:20:20,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 06:20:22,918][71000] Updated weights for policy 0, policy_version 180324 (0.0033) [2024-06-13 06:20:25,940][70768] Fps is (10 sec: 44236.4, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 2954526720. Throughput: 0: 48862.6. Samples: 2483380840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:20:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:20:26,915][71000] Updated weights for policy 0, policy_version 180334 (0.0029) [2024-06-13 06:20:29,539][71000] Updated weights for policy 0, policy_version 180344 (0.0024) [2024-06-13 06:20:30,944][70768] Fps is (10 sec: 49130.3, 60 sec: 48875.4, 300 sec: 49095.8). Total num frames: 2954805248. Throughput: 0: 49395.4. Samples: 2483678300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:20:30,944][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:20:33,131][70980] Signal inference workers to stop experience collection... (37100 times) [2024-06-13 06:20:33,164][71000] InferenceWorker_p0-w0: stopping experience collection (37100 times) [2024-06-13 06:20:33,241][70980] Signal inference workers to resume experience collection... (37100 times) [2024-06-13 06:20:33,241][71000] InferenceWorker_p0-w0: resuming experience collection (37100 times) [2024-06-13 06:20:33,365][71000] Updated weights for policy 0, policy_version 180354 (0.0033) [2024-06-13 06:20:35,939][70768] Fps is (10 sec: 54068.3, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 2955067392. Throughput: 0: 49649.0. Samples: 2483839880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:20:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:20:35,980][71000] Updated weights for policy 0, policy_version 180364 (0.0022) [2024-06-13 06:20:39,923][71000] Updated weights for policy 0, policy_version 180374 (0.0034) [2024-06-13 06:20:40,940][70768] Fps is (10 sec: 50811.7, 60 sec: 49424.9, 300 sec: 49096.4). Total num frames: 2955313152. Throughput: 0: 49591.8. Samples: 2484134460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:20:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:20:42,757][71000] Updated weights for policy 0, policy_version 180384 (0.0024) [2024-06-13 06:20:45,940][70768] Fps is (10 sec: 44236.0, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 2955509760. Throughput: 0: 49460.2. Samples: 2484427620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:20:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:20:46,622][71000] Updated weights for policy 0, policy_version 180394 (0.0036) [2024-06-13 06:20:49,258][71000] Updated weights for policy 0, policy_version 180404 (0.0028) [2024-06-13 06:20:50,939][70768] Fps is (10 sec: 47514.6, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 2955788288. Throughput: 0: 49462.3. Samples: 2484565400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:20:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:20:53,179][71000] Updated weights for policy 0, policy_version 180414 (0.0035) [2024-06-13 06:20:55,939][70768] Fps is (10 sec: 54067.9, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 2956050432. Throughput: 0: 49348.1. Samples: 2484858180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:20:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:20:55,945][71000] Updated weights for policy 0, policy_version 180424 (0.0028) [2024-06-13 06:20:59,819][71000] Updated weights for policy 0, policy_version 180434 (0.0026) [2024-06-13 06:21:00,939][70768] Fps is (10 sec: 52428.6, 60 sec: 49698.7, 300 sec: 49096.5). Total num frames: 2956312576. Throughput: 0: 49268.5. Samples: 2485158580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:21:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:21:02,719][71000] Updated weights for policy 0, policy_version 180444 (0.0034) [2024-06-13 06:21:05,940][70768] Fps is (10 sec: 45874.7, 60 sec: 49425.0, 300 sec: 48929.8). Total num frames: 2956509184. Throughput: 0: 49285.2. Samples: 2485305260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:21:05,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 06:21:06,645][71000] Updated weights for policy 0, policy_version 180454 (0.0034) [2024-06-13 06:21:09,254][71000] Updated weights for policy 0, policy_version 180464 (0.0029) [2024-06-13 06:21:10,940][70768] Fps is (10 sec: 44236.6, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2956754944. Throughput: 0: 49135.6. Samples: 2485591940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:21:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 06:21:13,227][71000] Updated weights for policy 0, policy_version 180474 (0.0029) [2024-06-13 06:21:15,903][71000] Updated weights for policy 0, policy_version 180484 (0.0032) [2024-06-13 06:21:15,940][70768] Fps is (10 sec: 54067.2, 60 sec: 49425.0, 300 sec: 49096.4). Total num frames: 2957049856. Throughput: 0: 49125.1. Samples: 2485888720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:21:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 06:21:19,822][71000] Updated weights for policy 0, policy_version 180494 (0.0028) [2024-06-13 06:21:20,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 2957279232. Throughput: 0: 49107.4. Samples: 2486049720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:21:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:21:22,787][71000] Updated weights for policy 0, policy_version 180504 (0.0028) [2024-06-13 06:21:25,940][70768] Fps is (10 sec: 44237.0, 60 sec: 49425.1, 300 sec: 48929.9). Total num frames: 2957492224. Throughput: 0: 49042.8. Samples: 2486341380. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-13 06:21:25,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 06:21:26,346][71000] Updated weights for policy 0, policy_version 180514 (0.0025) [2024-06-13 06:21:29,161][71000] Updated weights for policy 0, policy_version 180524 (0.0024) [2024-06-13 06:21:30,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48882.5, 300 sec: 48929.8). Total num frames: 2957737984. Throughput: 0: 49060.5. Samples: 2486635340. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-13 06:21:30,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 06:21:33,003][71000] Updated weights for policy 0, policy_version 180534 (0.0029) [2024-06-13 06:21:35,915][71000] Updated weights for policy 0, policy_version 180544 (0.0032) [2024-06-13 06:21:35,940][70768] Fps is (10 sec: 54067.3, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 2958032896. Throughput: 0: 49288.8. Samples: 2486783400. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-13 06:21:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 06:21:38,214][70980] Signal inference workers to stop experience collection... (37150 times) [2024-06-13 06:21:38,214][70980] Signal inference workers to resume experience collection... (37150 times) [2024-06-13 06:21:38,234][71000] InferenceWorker_p0-w0: stopping experience collection (37150 times) [2024-06-13 06:21:38,234][71000] InferenceWorker_p0-w0: resuming experience collection (37150 times) [2024-06-13 06:21:39,610][71000] Updated weights for policy 0, policy_version 180554 (0.0031) [2024-06-13 06:21:40,940][70768] Fps is (10 sec: 54067.2, 60 sec: 49425.2, 300 sec: 49152.0). Total num frames: 2958278656. Throughput: 0: 49475.5. Samples: 2487084580. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-13 06:21:40,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 06:21:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000180559_2958278656.pth... [2024-06-13 06:21:40,997][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000179839_2946482176.pth [2024-06-13 06:21:42,711][71000] Updated weights for policy 0, policy_version 180564 (0.0037) [2024-06-13 06:21:45,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49698.2, 300 sec: 48985.4). Total num frames: 2958491648. Throughput: 0: 49260.0. Samples: 2487375280. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-13 06:21:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 06:21:46,285][71000] Updated weights for policy 0, policy_version 180574 (0.0036) [2024-06-13 06:21:49,075][71000] Updated weights for policy 0, policy_version 180584 (0.0021) [2024-06-13 06:21:50,940][70768] Fps is (10 sec: 45874.8, 60 sec: 49151.8, 300 sec: 49040.9). Total num frames: 2958737408. Throughput: 0: 49023.0. Samples: 2487511300. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-13 06:21:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:21:53,066][71000] Updated weights for policy 0, policy_version 180594 (0.0032) [2024-06-13 06:21:55,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2958983168. Throughput: 0: 49108.1. Samples: 2487801800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-13 06:21:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:21:56,427][71000] Updated weights for policy 0, policy_version 180604 (0.0032) [2024-06-13 06:21:59,779][71000] Updated weights for policy 0, policy_version 180614 (0.0024) [2024-06-13 06:22:00,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 2959245312. Throughput: 0: 49044.0. Samples: 2488095700. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-13 06:22:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 06:22:02,931][71000] Updated weights for policy 0, policy_version 180624 (0.0030) [2024-06-13 06:22:05,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2959458304. Throughput: 0: 48804.4. Samples: 2488245920. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-13 06:22:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 06:22:06,273][71000] Updated weights for policy 0, policy_version 180634 (0.0032) [2024-06-13 06:22:09,425][71000] Updated weights for policy 0, policy_version 180644 (0.0027) [2024-06-13 06:22:10,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2959704064. Throughput: 0: 48696.4. Samples: 2488532720. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-13 06:22:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:22:13,062][71000] Updated weights for policy 0, policy_version 180654 (0.0024) [2024-06-13 06:22:15,940][70768] Fps is (10 sec: 52428.9, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2959982592. Throughput: 0: 48770.7. Samples: 2488830020. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-13 06:22:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:22:16,317][71000] Updated weights for policy 0, policy_version 180664 (0.0022) [2024-06-13 06:22:19,810][71000] Updated weights for policy 0, policy_version 180674 (0.0034) [2024-06-13 06:22:20,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 2960195584. Throughput: 0: 48924.0. Samples: 2488984980. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-13 06:22:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:22:23,196][71000] Updated weights for policy 0, policy_version 180684 (0.0032) [2024-06-13 06:22:25,939][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2960441344. Throughput: 0: 48599.6. Samples: 2489271560. Policy #0 lag: (min: 1.0, avg: 10.7, max: 19.0) [2024-06-13 06:22:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:22:26,244][71000] Updated weights for policy 0, policy_version 180694 (0.0034) [2024-06-13 06:22:29,631][71000] Updated weights for policy 0, policy_version 180704 (0.0032) [2024-06-13 06:22:30,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 2960687104. Throughput: 0: 48697.8. Samples: 2489566680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 06:22:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:22:32,912][71000] Updated weights for policy 0, policy_version 180714 (0.0028) [2024-06-13 06:22:33,589][70980] Signal inference workers to stop experience collection... (37200 times) [2024-06-13 06:22:33,590][70980] Signal inference workers to resume experience collection... (37200 times) [2024-06-13 06:22:33,629][71000] InferenceWorker_p0-w0: stopping experience collection (37200 times) [2024-06-13 06:22:33,629][71000] InferenceWorker_p0-w0: resuming experience collection (37200 times) [2024-06-13 06:22:35,929][71000] Updated weights for policy 0, policy_version 180724 (0.0033) [2024-06-13 06:22:35,940][70768] Fps is (10 sec: 54066.1, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2960982016. Throughput: 0: 49049.7. Samples: 2489718540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 06:22:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 06:22:39,786][71000] Updated weights for policy 0, policy_version 180734 (0.0032) [2024-06-13 06:22:40,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 2961195008. Throughput: 0: 49224.4. Samples: 2490016900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 06:22:40,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 06:22:43,142][71000] Updated weights for policy 0, policy_version 180744 (0.0029) [2024-06-13 06:22:45,940][70768] Fps is (10 sec: 45875.6, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 2961440768. Throughput: 0: 49104.5. Samples: 2490305400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 06:22:45,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:22:46,403][71000] Updated weights for policy 0, policy_version 180754 (0.0037) [2024-06-13 06:22:49,824][71000] Updated weights for policy 0, policy_version 180764 (0.0036) [2024-06-13 06:22:50,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48606.0, 300 sec: 48874.3). Total num frames: 2961653760. Throughput: 0: 48847.6. Samples: 2490444060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 06:22:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:22:53,246][71000] Updated weights for policy 0, policy_version 180774 (0.0030) [2024-06-13 06:22:55,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 2961932288. Throughput: 0: 48958.3. Samples: 2490735840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 06:22:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:22:56,180][71000] Updated weights for policy 0, policy_version 180784 (0.0032) [2024-06-13 06:22:59,922][71000] Updated weights for policy 0, policy_version 180794 (0.0028) [2024-06-13 06:23:00,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 2962161664. Throughput: 0: 48947.4. Samples: 2491032660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 06:23:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:23:03,145][71000] Updated weights for policy 0, policy_version 180804 (0.0027) [2024-06-13 06:23:05,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 2962407424. Throughput: 0: 48741.4. Samples: 2491178340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 06:23:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:23:06,766][71000] Updated weights for policy 0, policy_version 180814 (0.0038) [2024-06-13 06:23:10,056][71000] Updated weights for policy 0, policy_version 180824 (0.0032) [2024-06-13 06:23:10,939][70768] Fps is (10 sec: 47514.4, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 2962636800. Throughput: 0: 48876.5. Samples: 2491471000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 06:23:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:23:13,392][71000] Updated weights for policy 0, policy_version 180834 (0.0024) [2024-06-13 06:23:15,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48605.9, 300 sec: 49096.5). Total num frames: 2962898944. Throughput: 0: 48856.0. Samples: 2491765200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 06:23:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:23:16,680][71000] Updated weights for policy 0, policy_version 180844 (0.0023) [2024-06-13 06:23:20,281][71000] Updated weights for policy 0, policy_version 180854 (0.0033) [2024-06-13 06:23:20,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2963144704. Throughput: 0: 48733.0. Samples: 2491911520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 06:23:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:23:23,433][71000] Updated weights for policy 0, policy_version 180864 (0.0026) [2024-06-13 06:23:25,942][70768] Fps is (10 sec: 47499.9, 60 sec: 48876.6, 300 sec: 48929.4). Total num frames: 2963374080. Throughput: 0: 48709.3. Samples: 2492208960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 06:23:25,943][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:23:26,948][71000] Updated weights for policy 0, policy_version 180874 (0.0027) [2024-06-13 06:23:30,314][71000] Updated weights for policy 0, policy_version 180884 (0.0023) [2024-06-13 06:23:30,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 2963619840. Throughput: 0: 48679.2. Samples: 2492495960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 06:23:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:23:33,783][71000] Updated weights for policy 0, policy_version 180894 (0.0027) [2024-06-13 06:23:35,940][70768] Fps is (10 sec: 49166.4, 60 sec: 48059.9, 300 sec: 49040.9). Total num frames: 2963865600. Throughput: 0: 49088.5. Samples: 2492653040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 06:23:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:23:36,969][71000] Updated weights for policy 0, policy_version 180904 (0.0023) [2024-06-13 06:23:40,240][71000] Updated weights for policy 0, policy_version 180914 (0.0026) [2024-06-13 06:23:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 2964111360. Throughput: 0: 48914.6. Samples: 2492937000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 06:23:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:23:41,094][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000180916_2964127744.pth... [2024-06-13 06:23:41,137][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000180198_2952364032.pth [2024-06-13 06:23:43,577][71000] Updated weights for policy 0, policy_version 180924 (0.0027) [2024-06-13 06:23:45,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 2964357120. Throughput: 0: 48868.0. Samples: 2493231720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 06:23:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:23:47,066][71000] Updated weights for policy 0, policy_version 180934 (0.0030) [2024-06-13 06:23:50,252][71000] Updated weights for policy 0, policy_version 180944 (0.0022) [2024-06-13 06:23:50,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49151.9, 300 sec: 48985.3). Total num frames: 2964602880. Throughput: 0: 48782.9. Samples: 2493373580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 06:23:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:23:53,713][71000] Updated weights for policy 0, policy_version 180954 (0.0027) [2024-06-13 06:23:55,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48332.7, 300 sec: 48985.4). Total num frames: 2964832256. Throughput: 0: 48860.7. Samples: 2493669740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 06:23:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:23:57,248][71000] Updated weights for policy 0, policy_version 180964 (0.0026) [2024-06-13 06:23:57,834][70980] Signal inference workers to stop experience collection... (37250 times) [2024-06-13 06:23:57,895][71000] InferenceWorker_p0-w0: stopping experience collection (37250 times) [2024-06-13 06:23:57,948][70980] Signal inference workers to resume experience collection... (37250 times) [2024-06-13 06:23:57,949][71000] InferenceWorker_p0-w0: resuming experience collection (37250 times) [2024-06-13 06:24:00,607][71000] Updated weights for policy 0, policy_version 180974 (0.0024) [2024-06-13 06:24:00,940][70768] Fps is (10 sec: 49152.9, 60 sec: 48879.0, 300 sec: 49041.0). Total num frames: 2965094400. Throughput: 0: 48976.0. Samples: 2493969120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 06:24:00,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 06:24:03,561][71000] Updated weights for policy 0, policy_version 180984 (0.0032) [2024-06-13 06:24:05,940][70768] Fps is (10 sec: 50791.0, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 2965340160. Throughput: 0: 49068.5. Samples: 2494119600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 06:24:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:24:07,143][71000] Updated weights for policy 0, policy_version 180994 (0.0028) [2024-06-13 06:24:10,671][71000] Updated weights for policy 0, policy_version 181004 (0.0032) [2024-06-13 06:24:10,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2965585920. Throughput: 0: 49051.6. Samples: 2494416140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 06:24:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:24:13,761][71000] Updated weights for policy 0, policy_version 181014 (0.0033) [2024-06-13 06:24:15,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2965831680. Throughput: 0: 48993.7. Samples: 2494700680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 06:24:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 06:24:17,274][71000] Updated weights for policy 0, policy_version 181024 (0.0036) [2024-06-13 06:24:20,375][71000] Updated weights for policy 0, policy_version 181034 (0.0028) [2024-06-13 06:24:20,939][70768] Fps is (10 sec: 49151.9, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2966077440. Throughput: 0: 48860.0. Samples: 2494851740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 06:24:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:24:23,771][71000] Updated weights for policy 0, policy_version 181044 (0.0030) [2024-06-13 06:24:25,939][70768] Fps is (10 sec: 49152.5, 60 sec: 49154.4, 300 sec: 48985.4). Total num frames: 2966323200. Throughput: 0: 49092.1. Samples: 2495146140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 06:24:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 06:24:27,161][71000] Updated weights for policy 0, policy_version 181054 (0.0030) [2024-06-13 06:24:30,767][71000] Updated weights for policy 0, policy_version 181064 (0.0033) [2024-06-13 06:24:30,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 2966552576. Throughput: 0: 49124.8. Samples: 2495442340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 06:24:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:24:33,615][71000] Updated weights for policy 0, policy_version 181074 (0.0038) [2024-06-13 06:24:35,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2966798336. Throughput: 0: 49064.2. Samples: 2495581460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 06:24:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:24:37,202][71000] Updated weights for policy 0, policy_version 181084 (0.0037) [2024-06-13 06:24:40,370][71000] Updated weights for policy 0, policy_version 181094 (0.0029) [2024-06-13 06:24:40,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2967060480. Throughput: 0: 49080.9. Samples: 2495878380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:24:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:24:43,963][71000] Updated weights for policy 0, policy_version 181104 (0.0023) [2024-06-13 06:24:45,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 2967306240. Throughput: 0: 48890.2. Samples: 2496169180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:24:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:24:46,948][71000] Updated weights for policy 0, policy_version 181114 (0.0032) [2024-06-13 06:24:50,937][71000] Updated weights for policy 0, policy_version 181124 (0.0038) [2024-06-13 06:24:50,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 2967535616. Throughput: 0: 48826.1. Samples: 2496316780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:24:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:24:53,503][71000] Updated weights for policy 0, policy_version 181134 (0.0023) [2024-06-13 06:24:55,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.1, 300 sec: 48985.5). Total num frames: 2967781376. Throughput: 0: 48711.1. Samples: 2496608140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:24:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:24:57,508][71000] Updated weights for policy 0, policy_version 181144 (0.0030) [2024-06-13 06:25:00,146][71000] Updated weights for policy 0, policy_version 181154 (0.0025) [2024-06-13 06:25:00,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2968043520. Throughput: 0: 48857.3. Samples: 2496899260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:25:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:25:04,182][71000] Updated weights for policy 0, policy_version 181164 (0.0029) [2024-06-13 06:25:05,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2968305664. Throughput: 0: 49065.8. Samples: 2497059700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:25:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:25:06,819][71000] Updated weights for policy 0, policy_version 181174 (0.0039) [2024-06-13 06:25:10,177][70980] Signal inference workers to stop experience collection... (37300 times) [2024-06-13 06:25:10,178][70980] Signal inference workers to resume experience collection... (37300 times) [2024-06-13 06:25:10,195][71000] InferenceWorker_p0-w0: stopping experience collection (37300 times) [2024-06-13 06:25:10,225][71000] InferenceWorker_p0-w0: resuming experience collection (37300 times) [2024-06-13 06:25:10,871][71000] Updated weights for policy 0, policy_version 181184 (0.0029) [2024-06-13 06:25:10,939][70768] Fps is (10 sec: 47514.3, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 2968518656. Throughput: 0: 49116.4. Samples: 2497356380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:25:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 06:25:13,423][71000] Updated weights for policy 0, policy_version 181194 (0.0031) [2024-06-13 06:25:15,939][70768] Fps is (10 sec: 44237.1, 60 sec: 48606.0, 300 sec: 48929.8). Total num frames: 2968748032. Throughput: 0: 49056.2. Samples: 2497649860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:25:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:25:17,463][71000] Updated weights for policy 0, policy_version 181204 (0.0029) [2024-06-13 06:25:19,759][71000] Updated weights for policy 0, policy_version 181214 (0.0028) [2024-06-13 06:25:20,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 2969026560. Throughput: 0: 49201.3. Samples: 2497795520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:25:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:25:24,114][71000] Updated weights for policy 0, policy_version 181224 (0.0033) [2024-06-13 06:25:25,940][70768] Fps is (10 sec: 55704.4, 60 sec: 49698.0, 300 sec: 49152.7). Total num frames: 2969305088. Throughput: 0: 49279.6. Samples: 2498095960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:25:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:25:26,747][71000] Updated weights for policy 0, policy_version 181234 (0.0026) [2024-06-13 06:25:30,799][71000] Updated weights for policy 0, policy_version 181244 (0.0032) [2024-06-13 06:25:30,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.1, 300 sec: 48929.8). Total num frames: 2969501696. Throughput: 0: 49345.7. Samples: 2498389740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:25:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:25:33,170][71000] Updated weights for policy 0, policy_version 181254 (0.0026) [2024-06-13 06:25:35,939][70768] Fps is (10 sec: 42599.2, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 2969731072. Throughput: 0: 49009.1. Samples: 2498522180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:25:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:25:37,225][71000] Updated weights for policy 0, policy_version 181264 (0.0028) [2024-06-13 06:25:39,881][71000] Updated weights for policy 0, policy_version 181274 (0.0026) [2024-06-13 06:25:40,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2970009600. Throughput: 0: 49305.6. Samples: 2498826900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:25:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 06:25:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000181275_2970009600.pth... [2024-06-13 06:25:41,004][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000180559_2958278656.pth [2024-06-13 06:25:44,027][71000] Updated weights for policy 0, policy_version 181284 (0.0033) [2024-06-13 06:25:45,939][70768] Fps is (10 sec: 54067.2, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 2970271744. Throughput: 0: 49529.5. Samples: 2499128080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:25:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:25:46,599][71000] Updated weights for policy 0, policy_version 181294 (0.0020) [2024-06-13 06:25:50,485][71000] Updated weights for policy 0, policy_version 181304 (0.0029) [2024-06-13 06:25:50,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 2970501120. Throughput: 0: 49364.3. Samples: 2499281100. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-13 06:25:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:25:53,189][71000] Updated weights for policy 0, policy_version 181314 (0.0035) [2024-06-13 06:25:55,939][70768] Fps is (10 sec: 44237.0, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 2970714112. Throughput: 0: 49180.9. Samples: 2499569520. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-13 06:25:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:25:57,299][71000] Updated weights for policy 0, policy_version 181324 (0.0023) [2024-06-13 06:25:59,675][71000] Updated weights for policy 0, policy_version 181334 (0.0023) [2024-06-13 06:25:59,684][70980] Signal inference workers to stop experience collection... (37350 times) [2024-06-13 06:25:59,685][70980] Signal inference workers to resume experience collection... (37350 times) [2024-06-13 06:25:59,726][71000] InferenceWorker_p0-w0: stopping experience collection (37350 times) [2024-06-13 06:25:59,726][71000] InferenceWorker_p0-w0: resuming experience collection (37350 times) [2024-06-13 06:26:00,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 2970992640. Throughput: 0: 49028.8. Samples: 2499856160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-13 06:26:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:26:03,979][71000] Updated weights for policy 0, policy_version 181344 (0.0033) [2024-06-13 06:26:05,940][70768] Fps is (10 sec: 55704.9, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2971271168. Throughput: 0: 49539.6. Samples: 2500024800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-13 06:26:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:26:06,680][71000] Updated weights for policy 0, policy_version 181354 (0.0023) [2024-06-13 06:26:10,508][71000] Updated weights for policy 0, policy_version 181364 (0.0027) [2024-06-13 06:26:10,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 48929.9). Total num frames: 2971484160. Throughput: 0: 49438.0. Samples: 2500320660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-13 06:26:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:26:13,317][71000] Updated weights for policy 0, policy_version 181374 (0.0027) [2024-06-13 06:26:15,939][70768] Fps is (10 sec: 44237.1, 60 sec: 49425.0, 300 sec: 48929.9). Total num frames: 2971713536. Throughput: 0: 49497.0. Samples: 2500617100. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-13 06:26:15,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 06:26:17,019][71000] Updated weights for policy 0, policy_version 181384 (0.0028) [2024-06-13 06:26:19,883][71000] Updated weights for policy 0, policy_version 181394 (0.0029) [2024-06-13 06:26:20,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2971975680. Throughput: 0: 49601.7. Samples: 2500754260. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-13 06:26:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:26:23,695][71000] Updated weights for policy 0, policy_version 181404 (0.0033) [2024-06-13 06:26:25,940][70768] Fps is (10 sec: 54067.0, 60 sec: 49152.1, 300 sec: 49207.5). Total num frames: 2972254208. Throughput: 0: 49498.4. Samples: 2501054320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-13 06:26:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:26:26,354][71000] Updated weights for policy 0, policy_version 181414 (0.0021) [2024-06-13 06:26:30,329][71000] Updated weights for policy 0, policy_version 181424 (0.0033) [2024-06-13 06:26:30,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49698.1, 300 sec: 48985.4). Total num frames: 2972483584. Throughput: 0: 49576.3. Samples: 2501359020. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-13 06:26:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:26:33,045][71000] Updated weights for policy 0, policy_version 181434 (0.0022) [2024-06-13 06:26:35,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49698.1, 300 sec: 48929.8). Total num frames: 2972712960. Throughput: 0: 49264.5. Samples: 2501498000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-13 06:26:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:26:36,718][71000] Updated weights for policy 0, policy_version 181444 (0.0027) [2024-06-13 06:26:39,640][71000] Updated weights for policy 0, policy_version 181454 (0.0034) [2024-06-13 06:26:40,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2972958720. Throughput: 0: 49353.5. Samples: 2501790440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-13 06:26:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:26:43,405][71000] Updated weights for policy 0, policy_version 181464 (0.0024) [2024-06-13 06:26:45,939][70768] Fps is (10 sec: 54067.6, 60 sec: 49698.1, 300 sec: 49207.6). Total num frames: 2973253632. Throughput: 0: 49632.5. Samples: 2502089620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-13 06:26:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:26:46,022][71000] Updated weights for policy 0, policy_version 181474 (0.0029) [2024-06-13 06:26:50,142][71000] Updated weights for policy 0, policy_version 181484 (0.0025) [2024-06-13 06:26:50,939][70768] Fps is (10 sec: 52429.8, 60 sec: 49698.3, 300 sec: 49152.0). Total num frames: 2973483008. Throughput: 0: 49325.4. Samples: 2502244440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-13 06:26:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:26:53,184][71000] Updated weights for policy 0, policy_version 181494 (0.0024) [2024-06-13 06:26:55,939][70768] Fps is (10 sec: 42598.6, 60 sec: 49425.0, 300 sec: 48929.9). Total num frames: 2973679616. Throughput: 0: 49065.8. Samples: 2502528620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 06:26:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:26:56,715][71000] Updated weights for policy 0, policy_version 181504 (0.0032) [2024-06-13 06:26:59,685][71000] Updated weights for policy 0, policy_version 181514 (0.0037) [2024-06-13 06:27:00,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 2973941760. Throughput: 0: 49046.2. Samples: 2502824180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 06:27:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:27:03,384][71000] Updated weights for policy 0, policy_version 181524 (0.0027) [2024-06-13 06:27:05,939][70768] Fps is (10 sec: 54067.1, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 2974220288. Throughput: 0: 49441.8. Samples: 2502979140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 06:27:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:27:06,409][71000] Updated weights for policy 0, policy_version 181534 (0.0027) [2024-06-13 06:27:09,957][70980] Signal inference workers to stop experience collection... (37400 times) [2024-06-13 06:27:10,008][71000] InferenceWorker_p0-w0: stopping experience collection (37400 times) [2024-06-13 06:27:10,010][70980] Signal inference workers to resume experience collection... (37400 times) [2024-06-13 06:27:10,017][71000] InferenceWorker_p0-w0: resuming experience collection (37400 times) [2024-06-13 06:27:10,020][71000] Updated weights for policy 0, policy_version 181544 (0.0027) [2024-06-13 06:27:10,940][70768] Fps is (10 sec: 50788.7, 60 sec: 49424.8, 300 sec: 49040.9). Total num frames: 2974449664. Throughput: 0: 49444.9. Samples: 2503279360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 06:27:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:27:12,971][71000] Updated weights for policy 0, policy_version 181554 (0.0025) [2024-06-13 06:27:15,939][70768] Fps is (10 sec: 45875.1, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 2974679040. Throughput: 0: 49307.2. Samples: 2503577840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 06:27:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:27:16,718][71000] Updated weights for policy 0, policy_version 181564 (0.0034) [2024-06-13 06:27:19,403][71000] Updated weights for policy 0, policy_version 181574 (0.0019) [2024-06-13 06:27:20,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49424.9, 300 sec: 49152.0). Total num frames: 2974941184. Throughput: 0: 49309.2. Samples: 2503716920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 06:27:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:27:23,404][71000] Updated weights for policy 0, policy_version 181584 (0.0023) [2024-06-13 06:27:25,939][70768] Fps is (10 sec: 52428.8, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 2975203328. Throughput: 0: 49259.7. Samples: 2504007120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 06:27:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:27:26,439][71000] Updated weights for policy 0, policy_version 181594 (0.0029) [2024-06-13 06:27:29,918][71000] Updated weights for policy 0, policy_version 181604 (0.0028) [2024-06-13 06:27:30,940][70768] Fps is (10 sec: 50791.3, 60 sec: 49425.1, 300 sec: 49041.0). Total num frames: 2975449088. Throughput: 0: 49381.3. Samples: 2504311780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 06:27:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:27:32,877][71000] Updated weights for policy 0, policy_version 181614 (0.0029) [2024-06-13 06:27:35,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49425.2, 300 sec: 49096.5). Total num frames: 2975678464. Throughput: 0: 49288.9. Samples: 2504462440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 06:27:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:27:36,574][71000] Updated weights for policy 0, policy_version 181624 (0.0027) [2024-06-13 06:27:39,547][71000] Updated weights for policy 0, policy_version 181634 (0.0033) [2024-06-13 06:27:40,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49698.2, 300 sec: 49152.0). Total num frames: 2975940608. Throughput: 0: 49540.3. Samples: 2504757940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 06:27:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 06:27:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000181637_2975940608.pth... [2024-06-13 06:27:40,993][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000180916_2964127744.pth [2024-06-13 06:27:43,269][71000] Updated weights for policy 0, policy_version 181644 (0.0030) [2024-06-13 06:27:45,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48605.8, 300 sec: 49207.5). Total num frames: 2976169984. Throughput: 0: 49444.8. Samples: 2505049200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 06:27:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:27:46,455][71000] Updated weights for policy 0, policy_version 181654 (0.0031) [2024-06-13 06:27:49,695][71000] Updated weights for policy 0, policy_version 181664 (0.0021) [2024-06-13 06:27:50,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 2976448512. Throughput: 0: 49542.6. Samples: 2505208560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 06:27:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:27:53,116][71000] Updated weights for policy 0, policy_version 181674 (0.0024) [2024-06-13 06:27:55,939][70768] Fps is (10 sec: 49153.0, 60 sec: 49698.2, 300 sec: 49152.0). Total num frames: 2976661504. Throughput: 0: 49414.7. Samples: 2505503000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 06:27:55,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:27:56,246][71000] Updated weights for policy 0, policy_version 181684 (0.0031) [2024-06-13 06:27:59,738][71000] Updated weights for policy 0, policy_version 181694 (0.0026) [2024-06-13 06:28:00,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49698.1, 300 sec: 49207.5). Total num frames: 2976923648. Throughput: 0: 49133.3. Samples: 2505788840. Policy #0 lag: (min: 1.0, avg: 12.7, max: 23.0) [2024-06-13 06:28:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:28:03,107][71000] Updated weights for policy 0, policy_version 181704 (0.0033) [2024-06-13 06:28:05,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2977153024. Throughput: 0: 49354.0. Samples: 2505937840. Policy #0 lag: (min: 1.0, avg: 12.7, max: 23.0) [2024-06-13 06:28:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:28:06,405][71000] Updated weights for policy 0, policy_version 181714 (0.0039) [2024-06-13 06:28:09,758][71000] Updated weights for policy 0, policy_version 181724 (0.0029) [2024-06-13 06:28:10,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49425.4, 300 sec: 49207.5). Total num frames: 2977415168. Throughput: 0: 49451.1. Samples: 2506232420. Policy #0 lag: (min: 1.0, avg: 12.7, max: 23.0) [2024-06-13 06:28:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:28:13,133][70980] Signal inference workers to stop experience collection... (37450 times) [2024-06-13 06:28:13,133][70980] Signal inference workers to resume experience collection... (37450 times) [2024-06-13 06:28:13,160][71000] InferenceWorker_p0-w0: stopping experience collection (37450 times) [2024-06-13 06:28:13,160][71000] InferenceWorker_p0-w0: resuming experience collection (37450 times) [2024-06-13 06:28:13,265][71000] Updated weights for policy 0, policy_version 181734 (0.0025) [2024-06-13 06:28:15,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 2977644544. Throughput: 0: 49246.6. Samples: 2506527880. Policy #0 lag: (min: 1.0, avg: 12.7, max: 23.0) [2024-06-13 06:28:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:28:16,317][71000] Updated weights for policy 0, policy_version 181744 (0.0027) [2024-06-13 06:28:19,988][71000] Updated weights for policy 0, policy_version 181754 (0.0035) [2024-06-13 06:28:20,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.3, 300 sec: 49263.6). Total num frames: 2977906688. Throughput: 0: 49006.2. Samples: 2506667720. Policy #0 lag: (min: 1.0, avg: 12.7, max: 23.0) [2024-06-13 06:28:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 06:28:23,163][71000] Updated weights for policy 0, policy_version 181764 (0.0031) [2024-06-13 06:28:25,939][70768] Fps is (10 sec: 47514.5, 60 sec: 48605.9, 300 sec: 49152.0). Total num frames: 2978119680. Throughput: 0: 48959.7. Samples: 2506961120. Policy #0 lag: (min: 1.0, avg: 12.7, max: 23.0) [2024-06-13 06:28:25,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 06:28:26,625][71000] Updated weights for policy 0, policy_version 181774 (0.0028) [2024-06-13 06:28:29,542][71000] Updated weights for policy 0, policy_version 181784 (0.0027) [2024-06-13 06:28:30,939][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 2978398208. Throughput: 0: 49189.0. Samples: 2507262700. Policy #0 lag: (min: 1.0, avg: 12.7, max: 23.0) [2024-06-13 06:28:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:28:33,442][71000] Updated weights for policy 0, policy_version 181794 (0.0028) [2024-06-13 06:28:35,939][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 2978627584. Throughput: 0: 48938.8. Samples: 2507410800. Policy #0 lag: (min: 1.0, avg: 12.7, max: 23.0) [2024-06-13 06:28:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 06:28:36,583][71000] Updated weights for policy 0, policy_version 181804 (0.0028) [2024-06-13 06:28:40,206][71000] Updated weights for policy 0, policy_version 181814 (0.0022) [2024-06-13 06:28:40,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2978889728. Throughput: 0: 48863.9. Samples: 2507701880. Policy #0 lag: (min: 1.0, avg: 12.7, max: 23.0) [2024-06-13 06:28:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:28:43,167][71000] Updated weights for policy 0, policy_version 181824 (0.0027) [2024-06-13 06:28:45,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 2979102720. Throughput: 0: 48959.2. Samples: 2507992000. Policy #0 lag: (min: 1.0, avg: 12.7, max: 23.0) [2024-06-13 06:28:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:28:46,881][71000] Updated weights for policy 0, policy_version 181834 (0.0033) [2024-06-13 06:28:49,882][71000] Updated weights for policy 0, policy_version 181844 (0.0025) [2024-06-13 06:28:50,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 2979381248. Throughput: 0: 48856.8. Samples: 2508136400. Policy #0 lag: (min: 1.0, avg: 12.7, max: 23.0) [2024-06-13 06:28:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:28:53,575][71000] Updated weights for policy 0, policy_version 181854 (0.0028) [2024-06-13 06:28:55,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2979610624. Throughput: 0: 48962.7. Samples: 2508435740. Policy #0 lag: (min: 1.0, avg: 12.7, max: 23.0) [2024-06-13 06:28:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:28:56,481][71000] Updated weights for policy 0, policy_version 181864 (0.0029) [2024-06-13 06:29:00,358][71000] Updated weights for policy 0, policy_version 181874 (0.0036) [2024-06-13 06:29:00,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 2979856384. Throughput: 0: 48948.5. Samples: 2508730560. Policy #0 lag: (min: 1.0, avg: 12.7, max: 23.0) [2024-06-13 06:29:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 06:29:03,230][71000] Updated weights for policy 0, policy_version 181884 (0.0032) [2024-06-13 06:29:05,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2980085760. Throughput: 0: 48919.1. Samples: 2508869080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 06:29:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:29:07,202][71000] Updated weights for policy 0, policy_version 181894 (0.0027) [2024-06-13 06:29:09,756][71000] Updated weights for policy 0, policy_version 181904 (0.0026) [2024-06-13 06:29:10,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 2980347904. Throughput: 0: 48964.8. Samples: 2509164540. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 06:29:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:29:13,705][71000] Updated weights for policy 0, policy_version 181914 (0.0031) [2024-06-13 06:29:15,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 2980610048. Throughput: 0: 48853.3. Samples: 2509461100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 06:29:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:29:16,378][71000] Updated weights for policy 0, policy_version 181924 (0.0035) [2024-06-13 06:29:20,707][71000] Updated weights for policy 0, policy_version 181934 (0.0030) [2024-06-13 06:29:20,792][70980] Signal inference workers to stop experience collection... (37500 times) [2024-06-13 06:29:20,840][71000] InferenceWorker_p0-w0: stopping experience collection (37500 times) [2024-06-13 06:29:20,845][70980] Signal inference workers to resume experience collection... (37500 times) [2024-06-13 06:29:20,855][71000] InferenceWorker_p0-w0: resuming experience collection (37500 times) [2024-06-13 06:29:20,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48605.8, 300 sec: 49152.0). Total num frames: 2980823040. Throughput: 0: 49026.6. Samples: 2509617000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 06:29:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:29:23,047][71000] Updated weights for policy 0, policy_version 181944 (0.0033) [2024-06-13 06:29:25,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49151.9, 300 sec: 49207.6). Total num frames: 2981068800. Throughput: 0: 48935.9. Samples: 2509904000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 06:29:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:29:27,154][71000] Updated weights for policy 0, policy_version 181954 (0.0032) [2024-06-13 06:29:29,688][71000] Updated weights for policy 0, policy_version 181964 (0.0027) [2024-06-13 06:29:30,939][70768] Fps is (10 sec: 50791.0, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 2981330944. Throughput: 0: 48916.1. Samples: 2510193220. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 06:29:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:29:33,870][71000] Updated weights for policy 0, policy_version 181974 (0.0030) [2024-06-13 06:29:35,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 2981593088. Throughput: 0: 49305.0. Samples: 2510355120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 06:29:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:29:36,655][71000] Updated weights for policy 0, policy_version 181984 (0.0032) [2024-06-13 06:29:40,317][71000] Updated weights for policy 0, policy_version 181994 (0.0026) [2024-06-13 06:29:40,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48879.0, 300 sec: 49207.6). Total num frames: 2981822464. Throughput: 0: 49488.5. Samples: 2510662720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 06:29:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:29:40,955][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000181997_2981838848.pth... [2024-06-13 06:29:40,998][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000181275_2970009600.pth [2024-06-13 06:29:43,056][71000] Updated weights for policy 0, policy_version 182004 (0.0033) [2024-06-13 06:29:45,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2982051840. Throughput: 0: 49328.8. Samples: 2510950360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 06:29:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:29:47,218][71000] Updated weights for policy 0, policy_version 182014 (0.0030) [2024-06-13 06:29:49,414][71000] Updated weights for policy 0, policy_version 182024 (0.0025) [2024-06-13 06:29:50,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 2982330368. Throughput: 0: 49393.7. Samples: 2511091800. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 06:29:50,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:29:53,726][71000] Updated weights for policy 0, policy_version 182034 (0.0028) [2024-06-13 06:29:55,939][70768] Fps is (10 sec: 54067.8, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 2982592512. Throughput: 0: 49566.3. Samples: 2511395020. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 06:29:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:29:56,122][71000] Updated weights for policy 0, policy_version 182044 (0.0025) [2024-06-13 06:30:00,293][71000] Updated weights for policy 0, policy_version 182054 (0.0026) [2024-06-13 06:30:00,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2982805504. Throughput: 0: 49593.3. Samples: 2511692800. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 06:30:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:30:02,739][71000] Updated weights for policy 0, policy_version 182064 (0.0023) [2024-06-13 06:30:05,940][70768] Fps is (10 sec: 44236.4, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2983034880. Throughput: 0: 49173.8. Samples: 2511829820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-13 06:30:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:30:06,935][71000] Updated weights for policy 0, policy_version 182074 (0.0033) [2024-06-13 06:30:09,388][71000] Updated weights for policy 0, policy_version 182084 (0.0040) [2024-06-13 06:30:10,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 2983313408. Throughput: 0: 49174.7. Samples: 2512116860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-13 06:30:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:30:13,895][71000] Updated weights for policy 0, policy_version 182094 (0.0042) [2024-06-13 06:30:15,940][70768] Fps is (10 sec: 54066.5, 60 sec: 49424.9, 300 sec: 49318.6). Total num frames: 2983575552. Throughput: 0: 49257.1. Samples: 2512409800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-13 06:30:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:30:16,085][71000] Updated weights for policy 0, policy_version 182104 (0.0025) [2024-06-13 06:30:20,566][71000] Updated weights for policy 0, policy_version 182114 (0.0031) [2024-06-13 06:30:20,585][70980] Signal inference workers to stop experience collection... (37550 times) [2024-06-13 06:30:20,586][70980] Signal inference workers to resume experience collection... (37550 times) [2024-06-13 06:30:20,596][71000] InferenceWorker_p0-w0: stopping experience collection (37550 times) [2024-06-13 06:30:20,596][71000] InferenceWorker_p0-w0: resuming experience collection (37550 times) [2024-06-13 06:30:20,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.2, 300 sec: 49096.5). Total num frames: 2983788544. Throughput: 0: 49179.2. Samples: 2512568180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-13 06:30:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 06:30:22,391][71000] Updated weights for policy 0, policy_version 182124 (0.0034) [2024-06-13 06:30:25,940][70768] Fps is (10 sec: 44236.9, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 2984017920. Throughput: 0: 48894.4. Samples: 2512862980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-13 06:30:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:30:27,158][71000] Updated weights for policy 0, policy_version 182134 (0.0029) [2024-06-13 06:30:29,281][71000] Updated weights for policy 0, policy_version 182144 (0.0024) [2024-06-13 06:30:30,944][70768] Fps is (10 sec: 52405.6, 60 sec: 49694.5, 300 sec: 49429.0). Total num frames: 2984312832. Throughput: 0: 48996.7. Samples: 2513155420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-13 06:30:30,944][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 06:30:33,998][71000] Updated weights for policy 0, policy_version 182154 (0.0023) [2024-06-13 06:30:35,754][71000] Updated weights for policy 0, policy_version 182164 (0.0022) [2024-06-13 06:30:35,940][70768] Fps is (10 sec: 55705.1, 60 sec: 49697.9, 300 sec: 49374.1). Total num frames: 2984574976. Throughput: 0: 49436.2. Samples: 2513316440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-13 06:30:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:30:40,447][71000] Updated weights for policy 0, policy_version 182174 (0.0024) [2024-06-13 06:30:40,940][70768] Fps is (10 sec: 44256.1, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2984755200. Throughput: 0: 49243.1. Samples: 2513610960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-13 06:30:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:30:42,435][71000] Updated weights for policy 0, policy_version 182184 (0.0026) [2024-06-13 06:30:45,940][70768] Fps is (10 sec: 42598.9, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 2985000960. Throughput: 0: 49005.7. Samples: 2513898060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-13 06:30:45,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:30:47,305][71000] Updated weights for policy 0, policy_version 182194 (0.0033) [2024-06-13 06:30:49,251][71000] Updated weights for policy 0, policy_version 182204 (0.0034) [2024-06-13 06:30:50,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49152.0, 300 sec: 49374.1). Total num frames: 2985279488. Throughput: 0: 49168.5. Samples: 2514042400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-13 06:30:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:30:53,912][71000] Updated weights for policy 0, policy_version 182214 (0.0031) [2024-06-13 06:30:55,939][70768] Fps is (10 sec: 54068.2, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2985541632. Throughput: 0: 49653.0. Samples: 2514351240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-13 06:30:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:30:55,986][71000] Updated weights for policy 0, policy_version 182224 (0.0033) [2024-06-13 06:31:00,462][71000] Updated weights for policy 0, policy_version 182234 (0.0031) [2024-06-13 06:31:00,939][70768] Fps is (10 sec: 44237.6, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 2985721856. Throughput: 0: 49621.2. Samples: 2514642740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-13 06:31:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:31:01,205][70980] Signal inference workers to stop experience collection... (37600 times) [2024-06-13 06:31:01,240][71000] InferenceWorker_p0-w0: stopping experience collection (37600 times) [2024-06-13 06:31:01,262][70980] Signal inference workers to resume experience collection... (37600 times) [2024-06-13 06:31:01,263][71000] InferenceWorker_p0-w0: resuming experience collection (37600 times) [2024-06-13 06:31:02,645][71000] Updated weights for policy 0, policy_version 182244 (0.0027) [2024-06-13 06:31:05,939][70768] Fps is (10 sec: 42598.3, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 2985967616. Throughput: 0: 48969.3. Samples: 2514771800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-13 06:31:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:31:07,486][71000] Updated weights for policy 0, policy_version 182254 (0.0032) [2024-06-13 06:31:09,363][71000] Updated weights for policy 0, policy_version 182264 (0.0022) [2024-06-13 06:31:10,940][70768] Fps is (10 sec: 54066.0, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2986262528. Throughput: 0: 48843.2. Samples: 2515060920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-13 06:31:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:31:13,834][71000] Updated weights for policy 0, policy_version 182274 (0.0023) [2024-06-13 06:31:15,939][70768] Fps is (10 sec: 55705.8, 60 sec: 49152.2, 300 sec: 49318.6). Total num frames: 2986524672. Throughput: 0: 49047.5. Samples: 2515362340. Policy #0 lag: (min: 0.0, avg: 7.8, max: 19.0) [2024-06-13 06:31:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:31:16,027][71000] Updated weights for policy 0, policy_version 182284 (0.0023) [2024-06-13 06:31:20,663][71000] Updated weights for policy 0, policy_version 182294 (0.0026) [2024-06-13 06:31:20,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 2986721280. Throughput: 0: 48884.5. Samples: 2515516240. Policy #0 lag: (min: 0.0, avg: 7.8, max: 19.0) [2024-06-13 06:31:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:31:22,667][71000] Updated weights for policy 0, policy_version 182304 (0.0023) [2024-06-13 06:31:25,940][70768] Fps is (10 sec: 42597.4, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2986950656. Throughput: 0: 48662.0. Samples: 2515800760. Policy #0 lag: (min: 0.0, avg: 7.8, max: 19.0) [2024-06-13 06:31:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:31:27,113][71000] Updated weights for policy 0, policy_version 182314 (0.0030) [2024-06-13 06:31:29,358][71000] Updated weights for policy 0, policy_version 182324 (0.0030) [2024-06-13 06:31:30,940][70768] Fps is (10 sec: 52428.8, 60 sec: 48882.4, 300 sec: 49263.1). Total num frames: 2987245568. Throughput: 0: 48676.4. Samples: 2516088500. Policy #0 lag: (min: 0.0, avg: 7.8, max: 19.0) [2024-06-13 06:31:30,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:31:33,943][71000] Updated weights for policy 0, policy_version 182334 (0.0029) [2024-06-13 06:31:35,935][71000] Updated weights for policy 0, policy_version 182344 (0.0027) [2024-06-13 06:31:35,939][70768] Fps is (10 sec: 57345.2, 60 sec: 49152.2, 300 sec: 49374.2). Total num frames: 2987524096. Throughput: 0: 49322.7. Samples: 2516261920. Policy #0 lag: (min: 0.0, avg: 7.8, max: 19.0) [2024-06-13 06:31:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:31:40,598][71000] Updated weights for policy 0, policy_version 182354 (0.0023) [2024-06-13 06:31:40,939][70768] Fps is (10 sec: 47514.5, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 2987720704. Throughput: 0: 48964.9. Samples: 2516554660. Policy #0 lag: (min: 0.0, avg: 7.8, max: 19.0) [2024-06-13 06:31:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:31:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000182356_2987720704.pth... [2024-06-13 06:31:40,998][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000181637_2975940608.pth [2024-06-13 06:31:42,733][71000] Updated weights for policy 0, policy_version 182364 (0.0034) [2024-06-13 06:31:45,940][70768] Fps is (10 sec: 42598.2, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 2987950080. Throughput: 0: 49013.6. Samples: 2516848360. Policy #0 lag: (min: 0.0, avg: 7.8, max: 19.0) [2024-06-13 06:31:45,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:31:46,942][71000] Updated weights for policy 0, policy_version 182374 (0.0026) [2024-06-13 06:31:48,554][70980] Signal inference workers to stop experience collection... (37650 times) [2024-06-13 06:31:48,563][71000] InferenceWorker_p0-w0: stopping experience collection (37650 times) [2024-06-13 06:31:48,620][70980] Signal inference workers to resume experience collection... (37650 times) [2024-06-13 06:31:48,621][71000] InferenceWorker_p0-w0: resuming experience collection (37650 times) [2024-06-13 06:31:49,523][71000] Updated weights for policy 0, policy_version 182384 (0.0033) [2024-06-13 06:31:50,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 2988228608. Throughput: 0: 49323.8. Samples: 2516991380. Policy #0 lag: (min: 0.0, avg: 7.8, max: 19.0) [2024-06-13 06:31:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:31:53,938][71000] Updated weights for policy 0, policy_version 182394 (0.0027) [2024-06-13 06:31:55,939][70768] Fps is (10 sec: 54067.7, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 2988490752. Throughput: 0: 49457.5. Samples: 2517286500. Policy #0 lag: (min: 0.0, avg: 7.8, max: 19.0) [2024-06-13 06:31:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:31:55,981][71000] Updated weights for policy 0, policy_version 182404 (0.0028) [2024-06-13 06:32:00,267][71000] Updated weights for policy 0, policy_version 182414 (0.0042) [2024-06-13 06:32:00,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49698.0, 300 sec: 49096.4). Total num frames: 2988703744. Throughput: 0: 49363.9. Samples: 2517583720. Policy #0 lag: (min: 0.0, avg: 7.8, max: 19.0) [2024-06-13 06:32:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:32:03,046][71000] Updated weights for policy 0, policy_version 182424 (0.0024) [2024-06-13 06:32:05,939][70768] Fps is (10 sec: 42598.2, 60 sec: 49152.0, 300 sec: 49041.0). Total num frames: 2988916736. Throughput: 0: 48888.6. Samples: 2517716220. Policy #0 lag: (min: 0.0, avg: 7.8, max: 19.0) [2024-06-13 06:32:05,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-13 06:32:07,066][71000] Updated weights for policy 0, policy_version 182434 (0.0024) [2024-06-13 06:32:09,533][71000] Updated weights for policy 0, policy_version 182444 (0.0026) [2024-06-13 06:32:10,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 2989211648. Throughput: 0: 49137.1. Samples: 2518011920. Policy #0 lag: (min: 0.0, avg: 7.8, max: 19.0) [2024-06-13 06:32:10,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:32:13,712][71000] Updated weights for policy 0, policy_version 182454 (0.0027) [2024-06-13 06:32:15,940][70768] Fps is (10 sec: 54066.2, 60 sec: 48878.8, 300 sec: 49207.5). Total num frames: 2989457408. Throughput: 0: 49523.5. Samples: 2518317060. Policy #0 lag: (min: 0.0, avg: 7.8, max: 19.0) [2024-06-13 06:32:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:32:16,429][71000] Updated weights for policy 0, policy_version 182464 (0.0023) [2024-06-13 06:32:20,366][71000] Updated weights for policy 0, policy_version 182474 (0.0026) [2024-06-13 06:32:20,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49425.1, 300 sec: 49096.4). Total num frames: 2989686784. Throughput: 0: 48795.4. Samples: 2518457720. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-13 06:32:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:32:22,885][71000] Updated weights for policy 0, policy_version 182484 (0.0025) [2024-06-13 06:32:25,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49425.2, 300 sec: 49040.9). Total num frames: 2989916160. Throughput: 0: 48785.7. Samples: 2518750020. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-13 06:32:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:32:26,968][71000] Updated weights for policy 0, policy_version 182494 (0.0025) [2024-06-13 06:32:28,710][70980] Signal inference workers to stop experience collection... (37700 times) [2024-06-13 06:32:28,711][70980] Signal inference workers to resume experience collection... (37700 times) [2024-06-13 06:32:28,737][71000] InferenceWorker_p0-w0: stopping experience collection (37700 times) [2024-06-13 06:32:28,737][71000] InferenceWorker_p0-w0: resuming experience collection (37700 times) [2024-06-13 06:32:29,824][71000] Updated weights for policy 0, policy_version 182504 (0.0030) [2024-06-13 06:32:30,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.1, 300 sec: 49207.5). Total num frames: 2990194688. Throughput: 0: 48559.5. Samples: 2519033540. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-13 06:32:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:32:33,852][71000] Updated weights for policy 0, policy_version 182514 (0.0032) [2024-06-13 06:32:35,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48332.8, 300 sec: 49096.5). Total num frames: 2990424064. Throughput: 0: 49033.6. Samples: 2519197880. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-13 06:32:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:32:36,333][71000] Updated weights for policy 0, policy_version 182524 (0.0032) [2024-06-13 06:32:40,296][71000] Updated weights for policy 0, policy_version 182534 (0.0023) [2024-06-13 06:32:40,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 2990653440. Throughput: 0: 48952.3. Samples: 2519489360. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-13 06:32:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:32:43,410][71000] Updated weights for policy 0, policy_version 182544 (0.0025) [2024-06-13 06:32:45,940][70768] Fps is (10 sec: 47512.5, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 2990899200. Throughput: 0: 48806.1. Samples: 2519780000. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-13 06:32:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:32:47,020][71000] Updated weights for policy 0, policy_version 182554 (0.0030) [2024-06-13 06:32:50,145][71000] Updated weights for policy 0, policy_version 182564 (0.0034) [2024-06-13 06:32:50,939][70768] Fps is (10 sec: 50790.8, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 2991161344. Throughput: 0: 49166.2. Samples: 2519928700. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-13 06:32:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:32:53,573][71000] Updated weights for policy 0, policy_version 182574 (0.0027) [2024-06-13 06:32:55,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48605.7, 300 sec: 49096.4). Total num frames: 2991407104. Throughput: 0: 49065.2. Samples: 2520219860. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-13 06:32:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:32:57,116][71000] Updated weights for policy 0, policy_version 182584 (0.0036) [2024-06-13 06:33:00,499][71000] Updated weights for policy 0, policy_version 182594 (0.0030) [2024-06-13 06:33:00,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 2991620096. Throughput: 0: 48738.0. Samples: 2520510260. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-13 06:33:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:33:03,652][71000] Updated weights for policy 0, policy_version 182604 (0.0034) [2024-06-13 06:33:05,939][70768] Fps is (10 sec: 45876.3, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 2991865856. Throughput: 0: 48775.4. Samples: 2520652600. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-13 06:33:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:33:07,322][71000] Updated weights for policy 0, policy_version 182614 (0.0030) [2024-06-13 06:33:10,320][71000] Updated weights for policy 0, policy_version 182624 (0.0039) [2024-06-13 06:33:10,940][70768] Fps is (10 sec: 52428.6, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 2992144384. Throughput: 0: 48692.4. Samples: 2520941180. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-13 06:33:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:33:14,108][71000] Updated weights for policy 0, policy_version 182634 (0.0020) [2024-06-13 06:33:15,940][70768] Fps is (10 sec: 50788.4, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 2992373760. Throughput: 0: 48974.4. Samples: 2521237400. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-13 06:33:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:33:17,381][71000] Updated weights for policy 0, policy_version 182644 (0.0032) [2024-06-13 06:33:20,818][71000] Updated weights for policy 0, policy_version 182654 (0.0027) [2024-06-13 06:33:20,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48606.0, 300 sec: 49096.4). Total num frames: 2992603136. Throughput: 0: 48309.7. Samples: 2521371820. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-13 06:33:20,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:33:24,377][71000] Updated weights for policy 0, policy_version 182664 (0.0024) [2024-06-13 06:33:25,940][70768] Fps is (10 sec: 47514.9, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2992848896. Throughput: 0: 48566.7. Samples: 2521674860. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-13 06:33:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:33:27,579][71000] Updated weights for policy 0, policy_version 182674 (0.0026) [2024-06-13 06:33:28,972][70980] Signal inference workers to stop experience collection... (37750 times) [2024-06-13 06:33:29,008][71000] InferenceWorker_p0-w0: stopping experience collection (37750 times) [2024-06-13 06:33:29,028][70980] Signal inference workers to resume experience collection... (37750 times) [2024-06-13 06:33:29,032][71000] InferenceWorker_p0-w0: resuming experience collection (37750 times) [2024-06-13 06:33:30,696][71000] Updated weights for policy 0, policy_version 182684 (0.0026) [2024-06-13 06:33:30,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48332.9, 300 sec: 49040.9). Total num frames: 2993094656. Throughput: 0: 48681.6. Samples: 2521970660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 06:33:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:33:34,133][71000] Updated weights for policy 0, policy_version 182694 (0.0034) [2024-06-13 06:33:35,939][70768] Fps is (10 sec: 49152.5, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 2993340416. Throughput: 0: 48751.6. Samples: 2522122520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 06:33:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:33:37,386][71000] Updated weights for policy 0, policy_version 182704 (0.0031) [2024-06-13 06:33:40,602][71000] Updated weights for policy 0, policy_version 182714 (0.0024) [2024-06-13 06:33:40,940][70768] Fps is (10 sec: 49150.8, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 2993586176. Throughput: 0: 48728.8. Samples: 2522412660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 06:33:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:33:41,085][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000182715_2993602560.pth... [2024-06-13 06:33:41,134][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000181997_2981838848.pth [2024-06-13 06:33:44,282][71000] Updated weights for policy 0, policy_version 182724 (0.0026) [2024-06-13 06:33:45,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 2993848320. Throughput: 0: 48744.4. Samples: 2522703760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 06:33:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:33:47,088][71000] Updated weights for policy 0, policy_version 182734 (0.0026) [2024-06-13 06:33:50,713][71000] Updated weights for policy 0, policy_version 182744 (0.0024) [2024-06-13 06:33:50,939][70768] Fps is (10 sec: 49153.1, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 2994077696. Throughput: 0: 49147.0. Samples: 2522864220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 06:33:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:33:54,132][71000] Updated weights for policy 0, policy_version 182754 (0.0031) [2024-06-13 06:33:55,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48606.0, 300 sec: 49040.9). Total num frames: 2994323456. Throughput: 0: 49212.4. Samples: 2523155740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 06:33:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:33:57,770][71000] Updated weights for policy 0, policy_version 182764 (0.0029) [2024-06-13 06:34:00,827][71000] Updated weights for policy 0, policy_version 182774 (0.0023) [2024-06-13 06:34:00,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 2994569216. Throughput: 0: 48978.4. Samples: 2523441420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 06:34:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:34:04,410][71000] Updated weights for policy 0, policy_version 182784 (0.0025) [2024-06-13 06:34:05,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49424.9, 300 sec: 49096.4). Total num frames: 2994831360. Throughput: 0: 49268.3. Samples: 2523588900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 06:34:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:34:07,455][71000] Updated weights for policy 0, policy_version 182794 (0.0035) [2024-06-13 06:34:10,910][71000] Updated weights for policy 0, policy_version 182804 (0.0025) [2024-06-13 06:34:10,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 2995060736. Throughput: 0: 49171.1. Samples: 2523887560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 06:34:10,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 06:34:14,354][71000] Updated weights for policy 0, policy_version 182814 (0.0029) [2024-06-13 06:34:15,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.1, 300 sec: 49096.5). Total num frames: 2995306496. Throughput: 0: 49002.0. Samples: 2524175760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 06:34:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:34:17,789][71000] Updated weights for policy 0, policy_version 182824 (0.0027) [2024-06-13 06:34:20,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 2995535872. Throughput: 0: 48815.0. Samples: 2524319200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 06:34:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:34:21,011][71000] Updated weights for policy 0, policy_version 182834 (0.0028) [2024-06-13 06:34:24,217][71000] Updated weights for policy 0, policy_version 182844 (0.0028) [2024-06-13 06:34:25,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2995798016. Throughput: 0: 48957.5. Samples: 2524615740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 06:34:25,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:34:27,372][71000] Updated weights for policy 0, policy_version 182854 (0.0026) [2024-06-13 06:34:28,479][70980] Signal inference workers to stop experience collection... (37800 times) [2024-06-13 06:34:28,480][70980] Signal inference workers to resume experience collection... (37800 times) [2024-06-13 06:34:28,522][71000] InferenceWorker_p0-w0: stopping experience collection (37800 times) [2024-06-13 06:34:28,523][71000] InferenceWorker_p0-w0: resuming experience collection (37800 times) [2024-06-13 06:34:30,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 2996027392. Throughput: 0: 49183.2. Samples: 2524917000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 06:34:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 06:34:31,020][71000] Updated weights for policy 0, policy_version 182864 (0.0032) [2024-06-13 06:34:34,245][71000] Updated weights for policy 0, policy_version 182874 (0.0029) [2024-06-13 06:34:35,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2996273152. Throughput: 0: 48689.8. Samples: 2525055260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 06:34:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:34:37,624][71000] Updated weights for policy 0, policy_version 182884 (0.0029) [2024-06-13 06:34:40,939][70768] Fps is (10 sec: 49151.9, 60 sec: 48879.1, 300 sec: 49040.9). Total num frames: 2996518912. Throughput: 0: 48761.9. Samples: 2525350020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 06:34:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:34:41,040][71000] Updated weights for policy 0, policy_version 182894 (0.0031) [2024-06-13 06:34:44,090][71000] Updated weights for policy 0, policy_version 182904 (0.0027) [2024-06-13 06:34:45,940][70768] Fps is (10 sec: 50789.5, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 2996781056. Throughput: 0: 49364.0. Samples: 2525662800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 06:34:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:34:47,311][71000] Updated weights for policy 0, policy_version 182914 (0.0038) [2024-06-13 06:34:50,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 2997010432. Throughput: 0: 49123.3. Samples: 2525799440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 06:34:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:34:50,980][71000] Updated weights for policy 0, policy_version 182924 (0.0019) [2024-06-13 06:34:53,777][71000] Updated weights for policy 0, policy_version 182934 (0.0031) [2024-06-13 06:34:55,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 2997272576. Throughput: 0: 49115.5. Samples: 2526097760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 06:34:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 06:34:57,342][71000] Updated weights for policy 0, policy_version 182944 (0.0026) [2024-06-13 06:35:00,663][71000] Updated weights for policy 0, policy_version 182954 (0.0033) [2024-06-13 06:35:00,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 2997534720. Throughput: 0: 49363.6. Samples: 2526397120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 06:35:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:35:04,161][71000] Updated weights for policy 0, policy_version 182964 (0.0029) [2024-06-13 06:35:05,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 2997780480. Throughput: 0: 49496.5. Samples: 2526546540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 06:35:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:35:07,171][71000] Updated weights for policy 0, policy_version 182974 (0.0035) [2024-06-13 06:35:10,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 2997993472. Throughput: 0: 49204.9. Samples: 2526829960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 06:35:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:35:11,050][71000] Updated weights for policy 0, policy_version 182984 (0.0034) [2024-06-13 06:35:13,880][71000] Updated weights for policy 0, policy_version 182994 (0.0025) [2024-06-13 06:35:15,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 2998255616. Throughput: 0: 49015.9. Samples: 2527122720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 06:35:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:35:17,695][71000] Updated weights for policy 0, policy_version 183004 (0.0037) [2024-06-13 06:35:20,650][71000] Updated weights for policy 0, policy_version 183014 (0.0021) [2024-06-13 06:35:20,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 2998501376. Throughput: 0: 49332.3. Samples: 2527275220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 06:35:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:35:23,995][70980] Signal inference workers to stop experience collection... (37850 times) [2024-06-13 06:35:23,996][70980] Signal inference workers to resume experience collection... (37850 times) [2024-06-13 06:35:24,005][71000] InferenceWorker_p0-w0: stopping experience collection (37850 times) [2024-06-13 06:35:24,011][71000] InferenceWorker_p0-w0: resuming experience collection (37850 times) [2024-06-13 06:35:24,140][71000] Updated weights for policy 0, policy_version 183024 (0.0026) [2024-06-13 06:35:25,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 48930.6). Total num frames: 2998747136. Throughput: 0: 49122.2. Samples: 2527560520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 06:35:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:35:27,068][71000] Updated weights for policy 0, policy_version 183034 (0.0034) [2024-06-13 06:35:30,880][71000] Updated weights for policy 0, policy_version 183044 (0.0033) [2024-06-13 06:35:30,939][70768] Fps is (10 sec: 49152.8, 60 sec: 49425.1, 300 sec: 48874.4). Total num frames: 2998992896. Throughput: 0: 49012.2. Samples: 2527868340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 06:35:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:35:33,832][71000] Updated weights for policy 0, policy_version 183054 (0.0030) [2024-06-13 06:35:35,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 2999238656. Throughput: 0: 49062.2. Samples: 2528007240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 06:35:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:35:37,490][71000] Updated weights for policy 0, policy_version 183064 (0.0032) [2024-06-13 06:35:40,541][71000] Updated weights for policy 0, policy_version 183074 (0.0024) [2024-06-13 06:35:40,939][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 2999484416. Throughput: 0: 48782.3. Samples: 2528292960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-13 06:35:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:35:41,065][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000183075_2999500800.pth... [2024-06-13 06:35:41,110][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000182356_2987720704.pth [2024-06-13 06:35:44,518][71000] Updated weights for policy 0, policy_version 183084 (0.0024) [2024-06-13 06:35:45,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 2999730176. Throughput: 0: 48771.1. Samples: 2528591820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-13 06:35:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 06:35:47,147][71000] Updated weights for policy 0, policy_version 183094 (0.0026) [2024-06-13 06:35:50,919][71000] Updated weights for policy 0, policy_version 183104 (0.0027) [2024-06-13 06:35:50,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49424.9, 300 sec: 48929.8). Total num frames: 2999975936. Throughput: 0: 48954.5. Samples: 2528749500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-13 06:35:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:35:53,719][71000] Updated weights for policy 0, policy_version 183114 (0.0031) [2024-06-13 06:35:55,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.9, 300 sec: 49096.4). Total num frames: 3000205312. Throughput: 0: 49144.0. Samples: 2529041440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-13 06:35:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:35:57,409][71000] Updated weights for policy 0, policy_version 183124 (0.0035) [2024-06-13 06:36:00,159][71000] Updated weights for policy 0, policy_version 183134 (0.0034) [2024-06-13 06:36:00,939][70768] Fps is (10 sec: 50791.4, 60 sec: 49152.1, 300 sec: 49207.5). Total num frames: 3000483840. Throughput: 0: 49074.3. Samples: 2529331060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-13 06:36:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:36:03,971][71000] Updated weights for policy 0, policy_version 183144 (0.0034) [2024-06-13 06:36:05,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 3000713216. Throughput: 0: 49193.3. Samples: 2529488920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-13 06:36:05,944][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:36:06,992][71000] Updated weights for policy 0, policy_version 183154 (0.0025) [2024-06-13 06:36:10,868][71000] Updated weights for policy 0, policy_version 183164 (0.0023) [2024-06-13 06:36:10,940][70768] Fps is (10 sec: 47511.9, 60 sec: 49424.8, 300 sec: 48929.8). Total num frames: 3000958976. Throughput: 0: 49465.0. Samples: 2529786460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-13 06:36:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:36:13,640][71000] Updated weights for policy 0, policy_version 183174 (0.0029) [2024-06-13 06:36:15,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 3001204736. Throughput: 0: 49157.1. Samples: 2530080420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-13 06:36:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:36:17,514][71000] Updated weights for policy 0, policy_version 183184 (0.0034) [2024-06-13 06:36:20,685][71000] Updated weights for policy 0, policy_version 183194 (0.0031) [2024-06-13 06:36:20,940][70768] Fps is (10 sec: 50791.5, 60 sec: 49425.1, 300 sec: 49207.6). Total num frames: 3001466880. Throughput: 0: 49260.7. Samples: 2530223980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-13 06:36:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:36:24,271][71000] Updated weights for policy 0, policy_version 183204 (0.0021) [2024-06-13 06:36:25,939][70768] Fps is (10 sec: 50791.4, 60 sec: 49425.1, 300 sec: 49041.0). Total num frames: 3001712640. Throughput: 0: 49403.6. Samples: 2530516120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-13 06:36:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:36:27,318][71000] Updated weights for policy 0, policy_version 183214 (0.0029) [2024-06-13 06:36:30,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48878.8, 300 sec: 48818.7). Total num frames: 3001925632. Throughput: 0: 49436.8. Samples: 2530816480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-13 06:36:30,949][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 06:36:31,049][71000] Updated weights for policy 0, policy_version 183224 (0.0027) [2024-06-13 06:36:33,987][71000] Updated weights for policy 0, policy_version 183234 (0.0031) [2024-06-13 06:36:35,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3002187776. Throughput: 0: 49209.5. Samples: 2530963920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-13 06:36:35,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:36:37,734][71000] Updated weights for policy 0, policy_version 183244 (0.0026) [2024-06-13 06:36:39,817][70980] Signal inference workers to stop experience collection... (37900 times) [2024-06-13 06:36:39,817][70980] Signal inference workers to resume experience collection... (37900 times) [2024-06-13 06:36:39,845][71000] InferenceWorker_p0-w0: stopping experience collection (37900 times) [2024-06-13 06:36:39,845][71000] InferenceWorker_p0-w0: resuming experience collection (37900 times) [2024-06-13 06:36:40,357][71000] Updated weights for policy 0, policy_version 183254 (0.0029) [2024-06-13 06:36:40,939][70768] Fps is (10 sec: 50791.4, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3002433536. Throughput: 0: 49194.3. Samples: 2531255180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-13 06:36:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:36:44,399][71000] Updated weights for policy 0, policy_version 183264 (0.0039) [2024-06-13 06:36:45,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3002679296. Throughput: 0: 49115.5. Samples: 2531541260. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-13 06:36:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:36:47,256][71000] Updated weights for policy 0, policy_version 183274 (0.0026) [2024-06-13 06:36:50,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.1, 300 sec: 48929.8). Total num frames: 3002925056. Throughput: 0: 48897.0. Samples: 2531689280. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-13 06:36:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:36:50,940][71000] Updated weights for policy 0, policy_version 183284 (0.0032) [2024-06-13 06:36:54,187][71000] Updated weights for policy 0, policy_version 183294 (0.0028) [2024-06-13 06:36:55,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 3003170816. Throughput: 0: 48979.4. Samples: 2531990520. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-13 06:36:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:36:57,671][71000] Updated weights for policy 0, policy_version 183304 (0.0032) [2024-06-13 06:37:00,722][71000] Updated weights for policy 0, policy_version 183314 (0.0033) [2024-06-13 06:37:00,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3003416576. Throughput: 0: 49105.5. Samples: 2532290160. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-13 06:37:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:37:04,344][71000] Updated weights for policy 0, policy_version 183324 (0.0029) [2024-06-13 06:37:05,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3003662336. Throughput: 0: 49158.2. Samples: 2532436100. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-13 06:37:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:37:07,405][71000] Updated weights for policy 0, policy_version 183334 (0.0030) [2024-06-13 06:37:10,785][71000] Updated weights for policy 0, policy_version 183344 (0.0029) [2024-06-13 06:37:10,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49425.3, 300 sec: 49040.9). Total num frames: 3003924480. Throughput: 0: 49136.8. Samples: 2532727280. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-13 06:37:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:37:13,734][71000] Updated weights for policy 0, policy_version 183354 (0.0027) [2024-06-13 06:37:15,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.2, 300 sec: 49096.5). Total num frames: 3004170240. Throughput: 0: 49214.8. Samples: 2533031140. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-13 06:37:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:37:17,396][71000] Updated weights for policy 0, policy_version 183364 (0.0027) [2024-06-13 06:37:20,553][71000] Updated weights for policy 0, policy_version 183374 (0.0029) [2024-06-13 06:37:20,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 3004432384. Throughput: 0: 49366.2. Samples: 2533185400. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-13 06:37:20,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 06:37:24,060][71000] Updated weights for policy 0, policy_version 183384 (0.0023) [2024-06-13 06:37:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3004645376. Throughput: 0: 49493.7. Samples: 2533482400. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-13 06:37:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:37:27,166][71000] Updated weights for policy 0, policy_version 183394 (0.0034) [2024-06-13 06:37:30,499][71000] Updated weights for policy 0, policy_version 183404 (0.0031) [2024-06-13 06:37:30,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49698.1, 300 sec: 49096.4). Total num frames: 3004907520. Throughput: 0: 49844.7. Samples: 2533784280. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-13 06:37:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:37:33,799][71000] Updated weights for policy 0, policy_version 183414 (0.0028) [2024-06-13 06:37:35,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 3005153280. Throughput: 0: 49808.8. Samples: 2533930680. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-13 06:37:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:37:37,090][71000] Updated weights for policy 0, policy_version 183424 (0.0025) [2024-06-13 06:37:40,398][71000] Updated weights for policy 0, policy_version 183434 (0.0033) [2024-06-13 06:37:40,939][70768] Fps is (10 sec: 50791.3, 60 sec: 49698.1, 300 sec: 49207.6). Total num frames: 3005415424. Throughput: 0: 49672.0. Samples: 2534225760. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-13 06:37:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:37:40,970][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000183437_3005431808.pth... [2024-06-13 06:37:41,022][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000182715_2993602560.pth [2024-06-13 06:37:43,708][71000] Updated weights for policy 0, policy_version 183444 (0.0034) [2024-06-13 06:37:45,939][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3005628416. Throughput: 0: 49599.5. Samples: 2534522140. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-13 06:37:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:37:47,013][71000] Updated weights for policy 0, policy_version 183454 (0.0028) [2024-06-13 06:37:50,079][70980] Signal inference workers to stop experience collection... (37950 times) [2024-06-13 06:37:50,131][71000] InferenceWorker_p0-w0: stopping experience collection (37950 times) [2024-06-13 06:37:50,131][70980] Signal inference workers to resume experience collection... (37950 times) [2024-06-13 06:37:50,147][71000] InferenceWorker_p0-w0: resuming experience collection (37950 times) [2024-06-13 06:37:50,268][71000] Updated weights for policy 0, policy_version 183464 (0.0039) [2024-06-13 06:37:50,944][70768] Fps is (10 sec: 49130.7, 60 sec: 49694.6, 300 sec: 49151.3). Total num frames: 3005906944. Throughput: 0: 49534.5. Samples: 2534665360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 06:37:50,944][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:37:53,714][71000] Updated weights for policy 0, policy_version 183474 (0.0030) [2024-06-13 06:37:55,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 3006136320. Throughput: 0: 49715.6. Samples: 2534964480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 06:37:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:37:56,994][71000] Updated weights for policy 0, policy_version 183484 (0.0027) [2024-06-13 06:38:00,311][71000] Updated weights for policy 0, policy_version 183494 (0.0034) [2024-06-13 06:38:00,940][70768] Fps is (10 sec: 50812.2, 60 sec: 49971.1, 300 sec: 49318.6). Total num frames: 3006414848. Throughput: 0: 49644.4. Samples: 2535265140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 06:38:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:38:03,675][71000] Updated weights for policy 0, policy_version 183504 (0.0022) [2024-06-13 06:38:05,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.2, 300 sec: 49096.5). Total num frames: 3006627840. Throughput: 0: 49485.9. Samples: 2535412260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 06:38:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:38:06,859][71000] Updated weights for policy 0, policy_version 183514 (0.0035) [2024-06-13 06:38:10,209][71000] Updated weights for policy 0, policy_version 183524 (0.0027) [2024-06-13 06:38:10,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3006873600. Throughput: 0: 49424.4. Samples: 2535706500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 06:38:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:38:13,462][71000] Updated weights for policy 0, policy_version 183534 (0.0019) [2024-06-13 06:38:15,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 3007119360. Throughput: 0: 49217.4. Samples: 2535999060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 06:38:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:38:16,762][71000] Updated weights for policy 0, policy_version 183544 (0.0027) [2024-06-13 06:38:20,036][71000] Updated weights for policy 0, policy_version 183554 (0.0037) [2024-06-13 06:38:20,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 3007381504. Throughput: 0: 49266.7. Samples: 2536147680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 06:38:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:38:23,642][71000] Updated weights for policy 0, policy_version 183564 (0.0028) [2024-06-13 06:38:25,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 3007610880. Throughput: 0: 49156.8. Samples: 2536437820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 06:38:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 06:38:26,922][71000] Updated weights for policy 0, policy_version 183574 (0.0037) [2024-06-13 06:38:30,215][71000] Updated weights for policy 0, policy_version 183584 (0.0037) [2024-06-13 06:38:30,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.1, 300 sec: 49207.5). Total num frames: 3007856640. Throughput: 0: 49047.5. Samples: 2536729280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 06:38:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:38:33,445][71000] Updated weights for policy 0, policy_version 183594 (0.0022) [2024-06-13 06:38:35,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 3008102400. Throughput: 0: 49246.5. Samples: 2536881240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 06:38:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:38:36,905][71000] Updated weights for policy 0, policy_version 183604 (0.0033) [2024-06-13 06:38:40,090][71000] Updated weights for policy 0, policy_version 183614 (0.0032) [2024-06-13 06:38:40,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3008348160. Throughput: 0: 49132.9. Samples: 2537175460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 06:38:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 06:38:43,651][71000] Updated weights for policy 0, policy_version 183624 (0.0025) [2024-06-13 06:38:45,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3008577536. Throughput: 0: 49165.4. Samples: 2537477580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 06:38:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:38:46,938][71000] Updated weights for policy 0, policy_version 183634 (0.0028) [2024-06-13 06:38:50,234][71000] Updated weights for policy 0, policy_version 183644 (0.0028) [2024-06-13 06:38:50,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48882.4, 300 sec: 49207.5). Total num frames: 3008839680. Throughput: 0: 48864.4. Samples: 2537611160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 06:38:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:38:53,387][71000] Updated weights for policy 0, policy_version 183654 (0.0027) [2024-06-13 06:38:54,709][70980] Signal inference workers to stop experience collection... (38000 times) [2024-06-13 06:38:54,753][71000] InferenceWorker_p0-w0: stopping experience collection (38000 times) [2024-06-13 06:38:54,757][70980] Signal inference workers to resume experience collection... (38000 times) [2024-06-13 06:38:54,774][71000] InferenceWorker_p0-w0: resuming experience collection (38000 times) [2024-06-13 06:38:55,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 3009085440. Throughput: 0: 49071.0. Samples: 2537914700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 06:38:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:38:56,795][71000] Updated weights for policy 0, policy_version 183664 (0.0037) [2024-06-13 06:39:00,391][71000] Updated weights for policy 0, policy_version 183674 (0.0022) [2024-06-13 06:39:00,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48605.8, 300 sec: 49152.0). Total num frames: 3009331200. Throughput: 0: 48943.6. Samples: 2538201520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:39:00,940][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 06:39:03,548][71000] Updated weights for policy 0, policy_version 183684 (0.0026) [2024-06-13 06:39:05,940][70768] Fps is (10 sec: 47510.9, 60 sec: 48878.3, 300 sec: 49151.9). Total num frames: 3009560576. Throughput: 0: 48871.2. Samples: 2538346920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:39:05,941][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:39:07,127][71000] Updated weights for policy 0, policy_version 183694 (0.0032) [2024-06-13 06:39:10,116][71000] Updated weights for policy 0, policy_version 183704 (0.0035) [2024-06-13 06:39:10,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 3009806336. Throughput: 0: 48960.3. Samples: 2538641040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:39:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:39:13,650][71000] Updated weights for policy 0, policy_version 183714 (0.0034) [2024-06-13 06:39:15,940][70768] Fps is (10 sec: 52432.0, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 3010084864. Throughput: 0: 49093.7. Samples: 2538938500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:39:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:39:16,736][71000] Updated weights for policy 0, policy_version 183724 (0.0035) [2024-06-13 06:39:20,201][71000] Updated weights for policy 0, policy_version 183734 (0.0027) [2024-06-13 06:39:20,939][70768] Fps is (10 sec: 52429.8, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 3010330624. Throughput: 0: 49181.4. Samples: 2539094400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:39:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:39:23,444][71000] Updated weights for policy 0, policy_version 183744 (0.0025) [2024-06-13 06:39:25,940][70768] Fps is (10 sec: 45875.7, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 3010543616. Throughput: 0: 49216.0. Samples: 2539390180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:39:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:39:27,128][71000] Updated weights for policy 0, policy_version 183754 (0.0038) [2024-06-13 06:39:30,036][71000] Updated weights for policy 0, policy_version 183764 (0.0025) [2024-06-13 06:39:30,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 3010789376. Throughput: 0: 48676.9. Samples: 2539668040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:39:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:39:33,794][71000] Updated weights for policy 0, policy_version 183774 (0.0034) [2024-06-13 06:39:35,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49424.9, 300 sec: 49318.6). Total num frames: 3011067904. Throughput: 0: 49135.0. Samples: 2539822240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:39:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:39:36,914][71000] Updated weights for policy 0, policy_version 183784 (0.0030) [2024-06-13 06:39:40,206][71000] Updated weights for policy 0, policy_version 183794 (0.0025) [2024-06-13 06:39:40,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49151.9, 300 sec: 49207.6). Total num frames: 3011297280. Throughput: 0: 49036.5. Samples: 2540121340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:39:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:39:41,055][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000183796_3011313664.pth... [2024-06-13 06:39:41,089][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000183075_2999500800.pth [2024-06-13 06:39:43,795][71000] Updated weights for policy 0, policy_version 183804 (0.0028) [2024-06-13 06:39:45,940][70768] Fps is (10 sec: 45875.8, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 3011526656. Throughput: 0: 49306.3. Samples: 2540420300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:39:45,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:39:47,018][71000] Updated weights for policy 0, policy_version 183814 (0.0023) [2024-06-13 06:39:50,450][71000] Updated weights for policy 0, policy_version 183824 (0.0026) [2024-06-13 06:39:50,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3011772416. Throughput: 0: 49058.0. Samples: 2540554500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:39:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:39:53,603][71000] Updated weights for policy 0, policy_version 183834 (0.0031) [2024-06-13 06:39:55,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.2, 300 sec: 49207.5). Total num frames: 3012050944. Throughput: 0: 49038.4. Samples: 2540847760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:39:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 06:39:57,199][71000] Updated weights for policy 0, policy_version 183844 (0.0036) [2024-06-13 06:39:59,988][71000] Updated weights for policy 0, policy_version 183854 (0.0032) [2024-06-13 06:40:00,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 3012296704. Throughput: 0: 49154.3. Samples: 2541150440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-13 06:40:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:40:03,787][71000] Updated weights for policy 0, policy_version 183864 (0.0028) [2024-06-13 06:40:05,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.6, 300 sec: 49207.5). Total num frames: 3012509696. Throughput: 0: 48977.3. Samples: 2541298380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 06:40:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:40:06,961][71000] Updated weights for policy 0, policy_version 183874 (0.0029) [2024-06-13 06:40:10,752][71000] Updated weights for policy 0, policy_version 183884 (0.0031) [2024-06-13 06:40:10,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3012755456. Throughput: 0: 48843.8. Samples: 2541588160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 06:40:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:40:13,719][71000] Updated weights for policy 0, policy_version 183894 (0.0025) [2024-06-13 06:40:15,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48879.0, 300 sec: 49207.6). Total num frames: 3013017600. Throughput: 0: 49266.2. Samples: 2541885020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 06:40:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:40:17,365][71000] Updated weights for policy 0, policy_version 183904 (0.0029) [2024-06-13 06:40:20,417][71000] Updated weights for policy 0, policy_version 183914 (0.0025) [2024-06-13 06:40:20,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48878.8, 300 sec: 49207.5). Total num frames: 3013263360. Throughput: 0: 49121.4. Samples: 2542032700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 06:40:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:40:22,532][70980] Signal inference workers to stop experience collection... (38050 times) [2024-06-13 06:40:22,533][70980] Signal inference workers to resume experience collection... (38050 times) [2024-06-13 06:40:22,541][71000] InferenceWorker_p0-w0: stopping experience collection (38050 times) [2024-06-13 06:40:22,570][71000] InferenceWorker_p0-w0: resuming experience collection (38050 times) [2024-06-13 06:40:24,202][71000] Updated weights for policy 0, policy_version 183924 (0.0031) [2024-06-13 06:40:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3013492736. Throughput: 0: 48929.3. Samples: 2542323160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 06:40:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:40:27,411][71000] Updated weights for policy 0, policy_version 183934 (0.0033) [2024-06-13 06:40:30,863][71000] Updated weights for policy 0, policy_version 183944 (0.0023) [2024-06-13 06:40:30,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3013738496. Throughput: 0: 48575.9. Samples: 2542606220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 06:40:30,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 06:40:34,012][71000] Updated weights for policy 0, policy_version 183954 (0.0036) [2024-06-13 06:40:35,939][70768] Fps is (10 sec: 50790.9, 60 sec: 48879.1, 300 sec: 49207.5). Total num frames: 3014000640. Throughput: 0: 48982.8. Samples: 2542758720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 06:40:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:40:37,586][71000] Updated weights for policy 0, policy_version 183964 (0.0029) [2024-06-13 06:40:40,668][71000] Updated weights for policy 0, policy_version 183974 (0.0029) [2024-06-13 06:40:40,944][70768] Fps is (10 sec: 49131.4, 60 sec: 48875.4, 300 sec: 49151.3). Total num frames: 3014230016. Throughput: 0: 49046.0. Samples: 2543055040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 06:40:40,944][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:40:44,103][71000] Updated weights for policy 0, policy_version 183984 (0.0027) [2024-06-13 06:40:45,940][70768] Fps is (10 sec: 44236.7, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 3014443008. Throughput: 0: 48873.9. Samples: 2543349760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 06:40:45,940][70768] Avg episode reward: [(0, '0.295')] [2024-06-13 06:40:47,533][71000] Updated weights for policy 0, policy_version 183994 (0.0030) [2024-06-13 06:40:50,847][71000] Updated weights for policy 0, policy_version 184004 (0.0027) [2024-06-13 06:40:50,940][70768] Fps is (10 sec: 49173.0, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 3014721536. Throughput: 0: 48696.8. Samples: 2543489740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 06:40:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:40:53,943][71000] Updated weights for policy 0, policy_version 184014 (0.0026) [2024-06-13 06:40:55,940][70768] Fps is (10 sec: 54067.2, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3014983680. Throughput: 0: 48966.8. Samples: 2543791660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 06:40:55,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:40:57,621][71000] Updated weights for policy 0, policy_version 184024 (0.0028) [2024-06-13 06:41:00,564][71000] Updated weights for policy 0, policy_version 184034 (0.0032) [2024-06-13 06:41:00,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48878.9, 300 sec: 49207.6). Total num frames: 3015229440. Throughput: 0: 48883.1. Samples: 2544084760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 06:41:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 06:41:04,275][71000] Updated weights for policy 0, policy_version 184044 (0.0033) [2024-06-13 06:41:05,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 49152.1). Total num frames: 3015458816. Throughput: 0: 48775.6. Samples: 2544227600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 06:41:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:41:07,613][71000] Updated weights for policy 0, policy_version 184054 (0.0032) [2024-06-13 06:41:10,940][70768] Fps is (10 sec: 47512.6, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3015704576. Throughput: 0: 48895.3. Samples: 2544523460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:41:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:41:10,942][71000] Updated weights for policy 0, policy_version 184064 (0.0032) [2024-06-13 06:41:14,014][71000] Updated weights for policy 0, policy_version 184074 (0.0026) [2024-06-13 06:41:15,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 3015950336. Throughput: 0: 49186.4. Samples: 2544819600. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:41:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:41:17,864][71000] Updated weights for policy 0, policy_version 184084 (0.0035) [2024-06-13 06:41:20,529][71000] Updated weights for policy 0, policy_version 184094 (0.0028) [2024-06-13 06:41:20,940][70768] Fps is (10 sec: 49153.2, 60 sec: 48879.0, 300 sec: 49096.4). Total num frames: 3016196096. Throughput: 0: 49274.1. Samples: 2544976060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:41:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:41:24,371][71000] Updated weights for policy 0, policy_version 184104 (0.0034) [2024-06-13 06:41:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 3016441856. Throughput: 0: 49333.2. Samples: 2545274820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:41:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:41:27,493][71000] Updated weights for policy 0, policy_version 184114 (0.0037) [2024-06-13 06:41:30,927][71000] Updated weights for policy 0, policy_version 184124 (0.0038) [2024-06-13 06:41:30,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 3016687616. Throughput: 0: 49201.3. Samples: 2545563820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:41:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:41:34,164][71000] Updated weights for policy 0, policy_version 184134 (0.0035) [2024-06-13 06:41:35,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3016933376. Throughput: 0: 49219.7. Samples: 2545704620. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:41:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:41:37,878][71000] Updated weights for policy 0, policy_version 184144 (0.0023) [2024-06-13 06:41:40,623][70980] Signal inference workers to stop experience collection... (38100 times) [2024-06-13 06:41:40,626][70980] Signal inference workers to resume experience collection... (38100 times) [2024-06-13 06:41:40,651][71000] InferenceWorker_p0-w0: stopping experience collection (38100 times) [2024-06-13 06:41:40,651][71000] InferenceWorker_p0-w0: resuming experience collection (38100 times) [2024-06-13 06:41:40,771][71000] Updated weights for policy 0, policy_version 184154 (0.0026) [2024-06-13 06:41:40,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49428.6, 300 sec: 49207.5). Total num frames: 3017195520. Throughput: 0: 49216.4. Samples: 2546006400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:41:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:41:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000184155_3017195520.pth... [2024-06-13 06:41:41,002][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000183437_3005431808.pth [2024-06-13 06:41:44,155][71000] Updated weights for policy 0, policy_version 184164 (0.0027) [2024-06-13 06:41:45,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.2, 300 sec: 49152.0). Total num frames: 3017424896. Throughput: 0: 49258.8. Samples: 2546301400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:41:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:41:47,190][71000] Updated weights for policy 0, policy_version 184174 (0.0032) [2024-06-13 06:41:50,653][71000] Updated weights for policy 0, policy_version 184184 (0.0031) [2024-06-13 06:41:50,939][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 3017670656. Throughput: 0: 49283.6. Samples: 2546445360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:41:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:41:54,026][71000] Updated weights for policy 0, policy_version 184194 (0.0026) [2024-06-13 06:41:55,943][70768] Fps is (10 sec: 49136.1, 60 sec: 48876.3, 300 sec: 49151.5). Total num frames: 3017916416. Throughput: 0: 49308.3. Samples: 2546742480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:41:55,943][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:41:57,279][71000] Updated weights for policy 0, policy_version 184204 (0.0039) [2024-06-13 06:42:00,643][71000] Updated weights for policy 0, policy_version 184214 (0.0029) [2024-06-13 06:42:00,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 3018178560. Throughput: 0: 49425.8. Samples: 2547043760. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:42:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:42:03,706][71000] Updated weights for policy 0, policy_version 184224 (0.0027) [2024-06-13 06:42:05,940][70768] Fps is (10 sec: 50806.6, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 3018424320. Throughput: 0: 49365.8. Samples: 2547197520. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:42:05,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:42:07,200][71000] Updated weights for policy 0, policy_version 184234 (0.0028) [2024-06-13 06:42:10,745][71000] Updated weights for policy 0, policy_version 184244 (0.0028) [2024-06-13 06:42:10,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.2, 300 sec: 49096.5). Total num frames: 3018653696. Throughput: 0: 49183.1. Samples: 2547488060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:42:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:42:14,102][71000] Updated weights for policy 0, policy_version 184254 (0.0027) [2024-06-13 06:42:15,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3018899456. Throughput: 0: 48968.9. Samples: 2547767420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:42:15,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 06:42:17,302][71000] Updated weights for policy 0, policy_version 184264 (0.0023) [2024-06-13 06:42:20,766][71000] Updated weights for policy 0, policy_version 184274 (0.0040) [2024-06-13 06:42:20,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 3019161600. Throughput: 0: 49349.7. Samples: 2547925360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:42:20,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:42:24,076][71000] Updated weights for policy 0, policy_version 184284 (0.0029) [2024-06-13 06:42:25,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3019390976. Throughput: 0: 49162.8. Samples: 2548218720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:42:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:42:27,479][71000] Updated weights for policy 0, policy_version 184294 (0.0028) [2024-06-13 06:42:30,674][71000] Updated weights for policy 0, policy_version 184304 (0.0023) [2024-06-13 06:42:30,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 3019636736. Throughput: 0: 49233.7. Samples: 2548516920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:42:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:42:33,820][71000] Updated weights for policy 0, policy_version 184314 (0.0037) [2024-06-13 06:42:35,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3019882496. Throughput: 0: 49185.3. Samples: 2548658700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:42:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:42:37,403][71000] Updated weights for policy 0, policy_version 184324 (0.0025) [2024-06-13 06:42:40,045][71000] Updated weights for policy 0, policy_version 184334 (0.0025) [2024-06-13 06:42:40,940][70768] Fps is (10 sec: 52429.1, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 3020161024. Throughput: 0: 49272.8. Samples: 2548959600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:42:40,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:42:43,049][70980] Signal inference workers to stop experience collection... (38150 times) [2024-06-13 06:42:43,077][71000] InferenceWorker_p0-w0: stopping experience collection (38150 times) [2024-06-13 06:42:43,102][70980] Signal inference workers to resume experience collection... (38150 times) [2024-06-13 06:42:43,104][71000] InferenceWorker_p0-w0: resuming experience collection (38150 times) [2024-06-13 06:42:44,063][71000] Updated weights for policy 0, policy_version 184344 (0.0028) [2024-06-13 06:42:45,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.0, 300 sec: 49097.2). Total num frames: 3020390400. Throughput: 0: 49143.1. Samples: 2549255200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:42:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:42:46,835][71000] Updated weights for policy 0, policy_version 184354 (0.0023) [2024-06-13 06:42:50,818][71000] Updated weights for policy 0, policy_version 184364 (0.0029) [2024-06-13 06:42:50,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 3020636160. Throughput: 0: 49027.1. Samples: 2549403740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:42:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:42:53,676][71000] Updated weights for policy 0, policy_version 184374 (0.0020) [2024-06-13 06:42:55,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49154.6, 300 sec: 48985.4). Total num frames: 3020865536. Throughput: 0: 49117.3. Samples: 2549698340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:42:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:42:57,442][71000] Updated weights for policy 0, policy_version 184384 (0.0034) [2024-06-13 06:43:00,396][71000] Updated weights for policy 0, policy_version 184394 (0.0034) [2024-06-13 06:43:00,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3021127680. Throughput: 0: 49419.3. Samples: 2549991300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:43:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:43:04,017][71000] Updated weights for policy 0, policy_version 184404 (0.0029) [2024-06-13 06:43:05,939][70768] Fps is (10 sec: 50791.2, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 3021373440. Throughput: 0: 49347.7. Samples: 2550146000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:43:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:43:07,175][71000] Updated weights for policy 0, policy_version 184414 (0.0036) [2024-06-13 06:43:10,817][71000] Updated weights for policy 0, policy_version 184424 (0.0028) [2024-06-13 06:43:10,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3021602816. Throughput: 0: 49273.2. Samples: 2550436020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:43:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:43:13,523][71000] Updated weights for policy 0, policy_version 184434 (0.0027) [2024-06-13 06:43:15,939][70768] Fps is (10 sec: 45875.0, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3021832192. Throughput: 0: 49233.5. Samples: 2550732420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-13 06:43:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:43:17,626][71000] Updated weights for policy 0, policy_version 184444 (0.0030) [2024-06-13 06:43:20,225][71000] Updated weights for policy 0, policy_version 184454 (0.0026) [2024-06-13 06:43:20,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3022110720. Throughput: 0: 49197.8. Samples: 2550872600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 06:43:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:43:24,195][71000] Updated weights for policy 0, policy_version 184464 (0.0028) [2024-06-13 06:43:25,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 3022356480. Throughput: 0: 49098.2. Samples: 2551169020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 06:43:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:43:27,078][71000] Updated weights for policy 0, policy_version 184474 (0.0024) [2024-06-13 06:43:30,911][71000] Updated weights for policy 0, policy_version 184484 (0.0025) [2024-06-13 06:43:30,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 3022585856. Throughput: 0: 49112.4. Samples: 2551465260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 06:43:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:43:33,488][71000] Updated weights for policy 0, policy_version 184494 (0.0032) [2024-06-13 06:43:35,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3022815232. Throughput: 0: 48776.9. Samples: 2551598700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 06:43:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:43:37,482][71000] Updated weights for policy 0, policy_version 184504 (0.0027) [2024-06-13 06:43:40,357][71000] Updated weights for policy 0, policy_version 184514 (0.0029) [2024-06-13 06:43:40,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48605.9, 300 sec: 49152.0). Total num frames: 3023077376. Throughput: 0: 48832.5. Samples: 2551895800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 06:43:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 06:43:40,993][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000184515_3023093760.pth... [2024-06-13 06:43:41,040][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000183796_3011313664.pth [2024-06-13 06:43:44,283][71000] Updated weights for policy 0, policy_version 184524 (0.0029) [2024-06-13 06:43:45,940][70768] Fps is (10 sec: 52427.8, 60 sec: 49151.8, 300 sec: 49152.0). Total num frames: 3023339520. Throughput: 0: 48927.9. Samples: 2552193060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 06:43:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:43:47,016][71000] Updated weights for policy 0, policy_version 184534 (0.0034) [2024-06-13 06:43:50,043][70980] Signal inference workers to stop experience collection... (38200 times) [2024-06-13 06:43:50,092][71000] InferenceWorker_p0-w0: stopping experience collection (38200 times) [2024-06-13 06:43:50,155][70980] Signal inference workers to resume experience collection... (38200 times) [2024-06-13 06:43:50,155][71000] InferenceWorker_p0-w0: resuming experience collection (38200 times) [2024-06-13 06:43:50,737][71000] Updated weights for policy 0, policy_version 184544 (0.0027) [2024-06-13 06:43:50,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3023585280. Throughput: 0: 49099.0. Samples: 2552355460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 06:43:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:43:53,443][71000] Updated weights for policy 0, policy_version 184554 (0.0027) [2024-06-13 06:43:55,940][70768] Fps is (10 sec: 45875.6, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3023798272. Throughput: 0: 49065.7. Samples: 2552643980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 06:43:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:43:57,402][71000] Updated weights for policy 0, policy_version 184564 (0.0026) [2024-06-13 06:44:00,339][71000] Updated weights for policy 0, policy_version 184574 (0.0030) [2024-06-13 06:44:00,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48879.0, 300 sec: 49152.1). Total num frames: 3024060416. Throughput: 0: 48721.6. Samples: 2552924900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 06:44:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:44:04,198][71000] Updated weights for policy 0, policy_version 184584 (0.0029) [2024-06-13 06:44:05,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49151.8, 300 sec: 49207.5). Total num frames: 3024322560. Throughput: 0: 49162.5. Samples: 2553084920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 06:44:05,941][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:44:06,853][71000] Updated weights for policy 0, policy_version 184594 (0.0031) [2024-06-13 06:44:10,833][71000] Updated weights for policy 0, policy_version 184604 (0.0032) [2024-06-13 06:44:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 3024551936. Throughput: 0: 49232.3. Samples: 2553384480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 06:44:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:44:13,549][71000] Updated weights for policy 0, policy_version 184614 (0.0026) [2024-06-13 06:44:15,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 3024781312. Throughput: 0: 49015.9. Samples: 2553670980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 06:44:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:44:17,575][71000] Updated weights for policy 0, policy_version 184624 (0.0025) [2024-06-13 06:44:20,176][71000] Updated weights for policy 0, policy_version 184634 (0.0027) [2024-06-13 06:44:20,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3025043456. Throughput: 0: 49272.4. Samples: 2553815960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 06:44:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:44:24,226][71000] Updated weights for policy 0, policy_version 184644 (0.0047) [2024-06-13 06:44:25,940][70768] Fps is (10 sec: 52429.5, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 3025305600. Throughput: 0: 49318.2. Samples: 2554115120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:44:25,940][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 06:44:26,851][71000] Updated weights for policy 0, policy_version 184654 (0.0030) [2024-06-13 06:44:30,873][71000] Updated weights for policy 0, policy_version 184664 (0.0033) [2024-06-13 06:44:30,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3025534976. Throughput: 0: 49294.5. Samples: 2554411300. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:44:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:44:33,619][71000] Updated weights for policy 0, policy_version 184674 (0.0034) [2024-06-13 06:44:35,939][70768] Fps is (10 sec: 44236.9, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 3025747968. Throughput: 0: 48575.6. Samples: 2554541360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:44:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:44:37,387][71000] Updated weights for policy 0, policy_version 184684 (0.0042) [2024-06-13 06:44:40,221][71000] Updated weights for policy 0, policy_version 184694 (0.0028) [2024-06-13 06:44:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3026026496. Throughput: 0: 48682.8. Samples: 2554834700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:44:40,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:44:44,038][70980] Signal inference workers to stop experience collection... (38250 times) [2024-06-13 06:44:44,085][71000] InferenceWorker_p0-w0: stopping experience collection (38250 times) [2024-06-13 06:44:44,146][70980] Signal inference workers to resume experience collection... (38250 times) [2024-06-13 06:44:44,146][71000] InferenceWorker_p0-w0: resuming experience collection (38250 times) [2024-06-13 06:44:44,423][71000] Updated weights for policy 0, policy_version 184704 (0.0023) [2024-06-13 06:44:45,940][70768] Fps is (10 sec: 52428.5, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 3026272256. Throughput: 0: 49090.8. Samples: 2555133980. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:44:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:44:47,203][71000] Updated weights for policy 0, policy_version 184714 (0.0034) [2024-06-13 06:44:50,939][70768] Fps is (10 sec: 45875.3, 60 sec: 48332.8, 300 sec: 48929.8). Total num frames: 3026485248. Throughput: 0: 48783.3. Samples: 2555280160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:44:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:44:51,167][71000] Updated weights for policy 0, policy_version 184724 (0.0027) [2024-06-13 06:44:53,726][71000] Updated weights for policy 0, policy_version 184734 (0.0035) [2024-06-13 06:44:55,940][70768] Fps is (10 sec: 45874.5, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3026731008. Throughput: 0: 48619.5. Samples: 2555572360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:44:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:44:57,505][71000] Updated weights for policy 0, policy_version 184744 (0.0023) [2024-06-13 06:45:00,485][71000] Updated weights for policy 0, policy_version 184754 (0.0029) [2024-06-13 06:45:00,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 3027009536. Throughput: 0: 48913.5. Samples: 2555872080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:45:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:45:04,327][71000] Updated weights for policy 0, policy_version 184764 (0.0026) [2024-06-13 06:45:05,940][70768] Fps is (10 sec: 52428.9, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3027255296. Throughput: 0: 48999.9. Samples: 2556020960. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:45:05,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 06:45:07,243][71000] Updated weights for policy 0, policy_version 184774 (0.0032) [2024-06-13 06:45:10,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 3027484672. Throughput: 0: 48992.0. Samples: 2556319760. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:45:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:45:11,075][71000] Updated weights for policy 0, policy_version 184784 (0.0027) [2024-06-13 06:45:13,835][71000] Updated weights for policy 0, policy_version 184794 (0.0024) [2024-06-13 06:45:15,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3027730432. Throughput: 0: 49050.0. Samples: 2556618560. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:45:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:45:17,494][71000] Updated weights for policy 0, policy_version 184804 (0.0026) [2024-06-13 06:45:20,495][71000] Updated weights for policy 0, policy_version 184814 (0.0033) [2024-06-13 06:45:20,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 3028008960. Throughput: 0: 49339.8. Samples: 2556761660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:45:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:45:24,470][71000] Updated weights for policy 0, policy_version 184824 (0.0027) [2024-06-13 06:45:25,940][70768] Fps is (10 sec: 50791.1, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3028238336. Throughput: 0: 49411.5. Samples: 2557058220. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:45:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:45:26,931][71000] Updated weights for policy 0, policy_version 184834 (0.0024) [2024-06-13 06:45:30,940][70768] Fps is (10 sec: 45875.9, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3028467712. Throughput: 0: 49356.0. Samples: 2557355000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 06:45:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:45:30,999][71000] Updated weights for policy 0, policy_version 184844 (0.0028) [2024-06-13 06:45:33,894][71000] Updated weights for policy 0, policy_version 184854 (0.0026) [2024-06-13 06:45:35,939][70768] Fps is (10 sec: 45875.6, 60 sec: 49152.0, 300 sec: 49041.6). Total num frames: 3028697088. Throughput: 0: 49205.8. Samples: 2557494420. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-13 06:45:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:45:37,698][71000] Updated weights for policy 0, policy_version 184864 (0.0027) [2024-06-13 06:45:40,517][71000] Updated weights for policy 0, policy_version 184874 (0.0033) [2024-06-13 06:45:40,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49425.0, 300 sec: 49318.6). Total num frames: 3028992000. Throughput: 0: 49399.3. Samples: 2557795320. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-13 06:45:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:45:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000184875_3028992000.pth... [2024-06-13 06:45:40,992][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000184155_3017195520.pth [2024-06-13 06:45:42,337][70980] Signal inference workers to stop experience collection... (38300 times) [2024-06-13 06:45:42,338][70980] Signal inference workers to resume experience collection... (38300 times) [2024-06-13 06:45:42,375][71000] InferenceWorker_p0-w0: stopping experience collection (38300 times) [2024-06-13 06:45:42,375][71000] InferenceWorker_p0-w0: resuming experience collection (38300 times) [2024-06-13 06:45:44,223][71000] Updated weights for policy 0, policy_version 184884 (0.0032) [2024-06-13 06:45:45,940][70768] Fps is (10 sec: 54066.3, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 3029237760. Throughput: 0: 49400.8. Samples: 2558095120. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-13 06:45:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:45:47,074][71000] Updated weights for policy 0, policy_version 184894 (0.0030) [2024-06-13 06:45:50,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49698.1, 300 sec: 49096.5). Total num frames: 3029467136. Throughput: 0: 49476.1. Samples: 2558247380. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-13 06:45:50,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:45:50,943][71000] Updated weights for policy 0, policy_version 184904 (0.0031) [2024-06-13 06:45:53,765][71000] Updated weights for policy 0, policy_version 184914 (0.0030) [2024-06-13 06:45:55,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 3029696512. Throughput: 0: 49228.7. Samples: 2558535060. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-13 06:45:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:45:57,547][71000] Updated weights for policy 0, policy_version 184924 (0.0029) [2024-06-13 06:46:00,439][71000] Updated weights for policy 0, policy_version 184934 (0.0034) [2024-06-13 06:46:00,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3029958656. Throughput: 0: 49124.9. Samples: 2558829180. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-13 06:46:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:46:04,134][71000] Updated weights for policy 0, policy_version 184944 (0.0030) [2024-06-13 06:46:05,939][70768] Fps is (10 sec: 52429.9, 60 sec: 49425.2, 300 sec: 49207.6). Total num frames: 3030220800. Throughput: 0: 49346.9. Samples: 2558982260. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-13 06:46:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:46:07,112][71000] Updated weights for policy 0, policy_version 184954 (0.0032) [2024-06-13 06:46:10,900][71000] Updated weights for policy 0, policy_version 184964 (0.0027) [2024-06-13 06:46:10,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 3030450176. Throughput: 0: 49375.2. Samples: 2559280100. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-13 06:46:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:46:14,198][71000] Updated weights for policy 0, policy_version 184974 (0.0028) [2024-06-13 06:46:15,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 3030679552. Throughput: 0: 49148.5. Samples: 2559566680. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-13 06:46:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:46:17,666][71000] Updated weights for policy 0, policy_version 184984 (0.0034) [2024-06-13 06:46:20,907][71000] Updated weights for policy 0, policy_version 184994 (0.0027) [2024-06-13 06:46:20,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 3030941696. Throughput: 0: 49222.1. Samples: 2559709420. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-13 06:46:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:46:24,154][71000] Updated weights for policy 0, policy_version 185004 (0.0027) [2024-06-13 06:46:25,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 3031203840. Throughput: 0: 49199.2. Samples: 2560009280. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-13 06:46:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:46:27,190][71000] Updated weights for policy 0, policy_version 185014 (0.0031) [2024-06-13 06:46:30,810][71000] Updated weights for policy 0, policy_version 185024 (0.0033) [2024-06-13 06:46:30,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 3031433216. Throughput: 0: 49365.9. Samples: 2560316580. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-13 06:46:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:46:34,047][71000] Updated weights for policy 0, policy_version 185034 (0.0026) [2024-06-13 06:46:35,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49698.1, 300 sec: 49096.5). Total num frames: 3031678976. Throughput: 0: 48906.7. Samples: 2560448180. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-13 06:46:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:46:37,343][71000] Updated weights for policy 0, policy_version 185044 (0.0030) [2024-06-13 06:46:40,725][71000] Updated weights for policy 0, policy_version 185054 (0.0022) [2024-06-13 06:46:40,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 3031924736. Throughput: 0: 49155.3. Samples: 2560747040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 06:46:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:46:43,965][71000] Updated weights for policy 0, policy_version 185064 (0.0028) [2024-06-13 06:46:45,939][70768] Fps is (10 sec: 49152.1, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 3032170496. Throughput: 0: 49194.8. Samples: 2561042940. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 06:46:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:46:47,150][71000] Updated weights for policy 0, policy_version 185074 (0.0032) [2024-06-13 06:46:50,883][71000] Updated weights for policy 0, policy_version 185084 (0.0031) [2024-06-13 06:46:50,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 49152.5). Total num frames: 3032416256. Throughput: 0: 49099.1. Samples: 2561191720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 06:46:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:46:54,326][71000] Updated weights for policy 0, policy_version 185094 (0.0034) [2024-06-13 06:46:55,939][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.2, 300 sec: 49040.9). Total num frames: 3032645632. Throughput: 0: 48744.5. Samples: 2561473600. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 06:46:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:46:57,805][71000] Updated weights for policy 0, policy_version 185104 (0.0025) [2024-06-13 06:47:00,939][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 3032891392. Throughput: 0: 48930.3. Samples: 2561768540. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 06:47:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:47:00,955][71000] Updated weights for policy 0, policy_version 185114 (0.0028) [2024-06-13 06:47:04,185][71000] Updated weights for policy 0, policy_version 185124 (0.0036) [2024-06-13 06:47:05,940][70768] Fps is (10 sec: 50789.4, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 3033153536. Throughput: 0: 49161.2. Samples: 2561921680. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 06:47:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:47:07,867][71000] Updated weights for policy 0, policy_version 185134 (0.0033) [2024-06-13 06:47:10,841][71000] Updated weights for policy 0, policy_version 185144 (0.0026) [2024-06-13 06:47:10,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3033399296. Throughput: 0: 49001.8. Samples: 2562214360. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 06:47:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:47:14,228][70980] Signal inference workers to stop experience collection... (38350 times) [2024-06-13 06:47:14,278][71000] InferenceWorker_p0-w0: stopping experience collection (38350 times) [2024-06-13 06:47:14,285][70980] Signal inference workers to resume experience collection... (38350 times) [2024-06-13 06:47:14,295][71000] InferenceWorker_p0-w0: resuming experience collection (38350 times) [2024-06-13 06:47:14,418][71000] Updated weights for policy 0, policy_version 185154 (0.0035) [2024-06-13 06:47:15,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 3033645056. Throughput: 0: 48664.0. Samples: 2562506460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 06:47:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:47:17,757][71000] Updated weights for policy 0, policy_version 185164 (0.0022) [2024-06-13 06:47:20,856][71000] Updated weights for policy 0, policy_version 185174 (0.0022) [2024-06-13 06:47:20,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3033890816. Throughput: 0: 49003.3. Samples: 2562653340. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 06:47:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:47:24,215][71000] Updated weights for policy 0, policy_version 185184 (0.0029) [2024-06-13 06:47:25,939][70768] Fps is (10 sec: 47514.1, 60 sec: 48605.9, 300 sec: 49096.5). Total num frames: 3034120192. Throughput: 0: 48903.2. Samples: 2562947680. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 06:47:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:47:27,335][71000] Updated weights for policy 0, policy_version 185194 (0.0024) [2024-06-13 06:47:30,925][71000] Updated weights for policy 0, policy_version 185204 (0.0028) [2024-06-13 06:47:30,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3034382336. Throughput: 0: 48903.0. Samples: 2563243580. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 06:47:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:47:34,150][71000] Updated weights for policy 0, policy_version 185214 (0.0029) [2024-06-13 06:47:35,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 3034628096. Throughput: 0: 48887.4. Samples: 2563391660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 06:47:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:47:37,552][71000] Updated weights for policy 0, policy_version 185224 (0.0036) [2024-06-13 06:47:40,499][71000] Updated weights for policy 0, policy_version 185234 (0.0029) [2024-06-13 06:47:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3034873856. Throughput: 0: 49327.5. Samples: 2563693340. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 06:47:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:47:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000185234_3034873856.pth... [2024-06-13 06:47:41,014][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000184515_3023093760.pth [2024-06-13 06:47:44,066][71000] Updated weights for policy 0, policy_version 185244 (0.0024) [2024-06-13 06:47:45,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 3035136000. Throughput: 0: 49393.3. Samples: 2563991240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:47:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:47:47,412][71000] Updated weights for policy 0, policy_version 185254 (0.0025) [2024-06-13 06:47:50,935][71000] Updated weights for policy 0, policy_version 185264 (0.0025) [2024-06-13 06:47:50,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3035365376. Throughput: 0: 49228.2. Samples: 2564136940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:47:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:47:54,001][71000] Updated weights for policy 0, policy_version 185274 (0.0037) [2024-06-13 06:47:55,939][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.0, 300 sec: 49041.0). Total num frames: 3035594752. Throughput: 0: 49047.6. Samples: 2564421500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:47:55,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:47:57,747][71000] Updated weights for policy 0, policy_version 185284 (0.0038) [2024-06-13 06:48:00,639][71000] Updated weights for policy 0, policy_version 185294 (0.0023) [2024-06-13 06:48:00,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 49096.4). Total num frames: 3035856896. Throughput: 0: 49010.6. Samples: 2564711940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:48:00,948][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:48:04,230][71000] Updated weights for policy 0, policy_version 185304 (0.0023) [2024-06-13 06:48:05,944][70768] Fps is (10 sec: 49130.7, 60 sec: 48875.5, 300 sec: 49095.7). Total num frames: 3036086272. Throughput: 0: 49119.5. Samples: 2564863920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:48:05,944][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:48:07,546][71000] Updated weights for policy 0, policy_version 185314 (0.0024) [2024-06-13 06:48:10,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 3036332032. Throughput: 0: 49254.8. Samples: 2565164160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:48:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:48:11,186][71000] Updated weights for policy 0, policy_version 185324 (0.0022) [2024-06-13 06:48:14,185][71000] Updated weights for policy 0, policy_version 185334 (0.0028) [2024-06-13 06:48:15,940][70768] Fps is (10 sec: 47534.0, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3036561408. Throughput: 0: 49190.7. Samples: 2565457160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:48:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:48:17,751][71000] Updated weights for policy 0, policy_version 185344 (0.0030) [2024-06-13 06:48:20,825][71000] Updated weights for policy 0, policy_version 185354 (0.0024) [2024-06-13 06:48:20,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.1, 300 sec: 49096.4). Total num frames: 3036839936. Throughput: 0: 49024.4. Samples: 2565597760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:48:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:48:24,611][71000] Updated weights for policy 0, policy_version 185364 (0.0032) [2024-06-13 06:48:25,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3037052928. Throughput: 0: 48740.9. Samples: 2565886680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:48:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:48:27,530][71000] Updated weights for policy 0, policy_version 185374 (0.0034) [2024-06-13 06:48:29,540][70980] Signal inference workers to stop experience collection... (38400 times) [2024-06-13 06:48:29,592][70980] Signal inference workers to resume experience collection... (38400 times) [2024-06-13 06:48:29,593][71000] InferenceWorker_p0-w0: stopping experience collection (38400 times) [2024-06-13 06:48:29,607][71000] InferenceWorker_p0-w0: resuming experience collection (38400 times) [2024-06-13 06:48:30,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 3037315072. Throughput: 0: 48601.2. Samples: 2566178300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:48:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:48:31,206][71000] Updated weights for policy 0, policy_version 185384 (0.0021) [2024-06-13 06:48:34,235][71000] Updated weights for policy 0, policy_version 185394 (0.0028) [2024-06-13 06:48:35,939][70768] Fps is (10 sec: 50790.4, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 3037560832. Throughput: 0: 48997.3. Samples: 2566341820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:48:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:48:37,979][71000] Updated weights for policy 0, policy_version 185404 (0.0026) [2024-06-13 06:48:40,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48878.9, 300 sec: 49041.0). Total num frames: 3037806592. Throughput: 0: 49146.6. Samples: 2566633100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:48:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:48:41,033][71000] Updated weights for policy 0, policy_version 185414 (0.0035) [2024-06-13 06:48:44,663][71000] Updated weights for policy 0, policy_version 185424 (0.0024) [2024-06-13 06:48:45,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48332.8, 300 sec: 48985.4). Total num frames: 3038035968. Throughput: 0: 49214.2. Samples: 2566926580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 06:48:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:48:47,533][71000] Updated weights for policy 0, policy_version 185434 (0.0029) [2024-06-13 06:48:50,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3038298112. Throughput: 0: 49101.6. Samples: 2567073280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:48:50,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:48:51,309][71000] Updated weights for policy 0, policy_version 185444 (0.0022) [2024-06-13 06:48:53,965][71000] Updated weights for policy 0, policy_version 185454 (0.0025) [2024-06-13 06:48:55,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 3038543872. Throughput: 0: 48924.5. Samples: 2567365760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:48:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:48:57,769][71000] Updated weights for policy 0, policy_version 185464 (0.0024) [2024-06-13 06:49:00,737][71000] Updated weights for policy 0, policy_version 185474 (0.0027) [2024-06-13 06:49:00,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 3038806016. Throughput: 0: 49138.5. Samples: 2567668400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:49:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:49:04,154][71000] Updated weights for policy 0, policy_version 185484 (0.0029) [2024-06-13 06:49:05,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48882.3, 300 sec: 49040.9). Total num frames: 3039019008. Throughput: 0: 49205.7. Samples: 2567812020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:49:05,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:49:07,821][71000] Updated weights for policy 0, policy_version 185494 (0.0032) [2024-06-13 06:49:10,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 3039281152. Throughput: 0: 49249.2. Samples: 2568102900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:49:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 06:49:11,154][71000] Updated weights for policy 0, policy_version 185504 (0.0025) [2024-06-13 06:49:14,344][71000] Updated weights for policy 0, policy_version 185514 (0.0032) [2024-06-13 06:49:15,940][70768] Fps is (10 sec: 52429.6, 60 sec: 49698.2, 300 sec: 49152.0). Total num frames: 3039543296. Throughput: 0: 49234.4. Samples: 2568393840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:49:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:49:17,557][71000] Updated weights for policy 0, policy_version 185524 (0.0027) [2024-06-13 06:49:20,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3039772672. Throughput: 0: 49085.6. Samples: 2568550680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:49:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:49:20,995][71000] Updated weights for policy 0, policy_version 185534 (0.0034) [2024-06-13 06:49:24,155][71000] Updated weights for policy 0, policy_version 185544 (0.0026) [2024-06-13 06:49:25,940][70768] Fps is (10 sec: 45874.8, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 3040002048. Throughput: 0: 48964.4. Samples: 2568836500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:49:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:49:27,783][71000] Updated weights for policy 0, policy_version 185554 (0.0044) [2024-06-13 06:49:30,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 49207.5). Total num frames: 3040264192. Throughput: 0: 49086.2. Samples: 2569135460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:49:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:49:31,065][71000] Updated weights for policy 0, policy_version 185564 (0.0036) [2024-06-13 06:49:34,242][70980] Signal inference workers to stop experience collection... (38450 times) [2024-06-13 06:49:34,286][71000] InferenceWorker_p0-w0: stopping experience collection (38450 times) [2024-06-13 06:49:34,301][70980] Signal inference workers to resume experience collection... (38450 times) [2024-06-13 06:49:34,303][71000] InferenceWorker_p0-w0: resuming experience collection (38450 times) [2024-06-13 06:49:34,467][71000] Updated weights for policy 0, policy_version 185574 (0.0032) [2024-06-13 06:49:35,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 3040526336. Throughput: 0: 49122.2. Samples: 2569283780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:49:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:49:37,727][71000] Updated weights for policy 0, policy_version 185584 (0.0030) [2024-06-13 06:49:40,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3040755712. Throughput: 0: 49236.5. Samples: 2569581400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:49:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:49:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000185593_3040755712.pth... [2024-06-13 06:49:40,996][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000184875_3028992000.pth [2024-06-13 06:49:41,207][71000] Updated weights for policy 0, policy_version 185594 (0.0032) [2024-06-13 06:49:44,308][71000] Updated weights for policy 0, policy_version 185604 (0.0028) [2024-06-13 06:49:45,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 3040985088. Throughput: 0: 49132.2. Samples: 2569879340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:49:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:49:47,608][71000] Updated weights for policy 0, policy_version 185614 (0.0036) [2024-06-13 06:49:50,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3041230848. Throughput: 0: 49076.6. Samples: 2570020460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 06:49:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 06:49:51,206][71000] Updated weights for policy 0, policy_version 185624 (0.0042) [2024-06-13 06:49:54,424][71000] Updated weights for policy 0, policy_version 185634 (0.0030) [2024-06-13 06:49:55,939][70768] Fps is (10 sec: 52429.0, 60 sec: 49425.2, 300 sec: 49152.0). Total num frames: 3041509376. Throughput: 0: 49206.0. Samples: 2570317160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 06:49:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:49:57,631][71000] Updated weights for policy 0, policy_version 185644 (0.0034) [2024-06-13 06:50:00,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48606.0, 300 sec: 49041.0). Total num frames: 3041722368. Throughput: 0: 49291.6. Samples: 2570611960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 06:50:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:50:01,204][71000] Updated weights for policy 0, policy_version 185654 (0.0029) [2024-06-13 06:50:04,093][71000] Updated weights for policy 0, policy_version 185664 (0.0023) [2024-06-13 06:50:05,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 3041968128. Throughput: 0: 48998.4. Samples: 2570755600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 06:50:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:50:07,820][71000] Updated weights for policy 0, policy_version 185674 (0.0036) [2024-06-13 06:50:10,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48879.1, 300 sec: 49096.5). Total num frames: 3042213888. Throughput: 0: 49051.7. Samples: 2571043820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 06:50:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:50:11,148][71000] Updated weights for policy 0, policy_version 185684 (0.0026) [2024-06-13 06:50:14,699][71000] Updated weights for policy 0, policy_version 185694 (0.0028) [2024-06-13 06:50:15,940][70768] Fps is (10 sec: 52427.8, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 3042492416. Throughput: 0: 49057.6. Samples: 2571343060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 06:50:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:50:17,785][71000] Updated weights for policy 0, policy_version 185704 (0.0031) [2024-06-13 06:50:20,939][70768] Fps is (10 sec: 45875.2, 60 sec: 48332.9, 300 sec: 48929.9). Total num frames: 3042672640. Throughput: 0: 48832.5. Samples: 2571481240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 06:50:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:50:21,639][71000] Updated weights for policy 0, policy_version 185714 (0.0037) [2024-06-13 06:50:24,675][71000] Updated weights for policy 0, policy_version 185724 (0.0027) [2024-06-13 06:50:25,939][70768] Fps is (10 sec: 44238.0, 60 sec: 48879.1, 300 sec: 49040.9). Total num frames: 3042934784. Throughput: 0: 48627.7. Samples: 2571769640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 06:50:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:50:28,244][71000] Updated weights for policy 0, policy_version 185734 (0.0027) [2024-06-13 06:50:30,940][70768] Fps is (10 sec: 50789.7, 60 sec: 48605.8, 300 sec: 49096.4). Total num frames: 3043180544. Throughput: 0: 48435.9. Samples: 2572058960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 06:50:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:50:31,807][71000] Updated weights for policy 0, policy_version 185744 (0.0033) [2024-06-13 06:50:35,000][71000] Updated weights for policy 0, policy_version 185754 (0.0019) [2024-06-13 06:50:35,940][70768] Fps is (10 sec: 52427.7, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 3043459072. Throughput: 0: 48615.5. Samples: 2572208160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 06:50:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:50:38,223][71000] Updated weights for policy 0, policy_version 185764 (0.0027) [2024-06-13 06:50:40,863][70980] Signal inference workers to stop experience collection... (38500 times) [2024-06-13 06:50:40,863][70980] Signal inference workers to resume experience collection... (38500 times) [2024-06-13 06:50:40,883][71000] InferenceWorker_p0-w0: stopping experience collection (38500 times) [2024-06-13 06:50:40,884][71000] InferenceWorker_p0-w0: resuming experience collection (38500 times) [2024-06-13 06:50:40,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 3043672064. Throughput: 0: 48574.2. Samples: 2572503000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 06:50:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 06:50:41,772][71000] Updated weights for policy 0, policy_version 185774 (0.0033) [2024-06-13 06:50:44,809][71000] Updated weights for policy 0, policy_version 185784 (0.0020) [2024-06-13 06:50:45,940][70768] Fps is (10 sec: 45875.6, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3043917824. Throughput: 0: 48652.4. Samples: 2572801320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 06:50:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:50:48,393][71000] Updated weights for policy 0, policy_version 185794 (0.0035) [2024-06-13 06:50:50,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 3044179968. Throughput: 0: 48840.3. Samples: 2572953420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 06:50:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:50:51,782][71000] Updated weights for policy 0, policy_version 185804 (0.0024) [2024-06-13 06:50:55,125][71000] Updated weights for policy 0, policy_version 185814 (0.0030) [2024-06-13 06:50:55,939][70768] Fps is (10 sec: 50790.7, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 3044425728. Throughput: 0: 48821.3. Samples: 2573240780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 06:50:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:50:58,237][71000] Updated weights for policy 0, policy_version 185824 (0.0022) [2024-06-13 06:51:00,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 3044638720. Throughput: 0: 48611.6. Samples: 2573530580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 06:51:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:51:02,009][71000] Updated weights for policy 0, policy_version 185834 (0.0028) [2024-06-13 06:51:04,896][71000] Updated weights for policy 0, policy_version 185844 (0.0039) [2024-06-13 06:51:05,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3044900864. Throughput: 0: 48630.6. Samples: 2573669620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:51:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:51:08,568][71000] Updated weights for policy 0, policy_version 185854 (0.0023) [2024-06-13 06:51:10,940][70768] Fps is (10 sec: 52429.5, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3045163008. Throughput: 0: 48870.6. Samples: 2573968820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:51:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:51:11,698][71000] Updated weights for policy 0, policy_version 185864 (0.0038) [2024-06-13 06:51:15,246][71000] Updated weights for policy 0, policy_version 185874 (0.0033) [2024-06-13 06:51:15,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48606.0, 300 sec: 49040.9). Total num frames: 3045408768. Throughput: 0: 49049.4. Samples: 2574266180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:51:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:51:18,083][71000] Updated weights for policy 0, policy_version 185884 (0.0031) [2024-06-13 06:51:20,940][70768] Fps is (10 sec: 47512.4, 60 sec: 49424.8, 300 sec: 48929.8). Total num frames: 3045638144. Throughput: 0: 48833.2. Samples: 2574405660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:51:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:51:21,735][71000] Updated weights for policy 0, policy_version 185894 (0.0019) [2024-06-13 06:51:25,115][71000] Updated weights for policy 0, policy_version 185904 (0.0022) [2024-06-13 06:51:25,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 3045883904. Throughput: 0: 49137.2. Samples: 2574714180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:51:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:51:28,439][71000] Updated weights for policy 0, policy_version 185914 (0.0029) [2024-06-13 06:51:30,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49698.1, 300 sec: 49096.4). Total num frames: 3046162432. Throughput: 0: 48997.7. Samples: 2575006220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:51:30,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 06:51:31,552][71000] Updated weights for policy 0, policy_version 185924 (0.0026) [2024-06-13 06:51:35,107][71000] Updated weights for policy 0, policy_version 185934 (0.0028) [2024-06-13 06:51:35,939][70768] Fps is (10 sec: 49152.7, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 3046375424. Throughput: 0: 48913.1. Samples: 2575154500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:51:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:51:37,884][71000] Updated weights for policy 0, policy_version 185944 (0.0030) [2024-06-13 06:51:40,939][70768] Fps is (10 sec: 47514.4, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 3046637568. Throughput: 0: 49346.7. Samples: 2575461380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:51:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:51:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000185952_3046637568.pth... [2024-06-13 06:51:40,990][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000185234_3034873856.pth [2024-06-13 06:51:41,476][71000] Updated weights for policy 0, policy_version 185954 (0.0026) [2024-06-13 06:51:44,392][71000] Updated weights for policy 0, policy_version 185964 (0.0026) [2024-06-13 06:51:45,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3046866944. Throughput: 0: 49497.0. Samples: 2575757940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:51:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:51:47,737][71000] Updated weights for policy 0, policy_version 185974 (0.0032) [2024-06-13 06:51:50,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 3047145472. Throughput: 0: 49640.8. Samples: 2575903460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:51:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:51:51,210][71000] Updated weights for policy 0, policy_version 185984 (0.0027) [2024-06-13 06:51:52,760][70980] Signal inference workers to stop experience collection... (38550 times) [2024-06-13 06:51:52,807][71000] InferenceWorker_p0-w0: stopping experience collection (38550 times) [2024-06-13 06:51:52,814][70980] Signal inference workers to resume experience collection... (38550 times) [2024-06-13 06:51:52,815][71000] InferenceWorker_p0-w0: resuming experience collection (38550 times) [2024-06-13 06:51:54,674][71000] Updated weights for policy 0, policy_version 185994 (0.0022) [2024-06-13 06:51:55,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 3047358464. Throughput: 0: 49415.0. Samples: 2576192500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:51:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 06:51:57,911][71000] Updated weights for policy 0, policy_version 186004 (0.0029) [2024-06-13 06:52:00,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49698.2, 300 sec: 49040.9). Total num frames: 3047620608. Throughput: 0: 49348.4. Samples: 2576486860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:52:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:52:01,485][71000] Updated weights for policy 0, policy_version 186014 (0.0028) [2024-06-13 06:52:04,782][71000] Updated weights for policy 0, policy_version 186024 (0.0027) [2024-06-13 06:52:05,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 3047866368. Throughput: 0: 49545.8. Samples: 2576635220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 06:52:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:52:07,770][71000] Updated weights for policy 0, policy_version 186034 (0.0023) [2024-06-13 06:52:10,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 3048112128. Throughput: 0: 49402.6. Samples: 2576937300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:52:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:52:11,251][71000] Updated weights for policy 0, policy_version 186044 (0.0030) [2024-06-13 06:52:14,956][71000] Updated weights for policy 0, policy_version 186054 (0.0023) [2024-06-13 06:52:15,940][70768] Fps is (10 sec: 49152.9, 60 sec: 49152.0, 300 sec: 49041.0). Total num frames: 3048357888. Throughput: 0: 49447.3. Samples: 2577231340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:52:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:52:18,086][71000] Updated weights for policy 0, policy_version 186064 (0.0030) [2024-06-13 06:52:20,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49698.3, 300 sec: 49152.0). Total num frames: 3048620032. Throughput: 0: 49380.0. Samples: 2577376600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:52:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:52:21,313][71000] Updated weights for policy 0, policy_version 186074 (0.0029) [2024-06-13 06:52:24,650][71000] Updated weights for policy 0, policy_version 186084 (0.0021) [2024-06-13 06:52:25,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 3048849408. Throughput: 0: 49240.8. Samples: 2577677220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:52:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:52:27,781][71000] Updated weights for policy 0, policy_version 186094 (0.0022) [2024-06-13 06:52:30,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3049111552. Throughput: 0: 49296.8. Samples: 2577976300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:52:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:52:31,102][71000] Updated weights for policy 0, policy_version 186104 (0.0029) [2024-06-13 06:52:34,492][71000] Updated weights for policy 0, policy_version 186114 (0.0026) [2024-06-13 06:52:35,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 3049340928. Throughput: 0: 49324.6. Samples: 2578123060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:52:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:52:37,780][71000] Updated weights for policy 0, policy_version 186124 (0.0033) [2024-06-13 06:52:40,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 3049603072. Throughput: 0: 49345.9. Samples: 2578413060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:52:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:52:40,943][71000] Updated weights for policy 0, policy_version 186134 (0.0031) [2024-06-13 06:52:44,838][71000] Updated weights for policy 0, policy_version 186144 (0.0026) [2024-06-13 06:52:45,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 3049832448. Throughput: 0: 49271.0. Samples: 2578704060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:52:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:52:47,971][71000] Updated weights for policy 0, policy_version 186154 (0.0032) [2024-06-13 06:52:50,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48879.0, 300 sec: 49096.4). Total num frames: 3050078208. Throughput: 0: 49259.2. Samples: 2578851880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:52:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:52:51,355][71000] Updated weights for policy 0, policy_version 186164 (0.0032) [2024-06-13 06:52:54,486][71000] Updated weights for policy 0, policy_version 186174 (0.0031) [2024-06-13 06:52:55,939][70768] Fps is (10 sec: 45875.9, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 3050291200. Throughput: 0: 49002.4. Samples: 2579142400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:52:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:52:58,067][71000] Updated weights for policy 0, policy_version 186184 (0.0030) [2024-06-13 06:53:00,645][70980] Signal inference workers to stop experience collection... (38600 times) [2024-06-13 06:53:00,646][70980] Signal inference workers to resume experience collection... (38600 times) [2024-06-13 06:53:00,656][71000] InferenceWorker_p0-w0: stopping experience collection (38600 times) [2024-06-13 06:53:00,672][71000] InferenceWorker_p0-w0: resuming experience collection (38600 times) [2024-06-13 06:53:00,939][70768] Fps is (10 sec: 50791.1, 60 sec: 49425.1, 300 sec: 49152.7). Total num frames: 3050586112. Throughput: 0: 49127.2. Samples: 2579442060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:53:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:53:01,076][71000] Updated weights for policy 0, policy_version 186194 (0.0031) [2024-06-13 06:53:04,880][71000] Updated weights for policy 0, policy_version 186204 (0.0032) [2024-06-13 06:53:05,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 3050815488. Throughput: 0: 49044.4. Samples: 2579583600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:53:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:53:08,131][71000] Updated weights for policy 0, policy_version 186214 (0.0023) [2024-06-13 06:53:10,939][70768] Fps is (10 sec: 45875.2, 60 sec: 48879.1, 300 sec: 49096.5). Total num frames: 3051044864. Throughput: 0: 49151.2. Samples: 2579889020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:53:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:53:11,452][71000] Updated weights for policy 0, policy_version 186224 (0.0021) [2024-06-13 06:53:14,663][71000] Updated weights for policy 0, policy_version 186234 (0.0026) [2024-06-13 06:53:15,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3051307008. Throughput: 0: 48873.0. Samples: 2580175580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:53:15,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 06:53:18,016][71000] Updated weights for policy 0, policy_version 186244 (0.0038) [2024-06-13 06:53:20,943][70768] Fps is (10 sec: 50770.1, 60 sec: 48875.7, 300 sec: 49151.3). Total num frames: 3051552768. Throughput: 0: 48939.7. Samples: 2580325540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:53:20,944][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:53:21,182][71000] Updated weights for policy 0, policy_version 186254 (0.0030) [2024-06-13 06:53:24,624][71000] Updated weights for policy 0, policy_version 186264 (0.0031) [2024-06-13 06:53:25,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 3051798528. Throughput: 0: 49152.7. Samples: 2580624940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:53:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:53:27,970][71000] Updated weights for policy 0, policy_version 186274 (0.0024) [2024-06-13 06:53:30,940][70768] Fps is (10 sec: 47532.1, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 3052027904. Throughput: 0: 49223.2. Samples: 2580919100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:53:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:53:31,481][71000] Updated weights for policy 0, policy_version 186284 (0.0030) [2024-06-13 06:53:35,206][71000] Updated weights for policy 0, policy_version 186294 (0.0032) [2024-06-13 06:53:35,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3052273664. Throughput: 0: 48976.5. Samples: 2581055820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:53:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:53:38,365][71000] Updated weights for policy 0, policy_version 186304 (0.0023) [2024-06-13 06:53:40,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48605.8, 300 sec: 49096.5). Total num frames: 3052519424. Throughput: 0: 48797.3. Samples: 2581338280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:53:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:53:41,072][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000186312_3052535808.pth... [2024-06-13 06:53:41,106][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000185593_3040755712.pth [2024-06-13 06:53:41,651][71000] Updated weights for policy 0, policy_version 186314 (0.0026) [2024-06-13 06:53:45,048][71000] Updated weights for policy 0, policy_version 186324 (0.0029) [2024-06-13 06:53:45,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 3052781568. Throughput: 0: 48896.4. Samples: 2581642400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:53:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:53:48,264][71000] Updated weights for policy 0, policy_version 186334 (0.0031) [2024-06-13 06:53:50,939][70768] Fps is (10 sec: 47513.7, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3052994560. Throughput: 0: 48933.0. Samples: 2581785580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:53:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:53:51,779][71000] Updated weights for policy 0, policy_version 186344 (0.0032) [2024-06-13 06:53:55,482][71000] Updated weights for policy 0, policy_version 186354 (0.0036) [2024-06-13 06:53:55,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49151.9, 300 sec: 48929.9). Total num frames: 3053240320. Throughput: 0: 48501.2. Samples: 2582071580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:53:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:53:58,715][71000] Updated weights for policy 0, policy_version 186364 (0.0024) [2024-06-13 06:54:00,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48332.7, 300 sec: 49040.9). Total num frames: 3053486080. Throughput: 0: 48549.7. Samples: 2582360320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:54:00,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:54:01,971][71000] Updated weights for policy 0, policy_version 186374 (0.0029) [2024-06-13 06:54:03,287][70980] Signal inference workers to stop experience collection... (38650 times) [2024-06-13 06:54:03,287][70980] Signal inference workers to resume experience collection... (38650 times) [2024-06-13 06:54:03,298][71000] InferenceWorker_p0-w0: stopping experience collection (38650 times) [2024-06-13 06:54:03,298][71000] InferenceWorker_p0-w0: resuming experience collection (38650 times) [2024-06-13 06:54:05,264][71000] Updated weights for policy 0, policy_version 186384 (0.0024) [2024-06-13 06:54:05,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3053731840. Throughput: 0: 48616.6. Samples: 2582513100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:54:05,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:54:08,475][71000] Updated weights for policy 0, policy_version 186394 (0.0027) [2024-06-13 06:54:10,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48878.7, 300 sec: 48929.8). Total num frames: 3053977600. Throughput: 0: 48535.0. Samples: 2582809020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:54:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 06:54:12,172][71000] Updated weights for policy 0, policy_version 186404 (0.0027) [2024-06-13 06:54:15,336][71000] Updated weights for policy 0, policy_version 186414 (0.0044) [2024-06-13 06:54:15,939][70768] Fps is (10 sec: 49152.6, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3054223360. Throughput: 0: 48652.6. Samples: 2583108460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:54:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:54:18,706][71000] Updated weights for policy 0, policy_version 186424 (0.0029) [2024-06-13 06:54:20,940][70768] Fps is (10 sec: 50790.9, 60 sec: 48882.0, 300 sec: 49096.5). Total num frames: 3054485504. Throughput: 0: 48799.4. Samples: 2583251800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:54:20,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 06:54:21,914][71000] Updated weights for policy 0, policy_version 186434 (0.0026) [2024-06-13 06:54:25,391][71000] Updated weights for policy 0, policy_version 186444 (0.0021) [2024-06-13 06:54:25,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 3054714880. Throughput: 0: 49097.7. Samples: 2583547680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:54:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:54:28,603][71000] Updated weights for policy 0, policy_version 186454 (0.0028) [2024-06-13 06:54:30,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3054960640. Throughput: 0: 48728.4. Samples: 2583835180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:54:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 06:54:32,088][71000] Updated weights for policy 0, policy_version 186464 (0.0025) [2024-06-13 06:54:35,221][71000] Updated weights for policy 0, policy_version 186474 (0.0025) [2024-06-13 06:54:35,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 3055190016. Throughput: 0: 48801.8. Samples: 2583981660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:54:35,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:54:38,817][71000] Updated weights for policy 0, policy_version 186484 (0.0028) [2024-06-13 06:54:40,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 3055452160. Throughput: 0: 49070.5. Samples: 2584279760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:54:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:54:42,062][71000] Updated weights for policy 0, policy_version 186494 (0.0033) [2024-06-13 06:54:45,462][71000] Updated weights for policy 0, policy_version 186504 (0.0030) [2024-06-13 06:54:45,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 3055697920. Throughput: 0: 49321.9. Samples: 2584579800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:54:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:54:48,608][71000] Updated weights for policy 0, policy_version 186514 (0.0028) [2024-06-13 06:54:50,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3055927296. Throughput: 0: 49047.6. Samples: 2584720240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:54:50,948][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:54:52,151][71000] Updated weights for policy 0, policy_version 186524 (0.0043) [2024-06-13 06:54:55,276][71000] Updated weights for policy 0, policy_version 186534 (0.0031) [2024-06-13 06:54:55,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 3056173056. Throughput: 0: 48957.4. Samples: 2585012100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:54:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:54:58,679][71000] Updated weights for policy 0, policy_version 186544 (0.0027) [2024-06-13 06:55:00,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3056435200. Throughput: 0: 48767.4. Samples: 2585303000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:55:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:55:01,869][71000] Updated weights for policy 0, policy_version 186554 (0.0024) [2024-06-13 06:55:05,790][71000] Updated weights for policy 0, policy_version 186564 (0.0025) [2024-06-13 06:55:05,939][70768] Fps is (10 sec: 49153.4, 60 sec: 48879.1, 300 sec: 48985.4). Total num frames: 3056664576. Throughput: 0: 48827.4. Samples: 2585449020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:55:05,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:55:08,754][71000] Updated weights for policy 0, policy_version 186574 (0.0020) [2024-06-13 06:55:10,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48879.1, 300 sec: 48874.3). Total num frames: 3056910336. Throughput: 0: 48851.1. Samples: 2585745980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:55:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:55:12,294][71000] Updated weights for policy 0, policy_version 186584 (0.0024) [2024-06-13 06:55:14,476][70980] Signal inference workers to stop experience collection... (38700 times) [2024-06-13 06:55:14,478][70980] Signal inference workers to resume experience collection... (38700 times) [2024-06-13 06:55:14,498][71000] InferenceWorker_p0-w0: stopping experience collection (38700 times) [2024-06-13 06:55:14,498][71000] InferenceWorker_p0-w0: resuming experience collection (38700 times) [2024-06-13 06:55:15,686][71000] Updated weights for policy 0, policy_version 186594 (0.0031) [2024-06-13 06:55:15,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3057172480. Throughput: 0: 49024.1. Samples: 2586041260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:55:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:55:18,823][71000] Updated weights for policy 0, policy_version 186604 (0.0034) [2024-06-13 06:55:20,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48879.0, 300 sec: 49096.4). Total num frames: 3057418240. Throughput: 0: 49202.2. Samples: 2586195760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:55:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:55:22,092][71000] Updated weights for policy 0, policy_version 186614 (0.0031) [2024-06-13 06:55:25,448][71000] Updated weights for policy 0, policy_version 186624 (0.0036) [2024-06-13 06:55:25,944][70768] Fps is (10 sec: 47493.2, 60 sec: 48875.4, 300 sec: 49040.2). Total num frames: 3057647616. Throughput: 0: 48923.5. Samples: 2586481520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 06:55:25,944][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:55:28,953][71000] Updated weights for policy 0, policy_version 186634 (0.0033) [2024-06-13 06:55:30,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 3057893376. Throughput: 0: 48668.0. Samples: 2586769860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 06:55:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:55:32,339][71000] Updated weights for policy 0, policy_version 186644 (0.0029) [2024-06-13 06:55:35,544][71000] Updated weights for policy 0, policy_version 186654 (0.0027) [2024-06-13 06:55:35,940][70768] Fps is (10 sec: 49171.9, 60 sec: 49151.8, 300 sec: 49040.9). Total num frames: 3058139136. Throughput: 0: 48840.7. Samples: 2586918080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 06:55:35,941][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:55:39,003][71000] Updated weights for policy 0, policy_version 186664 (0.0036) [2024-06-13 06:55:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 3058384896. Throughput: 0: 48905.4. Samples: 2587212840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 06:55:40,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:55:41,014][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000186670_3058401280.pth... [2024-06-13 06:55:41,064][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000185952_3046637568.pth [2024-06-13 06:55:42,383][71000] Updated weights for policy 0, policy_version 186674 (0.0020) [2024-06-13 06:55:45,939][70768] Fps is (10 sec: 47515.1, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 3058614272. Throughput: 0: 48777.1. Samples: 2587497960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 06:55:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:55:45,984][71000] Updated weights for policy 0, policy_version 186684 (0.0026) [2024-06-13 06:55:49,295][71000] Updated weights for policy 0, policy_version 186694 (0.0031) [2024-06-13 06:55:50,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49151.8, 300 sec: 48985.3). Total num frames: 3058876416. Throughput: 0: 48825.4. Samples: 2587646180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 06:55:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:55:52,779][71000] Updated weights for policy 0, policy_version 186704 (0.0033) [2024-06-13 06:55:55,882][71000] Updated weights for policy 0, policy_version 186714 (0.0022) [2024-06-13 06:55:55,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 3059122176. Throughput: 0: 48614.7. Samples: 2587933640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 06:55:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:55:59,479][71000] Updated weights for policy 0, policy_version 186724 (0.0027) [2024-06-13 06:56:00,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 3059351552. Throughput: 0: 48567.4. Samples: 2588226800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 06:56:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:56:02,463][71000] Updated weights for policy 0, policy_version 186734 (0.0026) [2024-06-13 06:56:05,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 3059597312. Throughput: 0: 48447.6. Samples: 2588375900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 06:56:05,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:56:06,206][71000] Updated weights for policy 0, policy_version 186744 (0.0027) [2024-06-13 06:56:09,174][71000] Updated weights for policy 0, policy_version 186754 (0.0034) [2024-06-13 06:56:10,940][70768] Fps is (10 sec: 47514.3, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 3059826688. Throughput: 0: 48482.0. Samples: 2588663000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 06:56:10,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 06:56:12,706][71000] Updated weights for policy 0, policy_version 186764 (0.0030) [2024-06-13 06:56:15,879][71000] Updated weights for policy 0, policy_version 186774 (0.0033) [2024-06-13 06:56:15,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48878.9, 300 sec: 49041.0). Total num frames: 3060105216. Throughput: 0: 48718.7. Samples: 2588962200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 06:56:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:56:19,044][71000] Updated weights for policy 0, policy_version 186784 (0.0026) [2024-06-13 06:56:20,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3060334592. Throughput: 0: 48731.3. Samples: 2589110980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 06:56:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:56:22,680][71000] Updated weights for policy 0, policy_version 186794 (0.0031) [2024-06-13 06:56:25,882][71000] Updated weights for policy 0, policy_version 186804 (0.0034) [2024-06-13 06:56:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49155.5, 300 sec: 48929.9). Total num frames: 3060596736. Throughput: 0: 48718.8. Samples: 2589405180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 06:56:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:56:29,209][71000] Updated weights for policy 0, policy_version 186814 (0.0023) [2024-06-13 06:56:30,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48878.8, 300 sec: 48985.3). Total num frames: 3060826112. Throughput: 0: 48883.3. Samples: 2589697720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 06:56:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:56:32,957][71000] Updated weights for policy 0, policy_version 186824 (0.0032) [2024-06-13 06:56:35,939][70768] Fps is (10 sec: 45875.4, 60 sec: 48606.1, 300 sec: 48874.3). Total num frames: 3061055488. Throughput: 0: 48755.4. Samples: 2589840160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 06:56:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:56:36,201][71000] Updated weights for policy 0, policy_version 186834 (0.0028) [2024-06-13 06:56:39,287][71000] Updated weights for policy 0, policy_version 186844 (0.0032) [2024-06-13 06:56:40,771][70980] Signal inference workers to stop experience collection... (38750 times) [2024-06-13 06:56:40,809][71000] InferenceWorker_p0-w0: stopping experience collection (38750 times) [2024-06-13 06:56:40,879][70980] Signal inference workers to resume experience collection... (38750 times) [2024-06-13 06:56:40,879][71000] InferenceWorker_p0-w0: resuming experience collection (38750 times) [2024-06-13 06:56:40,939][70768] Fps is (10 sec: 49153.2, 60 sec: 48879.1, 300 sec: 48985.4). Total num frames: 3061317632. Throughput: 0: 48992.9. Samples: 2590138320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 06:56:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:56:42,792][71000] Updated weights for policy 0, policy_version 186854 (0.0027) [2024-06-13 06:56:45,597][71000] Updated weights for policy 0, policy_version 186864 (0.0032) [2024-06-13 06:56:45,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49425.0, 300 sec: 48929.9). Total num frames: 3061579776. Throughput: 0: 49037.1. Samples: 2590433460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 06:56:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:56:49,272][71000] Updated weights for policy 0, policy_version 186874 (0.0023) [2024-06-13 06:56:50,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.2, 300 sec: 49040.9). Total num frames: 3061825536. Throughput: 0: 49145.3. Samples: 2590587440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 06:56:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:56:52,643][71000] Updated weights for policy 0, policy_version 186884 (0.0028) [2024-06-13 06:56:55,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 3062054912. Throughput: 0: 49316.3. Samples: 2590882240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 06:56:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:56:56,199][71000] Updated weights for policy 0, policy_version 186894 (0.0029) [2024-06-13 06:56:59,439][71000] Updated weights for policy 0, policy_version 186904 (0.0026) [2024-06-13 06:57:00,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.2, 300 sec: 48929.9). Total num frames: 3062300672. Throughput: 0: 49377.8. Samples: 2591184200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 06:57:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:57:02,716][71000] Updated weights for policy 0, policy_version 186914 (0.0036) [2024-06-13 06:57:05,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 3062546432. Throughput: 0: 49098.6. Samples: 2591320420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 06:57:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:57:06,069][71000] Updated weights for policy 0, policy_version 186924 (0.0031) [2024-06-13 06:57:09,505][71000] Updated weights for policy 0, policy_version 186934 (0.0027) [2024-06-13 06:57:10,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.1, 300 sec: 48985.4). Total num frames: 3062808576. Throughput: 0: 49194.2. Samples: 2591618920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 06:57:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:57:12,761][71000] Updated weights for policy 0, policy_version 186944 (0.0027) [2024-06-13 06:57:15,940][70768] Fps is (10 sec: 49152.8, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 3063037952. Throughput: 0: 49156.7. Samples: 2591909760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 06:57:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:57:16,180][71000] Updated weights for policy 0, policy_version 186954 (0.0025) [2024-06-13 06:57:19,145][71000] Updated weights for policy 0, policy_version 186964 (0.0022) [2024-06-13 06:57:20,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 3063283712. Throughput: 0: 49364.0. Samples: 2592061540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 06:57:20,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:57:22,496][71000] Updated weights for policy 0, policy_version 186974 (0.0028) [2024-06-13 06:57:25,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3063529472. Throughput: 0: 49376.8. Samples: 2592360280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 06:57:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:57:26,023][71000] Updated weights for policy 0, policy_version 186984 (0.0037) [2024-06-13 06:57:29,173][71000] Updated weights for policy 0, policy_version 186994 (0.0033) [2024-06-13 06:57:30,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.2, 300 sec: 48985.4). Total num frames: 3063791616. Throughput: 0: 49342.2. Samples: 2592653860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 06:57:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:57:32,719][71000] Updated weights for policy 0, policy_version 187004 (0.0033) [2024-06-13 06:57:35,816][71000] Updated weights for policy 0, policy_version 187014 (0.0020) [2024-06-13 06:57:35,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49698.0, 300 sec: 48929.8). Total num frames: 3064037376. Throughput: 0: 49410.1. Samples: 2592810900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-13 06:57:35,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:57:39,113][71000] Updated weights for policy 0, policy_version 187024 (0.0029) [2024-06-13 06:57:40,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49151.9, 300 sec: 48929.9). Total num frames: 3064266752. Throughput: 0: 49330.8. Samples: 2593102120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:57:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:57:40,985][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000187029_3064283136.pth... [2024-06-13 06:57:41,019][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000186312_3052535808.pth [2024-06-13 06:57:42,211][71000] Updated weights for policy 0, policy_version 187034 (0.0030) [2024-06-13 06:57:45,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 3064512512. Throughput: 0: 49368.7. Samples: 2593405800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:57:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:57:46,167][71000] Updated weights for policy 0, policy_version 187044 (0.0035) [2024-06-13 06:57:48,778][71000] Updated weights for policy 0, policy_version 187054 (0.0025) [2024-06-13 06:57:50,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 3064791040. Throughput: 0: 49505.4. Samples: 2593548160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:57:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:57:52,734][71000] Updated weights for policy 0, policy_version 187064 (0.0029) [2024-06-13 06:57:55,900][71000] Updated weights for policy 0, policy_version 187074 (0.0029) [2024-06-13 06:57:55,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.1, 300 sec: 48929.8). Total num frames: 3065020416. Throughput: 0: 49260.7. Samples: 2593835660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:57:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:57:59,770][71000] Updated weights for policy 0, policy_version 187084 (0.0026) [2024-06-13 06:58:00,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49424.9, 300 sec: 48985.4). Total num frames: 3065266176. Throughput: 0: 49394.9. Samples: 2594132540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:58:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:58:02,194][70980] Signal inference workers to stop experience collection... (38800 times) [2024-06-13 06:58:02,231][71000] InferenceWorker_p0-w0: stopping experience collection (38800 times) [2024-06-13 06:58:02,250][70980] Signal inference workers to resume experience collection... (38800 times) [2024-06-13 06:58:02,250][71000] InferenceWorker_p0-w0: resuming experience collection (38800 times) [2024-06-13 06:58:02,385][71000] Updated weights for policy 0, policy_version 187094 (0.0034) [2024-06-13 06:58:05,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48879.1, 300 sec: 48929.8). Total num frames: 3065479168. Throughput: 0: 49190.2. Samples: 2594275100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:58:05,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 06:58:06,410][71000] Updated weights for policy 0, policy_version 187104 (0.0024) [2024-06-13 06:58:08,686][71000] Updated weights for policy 0, policy_version 187114 (0.0028) [2024-06-13 06:58:10,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 3065774080. Throughput: 0: 49156.0. Samples: 2594572300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:58:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:58:12,909][71000] Updated weights for policy 0, policy_version 187124 (0.0035) [2024-06-13 06:58:15,699][71000] Updated weights for policy 0, policy_version 187134 (0.0032) [2024-06-13 06:58:15,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49425.1, 300 sec: 48986.0). Total num frames: 3066003456. Throughput: 0: 49251.5. Samples: 2594870180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:58:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 06:58:19,417][71000] Updated weights for policy 0, policy_version 187144 (0.0024) [2024-06-13 06:58:20,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 3066249216. Throughput: 0: 49065.3. Samples: 2595018840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:58:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:58:22,309][71000] Updated weights for policy 0, policy_version 187154 (0.0031) [2024-06-13 06:58:25,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3066462208. Throughput: 0: 49033.8. Samples: 2595308640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:58:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:58:26,251][71000] Updated weights for policy 0, policy_version 187164 (0.0028) [2024-06-13 06:58:28,783][71000] Updated weights for policy 0, policy_version 187174 (0.0030) [2024-06-13 06:58:30,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 3066757120. Throughput: 0: 48803.3. Samples: 2595601940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:58:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:58:33,169][71000] Updated weights for policy 0, policy_version 187184 (0.0031) [2024-06-13 06:58:35,927][71000] Updated weights for policy 0, policy_version 187194 (0.0029) [2024-06-13 06:58:35,939][70768] Fps is (10 sec: 52429.1, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 3066986496. Throughput: 0: 49075.7. Samples: 2595756560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:58:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:58:39,637][71000] Updated weights for policy 0, policy_version 187204 (0.0022) [2024-06-13 06:58:40,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 3067215872. Throughput: 0: 49304.6. Samples: 2596054360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 06:58:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:58:42,318][71000] Updated weights for policy 0, policy_version 187214 (0.0024) [2024-06-13 06:58:45,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 3067445248. Throughput: 0: 49286.4. Samples: 2596350420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:58:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:58:46,235][71000] Updated weights for policy 0, policy_version 187224 (0.0031) [2024-06-13 06:58:48,968][71000] Updated weights for policy 0, policy_version 187234 (0.0027) [2024-06-13 06:58:50,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 3067723776. Throughput: 0: 49179.1. Samples: 2596488160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:58:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:58:53,066][71000] Updated weights for policy 0, policy_version 187244 (0.0038) [2024-06-13 06:58:55,601][71000] Updated weights for policy 0, policy_version 187254 (0.0030) [2024-06-13 06:58:55,940][70768] Fps is (10 sec: 54066.5, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 3067985920. Throughput: 0: 49355.8. Samples: 2596793320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:58:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 06:58:59,787][71000] Updated weights for policy 0, policy_version 187264 (0.0027) [2024-06-13 06:59:00,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 3068182528. Throughput: 0: 49126.6. Samples: 2597080880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:59:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 06:59:02,601][71000] Updated weights for policy 0, policy_version 187274 (0.0025) [2024-06-13 06:59:05,940][70768] Fps is (10 sec: 45875.5, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 3068444672. Throughput: 0: 48925.8. Samples: 2597220500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:59:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:59:06,250][71000] Updated weights for policy 0, policy_version 187284 (0.0029) [2024-06-13 06:59:09,076][71000] Updated weights for policy 0, policy_version 187294 (0.0032) [2024-06-13 06:59:10,940][70768] Fps is (10 sec: 52428.3, 60 sec: 48878.9, 300 sec: 49096.4). Total num frames: 3068706816. Throughput: 0: 49120.8. Samples: 2597519080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:59:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:59:13,016][71000] Updated weights for policy 0, policy_version 187304 (0.0024) [2024-06-13 06:59:15,678][71000] Updated weights for policy 0, policy_version 187314 (0.0028) [2024-06-13 06:59:15,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49424.9, 300 sec: 49096.5). Total num frames: 3068968960. Throughput: 0: 49237.2. Samples: 2597817620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:59:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 06:59:17,169][70980] Signal inference workers to stop experience collection... (38850 times) [2024-06-13 06:59:17,169][70980] Signal inference workers to resume experience collection... (38850 times) [2024-06-13 06:59:17,212][71000] InferenceWorker_p0-w0: stopping experience collection (38850 times) [2024-06-13 06:59:17,212][71000] InferenceWorker_p0-w0: resuming experience collection (38850 times) [2024-06-13 06:59:19,555][71000] Updated weights for policy 0, policy_version 187324 (0.0030) [2024-06-13 06:59:20,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 49096.4). Total num frames: 3069198336. Throughput: 0: 49020.7. Samples: 2597962500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:59:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 06:59:22,327][71000] Updated weights for policy 0, policy_version 187334 (0.0031) [2024-06-13 06:59:25,939][70768] Fps is (10 sec: 45876.1, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 3069427712. Throughput: 0: 49074.7. Samples: 2598262720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:59:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 06:59:25,946][71000] Updated weights for policy 0, policy_version 187344 (0.0029) [2024-06-13 06:59:28,957][71000] Updated weights for policy 0, policy_version 187354 (0.0027) [2024-06-13 06:59:30,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3069689856. Throughput: 0: 49055.6. Samples: 2598557920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:59:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:59:32,611][71000] Updated weights for policy 0, policy_version 187364 (0.0029) [2024-06-13 06:59:35,583][71000] Updated weights for policy 0, policy_version 187374 (0.0023) [2024-06-13 06:59:35,940][70768] Fps is (10 sec: 52427.8, 60 sec: 49424.9, 300 sec: 49152.0). Total num frames: 3069952000. Throughput: 0: 49475.0. Samples: 2598714540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:59:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 06:59:39,350][71000] Updated weights for policy 0, policy_version 187384 (0.0032) [2024-06-13 06:59:40,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49424.9, 300 sec: 49096.4). Total num frames: 3070181376. Throughput: 0: 49263.6. Samples: 2599010180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:59:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 06:59:40,953][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000187389_3070181376.pth... [2024-06-13 06:59:41,003][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000186670_3058401280.pth [2024-06-13 06:59:42,207][71000] Updated weights for policy 0, policy_version 187394 (0.0031) [2024-06-13 06:59:45,835][71000] Updated weights for policy 0, policy_version 187404 (0.0028) [2024-06-13 06:59:45,940][70768] Fps is (10 sec: 47514.4, 60 sec: 49698.2, 300 sec: 49152.0). Total num frames: 3070427136. Throughput: 0: 49343.2. Samples: 2599301320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:59:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 06:59:48,763][71000] Updated weights for policy 0, policy_version 187414 (0.0021) [2024-06-13 06:59:50,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3070672896. Throughput: 0: 49602.2. Samples: 2599452600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 06:59:50,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 06:59:52,609][71000] Updated weights for policy 0, policy_version 187424 (0.0029) [2024-06-13 06:59:55,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 3070902272. Throughput: 0: 49375.6. Samples: 2599740980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 06:59:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 06:59:55,954][71000] Updated weights for policy 0, policy_version 187434 (0.0022) [2024-06-13 06:59:59,433][71000] Updated weights for policy 0, policy_version 187444 (0.0033) [2024-06-13 07:00:00,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.0, 300 sec: 49096.4). Total num frames: 3071148032. Throughput: 0: 49045.8. Samples: 2600024680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 07:00:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:00:02,637][71000] Updated weights for policy 0, policy_version 187454 (0.0038) [2024-06-13 07:00:05,715][71000] Updated weights for policy 0, policy_version 187464 (0.0026) [2024-06-13 07:00:05,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 3071410176. Throughput: 0: 49128.1. Samples: 2600173260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 07:00:05,940][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 07:00:08,937][71000] Updated weights for policy 0, policy_version 187474 (0.0022) [2024-06-13 07:00:10,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 3071655936. Throughput: 0: 49159.0. Samples: 2600474880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 07:00:10,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 07:00:12,471][71000] Updated weights for policy 0, policy_version 187484 (0.0026) [2024-06-13 07:00:15,911][71000] Updated weights for policy 0, policy_version 187494 (0.0032) [2024-06-13 07:00:15,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48879.0, 300 sec: 49096.4). Total num frames: 3071901696. Throughput: 0: 49338.1. Samples: 2600778140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 07:00:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:00:19,381][71000] Updated weights for policy 0, policy_version 187504 (0.0030) [2024-06-13 07:00:20,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48878.9, 300 sec: 49097.2). Total num frames: 3072131072. Throughput: 0: 48893.3. Samples: 2600914740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 07:00:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:00:22,542][71000] Updated weights for policy 0, policy_version 187514 (0.0025) [2024-06-13 07:00:25,876][71000] Updated weights for policy 0, policy_version 187524 (0.0022) [2024-06-13 07:00:25,944][70768] Fps is (10 sec: 49131.2, 60 sec: 49421.4, 300 sec: 49151.3). Total num frames: 3072393216. Throughput: 0: 48755.5. Samples: 2601204380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 07:00:25,945][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:00:29,159][71000] Updated weights for policy 0, policy_version 187534 (0.0022) [2024-06-13 07:00:30,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3072638976. Throughput: 0: 49015.5. Samples: 2601507020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 07:00:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:00:32,601][71000] Updated weights for policy 0, policy_version 187544 (0.0025) [2024-06-13 07:00:32,811][70980] Signal inference workers to stop experience collection... (38900 times) [2024-06-13 07:00:32,821][71000] InferenceWorker_p0-w0: stopping experience collection (38900 times) [2024-06-13 07:00:32,868][70980] Signal inference workers to resume experience collection... (38900 times) [2024-06-13 07:00:32,868][71000] InferenceWorker_p0-w0: resuming experience collection (38900 times) [2024-06-13 07:00:35,939][70768] Fps is (10 sec: 45895.4, 60 sec: 48333.0, 300 sec: 49040.9). Total num frames: 3072851968. Throughput: 0: 48695.3. Samples: 2601643880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 07:00:35,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:00:36,212][71000] Updated weights for policy 0, policy_version 187554 (0.0041) [2024-06-13 07:00:39,423][71000] Updated weights for policy 0, policy_version 187564 (0.0040) [2024-06-13 07:00:40,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 3073114112. Throughput: 0: 48815.5. Samples: 2601937680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 07:00:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:00:42,945][71000] Updated weights for policy 0, policy_version 187574 (0.0033) [2024-06-13 07:00:45,939][70768] Fps is (10 sec: 50790.6, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 3073359872. Throughput: 0: 49083.7. Samples: 2602233440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 07:00:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:00:46,014][71000] Updated weights for policy 0, policy_version 187584 (0.0027) [2024-06-13 07:00:49,421][71000] Updated weights for policy 0, policy_version 187594 (0.0022) [2024-06-13 07:00:50,939][70768] Fps is (10 sec: 50791.4, 60 sec: 49152.2, 300 sec: 49152.0). Total num frames: 3073622016. Throughput: 0: 49193.5. Samples: 2602386960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 07:00:50,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 07:00:52,712][71000] Updated weights for policy 0, policy_version 187604 (0.0027) [2024-06-13 07:00:55,939][70768] Fps is (10 sec: 47513.4, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 3073835008. Throughput: 0: 48895.6. Samples: 2602675180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 07:00:55,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:00:56,478][71000] Updated weights for policy 0, policy_version 187614 (0.0029) [2024-06-13 07:00:59,352][71000] Updated weights for policy 0, policy_version 187624 (0.0025) [2024-06-13 07:01:00,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 3074113536. Throughput: 0: 48524.5. Samples: 2602961740. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 07:01:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:01:02,931][71000] Updated weights for policy 0, policy_version 187634 (0.0027) [2024-06-13 07:01:05,940][70768] Fps is (10 sec: 50789.7, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 3074342912. Throughput: 0: 48968.5. Samples: 2603118320. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 07:01:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:01:06,030][71000] Updated weights for policy 0, policy_version 187644 (0.0031) [2024-06-13 07:01:09,710][71000] Updated weights for policy 0, policy_version 187654 (0.0032) [2024-06-13 07:01:10,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 3074588672. Throughput: 0: 49072.4. Samples: 2603412420. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 07:01:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 07:01:12,815][71000] Updated weights for policy 0, policy_version 187664 (0.0032) [2024-06-13 07:01:15,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48332.9, 300 sec: 49040.9). Total num frames: 3074801664. Throughput: 0: 48850.6. Samples: 2603705300. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 07:01:15,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 07:01:16,288][71000] Updated weights for policy 0, policy_version 187674 (0.0029) [2024-06-13 07:01:19,378][71000] Updated weights for policy 0, policy_version 187684 (0.0039) [2024-06-13 07:01:20,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49425.2, 300 sec: 49152.0). Total num frames: 3075096576. Throughput: 0: 48992.0. Samples: 2603848520. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 07:01:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:01:22,934][71000] Updated weights for policy 0, policy_version 187694 (0.0029) [2024-06-13 07:01:25,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48609.4, 300 sec: 49096.5). Total num frames: 3075309568. Throughput: 0: 49087.3. Samples: 2604146600. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 07:01:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:01:26,076][71000] Updated weights for policy 0, policy_version 187704 (0.0029) [2024-06-13 07:01:29,711][71000] Updated weights for policy 0, policy_version 187714 (0.0030) [2024-06-13 07:01:30,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 3075571712. Throughput: 0: 49146.6. Samples: 2604445040. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 07:01:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:01:32,708][71000] Updated weights for policy 0, policy_version 187724 (0.0021) [2024-06-13 07:01:35,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3075801088. Throughput: 0: 48869.7. Samples: 2604586100. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 07:01:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 07:01:36,227][71000] Updated weights for policy 0, policy_version 187734 (0.0027) [2024-06-13 07:01:39,398][71000] Updated weights for policy 0, policy_version 187744 (0.0030) [2024-06-13 07:01:40,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 3076079616. Throughput: 0: 48968.4. Samples: 2604878760. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 07:01:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:01:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000187749_3076079616.pth... [2024-06-13 07:01:41,001][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000187029_3064283136.pth [2024-06-13 07:01:43,274][71000] Updated weights for policy 0, policy_version 187754 (0.0032) [2024-06-13 07:01:44,602][70980] Signal inference workers to stop experience collection... (38950 times) [2024-06-13 07:01:44,603][70980] Signal inference workers to resume experience collection... (38950 times) [2024-06-13 07:01:44,615][71000] InferenceWorker_p0-w0: stopping experience collection (38950 times) [2024-06-13 07:01:44,615][71000] InferenceWorker_p0-w0: resuming experience collection (38950 times) [2024-06-13 07:01:45,923][71000] Updated weights for policy 0, policy_version 187764 (0.0023) [2024-06-13 07:01:45,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 3076325376. Throughput: 0: 49095.6. Samples: 2605171040. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 07:01:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:01:49,952][71000] Updated weights for policy 0, policy_version 187774 (0.0023) [2024-06-13 07:01:50,939][70768] Fps is (10 sec: 44237.2, 60 sec: 48332.8, 300 sec: 49041.0). Total num frames: 3076521984. Throughput: 0: 48954.0. Samples: 2605321240. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 07:01:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:01:52,927][71000] Updated weights for policy 0, policy_version 187784 (0.0032) [2024-06-13 07:01:55,940][70768] Fps is (10 sec: 44236.9, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3076767744. Throughput: 0: 48665.3. Samples: 2605602360. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 07:01:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:01:56,423][71000] Updated weights for policy 0, policy_version 187794 (0.0033) [2024-06-13 07:01:59,517][71000] Updated weights for policy 0, policy_version 187804 (0.0027) [2024-06-13 07:02:00,940][70768] Fps is (10 sec: 52427.6, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 3077046272. Throughput: 0: 48678.1. Samples: 2605895820. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-13 07:02:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:02:03,640][71000] Updated weights for policy 0, policy_version 187814 (0.0026) [2024-06-13 07:02:05,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49152.0, 300 sec: 49096.4). Total num frames: 3077292032. Throughput: 0: 48983.9. Samples: 2606052800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 07:02:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 07:02:06,134][71000] Updated weights for policy 0, policy_version 187824 (0.0020) [2024-06-13 07:02:10,128][71000] Updated weights for policy 0, policy_version 187834 (0.0035) [2024-06-13 07:02:10,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48605.7, 300 sec: 49040.9). Total num frames: 3077505024. Throughput: 0: 48853.1. Samples: 2606345000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 07:02:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:02:13,037][71000] Updated weights for policy 0, policy_version 187844 (0.0034) [2024-06-13 07:02:15,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3077750784. Throughput: 0: 48691.4. Samples: 2606636160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 07:02:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:02:16,721][71000] Updated weights for policy 0, policy_version 187854 (0.0023) [2024-06-13 07:02:19,446][71000] Updated weights for policy 0, policy_version 187864 (0.0033) [2024-06-13 07:02:20,939][70768] Fps is (10 sec: 52429.8, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 3078029312. Throughput: 0: 48951.1. Samples: 2606788900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 07:02:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:02:23,353][71000] Updated weights for policy 0, policy_version 187874 (0.0023) [2024-06-13 07:02:25,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49424.9, 300 sec: 49096.4). Total num frames: 3078275072. Throughput: 0: 49233.2. Samples: 2607094260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 07:02:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:02:26,153][71000] Updated weights for policy 0, policy_version 187884 (0.0029) [2024-06-13 07:02:30,210][71000] Updated weights for policy 0, policy_version 187894 (0.0030) [2024-06-13 07:02:30,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 3078488064. Throughput: 0: 49201.4. Samples: 2607385100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 07:02:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:02:32,823][71000] Updated weights for policy 0, policy_version 187904 (0.0028) [2024-06-13 07:02:35,940][70768] Fps is (10 sec: 45875.9, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3078733824. Throughput: 0: 48913.2. Samples: 2607522340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 07:02:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:02:36,977][71000] Updated weights for policy 0, policy_version 187914 (0.0038) [2024-06-13 07:02:38,983][70980] Signal inference workers to stop experience collection... (39000 times) [2024-06-13 07:02:39,034][71000] InferenceWorker_p0-w0: stopping experience collection (39000 times) [2024-06-13 07:02:39,035][70980] Signal inference workers to resume experience collection... (39000 times) [2024-06-13 07:02:39,047][71000] InferenceWorker_p0-w0: resuming experience collection (39000 times) [2024-06-13 07:02:39,314][71000] Updated weights for policy 0, policy_version 187924 (0.0031) [2024-06-13 07:02:40,940][70768] Fps is (10 sec: 52428.6, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3079012352. Throughput: 0: 49116.8. Samples: 2607812620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 07:02:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:02:43,526][71000] Updated weights for policy 0, policy_version 187934 (0.0031) [2024-06-13 07:02:45,939][70768] Fps is (10 sec: 52429.3, 60 sec: 48879.0, 300 sec: 49041.0). Total num frames: 3079258112. Throughput: 0: 49294.5. Samples: 2608114060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 07:02:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:02:46,050][71000] Updated weights for policy 0, policy_version 187944 (0.0038) [2024-06-13 07:02:50,074][71000] Updated weights for policy 0, policy_version 187954 (0.0034) [2024-06-13 07:02:50,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 3079471104. Throughput: 0: 48929.9. Samples: 2608254640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 07:02:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:02:52,747][71000] Updated weights for policy 0, policy_version 187964 (0.0038) [2024-06-13 07:02:55,939][70768] Fps is (10 sec: 44236.8, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 3079700480. Throughput: 0: 49062.0. Samples: 2608552780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 07:02:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:02:56,959][71000] Updated weights for policy 0, policy_version 187974 (0.0026) [2024-06-13 07:02:59,407][71000] Updated weights for policy 0, policy_version 187984 (0.0029) [2024-06-13 07:03:00,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 3079995392. Throughput: 0: 48997.7. Samples: 2608841060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 07:03:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:03:03,444][71000] Updated weights for policy 0, policy_version 187994 (0.0024) [2024-06-13 07:03:05,940][70768] Fps is (10 sec: 52427.8, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3080224768. Throughput: 0: 49057.2. Samples: 2608996480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 07:03:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:03:06,205][71000] Updated weights for policy 0, policy_version 188004 (0.0031) [2024-06-13 07:03:10,495][71000] Updated weights for policy 0, policy_version 188014 (0.0024) [2024-06-13 07:03:10,939][70768] Fps is (10 sec: 44237.7, 60 sec: 48879.1, 300 sec: 48929.8). Total num frames: 3080437760. Throughput: 0: 48617.1. Samples: 2609282020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-13 07:03:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:03:13,182][71000] Updated weights for policy 0, policy_version 188024 (0.0032) [2024-06-13 07:03:15,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3080699904. Throughput: 0: 48475.0. Samples: 2609566480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-13 07:03:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:03:17,026][71000] Updated weights for policy 0, policy_version 188034 (0.0025) [2024-06-13 07:03:20,018][71000] Updated weights for policy 0, policy_version 188044 (0.0029) [2024-06-13 07:03:20,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48605.8, 300 sec: 49096.5). Total num frames: 3080945664. Throughput: 0: 48913.7. Samples: 2609723460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-13 07:03:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:03:23,842][71000] Updated weights for policy 0, policy_version 188054 (0.0026) [2024-06-13 07:03:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48332.8, 300 sec: 48874.3). Total num frames: 3081175040. Throughput: 0: 48835.0. Samples: 2610010200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-13 07:03:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:03:26,768][71000] Updated weights for policy 0, policy_version 188064 (0.0020) [2024-06-13 07:03:30,606][71000] Updated weights for policy 0, policy_version 188074 (0.0029) [2024-06-13 07:03:30,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 3081404416. Throughput: 0: 48694.1. Samples: 2610305300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-13 07:03:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:03:33,353][71000] Updated weights for policy 0, policy_version 188084 (0.0031) [2024-06-13 07:03:35,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3081682944. Throughput: 0: 48576.5. Samples: 2610440580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-13 07:03:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:03:37,194][71000] Updated weights for policy 0, policy_version 188094 (0.0032) [2024-06-13 07:03:40,334][71000] Updated weights for policy 0, policy_version 188104 (0.0027) [2024-06-13 07:03:40,940][70768] Fps is (10 sec: 52428.7, 60 sec: 48605.9, 300 sec: 49096.5). Total num frames: 3081928704. Throughput: 0: 48663.0. Samples: 2610742620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-13 07:03:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:03:41,044][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000188107_3081945088.pth... [2024-06-13 07:03:41,095][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000187389_3070181376.pth [2024-06-13 07:03:44,164][71000] Updated weights for policy 0, policy_version 188114 (0.0025) [2024-06-13 07:03:45,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48059.7, 300 sec: 48874.3). Total num frames: 3082141696. Throughput: 0: 48751.4. Samples: 2611034860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-13 07:03:45,940][70768] Avg episode reward: [(0, '0.295')] [2024-06-13 07:03:46,869][71000] Updated weights for policy 0, policy_version 188124 (0.0034) [2024-06-13 07:03:47,459][70980] Signal inference workers to stop experience collection... (39050 times) [2024-06-13 07:03:47,488][71000] InferenceWorker_p0-w0: stopping experience collection (39050 times) [2024-06-13 07:03:47,566][70980] Signal inference workers to resume experience collection... (39050 times) [2024-06-13 07:03:47,567][71000] InferenceWorker_p0-w0: resuming experience collection (39050 times) [2024-06-13 07:03:50,678][71000] Updated weights for policy 0, policy_version 188134 (0.0029) [2024-06-13 07:03:50,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48605.8, 300 sec: 48818.8). Total num frames: 3082387456. Throughput: 0: 48323.1. Samples: 2611171020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-13 07:03:50,949][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:03:53,877][71000] Updated weights for policy 0, policy_version 188144 (0.0021) [2024-06-13 07:03:55,944][70768] Fps is (10 sec: 50767.5, 60 sec: 49148.3, 300 sec: 49040.2). Total num frames: 3082649600. Throughput: 0: 48396.1. Samples: 2611460060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-13 07:03:55,945][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:03:57,388][71000] Updated weights for policy 0, policy_version 188154 (0.0030) [2024-06-13 07:04:00,567][71000] Updated weights for policy 0, policy_version 188164 (0.0026) [2024-06-13 07:04:00,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48332.9, 300 sec: 48985.4). Total num frames: 3082895360. Throughput: 0: 48738.7. Samples: 2611759720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-13 07:04:00,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 07:04:04,188][71000] Updated weights for policy 0, policy_version 188174 (0.0032) [2024-06-13 07:04:05,940][70768] Fps is (10 sec: 47534.6, 60 sec: 48332.9, 300 sec: 48874.3). Total num frames: 3083124736. Throughput: 0: 48490.3. Samples: 2611905520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-13 07:04:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:04:07,107][71000] Updated weights for policy 0, policy_version 188184 (0.0035) [2024-06-13 07:04:10,748][71000] Updated weights for policy 0, policy_version 188194 (0.0031) [2024-06-13 07:04:10,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.8, 300 sec: 48818.8). Total num frames: 3083370496. Throughput: 0: 48576.4. Samples: 2612196140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-13 07:04:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:04:13,805][71000] Updated weights for policy 0, policy_version 188204 (0.0022) [2024-06-13 07:04:15,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 3083632640. Throughput: 0: 48593.3. Samples: 2612492000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-13 07:04:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:04:17,137][71000] Updated weights for policy 0, policy_version 188214 (0.0033) [2024-06-13 07:04:20,423][71000] Updated weights for policy 0, policy_version 188224 (0.0027) [2024-06-13 07:04:20,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3083878400. Throughput: 0: 49091.0. Samples: 2612649680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:04:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:04:23,855][71000] Updated weights for policy 0, policy_version 188234 (0.0032) [2024-06-13 07:04:25,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48879.1, 300 sec: 48874.3). Total num frames: 3084107776. Throughput: 0: 49151.6. Samples: 2612954440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:04:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:04:26,905][71000] Updated weights for policy 0, policy_version 188244 (0.0026) [2024-06-13 07:04:30,325][71000] Updated weights for policy 0, policy_version 188254 (0.0035) [2024-06-13 07:04:30,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49424.9, 300 sec: 48874.3). Total num frames: 3084369920. Throughput: 0: 49117.1. Samples: 2613245140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:04:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:04:33,564][71000] Updated weights for policy 0, policy_version 188264 (0.0026) [2024-06-13 07:04:35,940][70768] Fps is (10 sec: 50789.1, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 3084615680. Throughput: 0: 49460.8. Samples: 2613396760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:04:35,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:04:36,782][71000] Updated weights for policy 0, policy_version 188274 (0.0027) [2024-06-13 07:04:40,267][71000] Updated weights for policy 0, policy_version 188284 (0.0030) [2024-06-13 07:04:40,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3084877824. Throughput: 0: 49573.8. Samples: 2613690660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:04:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:04:43,511][71000] Updated weights for policy 0, policy_version 188294 (0.0024) [2024-06-13 07:04:45,939][70768] Fps is (10 sec: 49153.4, 60 sec: 49425.1, 300 sec: 48929.9). Total num frames: 3085107200. Throughput: 0: 49568.6. Samples: 2613990300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:04:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:04:46,921][71000] Updated weights for policy 0, policy_version 188304 (0.0024) [2024-06-13 07:04:50,143][71000] Updated weights for policy 0, policy_version 188314 (0.0030) [2024-06-13 07:04:50,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49425.2, 300 sec: 48985.4). Total num frames: 3085352960. Throughput: 0: 49420.9. Samples: 2614129460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:04:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:04:53,386][71000] Updated weights for policy 0, policy_version 188324 (0.0024) [2024-06-13 07:04:55,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49155.5, 300 sec: 48985.4). Total num frames: 3085598720. Throughput: 0: 49694.7. Samples: 2614432400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:04:55,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:04:56,781][71000] Updated weights for policy 0, policy_version 188334 (0.0027) [2024-06-13 07:05:00,197][71000] Updated weights for policy 0, policy_version 188344 (0.0032) [2024-06-13 07:05:00,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49698.2, 300 sec: 49040.9). Total num frames: 3085877248. Throughput: 0: 49670.2. Samples: 2614727160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:05:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:05:03,422][71000] Updated weights for policy 0, policy_version 188354 (0.0027) [2024-06-13 07:05:05,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.0, 300 sec: 48985.4). Total num frames: 3086106624. Throughput: 0: 49493.7. Samples: 2614876900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:05:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:05:06,750][70980] Signal inference workers to stop experience collection... (39100 times) [2024-06-13 07:05:06,787][71000] InferenceWorker_p0-w0: stopping experience collection (39100 times) [2024-06-13 07:05:06,866][70980] Signal inference workers to resume experience collection... (39100 times) [2024-06-13 07:05:06,866][71000] InferenceWorker_p0-w0: resuming experience collection (39100 times) [2024-06-13 07:05:06,992][71000] Updated weights for policy 0, policy_version 188364 (0.0026) [2024-06-13 07:05:09,831][71000] Updated weights for policy 0, policy_version 188374 (0.0028) [2024-06-13 07:05:10,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49698.3, 300 sec: 48985.4). Total num frames: 3086352384. Throughput: 0: 49489.2. Samples: 2615181460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:05:10,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 07:05:13,674][71000] Updated weights for policy 0, policy_version 188384 (0.0031) [2024-06-13 07:05:15,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 3086598144. Throughput: 0: 49345.1. Samples: 2615465660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:05:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:05:16,729][71000] Updated weights for policy 0, policy_version 188394 (0.0037) [2024-06-13 07:05:20,310][71000] Updated weights for policy 0, policy_version 188404 (0.0035) [2024-06-13 07:05:20,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.1, 300 sec: 48986.1). Total num frames: 3086843904. Throughput: 0: 49391.8. Samples: 2615619380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:05:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:05:23,438][71000] Updated weights for policy 0, policy_version 188414 (0.0029) [2024-06-13 07:05:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49698.1, 300 sec: 48985.4). Total num frames: 3087089664. Throughput: 0: 49293.8. Samples: 2615908880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 07:05:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:05:26,963][71000] Updated weights for policy 0, policy_version 188424 (0.0028) [2024-06-13 07:05:30,162][71000] Updated weights for policy 0, policy_version 188434 (0.0028) [2024-06-13 07:05:30,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49698.2, 300 sec: 49152.0). Total num frames: 3087351808. Throughput: 0: 49226.4. Samples: 2616205500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 07:05:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 07:05:33,679][71000] Updated weights for policy 0, policy_version 188444 (0.0031) [2024-06-13 07:05:35,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.2, 300 sec: 48985.4). Total num frames: 3087564800. Throughput: 0: 49534.2. Samples: 2616358500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 07:05:35,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:05:36,616][71000] Updated weights for policy 0, policy_version 188454 (0.0025) [2024-06-13 07:05:40,438][71000] Updated weights for policy 0, policy_version 188464 (0.0028) [2024-06-13 07:05:40,939][70768] Fps is (10 sec: 47514.6, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 3087826944. Throughput: 0: 49151.3. Samples: 2616644200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 07:05:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:05:41,054][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000188467_3087843328.pth... [2024-06-13 07:05:41,105][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000187749_3076079616.pth [2024-06-13 07:05:43,465][71000] Updated weights for policy 0, policy_version 188474 (0.0027) [2024-06-13 07:05:45,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 3088056320. Throughput: 0: 49097.4. Samples: 2616936540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 07:05:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:05:47,064][71000] Updated weights for policy 0, policy_version 188484 (0.0026) [2024-06-13 07:05:50,020][71000] Updated weights for policy 0, policy_version 188494 (0.0031) [2024-06-13 07:05:50,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 49152.0). Total num frames: 3088334848. Throughput: 0: 49193.5. Samples: 2617090600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 07:05:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:05:53,838][71000] Updated weights for policy 0, policy_version 188504 (0.0032) [2024-06-13 07:05:55,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 3088564224. Throughput: 0: 49048.9. Samples: 2617388660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 07:05:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:05:56,822][71000] Updated weights for policy 0, policy_version 188514 (0.0033) [2024-06-13 07:06:00,545][71000] Updated weights for policy 0, policy_version 188524 (0.0026) [2024-06-13 07:06:00,939][70768] Fps is (10 sec: 44237.0, 60 sec: 48332.8, 300 sec: 48929.9). Total num frames: 3088777216. Throughput: 0: 49107.2. Samples: 2617675480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 07:06:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:06:01,848][70980] Signal inference workers to stop experience collection... (39150 times) [2024-06-13 07:06:01,848][70980] Signal inference workers to resume experience collection... (39150 times) [2024-06-13 07:06:01,887][71000] InferenceWorker_p0-w0: stopping experience collection (39150 times) [2024-06-13 07:06:01,887][71000] InferenceWorker_p0-w0: resuming experience collection (39150 times) [2024-06-13 07:06:03,662][71000] Updated weights for policy 0, policy_version 188534 (0.0036) [2024-06-13 07:06:05,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3089055744. Throughput: 0: 48828.8. Samples: 2617816680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 07:06:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:06:07,234][71000] Updated weights for policy 0, policy_version 188544 (0.0025) [2024-06-13 07:06:10,115][71000] Updated weights for policy 0, policy_version 188554 (0.0029) [2024-06-13 07:06:10,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 3089285120. Throughput: 0: 49027.1. Samples: 2618115100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 07:06:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:06:14,382][71000] Updated weights for policy 0, policy_version 188564 (0.0030) [2024-06-13 07:06:15,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3089547264. Throughput: 0: 48921.1. Samples: 2618406940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 07:06:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:06:16,738][71000] Updated weights for policy 0, policy_version 188574 (0.0030) [2024-06-13 07:06:20,716][71000] Updated weights for policy 0, policy_version 188584 (0.0029) [2024-06-13 07:06:20,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3089760256. Throughput: 0: 48864.5. Samples: 2618557400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 07:06:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:06:23,516][71000] Updated weights for policy 0, policy_version 188594 (0.0034) [2024-06-13 07:06:25,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3090038784. Throughput: 0: 49084.8. Samples: 2618853020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 07:06:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:06:27,577][71000] Updated weights for policy 0, policy_version 188604 (0.0030) [2024-06-13 07:06:30,029][71000] Updated weights for policy 0, policy_version 188614 (0.0020) [2024-06-13 07:06:30,940][70768] Fps is (10 sec: 54066.8, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 3090300928. Throughput: 0: 49280.8. Samples: 2619154180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 07:06:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:06:33,969][71000] Updated weights for policy 0, policy_version 188624 (0.0027) [2024-06-13 07:06:35,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 3090513920. Throughput: 0: 49131.9. Samples: 2619301540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 07:06:35,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 07:06:36,841][71000] Updated weights for policy 0, policy_version 188634 (0.0021) [2024-06-13 07:06:40,931][71000] Updated weights for policy 0, policy_version 188644 (0.0033) [2024-06-13 07:06:40,940][70768] Fps is (10 sec: 44237.0, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 3090743296. Throughput: 0: 48732.5. Samples: 2619581620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 07:06:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:06:43,636][71000] Updated weights for policy 0, policy_version 188654 (0.0038) [2024-06-13 07:06:45,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 49096.4). Total num frames: 3091005440. Throughput: 0: 48743.5. Samples: 2619868940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 07:06:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:06:48,042][71000] Updated weights for policy 0, policy_version 188664 (0.0025) [2024-06-13 07:06:50,440][71000] Updated weights for policy 0, policy_version 188674 (0.0032) [2024-06-13 07:06:50,940][70768] Fps is (10 sec: 50789.2, 60 sec: 48605.7, 300 sec: 49096.4). Total num frames: 3091251200. Throughput: 0: 48974.5. Samples: 2620020540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 07:06:50,941][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:06:54,399][71000] Updated weights for policy 0, policy_version 188684 (0.0032) [2024-06-13 07:06:55,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 3091480576. Throughput: 0: 48932.8. Samples: 2620317080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 07:06:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:06:57,039][71000] Updated weights for policy 0, policy_version 188694 (0.0033) [2024-06-13 07:07:00,940][70768] Fps is (10 sec: 45876.1, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3091709952. Throughput: 0: 48864.8. Samples: 2620605860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 07:07:00,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 07:07:00,951][71000] Updated weights for policy 0, policy_version 188704 (0.0026) [2024-06-13 07:07:03,826][71000] Updated weights for policy 0, policy_version 188714 (0.0022) [2024-06-13 07:07:05,940][70768] Fps is (10 sec: 50791.0, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 3091988480. Throughput: 0: 48568.4. Samples: 2620742980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 07:07:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:07:07,804][71000] Updated weights for policy 0, policy_version 188724 (0.0031) [2024-06-13 07:07:10,449][71000] Updated weights for policy 0, policy_version 188734 (0.0026) [2024-06-13 07:07:10,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3092234240. Throughput: 0: 48588.8. Samples: 2621039520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 07:07:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:07:14,466][71000] Updated weights for policy 0, policy_version 188744 (0.0034) [2024-06-13 07:07:15,940][70768] Fps is (10 sec: 45874.4, 60 sec: 48332.7, 300 sec: 48874.3). Total num frames: 3092447232. Throughput: 0: 48626.1. Samples: 2621342360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 07:07:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:07:17,042][71000] Updated weights for policy 0, policy_version 188754 (0.0025) [2024-06-13 07:07:18,607][70980] Signal inference workers to stop experience collection... (39200 times) [2024-06-13 07:07:18,647][71000] InferenceWorker_p0-w0: stopping experience collection (39200 times) [2024-06-13 07:07:18,721][70980] Signal inference workers to resume experience collection... (39200 times) [2024-06-13 07:07:18,721][71000] InferenceWorker_p0-w0: resuming experience collection (39200 times) [2024-06-13 07:07:20,936][71000] Updated weights for policy 0, policy_version 188764 (0.0037) [2024-06-13 07:07:20,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 3092709376. Throughput: 0: 48456.5. Samples: 2621482080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 07:07:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:07:23,714][71000] Updated weights for policy 0, policy_version 188774 (0.0023) [2024-06-13 07:07:25,940][70768] Fps is (10 sec: 50791.4, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 3092955136. Throughput: 0: 48665.8. Samples: 2621771580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 07:07:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:07:27,346][71000] Updated weights for policy 0, policy_version 188784 (0.0022) [2024-06-13 07:07:30,553][71000] Updated weights for policy 0, policy_version 188794 (0.0030) [2024-06-13 07:07:30,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48605.9, 300 sec: 49096.5). Total num frames: 3093217280. Throughput: 0: 48924.0. Samples: 2622070520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 07:07:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:07:34,203][71000] Updated weights for policy 0, policy_version 188804 (0.0026) [2024-06-13 07:07:35,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48333.0, 300 sec: 48818.8). Total num frames: 3093413888. Throughput: 0: 48829.7. Samples: 2622217860. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 07:07:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:07:36,996][71000] Updated weights for policy 0, policy_version 188814 (0.0026) [2024-06-13 07:07:40,940][70768] Fps is (10 sec: 44236.6, 60 sec: 48605.8, 300 sec: 48818.7). Total num frames: 3093659648. Throughput: 0: 48514.3. Samples: 2622500220. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 07:07:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:07:41,120][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000188824_3093692416.pth... [2024-06-13 07:07:41,124][71000] Updated weights for policy 0, policy_version 188824 (0.0035) [2024-06-13 07:07:41,170][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000188107_3081945088.pth [2024-06-13 07:07:43,866][71000] Updated weights for policy 0, policy_version 188834 (0.0032) [2024-06-13 07:07:45,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3093921792. Throughput: 0: 48572.0. Samples: 2622791600. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 07:07:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:07:47,649][71000] Updated weights for policy 0, policy_version 188844 (0.0030) [2024-06-13 07:07:50,523][71000] Updated weights for policy 0, policy_version 188854 (0.0022) [2024-06-13 07:07:50,939][70768] Fps is (10 sec: 52429.2, 60 sec: 48879.2, 300 sec: 49096.5). Total num frames: 3094183936. Throughput: 0: 48969.4. Samples: 2622946600. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 07:07:50,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:07:53,958][71000] Updated weights for policy 0, policy_version 188864 (0.0025) [2024-06-13 07:07:55,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 3094396928. Throughput: 0: 48925.5. Samples: 2623241160. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 07:07:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 07:07:57,270][71000] Updated weights for policy 0, policy_version 188874 (0.0027) [2024-06-13 07:08:00,933][71000] Updated weights for policy 0, policy_version 188884 (0.0032) [2024-06-13 07:08:00,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 3094675456. Throughput: 0: 48786.9. Samples: 2623537760. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 07:08:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:08:03,902][71000] Updated weights for policy 0, policy_version 188894 (0.0025) [2024-06-13 07:08:04,903][70980] Signal inference workers to stop experience collection... (39250 times) [2024-06-13 07:08:04,953][71000] InferenceWorker_p0-w0: stopping experience collection (39250 times) [2024-06-13 07:08:04,959][70980] Signal inference workers to resume experience collection... (39250 times) [2024-06-13 07:08:04,968][71000] InferenceWorker_p0-w0: resuming experience collection (39250 times) [2024-06-13 07:08:05,939][70768] Fps is (10 sec: 50790.4, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 3094904832. Throughput: 0: 48852.0. Samples: 2623680420. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 07:08:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:08:07,723][71000] Updated weights for policy 0, policy_version 188904 (0.0028) [2024-06-13 07:08:10,646][71000] Updated weights for policy 0, policy_version 188914 (0.0027) [2024-06-13 07:08:10,942][70768] Fps is (10 sec: 50777.0, 60 sec: 49149.9, 300 sec: 49096.0). Total num frames: 3095183360. Throughput: 0: 49198.5. Samples: 2623985640. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 07:08:10,951][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:08:14,162][71000] Updated weights for policy 0, policy_version 188924 (0.0030) [2024-06-13 07:08:15,940][70768] Fps is (10 sec: 49150.9, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3095396352. Throughput: 0: 49190.5. Samples: 2624284100. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 07:08:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:08:17,316][71000] Updated weights for policy 0, policy_version 188934 (0.0027) [2024-06-13 07:08:20,940][70768] Fps is (10 sec: 45887.0, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3095642112. Throughput: 0: 49100.3. Samples: 2624427380. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 07:08:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:08:20,966][71000] Updated weights for policy 0, policy_version 188944 (0.0023) [2024-06-13 07:08:23,851][71000] Updated weights for policy 0, policy_version 188954 (0.0036) [2024-06-13 07:08:25,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 3095887872. Throughput: 0: 49351.9. Samples: 2624721060. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 07:08:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:08:27,459][71000] Updated weights for policy 0, policy_version 188964 (0.0026) [2024-06-13 07:08:30,762][71000] Updated weights for policy 0, policy_version 188974 (0.0023) [2024-06-13 07:08:30,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3096150016. Throughput: 0: 49565.3. Samples: 2625022040. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 07:08:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:08:34,132][71000] Updated weights for policy 0, policy_version 188984 (0.0031) [2024-06-13 07:08:35,939][70768] Fps is (10 sec: 50791.1, 60 sec: 49698.1, 300 sec: 49040.9). Total num frames: 3096395776. Throughput: 0: 49316.0. Samples: 2625165820. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 07:08:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:08:37,588][71000] Updated weights for policy 0, policy_version 188994 (0.0036) [2024-06-13 07:08:40,581][71000] Updated weights for policy 0, policy_version 189004 (0.0035) [2024-06-13 07:08:40,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49698.2, 300 sec: 49152.0). Total num frames: 3096641536. Throughput: 0: 49454.7. Samples: 2625466620. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 07:08:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:08:44,132][71000] Updated weights for policy 0, policy_version 189014 (0.0036) [2024-06-13 07:08:45,940][70768] Fps is (10 sec: 49150.2, 60 sec: 49424.8, 300 sec: 49152.0). Total num frames: 3096887296. Throughput: 0: 49335.6. Samples: 2625757880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 07:08:45,941][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:08:47,398][71000] Updated weights for policy 0, policy_version 189024 (0.0022) [2024-06-13 07:08:50,685][71000] Updated weights for policy 0, policy_version 189034 (0.0026) [2024-06-13 07:08:50,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49151.9, 300 sec: 49097.2). Total num frames: 3097133056. Throughput: 0: 49603.4. Samples: 2625912580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 07:08:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:08:53,738][71000] Updated weights for policy 0, policy_version 189044 (0.0022) [2024-06-13 07:08:55,940][70768] Fps is (10 sec: 50792.0, 60 sec: 49971.2, 300 sec: 49152.0). Total num frames: 3097395200. Throughput: 0: 49291.3. Samples: 2626203620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 07:08:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 07:08:57,308][71000] Updated weights for policy 0, policy_version 189054 (0.0031) [2024-06-13 07:09:00,405][71000] Updated weights for policy 0, policy_version 189064 (0.0028) [2024-06-13 07:09:00,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3097624576. Throughput: 0: 49195.7. Samples: 2626497900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 07:09:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:09:03,861][71000] Updated weights for policy 0, policy_version 189074 (0.0022) [2024-06-13 07:09:05,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 3097870336. Throughput: 0: 49471.7. Samples: 2626653600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 07:09:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:09:07,191][71000] Updated weights for policy 0, policy_version 189084 (0.0031) [2024-06-13 07:09:10,625][71000] Updated weights for policy 0, policy_version 189094 (0.0032) [2024-06-13 07:09:10,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49154.0, 300 sec: 49152.0). Total num frames: 3098132480. Throughput: 0: 49334.6. Samples: 2626941120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 07:09:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:09:13,842][71000] Updated weights for policy 0, policy_version 189104 (0.0028) [2024-06-13 07:09:15,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49698.2, 300 sec: 49152.0). Total num frames: 3098378240. Throughput: 0: 49184.9. Samples: 2627235360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 07:09:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:09:17,077][71000] Updated weights for policy 0, policy_version 189114 (0.0025) [2024-06-13 07:09:19,025][70980] Signal inference workers to stop experience collection... (39300 times) [2024-06-13 07:09:19,026][70980] Signal inference workers to resume experience collection... (39300 times) [2024-06-13 07:09:19,066][71000] InferenceWorker_p0-w0: stopping experience collection (39300 times) [2024-06-13 07:09:19,066][71000] InferenceWorker_p0-w0: resuming experience collection (39300 times) [2024-06-13 07:09:20,573][71000] Updated weights for policy 0, policy_version 189124 (0.0025) [2024-06-13 07:09:20,939][70768] Fps is (10 sec: 47514.5, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 3098607616. Throughput: 0: 49479.5. Samples: 2627392400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 07:09:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:09:23,815][71000] Updated weights for policy 0, policy_version 189134 (0.0023) [2024-06-13 07:09:25,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49425.2, 300 sec: 49096.5). Total num frames: 3098853376. Throughput: 0: 49437.3. Samples: 2627691300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 07:09:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:09:27,066][71000] Updated weights for policy 0, policy_version 189144 (0.0025) [2024-06-13 07:09:30,482][71000] Updated weights for policy 0, policy_version 189154 (0.0027) [2024-06-13 07:09:30,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 3099099136. Throughput: 0: 49410.6. Samples: 2627981340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 07:09:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 07:09:33,771][71000] Updated weights for policy 0, policy_version 189164 (0.0021) [2024-06-13 07:09:35,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 3099361280. Throughput: 0: 49232.5. Samples: 2628128040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 07:09:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:09:37,257][71000] Updated weights for policy 0, policy_version 189174 (0.0025) [2024-06-13 07:09:40,308][71000] Updated weights for policy 0, policy_version 189184 (0.0026) [2024-06-13 07:09:40,940][70768] Fps is (10 sec: 50789.1, 60 sec: 49424.8, 300 sec: 49152.0). Total num frames: 3099607040. Throughput: 0: 49474.0. Samples: 2628429960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 07:09:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:09:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000189185_3099607040.pth... [2024-06-13 07:09:40,991][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000188467_3087843328.pth [2024-06-13 07:09:43,867][71000] Updated weights for policy 0, policy_version 189194 (0.0023) [2024-06-13 07:09:45,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49698.3, 300 sec: 49207.5). Total num frames: 3099869184. Throughput: 0: 49599.9. Samples: 2628729900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 07:09:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 07:09:46,832][71000] Updated weights for policy 0, policy_version 189204 (0.0032) [2024-06-13 07:09:50,354][71000] Updated weights for policy 0, policy_version 189214 (0.0035) [2024-06-13 07:09:50,940][70768] Fps is (10 sec: 47514.5, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 3100082176. Throughput: 0: 49270.6. Samples: 2628870780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 07:09:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:09:53,816][71000] Updated weights for policy 0, policy_version 189224 (0.0033) [2024-06-13 07:09:55,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 3100344320. Throughput: 0: 49269.4. Samples: 2629158240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 07:09:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:09:57,810][71000] Updated weights for policy 0, policy_version 189234 (0.0023) [2024-06-13 07:10:00,742][71000] Updated weights for policy 0, policy_version 189244 (0.0026) [2024-06-13 07:10:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 3100590080. Throughput: 0: 49244.5. Samples: 2629451360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 07:10:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:10:04,289][71000] Updated weights for policy 0, policy_version 189254 (0.0031) [2024-06-13 07:10:05,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 3100835840. Throughput: 0: 49056.9. Samples: 2629599960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 07:10:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:10:07,238][71000] Updated weights for policy 0, policy_version 189264 (0.0026) [2024-06-13 07:10:10,723][71000] Updated weights for policy 0, policy_version 189274 (0.0031) [2024-06-13 07:10:10,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 3101065216. Throughput: 0: 49182.1. Samples: 2629904500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 07:10:10,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:10:13,976][71000] Updated weights for policy 0, policy_version 189284 (0.0040) [2024-06-13 07:10:15,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 3101327360. Throughput: 0: 49038.4. Samples: 2630188080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 07:10:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:10:17,545][71000] Updated weights for policy 0, policy_version 189294 (0.0031) [2024-06-13 07:10:20,518][71000] Updated weights for policy 0, policy_version 189304 (0.0036) [2024-06-13 07:10:20,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49698.1, 300 sec: 49152.0). Total num frames: 3101589504. Throughput: 0: 49191.6. Samples: 2630341660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 07:10:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:10:24,035][71000] Updated weights for policy 0, policy_version 189314 (0.0030) [2024-06-13 07:10:25,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49425.1, 300 sec: 49041.0). Total num frames: 3101818880. Throughput: 0: 49108.3. Samples: 2630639820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 07:10:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:10:26,918][70980] Signal inference workers to stop experience collection... (39350 times) [2024-06-13 07:10:26,918][70980] Signal inference workers to resume experience collection... (39350 times) [2024-06-13 07:10:26,952][71000] InferenceWorker_p0-w0: stopping experience collection (39350 times) [2024-06-13 07:10:26,952][71000] InferenceWorker_p0-w0: resuming experience collection (39350 times) [2024-06-13 07:10:27,079][71000] Updated weights for policy 0, policy_version 189324 (0.0027) [2024-06-13 07:10:30,571][71000] Updated weights for policy 0, policy_version 189334 (0.0031) [2024-06-13 07:10:30,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3102048256. Throughput: 0: 49050.8. Samples: 2630937180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 07:10:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:10:33,747][71000] Updated weights for policy 0, policy_version 189344 (0.0029) [2024-06-13 07:10:35,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3102294016. Throughput: 0: 49205.3. Samples: 2631085020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 07:10:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:10:37,244][71000] Updated weights for policy 0, policy_version 189354 (0.0033) [2024-06-13 07:10:40,341][71000] Updated weights for policy 0, policy_version 189364 (0.0033) [2024-06-13 07:10:40,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.3, 300 sec: 49207.5). Total num frames: 3102572544. Throughput: 0: 49118.4. Samples: 2631368560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 07:10:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:10:43,829][71000] Updated weights for policy 0, policy_version 189374 (0.0027) [2024-06-13 07:10:45,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48879.1, 300 sec: 49040.9). Total num frames: 3102801920. Throughput: 0: 49282.7. Samples: 2631669080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 07:10:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:10:46,982][71000] Updated weights for policy 0, policy_version 189384 (0.0029) [2024-06-13 07:10:50,710][71000] Updated weights for policy 0, policy_version 189394 (0.0034) [2024-06-13 07:10:50,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 3103047680. Throughput: 0: 49248.4. Samples: 2631816140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 07:10:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:10:53,671][71000] Updated weights for policy 0, policy_version 189404 (0.0033) [2024-06-13 07:10:55,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 3103277056. Throughput: 0: 49029.0. Samples: 2632110800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 07:10:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:10:57,219][71000] Updated weights for policy 0, policy_version 189414 (0.0032) [2024-06-13 07:11:00,171][71000] Updated weights for policy 0, policy_version 189424 (0.0030) [2024-06-13 07:11:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 3103555584. Throughput: 0: 49284.6. Samples: 2632405880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 07:11:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:11:03,442][71000] Updated weights for policy 0, policy_version 189434 (0.0030) [2024-06-13 07:11:05,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 3103768576. Throughput: 0: 49308.1. Samples: 2632560520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 07:11:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:11:06,919][71000] Updated weights for policy 0, policy_version 189444 (0.0035) [2024-06-13 07:11:10,744][71000] Updated weights for policy 0, policy_version 189454 (0.0035) [2024-06-13 07:11:10,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 3104014336. Throughput: 0: 49157.7. Samples: 2632851920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 07:11:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:11:13,488][71000] Updated weights for policy 0, policy_version 189464 (0.0028) [2024-06-13 07:11:15,940][70768] Fps is (10 sec: 49150.9, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3104260096. Throughput: 0: 49034.5. Samples: 2633143740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 07:11:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:11:17,257][71000] Updated weights for policy 0, policy_version 189474 (0.0022) [2024-06-13 07:11:20,012][71000] Updated weights for policy 0, policy_version 189484 (0.0024) [2024-06-13 07:11:20,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 3104522240. Throughput: 0: 48975.1. Samples: 2633288900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 07:11:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:11:23,436][70980] Signal inference workers to stop experience collection... (39400 times) [2024-06-13 07:11:23,436][70980] Signal inference workers to resume experience collection... (39400 times) [2024-06-13 07:11:23,468][71000] InferenceWorker_p0-w0: stopping experience collection (39400 times) [2024-06-13 07:11:23,469][71000] InferenceWorker_p0-w0: resuming experience collection (39400 times) [2024-06-13 07:11:23,570][71000] Updated weights for policy 0, policy_version 189494 (0.0024) [2024-06-13 07:11:25,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48878.6, 300 sec: 48985.3). Total num frames: 3104751616. Throughput: 0: 49479.6. Samples: 2633595160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 07:11:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:11:26,700][71000] Updated weights for policy 0, policy_version 189504 (0.0034) [2024-06-13 07:11:30,559][71000] Updated weights for policy 0, policy_version 189514 (0.0039) [2024-06-13 07:11:30,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 3104997376. Throughput: 0: 49255.8. Samples: 2633885600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 07:11:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:11:33,536][71000] Updated weights for policy 0, policy_version 189524 (0.0026) [2024-06-13 07:11:35,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3105243136. Throughput: 0: 49020.7. Samples: 2634022080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 07:11:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:11:37,441][71000] Updated weights for policy 0, policy_version 189534 (0.0029) [2024-06-13 07:11:40,325][71000] Updated weights for policy 0, policy_version 189544 (0.0030) [2024-06-13 07:11:40,939][70768] Fps is (10 sec: 49152.8, 60 sec: 48605.9, 300 sec: 49096.5). Total num frames: 3105488896. Throughput: 0: 49176.1. Samples: 2634323720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 07:11:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:11:40,984][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000189545_3105505280.pth... [2024-06-13 07:11:41,031][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000188824_3093692416.pth [2024-06-13 07:11:44,079][71000] Updated weights for policy 0, policy_version 189554 (0.0020) [2024-06-13 07:11:45,942][70768] Fps is (10 sec: 49141.9, 60 sec: 48877.1, 300 sec: 49096.1). Total num frames: 3105734656. Throughput: 0: 49039.4. Samples: 2634612760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 07:11:45,942][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:11:46,775][71000] Updated weights for policy 0, policy_version 189564 (0.0026) [2024-06-13 07:11:50,893][71000] Updated weights for policy 0, policy_version 189574 (0.0035) [2024-06-13 07:11:50,940][70768] Fps is (10 sec: 49151.0, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 3105980416. Throughput: 0: 48682.0. Samples: 2634751220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 07:11:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:11:53,882][71000] Updated weights for policy 0, policy_version 189584 (0.0027) [2024-06-13 07:11:55,940][70768] Fps is (10 sec: 49162.8, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 3106226176. Throughput: 0: 48619.6. Samples: 2635039800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 07:11:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:11:57,559][71000] Updated weights for policy 0, policy_version 189594 (0.0028) [2024-06-13 07:12:00,239][71000] Updated weights for policy 0, policy_version 189604 (0.0033) [2024-06-13 07:12:00,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48605.8, 300 sec: 49096.5). Total num frames: 3106471936. Throughput: 0: 48790.4. Samples: 2635339300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-13 07:12:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:12:03,975][71000] Updated weights for policy 0, policy_version 189614 (0.0028) [2024-06-13 07:12:05,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 3106717696. Throughput: 0: 49051.6. Samples: 2635496220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 07:12:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:12:07,047][71000] Updated weights for policy 0, policy_version 189624 (0.0026) [2024-06-13 07:12:10,894][71000] Updated weights for policy 0, policy_version 189634 (0.0030) [2024-06-13 07:12:10,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 3106963456. Throughput: 0: 48771.3. Samples: 2635789860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 07:12:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:12:13,713][71000] Updated weights for policy 0, policy_version 189644 (0.0026) [2024-06-13 07:12:15,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48879.1, 300 sec: 49096.5). Total num frames: 3107192832. Throughput: 0: 48748.2. Samples: 2636079260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 07:12:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:12:17,689][71000] Updated weights for policy 0, policy_version 189654 (0.0030) [2024-06-13 07:12:20,550][71000] Updated weights for policy 0, policy_version 189664 (0.0029) [2024-06-13 07:12:20,940][70768] Fps is (10 sec: 49152.8, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3107454976. Throughput: 0: 49008.2. Samples: 2636227440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 07:12:20,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:12:23,921][71000] Updated weights for policy 0, policy_version 189674 (0.0027) [2024-06-13 07:12:25,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49425.4, 300 sec: 49152.0). Total num frames: 3107717120. Throughput: 0: 48854.7. Samples: 2636522180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 07:12:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:12:27,074][71000] Updated weights for policy 0, policy_version 189684 (0.0034) [2024-06-13 07:12:30,742][71000] Updated weights for policy 0, policy_version 189694 (0.0037) [2024-06-13 07:12:30,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 3107946496. Throughput: 0: 49204.7. Samples: 2636826860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 07:12:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:12:33,649][71000] Updated weights for policy 0, policy_version 189704 (0.0029) [2024-06-13 07:12:35,499][70980] Signal inference workers to stop experience collection... (39450 times) [2024-06-13 07:12:35,499][70980] Signal inference workers to resume experience collection... (39450 times) [2024-06-13 07:12:35,517][71000] InferenceWorker_p0-w0: stopping experience collection (39450 times) [2024-06-13 07:12:35,517][71000] InferenceWorker_p0-w0: resuming experience collection (39450 times) [2024-06-13 07:12:35,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 3108192256. Throughput: 0: 49168.2. Samples: 2636963780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 07:12:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:12:37,213][71000] Updated weights for policy 0, policy_version 189714 (0.0034) [2024-06-13 07:12:40,414][71000] Updated weights for policy 0, policy_version 189724 (0.0027) [2024-06-13 07:12:40,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 3108438016. Throughput: 0: 49293.8. Samples: 2637258020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 07:12:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:12:44,157][71000] Updated weights for policy 0, policy_version 189734 (0.0024) [2024-06-13 07:12:45,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49426.9, 300 sec: 49207.5). Total num frames: 3108700160. Throughput: 0: 49249.4. Samples: 2637555520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 07:12:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:12:47,051][71000] Updated weights for policy 0, policy_version 189744 (0.0026) [2024-06-13 07:12:50,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48879.1, 300 sec: 49207.5). Total num frames: 3108913152. Throughput: 0: 49088.5. Samples: 2637705200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 07:12:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:12:51,056][71000] Updated weights for policy 0, policy_version 189754 (0.0023) [2024-06-13 07:12:53,989][71000] Updated weights for policy 0, policy_version 189764 (0.0023) [2024-06-13 07:12:55,940][70768] Fps is (10 sec: 45874.6, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 3109158912. Throughput: 0: 48778.2. Samples: 2637984880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 07:12:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:12:57,610][71000] Updated weights for policy 0, policy_version 189774 (0.0030) [2024-06-13 07:13:00,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 3109404672. Throughput: 0: 48827.1. Samples: 2638276480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 07:13:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:13:00,963][71000] Updated weights for policy 0, policy_version 189784 (0.0021) [2024-06-13 07:13:04,230][71000] Updated weights for policy 0, policy_version 189794 (0.0026) [2024-06-13 07:13:05,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49151.9, 300 sec: 49096.9). Total num frames: 3109666816. Throughput: 0: 48979.0. Samples: 2638431500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 07:13:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:13:07,371][71000] Updated weights for policy 0, policy_version 189804 (0.0035) [2024-06-13 07:13:10,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 3109896192. Throughput: 0: 48993.3. Samples: 2638726880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:13:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:13:11,035][71000] Updated weights for policy 0, policy_version 189814 (0.0037) [2024-06-13 07:13:14,327][71000] Updated weights for policy 0, policy_version 189824 (0.0023) [2024-06-13 07:13:15,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48878.7, 300 sec: 49096.4). Total num frames: 3110125568. Throughput: 0: 48670.0. Samples: 2639017020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:13:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:13:17,623][71000] Updated weights for policy 0, policy_version 189834 (0.0025) [2024-06-13 07:13:20,684][71000] Updated weights for policy 0, policy_version 189844 (0.0022) [2024-06-13 07:13:20,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 3110404096. Throughput: 0: 48890.6. Samples: 2639163860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:13:20,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 07:13:24,286][71000] Updated weights for policy 0, policy_version 189854 (0.0030) [2024-06-13 07:13:25,940][70768] Fps is (10 sec: 52428.8, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 3110649856. Throughput: 0: 48978.1. Samples: 2639462040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:13:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:13:27,910][71000] Updated weights for policy 0, policy_version 189864 (0.0028) [2024-06-13 07:13:30,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48878.9, 300 sec: 49096.4). Total num frames: 3110879232. Throughput: 0: 48748.4. Samples: 2639749200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:13:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:13:31,130][71000] Updated weights for policy 0, policy_version 189874 (0.0026) [2024-06-13 07:13:35,087][71000] Updated weights for policy 0, policy_version 189884 (0.0031) [2024-06-13 07:13:35,940][70768] Fps is (10 sec: 44237.3, 60 sec: 48332.8, 300 sec: 48985.4). Total num frames: 3111092224. Throughput: 0: 48449.3. Samples: 2639885420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:13:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:13:37,971][71000] Updated weights for policy 0, policy_version 189894 (0.0035) [2024-06-13 07:13:38,445][70980] Signal inference workers to stop experience collection... (39500 times) [2024-06-13 07:13:38,446][70980] Signal inference workers to resume experience collection... (39500 times) [2024-06-13 07:13:38,488][71000] InferenceWorker_p0-w0: stopping experience collection (39500 times) [2024-06-13 07:13:38,488][71000] InferenceWorker_p0-w0: resuming experience collection (39500 times) [2024-06-13 07:13:40,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48605.8, 300 sec: 49041.0). Total num frames: 3111354368. Throughput: 0: 48716.9. Samples: 2640177140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:13:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:13:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000189902_3111354368.pth... [2024-06-13 07:13:40,998][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000189185_3099607040.pth [2024-06-13 07:13:41,451][71000] Updated weights for policy 0, policy_version 189904 (0.0025) [2024-06-13 07:13:44,502][71000] Updated weights for policy 0, policy_version 189914 (0.0036) [2024-06-13 07:13:45,940][70768] Fps is (10 sec: 52428.5, 60 sec: 48605.8, 300 sec: 49096.5). Total num frames: 3111616512. Throughput: 0: 48773.6. Samples: 2640471300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:13:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:13:48,203][71000] Updated weights for policy 0, policy_version 189924 (0.0032) [2024-06-13 07:13:50,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 3111829504. Throughput: 0: 48699.6. Samples: 2640622980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:13:50,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:13:51,275][71000] Updated weights for policy 0, policy_version 189934 (0.0037) [2024-06-13 07:13:54,868][71000] Updated weights for policy 0, policy_version 189944 (0.0035) [2024-06-13 07:13:55,939][70768] Fps is (10 sec: 45875.9, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 3112075264. Throughput: 0: 48585.4. Samples: 2640913220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:13:55,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 07:13:57,944][71000] Updated weights for policy 0, policy_version 189954 (0.0030) [2024-06-13 07:14:00,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 3112337408. Throughput: 0: 48667.6. Samples: 2641207060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:14:00,949][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:14:01,566][71000] Updated weights for policy 0, policy_version 189964 (0.0032) [2024-06-13 07:14:04,459][71000] Updated weights for policy 0, policy_version 189974 (0.0026) [2024-06-13 07:14:05,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3112583168. Throughput: 0: 48868.9. Samples: 2641362960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:14:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:14:08,046][71000] Updated weights for policy 0, policy_version 189984 (0.0036) [2024-06-13 07:14:10,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48605.7, 300 sec: 48929.8). Total num frames: 3112812544. Throughput: 0: 48552.9. Samples: 2641646920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:14:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:14:11,430][71000] Updated weights for policy 0, policy_version 189994 (0.0029) [2024-06-13 07:14:15,120][71000] Updated weights for policy 0, policy_version 190004 (0.0029) [2024-06-13 07:14:15,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48879.1, 300 sec: 48985.4). Total num frames: 3113058304. Throughput: 0: 48755.1. Samples: 2641943180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 07:14:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:14:18,130][71000] Updated weights for policy 0, policy_version 190014 (0.0032) [2024-06-13 07:14:20,940][70768] Fps is (10 sec: 49152.8, 60 sec: 48332.8, 300 sec: 48985.4). Total num frames: 3113304064. Throughput: 0: 48789.4. Samples: 2642080940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 07:14:20,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:14:21,739][71000] Updated weights for policy 0, policy_version 190024 (0.0029) [2024-06-13 07:14:24,946][71000] Updated weights for policy 0, policy_version 190034 (0.0030) [2024-06-13 07:14:25,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 3113566208. Throughput: 0: 49085.8. Samples: 2642386000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 07:14:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:14:28,239][71000] Updated weights for policy 0, policy_version 190044 (0.0033) [2024-06-13 07:14:30,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48332.8, 300 sec: 48874.3). Total num frames: 3113779200. Throughput: 0: 48831.6. Samples: 2642668720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 07:14:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:14:31,778][71000] Updated weights for policy 0, policy_version 190054 (0.0034) [2024-06-13 07:14:35,358][71000] Updated weights for policy 0, policy_version 190064 (0.0033) [2024-06-13 07:14:35,940][70768] Fps is (10 sec: 47510.3, 60 sec: 49151.4, 300 sec: 48929.7). Total num frames: 3114041344. Throughput: 0: 48582.7. Samples: 2642809240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 07:14:35,941][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:14:38,355][71000] Updated weights for policy 0, policy_version 190074 (0.0037) [2024-06-13 07:14:40,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48605.8, 300 sec: 48818.8). Total num frames: 3114270720. Throughput: 0: 48624.2. Samples: 2643101320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 07:14:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:14:41,861][71000] Updated weights for policy 0, policy_version 190084 (0.0024) [2024-06-13 07:14:44,701][70980] Signal inference workers to stop experience collection... (39550 times) [2024-06-13 07:14:44,732][71000] InferenceWorker_p0-w0: stopping experience collection (39550 times) [2024-06-13 07:14:44,757][70980] Signal inference workers to resume experience collection... (39550 times) [2024-06-13 07:14:44,757][71000] InferenceWorker_p0-w0: resuming experience collection (39550 times) [2024-06-13 07:14:44,899][71000] Updated weights for policy 0, policy_version 190094 (0.0022) [2024-06-13 07:14:45,939][70768] Fps is (10 sec: 50794.8, 60 sec: 48879.1, 300 sec: 49040.9). Total num frames: 3114549248. Throughput: 0: 48819.7. Samples: 2643403940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 07:14:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:14:48,535][71000] Updated weights for policy 0, policy_version 190104 (0.0029) [2024-06-13 07:14:50,939][70768] Fps is (10 sec: 50791.5, 60 sec: 49152.1, 300 sec: 48929.9). Total num frames: 3114778624. Throughput: 0: 48600.6. Samples: 2643549980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 07:14:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:14:51,860][71000] Updated weights for policy 0, policy_version 190114 (0.0026) [2024-06-13 07:14:55,402][71000] Updated weights for policy 0, policy_version 190124 (0.0024) [2024-06-13 07:14:55,940][70768] Fps is (10 sec: 45874.5, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 3115008000. Throughput: 0: 48790.7. Samples: 2643842500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 07:14:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:14:58,503][71000] Updated weights for policy 0, policy_version 190134 (0.0026) [2024-06-13 07:15:00,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 3115253760. Throughput: 0: 48684.4. Samples: 2644133980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 07:15:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:15:02,089][71000] Updated weights for policy 0, policy_version 190144 (0.0026) [2024-06-13 07:15:05,330][71000] Updated weights for policy 0, policy_version 190154 (0.0030) [2024-06-13 07:15:05,940][70768] Fps is (10 sec: 50791.2, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 3115515904. Throughput: 0: 48840.0. Samples: 2644278740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 07:15:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:15:09,090][71000] Updated weights for policy 0, policy_version 190164 (0.0027) [2024-06-13 07:15:10,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 3115745280. Throughput: 0: 48464.5. Samples: 2644566900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 07:15:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:15:12,030][71000] Updated weights for policy 0, policy_version 190174 (0.0029) [2024-06-13 07:15:15,663][71000] Updated weights for policy 0, policy_version 190184 (0.0031) [2024-06-13 07:15:15,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 3115974656. Throughput: 0: 48661.4. Samples: 2644858480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 07:15:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:15:18,414][71000] Updated weights for policy 0, policy_version 190194 (0.0030) [2024-06-13 07:15:20,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48605.8, 300 sec: 48818.8). Total num frames: 3116220416. Throughput: 0: 48743.9. Samples: 2645002680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 07:15:20,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:15:22,258][71000] Updated weights for policy 0, policy_version 190204 (0.0034) [2024-06-13 07:15:25,237][71000] Updated weights for policy 0, policy_version 190214 (0.0029) [2024-06-13 07:15:25,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 3116482560. Throughput: 0: 48958.8. Samples: 2645304460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-13 07:15:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:15:28,865][71000] Updated weights for policy 0, policy_version 190224 (0.0043) [2024-06-13 07:15:30,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.1, 300 sec: 48929.9). Total num frames: 3116728320. Throughput: 0: 48781.3. Samples: 2645599100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-13 07:15:30,944][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:15:31,880][71000] Updated weights for policy 0, policy_version 190234 (0.0024) [2024-06-13 07:15:35,932][71000] Updated weights for policy 0, policy_version 190244 (0.0027) [2024-06-13 07:15:35,939][70768] Fps is (10 sec: 47514.2, 60 sec: 48606.6, 300 sec: 48763.2). Total num frames: 3116957696. Throughput: 0: 48683.6. Samples: 2645740740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-13 07:15:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:15:38,421][71000] Updated weights for policy 0, policy_version 190254 (0.0030) [2024-06-13 07:15:40,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.1, 300 sec: 48818.8). Total num frames: 3117203456. Throughput: 0: 48710.4. Samples: 2646034460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-13 07:15:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:15:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000190259_3117203456.pth... [2024-06-13 07:15:41,002][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000189545_3105505280.pth [2024-06-13 07:15:42,424][71000] Updated weights for policy 0, policy_version 190264 (0.0027) [2024-06-13 07:15:45,298][71000] Updated weights for policy 0, policy_version 190274 (0.0035) [2024-06-13 07:15:45,939][70768] Fps is (10 sec: 49152.0, 60 sec: 48332.8, 300 sec: 48818.8). Total num frames: 3117449216. Throughput: 0: 48570.8. Samples: 2646319660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-13 07:15:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:15:49,357][71000] Updated weights for policy 0, policy_version 190284 (0.0045) [2024-06-13 07:15:50,944][70768] Fps is (10 sec: 50768.2, 60 sec: 48875.4, 300 sec: 48929.1). Total num frames: 3117711360. Throughput: 0: 48748.6. Samples: 2646472640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-13 07:15:50,944][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:15:52,105][71000] Updated weights for policy 0, policy_version 190294 (0.0034) [2024-06-13 07:15:55,912][71000] Updated weights for policy 0, policy_version 190304 (0.0031) [2024-06-13 07:15:55,940][70768] Fps is (10 sec: 49150.9, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 3117940736. Throughput: 0: 48828.8. Samples: 2646764200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-13 07:15:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:15:58,591][71000] Updated weights for policy 0, policy_version 190314 (0.0022) [2024-06-13 07:16:00,940][70768] Fps is (10 sec: 45895.2, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 3118170112. Throughput: 0: 48892.5. Samples: 2647058640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-13 07:16:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:16:02,572][71000] Updated weights for policy 0, policy_version 190324 (0.0032) [2024-06-13 07:16:04,510][70980] Signal inference workers to stop experience collection... (39600 times) [2024-06-13 07:16:04,560][71000] InferenceWorker_p0-w0: stopping experience collection (39600 times) [2024-06-13 07:16:04,616][70980] Signal inference workers to resume experience collection... (39600 times) [2024-06-13 07:16:04,616][71000] InferenceWorker_p0-w0: resuming experience collection (39600 times) [2024-06-13 07:16:05,456][71000] Updated weights for policy 0, policy_version 190334 (0.0027) [2024-06-13 07:16:05,939][70768] Fps is (10 sec: 49152.9, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 3118432256. Throughput: 0: 48970.7. Samples: 2647206360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-13 07:16:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:16:09,431][71000] Updated weights for policy 0, policy_version 190344 (0.0025) [2024-06-13 07:16:10,944][70768] Fps is (10 sec: 52406.3, 60 sec: 49148.5, 300 sec: 48929.2). Total num frames: 3118694400. Throughput: 0: 48806.5. Samples: 2647500960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-13 07:16:10,944][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:16:12,511][71000] Updated weights for policy 0, policy_version 190354 (0.0032) [2024-06-13 07:16:15,940][70768] Fps is (10 sec: 45874.2, 60 sec: 48605.7, 300 sec: 48707.7). Total num frames: 3118891008. Throughput: 0: 48685.1. Samples: 2647789940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-13 07:16:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:16:16,078][71000] Updated weights for policy 0, policy_version 190364 (0.0023) [2024-06-13 07:16:19,071][71000] Updated weights for policy 0, policy_version 190374 (0.0032) [2024-06-13 07:16:20,940][70768] Fps is (10 sec: 44255.8, 60 sec: 48605.9, 300 sec: 48763.3). Total num frames: 3119136768. Throughput: 0: 48447.9. Samples: 2647920900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-13 07:16:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:16:22,890][71000] Updated weights for policy 0, policy_version 190384 (0.0033) [2024-06-13 07:16:25,660][71000] Updated weights for policy 0, policy_version 190394 (0.0026) [2024-06-13 07:16:25,940][70768] Fps is (10 sec: 52429.0, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 3119415296. Throughput: 0: 48702.9. Samples: 2648226100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-13 07:16:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:16:29,591][71000] Updated weights for policy 0, policy_version 190404 (0.0025) [2024-06-13 07:16:30,940][70768] Fps is (10 sec: 54067.2, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 3119677440. Throughput: 0: 48975.5. Samples: 2648523560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-13 07:16:30,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 07:16:32,807][71000] Updated weights for policy 0, policy_version 190414 (0.0030) [2024-06-13 07:16:35,887][71000] Updated weights for policy 0, policy_version 190424 (0.0029) [2024-06-13 07:16:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49151.8, 300 sec: 48874.3). Total num frames: 3119906816. Throughput: 0: 48786.3. Samples: 2648667820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-13 07:16:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:16:39,169][71000] Updated weights for policy 0, policy_version 190434 (0.0023) [2024-06-13 07:16:40,940][70768] Fps is (10 sec: 44236.8, 60 sec: 48605.8, 300 sec: 48763.6). Total num frames: 3120119808. Throughput: 0: 48810.4. Samples: 2648960660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-13 07:16:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:16:42,521][71000] Updated weights for policy 0, policy_version 190444 (0.0028) [2024-06-13 07:16:45,658][71000] Updated weights for policy 0, policy_version 190454 (0.0028) [2024-06-13 07:16:45,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 3120398336. Throughput: 0: 48827.0. Samples: 2649255860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-13 07:16:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:16:49,418][71000] Updated weights for policy 0, policy_version 190464 (0.0026) [2024-06-13 07:16:50,940][70768] Fps is (10 sec: 52428.7, 60 sec: 48882.4, 300 sec: 48874.3). Total num frames: 3120644096. Throughput: 0: 48926.1. Samples: 2649408040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-13 07:16:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:16:52,862][71000] Updated weights for policy 0, policy_version 190474 (0.0025) [2024-06-13 07:16:55,866][71000] Updated weights for policy 0, policy_version 190484 (0.0021) [2024-06-13 07:16:55,939][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.2, 300 sec: 48874.3). Total num frames: 3120889856. Throughput: 0: 49074.5. Samples: 2649709100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-13 07:16:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:16:59,423][71000] Updated weights for policy 0, policy_version 190494 (0.0031) [2024-06-13 07:17:00,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 3121102848. Throughput: 0: 49063.8. Samples: 2649997800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-13 07:17:00,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 07:17:02,998][71000] Updated weights for policy 0, policy_version 190504 (0.0030) [2024-06-13 07:17:05,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48878.8, 300 sec: 48818.8). Total num frames: 3121364992. Throughput: 0: 49243.0. Samples: 2650136840. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-13 07:17:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:17:06,202][71000] Updated weights for policy 0, policy_version 190514 (0.0029) [2024-06-13 07:17:09,255][70980] Signal inference workers to stop experience collection... (39650 times) [2024-06-13 07:17:09,298][71000] InferenceWorker_p0-w0: stopping experience collection (39650 times) [2024-06-13 07:17:09,307][70980] Signal inference workers to resume experience collection... (39650 times) [2024-06-13 07:17:09,316][71000] InferenceWorker_p0-w0: resuming experience collection (39650 times) [2024-06-13 07:17:09,434][71000] Updated weights for policy 0, policy_version 190524 (0.0022) [2024-06-13 07:17:10,940][70768] Fps is (10 sec: 52428.3, 60 sec: 48882.4, 300 sec: 48929.8). Total num frames: 3121627136. Throughput: 0: 49160.1. Samples: 2650438300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-13 07:17:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:17:13,079][71000] Updated weights for policy 0, policy_version 190534 (0.0027) [2024-06-13 07:17:15,942][70768] Fps is (10 sec: 47504.0, 60 sec: 49150.4, 300 sec: 48762.9). Total num frames: 3121840128. Throughput: 0: 49087.5. Samples: 2650732600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-13 07:17:15,942][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:17:16,220][71000] Updated weights for policy 0, policy_version 190544 (0.0032) [2024-06-13 07:17:19,533][71000] Updated weights for policy 0, policy_version 190554 (0.0026) [2024-06-13 07:17:20,940][70768] Fps is (10 sec: 45875.5, 60 sec: 49152.0, 300 sec: 48707.7). Total num frames: 3122085888. Throughput: 0: 49045.9. Samples: 2650874880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-13 07:17:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:17:22,693][71000] Updated weights for policy 0, policy_version 190564 (0.0030) [2024-06-13 07:17:25,940][70768] Fps is (10 sec: 50800.5, 60 sec: 48878.9, 300 sec: 48818.7). Total num frames: 3122348032. Throughput: 0: 48894.5. Samples: 2651160920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-13 07:17:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:17:26,419][71000] Updated weights for policy 0, policy_version 190574 (0.0031) [2024-06-13 07:17:29,408][71000] Updated weights for policy 0, policy_version 190584 (0.0027) [2024-06-13 07:17:30,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48332.8, 300 sec: 48763.2). Total num frames: 3122577408. Throughput: 0: 48939.2. Samples: 2651458120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-13 07:17:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:17:33,069][71000] Updated weights for policy 0, policy_version 190594 (0.0025) [2024-06-13 07:17:35,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 3122823168. Throughput: 0: 48795.9. Samples: 2651603860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 07:17:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:17:36,298][71000] Updated weights for policy 0, policy_version 190604 (0.0036) [2024-06-13 07:17:39,664][71000] Updated weights for policy 0, policy_version 190614 (0.0031) [2024-06-13 07:17:40,940][70768] Fps is (10 sec: 49150.8, 60 sec: 49151.9, 300 sec: 48707.7). Total num frames: 3123068928. Throughput: 0: 48598.8. Samples: 2651896060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 07:17:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:17:40,957][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000190617_3123068928.pth... [2024-06-13 07:17:41,011][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000189902_3111354368.pth [2024-06-13 07:17:42,955][71000] Updated weights for policy 0, policy_version 190624 (0.0022) [2024-06-13 07:17:45,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3123331072. Throughput: 0: 48762.5. Samples: 2652192120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 07:17:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:17:46,388][71000] Updated weights for policy 0, policy_version 190634 (0.0034) [2024-06-13 07:17:49,479][71000] Updated weights for policy 0, policy_version 190644 (0.0030) [2024-06-13 07:17:50,940][70768] Fps is (10 sec: 50791.0, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3123576832. Throughput: 0: 49057.8. Samples: 2652344440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 07:17:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:17:52,723][71000] Updated weights for policy 0, policy_version 190654 (0.0027) [2024-06-13 07:17:55,939][70768] Fps is (10 sec: 49153.0, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 3123822592. Throughput: 0: 49014.9. Samples: 2652643960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 07:17:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:17:56,227][71000] Updated weights for policy 0, policy_version 190664 (0.0021) [2024-06-13 07:17:59,435][71000] Updated weights for policy 0, policy_version 190674 (0.0025) [2024-06-13 07:18:00,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 3124051968. Throughput: 0: 49117.9. Samples: 2652942800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 07:18:00,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:18:02,779][71000] Updated weights for policy 0, policy_version 190684 (0.0026) [2024-06-13 07:18:05,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.3, 300 sec: 48929.9). Total num frames: 3124330496. Throughput: 0: 49062.4. Samples: 2653082680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 07:18:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:18:05,951][71000] Updated weights for policy 0, policy_version 190694 (0.0024) [2024-06-13 07:18:09,442][71000] Updated weights for policy 0, policy_version 190704 (0.0028) [2024-06-13 07:18:10,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3124576256. Throughput: 0: 49333.4. Samples: 2653380920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 07:18:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:18:12,814][71000] Updated weights for policy 0, policy_version 190714 (0.0026) [2024-06-13 07:18:15,939][70768] Fps is (10 sec: 45875.1, 60 sec: 49153.8, 300 sec: 48763.2). Total num frames: 3124789248. Throughput: 0: 49213.8. Samples: 2653672740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 07:18:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:18:16,097][71000] Updated weights for policy 0, policy_version 190724 (0.0026) [2024-06-13 07:18:19,589][71000] Updated weights for policy 0, policy_version 190734 (0.0030) [2024-06-13 07:18:20,939][70768] Fps is (10 sec: 45875.8, 60 sec: 49152.0, 300 sec: 48763.3). Total num frames: 3125035008. Throughput: 0: 49273.5. Samples: 2653821160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 07:18:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:18:22,841][71000] Updated weights for policy 0, policy_version 190744 (0.0027) [2024-06-13 07:18:23,052][70980] Signal inference workers to stop experience collection... (39700 times) [2024-06-13 07:18:23,053][70980] Signal inference workers to resume experience collection... (39700 times) [2024-06-13 07:18:23,084][71000] InferenceWorker_p0-w0: stopping experience collection (39700 times) [2024-06-13 07:18:23,084][71000] InferenceWorker_p0-w0: resuming experience collection (39700 times) [2024-06-13 07:18:25,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 3125297152. Throughput: 0: 49242.9. Samples: 2654111980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 07:18:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:18:26,434][71000] Updated weights for policy 0, policy_version 190754 (0.0030) [2024-06-13 07:18:29,472][71000] Updated weights for policy 0, policy_version 190764 (0.0039) [2024-06-13 07:18:30,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 3125526528. Throughput: 0: 49072.2. Samples: 2654400360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 07:18:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:18:32,788][71000] Updated weights for policy 0, policy_version 190774 (0.0032) [2024-06-13 07:18:35,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49425.0, 300 sec: 48929.8). Total num frames: 3125788672. Throughput: 0: 49187.1. Samples: 2654557860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 07:18:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:18:36,289][71000] Updated weights for policy 0, policy_version 190784 (0.0030) [2024-06-13 07:18:39,446][71000] Updated weights for policy 0, policy_version 190794 (0.0031) [2024-06-13 07:18:40,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48879.1, 300 sec: 48763.2). Total num frames: 3126001664. Throughput: 0: 48925.6. Samples: 2654845620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 07:18:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:18:42,913][71000] Updated weights for policy 0, policy_version 190804 (0.0026) [2024-06-13 07:18:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3126263808. Throughput: 0: 48790.9. Samples: 2655138400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 07:18:45,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:18:46,402][71000] Updated weights for policy 0, policy_version 190814 (0.0030) [2024-06-13 07:18:49,400][71000] Updated weights for policy 0, policy_version 190824 (0.0034) [2024-06-13 07:18:50,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 3126509568. Throughput: 0: 48981.2. Samples: 2655286840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 07:18:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:18:52,855][71000] Updated weights for policy 0, policy_version 190834 (0.0022) [2024-06-13 07:18:55,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49151.9, 300 sec: 48929.9). Total num frames: 3126771712. Throughput: 0: 48994.3. Samples: 2655585660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 07:18:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:18:56,052][71000] Updated weights for policy 0, policy_version 190844 (0.0027) [2024-06-13 07:18:59,559][71000] Updated weights for policy 0, policy_version 190854 (0.0026) [2024-06-13 07:19:00,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 3126984704. Throughput: 0: 49072.4. Samples: 2655881000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 07:19:00,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:19:02,785][71000] Updated weights for policy 0, policy_version 190864 (0.0041) [2024-06-13 07:19:05,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.8, 300 sec: 48929.9). Total num frames: 3127246848. Throughput: 0: 48893.8. Samples: 2656021380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 07:19:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:19:06,414][71000] Updated weights for policy 0, policy_version 190874 (0.0034) [2024-06-13 07:19:09,473][71000] Updated weights for policy 0, policy_version 190884 (0.0027) [2024-06-13 07:19:10,940][70768] Fps is (10 sec: 50789.7, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 3127492608. Throughput: 0: 48965.2. Samples: 2656315420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 07:19:10,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 07:19:12,986][71000] Updated weights for policy 0, policy_version 190894 (0.0024) [2024-06-13 07:19:15,894][71000] Updated weights for policy 0, policy_version 190904 (0.0026) [2024-06-13 07:19:15,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49698.1, 300 sec: 49040.9). Total num frames: 3127771136. Throughput: 0: 49312.0. Samples: 2656619400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 07:19:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:19:19,675][71000] Updated weights for policy 0, policy_version 190914 (0.0026) [2024-06-13 07:19:20,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 3127984128. Throughput: 0: 49141.8. Samples: 2656769240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 07:19:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 07:19:22,751][71000] Updated weights for policy 0, policy_version 190924 (0.0030) [2024-06-13 07:19:25,940][70768] Fps is (10 sec: 44236.2, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 3128213504. Throughput: 0: 49037.3. Samples: 2657052300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 07:19:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:19:26,421][71000] Updated weights for policy 0, policy_version 190934 (0.0026) [2024-06-13 07:19:29,394][71000] Updated weights for policy 0, policy_version 190944 (0.0026) [2024-06-13 07:19:30,939][70768] Fps is (10 sec: 49152.8, 60 sec: 49152.0, 300 sec: 48930.0). Total num frames: 3128475648. Throughput: 0: 49243.3. Samples: 2657354340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 07:19:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:19:32,885][71000] Updated weights for policy 0, policy_version 190954 (0.0026) [2024-06-13 07:19:35,419][70980] Signal inference workers to stop experience collection... (39750 times) [2024-06-13 07:19:35,419][70980] Signal inference workers to resume experience collection... (39750 times) [2024-06-13 07:19:35,460][71000] InferenceWorker_p0-w0: stopping experience collection (39750 times) [2024-06-13 07:19:35,460][71000] InferenceWorker_p0-w0: resuming experience collection (39750 times) [2024-06-13 07:19:35,939][70768] Fps is (10 sec: 52429.5, 60 sec: 49152.1, 300 sec: 49041.0). Total num frames: 3128737792. Throughput: 0: 49175.6. Samples: 2657499740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 07:19:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:19:35,997][71000] Updated weights for policy 0, policy_version 190964 (0.0029) [2024-06-13 07:19:39,670][71000] Updated weights for policy 0, policy_version 190974 (0.0026) [2024-06-13 07:19:40,940][70768] Fps is (10 sec: 49150.9, 60 sec: 49425.0, 300 sec: 48874.3). Total num frames: 3128967168. Throughput: 0: 49290.9. Samples: 2657803760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 07:19:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:19:41,068][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000190978_3128983552.pth... [2024-06-13 07:19:41,118][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000190259_3117203456.pth [2024-06-13 07:19:43,163][71000] Updated weights for policy 0, policy_version 190984 (0.0030) [2024-06-13 07:19:45,939][70768] Fps is (10 sec: 45875.3, 60 sec: 48879.1, 300 sec: 48874.3). Total num frames: 3129196544. Throughput: 0: 48967.6. Samples: 2658084540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 07:19:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:19:46,399][71000] Updated weights for policy 0, policy_version 190994 (0.0033) [2024-06-13 07:19:49,806][71000] Updated weights for policy 0, policy_version 191004 (0.0028) [2024-06-13 07:19:50,940][70768] Fps is (10 sec: 47514.3, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 3129442304. Throughput: 0: 49016.9. Samples: 2658227140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 07:19:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:19:53,534][71000] Updated weights for policy 0, policy_version 191014 (0.0021) [2024-06-13 07:19:55,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 3129688064. Throughput: 0: 48745.9. Samples: 2658508980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 07:19:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:19:56,489][71000] Updated weights for policy 0, policy_version 191024 (0.0018) [2024-06-13 07:20:00,231][71000] Updated weights for policy 0, policy_version 191034 (0.0034) [2024-06-13 07:20:00,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 3129917440. Throughput: 0: 48537.3. Samples: 2658803580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 07:20:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:20:03,422][71000] Updated weights for policy 0, policy_version 191044 (0.0026) [2024-06-13 07:20:05,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 3130179584. Throughput: 0: 48401.0. Samples: 2658947280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 07:20:05,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:20:07,010][71000] Updated weights for policy 0, policy_version 191054 (0.0028) [2024-06-13 07:20:10,242][71000] Updated weights for policy 0, policy_version 191064 (0.0027) [2024-06-13 07:20:10,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 3130408960. Throughput: 0: 48664.8. Samples: 2659242220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 07:20:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:20:13,706][71000] Updated weights for policy 0, policy_version 191074 (0.0032) [2024-06-13 07:20:15,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48332.6, 300 sec: 48985.4). Total num frames: 3130671104. Throughput: 0: 48508.6. Samples: 2659537240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 07:20:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:20:16,728][71000] Updated weights for policy 0, policy_version 191084 (0.0033) [2024-06-13 07:20:20,468][71000] Updated weights for policy 0, policy_version 191094 (0.0027) [2024-06-13 07:20:20,940][70768] Fps is (10 sec: 50791.3, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 3130916864. Throughput: 0: 48553.3. Samples: 2659684640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 07:20:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:20:23,458][71000] Updated weights for policy 0, policy_version 191104 (0.0035) [2024-06-13 07:20:25,939][70768] Fps is (10 sec: 47514.6, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 3131146240. Throughput: 0: 48470.9. Samples: 2659984940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 07:20:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:20:27,122][71000] Updated weights for policy 0, policy_version 191114 (0.0024) [2024-06-13 07:20:29,839][71000] Updated weights for policy 0, policy_version 191124 (0.0026) [2024-06-13 07:20:30,939][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 3131392000. Throughput: 0: 48732.4. Samples: 2660277500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 07:20:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:20:33,828][71000] Updated weights for policy 0, policy_version 191134 (0.0023) [2024-06-13 07:20:35,121][70980] Signal inference workers to stop experience collection... (39800 times) [2024-06-13 07:20:35,163][71000] InferenceWorker_p0-w0: stopping experience collection (39800 times) [2024-06-13 07:20:35,169][70980] Signal inference workers to resume experience collection... (39800 times) [2024-06-13 07:20:35,176][71000] InferenceWorker_p0-w0: resuming experience collection (39800 times) [2024-06-13 07:20:35,940][70768] Fps is (10 sec: 52427.8, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 3131670528. Throughput: 0: 48984.3. Samples: 2660431440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 07:20:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:20:36,909][71000] Updated weights for policy 0, policy_version 191144 (0.0024) [2024-06-13 07:20:40,297][71000] Updated weights for policy 0, policy_version 191154 (0.0022) [2024-06-13 07:20:40,940][70768] Fps is (10 sec: 52427.6, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3131916288. Throughput: 0: 49354.1. Samples: 2660729920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 07:20:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:20:43,686][71000] Updated weights for policy 0, policy_version 191164 (0.0036) [2024-06-13 07:20:45,940][70768] Fps is (10 sec: 47514.4, 60 sec: 49152.0, 300 sec: 48930.6). Total num frames: 3132145664. Throughput: 0: 49103.6. Samples: 2661013240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 07:20:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:20:47,002][71000] Updated weights for policy 0, policy_version 191174 (0.0027) [2024-06-13 07:20:50,647][71000] Updated weights for policy 0, policy_version 191184 (0.0030) [2024-06-13 07:20:50,939][70768] Fps is (10 sec: 45876.3, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 3132375040. Throughput: 0: 49074.7. Samples: 2661155640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-13 07:20:50,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 07:20:53,860][71000] Updated weights for policy 0, policy_version 191194 (0.0026) [2024-06-13 07:20:55,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3132637184. Throughput: 0: 48838.4. Samples: 2661439940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 07:20:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 07:20:57,229][71000] Updated weights for policy 0, policy_version 191204 (0.0023) [2024-06-13 07:21:00,461][71000] Updated weights for policy 0, policy_version 191214 (0.0024) [2024-06-13 07:21:00,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 3132866560. Throughput: 0: 49060.6. Samples: 2661744960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 07:21:00,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 07:21:03,987][71000] Updated weights for policy 0, policy_version 191224 (0.0026) [2024-06-13 07:21:05,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.9, 300 sec: 48875.0). Total num frames: 3133112320. Throughput: 0: 49090.6. Samples: 2661893720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 07:21:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:21:06,867][71000] Updated weights for policy 0, policy_version 191234 (0.0027) [2024-06-13 07:21:10,588][71000] Updated weights for policy 0, policy_version 191244 (0.0026) [2024-06-13 07:21:10,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.2, 300 sec: 49041.0). Total num frames: 3133358080. Throughput: 0: 48821.8. Samples: 2662181920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 07:21:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 07:21:13,785][71000] Updated weights for policy 0, policy_version 191254 (0.0033) [2024-06-13 07:21:15,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49152.0, 300 sec: 49096.4). Total num frames: 3133620224. Throughput: 0: 48726.1. Samples: 2662470180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 07:21:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:21:17,414][71000] Updated weights for policy 0, policy_version 191264 (0.0033) [2024-06-13 07:21:20,526][71000] Updated weights for policy 0, policy_version 191274 (0.0021) [2024-06-13 07:21:20,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 3133849600. Throughput: 0: 48683.6. Samples: 2662622200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 07:21:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:21:23,839][71000] Updated weights for policy 0, policy_version 191284 (0.0036) [2024-06-13 07:21:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 3134095360. Throughput: 0: 48571.2. Samples: 2662915620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 07:21:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:21:26,936][71000] Updated weights for policy 0, policy_version 191294 (0.0020) [2024-06-13 07:21:30,518][71000] Updated weights for policy 0, policy_version 191304 (0.0028) [2024-06-13 07:21:30,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.8, 300 sec: 48929.8). Total num frames: 3134341120. Throughput: 0: 48990.0. Samples: 2663217800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 07:21:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:21:33,439][71000] Updated weights for policy 0, policy_version 191314 (0.0024) [2024-06-13 07:21:35,940][70768] Fps is (10 sec: 50791.2, 60 sec: 48879.1, 300 sec: 49096.5). Total num frames: 3134603264. Throughput: 0: 48978.6. Samples: 2663359680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 07:21:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:21:37,027][70980] Signal inference workers to stop experience collection... (39850 times) [2024-06-13 07:21:37,028][70980] Signal inference workers to resume experience collection... (39850 times) [2024-06-13 07:21:37,052][71000] InferenceWorker_p0-w0: stopping experience collection (39850 times) [2024-06-13 07:21:37,052][71000] InferenceWorker_p0-w0: resuming experience collection (39850 times) [2024-06-13 07:21:37,159][71000] Updated weights for policy 0, policy_version 191324 (0.0027) [2024-06-13 07:21:40,213][71000] Updated weights for policy 0, policy_version 191334 (0.0022) [2024-06-13 07:21:40,940][70768] Fps is (10 sec: 49153.0, 60 sec: 48606.0, 300 sec: 48929.9). Total num frames: 3134832640. Throughput: 0: 49327.1. Samples: 2663659660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 07:21:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:21:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000191336_3134849024.pth... [2024-06-13 07:21:41,001][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000190617_3123068928.pth [2024-06-13 07:21:43,693][71000] Updated weights for policy 0, policy_version 191344 (0.0030) [2024-06-13 07:21:45,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3135094784. Throughput: 0: 49354.8. Samples: 2663965920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 07:21:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:21:46,500][71000] Updated weights for policy 0, policy_version 191354 (0.0025) [2024-06-13 07:21:50,024][71000] Updated weights for policy 0, policy_version 191364 (0.0034) [2024-06-13 07:21:50,939][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.1, 300 sec: 48929.9). Total num frames: 3135324160. Throughput: 0: 49348.1. Samples: 2664114380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 07:21:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:21:53,212][71000] Updated weights for policy 0, policy_version 191374 (0.0036) [2024-06-13 07:21:55,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3135553536. Throughput: 0: 49353.8. Samples: 2664402840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 07:21:55,940][70768] Avg episode reward: [(0, '0.270')] [2024-06-13 07:21:57,166][71000] Updated weights for policy 0, policy_version 191384 (0.0029) [2024-06-13 07:22:00,139][71000] Updated weights for policy 0, policy_version 191394 (0.0029) [2024-06-13 07:22:00,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3135815680. Throughput: 0: 49270.3. Samples: 2664687340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 07:22:00,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 07:22:03,622][71000] Updated weights for policy 0, policy_version 191404 (0.0031) [2024-06-13 07:22:05,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 3136061440. Throughput: 0: 49289.4. Samples: 2664840220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 07:22:05,940][70768] Avg episode reward: [(0, '0.267')] [2024-06-13 07:22:06,623][71000] Updated weights for policy 0, policy_version 191414 (0.0028) [2024-06-13 07:22:10,106][71000] Updated weights for policy 0, policy_version 191424 (0.0030) [2024-06-13 07:22:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49151.9, 300 sec: 49041.3). Total num frames: 3136307200. Throughput: 0: 49357.4. Samples: 2665136700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 07:22:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 07:22:13,162][71000] Updated weights for policy 0, policy_version 191434 (0.0023) [2024-06-13 07:22:15,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 3136552960. Throughput: 0: 49197.0. Samples: 2665431660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 07:22:15,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 07:22:16,908][71000] Updated weights for policy 0, policy_version 191444 (0.0027) [2024-06-13 07:22:19,967][71000] Updated weights for policy 0, policy_version 191454 (0.0024) [2024-06-13 07:22:20,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3136798720. Throughput: 0: 49168.4. Samples: 2665572260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 07:22:20,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 07:22:23,895][71000] Updated weights for policy 0, policy_version 191464 (0.0032) [2024-06-13 07:22:25,944][70768] Fps is (10 sec: 50768.2, 60 sec: 49421.5, 300 sec: 49095.7). Total num frames: 3137060864. Throughput: 0: 49172.0. Samples: 2665872620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 07:22:25,945][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:22:26,600][71000] Updated weights for policy 0, policy_version 191474 (0.0023) [2024-06-13 07:22:30,322][71000] Updated weights for policy 0, policy_version 191484 (0.0025) [2024-06-13 07:22:30,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48879.1, 300 sec: 48985.4). Total num frames: 3137273856. Throughput: 0: 48946.2. Samples: 2666168500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 07:22:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:22:33,322][71000] Updated weights for policy 0, policy_version 191494 (0.0023) [2024-06-13 07:22:35,940][70768] Fps is (10 sec: 47534.8, 60 sec: 48878.9, 300 sec: 49041.0). Total num frames: 3137536000. Throughput: 0: 48873.6. Samples: 2666313700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 07:22:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:22:37,040][71000] Updated weights for policy 0, policy_version 191504 (0.0032) [2024-06-13 07:22:39,882][71000] Updated weights for policy 0, policy_version 191514 (0.0037) [2024-06-13 07:22:40,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3137781760. Throughput: 0: 48992.9. Samples: 2666607520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 07:22:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:22:43,701][71000] Updated weights for policy 0, policy_version 191524 (0.0032) [2024-06-13 07:22:45,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 3138043904. Throughput: 0: 49236.0. Samples: 2666902960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 07:22:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:22:46,718][71000] Updated weights for policy 0, policy_version 191534 (0.0027) [2024-06-13 07:22:50,438][71000] Updated weights for policy 0, policy_version 191544 (0.0024) [2024-06-13 07:22:50,942][70768] Fps is (10 sec: 47500.5, 60 sec: 48876.6, 300 sec: 48929.4). Total num frames: 3138256896. Throughput: 0: 48967.7. Samples: 2667043900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 07:22:50,943][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:22:51,260][70980] Signal inference workers to stop experience collection... (39900 times) [2024-06-13 07:22:51,260][70980] Signal inference workers to resume experience collection... (39900 times) [2024-06-13 07:22:51,296][71000] InferenceWorker_p0-w0: stopping experience collection (39900 times) [2024-06-13 07:22:51,296][71000] InferenceWorker_p0-w0: resuming experience collection (39900 times) [2024-06-13 07:22:53,448][71000] Updated weights for policy 0, policy_version 191554 (0.0027) [2024-06-13 07:22:55,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3138502656. Throughput: 0: 49107.6. Samples: 2667346540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 07:22:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:22:57,090][71000] Updated weights for policy 0, policy_version 191564 (0.0030) [2024-06-13 07:23:00,137][71000] Updated weights for policy 0, policy_version 191574 (0.0026) [2024-06-13 07:23:00,940][70768] Fps is (10 sec: 50803.6, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 3138764800. Throughput: 0: 48759.1. Samples: 2667625820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 07:23:00,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:23:03,980][71000] Updated weights for policy 0, policy_version 191584 (0.0030) [2024-06-13 07:23:05,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 3139010560. Throughput: 0: 49173.2. Samples: 2667785060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-13 07:23:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:23:06,867][71000] Updated weights for policy 0, policy_version 191594 (0.0033) [2024-06-13 07:23:10,815][71000] Updated weights for policy 0, policy_version 191604 (0.0034) [2024-06-13 07:23:10,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3139239936. Throughput: 0: 48915.5. Samples: 2668073600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 07:23:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:23:13,542][71000] Updated weights for policy 0, policy_version 191614 (0.0026) [2024-06-13 07:23:15,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3139485696. Throughput: 0: 48950.5. Samples: 2668371280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 07:23:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:23:17,150][71000] Updated weights for policy 0, policy_version 191624 (0.0022) [2024-06-13 07:23:20,168][71000] Updated weights for policy 0, policy_version 191634 (0.0033) [2024-06-13 07:23:20,939][70768] Fps is (10 sec: 52429.3, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 3139764224. Throughput: 0: 49069.4. Samples: 2668521820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 07:23:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:23:23,983][71000] Updated weights for policy 0, policy_version 191644 (0.0031) [2024-06-13 07:23:25,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48882.5, 300 sec: 49040.9). Total num frames: 3139993600. Throughput: 0: 48866.2. Samples: 2668806500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 07:23:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:23:27,292][71000] Updated weights for policy 0, policy_version 191654 (0.0037) [2024-06-13 07:23:30,940][70768] Fps is (10 sec: 44236.0, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 3140206592. Throughput: 0: 48921.2. Samples: 2669104420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 07:23:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:23:30,949][71000] Updated weights for policy 0, policy_version 191664 (0.0024) [2024-06-13 07:23:33,822][71000] Updated weights for policy 0, policy_version 191674 (0.0026) [2024-06-13 07:23:35,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 3140468736. Throughput: 0: 48865.2. Samples: 2669242700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 07:23:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:23:37,359][71000] Updated weights for policy 0, policy_version 191684 (0.0028) [2024-06-13 07:23:40,383][71000] Updated weights for policy 0, policy_version 191694 (0.0027) [2024-06-13 07:23:40,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49151.8, 300 sec: 49040.9). Total num frames: 3140730880. Throughput: 0: 48859.7. Samples: 2669545240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 07:23:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:23:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000191695_3140730880.pth... [2024-06-13 07:23:41,003][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000190978_3128983552.pth [2024-06-13 07:23:44,124][71000] Updated weights for policy 0, policy_version 191704 (0.0032) [2024-06-13 07:23:45,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3140960256. Throughput: 0: 49059.2. Samples: 2669833480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 07:23:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:23:47,353][71000] Updated weights for policy 0, policy_version 191714 (0.0029) [2024-06-13 07:23:50,939][70768] Fps is (10 sec: 45876.4, 60 sec: 48881.2, 300 sec: 48874.3). Total num frames: 3141189632. Throughput: 0: 48588.7. Samples: 2669971540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 07:23:50,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:23:51,020][71000] Updated weights for policy 0, policy_version 191724 (0.0038) [2024-06-13 07:23:54,052][71000] Updated weights for policy 0, policy_version 191734 (0.0029) [2024-06-13 07:23:55,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 3141451776. Throughput: 0: 48730.5. Samples: 2670266480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 07:23:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:23:57,610][71000] Updated weights for policy 0, policy_version 191744 (0.0024) [2024-06-13 07:24:00,808][71000] Updated weights for policy 0, policy_version 191754 (0.0028) [2024-06-13 07:24:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48879.1, 300 sec: 48985.4). Total num frames: 3141697536. Throughput: 0: 48923.7. Samples: 2670572840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 07:24:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:24:04,426][71000] Updated weights for policy 0, policy_version 191764 (0.0030) [2024-06-13 07:24:05,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 3141943296. Throughput: 0: 48649.6. Samples: 2670711060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 07:24:05,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:24:07,662][71000] Updated weights for policy 0, policy_version 191774 (0.0025) [2024-06-13 07:24:08,355][70980] Signal inference workers to stop experience collection... (39950 times) [2024-06-13 07:24:08,356][70980] Signal inference workers to resume experience collection... (39950 times) [2024-06-13 07:24:08,391][71000] InferenceWorker_p0-w0: stopping experience collection (39950 times) [2024-06-13 07:24:08,391][71000] InferenceWorker_p0-w0: resuming experience collection (39950 times) [2024-06-13 07:24:10,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48878.9, 300 sec: 48818.7). Total num frames: 3142172672. Throughput: 0: 48746.6. Samples: 2671000100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-13 07:24:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:24:11,209][71000] Updated weights for policy 0, policy_version 191784 (0.0030) [2024-06-13 07:24:14,605][71000] Updated weights for policy 0, policy_version 191794 (0.0024) [2024-06-13 07:24:15,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3142434816. Throughput: 0: 48490.2. Samples: 2671286480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 07:24:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:24:17,965][71000] Updated weights for policy 0, policy_version 191804 (0.0030) [2024-06-13 07:24:20,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48059.7, 300 sec: 48929.9). Total num frames: 3142647808. Throughput: 0: 48750.6. Samples: 2671436480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 07:24:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:24:21,122][71000] Updated weights for policy 0, policy_version 191814 (0.0030) [2024-06-13 07:24:24,456][71000] Updated weights for policy 0, policy_version 191824 (0.0026) [2024-06-13 07:24:25,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3142926336. Throughput: 0: 48740.1. Samples: 2671738540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 07:24:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:24:27,850][71000] Updated weights for policy 0, policy_version 191834 (0.0029) [2024-06-13 07:24:30,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3143155712. Throughput: 0: 48865.6. Samples: 2672032440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 07:24:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:24:31,223][71000] Updated weights for policy 0, policy_version 191844 (0.0035) [2024-06-13 07:24:34,419][71000] Updated weights for policy 0, policy_version 191854 (0.0030) [2024-06-13 07:24:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 3143417856. Throughput: 0: 49095.0. Samples: 2672180820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 07:24:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:24:37,619][71000] Updated weights for policy 0, policy_version 191864 (0.0037) [2024-06-13 07:24:40,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 3143647232. Throughput: 0: 49086.8. Samples: 2672475380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 07:24:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:24:40,986][71000] Updated weights for policy 0, policy_version 191874 (0.0030) [2024-06-13 07:24:44,475][71000] Updated weights for policy 0, policy_version 191884 (0.0024) [2024-06-13 07:24:45,939][70768] Fps is (10 sec: 47514.2, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 3143892992. Throughput: 0: 48668.9. Samples: 2672762940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 07:24:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:24:47,594][71000] Updated weights for policy 0, policy_version 191894 (0.0027) [2024-06-13 07:24:50,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 3144122368. Throughput: 0: 48903.2. Samples: 2672911700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 07:24:50,950][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:24:51,299][71000] Updated weights for policy 0, policy_version 191904 (0.0031) [2024-06-13 07:24:54,332][71000] Updated weights for policy 0, policy_version 191914 (0.0026) [2024-06-13 07:24:55,944][70768] Fps is (10 sec: 50768.2, 60 sec: 49148.6, 300 sec: 49095.7). Total num frames: 3144400896. Throughput: 0: 48958.0. Samples: 2673203420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 07:24:55,945][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:24:57,777][71000] Updated weights for policy 0, policy_version 191924 (0.0028) [2024-06-13 07:25:00,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3144630272. Throughput: 0: 49378.3. Samples: 2673508500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 07:25:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:25:01,171][71000] Updated weights for policy 0, policy_version 191934 (0.0031) [2024-06-13 07:25:04,558][71000] Updated weights for policy 0, policy_version 191944 (0.0035) [2024-06-13 07:25:05,940][70768] Fps is (10 sec: 47533.9, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 3144876032. Throughput: 0: 49154.2. Samples: 2673648420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 07:25:05,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:25:07,866][71000] Updated weights for policy 0, policy_version 191954 (0.0026) [2024-06-13 07:25:10,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3145121792. Throughput: 0: 48987.1. Samples: 2673942960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 07:25:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:25:11,169][71000] Updated weights for policy 0, policy_version 191964 (0.0034) [2024-06-13 07:25:14,473][71000] Updated weights for policy 0, policy_version 191974 (0.0028) [2024-06-13 07:25:15,939][70768] Fps is (10 sec: 49152.5, 60 sec: 48879.1, 300 sec: 48985.4). Total num frames: 3145367552. Throughput: 0: 48983.8. Samples: 2674236700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 07:25:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:25:17,757][71000] Updated weights for policy 0, policy_version 191984 (0.0037) [2024-06-13 07:25:20,939][70768] Fps is (10 sec: 47514.4, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3145596928. Throughput: 0: 48933.0. Samples: 2674382800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 07:25:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:25:21,177][71000] Updated weights for policy 0, policy_version 191994 (0.0028) [2024-06-13 07:25:24,446][71000] Updated weights for policy 0, policy_version 192004 (0.0024) [2024-06-13 07:25:25,939][70768] Fps is (10 sec: 49152.1, 60 sec: 48879.1, 300 sec: 49040.9). Total num frames: 3145859072. Throughput: 0: 49139.7. Samples: 2674686660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 07:25:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:25:27,398][71000] Updated weights for policy 0, policy_version 192014 (0.0021) [2024-06-13 07:25:30,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 3146104832. Throughput: 0: 49407.0. Samples: 2674986260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 07:25:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:25:30,979][71000] Updated weights for policy 0, policy_version 192024 (0.0027) [2024-06-13 07:25:33,305][70980] Signal inference workers to stop experience collection... (40000 times) [2024-06-13 07:25:33,362][71000] InferenceWorker_p0-w0: stopping experience collection (40000 times) [2024-06-13 07:25:33,414][70980] Signal inference workers to resume experience collection... (40000 times) [2024-06-13 07:25:33,415][71000] InferenceWorker_p0-w0: resuming experience collection (40000 times) [2024-06-13 07:25:34,035][71000] Updated weights for policy 0, policy_version 192034 (0.0025) [2024-06-13 07:25:35,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 3146366976. Throughput: 0: 49365.3. Samples: 2675133140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 07:25:35,940][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 07:25:37,651][71000] Updated weights for policy 0, policy_version 192044 (0.0034) [2024-06-13 07:25:40,836][71000] Updated weights for policy 0, policy_version 192054 (0.0026) [2024-06-13 07:25:40,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 3146612736. Throughput: 0: 49460.1. Samples: 2675428920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 07:25:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:25:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000192054_3146612736.pth... [2024-06-13 07:25:41,003][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000191336_3134849024.pth [2024-06-13 07:25:44,427][71000] Updated weights for policy 0, policy_version 192064 (0.0028) [2024-06-13 07:25:45,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3146842112. Throughput: 0: 49047.2. Samples: 2675715620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 07:25:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 07:25:47,738][71000] Updated weights for policy 0, policy_version 192074 (0.0026) [2024-06-13 07:25:50,939][70768] Fps is (10 sec: 47515.0, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 3147087872. Throughput: 0: 49166.9. Samples: 2675860920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 07:25:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:25:51,033][71000] Updated weights for policy 0, policy_version 192084 (0.0030) [2024-06-13 07:25:54,316][71000] Updated weights for policy 0, policy_version 192094 (0.0030) [2024-06-13 07:25:55,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48882.5, 300 sec: 49040.9). Total num frames: 3147333632. Throughput: 0: 49048.6. Samples: 2676150140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 07:25:55,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 07:25:57,896][71000] Updated weights for policy 0, policy_version 192104 (0.0026) [2024-06-13 07:26:00,753][71000] Updated weights for policy 0, policy_version 192114 (0.0029) [2024-06-13 07:26:00,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 3147595776. Throughput: 0: 49106.6. Samples: 2676446500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 07:26:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:26:04,297][71000] Updated weights for policy 0, policy_version 192124 (0.0028) [2024-06-13 07:26:05,939][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 3147825152. Throughput: 0: 49186.2. Samples: 2676596180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 07:26:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:26:07,761][71000] Updated weights for policy 0, policy_version 192134 (0.0034) [2024-06-13 07:26:10,940][70768] Fps is (10 sec: 45874.6, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3148054528. Throughput: 0: 48951.8. Samples: 2676889500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 07:26:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:26:11,156][71000] Updated weights for policy 0, policy_version 192144 (0.0026) [2024-06-13 07:26:14,354][71000] Updated weights for policy 0, policy_version 192154 (0.0030) [2024-06-13 07:26:15,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 3148300288. Throughput: 0: 48939.9. Samples: 2677188560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 07:26:15,943][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:26:17,810][71000] Updated weights for policy 0, policy_version 192164 (0.0027) [2024-06-13 07:26:20,838][71000] Updated weights for policy 0, policy_version 192174 (0.0025) [2024-06-13 07:26:20,940][70768] Fps is (10 sec: 52429.5, 60 sec: 49698.1, 300 sec: 49096.5). Total num frames: 3148578816. Throughput: 0: 48830.7. Samples: 2677330520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 07:26:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:26:24,709][71000] Updated weights for policy 0, policy_version 192184 (0.0032) [2024-06-13 07:26:25,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49152.0, 300 sec: 49041.0). Total num frames: 3148808192. Throughput: 0: 48889.5. Samples: 2677628940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 07:26:25,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 07:26:28,063][71000] Updated weights for policy 0, policy_version 192194 (0.0028) [2024-06-13 07:26:30,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 3149037568. Throughput: 0: 48867.5. Samples: 2677914660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 07:26:30,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:26:31,220][71000] Updated weights for policy 0, policy_version 192204 (0.0022) [2024-06-13 07:26:34,515][71000] Updated weights for policy 0, policy_version 192214 (0.0033) [2024-06-13 07:26:35,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3149283328. Throughput: 0: 48917.2. Samples: 2678062200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 07:26:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:26:37,971][71000] Updated weights for policy 0, policy_version 192224 (0.0030) [2024-06-13 07:26:40,940][70768] Fps is (10 sec: 50789.2, 60 sec: 48878.9, 300 sec: 48985.3). Total num frames: 3149545472. Throughput: 0: 49011.8. Samples: 2678355680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 07:26:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:26:41,334][71000] Updated weights for policy 0, policy_version 192234 (0.0032) [2024-06-13 07:26:44,860][71000] Updated weights for policy 0, policy_version 192244 (0.0035) [2024-06-13 07:26:45,940][70768] Fps is (10 sec: 49151.2, 60 sec: 48878.8, 300 sec: 48985.3). Total num frames: 3149774848. Throughput: 0: 48789.6. Samples: 2678642040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 07:26:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:26:47,117][70980] Signal inference workers to stop experience collection... (40050 times) [2024-06-13 07:26:47,151][71000] InferenceWorker_p0-w0: stopping experience collection (40050 times) [2024-06-13 07:26:47,181][70980] Signal inference workers to resume experience collection... (40050 times) [2024-06-13 07:26:47,184][71000] InferenceWorker_p0-w0: resuming experience collection (40050 times) [2024-06-13 07:26:48,427][71000] Updated weights for policy 0, policy_version 192254 (0.0022) [2024-06-13 07:26:50,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.6, 300 sec: 49040.9). Total num frames: 3150020608. Throughput: 0: 48825.9. Samples: 2678793360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 07:26:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:26:51,330][71000] Updated weights for policy 0, policy_version 192264 (0.0041) [2024-06-13 07:26:54,875][71000] Updated weights for policy 0, policy_version 192274 (0.0028) [2024-06-13 07:26:55,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 3150249984. Throughput: 0: 48716.6. Samples: 2679081740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 07:26:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:26:57,923][71000] Updated weights for policy 0, policy_version 192284 (0.0037) [2024-06-13 07:27:00,939][70768] Fps is (10 sec: 49153.3, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3150512128. Throughput: 0: 48664.6. Samples: 2679378460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 07:27:00,940][70768] Avg episode reward: [(0, '0.272')] [2024-06-13 07:27:01,559][71000] Updated weights for policy 0, policy_version 192294 (0.0028) [2024-06-13 07:27:04,789][71000] Updated weights for policy 0, policy_version 192304 (0.0018) [2024-06-13 07:27:05,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3150757888. Throughput: 0: 48858.2. Samples: 2679529140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 07:27:05,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:27:08,446][71000] Updated weights for policy 0, policy_version 192314 (0.0037) [2024-06-13 07:27:10,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 3151003648. Throughput: 0: 48731.6. Samples: 2679821860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 07:27:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:27:11,440][71000] Updated weights for policy 0, policy_version 192324 (0.0028) [2024-06-13 07:27:14,926][71000] Updated weights for policy 0, policy_version 192334 (0.0035) [2024-06-13 07:27:15,940][70768] Fps is (10 sec: 45874.4, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 3151216640. Throughput: 0: 48917.1. Samples: 2680115940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 07:27:15,941][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:27:18,054][71000] Updated weights for policy 0, policy_version 192344 (0.0024) [2024-06-13 07:27:20,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48605.8, 300 sec: 48930.6). Total num frames: 3151495168. Throughput: 0: 48818.2. Samples: 2680259020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 07:27:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:27:21,590][71000] Updated weights for policy 0, policy_version 192354 (0.0030) [2024-06-13 07:27:24,777][71000] Updated weights for policy 0, policy_version 192364 (0.0035) [2024-06-13 07:27:25,940][70768] Fps is (10 sec: 50791.4, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 3151724544. Throughput: 0: 48854.9. Samples: 2680554140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 07:27:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:27:28,528][71000] Updated weights for policy 0, policy_version 192374 (0.0029) [2024-06-13 07:27:30,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3151970304. Throughput: 0: 48969.0. Samples: 2680845640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 07:27:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:27:31,457][71000] Updated weights for policy 0, policy_version 192384 (0.0026) [2024-06-13 07:27:34,987][71000] Updated weights for policy 0, policy_version 192394 (0.0031) [2024-06-13 07:27:35,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 3152199680. Throughput: 0: 48847.8. Samples: 2680991500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 07:27:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:27:38,147][71000] Updated weights for policy 0, policy_version 192404 (0.0025) [2024-06-13 07:27:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48606.1, 300 sec: 48874.3). Total num frames: 3152461824. Throughput: 0: 49067.2. Samples: 2681289760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 07:27:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:27:40,968][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000192412_3152478208.pth... [2024-06-13 07:27:41,018][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000191695_3140730880.pth [2024-06-13 07:27:41,732][71000] Updated weights for policy 0, policy_version 192414 (0.0025) [2024-06-13 07:27:44,659][71000] Updated weights for policy 0, policy_version 192424 (0.0025) [2024-06-13 07:27:45,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49152.1, 300 sec: 49041.4). Total num frames: 3152723968. Throughput: 0: 49273.7. Samples: 2681595780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 07:27:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:27:48,739][71000] Updated weights for policy 0, policy_version 192434 (0.0033) [2024-06-13 07:27:50,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.2, 300 sec: 49040.9). Total num frames: 3152969728. Throughput: 0: 49152.5. Samples: 2681741000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 07:27:50,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:27:51,472][71000] Updated weights for policy 0, policy_version 192444 (0.0032) [2024-06-13 07:27:55,218][71000] Updated weights for policy 0, policy_version 192454 (0.0032) [2024-06-13 07:27:55,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3153182720. Throughput: 0: 49020.9. Samples: 2682027800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 07:27:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:27:58,331][71000] Updated weights for policy 0, policy_version 192464 (0.0029) [2024-06-13 07:28:00,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48878.8, 300 sec: 48929.9). Total num frames: 3153444864. Throughput: 0: 48924.1. Samples: 2682317520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 07:28:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:28:01,935][70980] Signal inference workers to stop experience collection... (40100 times) [2024-06-13 07:28:01,982][71000] InferenceWorker_p0-w0: stopping experience collection (40100 times) [2024-06-13 07:28:01,990][70980] Signal inference workers to resume experience collection... (40100 times) [2024-06-13 07:28:02,000][71000] InferenceWorker_p0-w0: resuming experience collection (40100 times) [2024-06-13 07:28:02,003][71000] Updated weights for policy 0, policy_version 192474 (0.0020) [2024-06-13 07:28:04,894][71000] Updated weights for policy 0, policy_version 192484 (0.0027) [2024-06-13 07:28:05,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3153690624. Throughput: 0: 49196.8. Samples: 2682472880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 07:28:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:28:08,473][71000] Updated weights for policy 0, policy_version 192494 (0.0030) [2024-06-13 07:28:10,939][70768] Fps is (10 sec: 50791.2, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3153952768. Throughput: 0: 49108.5. Samples: 2682764020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 07:28:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:28:11,720][71000] Updated weights for policy 0, policy_version 192504 (0.0035) [2024-06-13 07:28:15,355][71000] Updated weights for policy 0, policy_version 192514 (0.0027) [2024-06-13 07:28:15,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.1, 300 sec: 48818.7). Total num frames: 3154165760. Throughput: 0: 49203.5. Samples: 2683059800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 07:28:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:28:18,059][71000] Updated weights for policy 0, policy_version 192524 (0.0027) [2024-06-13 07:28:20,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3154444288. Throughput: 0: 49178.2. Samples: 2683204520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 07:28:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:28:22,017][71000] Updated weights for policy 0, policy_version 192534 (0.0027) [2024-06-13 07:28:25,289][71000] Updated weights for policy 0, policy_version 192544 (0.0027) [2024-06-13 07:28:25,940][70768] Fps is (10 sec: 52429.3, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 3154690048. Throughput: 0: 49063.1. Samples: 2683497600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 07:28:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:28:28,505][71000] Updated weights for policy 0, policy_version 192554 (0.0035) [2024-06-13 07:28:30,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 3154919424. Throughput: 0: 48707.5. Samples: 2683787620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 07:28:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:28:32,032][71000] Updated weights for policy 0, policy_version 192564 (0.0035) [2024-06-13 07:28:35,359][71000] Updated weights for policy 0, policy_version 192574 (0.0026) [2024-06-13 07:28:35,940][70768] Fps is (10 sec: 45874.7, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 3155148800. Throughput: 0: 48636.3. Samples: 2683929640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-13 07:28:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:28:38,394][71000] Updated weights for policy 0, policy_version 192584 (0.0024) [2024-06-13 07:28:40,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 3155427328. Throughput: 0: 48987.9. Samples: 2684232260. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 07:28:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:28:41,737][71000] Updated weights for policy 0, policy_version 192594 (0.0025) [2024-06-13 07:28:45,195][71000] Updated weights for policy 0, policy_version 192604 (0.0028) [2024-06-13 07:28:45,939][70768] Fps is (10 sec: 52429.7, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 3155673088. Throughput: 0: 49210.9. Samples: 2684532000. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 07:28:45,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 07:28:48,515][71000] Updated weights for policy 0, policy_version 192614 (0.0025) [2024-06-13 07:28:50,939][70768] Fps is (10 sec: 45876.0, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 3155886080. Throughput: 0: 49129.9. Samples: 2684683720. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 07:28:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:28:51,647][71000] Updated weights for policy 0, policy_version 192624 (0.0037) [2024-06-13 07:28:55,301][71000] Updated weights for policy 0, policy_version 192634 (0.0025) [2024-06-13 07:28:55,940][70768] Fps is (10 sec: 45874.4, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 3156131840. Throughput: 0: 49111.0. Samples: 2684974020. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 07:28:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:28:58,447][71000] Updated weights for policy 0, policy_version 192644 (0.0019) [2024-06-13 07:29:00,939][70768] Fps is (10 sec: 49151.8, 60 sec: 48879.1, 300 sec: 48929.9). Total num frames: 3156377600. Throughput: 0: 49076.6. Samples: 2685268240. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 07:29:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:29:01,981][71000] Updated weights for policy 0, policy_version 192654 (0.0029) [2024-06-13 07:29:05,096][71000] Updated weights for policy 0, policy_version 192664 (0.0029) [2024-06-13 07:29:05,940][70768] Fps is (10 sec: 54066.7, 60 sec: 49698.0, 300 sec: 49152.0). Total num frames: 3156672512. Throughput: 0: 48934.0. Samples: 2685406560. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 07:29:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:29:08,622][71000] Updated weights for policy 0, policy_version 192674 (0.0021) [2024-06-13 07:29:09,557][70980] Signal inference workers to stop experience collection... (40150 times) [2024-06-13 07:29:09,574][71000] InferenceWorker_p0-w0: stopping experience collection (40150 times) [2024-06-13 07:29:09,611][70980] Signal inference workers to resume experience collection... (40150 times) [2024-06-13 07:29:09,611][71000] InferenceWorker_p0-w0: resuming experience collection (40150 times) [2024-06-13 07:29:10,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 3156869120. Throughput: 0: 49093.7. Samples: 2685706820. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 07:29:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:29:12,012][71000] Updated weights for policy 0, policy_version 192684 (0.0033) [2024-06-13 07:29:15,311][71000] Updated weights for policy 0, policy_version 192694 (0.0031) [2024-06-13 07:29:15,940][70768] Fps is (10 sec: 44237.6, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 3157114880. Throughput: 0: 49103.2. Samples: 2685997260. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 07:29:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:29:18,483][71000] Updated weights for policy 0, policy_version 192704 (0.0033) [2024-06-13 07:29:20,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48605.7, 300 sec: 48929.8). Total num frames: 3157360640. Throughput: 0: 49250.5. Samples: 2686145920. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 07:29:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:29:22,139][71000] Updated weights for policy 0, policy_version 192714 (0.0023) [2024-06-13 07:29:25,100][71000] Updated weights for policy 0, policy_version 192724 (0.0030) [2024-06-13 07:29:25,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3157622784. Throughput: 0: 49106.8. Samples: 2686442060. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 07:29:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:29:28,561][71000] Updated weights for policy 0, policy_version 192734 (0.0028) [2024-06-13 07:29:30,940][70768] Fps is (10 sec: 47514.8, 60 sec: 48606.0, 300 sec: 48874.3). Total num frames: 3157835776. Throughput: 0: 48835.0. Samples: 2686729580. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 07:29:30,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 07:29:31,962][71000] Updated weights for policy 0, policy_version 192744 (0.0025) [2024-06-13 07:29:35,415][71000] Updated weights for policy 0, policy_version 192754 (0.0039) [2024-06-13 07:29:35,944][70768] Fps is (10 sec: 47493.2, 60 sec: 49148.6, 300 sec: 48984.7). Total num frames: 3158097920. Throughput: 0: 48474.9. Samples: 2686865300. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 07:29:35,944][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:29:38,605][71000] Updated weights for policy 0, policy_version 192764 (0.0038) [2024-06-13 07:29:40,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48332.8, 300 sec: 48929.8). Total num frames: 3158327296. Throughput: 0: 48658.7. Samples: 2687163660. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 07:29:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:29:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000192770_3158343680.pth... [2024-06-13 07:29:40,983][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000192054_3146612736.pth [2024-06-13 07:29:42,137][71000] Updated weights for policy 0, policy_version 192774 (0.0028) [2024-06-13 07:29:45,257][71000] Updated weights for policy 0, policy_version 192784 (0.0031) [2024-06-13 07:29:45,940][70768] Fps is (10 sec: 49172.7, 60 sec: 48605.7, 300 sec: 49040.9). Total num frames: 3158589440. Throughput: 0: 48609.2. Samples: 2687455660. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-13 07:29:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:29:48,779][71000] Updated weights for policy 0, policy_version 192794 (0.0030) [2024-06-13 07:29:50,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48605.7, 300 sec: 48819.5). Total num frames: 3158802432. Throughput: 0: 48803.2. Samples: 2687602700. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-13 07:29:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:29:52,179][71000] Updated weights for policy 0, policy_version 192804 (0.0039) [2024-06-13 07:29:55,434][71000] Updated weights for policy 0, policy_version 192814 (0.0029) [2024-06-13 07:29:55,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 3159097344. Throughput: 0: 48621.8. Samples: 2687894800. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-13 07:29:55,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 07:29:58,635][71000] Updated weights for policy 0, policy_version 192824 (0.0040) [2024-06-13 07:30:00,940][70768] Fps is (10 sec: 50791.1, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 3159310336. Throughput: 0: 48828.0. Samples: 2688194520. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-13 07:30:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:30:02,154][71000] Updated weights for policy 0, policy_version 192834 (0.0040) [2024-06-13 07:30:05,218][71000] Updated weights for policy 0, policy_version 192844 (0.0029) [2024-06-13 07:30:05,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48332.9, 300 sec: 48985.4). Total num frames: 3159572480. Throughput: 0: 48840.7. Samples: 2688343740. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-13 07:30:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:30:08,500][71000] Updated weights for policy 0, policy_version 192854 (0.0027) [2024-06-13 07:30:10,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 3159818240. Throughput: 0: 48984.0. Samples: 2688646340. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-13 07:30:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:30:11,827][71000] Updated weights for policy 0, policy_version 192864 (0.0021) [2024-06-13 07:30:15,332][71000] Updated weights for policy 0, policy_version 192874 (0.0028) [2024-06-13 07:30:15,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 3160064000. Throughput: 0: 48930.5. Samples: 2688931460. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-13 07:30:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:30:18,754][71000] Updated weights for policy 0, policy_version 192884 (0.0039) [2024-06-13 07:30:20,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.2, 300 sec: 48985.4). Total num frames: 3160309760. Throughput: 0: 49225.6. Samples: 2689080240. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-13 07:30:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:30:22,054][71000] Updated weights for policy 0, policy_version 192894 (0.0039) [2024-06-13 07:30:25,005][70980] Signal inference workers to stop experience collection... (40200 times) [2024-06-13 07:30:25,005][70980] Signal inference workers to resume experience collection... (40200 times) [2024-06-13 07:30:25,045][71000] InferenceWorker_p0-w0: stopping experience collection (40200 times) [2024-06-13 07:30:25,045][71000] InferenceWorker_p0-w0: resuming experience collection (40200 times) [2024-06-13 07:30:25,355][71000] Updated weights for policy 0, policy_version 192904 (0.0027) [2024-06-13 07:30:25,939][70768] Fps is (10 sec: 49152.7, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 3160555520. Throughput: 0: 49260.1. Samples: 2689380360. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-13 07:30:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:30:28,447][71000] Updated weights for policy 0, policy_version 192914 (0.0032) [2024-06-13 07:30:30,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 48929.8). Total num frames: 3160801280. Throughput: 0: 49377.8. Samples: 2689677660. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-13 07:30:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:30:31,804][71000] Updated weights for policy 0, policy_version 192924 (0.0028) [2024-06-13 07:30:35,017][71000] Updated weights for policy 0, policy_version 192934 (0.0038) [2024-06-13 07:30:35,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49155.6, 300 sec: 48929.9). Total num frames: 3161047040. Throughput: 0: 49365.5. Samples: 2689824140. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-13 07:30:35,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 07:30:38,720][71000] Updated weights for policy 0, policy_version 192944 (0.0025) [2024-06-13 07:30:40,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.1, 300 sec: 49040.9). Total num frames: 3161309184. Throughput: 0: 49392.4. Samples: 2690117460. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-13 07:30:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:30:41,835][71000] Updated weights for policy 0, policy_version 192954 (0.0036) [2024-06-13 07:30:45,657][71000] Updated weights for policy 0, policy_version 192964 (0.0033) [2024-06-13 07:30:45,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 3161538560. Throughput: 0: 49284.9. Samples: 2690412340. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-13 07:30:45,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:30:48,362][71000] Updated weights for policy 0, policy_version 192974 (0.0041) [2024-06-13 07:30:50,940][70768] Fps is (10 sec: 45876.0, 60 sec: 49425.2, 300 sec: 48929.9). Total num frames: 3161767936. Throughput: 0: 49121.8. Samples: 2690554220. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-13 07:30:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:30:51,926][71000] Updated weights for policy 0, policy_version 192984 (0.0021) [2024-06-13 07:30:54,864][71000] Updated weights for policy 0, policy_version 192994 (0.0027) [2024-06-13 07:30:55,940][70768] Fps is (10 sec: 49151.0, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3162030080. Throughput: 0: 49100.7. Samples: 2690855880. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-13 07:30:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:30:58,746][71000] Updated weights for policy 0, policy_version 193004 (0.0033) [2024-06-13 07:31:00,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49698.2, 300 sec: 49040.9). Total num frames: 3162292224. Throughput: 0: 49493.9. Samples: 2691158680. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-13 07:31:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 07:31:01,596][71000] Updated weights for policy 0, policy_version 193014 (0.0028) [2024-06-13 07:31:05,252][71000] Updated weights for policy 0, policy_version 193024 (0.0030) [2024-06-13 07:31:05,944][70768] Fps is (10 sec: 49131.8, 60 sec: 49148.5, 300 sec: 49040.2). Total num frames: 3162521600. Throughput: 0: 49397.6. Samples: 2691303340. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-13 07:31:05,944][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:31:08,212][71000] Updated weights for policy 0, policy_version 193034 (0.0039) [2024-06-13 07:31:10,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 3162767360. Throughput: 0: 49312.7. Samples: 2691599440. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-13 07:31:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:31:11,895][71000] Updated weights for policy 0, policy_version 193044 (0.0027) [2024-06-13 07:31:14,846][71000] Updated weights for policy 0, policy_version 193054 (0.0029) [2024-06-13 07:31:15,939][70768] Fps is (10 sec: 50812.5, 60 sec: 49425.2, 300 sec: 48985.4). Total num frames: 3163029504. Throughput: 0: 49278.4. Samples: 2691895180. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-13 07:31:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:31:18,446][71000] Updated weights for policy 0, policy_version 193064 (0.0025) [2024-06-13 07:31:20,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 3163275264. Throughput: 0: 49297.2. Samples: 2692042520. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-13 07:31:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:31:21,447][71000] Updated weights for policy 0, policy_version 193074 (0.0034) [2024-06-13 07:31:25,198][71000] Updated weights for policy 0, policy_version 193084 (0.0042) [2024-06-13 07:31:25,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3163504640. Throughput: 0: 49497.1. Samples: 2692344820. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-13 07:31:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:31:28,312][71000] Updated weights for policy 0, policy_version 193094 (0.0036) [2024-06-13 07:31:30,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 3163750400. Throughput: 0: 49266.6. Samples: 2692629340. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-13 07:31:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:31:32,017][71000] Updated weights for policy 0, policy_version 193104 (0.0030) [2024-06-13 07:31:34,904][71000] Updated weights for policy 0, policy_version 193114 (0.0034) [2024-06-13 07:31:35,940][70768] Fps is (10 sec: 52427.6, 60 sec: 49697.9, 300 sec: 49096.5). Total num frames: 3164028928. Throughput: 0: 49534.0. Samples: 2692783260. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-13 07:31:35,941][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 07:31:38,563][71000] Updated weights for policy 0, policy_version 193124 (0.0028) [2024-06-13 07:31:40,939][70768] Fps is (10 sec: 49152.5, 60 sec: 48879.1, 300 sec: 49041.0). Total num frames: 3164241920. Throughput: 0: 49336.3. Samples: 2693076000. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-13 07:31:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:31:41,140][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000193132_3164274688.pth... [2024-06-13 07:31:41,188][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000192412_3152478208.pth [2024-06-13 07:31:41,606][71000] Updated weights for policy 0, policy_version 193134 (0.0021) [2024-06-13 07:31:45,779][71000] Updated weights for policy 0, policy_version 193144 (0.0036) [2024-06-13 07:31:45,940][70768] Fps is (10 sec: 44237.1, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 3164471296. Throughput: 0: 48955.4. Samples: 2693361680. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-13 07:31:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:31:48,806][71000] Updated weights for policy 0, policy_version 193154 (0.0030) [2024-06-13 07:31:50,940][70768] Fps is (10 sec: 49150.6, 60 sec: 49424.9, 300 sec: 49096.4). Total num frames: 3164733440. Throughput: 0: 48879.1. Samples: 2693502700. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-13 07:31:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:31:52,387][71000] Updated weights for policy 0, policy_version 193164 (0.0029) [2024-06-13 07:31:53,250][70980] Signal inference workers to stop experience collection... (40250 times) [2024-06-13 07:31:53,300][71000] InferenceWorker_p0-w0: stopping experience collection (40250 times) [2024-06-13 07:31:53,304][70980] Signal inference workers to resume experience collection... (40250 times) [2024-06-13 07:31:53,318][71000] InferenceWorker_p0-w0: resuming experience collection (40250 times) [2024-06-13 07:31:55,585][71000] Updated weights for policy 0, policy_version 193174 (0.0029) [2024-06-13 07:31:55,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49152.2, 300 sec: 49040.9). Total num frames: 3164979200. Throughput: 0: 48905.1. Samples: 2693800160. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-13 07:31:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:31:59,078][71000] Updated weights for policy 0, policy_version 193184 (0.0031) [2024-06-13 07:32:00,940][70768] Fps is (10 sec: 49152.8, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3165224960. Throughput: 0: 48849.2. Samples: 2694093400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 07:32:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:32:02,146][71000] Updated weights for policy 0, policy_version 193194 (0.0028) [2024-06-13 07:32:05,706][71000] Updated weights for policy 0, policy_version 193204 (0.0035) [2024-06-13 07:32:05,939][70768] Fps is (10 sec: 47513.7, 60 sec: 48882.5, 300 sec: 48985.4). Total num frames: 3165454336. Throughput: 0: 48919.7. Samples: 2694243900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 07:32:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:32:08,998][71000] Updated weights for policy 0, policy_version 193214 (0.0036) [2024-06-13 07:32:10,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3165716480. Throughput: 0: 48643.8. Samples: 2694533800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 07:32:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:32:12,486][71000] Updated weights for policy 0, policy_version 193224 (0.0033) [2024-06-13 07:32:15,516][71000] Updated weights for policy 0, policy_version 193234 (0.0026) [2024-06-13 07:32:15,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3165978624. Throughput: 0: 49027.7. Samples: 2694835580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 07:32:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:32:19,071][71000] Updated weights for policy 0, policy_version 193244 (0.0031) [2024-06-13 07:32:20,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 3166208000. Throughput: 0: 48832.2. Samples: 2694980700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 07:32:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:32:22,206][71000] Updated weights for policy 0, policy_version 193254 (0.0021) [2024-06-13 07:32:25,619][71000] Updated weights for policy 0, policy_version 193264 (0.0022) [2024-06-13 07:32:25,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3166453760. Throughput: 0: 49027.5. Samples: 2695282240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 07:32:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:32:28,566][71000] Updated weights for policy 0, policy_version 193274 (0.0026) [2024-06-13 07:32:30,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3166699520. Throughput: 0: 49130.7. Samples: 2695572560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 07:32:30,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 07:32:32,067][71000] Updated weights for policy 0, policy_version 193284 (0.0026) [2024-06-13 07:32:35,653][71000] Updated weights for policy 0, policy_version 193294 (0.0028) [2024-06-13 07:32:35,940][70768] Fps is (10 sec: 49151.2, 60 sec: 48605.9, 300 sec: 49096.4). Total num frames: 3166945280. Throughput: 0: 49180.9. Samples: 2695715840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 07:32:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:32:38,981][71000] Updated weights for policy 0, policy_version 193304 (0.0040) [2024-06-13 07:32:40,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 3167191040. Throughput: 0: 49147.9. Samples: 2696011820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 07:32:40,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:32:42,227][71000] Updated weights for policy 0, policy_version 193314 (0.0027) [2024-06-13 07:32:45,566][71000] Updated weights for policy 0, policy_version 193324 (0.0028) [2024-06-13 07:32:45,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 3167436800. Throughput: 0: 49371.5. Samples: 2696315120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 07:32:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:32:48,761][71000] Updated weights for policy 0, policy_version 193334 (0.0032) [2024-06-13 07:32:50,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49698.2, 300 sec: 49263.0). Total num frames: 3167715328. Throughput: 0: 49378.0. Samples: 2696465920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 07:32:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:32:52,114][71000] Updated weights for policy 0, policy_version 193344 (0.0024) [2024-06-13 07:32:55,158][71000] Updated weights for policy 0, policy_version 193354 (0.0033) [2024-06-13 07:32:55,939][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3167928320. Throughput: 0: 49507.4. Samples: 2696761620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 07:32:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:32:58,681][71000] Updated weights for policy 0, policy_version 193364 (0.0033) [2024-06-13 07:32:58,901][70980] Signal inference workers to stop experience collection... (40300 times) [2024-06-13 07:32:58,953][70980] Signal inference workers to resume experience collection... (40300 times) [2024-06-13 07:32:58,953][71000] InferenceWorker_p0-w0: stopping experience collection (40300 times) [2024-06-13 07:32:58,966][71000] InferenceWorker_p0-w0: resuming experience collection (40300 times) [2024-06-13 07:33:00,940][70768] Fps is (10 sec: 45876.1, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3168174080. Throughput: 0: 49135.5. Samples: 2697046680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-13 07:33:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:33:01,895][71000] Updated weights for policy 0, policy_version 193374 (0.0027) [2024-06-13 07:33:05,417][71000] Updated weights for policy 0, policy_version 193384 (0.0023) [2024-06-13 07:33:05,939][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.1, 300 sec: 49096.5). Total num frames: 3168436224. Throughput: 0: 49368.1. Samples: 2697202260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 07:33:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:33:08,620][71000] Updated weights for policy 0, policy_version 193394 (0.0031) [2024-06-13 07:33:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.2, 300 sec: 49152.0). Total num frames: 3168665600. Throughput: 0: 49132.5. Samples: 2697493200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 07:33:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:33:11,902][71000] Updated weights for policy 0, policy_version 193404 (0.0027) [2024-06-13 07:33:15,413][71000] Updated weights for policy 0, policy_version 193414 (0.0024) [2024-06-13 07:33:15,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3168911360. Throughput: 0: 49267.8. Samples: 2697789600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 07:33:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:33:18,819][71000] Updated weights for policy 0, policy_version 193424 (0.0036) [2024-06-13 07:33:20,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3169157120. Throughput: 0: 49331.3. Samples: 2697935740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 07:33:20,940][70768] Avg episode reward: [(0, '0.269')] [2024-06-13 07:33:22,103][71000] Updated weights for policy 0, policy_version 193434 (0.0025) [2024-06-13 07:33:25,532][71000] Updated weights for policy 0, policy_version 193444 (0.0029) [2024-06-13 07:33:25,940][70768] Fps is (10 sec: 49150.9, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 3169402880. Throughput: 0: 49192.8. Samples: 2698225500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 07:33:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:33:28,554][71000] Updated weights for policy 0, policy_version 193454 (0.0026) [2024-06-13 07:33:30,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 3169648640. Throughput: 0: 48939.1. Samples: 2698517380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 07:33:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:33:31,914][71000] Updated weights for policy 0, policy_version 193464 (0.0025) [2024-06-13 07:33:35,357][71000] Updated weights for policy 0, policy_version 193474 (0.0029) [2024-06-13 07:33:35,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 3169894400. Throughput: 0: 49040.5. Samples: 2698672740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 07:33:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 07:33:38,771][71000] Updated weights for policy 0, policy_version 193484 (0.0029) [2024-06-13 07:33:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 3170140160. Throughput: 0: 48968.2. Samples: 2698965200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 07:33:40,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:33:40,953][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000193490_3170140160.pth... [2024-06-13 07:33:40,998][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000192770_3158343680.pth [2024-06-13 07:33:42,086][71000] Updated weights for policy 0, policy_version 193494 (0.0027) [2024-06-13 07:33:45,158][71000] Updated weights for policy 0, policy_version 193504 (0.0023) [2024-06-13 07:33:45,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3170385920. Throughput: 0: 49065.7. Samples: 2699254640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 07:33:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:33:48,605][71000] Updated weights for policy 0, policy_version 193514 (0.0039) [2024-06-13 07:33:50,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48605.8, 300 sec: 49152.0). Total num frames: 3170631680. Throughput: 0: 49082.8. Samples: 2699411000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 07:33:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:33:51,825][71000] Updated weights for policy 0, policy_version 193524 (0.0031) [2024-06-13 07:33:55,376][71000] Updated weights for policy 0, policy_version 193534 (0.0024) [2024-06-13 07:33:55,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3170877440. Throughput: 0: 49192.7. Samples: 2699706880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 07:33:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:33:58,612][71000] Updated weights for policy 0, policy_version 193544 (0.0038) [2024-06-13 07:34:00,940][70768] Fps is (10 sec: 47514.4, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 3171106816. Throughput: 0: 48972.3. Samples: 2699993360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 07:34:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:34:01,332][70980] Signal inference workers to stop experience collection... (40350 times) [2024-06-13 07:34:01,336][70980] Signal inference workers to resume experience collection... (40350 times) [2024-06-13 07:34:01,355][71000] InferenceWorker_p0-w0: stopping experience collection (40350 times) [2024-06-13 07:34:01,356][71000] InferenceWorker_p0-w0: resuming experience collection (40350 times) [2024-06-13 07:34:01,930][71000] Updated weights for policy 0, policy_version 193554 (0.0027) [2024-06-13 07:34:05,262][71000] Updated weights for policy 0, policy_version 193564 (0.0036) [2024-06-13 07:34:05,939][70768] Fps is (10 sec: 49152.8, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3171368960. Throughput: 0: 49048.9. Samples: 2700142940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 07:34:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:34:08,630][71000] Updated weights for policy 0, policy_version 193574 (0.0031) [2024-06-13 07:34:10,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3171614720. Throughput: 0: 49196.6. Samples: 2700439340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-13 07:34:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:34:11,902][71000] Updated weights for policy 0, policy_version 193584 (0.0027) [2024-06-13 07:34:15,216][71000] Updated weights for policy 0, policy_version 193594 (0.0036) [2024-06-13 07:34:15,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 3171844096. Throughput: 0: 49150.2. Samples: 2700729140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 07:34:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:34:18,618][71000] Updated weights for policy 0, policy_version 193604 (0.0031) [2024-06-13 07:34:20,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3172106240. Throughput: 0: 48908.9. Samples: 2700873640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 07:34:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:34:22,071][71000] Updated weights for policy 0, policy_version 193614 (0.0024) [2024-06-13 07:34:25,289][71000] Updated weights for policy 0, policy_version 193624 (0.0029) [2024-06-13 07:34:25,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.2, 300 sec: 49207.5). Total num frames: 3172352000. Throughput: 0: 49011.7. Samples: 2701170720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 07:34:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:34:28,783][71000] Updated weights for policy 0, policy_version 193634 (0.0022) [2024-06-13 07:34:30,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.0, 300 sec: 49208.2). Total num frames: 3172614144. Throughput: 0: 49242.6. Samples: 2701470560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 07:34:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:34:31,783][71000] Updated weights for policy 0, policy_version 193644 (0.0036) [2024-06-13 07:34:35,636][71000] Updated weights for policy 0, policy_version 193654 (0.0039) [2024-06-13 07:34:35,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 3172827136. Throughput: 0: 48927.7. Samples: 2701612740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 07:34:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:34:38,537][71000] Updated weights for policy 0, policy_version 193664 (0.0029) [2024-06-13 07:34:40,940][70768] Fps is (10 sec: 44237.3, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 3173056512. Throughput: 0: 48570.7. Samples: 2701892560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 07:34:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:34:42,247][71000] Updated weights for policy 0, policy_version 193674 (0.0025) [2024-06-13 07:34:45,550][71000] Updated weights for policy 0, policy_version 193684 (0.0025) [2024-06-13 07:34:45,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 3173335040. Throughput: 0: 48741.7. Samples: 2702186740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 07:34:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:34:49,140][71000] Updated weights for policy 0, policy_version 193694 (0.0025) [2024-06-13 07:34:50,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 3173564416. Throughput: 0: 48878.4. Samples: 2702342480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 07:34:50,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:34:52,175][71000] Updated weights for policy 0, policy_version 193704 (0.0032) [2024-06-13 07:34:55,511][71000] Updated weights for policy 0, policy_version 193714 (0.0027) [2024-06-13 07:34:55,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3173810176. Throughput: 0: 48986.2. Samples: 2702643720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 07:34:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:34:58,666][71000] Updated weights for policy 0, policy_version 193724 (0.0029) [2024-06-13 07:35:00,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 3174055936. Throughput: 0: 49129.4. Samples: 2702939960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 07:35:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:35:02,182][71000] Updated weights for policy 0, policy_version 193734 (0.0038) [2024-06-13 07:35:05,262][71000] Updated weights for policy 0, policy_version 193744 (0.0023) [2024-06-13 07:35:05,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3174318080. Throughput: 0: 49345.8. Samples: 2703094200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 07:35:05,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:35:08,851][71000] Updated weights for policy 0, policy_version 193754 (0.0027) [2024-06-13 07:35:10,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3174563840. Throughput: 0: 49035.4. Samples: 2703377320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 07:35:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:35:12,038][71000] Updated weights for policy 0, policy_version 193764 (0.0037) [2024-06-13 07:35:14,836][70980] Signal inference workers to stop experience collection... (40400 times) [2024-06-13 07:35:14,888][71000] InferenceWorker_p0-w0: stopping experience collection (40400 times) [2024-06-13 07:35:14,946][70980] Signal inference workers to resume experience collection... (40400 times) [2024-06-13 07:35:14,946][71000] InferenceWorker_p0-w0: resuming experience collection (40400 times) [2024-06-13 07:35:15,890][71000] Updated weights for policy 0, policy_version 193774 (0.0025) [2024-06-13 07:35:15,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3174793216. Throughput: 0: 48814.3. Samples: 2703667200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 07:35:15,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 07:35:18,822][71000] Updated weights for policy 0, policy_version 193784 (0.0037) [2024-06-13 07:35:20,939][70768] Fps is (10 sec: 47514.2, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 3175038976. Throughput: 0: 48932.1. Samples: 2703814680. Policy #0 lag: (min: 1.0, avg: 12.0, max: 25.0) [2024-06-13 07:35:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:35:22,470][71000] Updated weights for policy 0, policy_version 193794 (0.0035) [2024-06-13 07:35:25,418][71000] Updated weights for policy 0, policy_version 193804 (0.0029) [2024-06-13 07:35:25,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3175301120. Throughput: 0: 49236.8. Samples: 2704108220. Policy #0 lag: (min: 1.0, avg: 12.0, max: 25.0) [2024-06-13 07:35:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:35:29,154][71000] Updated weights for policy 0, policy_version 193814 (0.0029) [2024-06-13 07:35:30,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48606.1, 300 sec: 49096.5). Total num frames: 3175530496. Throughput: 0: 49384.2. Samples: 2704409020. Policy #0 lag: (min: 1.0, avg: 12.0, max: 25.0) [2024-06-13 07:35:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:35:32,259][71000] Updated weights for policy 0, policy_version 193824 (0.0026) [2024-06-13 07:35:35,910][71000] Updated weights for policy 0, policy_version 193834 (0.0024) [2024-06-13 07:35:35,939][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 3175776256. Throughput: 0: 49034.4. Samples: 2704549020. Policy #0 lag: (min: 1.0, avg: 12.0, max: 25.0) [2024-06-13 07:35:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:35:39,074][71000] Updated weights for policy 0, policy_version 193844 (0.0032) [2024-06-13 07:35:40,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 3176022016. Throughput: 0: 48940.6. Samples: 2704846040. Policy #0 lag: (min: 1.0, avg: 12.0, max: 25.0) [2024-06-13 07:35:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:35:40,989][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000193850_3176038400.pth... [2024-06-13 07:35:41,034][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000193132_3164274688.pth [2024-06-13 07:35:42,527][71000] Updated weights for policy 0, policy_version 193854 (0.0032) [2024-06-13 07:35:45,538][71000] Updated weights for policy 0, policy_version 193864 (0.0019) [2024-06-13 07:35:45,940][70768] Fps is (10 sec: 52427.0, 60 sec: 49424.9, 300 sec: 49263.0). Total num frames: 3176300544. Throughput: 0: 49045.0. Samples: 2705147000. Policy #0 lag: (min: 1.0, avg: 12.0, max: 25.0) [2024-06-13 07:35:45,941][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:35:49,315][71000] Updated weights for policy 0, policy_version 193874 (0.0030) [2024-06-13 07:35:50,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3176513536. Throughput: 0: 48787.4. Samples: 2705289640. Policy #0 lag: (min: 1.0, avg: 12.0, max: 25.0) [2024-06-13 07:35:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:35:52,215][71000] Updated weights for policy 0, policy_version 193884 (0.0025) [2024-06-13 07:35:55,939][70768] Fps is (10 sec: 44238.4, 60 sec: 48879.1, 300 sec: 48985.4). Total num frames: 3176742912. Throughput: 0: 48995.3. Samples: 2705582100. Policy #0 lag: (min: 1.0, avg: 12.0, max: 25.0) [2024-06-13 07:35:55,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:35:56,040][71000] Updated weights for policy 0, policy_version 193894 (0.0033) [2024-06-13 07:35:59,178][71000] Updated weights for policy 0, policy_version 193904 (0.0031) [2024-06-13 07:36:00,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.0, 300 sec: 49097.2). Total num frames: 3177005056. Throughput: 0: 48897.3. Samples: 2705867580. Policy #0 lag: (min: 1.0, avg: 12.0, max: 25.0) [2024-06-13 07:36:00,940][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 07:36:02,867][71000] Updated weights for policy 0, policy_version 193914 (0.0032) [2024-06-13 07:36:05,657][71000] Updated weights for policy 0, policy_version 193924 (0.0026) [2024-06-13 07:36:05,939][70768] Fps is (10 sec: 50790.4, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 3177250816. Throughput: 0: 49107.1. Samples: 2706024500. Policy #0 lag: (min: 1.0, avg: 12.0, max: 25.0) [2024-06-13 07:36:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:36:09,356][71000] Updated weights for policy 0, policy_version 193934 (0.0027) [2024-06-13 07:36:10,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3177480192. Throughput: 0: 49171.6. Samples: 2706320940. Policy #0 lag: (min: 1.0, avg: 12.0, max: 25.0) [2024-06-13 07:36:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:36:12,392][71000] Updated weights for policy 0, policy_version 193944 (0.0024) [2024-06-13 07:36:15,939][70768] Fps is (10 sec: 45875.1, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 3177709568. Throughput: 0: 48773.7. Samples: 2706603840. Policy #0 lag: (min: 1.0, avg: 12.0, max: 25.0) [2024-06-13 07:36:15,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 07:36:16,228][71000] Updated weights for policy 0, policy_version 193954 (0.0027) [2024-06-13 07:36:19,360][71000] Updated weights for policy 0, policy_version 193964 (0.0027) [2024-06-13 07:36:20,939][70768] Fps is (10 sec: 52429.2, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 3178004480. Throughput: 0: 48925.8. Samples: 2706750680. Policy #0 lag: (min: 1.0, avg: 12.0, max: 25.0) [2024-06-13 07:36:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:36:22,979][70980] Signal inference workers to stop experience collection... (40450 times) [2024-06-13 07:36:23,026][71000] InferenceWorker_p0-w0: stopping experience collection (40450 times) [2024-06-13 07:36:23,089][70980] Signal inference workers to resume experience collection... (40450 times) [2024-06-13 07:36:23,089][71000] InferenceWorker_p0-w0: resuming experience collection (40450 times) [2024-06-13 07:36:23,239][71000] Updated weights for policy 0, policy_version 193974 (0.0024) [2024-06-13 07:36:25,939][70768] Fps is (10 sec: 50790.5, 60 sec: 48606.0, 300 sec: 49040.9). Total num frames: 3178217472. Throughput: 0: 48724.0. Samples: 2707038620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 07:36:25,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:36:25,994][71000] Updated weights for policy 0, policy_version 193984 (0.0028) [2024-06-13 07:36:29,723][71000] Updated weights for policy 0, policy_version 193994 (0.0028) [2024-06-13 07:36:30,939][70768] Fps is (10 sec: 44236.7, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 3178446848. Throughput: 0: 48555.0. Samples: 2707331960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 07:36:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:36:32,791][71000] Updated weights for policy 0, policy_version 194004 (0.0026) [2024-06-13 07:36:35,940][70768] Fps is (10 sec: 47512.7, 60 sec: 48605.7, 300 sec: 48985.3). Total num frames: 3178692608. Throughput: 0: 48591.6. Samples: 2707476260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 07:36:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:36:36,621][71000] Updated weights for policy 0, policy_version 194014 (0.0030) [2024-06-13 07:36:39,686][71000] Updated weights for policy 0, policy_version 194024 (0.0029) [2024-06-13 07:36:40,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3178971136. Throughput: 0: 48631.5. Samples: 2707770520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 07:36:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:36:43,349][71000] Updated weights for policy 0, policy_version 194034 (0.0023) [2024-06-13 07:36:45,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48059.8, 300 sec: 48985.4). Total num frames: 3179184128. Throughput: 0: 48750.9. Samples: 2708061380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 07:36:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:36:46,268][71000] Updated weights for policy 0, policy_version 194044 (0.0035) [2024-06-13 07:36:49,877][71000] Updated weights for policy 0, policy_version 194054 (0.0033) [2024-06-13 07:36:50,940][70768] Fps is (10 sec: 44236.8, 60 sec: 48332.9, 300 sec: 48929.8). Total num frames: 3179413504. Throughput: 0: 48413.7. Samples: 2708203120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 07:36:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:36:52,873][71000] Updated weights for policy 0, policy_version 194064 (0.0023) [2024-06-13 07:36:55,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 3179675648. Throughput: 0: 48363.0. Samples: 2708497280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 07:36:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:36:56,742][71000] Updated weights for policy 0, policy_version 194074 (0.0028) [2024-06-13 07:36:59,901][71000] Updated weights for policy 0, policy_version 194084 (0.0028) [2024-06-13 07:37:00,940][70768] Fps is (10 sec: 52429.0, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 3179937792. Throughput: 0: 48612.0. Samples: 2708791380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 07:37:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:37:03,291][71000] Updated weights for policy 0, policy_version 194094 (0.0037) [2024-06-13 07:37:05,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48332.7, 300 sec: 48929.9). Total num frames: 3180150784. Throughput: 0: 48719.0. Samples: 2708943040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 07:37:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:37:06,431][71000] Updated weights for policy 0, policy_version 194104 (0.0026) [2024-06-13 07:37:09,688][71000] Updated weights for policy 0, policy_version 194114 (0.0021) [2024-06-13 07:37:10,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3180412928. Throughput: 0: 48811.9. Samples: 2709235160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 07:37:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:37:13,189][71000] Updated weights for policy 0, policy_version 194124 (0.0035) [2024-06-13 07:37:15,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 3180658688. Throughput: 0: 48655.4. Samples: 2709521460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 07:37:15,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:37:16,648][71000] Updated weights for policy 0, policy_version 194134 (0.0028) [2024-06-13 07:37:19,826][71000] Updated weights for policy 0, policy_version 194144 (0.0027) [2024-06-13 07:37:20,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48332.6, 300 sec: 48985.4). Total num frames: 3180904448. Throughput: 0: 48875.5. Samples: 2709675660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 07:37:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:37:23,570][71000] Updated weights for policy 0, policy_version 194154 (0.0030) [2024-06-13 07:37:25,939][70768] Fps is (10 sec: 47514.3, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 3181133824. Throughput: 0: 48734.3. Samples: 2709963560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 07:37:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:37:26,802][71000] Updated weights for policy 0, policy_version 194164 (0.0028) [2024-06-13 07:37:29,989][71000] Updated weights for policy 0, policy_version 194174 (0.0028) [2024-06-13 07:37:30,939][70768] Fps is (10 sec: 44237.8, 60 sec: 48332.8, 300 sec: 48818.8). Total num frames: 3181346816. Throughput: 0: 48802.5. Samples: 2710257480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 07:37:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:37:33,441][71000] Updated weights for policy 0, policy_version 194184 (0.0026) [2024-06-13 07:37:35,566][70980] Signal inference workers to stop experience collection... (40500 times) [2024-06-13 07:37:35,607][71000] InferenceWorker_p0-w0: stopping experience collection (40500 times) [2024-06-13 07:37:35,618][70980] Signal inference workers to resume experience collection... (40500 times) [2024-06-13 07:37:35,624][71000] InferenceWorker_p0-w0: resuming experience collection (40500 times) [2024-06-13 07:37:35,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 3181641728. Throughput: 0: 48881.3. Samples: 2710402780. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-13 07:37:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:37:36,702][71000] Updated weights for policy 0, policy_version 194194 (0.0031) [2024-06-13 07:37:40,258][71000] Updated weights for policy 0, policy_version 194204 (0.0033) [2024-06-13 07:37:40,940][70768] Fps is (10 sec: 52427.9, 60 sec: 48332.7, 300 sec: 48929.8). Total num frames: 3181871104. Throughput: 0: 48821.3. Samples: 2710694240. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-13 07:37:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:37:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000194206_3181871104.pth... [2024-06-13 07:37:41,016][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000193490_3170140160.pth [2024-06-13 07:37:43,577][71000] Updated weights for policy 0, policy_version 194214 (0.0024) [2024-06-13 07:37:45,940][70768] Fps is (10 sec: 45874.7, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 3182100480. Throughput: 0: 48694.0. Samples: 2710982620. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-13 07:37:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:37:47,063][71000] Updated weights for policy 0, policy_version 194224 (0.0039) [2024-06-13 07:37:50,003][71000] Updated weights for policy 0, policy_version 194234 (0.0029) [2024-06-13 07:37:50,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48605.8, 300 sec: 48818.7). Total num frames: 3182329856. Throughput: 0: 48327.9. Samples: 2711117800. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-13 07:37:50,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:37:53,566][71000] Updated weights for policy 0, policy_version 194244 (0.0028) [2024-06-13 07:37:55,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 3182624768. Throughput: 0: 48688.9. Samples: 2711426160. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-13 07:37:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 07:37:56,440][71000] Updated weights for policy 0, policy_version 194254 (0.0030) [2024-06-13 07:38:00,107][71000] Updated weights for policy 0, policy_version 194264 (0.0027) [2024-06-13 07:38:00,940][70768] Fps is (10 sec: 54068.0, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3182870528. Throughput: 0: 48793.4. Samples: 2711717160. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-13 07:38:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 07:38:03,235][71000] Updated weights for policy 0, policy_version 194274 (0.0032) [2024-06-13 07:38:05,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3183083520. Throughput: 0: 48635.3. Samples: 2711864240. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-13 07:38:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:38:06,652][71000] Updated weights for policy 0, policy_version 194284 (0.0025) [2024-06-13 07:38:09,812][71000] Updated weights for policy 0, policy_version 194294 (0.0036) [2024-06-13 07:38:10,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 3183329280. Throughput: 0: 48924.3. Samples: 2712165160. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-13 07:38:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:38:13,276][71000] Updated weights for policy 0, policy_version 194304 (0.0030) [2024-06-13 07:38:15,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3183607808. Throughput: 0: 48986.1. Samples: 2712461860. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-13 07:38:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:38:16,367][71000] Updated weights for policy 0, policy_version 194314 (0.0039) [2024-06-13 07:38:20,136][71000] Updated weights for policy 0, policy_version 194324 (0.0037) [2024-06-13 07:38:20,939][70768] Fps is (10 sec: 50790.9, 60 sec: 48879.1, 300 sec: 48929.9). Total num frames: 3183837184. Throughput: 0: 48997.9. Samples: 2712607680. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-13 07:38:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:38:23,287][71000] Updated weights for policy 0, policy_version 194334 (0.0021) [2024-06-13 07:38:25,939][70768] Fps is (10 sec: 45875.9, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3184066560. Throughput: 0: 49035.3. Samples: 2712900820. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-13 07:38:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:38:26,859][71000] Updated weights for policy 0, policy_version 194344 (0.0028) [2024-06-13 07:38:30,074][71000] Updated weights for policy 0, policy_version 194354 (0.0030) [2024-06-13 07:38:30,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49425.0, 300 sec: 48874.3). Total num frames: 3184312320. Throughput: 0: 49233.9. Samples: 2713198140. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-13 07:38:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:38:33,599][71000] Updated weights for policy 0, policy_version 194364 (0.0031) [2024-06-13 07:38:35,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3184590848. Throughput: 0: 49467.2. Samples: 2713343820. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-13 07:38:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:38:36,530][71000] Updated weights for policy 0, policy_version 194374 (0.0024) [2024-06-13 07:38:39,235][70980] Signal inference workers to stop experience collection... (40550 times) [2024-06-13 07:38:39,235][70980] Signal inference workers to resume experience collection... (40550 times) [2024-06-13 07:38:39,280][71000] InferenceWorker_p0-w0: stopping experience collection (40550 times) [2024-06-13 07:38:39,280][71000] InferenceWorker_p0-w0: resuming experience collection (40550 times) [2024-06-13 07:38:40,309][71000] Updated weights for policy 0, policy_version 194384 (0.0038) [2024-06-13 07:38:40,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 3184836608. Throughput: 0: 49162.6. Samples: 2713638480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:38:40,944][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:38:43,614][71000] Updated weights for policy 0, policy_version 194394 (0.0037) [2024-06-13 07:38:45,939][70768] Fps is (10 sec: 44237.4, 60 sec: 48879.1, 300 sec: 48818.8). Total num frames: 3185033216. Throughput: 0: 49071.2. Samples: 2713925360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:38:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:38:47,178][71000] Updated weights for policy 0, policy_version 194404 (0.0021) [2024-06-13 07:38:50,198][71000] Updated weights for policy 0, policy_version 194414 (0.0030) [2024-06-13 07:38:50,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49698.2, 300 sec: 48929.8). Total num frames: 3185311744. Throughput: 0: 48993.3. Samples: 2714068940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:38:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:38:53,848][71000] Updated weights for policy 0, policy_version 194424 (0.0024) [2024-06-13 07:38:55,939][70768] Fps is (10 sec: 50790.5, 60 sec: 48606.0, 300 sec: 48929.9). Total num frames: 3185541120. Throughput: 0: 48765.0. Samples: 2714359580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:38:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:38:56,996][71000] Updated weights for policy 0, policy_version 194434 (0.0027) [2024-06-13 07:39:00,605][71000] Updated weights for policy 0, policy_version 194444 (0.0039) [2024-06-13 07:39:00,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 3185786880. Throughput: 0: 48784.6. Samples: 2714657160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:39:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:39:03,552][71000] Updated weights for policy 0, policy_version 194454 (0.0019) [2024-06-13 07:39:05,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 3186016256. Throughput: 0: 48614.6. Samples: 2714795340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:39:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 07:39:07,141][71000] Updated weights for policy 0, policy_version 194464 (0.0035) [2024-06-13 07:39:10,517][71000] Updated weights for policy 0, policy_version 194474 (0.0031) [2024-06-13 07:39:10,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 3186294784. Throughput: 0: 48712.3. Samples: 2715092880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:39:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:39:14,160][71000] Updated weights for policy 0, policy_version 194484 (0.0028) [2024-06-13 07:39:15,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48332.9, 300 sec: 48818.8). Total num frames: 3186507776. Throughput: 0: 48588.0. Samples: 2715384600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:39:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:39:17,061][71000] Updated weights for policy 0, policy_version 194494 (0.0040) [2024-06-13 07:39:20,693][71000] Updated weights for policy 0, policy_version 194504 (0.0028) [2024-06-13 07:39:20,939][70768] Fps is (10 sec: 45875.8, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 3186753536. Throughput: 0: 48572.1. Samples: 2715529560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:39:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:39:23,919][71000] Updated weights for policy 0, policy_version 194514 (0.0023) [2024-06-13 07:39:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 3186999296. Throughput: 0: 48449.9. Samples: 2715818720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:39:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:39:27,470][71000] Updated weights for policy 0, policy_version 194524 (0.0029) [2024-06-13 07:39:30,450][71000] Updated weights for policy 0, policy_version 194534 (0.0031) [2024-06-13 07:39:30,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 3187277824. Throughput: 0: 48825.7. Samples: 2716122520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:39:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:39:34,106][71000] Updated weights for policy 0, policy_version 194544 (0.0025) [2024-06-13 07:39:35,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3187507200. Throughput: 0: 48996.5. Samples: 2716273780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:39:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:39:37,060][71000] Updated weights for policy 0, policy_version 194554 (0.0037) [2024-06-13 07:39:40,553][71000] Updated weights for policy 0, policy_version 194564 (0.0037) [2024-06-13 07:39:40,940][70768] Fps is (10 sec: 47512.5, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 3187752960. Throughput: 0: 49192.5. Samples: 2716573260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 07:39:40,941][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:39:40,959][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000194565_3187752960.pth... [2024-06-13 07:39:41,015][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000193850_3176038400.pth [2024-06-13 07:39:43,792][71000] Updated weights for policy 0, policy_version 194574 (0.0032) [2024-06-13 07:39:44,351][70980] Signal inference workers to stop experience collection... (40600 times) [2024-06-13 07:39:44,380][71000] InferenceWorker_p0-w0: stopping experience collection (40600 times) [2024-06-13 07:39:44,467][70980] Signal inference workers to resume experience collection... (40600 times) [2024-06-13 07:39:44,468][71000] InferenceWorker_p0-w0: resuming experience collection (40600 times) [2024-06-13 07:39:45,939][70768] Fps is (10 sec: 47514.0, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3187982336. Throughput: 0: 48975.2. Samples: 2716861040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 07:39:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:39:47,538][71000] Updated weights for policy 0, policy_version 194584 (0.0026) [2024-06-13 07:39:50,623][71000] Updated weights for policy 0, policy_version 194594 (0.0027) [2024-06-13 07:39:50,940][70768] Fps is (10 sec: 49153.3, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 3188244480. Throughput: 0: 49137.8. Samples: 2717006540. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 07:39:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:39:53,907][71000] Updated weights for policy 0, policy_version 194604 (0.0032) [2024-06-13 07:39:55,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49151.8, 300 sec: 48929.8). Total num frames: 3188490240. Throughput: 0: 49200.4. Samples: 2717306900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 07:39:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:39:57,097][71000] Updated weights for policy 0, policy_version 194614 (0.0033) [2024-06-13 07:40:00,818][71000] Updated weights for policy 0, policy_version 194624 (0.0023) [2024-06-13 07:40:00,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.8, 300 sec: 48818.7). Total num frames: 3188719616. Throughput: 0: 49323.0. Samples: 2717604140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 07:40:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:40:03,609][71000] Updated weights for policy 0, policy_version 194634 (0.0031) [2024-06-13 07:40:05,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 3188965376. Throughput: 0: 49349.3. Samples: 2717750280. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 07:40:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:40:07,414][71000] Updated weights for policy 0, policy_version 194644 (0.0033) [2024-06-13 07:40:10,351][71000] Updated weights for policy 0, policy_version 194654 (0.0021) [2024-06-13 07:40:10,940][70768] Fps is (10 sec: 50791.0, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 3189227520. Throughput: 0: 49348.4. Samples: 2718039400. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 07:40:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:40:13,946][71000] Updated weights for policy 0, policy_version 194664 (0.0042) [2024-06-13 07:40:15,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.1, 300 sec: 48929.8). Total num frames: 3189473280. Throughput: 0: 49241.4. Samples: 2718338380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 07:40:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:40:17,199][71000] Updated weights for policy 0, policy_version 194674 (0.0032) [2024-06-13 07:40:20,786][71000] Updated weights for policy 0, policy_version 194684 (0.0041) [2024-06-13 07:40:20,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 3189702656. Throughput: 0: 49139.6. Samples: 2718485060. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 07:40:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:40:23,673][71000] Updated weights for policy 0, policy_version 194694 (0.0027) [2024-06-13 07:40:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3189948416. Throughput: 0: 49033.2. Samples: 2718779740. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 07:40:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:40:27,255][71000] Updated weights for policy 0, policy_version 194704 (0.0022) [2024-06-13 07:40:30,527][71000] Updated weights for policy 0, policy_version 194714 (0.0028) [2024-06-13 07:40:30,940][70768] Fps is (10 sec: 49151.2, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 3190194176. Throughput: 0: 49230.5. Samples: 2719076420. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 07:40:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:40:34,192][71000] Updated weights for policy 0, policy_version 194724 (0.0035) [2024-06-13 07:40:35,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.1, 300 sec: 48929.9). Total num frames: 3190456320. Throughput: 0: 49146.8. Samples: 2719218140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 07:40:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:40:37,434][71000] Updated weights for policy 0, policy_version 194734 (0.0024) [2024-06-13 07:40:40,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48606.1, 300 sec: 48707.7). Total num frames: 3190669312. Throughput: 0: 48844.6. Samples: 2719504900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 07:40:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:40:41,058][71000] Updated weights for policy 0, policy_version 194744 (0.0030) [2024-06-13 07:40:44,175][71000] Updated weights for policy 0, policy_version 194754 (0.0039) [2024-06-13 07:40:45,940][70768] Fps is (10 sec: 47513.4, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3190931456. Throughput: 0: 48825.5. Samples: 2719801280. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 07:40:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:40:47,579][71000] Updated weights for policy 0, policy_version 194764 (0.0039) [2024-06-13 07:40:50,683][71000] Updated weights for policy 0, policy_version 194774 (0.0027) [2024-06-13 07:40:50,940][70768] Fps is (10 sec: 50789.7, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 3191177216. Throughput: 0: 48899.4. Samples: 2719950760. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-13 07:40:50,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:40:51,601][70980] Signal inference workers to stop experience collection... (40650 times) [2024-06-13 07:40:51,601][70980] Signal inference workers to resume experience collection... (40650 times) [2024-06-13 07:40:51,610][71000] InferenceWorker_p0-w0: stopping experience collection (40650 times) [2024-06-13 07:40:51,611][71000] InferenceWorker_p0-w0: resuming experience collection (40650 times) [2024-06-13 07:40:54,269][71000] Updated weights for policy 0, policy_version 194784 (0.0027) [2024-06-13 07:40:55,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.2, 300 sec: 48929.9). Total num frames: 3191439360. Throughput: 0: 49182.3. Samples: 2720252600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:40:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:40:57,618][71000] Updated weights for policy 0, policy_version 194794 (0.0028) [2024-06-13 07:41:00,862][71000] Updated weights for policy 0, policy_version 194804 (0.0028) [2024-06-13 07:41:00,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 3191668736. Throughput: 0: 49087.9. Samples: 2720547340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:41:00,940][70768] Avg episode reward: [(0, '0.295')] [2024-06-13 07:41:04,181][71000] Updated weights for policy 0, policy_version 194814 (0.0027) [2024-06-13 07:41:05,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 3191914496. Throughput: 0: 48928.4. Samples: 2720686840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:41:05,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:41:07,538][71000] Updated weights for policy 0, policy_version 194824 (0.0033) [2024-06-13 07:41:10,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 3192143872. Throughput: 0: 48942.6. Samples: 2720982160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:41:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:41:10,964][71000] Updated weights for policy 0, policy_version 194834 (0.0031) [2024-06-13 07:41:14,326][71000] Updated weights for policy 0, policy_version 194844 (0.0035) [2024-06-13 07:41:15,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3192422400. Throughput: 0: 48776.6. Samples: 2721271360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:41:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:41:17,951][71000] Updated weights for policy 0, policy_version 194854 (0.0022) [2024-06-13 07:41:20,836][71000] Updated weights for policy 0, policy_version 194864 (0.0025) [2024-06-13 07:41:20,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 3192651776. Throughput: 0: 49139.3. Samples: 2721429420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:41:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:41:24,281][71000] Updated weights for policy 0, policy_version 194874 (0.0030) [2024-06-13 07:41:25,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3192881152. Throughput: 0: 49279.1. Samples: 2721722460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:41:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:41:27,326][71000] Updated weights for policy 0, policy_version 194884 (0.0027) [2024-06-13 07:41:30,939][70768] Fps is (10 sec: 47514.5, 60 sec: 48879.1, 300 sec: 48929.9). Total num frames: 3193126912. Throughput: 0: 49201.8. Samples: 2722015360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:41:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:41:30,966][71000] Updated weights for policy 0, policy_version 194894 (0.0023) [2024-06-13 07:41:33,898][71000] Updated weights for policy 0, policy_version 194904 (0.0032) [2024-06-13 07:41:35,939][70768] Fps is (10 sec: 50790.9, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3193389056. Throughput: 0: 49186.1. Samples: 2722164120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:41:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:41:37,783][71000] Updated weights for policy 0, policy_version 194914 (0.0024) [2024-06-13 07:41:40,391][71000] Updated weights for policy 0, policy_version 194924 (0.0026) [2024-06-13 07:41:40,940][70768] Fps is (10 sec: 52425.7, 60 sec: 49697.7, 300 sec: 49040.9). Total num frames: 3193651200. Throughput: 0: 49016.2. Samples: 2722458360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:41:40,941][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:41:40,955][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000194925_3193651200.pth... [2024-06-13 07:41:41,008][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000194206_3181871104.pth [2024-06-13 07:41:44,323][71000] Updated weights for policy 0, policy_version 194934 (0.0026) [2024-06-13 07:41:45,940][70768] Fps is (10 sec: 49150.7, 60 sec: 49151.8, 300 sec: 49040.9). Total num frames: 3193880576. Throughput: 0: 49169.2. Samples: 2722759960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:41:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:41:47,074][71000] Updated weights for policy 0, policy_version 194944 (0.0026) [2024-06-13 07:41:50,940][70768] Fps is (10 sec: 45877.6, 60 sec: 48879.1, 300 sec: 48929.9). Total num frames: 3194109952. Throughput: 0: 49136.9. Samples: 2722898000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:41:50,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:41:51,018][71000] Updated weights for policy 0, policy_version 194954 (0.0029) [2024-06-13 07:41:53,828][71000] Updated weights for policy 0, policy_version 194964 (0.0030) [2024-06-13 07:41:55,940][70768] Fps is (10 sec: 49152.8, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 3194372096. Throughput: 0: 49008.5. Samples: 2723187540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-13 07:41:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:41:57,213][70980] Signal inference workers to stop experience collection... (40700 times) [2024-06-13 07:41:57,247][71000] InferenceWorker_p0-w0: stopping experience collection (40700 times) [2024-06-13 07:41:57,259][70980] Signal inference workers to resume experience collection... (40700 times) [2024-06-13 07:41:57,271][71000] InferenceWorker_p0-w0: resuming experience collection (40700 times) [2024-06-13 07:41:57,389][71000] Updated weights for policy 0, policy_version 194974 (0.0022) [2024-06-13 07:42:00,554][71000] Updated weights for policy 0, policy_version 194984 (0.0029) [2024-06-13 07:42:00,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 3194634240. Throughput: 0: 49205.3. Samples: 2723485600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:42:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:42:04,091][71000] Updated weights for policy 0, policy_version 194994 (0.0021) [2024-06-13 07:42:05,939][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 3194863616. Throughput: 0: 49263.8. Samples: 2723646280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:42:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:42:06,877][71000] Updated weights for policy 0, policy_version 195004 (0.0023) [2024-06-13 07:42:10,639][71000] Updated weights for policy 0, policy_version 195014 (0.0033) [2024-06-13 07:42:10,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.2, 300 sec: 49040.9). Total num frames: 3195125760. Throughput: 0: 49336.5. Samples: 2723942600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:42:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:42:13,743][71000] Updated weights for policy 0, policy_version 195024 (0.0031) [2024-06-13 07:42:15,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 3195338752. Throughput: 0: 49218.6. Samples: 2724230200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:42:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:42:17,328][71000] Updated weights for policy 0, policy_version 195034 (0.0026) [2024-06-13 07:42:20,192][71000] Updated weights for policy 0, policy_version 195044 (0.0055) [2024-06-13 07:42:20,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49698.1, 300 sec: 49152.0). Total num frames: 3195633664. Throughput: 0: 49286.9. Samples: 2724382040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:42:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:42:24,114][71000] Updated weights for policy 0, policy_version 195054 (0.0038) [2024-06-13 07:42:25,939][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.2, 300 sec: 49152.0). Total num frames: 3195846656. Throughput: 0: 49197.1. Samples: 2724672200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:42:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:42:26,914][71000] Updated weights for policy 0, policy_version 195064 (0.0022) [2024-06-13 07:42:30,407][71000] Updated weights for policy 0, policy_version 195074 (0.0029) [2024-06-13 07:42:30,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49424.9, 300 sec: 48985.4). Total num frames: 3196092416. Throughput: 0: 49081.9. Samples: 2724968640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:42:30,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:42:33,789][71000] Updated weights for policy 0, policy_version 195084 (0.0029) [2024-06-13 07:42:35,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 3196321792. Throughput: 0: 49155.6. Samples: 2725110000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:42:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:42:37,374][71000] Updated weights for policy 0, policy_version 195094 (0.0024) [2024-06-13 07:42:40,274][71000] Updated weights for policy 0, policy_version 195104 (0.0035) [2024-06-13 07:42:40,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49425.4, 300 sec: 49207.5). Total num frames: 3196616704. Throughput: 0: 49336.8. Samples: 2725407700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:42:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:42:44,255][71000] Updated weights for policy 0, policy_version 195114 (0.0026) [2024-06-13 07:42:45,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 3196829696. Throughput: 0: 49445.3. Samples: 2725710640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:42:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:42:46,808][71000] Updated weights for policy 0, policy_version 195124 (0.0023) [2024-06-13 07:42:50,773][71000] Updated weights for policy 0, policy_version 195134 (0.0034) [2024-06-13 07:42:50,940][70768] Fps is (10 sec: 45875.4, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 3197075456. Throughput: 0: 49032.3. Samples: 2725852740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:42:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:42:53,624][71000] Updated weights for policy 0, policy_version 195144 (0.0035) [2024-06-13 07:42:55,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3197304832. Throughput: 0: 48715.5. Samples: 2726134800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:42:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:42:57,675][71000] Updated weights for policy 0, policy_version 195154 (0.0029) [2024-06-13 07:43:00,352][71000] Updated weights for policy 0, policy_version 195164 (0.0032) [2024-06-13 07:43:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3197583360. Throughput: 0: 48699.9. Samples: 2726421700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:43:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:43:04,574][71000] Updated weights for policy 0, policy_version 195174 (0.0031) [2024-06-13 07:43:05,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 3197812736. Throughput: 0: 48950.4. Samples: 2726584800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 07:43:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:43:07,051][71000] Updated weights for policy 0, policy_version 195184 (0.0029) [2024-06-13 07:43:10,940][70768] Fps is (10 sec: 44236.3, 60 sec: 48332.7, 300 sec: 48874.3). Total num frames: 3198025728. Throughput: 0: 49057.5. Samples: 2726879800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 07:43:10,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 07:43:11,442][71000] Updated weights for policy 0, policy_version 195194 (0.0040) [2024-06-13 07:43:11,930][70980] Signal inference workers to stop experience collection... (40750 times) [2024-06-13 07:43:11,931][70980] Signal inference workers to resume experience collection... (40750 times) [2024-06-13 07:43:11,955][71000] InferenceWorker_p0-w0: stopping experience collection (40750 times) [2024-06-13 07:43:11,956][71000] InferenceWorker_p0-w0: resuming experience collection (40750 times) [2024-06-13 07:43:13,780][71000] Updated weights for policy 0, policy_version 195204 (0.0028) [2024-06-13 07:43:15,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 3198271488. Throughput: 0: 48845.0. Samples: 2727166660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 07:43:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:43:18,158][71000] Updated weights for policy 0, policy_version 195214 (0.0030) [2024-06-13 07:43:20,588][71000] Updated weights for policy 0, policy_version 195224 (0.0028) [2024-06-13 07:43:20,940][70768] Fps is (10 sec: 54068.3, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 3198566400. Throughput: 0: 49020.9. Samples: 2727315940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 07:43:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:43:24,681][71000] Updated weights for policy 0, policy_version 195234 (0.0030) [2024-06-13 07:43:25,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 3198795776. Throughput: 0: 48940.1. Samples: 2727610000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 07:43:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:43:27,429][71000] Updated weights for policy 0, policy_version 195244 (0.0026) [2024-06-13 07:43:30,940][70768] Fps is (10 sec: 42598.3, 60 sec: 48332.9, 300 sec: 48818.8). Total num frames: 3198992384. Throughput: 0: 48707.6. Samples: 2727902480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 07:43:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:43:31,674][71000] Updated weights for policy 0, policy_version 195254 (0.0030) [2024-06-13 07:43:34,232][71000] Updated weights for policy 0, policy_version 195264 (0.0027) [2024-06-13 07:43:35,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49151.9, 300 sec: 48929.9). Total num frames: 3199270912. Throughput: 0: 48559.1. Samples: 2728037900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 07:43:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:43:38,297][71000] Updated weights for policy 0, policy_version 195274 (0.0035) [2024-06-13 07:43:40,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48059.8, 300 sec: 49040.9). Total num frames: 3199500288. Throughput: 0: 48859.1. Samples: 2728333460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 07:43:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:43:41,093][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000195284_3199533056.pth... [2024-06-13 07:43:41,096][71000] Updated weights for policy 0, policy_version 195284 (0.0031) [2024-06-13 07:43:41,137][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000194565_3187752960.pth [2024-06-13 07:43:45,171][71000] Updated weights for policy 0, policy_version 195294 (0.0027) [2024-06-13 07:43:45,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3199762432. Throughput: 0: 49011.5. Samples: 2728627220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 07:43:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:43:47,833][71000] Updated weights for policy 0, policy_version 195304 (0.0025) [2024-06-13 07:43:50,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48332.8, 300 sec: 48929.8). Total num frames: 3199975424. Throughput: 0: 48518.6. Samples: 2728768140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 07:43:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:43:51,721][71000] Updated weights for policy 0, policy_version 195314 (0.0020) [2024-06-13 07:43:54,375][71000] Updated weights for policy 0, policy_version 195324 (0.0027) [2024-06-13 07:43:55,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3200253952. Throughput: 0: 48586.8. Samples: 2729066200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 07:43:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:43:58,301][71000] Updated weights for policy 0, policy_version 195334 (0.0026) [2024-06-13 07:44:00,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48332.8, 300 sec: 49040.9). Total num frames: 3200483328. Throughput: 0: 48623.1. Samples: 2729354700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 07:44:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:44:01,218][71000] Updated weights for policy 0, policy_version 195344 (0.0028) [2024-06-13 07:44:05,105][71000] Updated weights for policy 0, policy_version 195354 (0.0032) [2024-06-13 07:44:05,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3200745472. Throughput: 0: 48790.6. Samples: 2729511520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 07:44:05,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 07:44:07,998][71000] Updated weights for policy 0, policy_version 195364 (0.0030) [2024-06-13 07:44:10,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48606.0, 300 sec: 48929.9). Total num frames: 3200942080. Throughput: 0: 48673.4. Samples: 2729800300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 07:44:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:44:11,734][71000] Updated weights for policy 0, policy_version 195374 (0.0031) [2024-06-13 07:44:12,240][70980] Signal inference workers to stop experience collection... (40800 times) [2024-06-13 07:44:12,273][71000] InferenceWorker_p0-w0: stopping experience collection (40800 times) [2024-06-13 07:44:12,295][70980] Signal inference workers to resume experience collection... (40800 times) [2024-06-13 07:44:12,300][71000] InferenceWorker_p0-w0: resuming experience collection (40800 times) [2024-06-13 07:44:14,555][71000] Updated weights for policy 0, policy_version 195384 (0.0023) [2024-06-13 07:44:15,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3201220608. Throughput: 0: 48680.0. Samples: 2730093080. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-13 07:44:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:44:18,450][71000] Updated weights for policy 0, policy_version 195394 (0.0030) [2024-06-13 07:44:20,940][70768] Fps is (10 sec: 52428.6, 60 sec: 48332.8, 300 sec: 49040.9). Total num frames: 3201466368. Throughput: 0: 49104.5. Samples: 2730247600. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-13 07:44:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:44:21,551][71000] Updated weights for policy 0, policy_version 195404 (0.0033) [2024-06-13 07:44:25,091][71000] Updated weights for policy 0, policy_version 195414 (0.0043) [2024-06-13 07:44:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 3201712128. Throughput: 0: 48965.4. Samples: 2730536900. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-13 07:44:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:44:28,210][71000] Updated weights for policy 0, policy_version 195424 (0.0027) [2024-06-13 07:44:30,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 3201941504. Throughput: 0: 48878.8. Samples: 2730826760. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-13 07:44:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:44:31,693][71000] Updated weights for policy 0, policy_version 195434 (0.0022) [2024-06-13 07:44:35,062][71000] Updated weights for policy 0, policy_version 195444 (0.0023) [2024-06-13 07:44:35,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 3202203648. Throughput: 0: 48785.3. Samples: 2730963480. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-13 07:44:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:44:38,369][71000] Updated weights for policy 0, policy_version 195454 (0.0036) [2024-06-13 07:44:40,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3202449408. Throughput: 0: 48866.7. Samples: 2731265200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-13 07:44:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:44:41,703][71000] Updated weights for policy 0, policy_version 195464 (0.0028) [2024-06-13 07:44:45,169][71000] Updated weights for policy 0, policy_version 195474 (0.0030) [2024-06-13 07:44:45,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48606.0, 300 sec: 48929.8). Total num frames: 3202678784. Throughput: 0: 49009.4. Samples: 2731560120. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-13 07:44:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:44:48,568][71000] Updated weights for policy 0, policy_version 195484 (0.0029) [2024-06-13 07:44:50,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3202908160. Throughput: 0: 48702.7. Samples: 2731703140. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-13 07:44:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:44:51,764][71000] Updated weights for policy 0, policy_version 195494 (0.0024) [2024-06-13 07:44:55,104][71000] Updated weights for policy 0, policy_version 195504 (0.0033) [2024-06-13 07:44:55,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48332.7, 300 sec: 48929.8). Total num frames: 3203153920. Throughput: 0: 48532.3. Samples: 2731984260. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-13 07:44:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:44:58,724][71000] Updated weights for policy 0, policy_version 195514 (0.0026) [2024-06-13 07:45:00,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3203432448. Throughput: 0: 48370.6. Samples: 2732269760. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-13 07:45:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:45:02,194][71000] Updated weights for policy 0, policy_version 195524 (0.0033) [2024-06-13 07:45:05,185][71000] Updated weights for policy 0, policy_version 195534 (0.0030) [2024-06-13 07:45:05,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 3203661824. Throughput: 0: 48405.6. Samples: 2732425860. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-13 07:45:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:45:08,519][71000] Updated weights for policy 0, policy_version 195544 (0.0025) [2024-06-13 07:45:10,939][70768] Fps is (10 sec: 45875.7, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3203891200. Throughput: 0: 48740.5. Samples: 2732730220. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-13 07:45:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:45:11,695][71000] Updated weights for policy 0, policy_version 195554 (0.0024) [2024-06-13 07:45:14,116][70980] Signal inference workers to stop experience collection... (40850 times) [2024-06-13 07:45:14,164][70980] Signal inference workers to resume experience collection... (40850 times) [2024-06-13 07:45:14,164][71000] InferenceWorker_p0-w0: stopping experience collection (40850 times) [2024-06-13 07:45:14,175][71000] InferenceWorker_p0-w0: resuming experience collection (40850 times) [2024-06-13 07:45:15,225][71000] Updated weights for policy 0, policy_version 195564 (0.0027) [2024-06-13 07:45:15,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 3204136960. Throughput: 0: 48727.6. Samples: 2733019500. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-13 07:45:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:45:18,288][71000] Updated weights for policy 0, policy_version 195574 (0.0029) [2024-06-13 07:45:20,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 3204382720. Throughput: 0: 48904.5. Samples: 2733164180. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-13 07:45:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:45:22,202][71000] Updated weights for policy 0, policy_version 195584 (0.0027) [2024-06-13 07:45:25,240][71000] Updated weights for policy 0, policy_version 195594 (0.0026) [2024-06-13 07:45:25,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3204661248. Throughput: 0: 48818.1. Samples: 2733462020. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-13 07:45:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:45:28,584][71000] Updated weights for policy 0, policy_version 195604 (0.0029) [2024-06-13 07:45:30,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 3204890624. Throughput: 0: 48852.5. Samples: 2733758480. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-13 07:45:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:45:31,555][71000] Updated weights for policy 0, policy_version 195614 (0.0026) [2024-06-13 07:45:35,246][71000] Updated weights for policy 0, policy_version 195624 (0.0037) [2024-06-13 07:45:35,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3205120000. Throughput: 0: 48914.2. Samples: 2733904280. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-13 07:45:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:45:38,521][71000] Updated weights for policy 0, policy_version 195634 (0.0027) [2024-06-13 07:45:40,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48605.7, 300 sec: 48929.8). Total num frames: 3205365760. Throughput: 0: 49090.7. Samples: 2734193340. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-13 07:45:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:45:40,973][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000195641_3205382144.pth... [2024-06-13 07:45:41,018][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000194925_3193651200.pth [2024-06-13 07:45:41,959][71000] Updated weights for policy 0, policy_version 195644 (0.0024) [2024-06-13 07:45:44,902][71000] Updated weights for policy 0, policy_version 195654 (0.0028) [2024-06-13 07:45:45,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 3205644288. Throughput: 0: 49263.1. Samples: 2734486600. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-13 07:45:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:45:48,681][71000] Updated weights for policy 0, policy_version 195664 (0.0023) [2024-06-13 07:45:50,939][70768] Fps is (10 sec: 49152.9, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3205857280. Throughput: 0: 49260.2. Samples: 2734642560. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-13 07:45:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:45:51,811][71000] Updated weights for policy 0, policy_version 195674 (0.0027) [2024-06-13 07:45:55,458][71000] Updated weights for policy 0, policy_version 195684 (0.0028) [2024-06-13 07:45:55,939][70768] Fps is (10 sec: 44237.3, 60 sec: 48879.1, 300 sec: 48874.3). Total num frames: 3206086656. Throughput: 0: 48860.9. Samples: 2734928960. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-13 07:45:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:45:58,439][71000] Updated weights for policy 0, policy_version 195694 (0.0023) [2024-06-13 07:46:00,940][70768] Fps is (10 sec: 47512.7, 60 sec: 48332.7, 300 sec: 48874.3). Total num frames: 3206332416. Throughput: 0: 48822.1. Samples: 2735216500. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-13 07:46:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:46:02,443][71000] Updated weights for policy 0, policy_version 195704 (0.0029) [2024-06-13 07:46:05,084][71000] Updated weights for policy 0, policy_version 195714 (0.0030) [2024-06-13 07:46:05,940][70768] Fps is (10 sec: 54065.9, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 3206627328. Throughput: 0: 49090.5. Samples: 2735373260. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-13 07:46:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:46:08,881][71000] Updated weights for policy 0, policy_version 195724 (0.0025) [2024-06-13 07:46:10,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49425.0, 300 sec: 48929.8). Total num frames: 3206856704. Throughput: 0: 49025.8. Samples: 2735668180. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-13 07:46:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:46:11,603][71000] Updated weights for policy 0, policy_version 195734 (0.0031) [2024-06-13 07:46:15,437][71000] Updated weights for policy 0, policy_version 195744 (0.0029) [2024-06-13 07:46:15,940][70768] Fps is (10 sec: 45875.7, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 3207086080. Throughput: 0: 49084.4. Samples: 2735967280. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-13 07:46:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:46:18,170][71000] Updated weights for policy 0, policy_version 195754 (0.0026) [2024-06-13 07:46:20,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 3207315456. Throughput: 0: 48918.1. Samples: 2736105600. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-13 07:46:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:46:22,225][71000] Updated weights for policy 0, policy_version 195764 (0.0026) [2024-06-13 07:46:24,866][71000] Updated weights for policy 0, policy_version 195774 (0.0030) [2024-06-13 07:46:25,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3207610368. Throughput: 0: 49123.7. Samples: 2736403900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-13 07:46:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:46:29,002][71000] Updated weights for policy 0, policy_version 195784 (0.0028) [2024-06-13 07:46:30,939][70768] Fps is (10 sec: 52429.7, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3207839744. Throughput: 0: 49138.7. Samples: 2736697840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-13 07:46:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:46:31,702][71000] Updated weights for policy 0, policy_version 195794 (0.0030) [2024-06-13 07:46:34,314][70980] Signal inference workers to stop experience collection... (40900 times) [2024-06-13 07:46:34,315][70980] Signal inference workers to resume experience collection... (40900 times) [2024-06-13 07:46:34,331][71000] InferenceWorker_p0-w0: stopping experience collection (40900 times) [2024-06-13 07:46:34,331][71000] InferenceWorker_p0-w0: resuming experience collection (40900 times) [2024-06-13 07:46:35,512][71000] Updated weights for policy 0, policy_version 195804 (0.0034) [2024-06-13 07:46:35,939][70768] Fps is (10 sec: 45875.6, 60 sec: 49152.1, 300 sec: 48874.4). Total num frames: 3208069120. Throughput: 0: 48961.4. Samples: 2736845820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-13 07:46:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:46:38,150][71000] Updated weights for policy 0, policy_version 195814 (0.0026) [2024-06-13 07:46:40,940][70768] Fps is (10 sec: 45874.6, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 3208298496. Throughput: 0: 49177.6. Samples: 2737141960. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-13 07:46:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:46:42,275][71000] Updated weights for policy 0, policy_version 195824 (0.0026) [2024-06-13 07:46:44,771][71000] Updated weights for policy 0, policy_version 195834 (0.0027) [2024-06-13 07:46:45,939][70768] Fps is (10 sec: 52428.5, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3208593408. Throughput: 0: 49205.5. Samples: 2737430740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-13 07:46:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:46:48,977][71000] Updated weights for policy 0, policy_version 195844 (0.0035) [2024-06-13 07:46:50,939][70768] Fps is (10 sec: 52429.6, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 3208822784. Throughput: 0: 49295.8. Samples: 2737591560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-13 07:46:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:46:51,542][71000] Updated weights for policy 0, policy_version 195854 (0.0030) [2024-06-13 07:46:55,436][71000] Updated weights for policy 0, policy_version 195864 (0.0025) [2024-06-13 07:46:55,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49424.9, 300 sec: 48874.3). Total num frames: 3209052160. Throughput: 0: 49275.6. Samples: 2737885580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-13 07:46:55,940][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 07:46:58,045][71000] Updated weights for policy 0, policy_version 195874 (0.0032) [2024-06-13 07:47:00,940][70768] Fps is (10 sec: 45874.7, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 3209281536. Throughput: 0: 49297.8. Samples: 2738185680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-13 07:47:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:47:02,155][71000] Updated weights for policy 0, policy_version 195884 (0.0029) [2024-06-13 07:47:04,788][71000] Updated weights for policy 0, policy_version 195894 (0.0026) [2024-06-13 07:47:05,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3209576448. Throughput: 0: 49407.1. Samples: 2738328920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-13 07:47:05,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:47:08,876][71000] Updated weights for policy 0, policy_version 195904 (0.0024) [2024-06-13 07:47:10,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3209805824. Throughput: 0: 49527.1. Samples: 2738632620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-13 07:47:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:47:11,458][71000] Updated weights for policy 0, policy_version 195914 (0.0039) [2024-06-13 07:47:15,377][71000] Updated weights for policy 0, policy_version 195924 (0.0023) [2024-06-13 07:47:15,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 3210035200. Throughput: 0: 49589.7. Samples: 2738929380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-13 07:47:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:47:18,123][71000] Updated weights for policy 0, policy_version 195934 (0.0024) [2024-06-13 07:47:20,940][70768] Fps is (10 sec: 45874.6, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3210264576. Throughput: 0: 49173.6. Samples: 2739058640. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-13 07:47:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 07:47:22,303][71000] Updated weights for policy 0, policy_version 195944 (0.0024) [2024-06-13 07:47:24,776][71000] Updated weights for policy 0, policy_version 195954 (0.0025) [2024-06-13 07:47:25,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3210559488. Throughput: 0: 49244.1. Samples: 2739357940. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-13 07:47:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:47:28,790][71000] Updated weights for policy 0, policy_version 195964 (0.0033) [2024-06-13 07:47:30,939][70768] Fps is (10 sec: 52429.6, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3210788864. Throughput: 0: 49540.4. Samples: 2739660060. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-13 07:47:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:47:31,117][70980] Signal inference workers to stop experience collection... (40950 times) [2024-06-13 07:47:31,160][71000] InferenceWorker_p0-w0: stopping experience collection (40950 times) [2024-06-13 07:47:31,172][70980] Signal inference workers to resume experience collection... (40950 times) [2024-06-13 07:47:31,186][71000] InferenceWorker_p0-w0: resuming experience collection (40950 times) [2024-06-13 07:47:31,337][71000] Updated weights for policy 0, policy_version 195974 (0.0028) [2024-06-13 07:47:35,250][71000] Updated weights for policy 0, policy_version 195984 (0.0024) [2024-06-13 07:47:35,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.0, 300 sec: 48874.3). Total num frames: 3211034624. Throughput: 0: 49394.6. Samples: 2739814320. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-13 07:47:35,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:47:38,078][71000] Updated weights for policy 0, policy_version 195994 (0.0027) [2024-06-13 07:47:40,940][70768] Fps is (10 sec: 47512.6, 60 sec: 49425.0, 300 sec: 48929.8). Total num frames: 3211264000. Throughput: 0: 49335.0. Samples: 2740105660. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-13 07:47:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:47:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000196000_3211264000.pth... [2024-06-13 07:47:41,000][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000195284_3199533056.pth [2024-06-13 07:47:41,889][71000] Updated weights for policy 0, policy_version 196004 (0.0041) [2024-06-13 07:47:44,804][71000] Updated weights for policy 0, policy_version 196014 (0.0029) [2024-06-13 07:47:45,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 3211542528. Throughput: 0: 49038.1. Samples: 2740392400. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-13 07:47:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:47:48,574][71000] Updated weights for policy 0, policy_version 196024 (0.0034) [2024-06-13 07:47:50,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 3211771904. Throughput: 0: 49409.0. Samples: 2740552320. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-13 07:47:50,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 07:47:51,457][71000] Updated weights for policy 0, policy_version 196034 (0.0030) [2024-06-13 07:47:55,083][71000] Updated weights for policy 0, policy_version 196044 (0.0026) [2024-06-13 07:47:55,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49698.2, 300 sec: 48985.4). Total num frames: 3212034048. Throughput: 0: 49249.8. Samples: 2740848860. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-13 07:47:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:47:58,059][71000] Updated weights for policy 0, policy_version 196054 (0.0028) [2024-06-13 07:48:00,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49425.1, 300 sec: 48929.8). Total num frames: 3212247040. Throughput: 0: 49147.1. Samples: 2741141000. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-13 07:48:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:48:01,807][71000] Updated weights for policy 0, policy_version 196064 (0.0029) [2024-06-13 07:48:04,698][71000] Updated weights for policy 0, policy_version 196074 (0.0031) [2024-06-13 07:48:05,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3212525568. Throughput: 0: 49584.4. Samples: 2741289940. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-13 07:48:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:48:08,320][71000] Updated weights for policy 0, policy_version 196084 (0.0035) [2024-06-13 07:48:10,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3212754944. Throughput: 0: 49409.3. Samples: 2741581360. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-13 07:48:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:48:11,519][71000] Updated weights for policy 0, policy_version 196094 (0.0026) [2024-06-13 07:48:15,098][71000] Updated weights for policy 0, policy_version 196104 (0.0023) [2024-06-13 07:48:15,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49425.0, 300 sec: 48929.8). Total num frames: 3213000704. Throughput: 0: 49182.6. Samples: 2741873280. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-13 07:48:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:48:18,043][71000] Updated weights for policy 0, policy_version 196114 (0.0032) [2024-06-13 07:48:20,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49698.2, 300 sec: 48985.4). Total num frames: 3213246464. Throughput: 0: 49003.5. Samples: 2742019480. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-13 07:48:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:48:21,627][71000] Updated weights for policy 0, policy_version 196124 (0.0038) [2024-06-13 07:48:24,705][71000] Updated weights for policy 0, policy_version 196134 (0.0021) [2024-06-13 07:48:25,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3213492224. Throughput: 0: 49214.5. Samples: 2742320300. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-13 07:48:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:48:28,348][71000] Updated weights for policy 0, policy_version 196144 (0.0032) [2024-06-13 07:48:30,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3213721600. Throughput: 0: 49448.5. Samples: 2742617580. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-13 07:48:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:48:31,515][71000] Updated weights for policy 0, policy_version 196154 (0.0029) [2024-06-13 07:48:34,084][70980] Signal inference workers to stop experience collection... (41000 times) [2024-06-13 07:48:34,126][71000] InferenceWorker_p0-w0: stopping experience collection (41000 times) [2024-06-13 07:48:34,196][70980] Signal inference workers to resume experience collection... (41000 times) [2024-06-13 07:48:34,196][71000] InferenceWorker_p0-w0: resuming experience collection (41000 times) [2024-06-13 07:48:34,835][71000] Updated weights for policy 0, policy_version 196164 (0.0025) [2024-06-13 07:48:35,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3213983744. Throughput: 0: 49020.1. Samples: 2742758220. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-13 07:48:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:48:38,277][71000] Updated weights for policy 0, policy_version 196174 (0.0033) [2024-06-13 07:48:40,939][70768] Fps is (10 sec: 52429.1, 60 sec: 49698.3, 300 sec: 49096.5). Total num frames: 3214245888. Throughput: 0: 48983.6. Samples: 2743053120. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 07:48:40,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:48:41,554][71000] Updated weights for policy 0, policy_version 196184 (0.0030) [2024-06-13 07:48:44,845][71000] Updated weights for policy 0, policy_version 196194 (0.0038) [2024-06-13 07:48:45,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3214475264. Throughput: 0: 48875.9. Samples: 2743340420. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 07:48:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 07:48:48,111][71000] Updated weights for policy 0, policy_version 196204 (0.0026) [2024-06-13 07:48:50,940][70768] Fps is (10 sec: 45874.7, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3214704640. Throughput: 0: 48762.3. Samples: 2743484240. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 07:48:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:48:51,714][71000] Updated weights for policy 0, policy_version 196214 (0.0030) [2024-06-13 07:48:54,659][71000] Updated weights for policy 0, policy_version 196224 (0.0030) [2024-06-13 07:48:55,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 3214966784. Throughput: 0: 48816.0. Samples: 2743778080. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 07:48:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:48:58,409][71000] Updated weights for policy 0, policy_version 196234 (0.0031) [2024-06-13 07:49:00,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 3215212544. Throughput: 0: 49037.3. Samples: 2744079960. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 07:49:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:49:01,481][71000] Updated weights for policy 0, policy_version 196244 (0.0030) [2024-06-13 07:49:04,785][71000] Updated weights for policy 0, policy_version 196254 (0.0029) [2024-06-13 07:49:05,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 3215458304. Throughput: 0: 49171.1. Samples: 2744232180. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 07:49:05,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:49:07,910][71000] Updated weights for policy 0, policy_version 196264 (0.0030) [2024-06-13 07:49:10,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3215687680. Throughput: 0: 49099.9. Samples: 2744529800. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 07:49:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:49:11,717][71000] Updated weights for policy 0, policy_version 196274 (0.0028) [2024-06-13 07:49:14,572][71000] Updated weights for policy 0, policy_version 196284 (0.0030) [2024-06-13 07:49:15,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 3215949824. Throughput: 0: 48853.5. Samples: 2744815980. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 07:49:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:49:18,368][71000] Updated weights for policy 0, policy_version 196294 (0.0041) [2024-06-13 07:49:20,939][70768] Fps is (10 sec: 52429.2, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 3216211968. Throughput: 0: 49160.9. Samples: 2744970460. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 07:49:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:49:21,175][71000] Updated weights for policy 0, policy_version 196304 (0.0026) [2024-06-13 07:49:24,698][71000] Updated weights for policy 0, policy_version 196314 (0.0032) [2024-06-13 07:49:25,939][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.1, 300 sec: 49207.6). Total num frames: 3216457728. Throughput: 0: 49352.9. Samples: 2745274000. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 07:49:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:49:27,840][71000] Updated weights for policy 0, policy_version 196324 (0.0042) [2024-06-13 07:49:30,940][70768] Fps is (10 sec: 47512.5, 60 sec: 49425.0, 300 sec: 49096.4). Total num frames: 3216687104. Throughput: 0: 49565.7. Samples: 2745570880. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 07:49:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:49:31,445][71000] Updated weights for policy 0, policy_version 196334 (0.0026) [2024-06-13 07:49:32,333][70980] Signal inference workers to stop experience collection... (41050 times) [2024-06-13 07:49:32,337][70980] Signal inference workers to resume experience collection... (41050 times) [2024-06-13 07:49:32,350][71000] InferenceWorker_p0-w0: stopping experience collection (41050 times) [2024-06-13 07:49:32,350][71000] InferenceWorker_p0-w0: resuming experience collection (41050 times) [2024-06-13 07:49:34,692][71000] Updated weights for policy 0, policy_version 196344 (0.0032) [2024-06-13 07:49:35,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3216916480. Throughput: 0: 49248.1. Samples: 2745700400. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 07:49:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:49:38,447][71000] Updated weights for policy 0, policy_version 196354 (0.0034) [2024-06-13 07:49:40,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3217178624. Throughput: 0: 49185.7. Samples: 2745991440. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-13 07:49:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:49:41,054][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000196362_3217195008.pth... [2024-06-13 07:49:41,105][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000195641_3205382144.pth [2024-06-13 07:49:41,578][71000] Updated weights for policy 0, policy_version 196364 (0.0028) [2024-06-13 07:49:45,279][71000] Updated weights for policy 0, policy_version 196374 (0.0026) [2024-06-13 07:49:45,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 3217440768. Throughput: 0: 49085.4. Samples: 2746288800. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-13 07:49:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:49:47,987][71000] Updated weights for policy 0, policy_version 196384 (0.0035) [2024-06-13 07:49:50,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3217653760. Throughput: 0: 48867.0. Samples: 2746431200. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-13 07:49:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:49:51,682][71000] Updated weights for policy 0, policy_version 196394 (0.0022) [2024-06-13 07:49:54,569][71000] Updated weights for policy 0, policy_version 196404 (0.0034) [2024-06-13 07:49:55,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3217899520. Throughput: 0: 49025.8. Samples: 2746735960. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-13 07:49:55,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:49:58,345][71000] Updated weights for policy 0, policy_version 196414 (0.0031) [2024-06-13 07:50:00,940][70768] Fps is (10 sec: 52429.1, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 3218178048. Throughput: 0: 49186.4. Samples: 2747029380. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-13 07:50:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:50:01,581][71000] Updated weights for policy 0, policy_version 196424 (0.0027) [2024-06-13 07:50:05,106][71000] Updated weights for policy 0, policy_version 196434 (0.0030) [2024-06-13 07:50:05,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 3218423808. Throughput: 0: 49322.1. Samples: 2747189960. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-13 07:50:05,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:50:08,318][71000] Updated weights for policy 0, policy_version 196444 (0.0025) [2024-06-13 07:50:10,940][70768] Fps is (10 sec: 44237.2, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 3218620416. Throughput: 0: 48858.2. Samples: 2747472620. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-13 07:50:10,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:50:11,863][71000] Updated weights for policy 0, policy_version 196454 (0.0033) [2024-06-13 07:50:14,949][71000] Updated weights for policy 0, policy_version 196464 (0.0026) [2024-06-13 07:50:15,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48878.8, 300 sec: 49152.0). Total num frames: 3218882560. Throughput: 0: 48803.2. Samples: 2747767020. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-13 07:50:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:50:18,478][71000] Updated weights for policy 0, policy_version 196474 (0.0031) [2024-06-13 07:50:20,940][70768] Fps is (10 sec: 54066.4, 60 sec: 49151.8, 300 sec: 49152.0). Total num frames: 3219161088. Throughput: 0: 49401.2. Samples: 2747923460. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-13 07:50:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:50:21,738][71000] Updated weights for policy 0, policy_version 196484 (0.0025) [2024-06-13 07:50:25,172][71000] Updated weights for policy 0, policy_version 196494 (0.0023) [2024-06-13 07:50:25,939][70768] Fps is (10 sec: 50791.0, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3219390464. Throughput: 0: 49317.0. Samples: 2748210700. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-13 07:50:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:50:28,299][71000] Updated weights for policy 0, policy_version 196504 (0.0027) [2024-06-13 07:50:30,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 3219619840. Throughput: 0: 49279.9. Samples: 2748506400. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-13 07:50:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:50:31,214][70980] Signal inference workers to stop experience collection... (41100 times) [2024-06-13 07:50:31,215][70980] Signal inference workers to resume experience collection... (41100 times) [2024-06-13 07:50:31,256][71000] InferenceWorker_p0-w0: stopping experience collection (41100 times) [2024-06-13 07:50:31,256][71000] InferenceWorker_p0-w0: resuming experience collection (41100 times) [2024-06-13 07:50:32,042][71000] Updated weights for policy 0, policy_version 196514 (0.0026) [2024-06-13 07:50:35,209][71000] Updated weights for policy 0, policy_version 196524 (0.0019) [2024-06-13 07:50:35,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3219865600. Throughput: 0: 49095.6. Samples: 2748640500. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-13 07:50:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:50:38,466][71000] Updated weights for policy 0, policy_version 196534 (0.0027) [2024-06-13 07:50:40,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 3220144128. Throughput: 0: 49140.0. Samples: 2748947260. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-13 07:50:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:50:41,750][71000] Updated weights for policy 0, policy_version 196544 (0.0039) [2024-06-13 07:50:45,503][71000] Updated weights for policy 0, policy_version 196554 (0.0022) [2024-06-13 07:50:45,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48605.7, 300 sec: 49152.0). Total num frames: 3220357120. Throughput: 0: 49101.7. Samples: 2749238960. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-13 07:50:45,940][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 07:50:48,638][71000] Updated weights for policy 0, policy_version 196564 (0.0034) [2024-06-13 07:50:50,940][70768] Fps is (10 sec: 44237.1, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 3220586496. Throughput: 0: 48575.6. Samples: 2749375860. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-13 07:50:50,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:50:52,405][71000] Updated weights for policy 0, policy_version 196574 (0.0029) [2024-06-13 07:50:55,451][71000] Updated weights for policy 0, policy_version 196584 (0.0023) [2024-06-13 07:50:55,940][70768] Fps is (10 sec: 49152.8, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 3220848640. Throughput: 0: 48672.5. Samples: 2749662880. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 07:50:55,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 07:50:58,953][71000] Updated weights for policy 0, policy_version 196594 (0.0032) [2024-06-13 07:51:00,940][70768] Fps is (10 sec: 52428.2, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 3221110784. Throughput: 0: 48792.0. Samples: 2749962660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 07:51:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:51:01,916][71000] Updated weights for policy 0, policy_version 196604 (0.0025) [2024-06-13 07:51:05,551][71000] Updated weights for policy 0, policy_version 196614 (0.0024) [2024-06-13 07:51:05,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48605.9, 300 sec: 49096.5). Total num frames: 3221340160. Throughput: 0: 48782.4. Samples: 2750118660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 07:51:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:51:08,788][71000] Updated weights for policy 0, policy_version 196624 (0.0025) [2024-06-13 07:51:10,940][70768] Fps is (10 sec: 45875.8, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3221569536. Throughput: 0: 48911.1. Samples: 2750411700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 07:51:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:51:12,121][71000] Updated weights for policy 0, policy_version 196634 (0.0033) [2024-06-13 07:51:15,771][71000] Updated weights for policy 0, policy_version 196644 (0.0027) [2024-06-13 07:51:15,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 3221815296. Throughput: 0: 48910.7. Samples: 2750707380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 07:51:15,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 07:51:18,877][71000] Updated weights for policy 0, policy_version 196654 (0.0025) [2024-06-13 07:51:20,940][70768] Fps is (10 sec: 52428.7, 60 sec: 48879.1, 300 sec: 49096.5). Total num frames: 3222093824. Throughput: 0: 49247.7. Samples: 2750856640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 07:51:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:51:22,351][71000] Updated weights for policy 0, policy_version 196664 (0.0028) [2024-06-13 07:51:25,443][71000] Updated weights for policy 0, policy_version 196674 (0.0019) [2024-06-13 07:51:25,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 3222323200. Throughput: 0: 48841.8. Samples: 2751145140. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 07:51:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:51:26,277][70980] Signal inference workers to stop experience collection... (41150 times) [2024-06-13 07:51:26,317][71000] InferenceWorker_p0-w0: stopping experience collection (41150 times) [2024-06-13 07:51:26,332][70980] Signal inference workers to resume experience collection... (41150 times) [2024-06-13 07:51:26,336][71000] InferenceWorker_p0-w0: resuming experience collection (41150 times) [2024-06-13 07:51:28,542][71000] Updated weights for policy 0, policy_version 196684 (0.0027) [2024-06-13 07:51:30,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3222568960. Throughput: 0: 49107.7. Samples: 2751448800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 07:51:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:51:32,322][71000] Updated weights for policy 0, policy_version 196694 (0.0025) [2024-06-13 07:51:35,420][71000] Updated weights for policy 0, policy_version 196704 (0.0027) [2024-06-13 07:51:35,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3222798336. Throughput: 0: 49176.3. Samples: 2751588800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 07:51:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:51:38,791][71000] Updated weights for policy 0, policy_version 196714 (0.0026) [2024-06-13 07:51:40,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3223093248. Throughput: 0: 49423.8. Samples: 2751886960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 07:51:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:51:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000196722_3223093248.pth... [2024-06-13 07:51:40,992][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000196000_3211264000.pth [2024-06-13 07:51:42,241][71000] Updated weights for policy 0, policy_version 196724 (0.0027) [2024-06-13 07:51:45,863][71000] Updated weights for policy 0, policy_version 196734 (0.0027) [2024-06-13 07:51:45,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48879.1, 300 sec: 49040.9). Total num frames: 3223289856. Throughput: 0: 49212.1. Samples: 2752177200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 07:51:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:51:49,293][71000] Updated weights for policy 0, policy_version 196744 (0.0035) [2024-06-13 07:51:50,939][70768] Fps is (10 sec: 45876.2, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 3223552000. Throughput: 0: 48891.6. Samples: 2752318780. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 07:51:50,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:51:52,121][71000] Updated weights for policy 0, policy_version 196754 (0.0033) [2024-06-13 07:51:55,806][71000] Updated weights for policy 0, policy_version 196764 (0.0032) [2024-06-13 07:51:55,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 3223797760. Throughput: 0: 48970.3. Samples: 2752615360. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-13 07:51:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:51:59,009][71000] Updated weights for policy 0, policy_version 196774 (0.0043) [2024-06-13 07:52:00,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3224059904. Throughput: 0: 48923.9. Samples: 2752908960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 07:52:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:52:02,487][71000] Updated weights for policy 0, policy_version 196784 (0.0029) [2024-06-13 07:52:05,521][71000] Updated weights for policy 0, policy_version 196794 (0.0040) [2024-06-13 07:52:05,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3224289280. Throughput: 0: 49021.4. Samples: 2753062600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 07:52:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:52:09,213][71000] Updated weights for policy 0, policy_version 196804 (0.0038) [2024-06-13 07:52:10,940][70768] Fps is (10 sec: 45875.6, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3224518656. Throughput: 0: 49105.4. Samples: 2753354880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 07:52:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:52:12,104][71000] Updated weights for policy 0, policy_version 196814 (0.0023) [2024-06-13 07:52:15,779][71000] Updated weights for policy 0, policy_version 196824 (0.0033) [2024-06-13 07:52:15,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3224764416. Throughput: 0: 48831.0. Samples: 2753646200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 07:52:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:52:18,687][71000] Updated weights for policy 0, policy_version 196834 (0.0032) [2024-06-13 07:52:20,939][70768] Fps is (10 sec: 50791.0, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 3225026560. Throughput: 0: 49004.7. Samples: 2753794000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 07:52:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 07:52:22,345][71000] Updated weights for policy 0, policy_version 196844 (0.0033) [2024-06-13 07:52:25,327][71000] Updated weights for policy 0, policy_version 196854 (0.0028) [2024-06-13 07:52:25,940][70768] Fps is (10 sec: 49152.8, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 3225255936. Throughput: 0: 48994.9. Samples: 2754091720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 07:52:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:52:29,154][71000] Updated weights for policy 0, policy_version 196864 (0.0029) [2024-06-13 07:52:30,939][70768] Fps is (10 sec: 47513.4, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 3225501696. Throughput: 0: 49093.8. Samples: 2754386420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 07:52:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 07:52:32,043][71000] Updated weights for policy 0, policy_version 196874 (0.0035) [2024-06-13 07:52:35,658][71000] Updated weights for policy 0, policy_version 196884 (0.0022) [2024-06-13 07:52:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 3225747456. Throughput: 0: 49044.9. Samples: 2754525800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 07:52:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:52:38,678][71000] Updated weights for policy 0, policy_version 196894 (0.0029) [2024-06-13 07:52:40,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 3226009600. Throughput: 0: 49154.5. Samples: 2754827320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 07:52:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:52:42,176][71000] Updated weights for policy 0, policy_version 196904 (0.0028) [2024-06-13 07:52:44,501][70980] Signal inference workers to stop experience collection... (41200 times) [2024-06-13 07:52:44,549][71000] InferenceWorker_p0-w0: stopping experience collection (41200 times) [2024-06-13 07:52:44,615][70980] Signal inference workers to resume experience collection... (41200 times) [2024-06-13 07:52:44,615][71000] InferenceWorker_p0-w0: resuming experience collection (41200 times) [2024-06-13 07:52:45,054][71000] Updated weights for policy 0, policy_version 196914 (0.0022) [2024-06-13 07:52:45,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 3226255360. Throughput: 0: 49060.1. Samples: 2755116660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 07:52:45,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:52:48,681][71000] Updated weights for policy 0, policy_version 196924 (0.0029) [2024-06-13 07:52:50,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3226501120. Throughput: 0: 49249.7. Samples: 2755278840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 07:52:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:52:51,999][71000] Updated weights for policy 0, policy_version 196934 (0.0027) [2024-06-13 07:52:55,247][71000] Updated weights for policy 0, policy_version 196944 (0.0023) [2024-06-13 07:52:55,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3226746880. Throughput: 0: 49204.9. Samples: 2755569100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 07:52:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 07:52:58,616][71000] Updated weights for policy 0, policy_version 196954 (0.0026) [2024-06-13 07:53:00,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3226976256. Throughput: 0: 49203.6. Samples: 2755860360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 07:53:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:53:01,995][71000] Updated weights for policy 0, policy_version 196964 (0.0035) [2024-06-13 07:53:05,250][71000] Updated weights for policy 0, policy_version 196974 (0.0036) [2024-06-13 07:53:05,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 3227222016. Throughput: 0: 49275.3. Samples: 2756011400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-13 07:53:05,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:53:08,998][71000] Updated weights for policy 0, policy_version 196984 (0.0030) [2024-06-13 07:53:10,939][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 3227467776. Throughput: 0: 48968.0. Samples: 2756295280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-13 07:53:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:53:12,049][71000] Updated weights for policy 0, policy_version 196994 (0.0035) [2024-06-13 07:53:15,691][71000] Updated weights for policy 0, policy_version 197004 (0.0028) [2024-06-13 07:53:15,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49425.2, 300 sec: 49096.5). Total num frames: 3227729920. Throughput: 0: 49139.1. Samples: 2756597680. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-13 07:53:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:53:18,711][71000] Updated weights for policy 0, policy_version 197014 (0.0029) [2024-06-13 07:53:20,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3227959296. Throughput: 0: 49186.2. Samples: 2756739180. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-13 07:53:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:53:22,169][71000] Updated weights for policy 0, policy_version 197024 (0.0036) [2024-06-13 07:53:25,530][71000] Updated weights for policy 0, policy_version 197034 (0.0028) [2024-06-13 07:53:25,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 3228221440. Throughput: 0: 49056.1. Samples: 2757034840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-13 07:53:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:53:29,127][71000] Updated weights for policy 0, policy_version 197044 (0.0031) [2024-06-13 07:53:30,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 3228467200. Throughput: 0: 49242.7. Samples: 2757332580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-13 07:53:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:53:32,185][71000] Updated weights for policy 0, policy_version 197054 (0.0029) [2024-06-13 07:53:35,543][71000] Updated weights for policy 0, policy_version 197064 (0.0029) [2024-06-13 07:53:35,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49424.9, 300 sec: 49040.9). Total num frames: 3228712960. Throughput: 0: 48892.3. Samples: 2757479000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-13 07:53:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:53:38,883][71000] Updated weights for policy 0, policy_version 197074 (0.0029) [2024-06-13 07:53:40,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3228925952. Throughput: 0: 49040.0. Samples: 2757775900. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-13 07:53:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:53:40,957][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000197079_3228942336.pth... [2024-06-13 07:53:41,006][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000196362_3217195008.pth [2024-06-13 07:53:42,387][71000] Updated weights for policy 0, policy_version 197084 (0.0027) [2024-06-13 07:53:45,534][71000] Updated weights for policy 0, policy_version 197094 (0.0030) [2024-06-13 07:53:45,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 3229188096. Throughput: 0: 48881.2. Samples: 2758060020. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-13 07:53:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:53:49,425][71000] Updated weights for policy 0, policy_version 197104 (0.0034) [2024-06-13 07:53:50,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 3229450240. Throughput: 0: 48788.5. Samples: 2758206880. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-13 07:53:50,944][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:53:52,226][71000] Updated weights for policy 0, policy_version 197114 (0.0025) [2024-06-13 07:53:53,402][70980] Signal inference workers to stop experience collection... (41250 times) [2024-06-13 07:53:53,449][71000] InferenceWorker_p0-w0: stopping experience collection (41250 times) [2024-06-13 07:53:53,455][70980] Signal inference workers to resume experience collection... (41250 times) [2024-06-13 07:53:53,467][71000] InferenceWorker_p0-w0: resuming experience collection (41250 times) [2024-06-13 07:53:55,670][71000] Updated weights for policy 0, policy_version 197124 (0.0022) [2024-06-13 07:53:55,940][70768] Fps is (10 sec: 49152.8, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3229679616. Throughput: 0: 49233.2. Samples: 2758510780. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-13 07:53:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:53:58,769][71000] Updated weights for policy 0, policy_version 197134 (0.0024) [2024-06-13 07:54:00,939][70768] Fps is (10 sec: 47514.3, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 3229925376. Throughput: 0: 49387.6. Samples: 2758820120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-13 07:54:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 07:54:01,946][71000] Updated weights for policy 0, policy_version 197144 (0.0034) [2024-06-13 07:54:05,188][71000] Updated weights for policy 0, policy_version 197154 (0.0039) [2024-06-13 07:54:05,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.2, 300 sec: 49152.0). Total num frames: 3230187520. Throughput: 0: 49357.8. Samples: 2758960280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-13 07:54:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:54:08,649][71000] Updated weights for policy 0, policy_version 197164 (0.0024) [2024-06-13 07:54:10,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49698.1, 300 sec: 49152.0). Total num frames: 3230449664. Throughput: 0: 49394.2. Samples: 2759257580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-13 07:54:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:54:12,047][71000] Updated weights for policy 0, policy_version 197174 (0.0030) [2024-06-13 07:54:15,655][71000] Updated weights for policy 0, policy_version 197184 (0.0034) [2024-06-13 07:54:15,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3230662656. Throughput: 0: 49172.3. Samples: 2759545340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 07:54:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:54:18,769][71000] Updated weights for policy 0, policy_version 197194 (0.0033) [2024-06-13 07:54:20,940][70768] Fps is (10 sec: 44237.0, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3230892032. Throughput: 0: 49093.5. Samples: 2759688200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 07:54:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:54:22,182][71000] Updated weights for policy 0, policy_version 197204 (0.0028) [2024-06-13 07:54:25,269][71000] Updated weights for policy 0, policy_version 197214 (0.0033) [2024-06-13 07:54:25,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3231170560. Throughput: 0: 49120.9. Samples: 2759986340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 07:54:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:54:28,982][71000] Updated weights for policy 0, policy_version 197224 (0.0026) [2024-06-13 07:54:30,940][70768] Fps is (10 sec: 55705.3, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 3231449088. Throughput: 0: 49206.4. Samples: 2760274300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 07:54:30,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 07:54:32,447][71000] Updated weights for policy 0, policy_version 197234 (0.0032) [2024-06-13 07:54:35,510][71000] Updated weights for policy 0, policy_version 197244 (0.0022) [2024-06-13 07:54:35,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 3231662080. Throughput: 0: 49496.2. Samples: 2760434200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 07:54:35,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:54:39,110][71000] Updated weights for policy 0, policy_version 197254 (0.0028) [2024-06-13 07:54:40,940][70768] Fps is (10 sec: 42598.6, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 3231875072. Throughput: 0: 49172.5. Samples: 2760723540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 07:54:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:54:42,375][71000] Updated weights for policy 0, policy_version 197264 (0.0031) [2024-06-13 07:54:45,791][71000] Updated weights for policy 0, policy_version 197274 (0.0022) [2024-06-13 07:54:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.2, 300 sec: 49096.5). Total num frames: 3232137216. Throughput: 0: 48812.0. Samples: 2761016660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 07:54:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:54:49,252][71000] Updated weights for policy 0, policy_version 197284 (0.0023) [2024-06-13 07:54:50,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 3232399360. Throughput: 0: 48946.2. Samples: 2761162860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 07:54:50,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:54:52,740][71000] Updated weights for policy 0, policy_version 197294 (0.0029) [2024-06-13 07:54:55,611][70980] Signal inference workers to stop experience collection... (41300 times) [2024-06-13 07:54:55,638][71000] InferenceWorker_p0-w0: stopping experience collection (41300 times) [2024-06-13 07:54:55,721][70980] Signal inference workers to resume experience collection... (41300 times) [2024-06-13 07:54:55,721][71000] InferenceWorker_p0-w0: resuming experience collection (41300 times) [2024-06-13 07:54:55,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3232612352. Throughput: 0: 48655.0. Samples: 2761447060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 07:54:55,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:54:56,011][71000] Updated weights for policy 0, policy_version 197304 (0.0024) [2024-06-13 07:54:59,842][71000] Updated weights for policy 0, policy_version 197314 (0.0030) [2024-06-13 07:55:00,939][70768] Fps is (10 sec: 44237.0, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 3232841728. Throughput: 0: 48644.6. Samples: 2761734340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 07:55:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:55:02,822][71000] Updated weights for policy 0, policy_version 197324 (0.0031) [2024-06-13 07:55:05,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48605.7, 300 sec: 49096.4). Total num frames: 3233103872. Throughput: 0: 48569.1. Samples: 2761873820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 07:55:05,940][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 07:55:06,422][71000] Updated weights for policy 0, policy_version 197334 (0.0025) [2024-06-13 07:55:09,499][71000] Updated weights for policy 0, policy_version 197344 (0.0036) [2024-06-13 07:55:10,940][70768] Fps is (10 sec: 54066.9, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3233382400. Throughput: 0: 48509.8. Samples: 2762169280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 07:55:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:55:13,150][71000] Updated weights for policy 0, policy_version 197354 (0.0026) [2024-06-13 07:55:15,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 3233579008. Throughput: 0: 48573.7. Samples: 2762460120. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-13 07:55:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:55:16,461][71000] Updated weights for policy 0, policy_version 197364 (0.0026) [2024-06-13 07:55:19,945][71000] Updated weights for policy 0, policy_version 197374 (0.0036) [2024-06-13 07:55:20,940][70768] Fps is (10 sec: 42597.7, 60 sec: 48605.7, 300 sec: 48874.3). Total num frames: 3233808384. Throughput: 0: 48045.6. Samples: 2762596260. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 07:55:20,940][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 07:55:23,136][71000] Updated weights for policy 0, policy_version 197384 (0.0029) [2024-06-13 07:55:25,940][70768] Fps is (10 sec: 49148.0, 60 sec: 48332.1, 300 sec: 48985.2). Total num frames: 3234070528. Throughput: 0: 47915.5. Samples: 2762879780. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 07:55:25,941][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:55:26,726][71000] Updated weights for policy 0, policy_version 197394 (0.0030) [2024-06-13 07:55:29,637][71000] Updated weights for policy 0, policy_version 197404 (0.0019) [2024-06-13 07:55:30,940][70768] Fps is (10 sec: 55705.6, 60 sec: 48605.8, 300 sec: 49152.0). Total num frames: 3234365440. Throughput: 0: 48138.5. Samples: 2763182900. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 07:55:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:55:33,190][71000] Updated weights for policy 0, policy_version 197414 (0.0027) [2024-06-13 07:55:35,940][70768] Fps is (10 sec: 47517.5, 60 sec: 48059.6, 300 sec: 48818.8). Total num frames: 3234545664. Throughput: 0: 48311.9. Samples: 2763336900. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 07:55:35,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:55:36,392][71000] Updated weights for policy 0, policy_version 197424 (0.0030) [2024-06-13 07:55:40,148][71000] Updated weights for policy 0, policy_version 197434 (0.0043) [2024-06-13 07:55:40,940][70768] Fps is (10 sec: 40959.6, 60 sec: 48332.6, 300 sec: 48874.3). Total num frames: 3234775040. Throughput: 0: 48354.5. Samples: 2763623020. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 07:55:40,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 07:55:40,955][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000197435_3234775040.pth... [2024-06-13 07:55:41,001][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000196722_3223093248.pth [2024-06-13 07:55:43,311][71000] Updated weights for policy 0, policy_version 197444 (0.0029) [2024-06-13 07:55:45,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 3235053568. Throughput: 0: 48489.3. Samples: 2763916360. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 07:55:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:55:46,827][71000] Updated weights for policy 0, policy_version 197454 (0.0029) [2024-06-13 07:55:49,974][71000] Updated weights for policy 0, policy_version 197464 (0.0026) [2024-06-13 07:55:50,940][70768] Fps is (10 sec: 55707.0, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 3235332096. Throughput: 0: 48893.5. Samples: 2764074020. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 07:55:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:55:53,401][71000] Updated weights for policy 0, policy_version 197474 (0.0037) [2024-06-13 07:55:54,559][70980] Signal inference workers to stop experience collection... (41350 times) [2024-06-13 07:55:54,559][70980] Signal inference workers to resume experience collection... (41350 times) [2024-06-13 07:55:54,610][71000] InferenceWorker_p0-w0: stopping experience collection (41350 times) [2024-06-13 07:55:54,610][71000] InferenceWorker_p0-w0: resuming experience collection (41350 times) [2024-06-13 07:55:55,939][70768] Fps is (10 sec: 45875.8, 60 sec: 48333.0, 300 sec: 48818.8). Total num frames: 3235512320. Throughput: 0: 48913.9. Samples: 2764370400. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 07:55:55,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:55:56,542][71000] Updated weights for policy 0, policy_version 197484 (0.0029) [2024-06-13 07:56:00,282][71000] Updated weights for policy 0, policy_version 197494 (0.0035) [2024-06-13 07:56:00,940][70768] Fps is (10 sec: 42598.6, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 3235758080. Throughput: 0: 48719.3. Samples: 2764652480. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 07:56:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:56:03,426][71000] Updated weights for policy 0, policy_version 197504 (0.0038) [2024-06-13 07:56:05,940][70768] Fps is (10 sec: 52427.6, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 3236036608. Throughput: 0: 48860.0. Samples: 2764794960. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 07:56:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:56:07,084][71000] Updated weights for policy 0, policy_version 197514 (0.0036) [2024-06-13 07:56:10,106][71000] Updated weights for policy 0, policy_version 197524 (0.0039) [2024-06-13 07:56:10,940][70768] Fps is (10 sec: 52428.6, 60 sec: 48332.8, 300 sec: 49040.9). Total num frames: 3236282368. Throughput: 0: 49146.3. Samples: 2765091320. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 07:56:10,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:56:13,782][71000] Updated weights for policy 0, policy_version 197534 (0.0027) [2024-06-13 07:56:15,939][70768] Fps is (10 sec: 45876.1, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 3236495360. Throughput: 0: 48925.6. Samples: 2765384540. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 07:56:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:56:16,556][71000] Updated weights for policy 0, policy_version 197544 (0.0023) [2024-06-13 07:56:20,647][71000] Updated weights for policy 0, policy_version 197554 (0.0024) [2024-06-13 07:56:20,940][70768] Fps is (10 sec: 44236.6, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 3236724736. Throughput: 0: 48359.2. Samples: 2765513060. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-13 07:56:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:56:23,205][71000] Updated weights for policy 0, policy_version 197564 (0.0028) [2024-06-13 07:56:25,941][70768] Fps is (10 sec: 50784.9, 60 sec: 48878.9, 300 sec: 48929.7). Total num frames: 3237003264. Throughput: 0: 48696.5. Samples: 2765814400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-13 07:56:25,941][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:56:26,957][71000] Updated weights for policy 0, policy_version 197574 (0.0032) [2024-06-13 07:56:29,808][71000] Updated weights for policy 0, policy_version 197584 (0.0030) [2024-06-13 07:56:30,940][70768] Fps is (10 sec: 54067.4, 60 sec: 48332.9, 300 sec: 49040.9). Total num frames: 3237265408. Throughput: 0: 48854.2. Samples: 2766114800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-13 07:56:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:56:33,758][71000] Updated weights for policy 0, policy_version 197594 (0.0037) [2024-06-13 07:56:35,939][70768] Fps is (10 sec: 49157.2, 60 sec: 49152.1, 300 sec: 48818.8). Total num frames: 3237494784. Throughput: 0: 48832.5. Samples: 2766271480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-13 07:56:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 07:56:36,468][71000] Updated weights for policy 0, policy_version 197604 (0.0030) [2024-06-13 07:56:40,632][71000] Updated weights for policy 0, policy_version 197614 (0.0027) [2024-06-13 07:56:40,940][70768] Fps is (10 sec: 45873.4, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 3237724160. Throughput: 0: 48670.1. Samples: 2766560580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-13 07:56:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:56:43,148][71000] Updated weights for policy 0, policy_version 197624 (0.0024) [2024-06-13 07:56:45,939][70768] Fps is (10 sec: 49152.1, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 3237986304. Throughput: 0: 48872.9. Samples: 2766851760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-13 07:56:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:56:47,146][71000] Updated weights for policy 0, policy_version 197634 (0.0036) [2024-06-13 07:56:49,958][71000] Updated weights for policy 0, policy_version 197644 (0.0024) [2024-06-13 07:56:50,940][70768] Fps is (10 sec: 52430.9, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3238248448. Throughput: 0: 49092.1. Samples: 2767004100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-13 07:56:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:56:53,768][71000] Updated weights for policy 0, policy_version 197654 (0.0029) [2024-06-13 07:56:54,184][70980] Signal inference workers to stop experience collection... (41400 times) [2024-06-13 07:56:54,186][70980] Signal inference workers to resume experience collection... (41400 times) [2024-06-13 07:56:54,198][71000] InferenceWorker_p0-w0: stopping experience collection (41400 times) [2024-06-13 07:56:54,211][71000] InferenceWorker_p0-w0: resuming experience collection (41400 times) [2024-06-13 07:56:55,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49424.9, 300 sec: 48874.3). Total num frames: 3238477824. Throughput: 0: 49221.7. Samples: 2767306300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-13 07:56:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:56:56,358][71000] Updated weights for policy 0, policy_version 197664 (0.0025) [2024-06-13 07:57:00,307][71000] Updated weights for policy 0, policy_version 197674 (0.0026) [2024-06-13 07:57:00,939][70768] Fps is (10 sec: 44237.1, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 3238690816. Throughput: 0: 49171.1. Samples: 2767597240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-13 07:57:00,948][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:57:03,043][71000] Updated weights for policy 0, policy_version 197684 (0.0025) [2024-06-13 07:57:05,939][70768] Fps is (10 sec: 50791.3, 60 sec: 49152.2, 300 sec: 49040.9). Total num frames: 3238985728. Throughput: 0: 49409.5. Samples: 2767736480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-13 07:57:05,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:57:06,871][71000] Updated weights for policy 0, policy_version 197694 (0.0037) [2024-06-13 07:57:09,538][71000] Updated weights for policy 0, policy_version 197704 (0.0018) [2024-06-13 07:57:10,939][70768] Fps is (10 sec: 54067.2, 60 sec: 49152.0, 300 sec: 49041.0). Total num frames: 3239231488. Throughput: 0: 49448.7. Samples: 2768039540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-13 07:57:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:57:13,422][71000] Updated weights for policy 0, policy_version 197714 (0.0029) [2024-06-13 07:57:15,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49425.1, 300 sec: 48929.8). Total num frames: 3239460864. Throughput: 0: 49525.5. Samples: 2768343440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-13 07:57:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:57:16,290][71000] Updated weights for policy 0, policy_version 197724 (0.0025) [2024-06-13 07:57:20,073][71000] Updated weights for policy 0, policy_version 197734 (0.0021) [2024-06-13 07:57:20,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49425.1, 300 sec: 48929.8). Total num frames: 3239690240. Throughput: 0: 49164.0. Samples: 2768483860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-13 07:57:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:57:23,183][71000] Updated weights for policy 0, policy_version 197744 (0.0035) [2024-06-13 07:57:25,939][70768] Fps is (10 sec: 49151.7, 60 sec: 49152.9, 300 sec: 48985.4). Total num frames: 3239952384. Throughput: 0: 49398.3. Samples: 2768783480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-13 07:57:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:57:26,509][71000] Updated weights for policy 0, policy_version 197754 (0.0035) [2024-06-13 07:57:29,538][71000] Updated weights for policy 0, policy_version 197764 (0.0022) [2024-06-13 07:57:30,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3240214528. Throughput: 0: 49531.0. Samples: 2769080660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-13 07:57:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:57:33,362][71000] Updated weights for policy 0, policy_version 197774 (0.0027) [2024-06-13 07:57:35,895][71000] Updated weights for policy 0, policy_version 197784 (0.0033) [2024-06-13 07:57:35,940][70768] Fps is (10 sec: 54066.7, 60 sec: 49971.1, 300 sec: 49096.5). Total num frames: 3240493056. Throughput: 0: 49582.6. Samples: 2769235320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:57:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 07:57:39,915][71000] Updated weights for policy 0, policy_version 197794 (0.0034) [2024-06-13 07:57:40,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49425.3, 300 sec: 48929.8). Total num frames: 3240689664. Throughput: 0: 49381.7. Samples: 2769528480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:57:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:57:40,958][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000197796_3240689664.pth... [2024-06-13 07:57:41,025][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000197079_3228942336.pth [2024-06-13 07:57:43,191][71000] Updated weights for policy 0, policy_version 197804 (0.0039) [2024-06-13 07:57:45,940][70768] Fps is (10 sec: 44237.0, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 3240935424. Throughput: 0: 49400.4. Samples: 2769820260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:57:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:57:46,737][71000] Updated weights for policy 0, policy_version 197814 (0.0028) [2024-06-13 07:57:49,614][71000] Updated weights for policy 0, policy_version 197824 (0.0021) [2024-06-13 07:57:50,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3241181184. Throughput: 0: 49543.0. Samples: 2769965920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:57:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 07:57:53,211][71000] Updated weights for policy 0, policy_version 197834 (0.0032) [2024-06-13 07:57:55,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49698.2, 300 sec: 49096.5). Total num frames: 3241459712. Throughput: 0: 49523.1. Samples: 2770268080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:57:55,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 07:57:55,957][71000] Updated weights for policy 0, policy_version 197844 (0.0024) [2024-06-13 07:57:57,172][70980] Signal inference workers to stop experience collection... (41450 times) [2024-06-13 07:57:57,202][71000] InferenceWorker_p0-w0: stopping experience collection (41450 times) [2024-06-13 07:57:57,276][70980] Signal inference workers to resume experience collection... (41450 times) [2024-06-13 07:57:57,276][71000] InferenceWorker_p0-w0: resuming experience collection (41450 times) [2024-06-13 07:58:00,365][71000] Updated weights for policy 0, policy_version 197854 (0.0038) [2024-06-13 07:58:00,941][70768] Fps is (10 sec: 50785.6, 60 sec: 49970.3, 300 sec: 49040.8). Total num frames: 3241689088. Throughput: 0: 49338.8. Samples: 2770563740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:58:00,941][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:58:02,870][71000] Updated weights for policy 0, policy_version 197864 (0.0030) [2024-06-13 07:58:05,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 3241918464. Throughput: 0: 49094.3. Samples: 2770693100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:58:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:58:06,959][71000] Updated weights for policy 0, policy_version 197874 (0.0025) [2024-06-13 07:58:09,749][71000] Updated weights for policy 0, policy_version 197884 (0.0031) [2024-06-13 07:58:10,940][70768] Fps is (10 sec: 47518.5, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3242164224. Throughput: 0: 48838.7. Samples: 2770981220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:58:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:58:13,401][71000] Updated weights for policy 0, policy_version 197894 (0.0037) [2024-06-13 07:58:15,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49698.0, 300 sec: 49096.5). Total num frames: 3242442752. Throughput: 0: 48996.9. Samples: 2771285520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:58:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:58:16,033][71000] Updated weights for policy 0, policy_version 197904 (0.0034) [2024-06-13 07:58:20,061][71000] Updated weights for policy 0, policy_version 197914 (0.0034) [2024-06-13 07:58:20,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49698.1, 300 sec: 48985.4). Total num frames: 3242672128. Throughput: 0: 49032.9. Samples: 2771441800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:58:20,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 07:58:23,300][71000] Updated weights for policy 0, policy_version 197924 (0.0025) [2024-06-13 07:58:25,941][70768] Fps is (10 sec: 45868.4, 60 sec: 49150.7, 300 sec: 48929.6). Total num frames: 3242901504. Throughput: 0: 49055.4. Samples: 2771736040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:58:25,942][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:58:26,838][71000] Updated weights for policy 0, policy_version 197934 (0.0027) [2024-06-13 07:58:29,941][71000] Updated weights for policy 0, policy_version 197944 (0.0035) [2024-06-13 07:58:30,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 3243147264. Throughput: 0: 49147.5. Samples: 2772031900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:58:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:58:33,389][71000] Updated weights for policy 0, policy_version 197954 (0.0030) [2024-06-13 07:58:35,939][70768] Fps is (10 sec: 50798.5, 60 sec: 48606.0, 300 sec: 49096.5). Total num frames: 3243409408. Throughput: 0: 49226.4. Samples: 2772181100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 07:58:35,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:58:36,949][71000] Updated weights for policy 0, policy_version 197964 (0.0024) [2024-06-13 07:58:39,851][71000] Updated weights for policy 0, policy_version 197974 (0.0023) [2024-06-13 07:58:40,939][70768] Fps is (10 sec: 52429.2, 60 sec: 49698.3, 300 sec: 49096.5). Total num frames: 3243671552. Throughput: 0: 49240.5. Samples: 2772483900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 07:58:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:58:43,296][71000] Updated weights for policy 0, policy_version 197984 (0.0030) [2024-06-13 07:58:45,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3243868160. Throughput: 0: 49075.7. Samples: 2772772100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 07:58:45,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:58:46,611][71000] Updated weights for policy 0, policy_version 197994 (0.0029) [2024-06-13 07:58:50,268][71000] Updated weights for policy 0, policy_version 198004 (0.0024) [2024-06-13 07:58:50,940][70768] Fps is (10 sec: 44236.5, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3244113920. Throughput: 0: 49169.7. Samples: 2772905740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 07:58:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:58:53,280][71000] Updated weights for policy 0, policy_version 198014 (0.0033) [2024-06-13 07:58:55,940][70768] Fps is (10 sec: 52428.7, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3244392448. Throughput: 0: 49415.9. Samples: 2773204940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 07:58:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 07:58:57,055][71000] Updated weights for policy 0, policy_version 198024 (0.0034) [2024-06-13 07:58:59,364][70980] Signal inference workers to stop experience collection... (41500 times) [2024-06-13 07:58:59,390][71000] InferenceWorker_p0-w0: stopping experience collection (41500 times) [2024-06-13 07:58:59,471][70980] Signal inference workers to resume experience collection... (41500 times) [2024-06-13 07:58:59,472][71000] InferenceWorker_p0-w0: resuming experience collection (41500 times) [2024-06-13 07:59:00,035][71000] Updated weights for policy 0, policy_version 198034 (0.0026) [2024-06-13 07:59:00,939][70768] Fps is (10 sec: 54067.6, 60 sec: 49425.9, 300 sec: 49040.9). Total num frames: 3244654592. Throughput: 0: 49350.7. Samples: 2773506300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 07:59:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 07:59:03,595][71000] Updated weights for policy 0, policy_version 198044 (0.0025) [2024-06-13 07:59:05,939][70768] Fps is (10 sec: 45875.5, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 3244851200. Throughput: 0: 49054.3. Samples: 2773649240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 07:59:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:59:06,755][71000] Updated weights for policy 0, policy_version 198054 (0.0023) [2024-06-13 07:59:10,381][71000] Updated weights for policy 0, policy_version 198064 (0.0021) [2024-06-13 07:59:10,940][70768] Fps is (10 sec: 44236.5, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 3245096960. Throughput: 0: 48887.4. Samples: 2773935900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 07:59:10,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:59:13,299][71000] Updated weights for policy 0, policy_version 198074 (0.0022) [2024-06-13 07:59:15,940][70768] Fps is (10 sec: 54066.8, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3245391872. Throughput: 0: 48827.6. Samples: 2774229140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 07:59:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 07:59:16,747][71000] Updated weights for policy 0, policy_version 198084 (0.0025) [2024-06-13 07:59:20,026][71000] Updated weights for policy 0, policy_version 198094 (0.0031) [2024-06-13 07:59:20,939][70768] Fps is (10 sec: 54067.7, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 3245637632. Throughput: 0: 49216.4. Samples: 2774395840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 07:59:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 07:59:23,497][71000] Updated weights for policy 0, policy_version 198104 (0.0024) [2024-06-13 07:59:25,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49153.1, 300 sec: 48818.8). Total num frames: 3245850624. Throughput: 0: 49163.0. Samples: 2774696240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 07:59:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:59:26,402][71000] Updated weights for policy 0, policy_version 198114 (0.0025) [2024-06-13 07:59:30,342][71000] Updated weights for policy 0, policy_version 198124 (0.0034) [2024-06-13 07:59:30,939][70768] Fps is (10 sec: 44237.0, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 3246080000. Throughput: 0: 49360.1. Samples: 2774993300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 07:59:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:59:33,040][71000] Updated weights for policy 0, policy_version 198134 (0.0031) [2024-06-13 07:59:35,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 3246358528. Throughput: 0: 49354.7. Samples: 2775126700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 07:59:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:59:36,895][71000] Updated weights for policy 0, policy_version 198144 (0.0038) [2024-06-13 07:59:39,559][71000] Updated weights for policy 0, policy_version 198154 (0.0028) [2024-06-13 07:59:40,940][70768] Fps is (10 sec: 54066.4, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 3246620672. Throughput: 0: 49366.6. Samples: 2775426440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 07:59:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:59:40,971][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000198159_3246637056.pth... [2024-06-13 07:59:41,030][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000197435_3234775040.pth [2024-06-13 07:59:43,413][71000] Updated weights for policy 0, policy_version 198164 (0.0023) [2024-06-13 07:59:45,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49971.2, 300 sec: 49040.9). Total num frames: 3246866432. Throughput: 0: 49472.4. Samples: 2775732560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 07:59:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 07:59:46,074][71000] Updated weights for policy 0, policy_version 198174 (0.0034) [2024-06-13 07:59:50,151][71000] Updated weights for policy 0, policy_version 198184 (0.0025) [2024-06-13 07:59:50,940][70768] Fps is (10 sec: 44236.3, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 3247063040. Throughput: 0: 49121.6. Samples: 2775859720. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-13 07:59:50,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 07:59:51,674][70980] Signal inference workers to stop experience collection... (41550 times) [2024-06-13 07:59:51,677][70980] Signal inference workers to resume experience collection... (41550 times) [2024-06-13 07:59:51,711][71000] InferenceWorker_p0-w0: stopping experience collection (41550 times) [2024-06-13 07:59:51,711][71000] InferenceWorker_p0-w0: resuming experience collection (41550 times) [2024-06-13 07:59:52,791][71000] Updated weights for policy 0, policy_version 198194 (0.0025) [2024-06-13 07:59:55,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3247341568. Throughput: 0: 49389.4. Samples: 2776158420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-13 07:59:55,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 07:59:56,616][71000] Updated weights for policy 0, policy_version 198204 (0.0033) [2024-06-13 07:59:59,564][71000] Updated weights for policy 0, policy_version 198214 (0.0032) [2024-06-13 08:00:00,940][70768] Fps is (10 sec: 55706.3, 60 sec: 49425.0, 300 sec: 49207.6). Total num frames: 3247620096. Throughput: 0: 49337.8. Samples: 2776449340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-13 08:00:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:00:03,460][71000] Updated weights for policy 0, policy_version 198224 (0.0048) [2024-06-13 08:00:05,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49971.2, 300 sec: 49040.9). Total num frames: 3247849472. Throughput: 0: 49258.2. Samples: 2776612460. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-13 08:00:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:00:06,062][71000] Updated weights for policy 0, policy_version 198234 (0.0034) [2024-06-13 08:00:10,067][71000] Updated weights for policy 0, policy_version 198244 (0.0024) [2024-06-13 08:00:10,939][70768] Fps is (10 sec: 42598.7, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 3248046080. Throughput: 0: 49141.5. Samples: 2776907600. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-13 08:00:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 08:00:12,820][71000] Updated weights for policy 0, policy_version 198254 (0.0022) [2024-06-13 08:00:15,940][70768] Fps is (10 sec: 49150.8, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 3248340992. Throughput: 0: 49038.4. Samples: 2777200040. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-13 08:00:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:00:16,411][71000] Updated weights for policy 0, policy_version 198264 (0.0028) [2024-06-13 08:00:19,293][71000] Updated weights for policy 0, policy_version 198274 (0.0024) [2024-06-13 08:00:20,939][70768] Fps is (10 sec: 55705.6, 60 sec: 49425.0, 300 sec: 49263.2). Total num frames: 3248603136. Throughput: 0: 49644.5. Samples: 2777360700. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-13 08:00:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:00:23,250][71000] Updated weights for policy 0, policy_version 198284 (0.0029) [2024-06-13 08:00:25,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49698.1, 300 sec: 49040.9). Total num frames: 3248832512. Throughput: 0: 49688.8. Samples: 2777662440. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-13 08:00:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:00:26,068][71000] Updated weights for policy 0, policy_version 198294 (0.0023) [2024-06-13 08:00:29,883][71000] Updated weights for policy 0, policy_version 198304 (0.0024) [2024-06-13 08:00:30,940][70768] Fps is (10 sec: 42597.8, 60 sec: 49151.8, 300 sec: 49096.5). Total num frames: 3249029120. Throughput: 0: 49367.0. Samples: 2777954080. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-13 08:00:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:00:32,773][71000] Updated weights for policy 0, policy_version 198314 (0.0026) [2024-06-13 08:00:35,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 3249307648. Throughput: 0: 49596.2. Samples: 2778091540. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-13 08:00:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:00:36,168][71000] Updated weights for policy 0, policy_version 198324 (0.0020) [2024-06-13 08:00:39,186][71000] Updated weights for policy 0, policy_version 198334 (0.0029) [2024-06-13 08:00:40,940][70768] Fps is (10 sec: 57344.4, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 3249602560. Throughput: 0: 49541.3. Samples: 2778387780. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-13 08:00:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:00:42,863][71000] Updated weights for policy 0, policy_version 198344 (0.0024) [2024-06-13 08:00:45,146][70980] Signal inference workers to stop experience collection... (41600 times) [2024-06-13 08:00:45,187][71000] InferenceWorker_p0-w0: stopping experience collection (41600 times) [2024-06-13 08:00:45,255][70980] Signal inference workers to resume experience collection... (41600 times) [2024-06-13 08:00:45,255][71000] InferenceWorker_p0-w0: resuming experience collection (41600 times) [2024-06-13 08:00:45,941][70768] Fps is (10 sec: 49142.5, 60 sec: 48877.4, 300 sec: 49040.6). Total num frames: 3249799168. Throughput: 0: 49787.2. Samples: 2778689860. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-13 08:00:45,942][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:00:46,165][71000] Updated weights for policy 0, policy_version 198354 (0.0022) [2024-06-13 08:00:49,635][71000] Updated weights for policy 0, policy_version 198364 (0.0027) [2024-06-13 08:00:50,940][70768] Fps is (10 sec: 40960.0, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 3250012160. Throughput: 0: 49211.0. Samples: 2778826960. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-13 08:00:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:00:52,575][71000] Updated weights for policy 0, policy_version 198374 (0.0026) [2024-06-13 08:00:55,939][70768] Fps is (10 sec: 50800.3, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 3250307072. Throughput: 0: 49231.1. Samples: 2779123000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 08:00:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:00:56,038][71000] Updated weights for policy 0, policy_version 198384 (0.0026) [2024-06-13 08:00:59,302][71000] Updated weights for policy 0, policy_version 198394 (0.0035) [2024-06-13 08:01:00,940][70768] Fps is (10 sec: 55705.6, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 3250569216. Throughput: 0: 49404.2. Samples: 2779423220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 08:01:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:01:02,672][71000] Updated weights for policy 0, policy_version 198404 (0.0028) [2024-06-13 08:01:05,781][71000] Updated weights for policy 0, policy_version 198414 (0.0035) [2024-06-13 08:01:05,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49424.9, 300 sec: 49263.1). Total num frames: 3250814976. Throughput: 0: 49239.0. Samples: 2779576460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 08:01:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:01:09,294][71000] Updated weights for policy 0, policy_version 198424 (0.0027) [2024-06-13 08:01:10,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49698.1, 300 sec: 49263.1). Total num frames: 3251027968. Throughput: 0: 49059.7. Samples: 2779870120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 08:01:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:01:12,653][71000] Updated weights for policy 0, policy_version 198434 (0.0033) [2024-06-13 08:01:15,807][71000] Updated weights for policy 0, policy_version 198444 (0.0024) [2024-06-13 08:01:15,941][70768] Fps is (10 sec: 49145.2, 60 sec: 49424.0, 300 sec: 49429.5). Total num frames: 3251306496. Throughput: 0: 49114.5. Samples: 2780164300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 08:01:15,942][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:01:19,055][71000] Updated weights for policy 0, policy_version 198454 (0.0030) [2024-06-13 08:01:20,940][70768] Fps is (10 sec: 52427.7, 60 sec: 49151.8, 300 sec: 49318.8). Total num frames: 3251552256. Throughput: 0: 49565.1. Samples: 2780321980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 08:01:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:01:22,540][71000] Updated weights for policy 0, policy_version 198464 (0.0035) [2024-06-13 08:01:25,912][71000] Updated weights for policy 0, policy_version 198474 (0.0040) [2024-06-13 08:01:25,940][70768] Fps is (10 sec: 49159.1, 60 sec: 49425.2, 300 sec: 49263.1). Total num frames: 3251798016. Throughput: 0: 49479.5. Samples: 2780614360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 08:01:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:01:28,964][71000] Updated weights for policy 0, policy_version 198484 (0.0022) [2024-06-13 08:01:30,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49698.2, 300 sec: 49207.5). Total num frames: 3252011008. Throughput: 0: 49246.0. Samples: 2780905840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 08:01:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:01:32,579][71000] Updated weights for policy 0, policy_version 198494 (0.0031) [2024-06-13 08:01:35,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.1, 300 sec: 49318.7). Total num frames: 3252273152. Throughput: 0: 49137.8. Samples: 2781038160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 08:01:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:01:35,963][71000] Updated weights for policy 0, policy_version 198504 (0.0027) [2024-06-13 08:01:39,171][71000] Updated weights for policy 0, policy_version 198514 (0.0030) [2024-06-13 08:01:40,940][70768] Fps is (10 sec: 52428.8, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 3252535296. Throughput: 0: 49221.7. Samples: 2781337980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 08:01:40,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:01:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000198519_3252535296.pth... [2024-06-13 08:01:41,004][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000197796_3240689664.pth [2024-06-13 08:01:42,592][71000] Updated weights for policy 0, policy_version 198524 (0.0028) [2024-06-13 08:01:43,812][70980] Signal inference workers to stop experience collection... (41650 times) [2024-06-13 08:01:43,864][71000] InferenceWorker_p0-w0: stopping experience collection (41650 times) [2024-06-13 08:01:43,864][70980] Signal inference workers to resume experience collection... (41650 times) [2024-06-13 08:01:43,882][71000] InferenceWorker_p0-w0: resuming experience collection (41650 times) [2024-06-13 08:01:45,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49426.6, 300 sec: 49207.5). Total num frames: 3252764672. Throughput: 0: 49353.8. Samples: 2781644140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 08:01:45,940][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 08:01:46,007][71000] Updated weights for policy 0, policy_version 198534 (0.0025) [2024-06-13 08:01:49,125][71000] Updated weights for policy 0, policy_version 198544 (0.0026) [2024-06-13 08:01:50,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49698.1, 300 sec: 49207.6). Total num frames: 3252994048. Throughput: 0: 49183.6. Samples: 2781789720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 08:01:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:01:52,639][71000] Updated weights for policy 0, policy_version 198554 (0.0025) [2024-06-13 08:01:55,939][70768] Fps is (10 sec: 49152.5, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 3253256192. Throughput: 0: 49020.1. Samples: 2782076020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 08:01:55,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:01:56,039][71000] Updated weights for policy 0, policy_version 198564 (0.0025) [2024-06-13 08:01:59,219][71000] Updated weights for policy 0, policy_version 198574 (0.0022) [2024-06-13 08:02:00,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49151.9, 300 sec: 49263.0). Total num frames: 3253518336. Throughput: 0: 49033.0. Samples: 2782370720. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 08:02:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:02:03,083][71000] Updated weights for policy 0, policy_version 198584 (0.0031) [2024-06-13 08:02:05,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48606.0, 300 sec: 49152.0). Total num frames: 3253731328. Throughput: 0: 48880.3. Samples: 2782521580. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 08:02:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:02:06,168][71000] Updated weights for policy 0, policy_version 198594 (0.0031) [2024-06-13 08:02:09,487][71000] Updated weights for policy 0, policy_version 198604 (0.0038) [2024-06-13 08:02:10,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 3253977088. Throughput: 0: 48815.0. Samples: 2782811040. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 08:02:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:02:12,859][71000] Updated weights for policy 0, policy_version 198614 (0.0031) [2024-06-13 08:02:15,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48607.1, 300 sec: 49263.1). Total num frames: 3254222848. Throughput: 0: 48929.8. Samples: 2783107680. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 08:02:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:02:16,207][71000] Updated weights for policy 0, policy_version 198624 (0.0027) [2024-06-13 08:02:19,350][71000] Updated weights for policy 0, policy_version 198634 (0.0030) [2024-06-13 08:02:20,940][70768] Fps is (10 sec: 50791.2, 60 sec: 48879.1, 300 sec: 49263.1). Total num frames: 3254484992. Throughput: 0: 49328.4. Samples: 2783257940. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 08:02:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:02:22,943][71000] Updated weights for policy 0, policy_version 198644 (0.0030) [2024-06-13 08:02:25,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 3254730752. Throughput: 0: 49215.6. Samples: 2783552680. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 08:02:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:02:25,998][71000] Updated weights for policy 0, policy_version 198654 (0.0028) [2024-06-13 08:02:29,393][71000] Updated weights for policy 0, policy_version 198664 (0.0026) [2024-06-13 08:02:30,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 3254976512. Throughput: 0: 49030.7. Samples: 2783850520. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 08:02:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:02:32,501][71000] Updated weights for policy 0, policy_version 198674 (0.0032) [2024-06-13 08:02:35,939][70768] Fps is (10 sec: 47514.1, 60 sec: 48879.0, 300 sec: 49207.6). Total num frames: 3255205888. Throughput: 0: 49083.3. Samples: 2783998460. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 08:02:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:02:36,168][71000] Updated weights for policy 0, policy_version 198684 (0.0025) [2024-06-13 08:02:39,404][71000] Updated weights for policy 0, policy_version 198694 (0.0027) [2024-06-13 08:02:40,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 3255468032. Throughput: 0: 49214.4. Samples: 2784290680. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 08:02:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:02:42,743][71000] Updated weights for policy 0, policy_version 198704 (0.0031) [2024-06-13 08:02:44,047][70980] Signal inference workers to stop experience collection... (41700 times) [2024-06-13 08:02:44,048][70980] Signal inference workers to resume experience collection... (41700 times) [2024-06-13 08:02:44,093][71000] InferenceWorker_p0-w0: stopping experience collection (41700 times) [2024-06-13 08:02:44,093][71000] InferenceWorker_p0-w0: resuming experience collection (41700 times) [2024-06-13 08:02:45,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 3255713792. Throughput: 0: 49006.8. Samples: 2784576020. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 08:02:45,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:02:46,219][71000] Updated weights for policy 0, policy_version 198714 (0.0031) [2024-06-13 08:02:49,553][71000] Updated weights for policy 0, policy_version 198724 (0.0026) [2024-06-13 08:02:50,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 3255959552. Throughput: 0: 49060.4. Samples: 2784729300. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 08:02:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 08:02:53,091][71000] Updated weights for policy 0, policy_version 198734 (0.0026) [2024-06-13 08:02:55,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48878.9, 300 sec: 49152.2). Total num frames: 3256188928. Throughput: 0: 49202.9. Samples: 2785025160. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 08:02:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:02:56,124][71000] Updated weights for policy 0, policy_version 198744 (0.0034) [2024-06-13 08:02:59,692][71000] Updated weights for policy 0, policy_version 198754 (0.0030) [2024-06-13 08:03:00,939][70768] Fps is (10 sec: 45875.7, 60 sec: 48333.0, 300 sec: 49152.0). Total num frames: 3256418304. Throughput: 0: 48941.0. Samples: 2785310020. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 08:03:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:03:02,917][71000] Updated weights for policy 0, policy_version 198764 (0.0034) [2024-06-13 08:03:05,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 3256680448. Throughput: 0: 48856.9. Samples: 2785456500. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-13 08:03:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:03:06,603][71000] Updated weights for policy 0, policy_version 198774 (0.0032) [2024-06-13 08:03:09,938][71000] Updated weights for policy 0, policy_version 198784 (0.0027) [2024-06-13 08:03:10,940][70768] Fps is (10 sec: 50789.4, 60 sec: 49152.0, 300 sec: 49096.4). Total num frames: 3256926208. Throughput: 0: 48782.5. Samples: 2785747900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 08:03:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:03:13,405][71000] Updated weights for policy 0, policy_version 198794 (0.0032) [2024-06-13 08:03:15,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 3257139200. Throughput: 0: 48636.4. Samples: 2786039160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 08:03:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:03:16,461][71000] Updated weights for policy 0, policy_version 198804 (0.0022) [2024-06-13 08:03:19,912][71000] Updated weights for policy 0, policy_version 198814 (0.0032) [2024-06-13 08:03:20,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48605.8, 300 sec: 49152.2). Total num frames: 3257401344. Throughput: 0: 48424.3. Samples: 2786177560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 08:03:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:03:23,352][71000] Updated weights for policy 0, policy_version 198824 (0.0027) [2024-06-13 08:03:25,940][70768] Fps is (10 sec: 52428.8, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 3257663488. Throughput: 0: 48450.8. Samples: 2786470960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 08:03:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:03:26,829][71000] Updated weights for policy 0, policy_version 198834 (0.0025) [2024-06-13 08:03:30,083][71000] Updated weights for policy 0, policy_version 198844 (0.0023) [2024-06-13 08:03:30,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48605.8, 300 sec: 49096.4). Total num frames: 3257892864. Throughput: 0: 48725.3. Samples: 2786768660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 08:03:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:03:33,551][71000] Updated weights for policy 0, policy_version 198854 (0.0026) [2024-06-13 08:03:35,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3258138624. Throughput: 0: 48669.9. Samples: 2786919440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 08:03:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:03:36,641][71000] Updated weights for policy 0, policy_version 198864 (0.0033) [2024-06-13 08:03:40,173][71000] Updated weights for policy 0, policy_version 198874 (0.0030) [2024-06-13 08:03:40,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48332.9, 300 sec: 49152.0). Total num frames: 3258368000. Throughput: 0: 48379.9. Samples: 2787202260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 08:03:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:03:40,954][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000198875_3258368000.pth... [2024-06-13 08:03:41,010][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000198159_3246637056.pth [2024-06-13 08:03:43,537][71000] Updated weights for policy 0, policy_version 198884 (0.0035) [2024-06-13 08:03:45,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 3258646528. Throughput: 0: 48471.4. Samples: 2787491240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 08:03:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:03:46,722][71000] Updated weights for policy 0, policy_version 198894 (0.0025) [2024-06-13 08:03:50,261][71000] Updated weights for policy 0, policy_version 198904 (0.0028) [2024-06-13 08:03:50,940][70768] Fps is (10 sec: 52428.9, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3258892288. Throughput: 0: 48889.8. Samples: 2787656540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 08:03:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:03:53,469][71000] Updated weights for policy 0, policy_version 198914 (0.0035) [2024-06-13 08:03:55,381][70980] Signal inference workers to stop experience collection... (41750 times) [2024-06-13 08:03:55,381][70980] Signal inference workers to resume experience collection... (41750 times) [2024-06-13 08:03:55,423][71000] InferenceWorker_p0-w0: stopping experience collection (41750 times) [2024-06-13 08:03:55,424][71000] InferenceWorker_p0-w0: resuming experience collection (41750 times) [2024-06-13 08:03:55,940][70768] Fps is (10 sec: 44236.6, 60 sec: 48332.7, 300 sec: 48929.8). Total num frames: 3259088896. Throughput: 0: 48800.5. Samples: 2787943920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 08:03:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:03:57,100][71000] Updated weights for policy 0, policy_version 198924 (0.0027) [2024-06-13 08:04:00,281][71000] Updated weights for policy 0, policy_version 198934 (0.0026) [2024-06-13 08:04:00,940][70768] Fps is (10 sec: 44236.7, 60 sec: 48605.8, 300 sec: 49096.4). Total num frames: 3259334656. Throughput: 0: 48615.1. Samples: 2788226840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 08:04:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:04:03,888][71000] Updated weights for policy 0, policy_version 198944 (0.0028) [2024-06-13 08:04:05,939][70768] Fps is (10 sec: 52429.6, 60 sec: 48879.0, 300 sec: 49207.6). Total num frames: 3259613184. Throughput: 0: 48693.9. Samples: 2788368780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 08:04:05,940][70768] Avg episode reward: [(0, '0.295')] [2024-06-13 08:04:06,752][71000] Updated weights for policy 0, policy_version 198954 (0.0024) [2024-06-13 08:04:10,439][71000] Updated weights for policy 0, policy_version 198964 (0.0034) [2024-06-13 08:04:10,940][70768] Fps is (10 sec: 50789.5, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 3259842560. Throughput: 0: 48946.9. Samples: 2788673580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 08:04:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:04:13,026][71000] Updated weights for policy 0, policy_version 198974 (0.0030) [2024-06-13 08:04:15,939][70768] Fps is (10 sec: 45875.1, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 3260071936. Throughput: 0: 48960.1. Samples: 2788971860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:04:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 08:04:16,886][71000] Updated weights for policy 0, policy_version 198984 (0.0023) [2024-06-13 08:04:20,749][71000] Updated weights for policy 0, policy_version 198994 (0.0029) [2024-06-13 08:04:20,939][70768] Fps is (10 sec: 47514.7, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 3260317696. Throughput: 0: 48660.9. Samples: 2789109180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:04:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:04:23,698][71000] Updated weights for policy 0, policy_version 199004 (0.0024) [2024-06-13 08:04:25,940][70768] Fps is (10 sec: 52428.6, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 3260596224. Throughput: 0: 48912.5. Samples: 2789403320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:04:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:04:27,329][71000] Updated weights for policy 0, policy_version 199014 (0.0036) [2024-06-13 08:04:30,272][71000] Updated weights for policy 0, policy_version 199024 (0.0024) [2024-06-13 08:04:30,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3260825600. Throughput: 0: 48860.0. Samples: 2789689940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:04:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:04:33,731][71000] Updated weights for policy 0, policy_version 199034 (0.0032) [2024-06-13 08:04:35,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 3261054976. Throughput: 0: 48416.4. Samples: 2789835280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:04:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:04:36,909][71000] Updated weights for policy 0, policy_version 199044 (0.0027) [2024-06-13 08:04:40,634][71000] Updated weights for policy 0, policy_version 199054 (0.0026) [2024-06-13 08:04:40,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3261300736. Throughput: 0: 48536.9. Samples: 2790128080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:04:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:04:43,502][71000] Updated weights for policy 0, policy_version 199064 (0.0028) [2024-06-13 08:04:45,940][70768] Fps is (10 sec: 54066.9, 60 sec: 49151.9, 300 sec: 49263.1). Total num frames: 3261595648. Throughput: 0: 48753.2. Samples: 2790420740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:04:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:04:47,561][71000] Updated weights for policy 0, policy_version 199074 (0.0027) [2024-06-13 08:04:50,490][71000] Updated weights for policy 0, policy_version 199084 (0.0024) [2024-06-13 08:04:50,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48332.8, 300 sec: 48985.4). Total num frames: 3261792256. Throughput: 0: 49093.7. Samples: 2790578000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:04:50,940][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 08:04:51,498][70980] Signal inference workers to stop experience collection... (41800 times) [2024-06-13 08:04:51,534][71000] InferenceWorker_p0-w0: stopping experience collection (41800 times) [2024-06-13 08:04:51,551][70980] Signal inference workers to resume experience collection... (41800 times) [2024-06-13 08:04:51,552][71000] InferenceWorker_p0-w0: resuming experience collection (41800 times) [2024-06-13 08:04:54,279][71000] Updated weights for policy 0, policy_version 199094 (0.0025) [2024-06-13 08:04:55,940][70768] Fps is (10 sec: 42598.4, 60 sec: 48878.9, 300 sec: 48818.7). Total num frames: 3262021632. Throughput: 0: 48664.5. Samples: 2790863480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:04:55,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:04:57,361][71000] Updated weights for policy 0, policy_version 199104 (0.0029) [2024-06-13 08:05:00,772][71000] Updated weights for policy 0, policy_version 199114 (0.0031) [2024-06-13 08:05:00,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 3262283776. Throughput: 0: 48425.7. Samples: 2791151020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:05:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:05:03,824][71000] Updated weights for policy 0, policy_version 199124 (0.0029) [2024-06-13 08:05:05,939][70768] Fps is (10 sec: 54068.1, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 3262562304. Throughput: 0: 48908.9. Samples: 2791310080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:05:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:05:07,806][71000] Updated weights for policy 0, policy_version 199134 (0.0029) [2024-06-13 08:05:10,656][71000] Updated weights for policy 0, policy_version 199144 (0.0029) [2024-06-13 08:05:10,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49152.2, 300 sec: 48985.4). Total num frames: 3262791680. Throughput: 0: 48880.0. Samples: 2791602920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:05:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:05:14,262][71000] Updated weights for policy 0, policy_version 199154 (0.0023) [2024-06-13 08:05:15,939][70768] Fps is (10 sec: 44236.9, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 3263004672. Throughput: 0: 49054.3. Samples: 2791897380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:05:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:05:17,383][71000] Updated weights for policy 0, policy_version 199164 (0.0024) [2024-06-13 08:05:20,897][71000] Updated weights for policy 0, policy_version 199174 (0.0032) [2024-06-13 08:05:20,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49151.9, 300 sec: 48929.9). Total num frames: 3263266816. Throughput: 0: 48946.7. Samples: 2792037880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:05:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:05:23,997][71000] Updated weights for policy 0, policy_version 199184 (0.0030) [2024-06-13 08:05:25,940][70768] Fps is (10 sec: 52428.2, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3263528960. Throughput: 0: 49045.3. Samples: 2792335120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 08:05:25,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:05:27,747][71000] Updated weights for policy 0, policy_version 199194 (0.0029) [2024-06-13 08:05:30,807][71000] Updated weights for policy 0, policy_version 199204 (0.0029) [2024-06-13 08:05:30,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3263758336. Throughput: 0: 49062.3. Samples: 2792628540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 08:05:30,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:05:34,461][71000] Updated weights for policy 0, policy_version 199214 (0.0029) [2024-06-13 08:05:35,939][70768] Fps is (10 sec: 45875.8, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 3263987712. Throughput: 0: 48695.7. Samples: 2792769300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 08:05:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:05:37,491][71000] Updated weights for policy 0, policy_version 199224 (0.0034) [2024-06-13 08:05:40,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 48930.2). Total num frames: 3264233472. Throughput: 0: 48961.0. Samples: 2793066720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 08:05:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:05:41,059][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000199234_3264249856.pth... [2024-06-13 08:05:41,060][71000] Updated weights for policy 0, policy_version 199234 (0.0023) [2024-06-13 08:05:41,122][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000198519_3252535296.pth [2024-06-13 08:05:44,210][71000] Updated weights for policy 0, policy_version 199244 (0.0024) [2024-06-13 08:05:45,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48332.9, 300 sec: 49096.5). Total num frames: 3264495616. Throughput: 0: 48994.7. Samples: 2793355780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 08:05:45,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:05:47,659][71000] Updated weights for policy 0, policy_version 199254 (0.0024) [2024-06-13 08:05:50,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3264724992. Throughput: 0: 48857.3. Samples: 2793508660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 08:05:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:05:51,019][71000] Updated weights for policy 0, policy_version 199264 (0.0026) [2024-06-13 08:05:54,370][71000] Updated weights for policy 0, policy_version 199274 (0.0036) [2024-06-13 08:05:55,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49425.1, 300 sec: 48874.3). Total num frames: 3264987136. Throughput: 0: 48823.8. Samples: 2793800000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 08:05:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:05:57,747][71000] Updated weights for policy 0, policy_version 199284 (0.0039) [2024-06-13 08:06:00,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 3265216512. Throughput: 0: 48787.9. Samples: 2794092840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 08:06:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:06:01,068][71000] Updated weights for policy 0, policy_version 199294 (0.0032) [2024-06-13 08:06:04,432][71000] Updated weights for policy 0, policy_version 199304 (0.0033) [2024-06-13 08:06:05,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48332.8, 300 sec: 48929.8). Total num frames: 3265462272. Throughput: 0: 48730.7. Samples: 2794230760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 08:06:05,948][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:06:07,754][71000] Updated weights for policy 0, policy_version 199314 (0.0034) [2024-06-13 08:06:10,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48332.7, 300 sec: 48763.5). Total num frames: 3265691648. Throughput: 0: 48795.1. Samples: 2794530900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 08:06:10,949][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:06:11,178][71000] Updated weights for policy 0, policy_version 199324 (0.0026) [2024-06-13 08:06:14,190][71000] Updated weights for policy 0, policy_version 199334 (0.0023) [2024-06-13 08:06:15,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 3265953792. Throughput: 0: 48954.3. Samples: 2794831480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 08:06:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:06:17,669][71000] Updated weights for policy 0, policy_version 199344 (0.0025) [2024-06-13 08:06:20,939][70768] Fps is (10 sec: 50791.0, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 3266199552. Throughput: 0: 48866.7. Samples: 2794968300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 08:06:20,949][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:06:21,070][71000] Updated weights for policy 0, policy_version 199354 (0.0032) [2024-06-13 08:06:22,247][70980] Signal inference workers to stop experience collection... (41850 times) [2024-06-13 08:06:22,290][71000] InferenceWorker_p0-w0: stopping experience collection (41850 times) [2024-06-13 08:06:22,297][70980] Signal inference workers to resume experience collection... (41850 times) [2024-06-13 08:06:22,307][71000] InferenceWorker_p0-w0: resuming experience collection (41850 times) [2024-06-13 08:06:24,491][71000] Updated weights for policy 0, policy_version 199364 (0.0039) [2024-06-13 08:06:25,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48605.9, 300 sec: 48929.9). Total num frames: 3266445312. Throughput: 0: 48905.8. Samples: 2795267480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-13 08:06:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:06:27,794][71000] Updated weights for policy 0, policy_version 199374 (0.0021) [2024-06-13 08:06:30,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3266691072. Throughput: 0: 48980.4. Samples: 2795559900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 08:06:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:06:31,400][71000] Updated weights for policy 0, policy_version 199384 (0.0030) [2024-06-13 08:06:34,398][71000] Updated weights for policy 0, policy_version 199394 (0.0022) [2024-06-13 08:06:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49151.9, 300 sec: 48818.8). Total num frames: 3266936832. Throughput: 0: 48864.9. Samples: 2795707580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 08:06:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:06:37,667][71000] Updated weights for policy 0, policy_version 199404 (0.0029) [2024-06-13 08:06:40,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 3267166208. Throughput: 0: 48795.2. Samples: 2795995780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 08:06:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:06:41,281][71000] Updated weights for policy 0, policy_version 199414 (0.0040) [2024-06-13 08:06:44,790][71000] Updated weights for policy 0, policy_version 199424 (0.0026) [2024-06-13 08:06:45,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 3267428352. Throughput: 0: 48784.1. Samples: 2796288120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 08:06:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:06:47,955][71000] Updated weights for policy 0, policy_version 199434 (0.0018) [2024-06-13 08:06:50,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 3267657728. Throughput: 0: 49168.1. Samples: 2796443320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 08:06:50,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:06:51,230][71000] Updated weights for policy 0, policy_version 199444 (0.0041) [2024-06-13 08:06:54,373][71000] Updated weights for policy 0, policy_version 199454 (0.0026) [2024-06-13 08:06:55,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 3267919872. Throughput: 0: 49037.3. Samples: 2796737580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 08:06:55,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:06:57,476][71000] Updated weights for policy 0, policy_version 199464 (0.0023) [2024-06-13 08:07:00,905][71000] Updated weights for policy 0, policy_version 199474 (0.0031) [2024-06-13 08:07:00,940][70768] Fps is (10 sec: 52427.9, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 3268182016. Throughput: 0: 49075.4. Samples: 2797039880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 08:07:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:07:04,558][71000] Updated weights for policy 0, policy_version 199484 (0.0029) [2024-06-13 08:07:05,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3268395008. Throughput: 0: 49133.6. Samples: 2797179320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 08:07:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:07:07,693][71000] Updated weights for policy 0, policy_version 199494 (0.0024) [2024-06-13 08:07:10,940][70768] Fps is (10 sec: 47514.4, 60 sec: 49425.1, 300 sec: 48929.8). Total num frames: 3268657152. Throughput: 0: 49036.0. Samples: 2797474100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 08:07:10,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:07:11,007][71000] Updated weights for policy 0, policy_version 199504 (0.0023) [2024-06-13 08:07:14,439][71000] Updated weights for policy 0, policy_version 199514 (0.0017) [2024-06-13 08:07:15,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49424.9, 300 sec: 48929.8). Total num frames: 3268919296. Throughput: 0: 49155.0. Samples: 2797771880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 08:07:15,940][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 08:07:17,517][71000] Updated weights for policy 0, policy_version 199524 (0.0023) [2024-06-13 08:07:20,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 3269148672. Throughput: 0: 49120.0. Samples: 2797917980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 08:07:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:07:21,123][71000] Updated weights for policy 0, policy_version 199534 (0.0030) [2024-06-13 08:07:24,210][71000] Updated weights for policy 0, policy_version 199544 (0.0033) [2024-06-13 08:07:25,940][70768] Fps is (10 sec: 44237.5, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 3269361664. Throughput: 0: 49268.1. Samples: 2798212840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 08:07:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:07:27,730][71000] Updated weights for policy 0, policy_version 199554 (0.0024) [2024-06-13 08:07:30,228][70980] Signal inference workers to stop experience collection... (41900 times) [2024-06-13 08:07:30,228][70980] Signal inference workers to resume experience collection... (41900 times) [2024-06-13 08:07:30,243][71000] InferenceWorker_p0-w0: stopping experience collection (41900 times) [2024-06-13 08:07:30,243][71000] InferenceWorker_p0-w0: resuming experience collection (41900 times) [2024-06-13 08:07:30,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 3269640192. Throughput: 0: 49398.9. Samples: 2798511080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-13 08:07:30,940][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 08:07:30,979][71000] Updated weights for policy 0, policy_version 199564 (0.0034) [2024-06-13 08:07:34,467][71000] Updated weights for policy 0, policy_version 199574 (0.0028) [2024-06-13 08:07:35,940][70768] Fps is (10 sec: 54067.3, 60 sec: 49425.1, 300 sec: 48929.9). Total num frames: 3269902336. Throughput: 0: 49248.9. Samples: 2798659520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 08:07:35,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:07:37,725][71000] Updated weights for policy 0, policy_version 199584 (0.0023) [2024-06-13 08:07:40,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 3270115328. Throughput: 0: 49299.1. Samples: 2798956040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 08:07:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:07:40,998][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000199593_3270131712.pth... [2024-06-13 08:07:41,056][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000198875_3258368000.pth [2024-06-13 08:07:41,202][71000] Updated weights for policy 0, policy_version 199594 (0.0035) [2024-06-13 08:07:44,271][71000] Updated weights for policy 0, policy_version 199604 (0.0030) [2024-06-13 08:07:45,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3270377472. Throughput: 0: 49167.4. Samples: 2799252400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 08:07:45,940][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 08:07:47,717][71000] Updated weights for policy 0, policy_version 199614 (0.0036) [2024-06-13 08:07:50,939][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.1, 300 sec: 48929.8). Total num frames: 3270623232. Throughput: 0: 49132.1. Samples: 2799390260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 08:07:50,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:07:50,958][71000] Updated weights for policy 0, policy_version 199624 (0.0033) [2024-06-13 08:07:54,102][71000] Updated weights for policy 0, policy_version 199634 (0.0032) [2024-06-13 08:07:55,939][70768] Fps is (10 sec: 50790.3, 60 sec: 49425.2, 300 sec: 49040.9). Total num frames: 3270885376. Throughput: 0: 49168.9. Samples: 2799686700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 08:07:55,940][70768] Avg episode reward: [(0, '0.295')] [2024-06-13 08:07:57,691][71000] Updated weights for policy 0, policy_version 199644 (0.0026) [2024-06-13 08:08:00,939][70768] Fps is (10 sec: 47513.4, 60 sec: 48606.0, 300 sec: 48874.3). Total num frames: 3271098368. Throughput: 0: 49174.4. Samples: 2799984720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 08:08:00,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:08:01,187][71000] Updated weights for policy 0, policy_version 199654 (0.0033) [2024-06-13 08:08:04,044][71000] Updated weights for policy 0, policy_version 199664 (0.0032) [2024-06-13 08:08:05,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 3271344128. Throughput: 0: 49115.2. Samples: 2800128160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 08:08:05,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:08:07,602][71000] Updated weights for policy 0, policy_version 199674 (0.0035) [2024-06-13 08:08:10,804][71000] Updated weights for policy 0, policy_version 199684 (0.0027) [2024-06-13 08:08:10,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 3271622656. Throughput: 0: 49202.6. Samples: 2800426960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 08:08:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:08:14,338][71000] Updated weights for policy 0, policy_version 199694 (0.0036) [2024-06-13 08:08:15,940][70768] Fps is (10 sec: 52428.8, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 3271868416. Throughput: 0: 49162.0. Samples: 2800723360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 08:08:15,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:08:17,586][71000] Updated weights for policy 0, policy_version 199704 (0.0032) [2024-06-13 08:08:20,778][71000] Updated weights for policy 0, policy_version 199714 (0.0025) [2024-06-13 08:08:20,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 3272114176. Throughput: 0: 49272.4. Samples: 2800876780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 08:08:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:08:23,974][71000] Updated weights for policy 0, policy_version 199724 (0.0025) [2024-06-13 08:08:25,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49425.1, 300 sec: 48929.9). Total num frames: 3272327168. Throughput: 0: 49099.2. Samples: 2801165500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 08:08:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:08:27,620][71000] Updated weights for policy 0, policy_version 199734 (0.0023) [2024-06-13 08:08:30,330][71000] Updated weights for policy 0, policy_version 199744 (0.0031) [2024-06-13 08:08:30,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 3272605696. Throughput: 0: 49028.7. Samples: 2801458700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 08:08:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:08:34,130][71000] Updated weights for policy 0, policy_version 199754 (0.0029) [2024-06-13 08:08:35,939][70768] Fps is (10 sec: 50790.8, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 3272835072. Throughput: 0: 49517.8. Samples: 2801618560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 08:08:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:08:37,674][71000] Updated weights for policy 0, policy_version 199764 (0.0033) [2024-06-13 08:08:40,427][70980] Signal inference workers to stop experience collection... (41950 times) [2024-06-13 08:08:40,428][70980] Signal inference workers to resume experience collection... (41950 times) [2024-06-13 08:08:40,447][71000] InferenceWorker_p0-w0: stopping experience collection (41950 times) [2024-06-13 08:08:40,447][71000] InferenceWorker_p0-w0: resuming experience collection (41950 times) [2024-06-13 08:08:40,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.1, 300 sec: 48929.8). Total num frames: 3273080832. Throughput: 0: 49372.8. Samples: 2801908480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-13 08:08:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:08:40,969][71000] Updated weights for policy 0, policy_version 199774 (0.0038) [2024-06-13 08:08:44,188][71000] Updated weights for policy 0, policy_version 199784 (0.0033) [2024-06-13 08:08:45,940][70768] Fps is (10 sec: 47512.6, 60 sec: 48878.7, 300 sec: 48874.3). Total num frames: 3273310208. Throughput: 0: 49261.1. Samples: 2802201480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 08:08:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:08:47,791][71000] Updated weights for policy 0, policy_version 199794 (0.0024) [2024-06-13 08:08:50,420][71000] Updated weights for policy 0, policy_version 199804 (0.0027) [2024-06-13 08:08:50,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49424.9, 300 sec: 49152.0). Total num frames: 3273588736. Throughput: 0: 49470.5. Samples: 2802354340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 08:08:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:08:54,385][71000] Updated weights for policy 0, policy_version 199814 (0.0033) [2024-06-13 08:08:55,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3273834496. Throughput: 0: 49366.7. Samples: 2802648460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 08:08:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:08:57,451][71000] Updated weights for policy 0, policy_version 199824 (0.0029) [2024-06-13 08:09:00,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49151.8, 300 sec: 48929.8). Total num frames: 3274047488. Throughput: 0: 49032.7. Samples: 2802929840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 08:09:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:09:01,531][71000] Updated weights for policy 0, policy_version 199834 (0.0033) [2024-06-13 08:09:04,521][71000] Updated weights for policy 0, policy_version 199844 (0.0032) [2024-06-13 08:09:05,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3274293248. Throughput: 0: 48703.6. Samples: 2803068440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 08:09:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:09:08,237][71000] Updated weights for policy 0, policy_version 199854 (0.0023) [2024-06-13 08:09:10,900][71000] Updated weights for policy 0, policy_version 199864 (0.0021) [2024-06-13 08:09:10,940][70768] Fps is (10 sec: 52429.5, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3274571776. Throughput: 0: 49169.3. Samples: 2803378120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 08:09:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:09:14,695][71000] Updated weights for policy 0, policy_version 199874 (0.0027) [2024-06-13 08:09:15,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3274817536. Throughput: 0: 49082.7. Samples: 2803667420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 08:09:15,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:09:17,852][71000] Updated weights for policy 0, policy_version 199884 (0.0032) [2024-06-13 08:09:20,940][70768] Fps is (10 sec: 45874.5, 60 sec: 48605.7, 300 sec: 48929.8). Total num frames: 3275030528. Throughput: 0: 48904.6. Samples: 2803819280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 08:09:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:09:21,438][71000] Updated weights for policy 0, policy_version 199894 (0.0026) [2024-06-13 08:09:24,762][71000] Updated weights for policy 0, policy_version 199904 (0.0036) [2024-06-13 08:09:25,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 3275276288. Throughput: 0: 48869.7. Samples: 2804107620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 08:09:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:09:28,009][71000] Updated weights for policy 0, policy_version 199914 (0.0029) [2024-06-13 08:09:30,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 3275522048. Throughput: 0: 48852.4. Samples: 2804399840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 08:09:30,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:09:31,163][71000] Updated weights for policy 0, policy_version 199924 (0.0030) [2024-06-13 08:09:35,112][71000] Updated weights for policy 0, policy_version 199934 (0.0033) [2024-06-13 08:09:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 3275767808. Throughput: 0: 48643.6. Samples: 2804543300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 08:09:35,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:09:37,839][71000] Updated weights for policy 0, policy_version 199944 (0.0031) [2024-06-13 08:09:40,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 3275997184. Throughput: 0: 48595.6. Samples: 2804835260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 08:09:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:09:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000199951_3275997184.pth... [2024-06-13 08:09:40,993][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000199234_3264249856.pth [2024-06-13 08:09:41,843][71000] Updated weights for policy 0, policy_version 199954 (0.0026) [2024-06-13 08:09:44,614][71000] Updated weights for policy 0, policy_version 199964 (0.0030) [2024-06-13 08:09:45,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 3276259328. Throughput: 0: 48750.4. Samples: 2805123600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 08:09:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:09:48,714][71000] Updated weights for policy 0, policy_version 199974 (0.0022) [2024-06-13 08:09:50,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48605.8, 300 sec: 49096.5). Total num frames: 3276505088. Throughput: 0: 49186.5. Samples: 2805281840. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-13 08:09:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:09:51,303][71000] Updated weights for policy 0, policy_version 199984 (0.0023) [2024-06-13 08:09:55,088][71000] Updated weights for policy 0, policy_version 199994 (0.0031) [2024-06-13 08:09:55,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 3276750848. Throughput: 0: 48824.8. Samples: 2805575240. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-13 08:09:55,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:09:58,068][71000] Updated weights for policy 0, policy_version 200004 (0.0031) [2024-06-13 08:10:00,940][70768] Fps is (10 sec: 47514.3, 60 sec: 48879.1, 300 sec: 48874.3). Total num frames: 3276980224. Throughput: 0: 48821.8. Samples: 2805864400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-13 08:10:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:10:01,883][71000] Updated weights for policy 0, policy_version 200014 (0.0036) [2024-06-13 08:10:04,762][71000] Updated weights for policy 0, policy_version 200024 (0.0023) [2024-06-13 08:10:05,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3277242368. Throughput: 0: 48530.9. Samples: 2806003160. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-13 08:10:05,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:10:08,492][71000] Updated weights for policy 0, policy_version 200034 (0.0023) [2024-06-13 08:10:09,107][70980] Signal inference workers to stop experience collection... (42000 times) [2024-06-13 08:10:09,108][70980] Signal inference workers to resume experience collection... (42000 times) [2024-06-13 08:10:09,118][71000] InferenceWorker_p0-w0: stopping experience collection (42000 times) [2024-06-13 08:10:09,119][71000] InferenceWorker_p0-w0: resuming experience collection (42000 times) [2024-06-13 08:10:10,940][70768] Fps is (10 sec: 50789.8, 60 sec: 48605.8, 300 sec: 49096.4). Total num frames: 3277488128. Throughput: 0: 48820.8. Samples: 2806304560. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-13 08:10:10,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:10:11,354][71000] Updated weights for policy 0, policy_version 200044 (0.0028) [2024-06-13 08:10:15,178][71000] Updated weights for policy 0, policy_version 200054 (0.0030) [2024-06-13 08:10:15,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48059.8, 300 sec: 48929.9). Total num frames: 3277701120. Throughput: 0: 49080.2. Samples: 2806608440. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-13 08:10:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:10:17,922][71000] Updated weights for policy 0, policy_version 200064 (0.0024) [2024-06-13 08:10:20,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 3277963264. Throughput: 0: 48896.8. Samples: 2806743660. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-13 08:10:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:10:21,639][71000] Updated weights for policy 0, policy_version 200074 (0.0025) [2024-06-13 08:10:24,459][71000] Updated weights for policy 0, policy_version 200084 (0.0025) [2024-06-13 08:10:25,940][70768] Fps is (10 sec: 54066.6, 60 sec: 49425.0, 300 sec: 49096.4). Total num frames: 3278241792. Throughput: 0: 49011.9. Samples: 2807040800. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-13 08:10:25,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:10:28,294][71000] Updated weights for policy 0, policy_version 200094 (0.0033) [2024-06-13 08:10:30,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.1, 300 sec: 49096.4). Total num frames: 3278471168. Throughput: 0: 49024.4. Samples: 2807329700. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-13 08:10:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:10:31,501][71000] Updated weights for policy 0, policy_version 200104 (0.0036) [2024-06-13 08:10:35,167][71000] Updated weights for policy 0, policy_version 200114 (0.0024) [2024-06-13 08:10:35,939][70768] Fps is (10 sec: 44237.4, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3278684160. Throughput: 0: 48839.8. Samples: 2807479620. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-13 08:10:35,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:10:38,131][71000] Updated weights for policy 0, policy_version 200124 (0.0022) [2024-06-13 08:10:40,939][70768] Fps is (10 sec: 45875.5, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 3278929920. Throughput: 0: 48869.0. Samples: 2807774340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-13 08:10:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:10:41,658][71000] Updated weights for policy 0, policy_version 200134 (0.0025) [2024-06-13 08:10:44,704][71000] Updated weights for policy 0, policy_version 200144 (0.0037) [2024-06-13 08:10:45,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3279208448. Throughput: 0: 48830.2. Samples: 2808061760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-13 08:10:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:10:48,303][71000] Updated weights for policy 0, policy_version 200154 (0.0040) [2024-06-13 08:10:50,940][70768] Fps is (10 sec: 52428.1, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 3279454208. Throughput: 0: 48981.2. Samples: 2808207320. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-13 08:10:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:10:51,351][71000] Updated weights for policy 0, policy_version 200164 (0.0036) [2024-06-13 08:10:55,307][71000] Updated weights for policy 0, policy_version 200174 (0.0031) [2024-06-13 08:10:55,940][70768] Fps is (10 sec: 45874.6, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 3279667200. Throughput: 0: 48900.0. Samples: 2808505060. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-13 08:10:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:10:57,897][71000] Updated weights for policy 0, policy_version 200184 (0.0031) [2024-06-13 08:11:00,939][70768] Fps is (10 sec: 45876.0, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 3279912960. Throughput: 0: 48814.3. Samples: 2808805080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 08:11:00,942][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:11:01,896][71000] Updated weights for policy 0, policy_version 200194 (0.0021) [2024-06-13 08:11:04,785][71000] Updated weights for policy 0, policy_version 200204 (0.0028) [2024-06-13 08:11:05,940][70768] Fps is (10 sec: 52429.4, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3280191488. Throughput: 0: 49158.4. Samples: 2808955780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 08:11:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 08:11:08,594][71000] Updated weights for policy 0, policy_version 200214 (0.0031) [2024-06-13 08:11:10,940][70768] Fps is (10 sec: 50789.1, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3280420864. Throughput: 0: 49035.5. Samples: 2809247400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 08:11:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 08:11:10,975][70980] Signal inference workers to stop experience collection... (42050 times) [2024-06-13 08:11:10,976][70980] Signal inference workers to resume experience collection... (42050 times) [2024-06-13 08:11:11,004][71000] InferenceWorker_p0-w0: stopping experience collection (42050 times) [2024-06-13 08:11:11,004][71000] InferenceWorker_p0-w0: resuming experience collection (42050 times) [2024-06-13 08:11:11,452][71000] Updated weights for policy 0, policy_version 200224 (0.0035) [2024-06-13 08:11:15,204][71000] Updated weights for policy 0, policy_version 200234 (0.0025) [2024-06-13 08:11:15,940][70768] Fps is (10 sec: 45874.5, 60 sec: 49151.9, 300 sec: 48985.3). Total num frames: 3280650240. Throughput: 0: 48898.1. Samples: 2809530120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 08:11:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:11:18,082][71000] Updated weights for policy 0, policy_version 200244 (0.0037) [2024-06-13 08:11:20,940][70768] Fps is (10 sec: 47514.4, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 3280896000. Throughput: 0: 48726.2. Samples: 2809672300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 08:11:20,940][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 08:11:21,998][71000] Updated weights for policy 0, policy_version 200254 (0.0029) [2024-06-13 08:11:24,713][71000] Updated weights for policy 0, policy_version 200264 (0.0036) [2024-06-13 08:11:25,940][70768] Fps is (10 sec: 52429.7, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 3281174528. Throughput: 0: 48795.1. Samples: 2809970120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 08:11:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:11:28,495][71000] Updated weights for policy 0, policy_version 200274 (0.0030) [2024-06-13 08:11:30,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 3281387520. Throughput: 0: 49028.8. Samples: 2810268060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 08:11:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:11:31,703][71000] Updated weights for policy 0, policy_version 200284 (0.0032) [2024-06-13 08:11:35,391][71000] Updated weights for policy 0, policy_version 200294 (0.0030) [2024-06-13 08:11:35,940][70768] Fps is (10 sec: 45874.5, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 3281633280. Throughput: 0: 48992.8. Samples: 2810412000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 08:11:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:11:38,075][71000] Updated weights for policy 0, policy_version 200304 (0.0031) [2024-06-13 08:11:40,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49424.9, 300 sec: 49040.9). Total num frames: 3281895424. Throughput: 0: 49084.9. Samples: 2810713880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 08:11:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 08:11:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000200311_3281895424.pth... [2024-06-13 08:11:40,998][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000199593_3270131712.pth [2024-06-13 08:11:41,897][71000] Updated weights for policy 0, policy_version 200314 (0.0027) [2024-06-13 08:11:44,610][71000] Updated weights for policy 0, policy_version 200324 (0.0028) [2024-06-13 08:11:45,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3282157568. Throughput: 0: 48897.1. Samples: 2811005460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 08:11:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:11:48,453][71000] Updated weights for policy 0, policy_version 200334 (0.0038) [2024-06-13 08:11:50,940][70768] Fps is (10 sec: 50791.1, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 3282403328. Throughput: 0: 49113.8. Samples: 2811165900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 08:11:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:11:51,264][71000] Updated weights for policy 0, policy_version 200344 (0.0033) [2024-06-13 08:11:55,124][71000] Updated weights for policy 0, policy_version 200354 (0.0031) [2024-06-13 08:11:55,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 3282616320. Throughput: 0: 49244.5. Samples: 2811463400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 08:11:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:11:58,106][71000] Updated weights for policy 0, policy_version 200364 (0.0021) [2024-06-13 08:12:00,940][70768] Fps is (10 sec: 45874.7, 60 sec: 49151.8, 300 sec: 49040.9). Total num frames: 3282862080. Throughput: 0: 49460.9. Samples: 2811755860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-13 08:12:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:12:01,693][71000] Updated weights for policy 0, policy_version 200374 (0.0028) [2024-06-13 08:12:04,634][71000] Updated weights for policy 0, policy_version 200384 (0.0035) [2024-06-13 08:12:05,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 3283140608. Throughput: 0: 49598.1. Samples: 2811904220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-13 08:12:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:12:08,681][71000] Updated weights for policy 0, policy_version 200394 (0.0029) [2024-06-13 08:12:10,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49152.2, 300 sec: 48985.4). Total num frames: 3283369984. Throughput: 0: 49455.1. Samples: 2812195600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-13 08:12:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:12:11,358][71000] Updated weights for policy 0, policy_version 200404 (0.0030) [2024-06-13 08:12:15,431][71000] Updated weights for policy 0, policy_version 200414 (0.0031) [2024-06-13 08:12:15,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 3283599360. Throughput: 0: 49248.6. Samples: 2812484240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-13 08:12:15,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:12:18,151][71000] Updated weights for policy 0, policy_version 200424 (0.0029) [2024-06-13 08:12:18,543][70980] Signal inference workers to stop experience collection... (42100 times) [2024-06-13 08:12:18,543][70980] Signal inference workers to resume experience collection... (42100 times) [2024-06-13 08:12:18,559][71000] InferenceWorker_p0-w0: stopping experience collection (42100 times) [2024-06-13 08:12:18,559][71000] InferenceWorker_p0-w0: resuming experience collection (42100 times) [2024-06-13 08:12:20,940][70768] Fps is (10 sec: 49151.0, 60 sec: 49424.9, 300 sec: 49152.0). Total num frames: 3283861504. Throughput: 0: 49291.9. Samples: 2812630140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-13 08:12:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:12:21,724][71000] Updated weights for policy 0, policy_version 200434 (0.0033) [2024-06-13 08:12:24,745][71000] Updated weights for policy 0, policy_version 200444 (0.0035) [2024-06-13 08:12:25,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 3284123648. Throughput: 0: 49375.1. Samples: 2812935760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-13 08:12:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:12:28,378][71000] Updated weights for policy 0, policy_version 200454 (0.0023) [2024-06-13 08:12:30,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49698.1, 300 sec: 49040.9). Total num frames: 3284369408. Throughput: 0: 49516.3. Samples: 2813233700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-13 08:12:30,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 08:12:31,266][71000] Updated weights for policy 0, policy_version 200464 (0.0030) [2024-06-13 08:12:35,289][71000] Updated weights for policy 0, policy_version 200474 (0.0032) [2024-06-13 08:12:35,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 3284598784. Throughput: 0: 49113.3. Samples: 2813376000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-13 08:12:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:12:37,831][71000] Updated weights for policy 0, policy_version 200484 (0.0024) [2024-06-13 08:12:40,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3284844544. Throughput: 0: 49114.1. Samples: 2813673540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-13 08:12:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:12:41,767][71000] Updated weights for policy 0, policy_version 200494 (0.0025) [2024-06-13 08:12:44,525][71000] Updated weights for policy 0, policy_version 200504 (0.0029) [2024-06-13 08:12:45,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49152.0, 300 sec: 49096.4). Total num frames: 3285106688. Throughput: 0: 49176.0. Samples: 2813968780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-13 08:12:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:12:48,728][71000] Updated weights for policy 0, policy_version 200514 (0.0028) [2024-06-13 08:12:50,940][70768] Fps is (10 sec: 49152.9, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3285336064. Throughput: 0: 49297.9. Samples: 2814122620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-13 08:12:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:12:51,322][71000] Updated weights for policy 0, policy_version 200524 (0.0024) [2024-06-13 08:12:55,220][71000] Updated weights for policy 0, policy_version 200534 (0.0022) [2024-06-13 08:12:55,939][70768] Fps is (10 sec: 45876.0, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 3285565440. Throughput: 0: 49167.6. Samples: 2814408140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-13 08:12:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:12:58,359][71000] Updated weights for policy 0, policy_version 200544 (0.0028) [2024-06-13 08:13:00,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 3285827584. Throughput: 0: 49185.7. Samples: 2814697600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-13 08:13:00,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 08:13:01,790][71000] Updated weights for policy 0, policy_version 200554 (0.0023) [2024-06-13 08:13:04,953][71000] Updated weights for policy 0, policy_version 200564 (0.0027) [2024-06-13 08:13:05,939][70768] Fps is (10 sec: 50790.3, 60 sec: 48879.1, 300 sec: 48985.4). Total num frames: 3286073344. Throughput: 0: 49391.4. Samples: 2814852740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-13 08:13:05,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:13:08,451][71000] Updated weights for policy 0, policy_version 200574 (0.0039) [2024-06-13 08:13:09,229][70980] Signal inference workers to stop experience collection... (42150 times) [2024-06-13 08:13:09,231][70980] Signal inference workers to resume experience collection... (42150 times) [2024-06-13 08:13:09,237][71000] InferenceWorker_p0-w0: stopping experience collection (42150 times) [2024-06-13 08:13:09,258][71000] InferenceWorker_p0-w0: resuming experience collection (42150 times) [2024-06-13 08:13:10,943][70768] Fps is (10 sec: 49134.4, 60 sec: 49149.0, 300 sec: 48984.8). Total num frames: 3286319104. Throughput: 0: 49015.3. Samples: 2815141620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:13:10,944][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:13:11,838][71000] Updated weights for policy 0, policy_version 200584 (0.0030) [2024-06-13 08:13:15,127][71000] Updated weights for policy 0, policy_version 200594 (0.0033) [2024-06-13 08:13:15,939][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 3286548480. Throughput: 0: 48945.2. Samples: 2815436220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:13:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:13:18,378][71000] Updated weights for policy 0, policy_version 200604 (0.0032) [2024-06-13 08:13:20,939][70768] Fps is (10 sec: 49170.1, 60 sec: 49152.2, 300 sec: 49096.5). Total num frames: 3286810624. Throughput: 0: 49029.9. Samples: 2815582340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:13:20,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:13:21,657][71000] Updated weights for policy 0, policy_version 200614 (0.0022) [2024-06-13 08:13:24,918][71000] Updated weights for policy 0, policy_version 200624 (0.0032) [2024-06-13 08:13:25,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 3287056384. Throughput: 0: 49178.4. Samples: 2815886560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:13:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:13:27,968][71000] Updated weights for policy 0, policy_version 200634 (0.0033) [2024-06-13 08:13:30,940][70768] Fps is (10 sec: 50789.2, 60 sec: 49152.0, 300 sec: 49096.4). Total num frames: 3287318528. Throughput: 0: 49191.0. Samples: 2816182380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:13:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:13:31,779][71000] Updated weights for policy 0, policy_version 200644 (0.0038) [2024-06-13 08:13:34,783][71000] Updated weights for policy 0, policy_version 200654 (0.0031) [2024-06-13 08:13:35,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 3287564288. Throughput: 0: 49120.8. Samples: 2816333060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:13:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:13:38,304][71000] Updated weights for policy 0, policy_version 200664 (0.0036) [2024-06-13 08:13:40,940][70768] Fps is (10 sec: 49151.7, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 3287810048. Throughput: 0: 49409.0. Samples: 2816631560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:13:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 08:13:40,959][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000200672_3287810048.pth... [2024-06-13 08:13:41,008][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000199951_3275997184.pth [2024-06-13 08:13:41,695][71000] Updated weights for policy 0, policy_version 200674 (0.0031) [2024-06-13 08:13:44,871][71000] Updated weights for policy 0, policy_version 200684 (0.0035) [2024-06-13 08:13:45,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 3288072192. Throughput: 0: 49680.0. Samples: 2816933200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:13:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:13:47,946][71000] Updated weights for policy 0, policy_version 200694 (0.0038) [2024-06-13 08:13:50,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 3288301568. Throughput: 0: 49634.9. Samples: 2817086320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:13:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 08:13:51,619][71000] Updated weights for policy 0, policy_version 200704 (0.0024) [2024-06-13 08:13:54,557][71000] Updated weights for policy 0, policy_version 200714 (0.0026) [2024-06-13 08:13:55,940][70768] Fps is (10 sec: 50790.9, 60 sec: 50244.2, 300 sec: 49263.1). Total num frames: 3288580096. Throughput: 0: 49801.8. Samples: 2817382520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:13:55,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:13:58,334][71000] Updated weights for policy 0, policy_version 200724 (0.0030) [2024-06-13 08:14:00,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.1, 300 sec: 49207.5). Total num frames: 3288809472. Throughput: 0: 49745.6. Samples: 2817674780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:14:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:14:01,377][71000] Updated weights for policy 0, policy_version 200734 (0.0027) [2024-06-13 08:14:04,618][71000] Updated weights for policy 0, policy_version 200744 (0.0030) [2024-06-13 08:14:05,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49698.1, 300 sec: 49096.5). Total num frames: 3289055232. Throughput: 0: 49836.4. Samples: 2817824980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:14:05,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 08:14:07,674][71000] Updated weights for policy 0, policy_version 200754 (0.0029) [2024-06-13 08:14:10,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49428.0, 300 sec: 49040.9). Total num frames: 3289284608. Throughput: 0: 49696.0. Samples: 2818122880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:14:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:14:11,238][71000] Updated weights for policy 0, policy_version 200764 (0.0027) [2024-06-13 08:14:14,255][71000] Updated weights for policy 0, policy_version 200774 (0.0030) [2024-06-13 08:14:15,940][70768] Fps is (10 sec: 52428.7, 60 sec: 50517.3, 300 sec: 49318.6). Total num frames: 3289579520. Throughput: 0: 49902.0. Samples: 2818427960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:14:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:14:17,980][71000] Updated weights for policy 0, policy_version 200784 (0.0023) [2024-06-13 08:14:19,859][70980] Signal inference workers to stop experience collection... (42200 times) [2024-06-13 08:14:19,860][70980] Signal inference workers to resume experience collection... (42200 times) [2024-06-13 08:14:19,881][71000] InferenceWorker_p0-w0: stopping experience collection (42200 times) [2024-06-13 08:14:19,881][71000] InferenceWorker_p0-w0: resuming experience collection (42200 times) [2024-06-13 08:14:20,885][71000] Updated weights for policy 0, policy_version 200794 (0.0026) [2024-06-13 08:14:20,940][70768] Fps is (10 sec: 52429.1, 60 sec: 49971.2, 300 sec: 49263.1). Total num frames: 3289808896. Throughput: 0: 49968.5. Samples: 2818581640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 08:14:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 08:14:24,603][71000] Updated weights for policy 0, policy_version 200804 (0.0030) [2024-06-13 08:14:25,940][70768] Fps is (10 sec: 44236.7, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 3290021888. Throughput: 0: 49958.0. Samples: 2818879660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 08:14:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:14:27,441][71000] Updated weights for policy 0, policy_version 200814 (0.0027) [2024-06-13 08:14:30,940][70768] Fps is (10 sec: 45874.6, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 3290267648. Throughput: 0: 49807.1. Samples: 2819174520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 08:14:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:14:31,118][71000] Updated weights for policy 0, policy_version 200824 (0.0032) [2024-06-13 08:14:33,952][71000] Updated weights for policy 0, policy_version 200834 (0.0023) [2024-06-13 08:14:35,940][70768] Fps is (10 sec: 54066.4, 60 sec: 49971.1, 300 sec: 49374.1). Total num frames: 3290562560. Throughput: 0: 49621.7. Samples: 2819319300. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 08:14:35,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 08:14:37,729][71000] Updated weights for policy 0, policy_version 200844 (0.0030) [2024-06-13 08:14:40,739][71000] Updated weights for policy 0, policy_version 200854 (0.0025) [2024-06-13 08:14:40,940][70768] Fps is (10 sec: 54066.9, 60 sec: 49971.3, 300 sec: 49318.6). Total num frames: 3290808320. Throughput: 0: 49742.0. Samples: 2819620920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 08:14:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:14:44,977][71000] Updated weights for policy 0, policy_version 200864 (0.0035) [2024-06-13 08:14:45,940][70768] Fps is (10 sec: 45876.1, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 3291021312. Throughput: 0: 49642.8. Samples: 2819908700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 08:14:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:14:47,478][71000] Updated weights for policy 0, policy_version 200874 (0.0026) [2024-06-13 08:14:50,940][70768] Fps is (10 sec: 45875.6, 60 sec: 49425.1, 300 sec: 49207.5). Total num frames: 3291267072. Throughput: 0: 49230.6. Samples: 2820040360. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 08:14:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:14:51,509][71000] Updated weights for policy 0, policy_version 200884 (0.0024) [2024-06-13 08:14:54,197][71000] Updated weights for policy 0, policy_version 200894 (0.0028) [2024-06-13 08:14:55,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49425.0, 300 sec: 49374.1). Total num frames: 3291545600. Throughput: 0: 49473.8. Samples: 2820349200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 08:14:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:14:57,926][71000] Updated weights for policy 0, policy_version 200904 (0.0030) [2024-06-13 08:15:00,637][71000] Updated weights for policy 0, policy_version 200914 (0.0025) [2024-06-13 08:15:00,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 3291791360. Throughput: 0: 49305.3. Samples: 2820646700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 08:15:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:15:04,461][71000] Updated weights for policy 0, policy_version 200924 (0.0023) [2024-06-13 08:15:05,939][70768] Fps is (10 sec: 47514.3, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 3292020736. Throughput: 0: 49239.6. Samples: 2820797420. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 08:15:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:15:07,279][71000] Updated weights for policy 0, policy_version 200934 (0.0033) [2024-06-13 08:15:10,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 3292250112. Throughput: 0: 49076.5. Samples: 2821088100. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 08:15:10,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 08:15:11,375][71000] Updated weights for policy 0, policy_version 200944 (0.0033) [2024-06-13 08:15:14,116][71000] Updated weights for policy 0, policy_version 200954 (0.0034) [2024-06-13 08:15:15,940][70768] Fps is (10 sec: 50789.7, 60 sec: 49152.0, 300 sec: 49374.2). Total num frames: 3292528640. Throughput: 0: 48986.3. Samples: 2821378900. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 08:15:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:15:17,822][71000] Updated weights for policy 0, policy_version 200964 (0.0030) [2024-06-13 08:15:20,451][71000] Updated weights for policy 0, policy_version 200974 (0.0029) [2024-06-13 08:15:20,940][70768] Fps is (10 sec: 54066.7, 60 sec: 49698.0, 300 sec: 49318.6). Total num frames: 3292790784. Throughput: 0: 49413.8. Samples: 2821542920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 08:15:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:15:24,354][71000] Updated weights for policy 0, policy_version 200984 (0.0039) [2024-06-13 08:15:25,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49698.2, 300 sec: 49263.1). Total num frames: 3293003776. Throughput: 0: 49377.1. Samples: 2821842880. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 08:15:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:15:27,311][71000] Updated weights for policy 0, policy_version 200994 (0.0031) [2024-06-13 08:15:30,940][70768] Fps is (10 sec: 44237.3, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 3293233152. Throughput: 0: 49196.4. Samples: 2822122540. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 08:15:30,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 08:15:31,413][71000] Updated weights for policy 0, policy_version 201004 (0.0034) [2024-06-13 08:15:33,706][70980] Signal inference workers to stop experience collection... (42250 times) [2024-06-13 08:15:33,706][70980] Signal inference workers to resume experience collection... (42250 times) [2024-06-13 08:15:33,723][71000] InferenceWorker_p0-w0: stopping experience collection (42250 times) [2024-06-13 08:15:33,723][71000] InferenceWorker_p0-w0: resuming experience collection (42250 times) [2024-06-13 08:15:34,001][71000] Updated weights for policy 0, policy_version 201014 (0.0037) [2024-06-13 08:15:35,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 3293511680. Throughput: 0: 49549.7. Samples: 2822270100. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 08:15:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:15:37,983][71000] Updated weights for policy 0, policy_version 201024 (0.0028) [2024-06-13 08:15:40,687][71000] Updated weights for policy 0, policy_version 201034 (0.0029) [2024-06-13 08:15:40,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 3293757440. Throughput: 0: 49286.3. Samples: 2822567080. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 08:15:40,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 08:15:40,963][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000201036_3293773824.pth... [2024-06-13 08:15:41,007][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000200311_3281895424.pth [2024-06-13 08:15:44,350][71000] Updated weights for policy 0, policy_version 201044 (0.0033) [2024-06-13 08:15:45,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49152.0, 300 sec: 49207.6). Total num frames: 3293970432. Throughput: 0: 49177.4. Samples: 2822859680. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 08:15:45,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 08:15:47,229][71000] Updated weights for policy 0, policy_version 201054 (0.0028) [2024-06-13 08:15:50,939][70768] Fps is (10 sec: 45875.7, 60 sec: 49152.1, 300 sec: 49318.7). Total num frames: 3294216192. Throughput: 0: 48763.1. Samples: 2822991760. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 08:15:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:15:51,395][71000] Updated weights for policy 0, policy_version 201064 (0.0031) [2024-06-13 08:15:54,195][71000] Updated weights for policy 0, policy_version 201074 (0.0032) [2024-06-13 08:15:55,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48879.0, 300 sec: 49374.1). Total num frames: 3294478336. Throughput: 0: 48811.2. Samples: 2823284600. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 08:15:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:15:57,833][71000] Updated weights for policy 0, policy_version 201084 (0.0032) [2024-06-13 08:16:00,874][71000] Updated weights for policy 0, policy_version 201094 (0.0033) [2024-06-13 08:16:00,939][70768] Fps is (10 sec: 50790.1, 60 sec: 48879.0, 300 sec: 49263.1). Total num frames: 3294724096. Throughput: 0: 48997.0. Samples: 2823583760. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 08:16:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:16:04,335][71000] Updated weights for policy 0, policy_version 201104 (0.0028) [2024-06-13 08:16:05,939][70768] Fps is (10 sec: 47513.7, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 3294953472. Throughput: 0: 48750.0. Samples: 2823736660. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 08:16:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 08:16:07,150][71000] Updated weights for policy 0, policy_version 201114 (0.0027) [2024-06-13 08:16:10,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.1, 300 sec: 49318.6). Total num frames: 3295199232. Throughput: 0: 48914.3. Samples: 2824044020. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 08:16:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:16:11,065][71000] Updated weights for policy 0, policy_version 201124 (0.0035) [2024-06-13 08:16:13,880][71000] Updated weights for policy 0, policy_version 201134 (0.0022) [2024-06-13 08:16:15,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48879.0, 300 sec: 49374.2). Total num frames: 3295461376. Throughput: 0: 48989.8. Samples: 2824327080. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 08:16:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:16:17,587][71000] Updated weights for policy 0, policy_version 201144 (0.0024) [2024-06-13 08:16:20,719][71000] Updated weights for policy 0, policy_version 201154 (0.0035) [2024-06-13 08:16:20,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48605.9, 300 sec: 49263.1). Total num frames: 3295707136. Throughput: 0: 49076.0. Samples: 2824478520. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 08:16:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:16:24,468][71000] Updated weights for policy 0, policy_version 201164 (0.0044) [2024-06-13 08:16:25,939][70768] Fps is (10 sec: 45875.8, 60 sec: 48606.0, 300 sec: 49263.1). Total num frames: 3295920128. Throughput: 0: 48989.5. Samples: 2824771600. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 08:16:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:16:27,563][71000] Updated weights for policy 0, policy_version 201174 (0.0033) [2024-06-13 08:16:30,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 3296182272. Throughput: 0: 48928.8. Samples: 2825061480. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 08:16:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:16:31,052][71000] Updated weights for policy 0, policy_version 201184 (0.0036) [2024-06-13 08:16:34,195][71000] Updated weights for policy 0, policy_version 201194 (0.0023) [2024-06-13 08:16:35,940][70768] Fps is (10 sec: 52427.8, 60 sec: 48879.0, 300 sec: 49318.6). Total num frames: 3296444416. Throughput: 0: 49365.1. Samples: 2825213200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:16:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:16:37,776][71000] Updated weights for policy 0, policy_version 201204 (0.0028) [2024-06-13 08:16:40,768][71000] Updated weights for policy 0, policy_version 201214 (0.0029) [2024-06-13 08:16:40,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48878.8, 300 sec: 49263.1). Total num frames: 3296690176. Throughput: 0: 49402.9. Samples: 2825507740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:16:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:16:44,629][71000] Updated weights for policy 0, policy_version 201224 (0.0029) [2024-06-13 08:16:45,939][70768] Fps is (10 sec: 45876.1, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 3296903168. Throughput: 0: 49094.7. Samples: 2825793020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:16:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:16:47,919][71000] Updated weights for policy 0, policy_version 201234 (0.0028) [2024-06-13 08:16:50,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48878.7, 300 sec: 49263.1). Total num frames: 3297148928. Throughput: 0: 48854.5. Samples: 2825935120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:16:50,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:16:51,377][71000] Updated weights for policy 0, policy_version 201244 (0.0027) [2024-06-13 08:16:54,362][71000] Updated weights for policy 0, policy_version 201254 (0.0025) [2024-06-13 08:16:55,939][70768] Fps is (10 sec: 50790.0, 60 sec: 48878.9, 300 sec: 49318.6). Total num frames: 3297411072. Throughput: 0: 48493.8. Samples: 2826226240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:16:55,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 08:16:58,095][71000] Updated weights for policy 0, policy_version 201264 (0.0026) [2024-06-13 08:17:00,829][71000] Updated weights for policy 0, policy_version 201274 (0.0028) [2024-06-13 08:17:00,940][70768] Fps is (10 sec: 52429.7, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 3297673216. Throughput: 0: 48920.5. Samples: 2826528500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:17:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 08:17:04,672][71000] Updated weights for policy 0, policy_version 201284 (0.0026) [2024-06-13 08:17:05,483][70980] Signal inference workers to stop experience collection... (42300 times) [2024-06-13 08:17:05,526][71000] InferenceWorker_p0-w0: stopping experience collection (42300 times) [2024-06-13 08:17:05,594][70980] Signal inference workers to resume experience collection... (42300 times) [2024-06-13 08:17:05,594][71000] InferenceWorker_p0-w0: resuming experience collection (42300 times) [2024-06-13 08:17:05,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48878.8, 300 sec: 49207.5). Total num frames: 3297886208. Throughput: 0: 48713.8. Samples: 2826670640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:17:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:17:07,782][71000] Updated weights for policy 0, policy_version 201294 (0.0021) [2024-06-13 08:17:10,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48878.9, 300 sec: 49263.1). Total num frames: 3298131968. Throughput: 0: 48687.4. Samples: 2826962540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:17:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:17:11,334][71000] Updated weights for policy 0, policy_version 201304 (0.0028) [2024-06-13 08:17:14,269][71000] Updated weights for policy 0, policy_version 201314 (0.0025) [2024-06-13 08:17:15,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 3298410496. Throughput: 0: 48929.4. Samples: 2827263300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:17:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:17:17,933][71000] Updated weights for policy 0, policy_version 201324 (0.0039) [2024-06-13 08:17:20,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49152.1, 300 sec: 49263.1). Total num frames: 3298656256. Throughput: 0: 48914.3. Samples: 2827414340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:17:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:17:20,942][71000] Updated weights for policy 0, policy_version 201334 (0.0026) [2024-06-13 08:17:24,512][71000] Updated weights for policy 0, policy_version 201344 (0.0028) [2024-06-13 08:17:25,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49425.0, 300 sec: 49207.6). Total num frames: 3298885632. Throughput: 0: 48949.0. Samples: 2827710440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:17:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:17:27,856][71000] Updated weights for policy 0, policy_version 201354 (0.0037) [2024-06-13 08:17:30,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 3299115008. Throughput: 0: 49083.4. Samples: 2828001780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:17:30,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 08:17:31,431][71000] Updated weights for policy 0, policy_version 201364 (0.0030) [2024-06-13 08:17:34,639][71000] Updated weights for policy 0, policy_version 201374 (0.0034) [2024-06-13 08:17:35,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.1, 300 sec: 49318.7). Total num frames: 3299393536. Throughput: 0: 49135.3. Samples: 2828146200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:17:35,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 08:17:37,965][71000] Updated weights for policy 0, policy_version 201384 (0.0045) [2024-06-13 08:17:40,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48879.0, 300 sec: 49207.6). Total num frames: 3299622912. Throughput: 0: 49164.8. Samples: 2828438660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:17:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:17:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000201393_3299622912.pth... [2024-06-13 08:17:40,994][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000200672_3287810048.pth [2024-06-13 08:17:41,380][71000] Updated weights for policy 0, policy_version 201394 (0.0028) [2024-06-13 08:17:44,694][71000] Updated weights for policy 0, policy_version 201404 (0.0023) [2024-06-13 08:17:45,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 3299852288. Throughput: 0: 48904.4. Samples: 2828729200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:17:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:17:48,019][71000] Updated weights for policy 0, policy_version 201414 (0.0034) [2024-06-13 08:17:50,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49152.0, 300 sec: 49263.0). Total num frames: 3300098048. Throughput: 0: 48974.2. Samples: 2828874480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:17:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:17:51,467][71000] Updated weights for policy 0, policy_version 201424 (0.0025) [2024-06-13 08:17:54,636][71000] Updated weights for policy 0, policy_version 201434 (0.0034) [2024-06-13 08:17:55,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 3300343808. Throughput: 0: 49027.1. Samples: 2829168760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:17:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:17:57,889][71000] Updated weights for policy 0, policy_version 201444 (0.0024) [2024-06-13 08:18:00,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48605.8, 300 sec: 49207.5). Total num frames: 3300589568. Throughput: 0: 48912.1. Samples: 2829464340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:18:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:18:01,341][71000] Updated weights for policy 0, policy_version 201454 (0.0037) [2024-06-13 08:18:04,635][71000] Updated weights for policy 0, policy_version 201464 (0.0024) [2024-06-13 08:18:05,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49698.2, 300 sec: 49319.2). Total num frames: 3300868096. Throughput: 0: 49011.1. Samples: 2829619840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:18:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:18:06,551][70980] Signal inference workers to stop experience collection... (42350 times) [2024-06-13 08:18:06,593][71000] InferenceWorker_p0-w0: stopping experience collection (42350 times) [2024-06-13 08:18:06,608][70980] Signal inference workers to resume experience collection... (42350 times) [2024-06-13 08:18:06,611][71000] InferenceWorker_p0-w0: resuming experience collection (42350 times) [2024-06-13 08:18:08,047][71000] Updated weights for policy 0, policy_version 201474 (0.0036) [2024-06-13 08:18:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 49263.1). Total num frames: 3301081088. Throughput: 0: 49068.5. Samples: 2829918520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:18:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:18:11,438][71000] Updated weights for policy 0, policy_version 201484 (0.0030) [2024-06-13 08:18:14,826][71000] Updated weights for policy 0, policy_version 201494 (0.0029) [2024-06-13 08:18:15,939][70768] Fps is (10 sec: 45875.5, 60 sec: 48605.9, 300 sec: 49207.5). Total num frames: 3301326848. Throughput: 0: 48865.0. Samples: 2830200700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:18:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:18:17,897][71000] Updated weights for policy 0, policy_version 201504 (0.0039) [2024-06-13 08:18:20,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48605.8, 300 sec: 49207.5). Total num frames: 3301572608. Throughput: 0: 48943.3. Samples: 2830348660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:18:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:18:21,620][71000] Updated weights for policy 0, policy_version 201514 (0.0022) [2024-06-13 08:18:24,604][71000] Updated weights for policy 0, policy_version 201524 (0.0027) [2024-06-13 08:18:25,939][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 3301834752. Throughput: 0: 48978.8. Samples: 2830642700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:18:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:18:28,075][71000] Updated weights for policy 0, policy_version 201534 (0.0029) [2024-06-13 08:18:30,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3302064128. Throughput: 0: 49191.4. Samples: 2830942820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:18:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:18:31,235][71000] Updated weights for policy 0, policy_version 201544 (0.0024) [2024-06-13 08:18:34,586][71000] Updated weights for policy 0, policy_version 201554 (0.0033) [2024-06-13 08:18:35,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48605.8, 300 sec: 49152.0). Total num frames: 3302309888. Throughput: 0: 49242.3. Samples: 2831090380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:18:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:18:37,784][71000] Updated weights for policy 0, policy_version 201564 (0.0028) [2024-06-13 08:18:40,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3302572032. Throughput: 0: 49330.2. Samples: 2831388620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:18:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 08:18:41,416][71000] Updated weights for policy 0, policy_version 201574 (0.0029) [2024-06-13 08:18:44,338][71000] Updated weights for policy 0, policy_version 201584 (0.0031) [2024-06-13 08:18:45,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49698.0, 300 sec: 49263.1). Total num frames: 3302834176. Throughput: 0: 49573.2. Samples: 2831695140. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-13 08:18:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:18:47,974][71000] Updated weights for policy 0, policy_version 201594 (0.0023) [2024-06-13 08:18:50,931][71000] Updated weights for policy 0, policy_version 201604 (0.0033) [2024-06-13 08:18:50,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.2, 300 sec: 49152.0). Total num frames: 3303079936. Throughput: 0: 49412.9. Samples: 2831843420. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-13 08:18:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:18:54,834][71000] Updated weights for policy 0, policy_version 201614 (0.0025) [2024-06-13 08:18:55,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 3303309312. Throughput: 0: 49247.0. Samples: 2832134640. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-13 08:18:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:18:57,743][71000] Updated weights for policy 0, policy_version 201624 (0.0030) [2024-06-13 08:19:00,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3303538688. Throughput: 0: 49419.9. Samples: 2832424600. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-13 08:19:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:19:01,269][71000] Updated weights for policy 0, policy_version 201634 (0.0029) [2024-06-13 08:19:04,360][71000] Updated weights for policy 0, policy_version 201644 (0.0031) [2024-06-13 08:19:05,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.8, 300 sec: 49207.5). Total num frames: 3303800832. Throughput: 0: 49368.0. Samples: 2832570220. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-13 08:19:05,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:19:08,182][71000] Updated weights for policy 0, policy_version 201654 (0.0020) [2024-06-13 08:19:10,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 3304046592. Throughput: 0: 49524.4. Samples: 2832871300. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-13 08:19:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:19:10,973][71000] Updated weights for policy 0, policy_version 201664 (0.0025) [2024-06-13 08:19:11,667][70980] Signal inference workers to stop experience collection... (42400 times) [2024-06-13 08:19:11,695][71000] InferenceWorker_p0-w0: stopping experience collection (42400 times) [2024-06-13 08:19:11,714][70980] Signal inference workers to resume experience collection... (42400 times) [2024-06-13 08:19:11,715][71000] InferenceWorker_p0-w0: resuming experience collection (42400 times) [2024-06-13 08:19:14,881][71000] Updated weights for policy 0, policy_version 201674 (0.0028) [2024-06-13 08:19:15,940][70768] Fps is (10 sec: 45876.0, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3304259584. Throughput: 0: 49296.1. Samples: 2833161140. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-13 08:19:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:19:17,976][71000] Updated weights for policy 0, policy_version 201684 (0.0027) [2024-06-13 08:19:20,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 3304521728. Throughput: 0: 49269.3. Samples: 2833307500. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-13 08:19:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:19:21,379][71000] Updated weights for policy 0, policy_version 201694 (0.0026) [2024-06-13 08:19:24,576][71000] Updated weights for policy 0, policy_version 201704 (0.0034) [2024-06-13 08:19:25,940][70768] Fps is (10 sec: 54067.0, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 3304800256. Throughput: 0: 49156.9. Samples: 2833600680. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-13 08:19:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:19:28,086][71000] Updated weights for policy 0, policy_version 201714 (0.0025) [2024-06-13 08:19:30,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3305013248. Throughput: 0: 48890.3. Samples: 2833895200. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-13 08:19:30,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:19:31,340][71000] Updated weights for policy 0, policy_version 201724 (0.0023) [2024-06-13 08:19:34,917][71000] Updated weights for policy 0, policy_version 201734 (0.0029) [2024-06-13 08:19:35,939][70768] Fps is (10 sec: 45875.5, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 3305259008. Throughput: 0: 48834.7. Samples: 2834040980. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-13 08:19:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:19:37,977][71000] Updated weights for policy 0, policy_version 201744 (0.0029) [2024-06-13 08:19:40,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 3305504768. Throughput: 0: 48714.3. Samples: 2834326780. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-13 08:19:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:19:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000201752_3305504768.pth... [2024-06-13 08:19:41,007][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000201036_3293773824.pth [2024-06-13 08:19:41,492][71000] Updated weights for policy 0, policy_version 201754 (0.0025) [2024-06-13 08:19:44,714][71000] Updated weights for policy 0, policy_version 201764 (0.0023) [2024-06-13 08:19:45,940][70768] Fps is (10 sec: 52427.8, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 3305783296. Throughput: 0: 48882.9. Samples: 2834624340. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-13 08:19:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:19:48,216][71000] Updated weights for policy 0, policy_version 201774 (0.0030) [2024-06-13 08:19:50,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3305996288. Throughput: 0: 48909.9. Samples: 2834771160. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-13 08:19:50,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:19:51,347][71000] Updated weights for policy 0, policy_version 201784 (0.0026) [2024-06-13 08:19:54,961][71000] Updated weights for policy 0, policy_version 201794 (0.0021) [2024-06-13 08:19:55,940][70768] Fps is (10 sec: 44237.5, 60 sec: 48606.0, 300 sec: 48929.8). Total num frames: 3306225664. Throughput: 0: 48679.5. Samples: 2835061880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 08:19:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:19:58,258][71000] Updated weights for policy 0, policy_version 201804 (0.0022) [2024-06-13 08:20:00,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 3306487808. Throughput: 0: 48654.5. Samples: 2835350600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 08:20:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:20:01,631][71000] Updated weights for policy 0, policy_version 201814 (0.0031) [2024-06-13 08:20:04,783][71000] Updated weights for policy 0, policy_version 201824 (0.0030) [2024-06-13 08:20:05,940][70768] Fps is (10 sec: 54067.0, 60 sec: 49425.2, 300 sec: 49207.5). Total num frames: 3306766336. Throughput: 0: 48847.6. Samples: 2835505640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 08:20:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 08:20:08,003][71000] Updated weights for policy 0, policy_version 201834 (0.0030) [2024-06-13 08:20:10,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 3306962944. Throughput: 0: 48974.1. Samples: 2835804520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 08:20:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:20:11,351][71000] Updated weights for policy 0, policy_version 201844 (0.0031) [2024-06-13 08:20:14,024][70980] Signal inference workers to stop experience collection... (42450 times) [2024-06-13 08:20:14,025][70980] Signal inference workers to resume experience collection... (42450 times) [2024-06-13 08:20:14,037][71000] InferenceWorker_p0-w0: stopping experience collection (42450 times) [2024-06-13 08:20:14,037][71000] InferenceWorker_p0-w0: resuming experience collection (42450 times) [2024-06-13 08:20:14,763][71000] Updated weights for policy 0, policy_version 201854 (0.0026) [2024-06-13 08:20:15,940][70768] Fps is (10 sec: 45874.6, 60 sec: 49424.9, 300 sec: 48929.8). Total num frames: 3307225088. Throughput: 0: 49000.3. Samples: 2836100220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 08:20:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:20:18,117][71000] Updated weights for policy 0, policy_version 201864 (0.0027) [2024-06-13 08:20:20,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3307470848. Throughput: 0: 48939.5. Samples: 2836243260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 08:20:20,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 08:20:21,194][71000] Updated weights for policy 0, policy_version 201874 (0.0032) [2024-06-13 08:20:24,913][71000] Updated weights for policy 0, policy_version 201884 (0.0031) [2024-06-13 08:20:25,940][70768] Fps is (10 sec: 52429.2, 60 sec: 49151.9, 300 sec: 49207.5). Total num frames: 3307749376. Throughput: 0: 49059.5. Samples: 2836534460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 08:20:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:20:28,210][71000] Updated weights for policy 0, policy_version 201894 (0.0032) [2024-06-13 08:20:30,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 3307945984. Throughput: 0: 49094.3. Samples: 2836833580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 08:20:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:20:31,553][71000] Updated weights for policy 0, policy_version 201904 (0.0030) [2024-06-13 08:20:34,774][71000] Updated weights for policy 0, policy_version 201914 (0.0030) [2024-06-13 08:20:35,940][70768] Fps is (10 sec: 44236.9, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3308191744. Throughput: 0: 48965.7. Samples: 2836974620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 08:20:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:20:37,992][71000] Updated weights for policy 0, policy_version 201924 (0.0028) [2024-06-13 08:20:40,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 3308453888. Throughput: 0: 49071.1. Samples: 2837270080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 08:20:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:20:41,186][71000] Updated weights for policy 0, policy_version 201934 (0.0026) [2024-06-13 08:20:44,748][71000] Updated weights for policy 0, policy_version 201944 (0.0039) [2024-06-13 08:20:45,940][70768] Fps is (10 sec: 52428.5, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 3308716032. Throughput: 0: 49102.7. Samples: 2837560220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 08:20:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:20:48,427][71000] Updated weights for policy 0, policy_version 201954 (0.0035) [2024-06-13 08:20:50,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3308929024. Throughput: 0: 48952.4. Samples: 2837708500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 08:20:50,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:20:51,623][71000] Updated weights for policy 0, policy_version 201964 (0.0030) [2024-06-13 08:20:54,983][71000] Updated weights for policy 0, policy_version 201974 (0.0033) [2024-06-13 08:20:55,940][70768] Fps is (10 sec: 44237.3, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3309158400. Throughput: 0: 48769.0. Samples: 2837999120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-13 08:20:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:20:58,142][71000] Updated weights for policy 0, policy_version 201984 (0.0030) [2024-06-13 08:21:00,939][70768] Fps is (10 sec: 49152.7, 60 sec: 48879.1, 300 sec: 49040.9). Total num frames: 3309420544. Throughput: 0: 48787.4. Samples: 2838295640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 08:21:00,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 08:21:01,870][71000] Updated weights for policy 0, policy_version 201994 (0.0030) [2024-06-13 08:21:04,805][71000] Updated weights for policy 0, policy_version 202004 (0.0028) [2024-06-13 08:21:05,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48332.8, 300 sec: 49040.9). Total num frames: 3309666304. Throughput: 0: 48906.2. Samples: 2838444040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 08:21:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:21:08,318][71000] Updated weights for policy 0, policy_version 202014 (0.0035) [2024-06-13 08:21:10,939][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 3309912064. Throughput: 0: 49021.9. Samples: 2838740440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 08:21:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:21:11,541][71000] Updated weights for policy 0, policy_version 202024 (0.0031) [2024-06-13 08:21:15,271][71000] Updated weights for policy 0, policy_version 202034 (0.0035) [2024-06-13 08:21:15,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48606.0, 300 sec: 48929.9). Total num frames: 3310141440. Throughput: 0: 48990.8. Samples: 2839038160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 08:21:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:21:17,713][70980] Signal inference workers to stop experience collection... (42500 times) [2024-06-13 08:21:17,738][71000] InferenceWorker_p0-w0: stopping experience collection (42500 times) [2024-06-13 08:21:17,775][70980] Signal inference workers to resume experience collection... (42500 times) [2024-06-13 08:21:17,775][71000] InferenceWorker_p0-w0: resuming experience collection (42500 times) [2024-06-13 08:21:17,923][71000] Updated weights for policy 0, policy_version 202044 (0.0021) [2024-06-13 08:21:20,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3310419968. Throughput: 0: 49086.3. Samples: 2839183500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 08:21:20,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:21:21,681][71000] Updated weights for policy 0, policy_version 202054 (0.0033) [2024-06-13 08:21:24,505][71000] Updated weights for policy 0, policy_version 202064 (0.0027) [2024-06-13 08:21:25,940][70768] Fps is (10 sec: 52428.6, 60 sec: 48605.9, 300 sec: 49096.5). Total num frames: 3310665728. Throughput: 0: 49204.4. Samples: 2839484280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 08:21:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:21:28,151][71000] Updated weights for policy 0, policy_version 202074 (0.0029) [2024-06-13 08:21:30,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 3310911488. Throughput: 0: 49332.5. Samples: 2839780180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 08:21:30,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:21:31,294][71000] Updated weights for policy 0, policy_version 202084 (0.0024) [2024-06-13 08:21:35,178][71000] Updated weights for policy 0, policy_version 202094 (0.0028) [2024-06-13 08:21:35,939][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 3311140864. Throughput: 0: 49189.9. Samples: 2839922040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 08:21:35,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 08:21:37,951][71000] Updated weights for policy 0, policy_version 202104 (0.0037) [2024-06-13 08:21:40,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 3311419392. Throughput: 0: 49571.9. Samples: 2840229860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 08:21:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:21:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000202113_3311419392.pth... [2024-06-13 08:21:40,990][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000201393_3299622912.pth [2024-06-13 08:21:41,718][71000] Updated weights for policy 0, policy_version 202114 (0.0021) [2024-06-13 08:21:44,627][71000] Updated weights for policy 0, policy_version 202124 (0.0025) [2024-06-13 08:21:45,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48605.9, 300 sec: 49096.5). Total num frames: 3311632384. Throughput: 0: 49327.5. Samples: 2840515380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 08:21:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 08:21:48,363][71000] Updated weights for policy 0, policy_version 202134 (0.0021) [2024-06-13 08:21:50,940][70768] Fps is (10 sec: 47513.7, 60 sec: 49425.1, 300 sec: 49096.4). Total num frames: 3311894528. Throughput: 0: 49346.6. Samples: 2840664640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 08:21:50,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 08:21:51,632][71000] Updated weights for policy 0, policy_version 202144 (0.0018) [2024-06-13 08:21:55,482][71000] Updated weights for policy 0, policy_version 202154 (0.0022) [2024-06-13 08:21:55,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 3312123904. Throughput: 0: 49210.0. Samples: 2840954900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 08:21:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:21:58,111][71000] Updated weights for policy 0, policy_version 202164 (0.0036) [2024-06-13 08:22:00,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 3312386048. Throughput: 0: 49086.6. Samples: 2841247060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 08:22:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:22:02,031][71000] Updated weights for policy 0, policy_version 202174 (0.0034) [2024-06-13 08:22:04,901][71000] Updated weights for policy 0, policy_version 202184 (0.0026) [2024-06-13 08:22:05,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49425.0, 300 sec: 49152.0). Total num frames: 3312631808. Throughput: 0: 49171.9. Samples: 2841396240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-13 08:22:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:22:08,908][71000] Updated weights for policy 0, policy_version 202194 (0.0035) [2024-06-13 08:22:10,939][70768] Fps is (10 sec: 50790.6, 60 sec: 49698.1, 300 sec: 49096.5). Total num frames: 3312893952. Throughput: 0: 49008.5. Samples: 2841689660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 08:22:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:22:11,724][71000] Updated weights for policy 0, policy_version 202204 (0.0033) [2024-06-13 08:22:15,374][71000] Updated weights for policy 0, policy_version 202214 (0.0036) [2024-06-13 08:22:15,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49424.9, 300 sec: 48985.4). Total num frames: 3313106944. Throughput: 0: 48980.9. Samples: 2841984320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 08:22:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:22:18,572][71000] Updated weights for policy 0, policy_version 202224 (0.0024) [2024-06-13 08:22:20,940][70768] Fps is (10 sec: 44236.7, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3313336320. Throughput: 0: 49160.4. Samples: 2842134260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 08:22:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:22:22,496][71000] Updated weights for policy 0, policy_version 202234 (0.0030) [2024-06-13 08:22:25,215][71000] Updated weights for policy 0, policy_version 202244 (0.0023) [2024-06-13 08:22:25,449][70980] Signal inference workers to stop experience collection... (42550 times) [2024-06-13 08:22:25,449][70980] Signal inference workers to resume experience collection... (42550 times) [2024-06-13 08:22:25,499][71000] InferenceWorker_p0-w0: stopping experience collection (42550 times) [2024-06-13 08:22:25,499][71000] InferenceWorker_p0-w0: resuming experience collection (42550 times) [2024-06-13 08:22:25,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49151.8, 300 sec: 49152.0). Total num frames: 3313614848. Throughput: 0: 48565.2. Samples: 2842415300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 08:22:25,941][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:22:28,914][71000] Updated weights for policy 0, policy_version 202254 (0.0025) [2024-06-13 08:22:30,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3313860608. Throughput: 0: 48644.4. Samples: 2842704380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 08:22:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:22:31,890][71000] Updated weights for policy 0, policy_version 202264 (0.0029) [2024-06-13 08:22:35,247][71000] Updated weights for policy 0, policy_version 202274 (0.0030) [2024-06-13 08:22:35,940][70768] Fps is (10 sec: 45876.2, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3314073600. Throughput: 0: 48713.4. Samples: 2842856740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 08:22:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:22:38,641][71000] Updated weights for policy 0, policy_version 202284 (0.0034) [2024-06-13 08:22:40,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48332.8, 300 sec: 49040.9). Total num frames: 3314319360. Throughput: 0: 48836.5. Samples: 2843152540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 08:22:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:22:42,140][71000] Updated weights for policy 0, policy_version 202294 (0.0030) [2024-06-13 08:22:44,958][71000] Updated weights for policy 0, policy_version 202304 (0.0022) [2024-06-13 08:22:45,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 3314581504. Throughput: 0: 48967.7. Samples: 2843450600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 08:22:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:22:48,921][71000] Updated weights for policy 0, policy_version 202314 (0.0029) [2024-06-13 08:22:50,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3314843648. Throughput: 0: 49027.9. Samples: 2843602500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 08:22:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:22:51,365][71000] Updated weights for policy 0, policy_version 202324 (0.0025) [2024-06-13 08:22:55,342][71000] Updated weights for policy 0, policy_version 202334 (0.0040) [2024-06-13 08:22:55,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49152.0, 300 sec: 49096.4). Total num frames: 3315073024. Throughput: 0: 49057.6. Samples: 2843897260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 08:22:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:22:58,333][71000] Updated weights for policy 0, policy_version 202344 (0.0029) [2024-06-13 08:23:00,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48605.7, 300 sec: 48929.8). Total num frames: 3315302400. Throughput: 0: 48831.4. Samples: 2844181740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 08:23:00,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:23:01,988][71000] Updated weights for policy 0, policy_version 202354 (0.0026) [2024-06-13 08:23:04,874][71000] Updated weights for policy 0, policy_version 202364 (0.0038) [2024-06-13 08:23:05,940][70768] Fps is (10 sec: 47513.9, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 3315548160. Throughput: 0: 48810.2. Samples: 2844330720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 08:23:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:23:08,956][71000] Updated weights for policy 0, policy_version 202374 (0.0025) [2024-06-13 08:23:10,939][70768] Fps is (10 sec: 52430.4, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 3315826688. Throughput: 0: 49062.1. Samples: 2844623080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-13 08:23:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:23:11,559][71000] Updated weights for policy 0, policy_version 202384 (0.0028) [2024-06-13 08:23:15,442][71000] Updated weights for policy 0, policy_version 202394 (0.0027) [2024-06-13 08:23:15,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 3316039680. Throughput: 0: 49231.5. Samples: 2844919800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-13 08:23:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:23:18,398][71000] Updated weights for policy 0, policy_version 202404 (0.0027) [2024-06-13 08:23:20,940][70768] Fps is (10 sec: 44235.8, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 3316269056. Throughput: 0: 48929.1. Samples: 2845058560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-13 08:23:20,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:23:21,533][70980] Signal inference workers to stop experience collection... (42600 times) [2024-06-13 08:23:21,568][71000] InferenceWorker_p0-w0: stopping experience collection (42600 times) [2024-06-13 08:23:21,637][70980] Signal inference workers to resume experience collection... (42600 times) [2024-06-13 08:23:21,637][71000] InferenceWorker_p0-w0: resuming experience collection (42600 times) [2024-06-13 08:23:22,113][71000] Updated weights for policy 0, policy_version 202414 (0.0027) [2024-06-13 08:23:24,933][71000] Updated weights for policy 0, policy_version 202424 (0.0030) [2024-06-13 08:23:25,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48606.0, 300 sec: 49040.9). Total num frames: 3316531200. Throughput: 0: 48824.0. Samples: 2845349620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-13 08:23:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:23:28,858][71000] Updated weights for policy 0, policy_version 202434 (0.0024) [2024-06-13 08:23:30,940][70768] Fps is (10 sec: 52429.5, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 3316793344. Throughput: 0: 48731.4. Samples: 2845643520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-13 08:23:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 08:23:32,012][71000] Updated weights for policy 0, policy_version 202444 (0.0026) [2024-06-13 08:23:35,625][71000] Updated weights for policy 0, policy_version 202454 (0.0026) [2024-06-13 08:23:35,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 3317022720. Throughput: 0: 48662.8. Samples: 2845792320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-13 08:23:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:23:38,708][71000] Updated weights for policy 0, policy_version 202464 (0.0027) [2024-06-13 08:23:40,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 3317252096. Throughput: 0: 48555.6. Samples: 2846082260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-13 08:23:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:23:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000202469_3317252096.pth... [2024-06-13 08:23:40,997][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000201752_3305504768.pth [2024-06-13 08:23:42,502][71000] Updated weights for policy 0, policy_version 202474 (0.0037) [2024-06-13 08:23:45,284][71000] Updated weights for policy 0, policy_version 202484 (0.0029) [2024-06-13 08:23:45,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48605.7, 300 sec: 48874.3). Total num frames: 3317497856. Throughput: 0: 48587.6. Samples: 2846368180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-13 08:23:45,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:23:48,978][71000] Updated weights for policy 0, policy_version 202494 (0.0034) [2024-06-13 08:23:50,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3317760000. Throughput: 0: 48783.9. Samples: 2846526000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-13 08:23:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:23:52,191][71000] Updated weights for policy 0, policy_version 202504 (0.0029) [2024-06-13 08:23:55,858][71000] Updated weights for policy 0, policy_version 202514 (0.0025) [2024-06-13 08:23:55,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3317989376. Throughput: 0: 48779.5. Samples: 2846818160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-13 08:23:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:23:59,283][71000] Updated weights for policy 0, policy_version 202524 (0.0036) [2024-06-13 08:24:00,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48606.0, 300 sec: 48874.3). Total num frames: 3318218752. Throughput: 0: 48623.1. Samples: 2847107840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-13 08:24:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:24:02,867][71000] Updated weights for policy 0, policy_version 202534 (0.0028) [2024-06-13 08:24:05,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 3318464512. Throughput: 0: 48634.8. Samples: 2847247120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-13 08:24:05,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 08:24:06,027][71000] Updated weights for policy 0, policy_version 202544 (0.0032) [2024-06-13 08:24:09,511][71000] Updated weights for policy 0, policy_version 202554 (0.0029) [2024-06-13 08:24:10,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48332.7, 300 sec: 49040.9). Total num frames: 3318726656. Throughput: 0: 48450.2. Samples: 2847529880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-13 08:24:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:24:12,852][71000] Updated weights for policy 0, policy_version 202564 (0.0030) [2024-06-13 08:24:15,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48332.9, 300 sec: 48874.3). Total num frames: 3318939648. Throughput: 0: 48530.3. Samples: 2847827380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-13 08:24:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:24:16,243][71000] Updated weights for policy 0, policy_version 202574 (0.0028) [2024-06-13 08:24:19,473][71000] Updated weights for policy 0, policy_version 202584 (0.0029) [2024-06-13 08:24:20,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48606.0, 300 sec: 48763.2). Total num frames: 3319185408. Throughput: 0: 48482.7. Samples: 2847974040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-13 08:24:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:24:22,932][71000] Updated weights for policy 0, policy_version 202594 (0.0038) [2024-06-13 08:24:25,887][71000] Updated weights for policy 0, policy_version 202604 (0.0029) [2024-06-13 08:24:25,940][70768] Fps is (10 sec: 52427.8, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3319463936. Throughput: 0: 48470.6. Samples: 2848263440. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 08:24:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:24:29,437][71000] Updated weights for policy 0, policy_version 202614 (0.0029) [2024-06-13 08:24:30,940][70768] Fps is (10 sec: 52428.2, 60 sec: 48605.8, 300 sec: 48985.3). Total num frames: 3319709696. Throughput: 0: 48694.2. Samples: 2848559420. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 08:24:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:24:33,130][71000] Updated weights for policy 0, policy_version 202624 (0.0036) [2024-06-13 08:24:35,940][70768] Fps is (10 sec: 45875.9, 60 sec: 48332.8, 300 sec: 48874.3). Total num frames: 3319922688. Throughput: 0: 48423.3. Samples: 2848705040. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 08:24:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:24:36,358][71000] Updated weights for policy 0, policy_version 202634 (0.0029) [2024-06-13 08:24:36,754][70980] Signal inference workers to stop experience collection... (42650 times) [2024-06-13 08:24:36,802][71000] InferenceWorker_p0-w0: stopping experience collection (42650 times) [2024-06-13 08:24:36,810][70980] Signal inference workers to resume experience collection... (42650 times) [2024-06-13 08:24:36,821][71000] InferenceWorker_p0-w0: resuming experience collection (42650 times) [2024-06-13 08:24:39,678][71000] Updated weights for policy 0, policy_version 202644 (0.0022) [2024-06-13 08:24:40,939][70768] Fps is (10 sec: 44237.8, 60 sec: 48332.9, 300 sec: 48707.7). Total num frames: 3320152064. Throughput: 0: 48575.6. Samples: 2849004060. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 08:24:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:24:43,038][71000] Updated weights for policy 0, policy_version 202654 (0.0025) [2024-06-13 08:24:45,940][70768] Fps is (10 sec: 49151.0, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 3320414208. Throughput: 0: 48647.0. Samples: 2849296960. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 08:24:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:24:46,295][71000] Updated weights for policy 0, policy_version 202664 (0.0029) [2024-06-13 08:24:49,762][71000] Updated weights for policy 0, policy_version 202674 (0.0027) [2024-06-13 08:24:50,940][70768] Fps is (10 sec: 54066.0, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3320692736. Throughput: 0: 48770.1. Samples: 2849441780. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 08:24:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:24:53,184][71000] Updated weights for policy 0, policy_version 202684 (0.0035) [2024-06-13 08:24:55,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 3320905728. Throughput: 0: 49083.5. Samples: 2849738640. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 08:24:55,944][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:24:56,639][71000] Updated weights for policy 0, policy_version 202694 (0.0029) [2024-06-13 08:24:59,965][71000] Updated weights for policy 0, policy_version 202704 (0.0033) [2024-06-13 08:25:00,940][70768] Fps is (10 sec: 44237.0, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 3321135104. Throughput: 0: 48949.2. Samples: 2850030100. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 08:25:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:25:03,021][71000] Updated weights for policy 0, policy_version 202714 (0.0024) [2024-06-13 08:25:05,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3321397248. Throughput: 0: 48813.7. Samples: 2850170660. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 08:25:05,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:25:06,489][71000] Updated weights for policy 0, policy_version 202724 (0.0028) [2024-06-13 08:25:09,709][71000] Updated weights for policy 0, policy_version 202734 (0.0023) [2024-06-13 08:25:10,940][70768] Fps is (10 sec: 52428.4, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 3321659392. Throughput: 0: 49029.7. Samples: 2850469780. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 08:25:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:25:13,282][71000] Updated weights for policy 0, policy_version 202744 (0.0027) [2024-06-13 08:25:15,939][70768] Fps is (10 sec: 47514.2, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 3321872384. Throughput: 0: 48990.9. Samples: 2850764000. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 08:25:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 08:25:16,552][71000] Updated weights for policy 0, policy_version 202754 (0.0027) [2024-06-13 08:25:20,201][71000] Updated weights for policy 0, policy_version 202764 (0.0030) [2024-06-13 08:25:20,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 3322134528. Throughput: 0: 48771.0. Samples: 2850899740. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 08:25:20,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 08:25:23,104][71000] Updated weights for policy 0, policy_version 202774 (0.0029) [2024-06-13 08:25:25,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 3322380288. Throughput: 0: 48738.1. Samples: 2851197280. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-13 08:25:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:25:26,746][71000] Updated weights for policy 0, policy_version 202784 (0.0032) [2024-06-13 08:25:29,770][71000] Updated weights for policy 0, policy_version 202794 (0.0028) [2024-06-13 08:25:30,941][70768] Fps is (10 sec: 50783.8, 60 sec: 48877.9, 300 sec: 48985.2). Total num frames: 3322642432. Throughput: 0: 48796.5. Samples: 2851492860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 08:25:30,941][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 08:25:33,598][71000] Updated weights for policy 0, policy_version 202804 (0.0026) [2024-06-13 08:25:35,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 3322839040. Throughput: 0: 49096.1. Samples: 2851651100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 08:25:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:25:36,544][71000] Updated weights for policy 0, policy_version 202814 (0.0031) [2024-06-13 08:25:40,354][71000] Updated weights for policy 0, policy_version 202824 (0.0035) [2024-06-13 08:25:40,940][70768] Fps is (10 sec: 45881.5, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 3323101184. Throughput: 0: 48804.5. Samples: 2851934840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 08:25:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:25:40,946][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000202827_3323117568.pth... [2024-06-13 08:25:40,995][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000202113_3311419392.pth [2024-06-13 08:25:43,142][71000] Updated weights for policy 0, policy_version 202834 (0.0023) [2024-06-13 08:25:44,505][70980] Signal inference workers to stop experience collection... (42700 times) [2024-06-13 08:25:44,506][70980] Signal inference workers to resume experience collection... (42700 times) [2024-06-13 08:25:44,524][71000] InferenceWorker_p0-w0: stopping experience collection (42700 times) [2024-06-13 08:25:44,524][71000] InferenceWorker_p0-w0: resuming experience collection (42700 times) [2024-06-13 08:25:45,940][70768] Fps is (10 sec: 52428.3, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 3323363328. Throughput: 0: 48776.4. Samples: 2852225040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 08:25:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:25:46,953][71000] Updated weights for policy 0, policy_version 202844 (0.0030) [2024-06-13 08:25:49,726][71000] Updated weights for policy 0, policy_version 202854 (0.0024) [2024-06-13 08:25:50,939][70768] Fps is (10 sec: 52429.2, 60 sec: 48879.1, 300 sec: 49040.9). Total num frames: 3323625472. Throughput: 0: 49047.3. Samples: 2852377780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 08:25:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:25:53,443][71000] Updated weights for policy 0, policy_version 202864 (0.0031) [2024-06-13 08:25:55,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48605.9, 300 sec: 48818.7). Total num frames: 3323822080. Throughput: 0: 48974.0. Samples: 2852673600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 08:25:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:25:56,711][71000] Updated weights for policy 0, policy_version 202874 (0.0027) [2024-06-13 08:26:00,413][71000] Updated weights for policy 0, policy_version 202884 (0.0030) [2024-06-13 08:26:00,939][70768] Fps is (10 sec: 44236.8, 60 sec: 48879.1, 300 sec: 48818.8). Total num frames: 3324067840. Throughput: 0: 48976.0. Samples: 2852967920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 08:26:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:26:03,197][71000] Updated weights for policy 0, policy_version 202894 (0.0027) [2024-06-13 08:26:05,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 3324329984. Throughput: 0: 49116.1. Samples: 2853109960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 08:26:05,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:26:06,767][71000] Updated weights for policy 0, policy_version 202904 (0.0030) [2024-06-13 08:26:09,767][71000] Updated weights for policy 0, policy_version 202914 (0.0031) [2024-06-13 08:26:10,940][70768] Fps is (10 sec: 52428.6, 60 sec: 48879.1, 300 sec: 48985.4). Total num frames: 3324592128. Throughput: 0: 49263.2. Samples: 2853414120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 08:26:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:26:13,050][71000] Updated weights for policy 0, policy_version 202924 (0.0031) [2024-06-13 08:26:15,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49424.9, 300 sec: 48874.3). Total num frames: 3324837888. Throughput: 0: 49289.9. Samples: 2853710840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 08:26:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:26:16,352][71000] Updated weights for policy 0, policy_version 202934 (0.0029) [2024-06-13 08:26:19,881][71000] Updated weights for policy 0, policy_version 202944 (0.0032) [2024-06-13 08:26:20,939][70768] Fps is (10 sec: 45875.4, 60 sec: 48606.0, 300 sec: 48763.2). Total num frames: 3325050880. Throughput: 0: 48966.8. Samples: 2853854600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 08:26:20,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:26:23,160][71000] Updated weights for policy 0, policy_version 202954 (0.0020) [2024-06-13 08:26:25,940][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3325329408. Throughput: 0: 49076.0. Samples: 2854143260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 08:26:25,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 08:26:26,585][71000] Updated weights for policy 0, policy_version 202964 (0.0030) [2024-06-13 08:26:29,697][71000] Updated weights for policy 0, policy_version 202974 (0.0030) [2024-06-13 08:26:30,940][70768] Fps is (10 sec: 54066.2, 60 sec: 49153.0, 300 sec: 48985.4). Total num frames: 3325591552. Throughput: 0: 49141.3. Samples: 2854436400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 08:26:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:26:33,177][71000] Updated weights for policy 0, policy_version 202984 (0.0033) [2024-06-13 08:26:35,939][70768] Fps is (10 sec: 45875.7, 60 sec: 49152.1, 300 sec: 48707.7). Total num frames: 3325788160. Throughput: 0: 49082.3. Samples: 2854586480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 08:26:35,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:26:36,449][71000] Updated weights for policy 0, policy_version 202994 (0.0027) [2024-06-13 08:26:39,983][71000] Updated weights for policy 0, policy_version 203004 (0.0027) [2024-06-13 08:26:40,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3326050304. Throughput: 0: 48958.7. Samples: 2854876740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 08:26:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:26:43,150][71000] Updated weights for policy 0, policy_version 203014 (0.0032) [2024-06-13 08:26:44,008][70980] Signal inference workers to stop experience collection... (42750 times) [2024-06-13 08:26:44,032][71000] InferenceWorker_p0-w0: stopping experience collection (42750 times) [2024-06-13 08:26:44,060][70980] Signal inference workers to resume experience collection... (42750 times) [2024-06-13 08:26:44,066][71000] InferenceWorker_p0-w0: resuming experience collection (42750 times) [2024-06-13 08:26:45,940][70768] Fps is (10 sec: 52427.8, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 3326312448. Throughput: 0: 48932.7. Samples: 2855169900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 08:26:45,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:26:46,570][71000] Updated weights for policy 0, policy_version 203024 (0.0029) [2024-06-13 08:26:49,994][71000] Updated weights for policy 0, policy_version 203034 (0.0025) [2024-06-13 08:26:50,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 3326558208. Throughput: 0: 49123.0. Samples: 2855320500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 08:26:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:26:53,405][71000] Updated weights for policy 0, policy_version 203044 (0.0030) [2024-06-13 08:26:55,940][70768] Fps is (10 sec: 45875.0, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 3326771200. Throughput: 0: 48786.5. Samples: 2855609520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 08:26:55,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:26:56,632][71000] Updated weights for policy 0, policy_version 203054 (0.0031) [2024-06-13 08:27:00,313][71000] Updated weights for policy 0, policy_version 203064 (0.0029) [2024-06-13 08:27:00,940][70768] Fps is (10 sec: 45875.1, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 3327016960. Throughput: 0: 48647.2. Samples: 2855899960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 08:27:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 08:27:03,401][71000] Updated weights for policy 0, policy_version 203074 (0.0039) [2024-06-13 08:27:05,939][70768] Fps is (10 sec: 49153.1, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 3327262720. Throughput: 0: 48545.4. Samples: 2856039140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 08:27:05,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:27:07,030][71000] Updated weights for policy 0, policy_version 203084 (0.0031) [2024-06-13 08:27:09,909][71000] Updated weights for policy 0, policy_version 203094 (0.0031) [2024-06-13 08:27:10,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3327524864. Throughput: 0: 48850.2. Samples: 2856341520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 08:27:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:27:13,172][71000] Updated weights for policy 0, policy_version 203104 (0.0019) [2024-06-13 08:27:15,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 3327770624. Throughput: 0: 49112.2. Samples: 2856646440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 08:27:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:27:16,350][71000] Updated weights for policy 0, policy_version 203114 (0.0034) [2024-06-13 08:27:20,205][71000] Updated weights for policy 0, policy_version 203124 (0.0036) [2024-06-13 08:27:20,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49151.8, 300 sec: 48763.2). Total num frames: 3328000000. Throughput: 0: 48750.8. Samples: 2856780280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 08:27:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:27:23,329][71000] Updated weights for policy 0, policy_version 203134 (0.0030) [2024-06-13 08:27:25,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 3328245760. Throughput: 0: 48813.6. Samples: 2857073360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 08:27:25,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:27:26,946][71000] Updated weights for policy 0, policy_version 203144 (0.0033) [2024-06-13 08:27:30,292][71000] Updated weights for policy 0, policy_version 203154 (0.0028) [2024-06-13 08:27:30,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48332.9, 300 sec: 48874.3). Total num frames: 3328491520. Throughput: 0: 48973.9. Samples: 2857373720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 08:27:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:27:33,478][71000] Updated weights for policy 0, policy_version 203164 (0.0031) [2024-06-13 08:27:35,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49424.9, 300 sec: 48929.9). Total num frames: 3328753664. Throughput: 0: 48908.0. Samples: 2857521360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 08:27:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:27:36,701][71000] Updated weights for policy 0, policy_version 203174 (0.0025) [2024-06-13 08:27:39,987][71000] Updated weights for policy 0, policy_version 203184 (0.0026) [2024-06-13 08:27:40,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3328999424. Throughput: 0: 49068.1. Samples: 2857817580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 08:27:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:27:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000203186_3328999424.pth... [2024-06-13 08:27:41,009][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000202469_3317252096.pth [2024-06-13 08:27:43,308][71000] Updated weights for policy 0, policy_version 203194 (0.0033) [2024-06-13 08:27:45,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 3329245184. Throughput: 0: 48991.6. Samples: 2858104580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-13 08:27:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:27:46,965][71000] Updated weights for policy 0, policy_version 203204 (0.0031) [2024-06-13 08:27:50,203][71000] Updated weights for policy 0, policy_version 203214 (0.0032) [2024-06-13 08:27:50,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 3329474560. Throughput: 0: 49319.0. Samples: 2858258500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-13 08:27:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:27:52,673][70980] Signal inference workers to stop experience collection... (42800 times) [2024-06-13 08:27:52,721][71000] InferenceWorker_p0-w0: stopping experience collection (42800 times) [2024-06-13 08:27:52,723][70980] Signal inference workers to resume experience collection... (42800 times) [2024-06-13 08:27:52,731][71000] InferenceWorker_p0-w0: resuming experience collection (42800 times) [2024-06-13 08:27:53,523][71000] Updated weights for policy 0, policy_version 203224 (0.0035) [2024-06-13 08:27:55,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3329720320. Throughput: 0: 49000.8. Samples: 2858546560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-13 08:27:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:27:57,013][71000] Updated weights for policy 0, policy_version 203234 (0.0029) [2024-06-13 08:28:00,325][71000] Updated weights for policy 0, policy_version 203244 (0.0031) [2024-06-13 08:28:00,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 3329966080. Throughput: 0: 48621.4. Samples: 2858834400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-13 08:28:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:28:03,675][71000] Updated weights for policy 0, policy_version 203254 (0.0023) [2024-06-13 08:28:05,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 3330211840. Throughput: 0: 48889.9. Samples: 2858980320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-13 08:28:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:28:07,147][71000] Updated weights for policy 0, policy_version 203264 (0.0027) [2024-06-13 08:28:10,384][71000] Updated weights for policy 0, policy_version 203274 (0.0035) [2024-06-13 08:28:10,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3330457600. Throughput: 0: 48921.0. Samples: 2859274800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-13 08:28:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:28:13,776][71000] Updated weights for policy 0, policy_version 203284 (0.0032) [2024-06-13 08:28:15,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 3330703360. Throughput: 0: 48695.6. Samples: 2859565020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-13 08:28:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:28:17,383][71000] Updated weights for policy 0, policy_version 203294 (0.0027) [2024-06-13 08:28:20,846][71000] Updated weights for policy 0, policy_version 203304 (0.0032) [2024-06-13 08:28:20,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 3330932736. Throughput: 0: 48678.3. Samples: 2859711880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-13 08:28:20,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:28:24,154][71000] Updated weights for policy 0, policy_version 203314 (0.0030) [2024-06-13 08:28:25,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 48818.8). Total num frames: 3331194880. Throughput: 0: 48668.9. Samples: 2860007680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-13 08:28:25,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:28:27,487][71000] Updated weights for policy 0, policy_version 203324 (0.0031) [2024-06-13 08:28:30,750][71000] Updated weights for policy 0, policy_version 203334 (0.0029) [2024-06-13 08:28:30,939][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 3331424256. Throughput: 0: 48797.0. Samples: 2860300440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-13 08:28:30,940][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 08:28:33,832][71000] Updated weights for policy 0, policy_version 203344 (0.0025) [2024-06-13 08:28:35,939][70768] Fps is (10 sec: 49152.1, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 3331686400. Throughput: 0: 48630.3. Samples: 2860446860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-13 08:28:35,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:28:37,101][71000] Updated weights for policy 0, policy_version 203354 (0.0021) [2024-06-13 08:28:40,606][71000] Updated weights for policy 0, policy_version 203364 (0.0026) [2024-06-13 08:28:40,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 3331932160. Throughput: 0: 49069.0. Samples: 2860754660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-13 08:28:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:28:44,047][71000] Updated weights for policy 0, policy_version 203374 (0.0042) [2024-06-13 08:28:45,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 3332161536. Throughput: 0: 49033.7. Samples: 2861040920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-13 08:28:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:28:47,559][71000] Updated weights for policy 0, policy_version 203384 (0.0031) [2024-06-13 08:28:50,692][71000] Updated weights for policy 0, policy_version 203394 (0.0029) [2024-06-13 08:28:50,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 3332407296. Throughput: 0: 49012.7. Samples: 2861185900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:28:50,941][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:28:53,843][71000] Updated weights for policy 0, policy_version 203404 (0.0024) [2024-06-13 08:28:55,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 3332669440. Throughput: 0: 49127.6. Samples: 2861485540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:28:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:28:57,340][71000] Updated weights for policy 0, policy_version 203414 (0.0026) [2024-06-13 08:29:00,430][71000] Updated weights for policy 0, policy_version 203424 (0.0033) [2024-06-13 08:29:00,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 3332915200. Throughput: 0: 49310.9. Samples: 2861784020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:29:00,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:29:03,791][71000] Updated weights for policy 0, policy_version 203434 (0.0030) [2024-06-13 08:29:05,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 3333160960. Throughput: 0: 49364.9. Samples: 2861933300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:29:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:29:07,415][71000] Updated weights for policy 0, policy_version 203444 (0.0028) [2024-06-13 08:29:10,634][71000] Updated weights for policy 0, policy_version 203454 (0.0025) [2024-06-13 08:29:10,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3333390336. Throughput: 0: 49199.1. Samples: 2862221640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:29:10,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:29:13,685][71000] Updated weights for policy 0, policy_version 203464 (0.0023) [2024-06-13 08:29:15,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 3333652480. Throughput: 0: 49192.7. Samples: 2862514120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:29:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:29:16,370][70980] Signal inference workers to stop experience collection... (42850 times) [2024-06-13 08:29:16,371][70980] Signal inference workers to resume experience collection... (42850 times) [2024-06-13 08:29:16,385][71000] InferenceWorker_p0-w0: stopping experience collection (42850 times) [2024-06-13 08:29:16,385][71000] InferenceWorker_p0-w0: resuming experience collection (42850 times) [2024-06-13 08:29:17,276][71000] Updated weights for policy 0, policy_version 203474 (0.0029) [2024-06-13 08:29:20,466][71000] Updated weights for policy 0, policy_version 203484 (0.0025) [2024-06-13 08:29:20,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49425.1, 300 sec: 48929.9). Total num frames: 3333898240. Throughput: 0: 49406.7. Samples: 2862670160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:29:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:29:24,353][71000] Updated weights for policy 0, policy_version 203494 (0.0031) [2024-06-13 08:29:25,941][70768] Fps is (10 sec: 47507.6, 60 sec: 48877.8, 300 sec: 48874.1). Total num frames: 3334127616. Throughput: 0: 49121.2. Samples: 2862965180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:29:25,942][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:29:27,069][71000] Updated weights for policy 0, policy_version 203504 (0.0034) [2024-06-13 08:29:30,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3334356992. Throughput: 0: 49087.2. Samples: 2863249840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:29:30,940][70768] Avg episode reward: [(0, '0.295')] [2024-06-13 08:29:31,027][71000] Updated weights for policy 0, policy_version 203514 (0.0033) [2024-06-13 08:29:33,916][71000] Updated weights for policy 0, policy_version 203524 (0.0026) [2024-06-13 08:29:35,940][70768] Fps is (10 sec: 50797.2, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 3334635520. Throughput: 0: 49143.2. Samples: 2863397340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:29:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:29:37,509][71000] Updated weights for policy 0, policy_version 203534 (0.0025) [2024-06-13 08:29:40,848][71000] Updated weights for policy 0, policy_version 203544 (0.0039) [2024-06-13 08:29:40,939][70768] Fps is (10 sec: 50790.5, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 3334864896. Throughput: 0: 48863.2. Samples: 2863684380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:29:40,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:29:40,967][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000203545_3334881280.pth... [2024-06-13 08:29:41,014][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000202827_3323117568.pth [2024-06-13 08:29:44,598][71000] Updated weights for policy 0, policy_version 203554 (0.0026) [2024-06-13 08:29:45,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 3335094272. Throughput: 0: 48800.1. Samples: 2863980020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:29:45,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:29:47,364][71000] Updated weights for policy 0, policy_version 203564 (0.0036) [2024-06-13 08:29:50,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 3335340032. Throughput: 0: 48598.6. Samples: 2864120240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:29:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:29:51,101][71000] Updated weights for policy 0, policy_version 203574 (0.0031) [2024-06-13 08:29:54,188][71000] Updated weights for policy 0, policy_version 203584 (0.0025) [2024-06-13 08:29:55,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3335585792. Throughput: 0: 48868.0. Samples: 2864420700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:29:55,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:29:57,415][71000] Updated weights for policy 0, policy_version 203594 (0.0033) [2024-06-13 08:30:00,683][71000] Updated weights for policy 0, policy_version 203604 (0.0027) [2024-06-13 08:30:00,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3335847936. Throughput: 0: 48986.2. Samples: 2864718500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:30:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:30:04,450][71000] Updated weights for policy 0, policy_version 203614 (0.0030) [2024-06-13 08:30:05,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 3336093696. Throughput: 0: 48635.9. Samples: 2864858780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:30:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:30:07,628][71000] Updated weights for policy 0, policy_version 203624 (0.0034) [2024-06-13 08:30:10,939][70768] Fps is (10 sec: 47514.5, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 3336323072. Throughput: 0: 48725.1. Samples: 2865157740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:30:10,940][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 08:30:11,047][71000] Updated weights for policy 0, policy_version 203634 (0.0029) [2024-06-13 08:30:13,965][71000] Updated weights for policy 0, policy_version 203644 (0.0026) [2024-06-13 08:30:15,939][70768] Fps is (10 sec: 47514.2, 60 sec: 48606.0, 300 sec: 48929.9). Total num frames: 3336568832. Throughput: 0: 49136.4. Samples: 2865460980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:30:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:30:17,290][71000] Updated weights for policy 0, policy_version 203654 (0.0030) [2024-06-13 08:30:20,688][71000] Updated weights for policy 0, policy_version 203664 (0.0029) [2024-06-13 08:30:20,940][70768] Fps is (10 sec: 52427.7, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 3336847360. Throughput: 0: 49128.3. Samples: 2865608120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:30:20,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:30:24,082][71000] Updated weights for policy 0, policy_version 203674 (0.0028) [2024-06-13 08:30:25,939][70768] Fps is (10 sec: 50790.4, 60 sec: 49153.2, 300 sec: 48930.1). Total num frames: 3337076736. Throughput: 0: 49302.7. Samples: 2865903000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:30:25,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:30:27,292][71000] Updated weights for policy 0, policy_version 203684 (0.0036) [2024-06-13 08:30:29,043][70980] Signal inference workers to stop experience collection... (42900 times) [2024-06-13 08:30:29,095][71000] InferenceWorker_p0-w0: stopping experience collection (42900 times) [2024-06-13 08:30:29,095][70980] Signal inference workers to resume experience collection... (42900 times) [2024-06-13 08:30:29,110][71000] InferenceWorker_p0-w0: resuming experience collection (42900 times) [2024-06-13 08:30:30,762][71000] Updated weights for policy 0, policy_version 203694 (0.0033) [2024-06-13 08:30:30,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 3337322496. Throughput: 0: 49407.5. Samples: 2866203360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:30:30,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:30:33,955][71000] Updated weights for policy 0, policy_version 203704 (0.0025) [2024-06-13 08:30:35,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3337551872. Throughput: 0: 49407.2. Samples: 2866343560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:30:35,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 08:30:37,465][71000] Updated weights for policy 0, policy_version 203714 (0.0036) [2024-06-13 08:30:40,625][71000] Updated weights for policy 0, policy_version 203724 (0.0027) [2024-06-13 08:30:40,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 3337830400. Throughput: 0: 49241.3. Samples: 2866636560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:30:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:30:44,305][71000] Updated weights for policy 0, policy_version 203734 (0.0035) [2024-06-13 08:30:45,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48878.8, 300 sec: 48818.7). Total num frames: 3338027008. Throughput: 0: 49087.6. Samples: 2866927440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:30:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:30:47,402][71000] Updated weights for policy 0, policy_version 203744 (0.0029) [2024-06-13 08:30:50,940][70768] Fps is (10 sec: 45874.5, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 3338289152. Throughput: 0: 49035.9. Samples: 2867065400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:30:50,940][70768] Avg episode reward: [(0, '0.274')] [2024-06-13 08:30:50,987][71000] Updated weights for policy 0, policy_version 203754 (0.0033) [2024-06-13 08:30:54,178][71000] Updated weights for policy 0, policy_version 203764 (0.0028) [2024-06-13 08:30:55,941][70768] Fps is (10 sec: 50786.0, 60 sec: 49151.2, 300 sec: 49040.8). Total num frames: 3338534912. Throughput: 0: 48786.9. Samples: 2867353200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:30:55,941][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:30:57,755][71000] Updated weights for policy 0, policy_version 203774 (0.0024) [2024-06-13 08:31:00,845][71000] Updated weights for policy 0, policy_version 203784 (0.0029) [2024-06-13 08:31:00,939][70768] Fps is (10 sec: 50791.4, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 3338797056. Throughput: 0: 48750.7. Samples: 2867654760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 08:31:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 08:31:04,286][71000] Updated weights for policy 0, policy_version 203794 (0.0032) [2024-06-13 08:31:05,939][70768] Fps is (10 sec: 47518.6, 60 sec: 48606.0, 300 sec: 48874.3). Total num frames: 3339010048. Throughput: 0: 48733.1. Samples: 2867801100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 08:31:05,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:31:07,608][71000] Updated weights for policy 0, policy_version 203804 (0.0025) [2024-06-13 08:31:10,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3339255808. Throughput: 0: 48430.2. Samples: 2868082360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 08:31:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:31:11,530][71000] Updated weights for policy 0, policy_version 203814 (0.0033) [2024-06-13 08:31:14,592][71000] Updated weights for policy 0, policy_version 203824 (0.0033) [2024-06-13 08:31:15,939][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3339501568. Throughput: 0: 48236.1. Samples: 2868373980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 08:31:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:31:18,110][71000] Updated weights for policy 0, policy_version 203834 (0.0032) [2024-06-13 08:31:20,940][70768] Fps is (10 sec: 49151.2, 60 sec: 48332.8, 300 sec: 48874.3). Total num frames: 3339747328. Throughput: 0: 48335.4. Samples: 2868518660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 08:31:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:31:21,206][71000] Updated weights for policy 0, policy_version 203844 (0.0031) [2024-06-13 08:31:24,710][71000] Updated weights for policy 0, policy_version 203854 (0.0037) [2024-06-13 08:31:25,939][70768] Fps is (10 sec: 47513.5, 60 sec: 48332.8, 300 sec: 48763.3). Total num frames: 3339976704. Throughput: 0: 48405.8. Samples: 2868814820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 08:31:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:31:27,746][71000] Updated weights for policy 0, policy_version 203864 (0.0022) [2024-06-13 08:31:30,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 3340238848. Throughput: 0: 48477.0. Samples: 2869108900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 08:31:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:31:31,190][71000] Updated weights for policy 0, policy_version 203874 (0.0036) [2024-06-13 08:31:34,687][71000] Updated weights for policy 0, policy_version 203884 (0.0031) [2024-06-13 08:31:35,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 3340468224. Throughput: 0: 48717.4. Samples: 2869257680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 08:31:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:31:38,186][71000] Updated weights for policy 0, policy_version 203894 (0.0031) [2024-06-13 08:31:40,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48332.8, 300 sec: 48874.3). Total num frames: 3340730368. Throughput: 0: 48543.7. Samples: 2869537620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 08:31:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:31:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000203902_3340730368.pth... [2024-06-13 08:31:40,992][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000203186_3328999424.pth [2024-06-13 08:31:41,556][71000] Updated weights for policy 0, policy_version 203904 (0.0026) [2024-06-13 08:31:45,057][71000] Updated weights for policy 0, policy_version 203914 (0.0028) [2024-06-13 08:31:45,939][70768] Fps is (10 sec: 49152.6, 60 sec: 48879.1, 300 sec: 48818.8). Total num frames: 3340959744. Throughput: 0: 48269.8. Samples: 2869826900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 08:31:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:31:48,109][71000] Updated weights for policy 0, policy_version 203924 (0.0035) [2024-06-13 08:31:50,909][70980] Signal inference workers to stop experience collection... (42950 times) [2024-06-13 08:31:50,910][70980] Signal inference workers to resume experience collection... (42950 times) [2024-06-13 08:31:50,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48606.0, 300 sec: 48929.9). Total num frames: 3341205504. Throughput: 0: 48139.0. Samples: 2869967360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 08:31:50,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:31:50,956][71000] InferenceWorker_p0-w0: stopping experience collection (42950 times) [2024-06-13 08:31:50,956][71000] InferenceWorker_p0-w0: resuming experience collection (42950 times) [2024-06-13 08:31:51,458][71000] Updated weights for policy 0, policy_version 203934 (0.0023) [2024-06-13 08:31:54,717][71000] Updated weights for policy 0, policy_version 203944 (0.0027) [2024-06-13 08:31:55,939][70768] Fps is (10 sec: 49152.5, 60 sec: 48606.8, 300 sec: 48929.9). Total num frames: 3341451264. Throughput: 0: 48844.1. Samples: 2870280340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 08:31:55,940][70768] Avg episode reward: [(0, '0.268')] [2024-06-13 08:31:58,267][71000] Updated weights for policy 0, policy_version 203954 (0.0027) [2024-06-13 08:32:00,943][70768] Fps is (10 sec: 50770.3, 60 sec: 48602.6, 300 sec: 48984.7). Total num frames: 3341713408. Throughput: 0: 48721.9. Samples: 2870566660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 08:32:00,944][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:32:02,066][71000] Updated weights for policy 0, policy_version 203964 (0.0027) [2024-06-13 08:32:05,338][71000] Updated weights for policy 0, policy_version 203974 (0.0024) [2024-06-13 08:32:05,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48605.8, 300 sec: 48818.8). Total num frames: 3341926400. Throughput: 0: 48738.8. Samples: 2870711900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 08:32:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:32:08,547][71000] Updated weights for policy 0, policy_version 203984 (0.0027) [2024-06-13 08:32:10,940][70768] Fps is (10 sec: 47531.7, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 3342188544. Throughput: 0: 48602.0. Samples: 2871001920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-13 08:32:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:32:11,769][71000] Updated weights for policy 0, policy_version 203994 (0.0028) [2024-06-13 08:32:15,461][71000] Updated weights for policy 0, policy_version 204004 (0.0028) [2024-06-13 08:32:15,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 3342417920. Throughput: 0: 48839.2. Samples: 2871306660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 08:32:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:32:18,127][71000] Updated weights for policy 0, policy_version 204014 (0.0038) [2024-06-13 08:32:20,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 3342696448. Throughput: 0: 48641.4. Samples: 2871446540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 08:32:20,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:32:22,181][71000] Updated weights for policy 0, policy_version 204024 (0.0034) [2024-06-13 08:32:24,836][71000] Updated weights for policy 0, policy_version 204034 (0.0038) [2024-06-13 08:32:25,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 3342909440. Throughput: 0: 49028.8. Samples: 2871743920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 08:32:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:32:28,620][71000] Updated weights for policy 0, policy_version 204044 (0.0036) [2024-06-13 08:32:30,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 3343171584. Throughput: 0: 49104.0. Samples: 2872036580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 08:32:30,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 08:32:31,715][71000] Updated weights for policy 0, policy_version 204054 (0.0025) [2024-06-13 08:32:34,982][71000] Updated weights for policy 0, policy_version 204064 (0.0025) [2024-06-13 08:32:35,939][70768] Fps is (10 sec: 50791.2, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 3343417344. Throughput: 0: 49434.7. Samples: 2872191920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 08:32:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 08:32:38,211][71000] Updated weights for policy 0, policy_version 204074 (0.0028) [2024-06-13 08:32:40,940][70768] Fps is (10 sec: 49151.2, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3343663104. Throughput: 0: 49147.7. Samples: 2872492000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 08:32:40,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 08:32:41,998][71000] Updated weights for policy 0, policy_version 204084 (0.0027) [2024-06-13 08:32:44,783][71000] Updated weights for policy 0, policy_version 204094 (0.0020) [2024-06-13 08:32:45,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 3343908864. Throughput: 0: 49151.4. Samples: 2872778280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 08:32:45,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 08:32:48,865][71000] Updated weights for policy 0, policy_version 204104 (0.0035) [2024-06-13 08:32:50,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49424.9, 300 sec: 48985.4). Total num frames: 3344171008. Throughput: 0: 49256.3. Samples: 2872928440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 08:32:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:32:51,562][71000] Updated weights for policy 0, policy_version 204114 (0.0027) [2024-06-13 08:32:55,197][71000] Updated weights for policy 0, policy_version 204124 (0.0031) [2024-06-13 08:32:55,939][70768] Fps is (10 sec: 49152.5, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 3344400384. Throughput: 0: 49436.7. Samples: 2873226560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 08:32:55,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 08:32:58,358][71000] Updated weights for policy 0, policy_version 204134 (0.0027) [2024-06-13 08:33:00,940][70768] Fps is (10 sec: 47514.3, 60 sec: 48882.1, 300 sec: 48929.8). Total num frames: 3344646144. Throughput: 0: 49103.5. Samples: 2873516320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 08:33:00,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 08:33:02,023][71000] Updated weights for policy 0, policy_version 204144 (0.0022) [2024-06-13 08:33:04,778][71000] Updated weights for policy 0, policy_version 204154 (0.0032) [2024-06-13 08:33:05,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.1, 300 sec: 48929.8). Total num frames: 3344891904. Throughput: 0: 49376.5. Samples: 2873668480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 08:33:05,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 08:33:08,489][71000] Updated weights for policy 0, policy_version 204164 (0.0032) [2024-06-13 08:33:10,626][70980] Signal inference workers to stop experience collection... (43000 times) [2024-06-13 08:33:10,626][70980] Signal inference workers to resume experience collection... (43000 times) [2024-06-13 08:33:10,639][71000] InferenceWorker_p0-w0: stopping experience collection (43000 times) [2024-06-13 08:33:10,639][71000] InferenceWorker_p0-w0: resuming experience collection (43000 times) [2024-06-13 08:33:10,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 3345154048. Throughput: 0: 49429.4. Samples: 2873968240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 08:33:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:33:11,437][71000] Updated weights for policy 0, policy_version 204174 (0.0040) [2024-06-13 08:33:15,029][71000] Updated weights for policy 0, policy_version 204184 (0.0026) [2024-06-13 08:33:15,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 3345383424. Throughput: 0: 49525.3. Samples: 2874265220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-13 08:33:15,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:33:18,171][71000] Updated weights for policy 0, policy_version 204194 (0.0024) [2024-06-13 08:33:20,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3345629184. Throughput: 0: 49444.7. Samples: 2874416940. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-13 08:33:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:33:22,023][71000] Updated weights for policy 0, policy_version 204204 (0.0027) [2024-06-13 08:33:24,747][71000] Updated weights for policy 0, policy_version 204214 (0.0023) [2024-06-13 08:33:25,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49425.0, 300 sec: 48985.3). Total num frames: 3345874944. Throughput: 0: 49190.2. Samples: 2874705560. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-13 08:33:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:33:28,548][71000] Updated weights for policy 0, policy_version 204224 (0.0031) [2024-06-13 08:33:30,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 3346137088. Throughput: 0: 49444.5. Samples: 2875003280. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-13 08:33:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:33:31,588][71000] Updated weights for policy 0, policy_version 204234 (0.0031) [2024-06-13 08:33:34,980][71000] Updated weights for policy 0, policy_version 204244 (0.0032) [2024-06-13 08:33:35,940][70768] Fps is (10 sec: 50791.2, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 3346382848. Throughput: 0: 49454.8. Samples: 2875153900. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-13 08:33:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:33:38,118][71000] Updated weights for policy 0, policy_version 204254 (0.0025) [2024-06-13 08:33:40,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.2, 300 sec: 49040.9). Total num frames: 3346628608. Throughput: 0: 49393.7. Samples: 2875449280. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-13 08:33:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:33:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000204262_3346628608.pth... [2024-06-13 08:33:41,000][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000203545_3334881280.pth [2024-06-13 08:33:41,818][71000] Updated weights for policy 0, policy_version 204264 (0.0029) [2024-06-13 08:33:44,703][71000] Updated weights for policy 0, policy_version 204274 (0.0027) [2024-06-13 08:33:45,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.2, 300 sec: 49096.5). Total num frames: 3346890752. Throughput: 0: 49457.3. Samples: 2875741900. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-13 08:33:45,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:33:48,237][71000] Updated weights for policy 0, policy_version 204284 (0.0027) [2024-06-13 08:33:50,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3347120128. Throughput: 0: 49546.1. Samples: 2875898060. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-13 08:33:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:33:51,695][71000] Updated weights for policy 0, policy_version 204294 (0.0035) [2024-06-13 08:33:54,792][71000] Updated weights for policy 0, policy_version 204304 (0.0033) [2024-06-13 08:33:55,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49698.2, 300 sec: 49041.0). Total num frames: 3347382272. Throughput: 0: 49534.0. Samples: 2876197260. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-13 08:33:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:33:58,108][71000] Updated weights for policy 0, policy_version 204314 (0.0027) [2024-06-13 08:34:00,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49424.9, 300 sec: 48985.4). Total num frames: 3347611648. Throughput: 0: 49368.2. Samples: 2876486800. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-13 08:34:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:34:01,463][71000] Updated weights for policy 0, policy_version 204324 (0.0031) [2024-06-13 08:34:04,567][71000] Updated weights for policy 0, policy_version 204334 (0.0026) [2024-06-13 08:34:05,940][70768] Fps is (10 sec: 50788.9, 60 sec: 49971.1, 300 sec: 49152.0). Total num frames: 3347890176. Throughput: 0: 49282.6. Samples: 2876634660. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-13 08:34:05,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:34:07,817][71000] Updated weights for policy 0, policy_version 204344 (0.0028) [2024-06-13 08:34:10,940][70768] Fps is (10 sec: 50791.3, 60 sec: 49425.1, 300 sec: 49040.9). Total num frames: 3348119552. Throughput: 0: 49596.2. Samples: 2876937380. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-13 08:34:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:34:11,370][71000] Updated weights for policy 0, policy_version 204354 (0.0037) [2024-06-13 08:34:14,542][71000] Updated weights for policy 0, policy_version 204364 (0.0030) [2024-06-13 08:34:15,940][70768] Fps is (10 sec: 45875.9, 60 sec: 49425.0, 300 sec: 48985.4). Total num frames: 3348348928. Throughput: 0: 49422.2. Samples: 2877227280. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-13 08:34:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 08:34:18,247][71000] Updated weights for policy 0, policy_version 204374 (0.0029) [2024-06-13 08:34:20,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49698.2, 300 sec: 49096.7). Total num frames: 3348611072. Throughput: 0: 49417.7. Samples: 2877377700. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-13 08:34:20,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:34:21,539][71000] Updated weights for policy 0, policy_version 204384 (0.0033) [2024-06-13 08:34:24,834][71000] Updated weights for policy 0, policy_version 204394 (0.0030) [2024-06-13 08:34:25,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49698.3, 300 sec: 49152.0). Total num frames: 3348856832. Throughput: 0: 49248.4. Samples: 2877665460. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-13 08:34:25,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 08:34:28,121][71000] Updated weights for policy 0, policy_version 204404 (0.0025) [2024-06-13 08:34:30,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3349086208. Throughput: 0: 49232.0. Samples: 2877957340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 08:34:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:34:31,512][71000] Updated weights for policy 0, policy_version 204414 (0.0033) [2024-06-13 08:34:34,044][70980] Signal inference workers to stop experience collection... (43050 times) [2024-06-13 08:34:34,045][70980] Signal inference workers to resume experience collection... (43050 times) [2024-06-13 08:34:34,078][71000] InferenceWorker_p0-w0: stopping experience collection (43050 times) [2024-06-13 08:34:34,078][71000] InferenceWorker_p0-w0: resuming experience collection (43050 times) [2024-06-13 08:34:34,687][71000] Updated weights for policy 0, policy_version 204424 (0.0036) [2024-06-13 08:34:35,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49425.0, 300 sec: 49096.4). Total num frames: 3349348352. Throughput: 0: 49099.7. Samples: 2878107540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 08:34:35,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:34:38,175][71000] Updated weights for policy 0, policy_version 204434 (0.0030) [2024-06-13 08:34:40,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49151.8, 300 sec: 49096.4). Total num frames: 3349577728. Throughput: 0: 49012.5. Samples: 2878402840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 08:34:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:34:41,453][71000] Updated weights for policy 0, policy_version 204444 (0.0027) [2024-06-13 08:34:45,078][71000] Updated weights for policy 0, policy_version 204454 (0.0028) [2024-06-13 08:34:45,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 3349823488. Throughput: 0: 48948.5. Samples: 2878689480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 08:34:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:34:48,140][71000] Updated weights for policy 0, policy_version 204464 (0.0035) [2024-06-13 08:34:50,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.1, 300 sec: 49096.5). Total num frames: 3350069248. Throughput: 0: 48901.0. Samples: 2878835200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 08:34:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 08:34:51,563][71000] Updated weights for policy 0, policy_version 204474 (0.0032) [2024-06-13 08:34:54,954][71000] Updated weights for policy 0, policy_version 204484 (0.0033) [2024-06-13 08:34:55,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.7, 300 sec: 49040.9). Total num frames: 3350315008. Throughput: 0: 48821.2. Samples: 2879134340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 08:34:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:34:58,210][71000] Updated weights for policy 0, policy_version 204494 (0.0033) [2024-06-13 08:35:00,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 3350544384. Throughput: 0: 48911.5. Samples: 2879428300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 08:35:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:35:01,459][71000] Updated weights for policy 0, policy_version 204504 (0.0036) [2024-06-13 08:35:04,933][71000] Updated weights for policy 0, policy_version 204514 (0.0030) [2024-06-13 08:35:05,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48332.8, 300 sec: 49040.9). Total num frames: 3350790144. Throughput: 0: 48672.7. Samples: 2879567980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 08:35:05,941][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 08:35:08,590][71000] Updated weights for policy 0, policy_version 204524 (0.0028) [2024-06-13 08:35:10,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48332.7, 300 sec: 48985.4). Total num frames: 3351019520. Throughput: 0: 48777.3. Samples: 2879860440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 08:35:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:35:11,545][71000] Updated weights for policy 0, policy_version 204534 (0.0040) [2024-06-13 08:35:15,182][71000] Updated weights for policy 0, policy_version 204544 (0.0026) [2024-06-13 08:35:15,940][70768] Fps is (10 sec: 49153.1, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 3351281664. Throughput: 0: 48623.6. Samples: 2880145400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 08:35:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:35:18,327][71000] Updated weights for policy 0, policy_version 204554 (0.0025) [2024-06-13 08:35:20,939][70768] Fps is (10 sec: 49152.5, 60 sec: 48332.9, 300 sec: 48929.8). Total num frames: 3351511040. Throughput: 0: 48462.8. Samples: 2880288360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 08:35:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:35:21,843][71000] Updated weights for policy 0, policy_version 204564 (0.0024) [2024-06-13 08:35:25,076][71000] Updated weights for policy 0, policy_version 204574 (0.0035) [2024-06-13 08:35:25,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 3351773184. Throughput: 0: 48515.2. Samples: 2880586020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 08:35:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:35:28,651][71000] Updated weights for policy 0, policy_version 204584 (0.0031) [2024-06-13 08:35:30,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 3352002560. Throughput: 0: 48790.7. Samples: 2880885060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 08:35:30,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:35:31,778][71000] Updated weights for policy 0, policy_version 204594 (0.0034) [2024-06-13 08:35:35,314][71000] Updated weights for policy 0, policy_version 204604 (0.0027) [2024-06-13 08:35:35,940][70768] Fps is (10 sec: 47513.8, 60 sec: 48332.8, 300 sec: 48874.3). Total num frames: 3352248320. Throughput: 0: 48801.3. Samples: 2881031260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 08:35:35,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:35:38,290][71000] Updated weights for policy 0, policy_version 204614 (0.0021) [2024-06-13 08:35:40,939][70768] Fps is (10 sec: 47514.3, 60 sec: 48333.0, 300 sec: 48985.4). Total num frames: 3352477696. Throughput: 0: 48577.1. Samples: 2881320300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 08:35:40,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:35:41,123][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000204621_3352510464.pth... [2024-06-13 08:35:41,171][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000203902_3340730368.pth [2024-06-13 08:35:41,995][71000] Updated weights for policy 0, policy_version 204624 (0.0027) [2024-06-13 08:35:45,218][71000] Updated weights for policy 0, policy_version 204634 (0.0026) [2024-06-13 08:35:45,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 3352739840. Throughput: 0: 48597.8. Samples: 2881615200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 08:35:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:35:48,724][71000] Updated weights for policy 0, policy_version 204644 (0.0027) [2024-06-13 08:35:50,940][70768] Fps is (10 sec: 50789.4, 60 sec: 48605.8, 300 sec: 48985.5). Total num frames: 3352985600. Throughput: 0: 48869.4. Samples: 2881767100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 08:35:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:35:51,730][71000] Updated weights for policy 0, policy_version 204654 (0.0025) [2024-06-13 08:35:55,438][71000] Updated weights for policy 0, policy_version 204664 (0.0034) [2024-06-13 08:35:55,939][70768] Fps is (10 sec: 50791.2, 60 sec: 48879.2, 300 sec: 48985.4). Total num frames: 3353247744. Throughput: 0: 48925.1. Samples: 2882062060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 08:35:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:35:58,438][71000] Updated weights for policy 0, policy_version 204674 (0.0027) [2024-06-13 08:36:00,327][70980] Signal inference workers to stop experience collection... (43100 times) [2024-06-13 08:36:00,329][70980] Signal inference workers to resume experience collection... (43100 times) [2024-06-13 08:36:00,338][71000] InferenceWorker_p0-w0: stopping experience collection (43100 times) [2024-06-13 08:36:00,338][71000] InferenceWorker_p0-w0: resuming experience collection (43100 times) [2024-06-13 08:36:00,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3353477120. Throughput: 0: 49078.6. Samples: 2882353940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 08:36:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:36:02,083][71000] Updated weights for policy 0, policy_version 204684 (0.0027) [2024-06-13 08:36:05,646][71000] Updated weights for policy 0, policy_version 204694 (0.0030) [2024-06-13 08:36:05,940][70768] Fps is (10 sec: 47512.3, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 3353722880. Throughput: 0: 49086.9. Samples: 2882497280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 08:36:05,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:36:08,711][71000] Updated weights for policy 0, policy_version 204704 (0.0033) [2024-06-13 08:36:10,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.9, 300 sec: 48985.3). Total num frames: 3353952256. Throughput: 0: 48889.3. Samples: 2882786040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 08:36:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:36:12,140][71000] Updated weights for policy 0, policy_version 204714 (0.0034) [2024-06-13 08:36:15,778][71000] Updated weights for policy 0, policy_version 204724 (0.0032) [2024-06-13 08:36:15,940][70768] Fps is (10 sec: 47514.3, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3354198016. Throughput: 0: 48766.8. Samples: 2883079560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 08:36:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:36:18,774][71000] Updated weights for policy 0, policy_version 204734 (0.0024) [2024-06-13 08:36:20,939][70768] Fps is (10 sec: 49153.2, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 3354443776. Throughput: 0: 48771.3. Samples: 2883225960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 08:36:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:36:22,125][71000] Updated weights for policy 0, policy_version 204744 (0.0033) [2024-06-13 08:36:25,684][71000] Updated weights for policy 0, policy_version 204754 (0.0023) [2024-06-13 08:36:25,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3354689536. Throughput: 0: 48753.7. Samples: 2883514220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 08:36:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:36:28,754][71000] Updated weights for policy 0, policy_version 204764 (0.0032) [2024-06-13 08:36:30,940][70768] Fps is (10 sec: 50788.8, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 3354951680. Throughput: 0: 48883.8. Samples: 2883814980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 08:36:30,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:36:32,265][71000] Updated weights for policy 0, policy_version 204774 (0.0034) [2024-06-13 08:36:35,671][71000] Updated weights for policy 0, policy_version 204784 (0.0036) [2024-06-13 08:36:35,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 3355181056. Throughput: 0: 48783.3. Samples: 2883962340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 08:36:35,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:36:38,922][71000] Updated weights for policy 0, policy_version 204794 (0.0028) [2024-06-13 08:36:40,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49151.8, 300 sec: 49040.9). Total num frames: 3355426816. Throughput: 0: 48629.4. Samples: 2884250400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-13 08:36:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:36:42,378][71000] Updated weights for policy 0, policy_version 204804 (0.0036) [2024-06-13 08:36:45,909][71000] Updated weights for policy 0, policy_version 204814 (0.0033) [2024-06-13 08:36:45,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3355672576. Throughput: 0: 48675.7. Samples: 2884544340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 08:36:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:36:49,178][71000] Updated weights for policy 0, policy_version 204824 (0.0029) [2024-06-13 08:36:50,939][70768] Fps is (10 sec: 50791.3, 60 sec: 49152.1, 300 sec: 49096.4). Total num frames: 3355934720. Throughput: 0: 48829.5. Samples: 2884694600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 08:36:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 08:36:52,357][71000] Updated weights for policy 0, policy_version 204834 (0.0027) [2024-06-13 08:36:55,651][71000] Updated weights for policy 0, policy_version 204844 (0.0026) [2024-06-13 08:36:55,941][70768] Fps is (10 sec: 49144.9, 60 sec: 48604.6, 300 sec: 48985.8). Total num frames: 3356164096. Throughput: 0: 48828.4. Samples: 2884983380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 08:36:55,942][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 08:36:59,142][71000] Updated weights for policy 0, policy_version 204854 (0.0031) [2024-06-13 08:37:00,939][70768] Fps is (10 sec: 47513.7, 60 sec: 48879.1, 300 sec: 49096.5). Total num frames: 3356409856. Throughput: 0: 48938.3. Samples: 2885281780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 08:37:00,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 08:37:02,465][71000] Updated weights for policy 0, policy_version 204864 (0.0031) [2024-06-13 08:37:05,940][70768] Fps is (10 sec: 47520.3, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3356639232. Throughput: 0: 48868.3. Samples: 2885425040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 08:37:05,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:37:06,143][71000] Updated weights for policy 0, policy_version 204874 (0.0033) [2024-06-13 08:37:09,470][71000] Updated weights for policy 0, policy_version 204884 (0.0026) [2024-06-13 08:37:10,940][70768] Fps is (10 sec: 50789.4, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 3356917760. Throughput: 0: 49085.2. Samples: 2885723060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 08:37:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:37:12,635][71000] Updated weights for policy 0, policy_version 204894 (0.0030) [2024-06-13 08:37:13,321][70980] Signal inference workers to stop experience collection... (43150 times) [2024-06-13 08:37:13,321][70980] Signal inference workers to resume experience collection... (43150 times) [2024-06-13 08:37:13,341][71000] InferenceWorker_p0-w0: stopping experience collection (43150 times) [2024-06-13 08:37:13,341][71000] InferenceWorker_p0-w0: resuming experience collection (43150 times) [2024-06-13 08:37:15,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 3357130752. Throughput: 0: 48822.7. Samples: 2886012000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 08:37:15,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:37:16,087][71000] Updated weights for policy 0, policy_version 204904 (0.0021) [2024-06-13 08:37:19,443][71000] Updated weights for policy 0, policy_version 204914 (0.0025) [2024-06-13 08:37:20,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 3357376512. Throughput: 0: 48871.9. Samples: 2886161580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 08:37:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:37:22,546][71000] Updated weights for policy 0, policy_version 204924 (0.0026) [2024-06-13 08:37:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48878.8, 300 sec: 48985.3). Total num frames: 3357622272. Throughput: 0: 48833.3. Samples: 2886447900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 08:37:25,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:37:26,193][71000] Updated weights for policy 0, policy_version 204934 (0.0026) [2024-06-13 08:37:29,267][71000] Updated weights for policy 0, policy_version 204944 (0.0030) [2024-06-13 08:37:30,939][70768] Fps is (10 sec: 50790.6, 60 sec: 48879.1, 300 sec: 49040.9). Total num frames: 3357884416. Throughput: 0: 48967.6. Samples: 2886747880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 08:37:30,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:37:32,715][71000] Updated weights for policy 0, policy_version 204954 (0.0023) [2024-06-13 08:37:35,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 3358113792. Throughput: 0: 49040.4. Samples: 2886901420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 08:37:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:37:35,996][71000] Updated weights for policy 0, policy_version 204964 (0.0023) [2024-06-13 08:37:39,503][71000] Updated weights for policy 0, policy_version 204974 (0.0029) [2024-06-13 08:37:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 3358375936. Throughput: 0: 49067.8. Samples: 2887191360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 08:37:40,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:37:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000204979_3358375936.pth... [2024-06-13 08:37:40,998][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000204262_3346628608.pth [2024-06-13 08:37:42,562][71000] Updated weights for policy 0, policy_version 204984 (0.0025) [2024-06-13 08:37:45,939][70768] Fps is (10 sec: 49152.5, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 3358605312. Throughput: 0: 49013.8. Samples: 2887487400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-13 08:37:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:37:46,045][71000] Updated weights for policy 0, policy_version 204994 (0.0036) [2024-06-13 08:37:49,202][71000] Updated weights for policy 0, policy_version 205004 (0.0025) [2024-06-13 08:37:50,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3358867456. Throughput: 0: 48968.5. Samples: 2887628620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 08:37:50,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:37:52,750][71000] Updated weights for policy 0, policy_version 205014 (0.0030) [2024-06-13 08:37:55,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48607.0, 300 sec: 48929.8). Total num frames: 3359080448. Throughput: 0: 48801.9. Samples: 2887919140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 08:37:55,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:37:56,256][71000] Updated weights for policy 0, policy_version 205024 (0.0029) [2024-06-13 08:37:59,526][71000] Updated weights for policy 0, policy_version 205034 (0.0033) [2024-06-13 08:38:00,940][70768] Fps is (10 sec: 47512.5, 60 sec: 48878.7, 300 sec: 48985.3). Total num frames: 3359342592. Throughput: 0: 49042.2. Samples: 2888218900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 08:38:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:38:03,021][71000] Updated weights for policy 0, policy_version 205044 (0.0037) [2024-06-13 08:38:05,940][70768] Fps is (10 sec: 50790.5, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 3359588352. Throughput: 0: 48826.2. Samples: 2888358760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 08:38:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:38:06,062][71000] Updated weights for policy 0, policy_version 205054 (0.0029) [2024-06-13 08:38:09,573][71000] Updated weights for policy 0, policy_version 205064 (0.0026) [2024-06-13 08:38:10,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3359850496. Throughput: 0: 49046.7. Samples: 2888655000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 08:38:10,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:38:12,411][71000] Updated weights for policy 0, policy_version 205074 (0.0023) [2024-06-13 08:38:15,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 3360079872. Throughput: 0: 48838.2. Samples: 2888945600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 08:38:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:38:16,358][71000] Updated weights for policy 0, policy_version 205084 (0.0034) [2024-06-13 08:38:19,351][71000] Updated weights for policy 0, policy_version 205094 (0.0028) [2024-06-13 08:38:20,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 3360309248. Throughput: 0: 48852.3. Samples: 2889099780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 08:38:20,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:38:23,036][71000] Updated weights for policy 0, policy_version 205104 (0.0033) [2024-06-13 08:38:25,848][70980] Signal inference workers to stop experience collection... (43200 times) [2024-06-13 08:38:25,848][70980] Signal inference workers to resume experience collection... (43200 times) [2024-06-13 08:38:25,863][71000] InferenceWorker_p0-w0: stopping experience collection (43200 times) [2024-06-13 08:38:25,863][71000] InferenceWorker_p0-w0: resuming experience collection (43200 times) [2024-06-13 08:38:25,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.1, 300 sec: 48929.8). Total num frames: 3360571392. Throughput: 0: 48647.1. Samples: 2889380480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 08:38:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:38:25,989][71000] Updated weights for policy 0, policy_version 205114 (0.0026) [2024-06-13 08:38:29,591][71000] Updated weights for policy 0, policy_version 205124 (0.0032) [2024-06-13 08:38:30,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48605.7, 300 sec: 48874.3). Total num frames: 3360800768. Throughput: 0: 48671.4. Samples: 2889677620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 08:38:30,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 08:38:32,714][71000] Updated weights for policy 0, policy_version 205134 (0.0024) [2024-06-13 08:38:35,942][70768] Fps is (10 sec: 47503.5, 60 sec: 48877.2, 300 sec: 48873.9). Total num frames: 3361046528. Throughput: 0: 48646.1. Samples: 2889817800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 08:38:35,943][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 08:38:36,353][71000] Updated weights for policy 0, policy_version 205144 (0.0022) [2024-06-13 08:38:39,565][71000] Updated weights for policy 0, policy_version 205154 (0.0034) [2024-06-13 08:38:40,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48605.8, 300 sec: 48818.8). Total num frames: 3361292288. Throughput: 0: 48823.5. Samples: 2890116200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 08:38:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 08:38:43,185][71000] Updated weights for policy 0, policy_version 205164 (0.0042) [2024-06-13 08:38:45,940][70768] Fps is (10 sec: 49162.5, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3361538048. Throughput: 0: 48649.5. Samples: 2890408120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 08:38:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:38:46,230][71000] Updated weights for policy 0, policy_version 205174 (0.0022) [2024-06-13 08:38:49,595][71000] Updated weights for policy 0, policy_version 205184 (0.0029) [2024-06-13 08:38:50,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48059.7, 300 sec: 48707.7). Total num frames: 3361751040. Throughput: 0: 48744.9. Samples: 2890552280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 08:38:50,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:38:52,985][71000] Updated weights for policy 0, policy_version 205194 (0.0029) [2024-06-13 08:38:55,939][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 3362029568. Throughput: 0: 48697.6. Samples: 2890846380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-13 08:38:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:38:56,368][71000] Updated weights for policy 0, policy_version 205204 (0.0022) [2024-06-13 08:38:59,763][71000] Updated weights for policy 0, policy_version 205214 (0.0027) [2024-06-13 08:39:00,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48606.0, 300 sec: 48707.7). Total num frames: 3362258944. Throughput: 0: 48867.5. Samples: 2891144640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 08:39:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:39:03,028][71000] Updated weights for policy 0, policy_version 205224 (0.0033) [2024-06-13 08:39:05,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48878.9, 300 sec: 48818.7). Total num frames: 3362521088. Throughput: 0: 48596.5. Samples: 2891286620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 08:39:05,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 08:39:06,257][71000] Updated weights for policy 0, policy_version 205234 (0.0028) [2024-06-13 08:39:09,582][71000] Updated weights for policy 0, policy_version 205244 (0.0028) [2024-06-13 08:39:10,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 3362766848. Throughput: 0: 48977.2. Samples: 2891584460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 08:39:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:39:13,014][71000] Updated weights for policy 0, policy_version 205254 (0.0027) [2024-06-13 08:39:15,940][70768] Fps is (10 sec: 47514.2, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 3362996224. Throughput: 0: 48701.9. Samples: 2891869200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 08:39:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:39:16,366][71000] Updated weights for policy 0, policy_version 205264 (0.0026) [2024-06-13 08:39:20,210][71000] Updated weights for policy 0, policy_version 205274 (0.0030) [2024-06-13 08:39:20,939][70768] Fps is (10 sec: 45876.3, 60 sec: 48606.1, 300 sec: 48707.7). Total num frames: 3363225600. Throughput: 0: 48938.0. Samples: 2892019900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 08:39:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:39:23,286][71000] Updated weights for policy 0, policy_version 205284 (0.0027) [2024-06-13 08:39:25,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48332.8, 300 sec: 48763.2). Total num frames: 3363471360. Throughput: 0: 48713.8. Samples: 2892308320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 08:39:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:39:26,775][71000] Updated weights for policy 0, policy_version 205294 (0.0026) [2024-06-13 08:39:30,078][71000] Updated weights for policy 0, policy_version 205304 (0.0039) [2024-06-13 08:39:30,940][70768] Fps is (10 sec: 49150.9, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 3363717120. Throughput: 0: 48513.7. Samples: 2892591240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 08:39:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:39:33,369][71000] Updated weights for policy 0, policy_version 205314 (0.0024) [2024-06-13 08:39:35,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48607.5, 300 sec: 48763.2). Total num frames: 3363962880. Throughput: 0: 48572.8. Samples: 2892738060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 08:39:35,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:39:36,821][71000] Updated weights for policy 0, policy_version 205324 (0.0023) [2024-06-13 08:39:39,118][70980] Signal inference workers to stop experience collection... (43250 times) [2024-06-13 08:39:39,166][71000] InferenceWorker_p0-w0: stopping experience collection (43250 times) [2024-06-13 08:39:39,173][70980] Signal inference workers to resume experience collection... (43250 times) [2024-06-13 08:39:39,180][71000] InferenceWorker_p0-w0: resuming experience collection (43250 times) [2024-06-13 08:39:40,001][71000] Updated weights for policy 0, policy_version 205334 (0.0030) [2024-06-13 08:39:40,942][70768] Fps is (10 sec: 49139.2, 60 sec: 48603.7, 300 sec: 48762.8). Total num frames: 3364208640. Throughput: 0: 48656.9. Samples: 2893036080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 08:39:40,943][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:39:40,949][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000205335_3364208640.pth... [2024-06-13 08:39:41,031][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000204621_3352510464.pth [2024-06-13 08:39:43,567][71000] Updated weights for policy 0, policy_version 205344 (0.0026) [2024-06-13 08:39:45,939][70768] Fps is (10 sec: 47514.5, 60 sec: 48332.9, 300 sec: 48707.7). Total num frames: 3364438016. Throughput: 0: 48512.2. Samples: 2893327680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 08:39:45,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:39:47,016][71000] Updated weights for policy 0, policy_version 205354 (0.0027) [2024-06-13 08:39:50,630][71000] Updated weights for policy 0, policy_version 205364 (0.0026) [2024-06-13 08:39:50,940][70768] Fps is (10 sec: 49165.3, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 3364700160. Throughput: 0: 48497.9. Samples: 2893469020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 08:39:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:39:53,818][71000] Updated weights for policy 0, policy_version 205374 (0.0031) [2024-06-13 08:39:55,940][70768] Fps is (10 sec: 50789.6, 60 sec: 48605.7, 300 sec: 48818.8). Total num frames: 3364945920. Throughput: 0: 48269.4. Samples: 2893756580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 08:39:55,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:39:57,378][71000] Updated weights for policy 0, policy_version 205384 (0.0025) [2024-06-13 08:40:00,754][71000] Updated weights for policy 0, policy_version 205394 (0.0030) [2024-06-13 08:40:00,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.9, 300 sec: 48763.3). Total num frames: 3365175296. Throughput: 0: 48478.2. Samples: 2894050720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 08:40:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:40:04,155][71000] Updated weights for policy 0, policy_version 205404 (0.0028) [2024-06-13 08:40:05,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48332.9, 300 sec: 48818.8). Total num frames: 3365421056. Throughput: 0: 48305.2. Samples: 2894193640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 08:40:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:40:07,373][71000] Updated weights for policy 0, policy_version 205414 (0.0025) [2024-06-13 08:40:10,939][70768] Fps is (10 sec: 47514.3, 60 sec: 48059.9, 300 sec: 48707.7). Total num frames: 3365650432. Throughput: 0: 48289.4. Samples: 2894481340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 08:40:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:40:10,987][71000] Updated weights for policy 0, policy_version 205424 (0.0027) [2024-06-13 08:40:14,263][71000] Updated weights for policy 0, policy_version 205434 (0.0029) [2024-06-13 08:40:15,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48605.8, 300 sec: 48818.7). Total num frames: 3365912576. Throughput: 0: 48272.9. Samples: 2894763520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 08:40:15,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:40:17,770][71000] Updated weights for policy 0, policy_version 205444 (0.0032) [2024-06-13 08:40:20,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48332.7, 300 sec: 48652.2). Total num frames: 3366125568. Throughput: 0: 48289.9. Samples: 2894911100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 08:40:20,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:40:21,208][71000] Updated weights for policy 0, policy_version 205454 (0.0026) [2024-06-13 08:40:24,619][71000] Updated weights for policy 0, policy_version 205464 (0.0027) [2024-06-13 08:40:25,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 3366387712. Throughput: 0: 48250.0. Samples: 2895207200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 08:40:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:40:27,984][71000] Updated weights for policy 0, policy_version 205474 (0.0034) [2024-06-13 08:40:30,940][70768] Fps is (10 sec: 49150.2, 60 sec: 48332.6, 300 sec: 48707.6). Total num frames: 3366617088. Throughput: 0: 48264.4. Samples: 2895499600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 08:40:30,940][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 08:40:31,273][71000] Updated weights for policy 0, policy_version 205484 (0.0024) [2024-06-13 08:40:34,696][71000] Updated weights for policy 0, policy_version 205494 (0.0030) [2024-06-13 08:40:35,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 3366895616. Throughput: 0: 48440.4. Samples: 2895648840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 08:40:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:40:37,667][71000] Updated weights for policy 0, policy_version 205504 (0.0026) [2024-06-13 08:40:40,939][70768] Fps is (10 sec: 49154.2, 60 sec: 48335.0, 300 sec: 48707.7). Total num frames: 3367108608. Throughput: 0: 48473.9. Samples: 2895937900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 08:40:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:40:41,210][71000] Updated weights for policy 0, policy_version 205514 (0.0027) [2024-06-13 08:40:44,505][71000] Updated weights for policy 0, policy_version 205524 (0.0031) [2024-06-13 08:40:45,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.8, 300 sec: 48763.2). Total num frames: 3367370752. Throughput: 0: 48373.6. Samples: 2896227540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 08:40:45,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:40:48,203][71000] Updated weights for policy 0, policy_version 205534 (0.0028) [2024-06-13 08:40:50,939][70768] Fps is (10 sec: 49152.0, 60 sec: 48332.8, 300 sec: 48652.1). Total num frames: 3367600128. Throughput: 0: 48632.0. Samples: 2896382080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 08:40:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:40:51,178][71000] Updated weights for policy 0, policy_version 205544 (0.0027) [2024-06-13 08:40:54,768][71000] Updated weights for policy 0, policy_version 205554 (0.0025) [2024-06-13 08:40:55,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 3367862272. Throughput: 0: 48884.7. Samples: 2896681160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 08:40:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:40:57,079][70980] Signal inference workers to stop experience collection... (43300 times) [2024-06-13 08:40:57,130][71000] InferenceWorker_p0-w0: stopping experience collection (43300 times) [2024-06-13 08:40:57,133][70980] Signal inference workers to resume experience collection... (43300 times) [2024-06-13 08:40:57,151][71000] InferenceWorker_p0-w0: resuming experience collection (43300 times) [2024-06-13 08:40:57,683][71000] Updated weights for policy 0, policy_version 205564 (0.0027) [2024-06-13 08:41:00,939][70768] Fps is (10 sec: 49152.0, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 3368091648. Throughput: 0: 49299.7. Samples: 2896982000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 08:41:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:41:01,173][71000] Updated weights for policy 0, policy_version 205574 (0.0028) [2024-06-13 08:41:04,343][71000] Updated weights for policy 0, policy_version 205584 (0.0030) [2024-06-13 08:41:05,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 3368370176. Throughput: 0: 49291.0. Samples: 2897129200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 08:41:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 08:41:08,090][71000] Updated weights for policy 0, policy_version 205594 (0.0029) [2024-06-13 08:41:10,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49151.9, 300 sec: 48818.8). Total num frames: 3368599552. Throughput: 0: 49027.2. Samples: 2897413420. Policy #0 lag: (min: 2.0, avg: 8.8, max: 19.0) [2024-06-13 08:41:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:41:11,061][71000] Updated weights for policy 0, policy_version 205604 (0.0033) [2024-06-13 08:41:14,813][71000] Updated weights for policy 0, policy_version 205614 (0.0035) [2024-06-13 08:41:15,940][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 3368861696. Throughput: 0: 49160.5. Samples: 2897711800. Policy #0 lag: (min: 2.0, avg: 8.8, max: 19.0) [2024-06-13 08:41:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:41:17,776][71000] Updated weights for policy 0, policy_version 205624 (0.0027) [2024-06-13 08:41:20,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49425.1, 300 sec: 48818.8). Total num frames: 3369091072. Throughput: 0: 49194.8. Samples: 2897862600. Policy #0 lag: (min: 2.0, avg: 8.8, max: 19.0) [2024-06-13 08:41:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:41:20,978][71000] Updated weights for policy 0, policy_version 205634 (0.0032) [2024-06-13 08:41:24,574][71000] Updated weights for policy 0, policy_version 205644 (0.0022) [2024-06-13 08:41:25,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.1, 300 sec: 48763.3). Total num frames: 3369336832. Throughput: 0: 49236.5. Samples: 2898153540. Policy #0 lag: (min: 2.0, avg: 8.8, max: 19.0) [2024-06-13 08:41:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:41:28,087][71000] Updated weights for policy 0, policy_version 205654 (0.0031) [2024-06-13 08:41:30,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.4, 300 sec: 48763.2). Total num frames: 3369566208. Throughput: 0: 49321.1. Samples: 2898446980. Policy #0 lag: (min: 2.0, avg: 8.8, max: 19.0) [2024-06-13 08:41:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:41:31,426][71000] Updated weights for policy 0, policy_version 205664 (0.0032) [2024-06-13 08:41:34,775][71000] Updated weights for policy 0, policy_version 205674 (0.0029) [2024-06-13 08:41:35,939][70768] Fps is (10 sec: 47513.6, 60 sec: 48606.0, 300 sec: 48763.3). Total num frames: 3369811968. Throughput: 0: 48924.5. Samples: 2898583680. Policy #0 lag: (min: 2.0, avg: 8.8, max: 19.0) [2024-06-13 08:41:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:41:38,073][71000] Updated weights for policy 0, policy_version 205684 (0.0034) [2024-06-13 08:41:40,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 3370057728. Throughput: 0: 48907.5. Samples: 2898882000. Policy #0 lag: (min: 2.0, avg: 8.8, max: 19.0) [2024-06-13 08:41:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:41:40,946][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000205692_3370057728.pth... [2024-06-13 08:41:40,994][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000204979_3358375936.pth [2024-06-13 08:41:41,405][71000] Updated weights for policy 0, policy_version 205694 (0.0022) [2024-06-13 08:41:44,582][71000] Updated weights for policy 0, policy_version 205704 (0.0031) [2024-06-13 08:41:45,939][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.3, 300 sec: 48818.8). Total num frames: 3370336256. Throughput: 0: 48784.9. Samples: 2899177320. Policy #0 lag: (min: 2.0, avg: 8.8, max: 19.0) [2024-06-13 08:41:45,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:41:48,219][71000] Updated weights for policy 0, policy_version 205714 (0.0026) [2024-06-13 08:41:50,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48878.9, 300 sec: 48707.9). Total num frames: 3370532864. Throughput: 0: 48713.9. Samples: 2899321320. Policy #0 lag: (min: 2.0, avg: 8.8, max: 19.0) [2024-06-13 08:41:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:41:51,658][71000] Updated weights for policy 0, policy_version 205724 (0.0034) [2024-06-13 08:41:54,599][71000] Updated weights for policy 0, policy_version 205734 (0.0028) [2024-06-13 08:41:55,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 3370795008. Throughput: 0: 48908.1. Samples: 2899614280. Policy #0 lag: (min: 2.0, avg: 8.8, max: 19.0) [2024-06-13 08:41:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:41:58,315][71000] Updated weights for policy 0, policy_version 205744 (0.0024) [2024-06-13 08:42:00,940][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 3371040768. Throughput: 0: 48759.5. Samples: 2899905980. Policy #0 lag: (min: 2.0, avg: 8.8, max: 19.0) [2024-06-13 08:42:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 08:42:01,437][71000] Updated weights for policy 0, policy_version 205754 (0.0024) [2024-06-13 08:42:02,319][70980] Signal inference workers to stop experience collection... (43350 times) [2024-06-13 08:42:02,320][70980] Signal inference workers to resume experience collection... (43350 times) [2024-06-13 08:42:02,360][71000] InferenceWorker_p0-w0: stopping experience collection (43350 times) [2024-06-13 08:42:02,360][71000] InferenceWorker_p0-w0: resuming experience collection (43350 times) [2024-06-13 08:42:04,940][71000] Updated weights for policy 0, policy_version 205764 (0.0042) [2024-06-13 08:42:05,940][70768] Fps is (10 sec: 49151.2, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 3371286528. Throughput: 0: 48707.4. Samples: 2900054440. Policy #0 lag: (min: 2.0, avg: 8.8, max: 19.0) [2024-06-13 08:42:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:42:08,335][71000] Updated weights for policy 0, policy_version 205774 (0.0031) [2024-06-13 08:42:10,940][70768] Fps is (10 sec: 45875.4, 60 sec: 48332.8, 300 sec: 48707.7). Total num frames: 3371499520. Throughput: 0: 48607.1. Samples: 2900340860. Policy #0 lag: (min: 2.0, avg: 8.8, max: 19.0) [2024-06-13 08:42:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:42:11,575][71000] Updated weights for policy 0, policy_version 205784 (0.0033) [2024-06-13 08:42:15,211][71000] Updated weights for policy 0, policy_version 205794 (0.0023) [2024-06-13 08:42:15,939][70768] Fps is (10 sec: 45876.0, 60 sec: 48059.8, 300 sec: 48707.7). Total num frames: 3371745280. Throughput: 0: 48635.6. Samples: 2900635580. Policy #0 lag: (min: 2.0, avg: 8.8, max: 19.0) [2024-06-13 08:42:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:42:18,269][71000] Updated weights for policy 0, policy_version 205804 (0.0025) [2024-06-13 08:42:20,940][70768] Fps is (10 sec: 52428.4, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 3372023808. Throughput: 0: 48934.1. Samples: 2900785720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 08:42:20,949][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:42:21,909][71000] Updated weights for policy 0, policy_version 205814 (0.0036) [2024-06-13 08:42:25,039][71000] Updated weights for policy 0, policy_version 205824 (0.0020) [2024-06-13 08:42:25,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48332.8, 300 sec: 48652.1). Total num frames: 3372236800. Throughput: 0: 48845.0. Samples: 2901080020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 08:42:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:42:28,814][71000] Updated weights for policy 0, policy_version 205834 (0.0030) [2024-06-13 08:42:30,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 3372498944. Throughput: 0: 48728.8. Samples: 2901370120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 08:42:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:42:31,689][71000] Updated weights for policy 0, policy_version 205844 (0.0027) [2024-06-13 08:42:35,380][71000] Updated weights for policy 0, policy_version 205854 (0.0022) [2024-06-13 08:42:35,940][70768] Fps is (10 sec: 49151.2, 60 sec: 48605.7, 300 sec: 48652.1). Total num frames: 3372728320. Throughput: 0: 48718.1. Samples: 2901513640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 08:42:35,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:42:38,359][71000] Updated weights for policy 0, policy_version 205864 (0.0042) [2024-06-13 08:42:40,939][70768] Fps is (10 sec: 49152.0, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 3372990464. Throughput: 0: 48772.9. Samples: 2901809060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 08:42:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:42:42,257][71000] Updated weights for policy 0, policy_version 205874 (0.0028) [2024-06-13 08:42:45,122][71000] Updated weights for policy 0, policy_version 205884 (0.0029) [2024-06-13 08:42:45,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48059.6, 300 sec: 48652.1). Total num frames: 3373219840. Throughput: 0: 48667.1. Samples: 2902096000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 08:42:45,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:42:49,022][71000] Updated weights for policy 0, policy_version 205894 (0.0029) [2024-06-13 08:42:50,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 3373481984. Throughput: 0: 48862.8. Samples: 2902253260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 08:42:50,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:42:52,212][71000] Updated weights for policy 0, policy_version 205904 (0.0029) [2024-06-13 08:42:55,767][71000] Updated weights for policy 0, policy_version 205914 (0.0026) [2024-06-13 08:42:55,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48332.7, 300 sec: 48652.2). Total num frames: 3373694976. Throughput: 0: 48722.5. Samples: 2902533380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 08:42:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:42:58,673][71000] Updated weights for policy 0, policy_version 205924 (0.0039) [2024-06-13 08:43:00,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 3373973504. Throughput: 0: 48561.4. Samples: 2902820840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 08:43:00,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:43:02,543][71000] Updated weights for policy 0, policy_version 205934 (0.0031) [2024-06-13 08:43:05,273][71000] Updated weights for policy 0, policy_version 205944 (0.0032) [2024-06-13 08:43:05,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48332.9, 300 sec: 48596.6). Total num frames: 3374186496. Throughput: 0: 48768.9. Samples: 2902980320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 08:43:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:43:08,991][71000] Updated weights for policy 0, policy_version 205954 (0.0031) [2024-06-13 08:43:10,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 48763.2). Total num frames: 3374465024. Throughput: 0: 48765.3. Samples: 2903274460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 08:43:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:43:11,708][71000] Updated weights for policy 0, policy_version 205964 (0.0032) [2024-06-13 08:43:15,481][71000] Updated weights for policy 0, policy_version 205974 (0.0022) [2024-06-13 08:43:15,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.8, 300 sec: 48707.7). Total num frames: 3374678016. Throughput: 0: 48992.8. Samples: 2903574800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 08:43:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:43:18,505][71000] Updated weights for policy 0, policy_version 205984 (0.0026) [2024-06-13 08:43:20,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 3374956544. Throughput: 0: 48857.0. Samples: 2903712200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 08:43:20,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:43:22,307][71000] Updated weights for policy 0, policy_version 205994 (0.0034) [2024-06-13 08:43:24,175][70980] Signal inference workers to stop experience collection... (43400 times) [2024-06-13 08:43:24,176][70980] Signal inference workers to resume experience collection... (43400 times) [2024-06-13 08:43:24,197][71000] InferenceWorker_p0-w0: stopping experience collection (43400 times) [2024-06-13 08:43:24,197][71000] InferenceWorker_p0-w0: resuming experience collection (43400 times) [2024-06-13 08:43:25,062][71000] Updated weights for policy 0, policy_version 206004 (0.0029) [2024-06-13 08:43:25,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 3375169536. Throughput: 0: 48757.2. Samples: 2904003140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-13 08:43:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:43:28,874][71000] Updated weights for policy 0, policy_version 206014 (0.0024) [2024-06-13 08:43:30,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 48819.1). Total num frames: 3375448064. Throughput: 0: 49149.8. Samples: 2904307740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 08:43:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:43:31,724][71000] Updated weights for policy 0, policy_version 206024 (0.0027) [2024-06-13 08:43:35,422][71000] Updated weights for policy 0, policy_version 206034 (0.0022) [2024-06-13 08:43:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 3375661056. Throughput: 0: 48823.1. Samples: 2904450300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 08:43:35,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:43:38,267][71000] Updated weights for policy 0, policy_version 206044 (0.0020) [2024-06-13 08:43:40,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48878.7, 300 sec: 48763.2). Total num frames: 3375923200. Throughput: 0: 49029.2. Samples: 2904739700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 08:43:40,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:43:40,947][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000206050_3375923200.pth... [2024-06-13 08:43:40,990][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000205335_3364208640.pth [2024-06-13 08:43:42,323][71000] Updated weights for policy 0, policy_version 206054 (0.0033) [2024-06-13 08:43:45,094][71000] Updated weights for policy 0, policy_version 206064 (0.0026) [2024-06-13 08:43:45,939][70768] Fps is (10 sec: 49152.4, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 3376152576. Throughput: 0: 48895.5. Samples: 2905021140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 08:43:45,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:43:48,839][71000] Updated weights for policy 0, policy_version 206074 (0.0029) [2024-06-13 08:43:50,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48878.8, 300 sec: 48763.2). Total num frames: 3376414720. Throughput: 0: 48709.7. Samples: 2905172260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 08:43:50,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 08:43:51,737][71000] Updated weights for policy 0, policy_version 206084 (0.0026) [2024-06-13 08:43:54,995][71000] Updated weights for policy 0, policy_version 206094 (0.0022) [2024-06-13 08:43:55,940][70768] Fps is (10 sec: 49150.8, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 3376644096. Throughput: 0: 49184.7. Samples: 2905487780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 08:43:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:43:58,294][71000] Updated weights for policy 0, policy_version 206104 (0.0023) [2024-06-13 08:44:00,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 3376922624. Throughput: 0: 49182.7. Samples: 2905788020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 08:44:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:44:02,078][71000] Updated weights for policy 0, policy_version 206114 (0.0023) [2024-06-13 08:44:04,892][71000] Updated weights for policy 0, policy_version 206124 (0.0030) [2024-06-13 08:44:05,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49425.0, 300 sec: 48763.2). Total num frames: 3377152000. Throughput: 0: 49378.6. Samples: 2905934240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 08:44:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:44:08,740][71000] Updated weights for policy 0, policy_version 206134 (0.0032) [2024-06-13 08:44:10,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3377414144. Throughput: 0: 49507.1. Samples: 2906230960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 08:44:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:44:11,559][71000] Updated weights for policy 0, policy_version 206144 (0.0030) [2024-06-13 08:44:15,191][71000] Updated weights for policy 0, policy_version 206154 (0.0020) [2024-06-13 08:44:15,939][70768] Fps is (10 sec: 49153.0, 60 sec: 49425.2, 300 sec: 48874.3). Total num frames: 3377643520. Throughput: 0: 49265.9. Samples: 2906524700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 08:44:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:44:18,333][71000] Updated weights for policy 0, policy_version 206164 (0.0027) [2024-06-13 08:44:20,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3377889280. Throughput: 0: 49523.4. Samples: 2906678860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 08:44:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:44:21,697][70980] Signal inference workers to stop experience collection... (43450 times) [2024-06-13 08:44:21,752][71000] InferenceWorker_p0-w0: stopping experience collection (43450 times) [2024-06-13 08:44:21,754][70980] Signal inference workers to resume experience collection... (43450 times) [2024-06-13 08:44:21,762][71000] InferenceWorker_p0-w0: resuming experience collection (43450 times) [2024-06-13 08:44:21,887][71000] Updated weights for policy 0, policy_version 206174 (0.0035) [2024-06-13 08:44:24,858][71000] Updated weights for policy 0, policy_version 206184 (0.0033) [2024-06-13 08:44:25,940][70768] Fps is (10 sec: 52427.7, 60 sec: 49971.1, 300 sec: 48985.4). Total num frames: 3378167808. Throughput: 0: 49669.4. Samples: 2906974820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 08:44:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:44:28,307][71000] Updated weights for policy 0, policy_version 206194 (0.0032) [2024-06-13 08:44:30,940][70768] Fps is (10 sec: 50791.0, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 3378397184. Throughput: 0: 49913.7. Samples: 2907267260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-13 08:44:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:44:31,483][71000] Updated weights for policy 0, policy_version 206204 (0.0029) [2024-06-13 08:44:35,187][71000] Updated weights for policy 0, policy_version 206214 (0.0029) [2024-06-13 08:44:35,939][70768] Fps is (10 sec: 47514.7, 60 sec: 49698.2, 300 sec: 48930.3). Total num frames: 3378642944. Throughput: 0: 49962.5. Samples: 2907420560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 08:44:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:44:38,083][71000] Updated weights for policy 0, policy_version 206224 (0.0021) [2024-06-13 08:44:40,940][70768] Fps is (10 sec: 47513.6, 60 sec: 49152.1, 300 sec: 48929.8). Total num frames: 3378872320. Throughput: 0: 49422.0. Samples: 2907711760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 08:44:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:44:41,595][71000] Updated weights for policy 0, policy_version 206234 (0.0029) [2024-06-13 08:44:44,842][71000] Updated weights for policy 0, policy_version 206244 (0.0025) [2024-06-13 08:44:45,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49971.2, 300 sec: 48985.4). Total num frames: 3379150848. Throughput: 0: 49256.0. Samples: 2908004540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 08:44:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:44:48,276][71000] Updated weights for policy 0, policy_version 206254 (0.0033) [2024-06-13 08:44:50,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.1, 300 sec: 48929.9). Total num frames: 3379380224. Throughput: 0: 49584.1. Samples: 2908165520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 08:44:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:44:51,293][71000] Updated weights for policy 0, policy_version 206264 (0.0032) [2024-06-13 08:44:55,103][71000] Updated weights for policy 0, policy_version 206274 (0.0030) [2024-06-13 08:44:55,940][70768] Fps is (10 sec: 49151.3, 60 sec: 49971.2, 300 sec: 49040.9). Total num frames: 3379642368. Throughput: 0: 49574.1. Samples: 2908461800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 08:44:55,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:44:58,110][71000] Updated weights for policy 0, policy_version 206284 (0.0028) [2024-06-13 08:45:00,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 3379855360. Throughput: 0: 49502.9. Samples: 2908752340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 08:45:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:45:01,655][71000] Updated weights for policy 0, policy_version 206294 (0.0034) [2024-06-13 08:45:04,811][71000] Updated weights for policy 0, policy_version 206304 (0.0021) [2024-06-13 08:45:05,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49698.2, 300 sec: 49096.4). Total num frames: 3380133888. Throughput: 0: 49261.9. Samples: 2908895640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 08:45:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:45:08,262][71000] Updated weights for policy 0, policy_version 206314 (0.0025) [2024-06-13 08:45:10,940][70768] Fps is (10 sec: 52428.9, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 3380379648. Throughput: 0: 49439.1. Samples: 2909199580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 08:45:10,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:45:11,517][71000] Updated weights for policy 0, policy_version 206324 (0.0031) [2024-06-13 08:45:14,946][71000] Updated weights for policy 0, policy_version 206334 (0.0024) [2024-06-13 08:45:15,940][70768] Fps is (10 sec: 50790.6, 60 sec: 49971.1, 300 sec: 49207.5). Total num frames: 3380641792. Throughput: 0: 49311.2. Samples: 2909486260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 08:45:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:45:18,121][71000] Updated weights for policy 0, policy_version 206344 (0.0030) [2024-06-13 08:45:20,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49698.2, 300 sec: 49096.5). Total num frames: 3380871168. Throughput: 0: 49258.9. Samples: 2909637220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 08:45:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:45:21,430][71000] Updated weights for policy 0, policy_version 206354 (0.0021) [2024-06-13 08:45:24,825][71000] Updated weights for policy 0, policy_version 206364 (0.0022) [2024-06-13 08:45:25,939][70768] Fps is (10 sec: 47513.8, 60 sec: 49152.1, 300 sec: 49152.1). Total num frames: 3381116928. Throughput: 0: 49321.0. Samples: 2909931200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 08:45:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:45:27,702][70980] Signal inference workers to stop experience collection... (43500 times) [2024-06-13 08:45:27,753][71000] InferenceWorker_p0-w0: stopping experience collection (43500 times) [2024-06-13 08:45:27,762][70980] Signal inference workers to resume experience collection... (43500 times) [2024-06-13 08:45:27,763][71000] InferenceWorker_p0-w0: resuming experience collection (43500 times) [2024-06-13 08:45:28,236][71000] Updated weights for policy 0, policy_version 206374 (0.0034) [2024-06-13 08:45:30,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 3381362688. Throughput: 0: 49359.9. Samples: 2910225740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 08:45:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:45:31,639][71000] Updated weights for policy 0, policy_version 206384 (0.0027) [2024-06-13 08:45:34,772][71000] Updated weights for policy 0, policy_version 206394 (0.0025) [2024-06-13 08:45:35,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49424.9, 300 sec: 49152.0). Total num frames: 3381608448. Throughput: 0: 49171.9. Samples: 2910378260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 08:45:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:45:38,018][71000] Updated weights for policy 0, policy_version 206404 (0.0028) [2024-06-13 08:45:40,940][70768] Fps is (10 sec: 47514.3, 60 sec: 49425.1, 300 sec: 49041.0). Total num frames: 3381837824. Throughput: 0: 49059.3. Samples: 2910669460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-13 08:45:40,940][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 08:45:40,946][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000206411_3381837824.pth... [2024-06-13 08:45:41,002][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000205692_3370057728.pth [2024-06-13 08:45:41,564][71000] Updated weights for policy 0, policy_version 206414 (0.0025) [2024-06-13 08:45:44,887][71000] Updated weights for policy 0, policy_version 206424 (0.0029) [2024-06-13 08:45:45,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.8, 300 sec: 49096.4). Total num frames: 3382083584. Throughput: 0: 48992.0. Samples: 2910956980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:45:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:45:48,575][71000] Updated weights for policy 0, policy_version 206434 (0.0033) [2024-06-13 08:45:50,940][70768] Fps is (10 sec: 50789.6, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 3382345728. Throughput: 0: 49204.3. Samples: 2911109840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:45:50,952][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 08:45:51,511][71000] Updated weights for policy 0, policy_version 206444 (0.0030) [2024-06-13 08:45:55,074][71000] Updated weights for policy 0, policy_version 206454 (0.0021) [2024-06-13 08:45:55,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 3382591488. Throughput: 0: 49042.3. Samples: 2911406480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:45:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:45:58,567][71000] Updated weights for policy 0, policy_version 206464 (0.0021) [2024-06-13 08:46:00,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 3382804480. Throughput: 0: 49153.2. Samples: 2911698160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:46:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:46:01,702][71000] Updated weights for policy 0, policy_version 206474 (0.0028) [2024-06-13 08:46:05,177][71000] Updated weights for policy 0, policy_version 206484 (0.0031) [2024-06-13 08:46:05,940][70768] Fps is (10 sec: 45874.7, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 3383050240. Throughput: 0: 48855.5. Samples: 2911835720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:46:05,940][70768] Avg episode reward: [(0, '0.277')] [2024-06-13 08:46:08,531][71000] Updated weights for policy 0, policy_version 206494 (0.0032) [2024-06-13 08:46:10,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3383328768. Throughput: 0: 48745.1. Samples: 2912124740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:46:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:46:12,003][71000] Updated weights for policy 0, policy_version 206504 (0.0027) [2024-06-13 08:46:15,186][71000] Updated weights for policy 0, policy_version 206514 (0.0031) [2024-06-13 08:46:15,940][70768] Fps is (10 sec: 52429.4, 60 sec: 48878.9, 300 sec: 49096.5). Total num frames: 3383574528. Throughput: 0: 48934.7. Samples: 2912427800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:46:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:46:18,349][71000] Updated weights for policy 0, policy_version 206524 (0.0031) [2024-06-13 08:46:20,940][70768] Fps is (10 sec: 44237.6, 60 sec: 48332.9, 300 sec: 48929.8). Total num frames: 3383771136. Throughput: 0: 48760.6. Samples: 2912572480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:46:20,940][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 08:46:21,699][71000] Updated weights for policy 0, policy_version 206534 (0.0039) [2024-06-13 08:46:25,461][71000] Updated weights for policy 0, policy_version 206544 (0.0027) [2024-06-13 08:46:25,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48605.7, 300 sec: 49040.9). Total num frames: 3384033280. Throughput: 0: 48762.5. Samples: 2912863780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:46:25,940][70768] Avg episode reward: [(0, '0.279')] [2024-06-13 08:46:28,646][71000] Updated weights for policy 0, policy_version 206554 (0.0025) [2024-06-13 08:46:29,293][70980] Signal inference workers to stop experience collection... (43550 times) [2024-06-13 08:46:29,297][70980] Signal inference workers to resume experience collection... (43550 times) [2024-06-13 08:46:29,322][71000] InferenceWorker_p0-w0: stopping experience collection (43550 times) [2024-06-13 08:46:29,322][71000] InferenceWorker_p0-w0: resuming experience collection (43550 times) [2024-06-13 08:46:30,940][70768] Fps is (10 sec: 52428.5, 60 sec: 48879.0, 300 sec: 49096.4). Total num frames: 3384295424. Throughput: 0: 48784.1. Samples: 2913152260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:46:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:46:32,374][71000] Updated weights for policy 0, policy_version 206564 (0.0031) [2024-06-13 08:46:35,140][71000] Updated weights for policy 0, policy_version 206574 (0.0028) [2024-06-13 08:46:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 3384524800. Throughput: 0: 48800.4. Samples: 2913305860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:46:35,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 08:46:38,649][71000] Updated weights for policy 0, policy_version 206584 (0.0031) [2024-06-13 08:46:40,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3384770560. Throughput: 0: 48747.2. Samples: 2913600100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:46:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:46:42,056][71000] Updated weights for policy 0, policy_version 206594 (0.0032) [2024-06-13 08:46:45,714][71000] Updated weights for policy 0, policy_version 206604 (0.0034) [2024-06-13 08:46:45,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48605.8, 300 sec: 49040.9). Total num frames: 3384999936. Throughput: 0: 48697.3. Samples: 2913889540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 08:46:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:46:48,584][71000] Updated weights for policy 0, policy_version 206614 (0.0034) [2024-06-13 08:46:50,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 3385278464. Throughput: 0: 48853.0. Samples: 2914034100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 08:46:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:46:52,299][71000] Updated weights for policy 0, policy_version 206624 (0.0026) [2024-06-13 08:46:55,201][71000] Updated weights for policy 0, policy_version 206634 (0.0032) [2024-06-13 08:46:55,940][70768] Fps is (10 sec: 52428.8, 60 sec: 48878.9, 300 sec: 49096.4). Total num frames: 3385524224. Throughput: 0: 49069.3. Samples: 2914332860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 08:46:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:46:59,275][71000] Updated weights for policy 0, policy_version 206644 (0.0025) [2024-06-13 08:47:00,940][70768] Fps is (10 sec: 47513.0, 60 sec: 49152.0, 300 sec: 49040.9). Total num frames: 3385753600. Throughput: 0: 49032.3. Samples: 2914634260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 08:47:00,940][70768] Avg episode reward: [(0, '0.271')] [2024-06-13 08:47:01,811][71000] Updated weights for policy 0, policy_version 206654 (0.0029) [2024-06-13 08:47:05,940][70768] Fps is (10 sec: 44237.4, 60 sec: 48606.0, 300 sec: 49040.9). Total num frames: 3385966592. Throughput: 0: 48752.4. Samples: 2914766340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 08:47:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:47:06,073][71000] Updated weights for policy 0, policy_version 206664 (0.0041) [2024-06-13 08:47:08,928][71000] Updated weights for policy 0, policy_version 206674 (0.0029) [2024-06-13 08:47:10,940][70768] Fps is (10 sec: 50791.2, 60 sec: 48879.1, 300 sec: 49207.5). Total num frames: 3386261504. Throughput: 0: 48855.7. Samples: 2915062280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 08:47:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:47:12,502][71000] Updated weights for policy 0, policy_version 206684 (0.0031) [2024-06-13 08:47:15,670][71000] Updated weights for policy 0, policy_version 206694 (0.0026) [2024-06-13 08:47:15,939][70768] Fps is (10 sec: 52429.1, 60 sec: 48605.9, 300 sec: 49040.9). Total num frames: 3386490880. Throughput: 0: 49045.8. Samples: 2915359320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 08:47:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:47:19,087][71000] Updated weights for policy 0, policy_version 206704 (0.0023) [2024-06-13 08:47:20,939][70768] Fps is (10 sec: 45875.4, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3386720256. Throughput: 0: 48888.2. Samples: 2915505820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 08:47:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:47:22,110][71000] Updated weights for policy 0, policy_version 206714 (0.0026) [2024-06-13 08:47:25,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 3386949632. Throughput: 0: 48665.3. Samples: 2915790040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 08:47:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:47:25,949][71000] Updated weights for policy 0, policy_version 206724 (0.0025) [2024-06-13 08:47:28,934][71000] Updated weights for policy 0, policy_version 206734 (0.0022) [2024-06-13 08:47:30,940][70768] Fps is (10 sec: 54067.0, 60 sec: 49425.1, 300 sec: 49263.1). Total num frames: 3387260928. Throughput: 0: 48869.1. Samples: 2916088640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 08:47:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:47:32,483][71000] Updated weights for policy 0, policy_version 206744 (0.0032) [2024-06-13 08:47:33,501][70980] Signal inference workers to stop experience collection... (43600 times) [2024-06-13 08:47:33,520][71000] InferenceWorker_p0-w0: stopping experience collection (43600 times) [2024-06-13 08:47:33,556][70980] Signal inference workers to resume experience collection... (43600 times) [2024-06-13 08:47:33,557][71000] InferenceWorker_p0-w0: resuming experience collection (43600 times) [2024-06-13 08:47:35,503][71000] Updated weights for policy 0, policy_version 206754 (0.0021) [2024-06-13 08:47:35,940][70768] Fps is (10 sec: 52428.5, 60 sec: 49152.1, 300 sec: 49096.4). Total num frames: 3387473920. Throughput: 0: 49439.5. Samples: 2916258880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 08:47:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:47:38,921][71000] Updated weights for policy 0, policy_version 206764 (0.0026) [2024-06-13 08:47:40,940][70768] Fps is (10 sec: 45874.9, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3387719680. Throughput: 0: 49451.7. Samples: 2916558180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 08:47:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:47:40,950][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000206770_3387719680.pth... [2024-06-13 08:47:40,990][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000206050_3375923200.pth [2024-06-13 08:47:41,845][71000] Updated weights for policy 0, policy_version 206774 (0.0022) [2024-06-13 08:47:45,692][71000] Updated weights for policy 0, policy_version 206784 (0.0031) [2024-06-13 08:47:45,940][70768] Fps is (10 sec: 47513.5, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 3387949056. Throughput: 0: 49361.4. Samples: 2916855520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 08:47:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:47:48,350][71000] Updated weights for policy 0, policy_version 206794 (0.0032) [2024-06-13 08:47:50,939][70768] Fps is (10 sec: 52429.4, 60 sec: 49425.1, 300 sec: 49318.6). Total num frames: 3388243968. Throughput: 0: 49354.8. Samples: 2916987300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 08:47:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:47:53,050][71000] Updated weights for policy 0, policy_version 206804 (0.0044) [2024-06-13 08:47:55,130][71000] Updated weights for policy 0, policy_version 206814 (0.0030) [2024-06-13 08:47:55,944][70768] Fps is (10 sec: 50769.0, 60 sec: 48875.5, 300 sec: 49095.7). Total num frames: 3388456960. Throughput: 0: 49304.6. Samples: 2917281200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 08:47:55,944][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:47:59,634][71000] Updated weights for policy 0, policy_version 206824 (0.0025) [2024-06-13 08:48:00,940][70768] Fps is (10 sec: 45874.3, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 3388702720. Throughput: 0: 49332.2. Samples: 2917579280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:48:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:48:02,064][71000] Updated weights for policy 0, policy_version 206834 (0.0028) [2024-06-13 08:48:05,939][70768] Fps is (10 sec: 45895.2, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3388915712. Throughput: 0: 49018.2. Samples: 2917711640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:48:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:48:06,138][71000] Updated weights for policy 0, policy_version 206844 (0.0030) [2024-06-13 08:48:08,522][71000] Updated weights for policy 0, policy_version 206854 (0.0029) [2024-06-13 08:48:10,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 3389194240. Throughput: 0: 49320.4. Samples: 2918009460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:48:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:48:12,836][71000] Updated weights for policy 0, policy_version 206864 (0.0036) [2024-06-13 08:48:15,267][71000] Updated weights for policy 0, policy_version 206874 (0.0031) [2024-06-13 08:48:15,940][70768] Fps is (10 sec: 52428.2, 60 sec: 49151.9, 300 sec: 49096.5). Total num frames: 3389440000. Throughput: 0: 49153.7. Samples: 2918300560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:48:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:48:19,788][71000] Updated weights for policy 0, policy_version 206884 (0.0021) [2024-06-13 08:48:20,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49424.9, 300 sec: 49207.5). Total num frames: 3389685760. Throughput: 0: 48637.2. Samples: 2918447560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:48:20,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:48:22,004][71000] Updated weights for policy 0, policy_version 206894 (0.0021) [2024-06-13 08:48:25,940][70768] Fps is (10 sec: 44237.3, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 3389882368. Throughput: 0: 48619.2. Samples: 2918746040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:48:25,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:48:26,036][70980] Signal inference workers to stop experience collection... (43650 times) [2024-06-13 08:48:26,037][70980] Signal inference workers to resume experience collection... (43650 times) [2024-06-13 08:48:26,081][71000] InferenceWorker_p0-w0: stopping experience collection (43650 times) [2024-06-13 08:48:26,081][71000] InferenceWorker_p0-w0: resuming experience collection (43650 times) [2024-06-13 08:48:26,166][71000] Updated weights for policy 0, policy_version 206904 (0.0038) [2024-06-13 08:48:28,710][71000] Updated weights for policy 0, policy_version 206914 (0.0029) [2024-06-13 08:48:30,939][70768] Fps is (10 sec: 49153.2, 60 sec: 48605.9, 300 sec: 49207.6). Total num frames: 3390177280. Throughput: 0: 48342.8. Samples: 2919030940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:48:30,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:48:33,154][71000] Updated weights for policy 0, policy_version 206924 (0.0033) [2024-06-13 08:48:35,257][71000] Updated weights for policy 0, policy_version 206934 (0.0023) [2024-06-13 08:48:35,940][70768] Fps is (10 sec: 55704.5, 60 sec: 49425.0, 300 sec: 49207.5). Total num frames: 3390439424. Throughput: 0: 49083.3. Samples: 2919196060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:48:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:48:39,875][71000] Updated weights for policy 0, policy_version 206944 (0.0031) [2024-06-13 08:48:40,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48879.0, 300 sec: 49152.0). Total num frames: 3390652416. Throughput: 0: 49075.3. Samples: 2919489380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:48:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:48:42,102][71000] Updated weights for policy 0, policy_version 206954 (0.0029) [2024-06-13 08:48:45,940][70768] Fps is (10 sec: 42599.1, 60 sec: 48605.9, 300 sec: 48985.4). Total num frames: 3390865408. Throughput: 0: 48833.5. Samples: 2919776780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:48:45,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:48:46,590][71000] Updated weights for policy 0, policy_version 206964 (0.0030) [2024-06-13 08:48:48,831][71000] Updated weights for policy 0, policy_version 206974 (0.0031) [2024-06-13 08:48:50,940][70768] Fps is (10 sec: 50789.5, 60 sec: 48605.6, 300 sec: 49207.5). Total num frames: 3391160320. Throughput: 0: 48971.7. Samples: 2919915380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:48:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:48:53,478][71000] Updated weights for policy 0, policy_version 206984 (0.0030) [2024-06-13 08:48:55,690][71000] Updated weights for policy 0, policy_version 206994 (0.0033) [2024-06-13 08:48:55,939][70768] Fps is (10 sec: 54067.5, 60 sec: 49155.6, 300 sec: 49096.5). Total num frames: 3391406080. Throughput: 0: 48871.2. Samples: 2920208660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:48:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:48:59,980][71000] Updated weights for policy 0, policy_version 207004 (0.0029) [2024-06-13 08:49:00,940][70768] Fps is (10 sec: 44237.4, 60 sec: 48332.9, 300 sec: 48985.4). Total num frames: 3391602688. Throughput: 0: 49016.0. Samples: 2920506280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 08:49:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:49:02,284][71000] Updated weights for policy 0, policy_version 207014 (0.0025) [2024-06-13 08:49:05,940][70768] Fps is (10 sec: 44236.0, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 3391848448. Throughput: 0: 48696.1. Samples: 2920638880. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-13 08:49:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:49:06,528][71000] Updated weights for policy 0, policy_version 207024 (0.0028) [2024-06-13 08:49:08,973][71000] Updated weights for policy 0, policy_version 207034 (0.0035) [2024-06-13 08:49:10,940][70768] Fps is (10 sec: 54066.6, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3392143360. Throughput: 0: 48459.3. Samples: 2920926720. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-13 08:49:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:49:13,395][71000] Updated weights for policy 0, policy_version 207044 (0.0037) [2024-06-13 08:49:15,770][71000] Updated weights for policy 0, policy_version 207054 (0.0031) [2024-06-13 08:49:15,940][70768] Fps is (10 sec: 54067.8, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 3392389120. Throughput: 0: 48782.6. Samples: 2921226160. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-13 08:49:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:49:20,300][71000] Updated weights for policy 0, policy_version 207064 (0.0025) [2024-06-13 08:49:20,940][70768] Fps is (10 sec: 44236.5, 60 sec: 48332.8, 300 sec: 48874.3). Total num frames: 3392585728. Throughput: 0: 48284.4. Samples: 2921368860. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-13 08:49:20,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:49:22,306][70980] Signal inference workers to stop experience collection... (43700 times) [2024-06-13 08:49:22,364][71000] InferenceWorker_p0-w0: stopping experience collection (43700 times) [2024-06-13 08:49:22,414][70980] Signal inference workers to resume experience collection... (43700 times) [2024-06-13 08:49:22,414][71000] InferenceWorker_p0-w0: resuming experience collection (43700 times) [2024-06-13 08:49:22,416][71000] Updated weights for policy 0, policy_version 207074 (0.0024) [2024-06-13 08:49:25,940][70768] Fps is (10 sec: 42598.4, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3392815104. Throughput: 0: 48260.4. Samples: 2921661100. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-13 08:49:25,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:49:26,871][71000] Updated weights for policy 0, policy_version 207084 (0.0028) [2024-06-13 08:49:29,146][71000] Updated weights for policy 0, policy_version 207094 (0.0026) [2024-06-13 08:49:30,944][70768] Fps is (10 sec: 52407.2, 60 sec: 48875.4, 300 sec: 49040.2). Total num frames: 3393110016. Throughput: 0: 48255.8. Samples: 2921948500. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-13 08:49:30,944][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:49:33,307][71000] Updated weights for policy 0, policy_version 207104 (0.0037) [2024-06-13 08:49:35,734][71000] Updated weights for policy 0, policy_version 207114 (0.0027) [2024-06-13 08:49:35,940][70768] Fps is (10 sec: 54067.3, 60 sec: 48606.0, 300 sec: 49096.5). Total num frames: 3393355776. Throughput: 0: 48711.3. Samples: 2922107380. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-13 08:49:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:49:40,575][71000] Updated weights for policy 0, policy_version 207124 (0.0029) [2024-06-13 08:49:40,940][70768] Fps is (10 sec: 44256.0, 60 sec: 48332.8, 300 sec: 48818.8). Total num frames: 3393552384. Throughput: 0: 48756.8. Samples: 2922402720. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-13 08:49:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:49:41,046][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000207127_3393568768.pth... [2024-06-13 08:49:41,090][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000206411_3381837824.pth [2024-06-13 08:49:42,515][71000] Updated weights for policy 0, policy_version 207134 (0.0023) [2024-06-13 08:49:45,940][70768] Fps is (10 sec: 42598.3, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 3393781760. Throughput: 0: 48335.6. Samples: 2922681380. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-13 08:49:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:49:47,119][71000] Updated weights for policy 0, policy_version 207144 (0.0029) [2024-06-13 08:49:49,277][71000] Updated weights for policy 0, policy_version 207154 (0.0040) [2024-06-13 08:49:50,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48332.9, 300 sec: 48874.3). Total num frames: 3394060288. Throughput: 0: 48733.0. Samples: 2922831860. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-13 08:49:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:49:53,706][71000] Updated weights for policy 0, policy_version 207164 (0.0030) [2024-06-13 08:49:55,940][70768] Fps is (10 sec: 52427.9, 60 sec: 48332.6, 300 sec: 48985.4). Total num frames: 3394306048. Throughput: 0: 48808.9. Samples: 2923123120. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-13 08:49:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:49:56,163][71000] Updated weights for policy 0, policy_version 207174 (0.0031) [2024-06-13 08:50:00,575][71000] Updated weights for policy 0, policy_version 207184 (0.0033) [2024-06-13 08:50:00,940][70768] Fps is (10 sec: 45874.0, 60 sec: 48605.7, 300 sec: 48763.2). Total num frames: 3394519040. Throughput: 0: 48639.3. Samples: 2923414940. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-13 08:50:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:50:03,012][71000] Updated weights for policy 0, policy_version 207194 (0.0030) [2024-06-13 08:50:05,940][70768] Fps is (10 sec: 45876.0, 60 sec: 48606.0, 300 sec: 48763.2). Total num frames: 3394764800. Throughput: 0: 48354.9. Samples: 2923544820. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-13 08:50:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:50:07,406][71000] Updated weights for policy 0, policy_version 207204 (0.0032) [2024-06-13 08:50:09,653][71000] Updated weights for policy 0, policy_version 207214 (0.0030) [2024-06-13 08:50:10,940][70768] Fps is (10 sec: 50791.8, 60 sec: 48059.9, 300 sec: 48763.2). Total num frames: 3395026944. Throughput: 0: 48416.5. Samples: 2923839840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 08:50:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:50:14,061][71000] Updated weights for policy 0, policy_version 207224 (0.0027) [2024-06-13 08:50:15,326][70980] Signal inference workers to stop experience collection... (43750 times) [2024-06-13 08:50:15,327][70980] Signal inference workers to resume experience collection... (43750 times) [2024-06-13 08:50:15,340][71000] InferenceWorker_p0-w0: stopping experience collection (43750 times) [2024-06-13 08:50:15,340][71000] InferenceWorker_p0-w0: resuming experience collection (43750 times) [2024-06-13 08:50:15,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48059.7, 300 sec: 48818.8). Total num frames: 3395272704. Throughput: 0: 48780.7. Samples: 2924143420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 08:50:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:50:16,285][71000] Updated weights for policy 0, policy_version 207234 (0.0020) [2024-06-13 08:50:20,518][71000] Updated weights for policy 0, policy_version 207244 (0.0021) [2024-06-13 08:50:20,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48606.0, 300 sec: 48763.2). Total num frames: 3395502080. Throughput: 0: 48388.0. Samples: 2924284840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 08:50:20,940][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 08:50:22,952][71000] Updated weights for policy 0, policy_version 207254 (0.0022) [2024-06-13 08:50:25,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.8, 300 sec: 48763.2). Total num frames: 3395747840. Throughput: 0: 48358.1. Samples: 2924578840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 08:50:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:50:27,115][71000] Updated weights for policy 0, policy_version 207264 (0.0027) [2024-06-13 08:50:29,729][71000] Updated weights for policy 0, policy_version 207274 (0.0031) [2024-06-13 08:50:30,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48063.2, 300 sec: 48763.2). Total num frames: 3395993600. Throughput: 0: 48468.5. Samples: 2924862460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 08:50:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 08:50:33,858][71000] Updated weights for policy 0, policy_version 207284 (0.0030) [2024-06-13 08:50:35,939][70768] Fps is (10 sec: 50791.8, 60 sec: 48332.9, 300 sec: 48874.3). Total num frames: 3396255744. Throughput: 0: 48440.6. Samples: 2925011680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 08:50:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:50:36,437][71000] Updated weights for policy 0, policy_version 207294 (0.0023) [2024-06-13 08:50:40,539][71000] Updated weights for policy 0, policy_version 207304 (0.0032) [2024-06-13 08:50:40,940][70768] Fps is (10 sec: 47512.7, 60 sec: 48605.7, 300 sec: 48763.2). Total num frames: 3396468736. Throughput: 0: 48348.9. Samples: 2925298820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 08:50:40,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:50:43,467][71000] Updated weights for policy 0, policy_version 207314 (0.0035) [2024-06-13 08:50:45,940][70768] Fps is (10 sec: 44235.5, 60 sec: 48605.7, 300 sec: 48652.1). Total num frames: 3396698112. Throughput: 0: 48349.0. Samples: 2925590640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 08:50:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 08:50:47,511][71000] Updated weights for policy 0, policy_version 207324 (0.0028) [2024-06-13 08:50:50,545][71000] Updated weights for policy 0, policy_version 207334 (0.0024) [2024-06-13 08:50:50,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48332.8, 300 sec: 48707.7). Total num frames: 3396960256. Throughput: 0: 48643.5. Samples: 2925733780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 08:50:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:50:54,012][71000] Updated weights for policy 0, policy_version 207344 (0.0031) [2024-06-13 08:50:55,939][70768] Fps is (10 sec: 52429.9, 60 sec: 48606.1, 300 sec: 48874.3). Total num frames: 3397222400. Throughput: 0: 48576.5. Samples: 2926025780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 08:50:55,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:50:57,055][71000] Updated weights for policy 0, policy_version 207354 (0.0035) [2024-06-13 08:51:00,491][71000] Updated weights for policy 0, policy_version 207364 (0.0025) [2024-06-13 08:51:00,944][70768] Fps is (10 sec: 50768.7, 60 sec: 49148.7, 300 sec: 48873.6). Total num frames: 3397468160. Throughput: 0: 48477.2. Samples: 2926325100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 08:51:00,944][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:51:03,331][71000] Updated weights for policy 0, policy_version 207374 (0.0029) [2024-06-13 08:51:05,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 3397697536. Throughput: 0: 48568.8. Samples: 2926470440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 08:51:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 08:51:07,360][71000] Updated weights for policy 0, policy_version 207384 (0.0029) [2024-06-13 08:51:10,280][71000] Updated weights for policy 0, policy_version 207394 (0.0034) [2024-06-13 08:51:10,940][70768] Fps is (10 sec: 47533.5, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 3397943296. Throughput: 0: 48581.4. Samples: 2926765000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 08:51:10,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 08:51:12,578][70980] Signal inference workers to stop experience collection... (43800 times) [2024-06-13 08:51:12,628][71000] InferenceWorker_p0-w0: stopping experience collection (43800 times) [2024-06-13 08:51:12,635][70980] Signal inference workers to resume experience collection... (43800 times) [2024-06-13 08:51:12,640][71000] InferenceWorker_p0-w0: resuming experience collection (43800 times) [2024-06-13 08:51:14,151][71000] Updated weights for policy 0, policy_version 207404 (0.0027) [2024-06-13 08:51:15,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3398205440. Throughput: 0: 48796.8. Samples: 2927058320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 08:51:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:51:17,438][71000] Updated weights for policy 0, policy_version 207414 (0.0039) [2024-06-13 08:51:20,678][71000] Updated weights for policy 0, policy_version 207424 (0.0034) [2024-06-13 08:51:20,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 3398434816. Throughput: 0: 48758.0. Samples: 2927205800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 08:51:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:51:23,848][71000] Updated weights for policy 0, policy_version 207434 (0.0036) [2024-06-13 08:51:25,940][70768] Fps is (10 sec: 44236.8, 60 sec: 48332.9, 300 sec: 48652.1). Total num frames: 3398647808. Throughput: 0: 49017.9. Samples: 2927504620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 08:51:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:51:27,452][71000] Updated weights for policy 0, policy_version 207444 (0.0029) [2024-06-13 08:51:30,870][71000] Updated weights for policy 0, policy_version 207454 (0.0044) [2024-06-13 08:51:30,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48878.8, 300 sec: 48818.8). Total num frames: 3398926336. Throughput: 0: 48802.3. Samples: 2927786740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 08:51:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:51:34,371][71000] Updated weights for policy 0, policy_version 207464 (0.0027) [2024-06-13 08:51:35,939][70768] Fps is (10 sec: 55706.1, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 3399204864. Throughput: 0: 49045.9. Samples: 2927940840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 08:51:35,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:51:38,099][71000] Updated weights for policy 0, policy_version 207474 (0.0023) [2024-06-13 08:51:40,894][71000] Updated weights for policy 0, policy_version 207484 (0.0024) [2024-06-13 08:51:40,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3399417856. Throughput: 0: 49136.2. Samples: 2928236920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 08:51:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:51:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000207484_3399417856.pth... [2024-06-13 08:51:40,991][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000206770_3387719680.pth [2024-06-13 08:51:44,586][71000] Updated weights for policy 0, policy_version 207494 (0.0026) [2024-06-13 08:51:45,939][70768] Fps is (10 sec: 40960.0, 60 sec: 48606.0, 300 sec: 48596.6). Total num frames: 3399614464. Throughput: 0: 48948.3. Samples: 2928527560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 08:51:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:51:47,728][71000] Updated weights for policy 0, policy_version 207504 (0.0024) [2024-06-13 08:51:50,940][70768] Fps is (10 sec: 45875.7, 60 sec: 48605.8, 300 sec: 48652.2). Total num frames: 3399876608. Throughput: 0: 48703.6. Samples: 2928662100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 08:51:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:51:51,522][71000] Updated weights for policy 0, policy_version 207514 (0.0031) [2024-06-13 08:51:54,661][71000] Updated weights for policy 0, policy_version 207524 (0.0031) [2024-06-13 08:51:55,940][70768] Fps is (10 sec: 55704.6, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 3400171520. Throughput: 0: 48710.7. Samples: 2928956980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 08:51:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:51:58,388][71000] Updated weights for policy 0, policy_version 207534 (0.0031) [2024-06-13 08:52:00,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48609.3, 300 sec: 48874.3). Total num frames: 3400384512. Throughput: 0: 48918.2. Samples: 2929259640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 08:52:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:52:01,046][71000] Updated weights for policy 0, policy_version 207544 (0.0032) [2024-06-13 08:52:04,839][71000] Updated weights for policy 0, policy_version 207554 (0.0025) [2024-06-13 08:52:05,940][70768] Fps is (10 sec: 42598.9, 60 sec: 48332.9, 300 sec: 48596.6). Total num frames: 3400597504. Throughput: 0: 48619.7. Samples: 2929393680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 08:52:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:52:07,771][71000] Updated weights for policy 0, policy_version 207564 (0.0024) [2024-06-13 08:52:10,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 3400859648. Throughput: 0: 48418.1. Samples: 2929683440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 08:52:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:52:11,820][71000] Updated weights for policy 0, policy_version 207574 (0.0022) [2024-06-13 08:52:14,568][71000] Updated weights for policy 0, policy_version 207584 (0.0023) [2024-06-13 08:52:15,544][70980] Signal inference workers to stop experience collection... (43850 times) [2024-06-13 08:52:15,582][71000] InferenceWorker_p0-w0: stopping experience collection (43850 times) [2024-06-13 08:52:15,592][70980] Signal inference workers to resume experience collection... (43850 times) [2024-06-13 08:52:15,605][71000] InferenceWorker_p0-w0: resuming experience collection (43850 times) [2024-06-13 08:52:15,939][70768] Fps is (10 sec: 54067.5, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 3401138176. Throughput: 0: 48661.9. Samples: 2929976520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 08:52:15,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:52:18,570][71000] Updated weights for policy 0, policy_version 207594 (0.0037) [2024-06-13 08:52:20,940][70768] Fps is (10 sec: 45876.0, 60 sec: 48059.8, 300 sec: 48707.7). Total num frames: 3401318400. Throughput: 0: 48479.5. Samples: 2930122420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 08:52:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:52:21,813][71000] Updated weights for policy 0, policy_version 207604 (0.0023) [2024-06-13 08:52:25,679][71000] Updated weights for policy 0, policy_version 207614 (0.0033) [2024-06-13 08:52:25,940][70768] Fps is (10 sec: 42596.8, 60 sec: 48605.6, 300 sec: 48485.5). Total num frames: 3401564160. Throughput: 0: 48279.4. Samples: 2930409500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 08:52:25,941][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:52:28,153][71000] Updated weights for policy 0, policy_version 207624 (0.0022) [2024-06-13 08:52:30,940][70768] Fps is (10 sec: 50789.2, 60 sec: 48332.7, 300 sec: 48652.1). Total num frames: 3401826304. Throughput: 0: 48206.4. Samples: 2930696860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 08:52:30,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 08:52:31,963][71000] Updated weights for policy 0, policy_version 207634 (0.0030) [2024-06-13 08:52:35,039][71000] Updated weights for policy 0, policy_version 207644 (0.0029) [2024-06-13 08:52:35,940][70768] Fps is (10 sec: 52429.8, 60 sec: 48059.6, 300 sec: 48707.7). Total num frames: 3402088448. Throughput: 0: 48611.0. Samples: 2930849600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 08:52:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:52:38,811][71000] Updated weights for policy 0, policy_version 207654 (0.0029) [2024-06-13 08:52:40,940][70768] Fps is (10 sec: 45876.1, 60 sec: 47786.8, 300 sec: 48596.6). Total num frames: 3402285056. Throughput: 0: 48512.5. Samples: 2931140040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 08:52:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:52:41,910][71000] Updated weights for policy 0, policy_version 207664 (0.0030) [2024-06-13 08:52:45,771][71000] Updated weights for policy 0, policy_version 207674 (0.0034) [2024-06-13 08:52:45,940][70768] Fps is (10 sec: 44237.3, 60 sec: 48605.8, 300 sec: 48430.0). Total num frames: 3402530816. Throughput: 0: 48146.2. Samples: 2931426220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 08:52:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:52:48,441][71000] Updated weights for policy 0, policy_version 207684 (0.0020) [2024-06-13 08:52:50,939][70768] Fps is (10 sec: 52429.6, 60 sec: 48879.1, 300 sec: 48652.9). Total num frames: 3402809344. Throughput: 0: 48456.1. Samples: 2931574200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 08:52:50,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 08:52:52,064][71000] Updated weights for policy 0, policy_version 207694 (0.0037) [2024-06-13 08:52:55,141][71000] Updated weights for policy 0, policy_version 207704 (0.0028) [2024-06-13 08:52:55,940][70768] Fps is (10 sec: 54066.8, 60 sec: 48332.8, 300 sec: 48707.7). Total num frames: 3403071488. Throughput: 0: 48765.0. Samples: 2931877860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 08:52:55,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:52:59,154][71000] Updated weights for policy 0, policy_version 207714 (0.0043) [2024-06-13 08:53:00,940][70768] Fps is (10 sec: 47512.5, 60 sec: 48332.7, 300 sec: 48707.7). Total num frames: 3403284480. Throughput: 0: 48673.6. Samples: 2932166840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 08:53:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:53:01,824][71000] Updated weights for policy 0, policy_version 207724 (0.0033) [2024-06-13 08:53:05,711][71000] Updated weights for policy 0, policy_version 207734 (0.0024) [2024-06-13 08:53:05,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48878.8, 300 sec: 48596.6). Total num frames: 3403530240. Throughput: 0: 48226.0. Samples: 2932292600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 08:53:05,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:53:08,467][71000] Updated weights for policy 0, policy_version 207744 (0.0021) [2024-06-13 08:53:09,676][70980] Signal inference workers to stop experience collection... (43900 times) [2024-06-13 08:53:09,677][70980] Signal inference workers to resume experience collection... (43900 times) [2024-06-13 08:53:09,723][71000] InferenceWorker_p0-w0: stopping experience collection (43900 times) [2024-06-13 08:53:09,724][71000] InferenceWorker_p0-w0: resuming experience collection (43900 times) [2024-06-13 08:53:10,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48878.9, 300 sec: 48652.1). Total num frames: 3403792384. Throughput: 0: 48798.0. Samples: 2932605400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 08:53:10,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:53:12,159][71000] Updated weights for policy 0, policy_version 207754 (0.0029) [2024-06-13 08:53:15,086][71000] Updated weights for policy 0, policy_version 207764 (0.0033) [2024-06-13 08:53:15,939][70768] Fps is (10 sec: 52430.0, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 3404054528. Throughput: 0: 48971.0. Samples: 2932900540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 08:53:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:53:19,084][71000] Updated weights for policy 0, policy_version 207774 (0.0030) [2024-06-13 08:53:20,940][70768] Fps is (10 sec: 47514.0, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 3404267520. Throughput: 0: 48871.6. Samples: 2933048820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 08:53:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:53:21,763][71000] Updated weights for policy 0, policy_version 207784 (0.0033) [2024-06-13 08:53:25,519][71000] Updated weights for policy 0, policy_version 207794 (0.0038) [2024-06-13 08:53:25,940][70768] Fps is (10 sec: 45874.6, 60 sec: 49152.2, 300 sec: 48596.6). Total num frames: 3404513280. Throughput: 0: 48977.3. Samples: 2933344020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 08:53:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:53:28,470][71000] Updated weights for policy 0, policy_version 207804 (0.0028) [2024-06-13 08:53:30,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48879.0, 300 sec: 48541.1). Total num frames: 3404759040. Throughput: 0: 48843.0. Samples: 2933624160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 08:53:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:53:32,245][71000] Updated weights for policy 0, policy_version 207814 (0.0025) [2024-06-13 08:53:35,319][71000] Updated weights for policy 0, policy_version 207824 (0.0034) [2024-06-13 08:53:35,939][70768] Fps is (10 sec: 50791.1, 60 sec: 48879.1, 300 sec: 48707.7). Total num frames: 3405021184. Throughput: 0: 49275.5. Samples: 2933791600. Policy #0 lag: (min: 1.0, avg: 8.5, max: 22.0) [2024-06-13 08:53:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:53:38,731][71000] Updated weights for policy 0, policy_version 207834 (0.0028) [2024-06-13 08:53:40,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49151.9, 300 sec: 48707.7). Total num frames: 3405234176. Throughput: 0: 48906.1. Samples: 2934078640. Policy #0 lag: (min: 1.0, avg: 8.5, max: 22.0) [2024-06-13 08:53:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:53:40,986][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000207840_3405250560.pth... [2024-06-13 08:53:41,033][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000207127_3393568768.pth [2024-06-13 08:53:42,124][71000] Updated weights for policy 0, policy_version 207844 (0.0029) [2024-06-13 08:53:45,427][71000] Updated weights for policy 0, policy_version 207854 (0.0035) [2024-06-13 08:53:45,940][70768] Fps is (10 sec: 45874.3, 60 sec: 49151.9, 300 sec: 48541.1). Total num frames: 3405479936. Throughput: 0: 48959.5. Samples: 2934370020. Policy #0 lag: (min: 1.0, avg: 8.5, max: 22.0) [2024-06-13 08:53:45,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:53:48,786][71000] Updated weights for policy 0, policy_version 207864 (0.0023) [2024-06-13 08:53:50,940][70768] Fps is (10 sec: 52429.9, 60 sec: 49151.9, 300 sec: 48652.1). Total num frames: 3405758464. Throughput: 0: 49354.9. Samples: 2934513560. Policy #0 lag: (min: 1.0, avg: 8.5, max: 22.0) [2024-06-13 08:53:50,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:53:52,143][71000] Updated weights for policy 0, policy_version 207874 (0.0024) [2024-06-13 08:53:55,386][71000] Updated weights for policy 0, policy_version 207884 (0.0023) [2024-06-13 08:53:55,940][70768] Fps is (10 sec: 52429.4, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 3406004224. Throughput: 0: 49131.7. Samples: 2934816320. Policy #0 lag: (min: 1.0, avg: 8.5, max: 22.0) [2024-06-13 08:53:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:53:58,793][71000] Updated weights for policy 0, policy_version 207894 (0.0036) [2024-06-13 08:54:00,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 3406217216. Throughput: 0: 49020.8. Samples: 2935106480. Policy #0 lag: (min: 1.0, avg: 8.5, max: 22.0) [2024-06-13 08:54:00,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:54:02,187][71000] Updated weights for policy 0, policy_version 207904 (0.0027) [2024-06-13 08:54:05,372][71000] Updated weights for policy 0, policy_version 207914 (0.0022) [2024-06-13 08:54:05,941][70768] Fps is (10 sec: 45871.1, 60 sec: 48878.3, 300 sec: 48540.9). Total num frames: 3406462976. Throughput: 0: 48726.6. Samples: 2935241560. Policy #0 lag: (min: 1.0, avg: 8.5, max: 22.0) [2024-06-13 08:54:05,941][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:54:08,899][71000] Updated weights for policy 0, policy_version 207924 (0.0027) [2024-06-13 08:54:10,940][70768] Fps is (10 sec: 50790.6, 60 sec: 48879.0, 300 sec: 48596.6). Total num frames: 3406725120. Throughput: 0: 48898.7. Samples: 2935544460. Policy #0 lag: (min: 1.0, avg: 8.5, max: 22.0) [2024-06-13 08:54:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:54:11,962][71000] Updated weights for policy 0, policy_version 207934 (0.0030) [2024-06-13 08:54:15,669][71000] Updated weights for policy 0, policy_version 207944 (0.0027) [2024-06-13 08:54:15,940][70768] Fps is (10 sec: 49156.8, 60 sec: 48332.8, 300 sec: 48707.7). Total num frames: 3406954496. Throughput: 0: 49076.1. Samples: 2935832580. Policy #0 lag: (min: 1.0, avg: 8.5, max: 22.0) [2024-06-13 08:54:15,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:54:18,847][71000] Updated weights for policy 0, policy_version 207954 (0.0026) [2024-06-13 08:54:20,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 3407200256. Throughput: 0: 48499.4. Samples: 2935974080. Policy #0 lag: (min: 1.0, avg: 8.5, max: 22.0) [2024-06-13 08:54:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:54:22,715][71000] Updated weights for policy 0, policy_version 207964 (0.0031) [2024-06-13 08:54:25,819][71000] Updated weights for policy 0, policy_version 207974 (0.0031) [2024-06-13 08:54:25,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48879.0, 300 sec: 48597.3). Total num frames: 3407446016. Throughput: 0: 48664.2. Samples: 2936268520. Policy #0 lag: (min: 1.0, avg: 8.5, max: 22.0) [2024-06-13 08:54:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:54:28,407][70980] Signal inference workers to stop experience collection... (43950 times) [2024-06-13 08:54:28,441][71000] InferenceWorker_p0-w0: stopping experience collection (43950 times) [2024-06-13 08:54:28,518][70980] Signal inference workers to resume experience collection... (43950 times) [2024-06-13 08:54:28,519][71000] InferenceWorker_p0-w0: resuming experience collection (43950 times) [2024-06-13 08:54:29,135][71000] Updated weights for policy 0, policy_version 207984 (0.0039) [2024-06-13 08:54:30,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49152.0, 300 sec: 48652.1). Total num frames: 3407708160. Throughput: 0: 48810.7. Samples: 2936566500. Policy #0 lag: (min: 1.0, avg: 8.5, max: 22.0) [2024-06-13 08:54:30,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:54:32,161][71000] Updated weights for policy 0, policy_version 207994 (0.0027) [2024-06-13 08:54:35,668][71000] Updated weights for policy 0, policy_version 208004 (0.0028) [2024-06-13 08:54:35,940][70768] Fps is (10 sec: 49151.2, 60 sec: 48605.7, 300 sec: 48763.2). Total num frames: 3407937536. Throughput: 0: 49162.0. Samples: 2936725860. Policy #0 lag: (min: 1.0, avg: 8.5, max: 22.0) [2024-06-13 08:54:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:54:39,298][71000] Updated weights for policy 0, policy_version 208014 (0.0032) [2024-06-13 08:54:40,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.1, 300 sec: 48818.8). Total num frames: 3408183296. Throughput: 0: 48778.2. Samples: 2937011340. Policy #0 lag: (min: 1.0, avg: 8.5, max: 22.0) [2024-06-13 08:54:40,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:54:42,663][71000] Updated weights for policy 0, policy_version 208024 (0.0021) [2024-06-13 08:54:45,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48879.0, 300 sec: 48652.1). Total num frames: 3408412672. Throughput: 0: 48692.4. Samples: 2937297640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 08:54:45,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:54:46,119][71000] Updated weights for policy 0, policy_version 208034 (0.0034) [2024-06-13 08:54:49,258][71000] Updated weights for policy 0, policy_version 208044 (0.0025) [2024-06-13 08:54:50,940][70768] Fps is (10 sec: 50789.4, 60 sec: 48878.7, 300 sec: 48763.2). Total num frames: 3408691200. Throughput: 0: 49047.5. Samples: 2937448660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 08:54:50,941][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:54:52,365][71000] Updated weights for policy 0, policy_version 208054 (0.0022) [2024-06-13 08:54:55,626][71000] Updated weights for policy 0, policy_version 208064 (0.0033) [2024-06-13 08:54:55,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 3408920576. Throughput: 0: 49000.9. Samples: 2937749500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 08:54:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:54:58,985][71000] Updated weights for policy 0, policy_version 208074 (0.0024) [2024-06-13 08:55:00,940][70768] Fps is (10 sec: 49153.0, 60 sec: 49425.1, 300 sec: 48874.3). Total num frames: 3409182720. Throughput: 0: 49272.8. Samples: 2938049860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 08:55:00,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:55:02,482][71000] Updated weights for policy 0, policy_version 208084 (0.0027) [2024-06-13 08:55:05,571][71000] Updated weights for policy 0, policy_version 208094 (0.0029) [2024-06-13 08:55:05,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.8, 300 sec: 48763.2). Total num frames: 3409412096. Throughput: 0: 49488.5. Samples: 2938201060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 08:55:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:55:09,363][71000] Updated weights for policy 0, policy_version 208104 (0.0026) [2024-06-13 08:55:10,940][70768] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 48818.7). Total num frames: 3409674240. Throughput: 0: 49447.0. Samples: 2938493640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 08:55:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:55:12,467][71000] Updated weights for policy 0, policy_version 208114 (0.0027) [2024-06-13 08:55:15,911][71000] Updated weights for policy 0, policy_version 208124 (0.0035) [2024-06-13 08:55:15,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 3409903616. Throughput: 0: 49212.6. Samples: 2938781060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 08:55:15,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:55:19,225][71000] Updated weights for policy 0, policy_version 208134 (0.0027) [2024-06-13 08:55:20,939][70768] Fps is (10 sec: 47514.4, 60 sec: 49152.1, 300 sec: 48818.8). Total num frames: 3410149376. Throughput: 0: 48872.2. Samples: 2938925100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 08:55:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:55:22,848][71000] Updated weights for policy 0, policy_version 208144 (0.0025) [2024-06-13 08:55:25,896][71000] Updated weights for policy 0, policy_version 208154 (0.0038) [2024-06-13 08:55:25,940][70768] Fps is (10 sec: 49151.5, 60 sec: 49151.9, 300 sec: 48818.7). Total num frames: 3410395136. Throughput: 0: 48966.1. Samples: 2939214820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 08:55:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:55:29,508][71000] Updated weights for policy 0, policy_version 208164 (0.0024) [2024-06-13 08:55:30,939][70768] Fps is (10 sec: 47513.6, 60 sec: 48606.0, 300 sec: 48707.7). Total num frames: 3410624512. Throughput: 0: 49125.4. Samples: 2939508280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 08:55:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:55:32,432][71000] Updated weights for policy 0, policy_version 208174 (0.0032) [2024-06-13 08:55:35,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 3410870272. Throughput: 0: 48934.3. Samples: 2939650700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 08:55:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:55:36,343][71000] Updated weights for policy 0, policy_version 208184 (0.0030) [2024-06-13 08:55:39,174][71000] Updated weights for policy 0, policy_version 208194 (0.0026) [2024-06-13 08:55:40,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3411116032. Throughput: 0: 48865.4. Samples: 2939948440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 08:55:40,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:55:40,952][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000208199_3411132416.pth... [2024-06-13 08:55:41,003][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000207484_3399417856.pth [2024-06-13 08:55:42,725][71000] Updated weights for policy 0, policy_version 208204 (0.0031) [2024-06-13 08:55:43,989][70980] Signal inference workers to stop experience collection... (44000 times) [2024-06-13 08:55:44,007][71000] InferenceWorker_p0-w0: stopping experience collection (44000 times) [2024-06-13 08:55:44,044][70980] Signal inference workers to resume experience collection... (44000 times) [2024-06-13 08:55:44,045][71000] InferenceWorker_p0-w0: resuming experience collection (44000 times) [2024-06-13 08:55:45,828][71000] Updated weights for policy 0, policy_version 208214 (0.0024) [2024-06-13 08:55:45,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49425.0, 300 sec: 48874.3). Total num frames: 3411378176. Throughput: 0: 48659.8. Samples: 2940239560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 08:55:45,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:55:49,745][71000] Updated weights for policy 0, policy_version 208224 (0.0039) [2024-06-13 08:55:50,940][70768] Fps is (10 sec: 49152.2, 60 sec: 48606.1, 300 sec: 48763.2). Total num frames: 3411607552. Throughput: 0: 48517.3. Samples: 2940384340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:55:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:55:52,574][71000] Updated weights for policy 0, policy_version 208234 (0.0025) [2024-06-13 08:55:55,940][70768] Fps is (10 sec: 47514.4, 60 sec: 48878.9, 300 sec: 48763.9). Total num frames: 3411853312. Throughput: 0: 48487.2. Samples: 2940675560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:55:55,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:55:56,385][71000] Updated weights for policy 0, policy_version 208244 (0.0031) [2024-06-13 08:55:59,108][71000] Updated weights for policy 0, policy_version 208254 (0.0026) [2024-06-13 08:56:00,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48605.8, 300 sec: 48818.8). Total num frames: 3412099072. Throughput: 0: 48765.7. Samples: 2940975520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:56:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:56:02,941][71000] Updated weights for policy 0, policy_version 208264 (0.0031) [2024-06-13 08:56:05,818][71000] Updated weights for policy 0, policy_version 208274 (0.0035) [2024-06-13 08:56:05,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3412361216. Throughput: 0: 48837.7. Samples: 2941122800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:56:05,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:56:09,596][71000] Updated weights for policy 0, policy_version 208284 (0.0034) [2024-06-13 08:56:10,939][70768] Fps is (10 sec: 47514.2, 60 sec: 48333.0, 300 sec: 48707.7). Total num frames: 3412574208. Throughput: 0: 48804.7. Samples: 2941411020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:56:10,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:56:12,389][71000] Updated weights for policy 0, policy_version 208294 (0.0027) [2024-06-13 08:56:15,940][70768] Fps is (10 sec: 45874.6, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 3412819968. Throughput: 0: 48841.6. Samples: 2941706160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:56:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:56:16,439][71000] Updated weights for policy 0, policy_version 208304 (0.0036) [2024-06-13 08:56:19,097][71000] Updated weights for policy 0, policy_version 208314 (0.0027) [2024-06-13 08:56:20,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3413098496. Throughput: 0: 49049.5. Samples: 2941857920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:56:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:56:22,583][71000] Updated weights for policy 0, policy_version 208324 (0.0029) [2024-06-13 08:56:25,630][71000] Updated weights for policy 0, policy_version 208334 (0.0030) [2024-06-13 08:56:25,940][70768] Fps is (10 sec: 54068.1, 60 sec: 49425.2, 300 sec: 48929.9). Total num frames: 3413360640. Throughput: 0: 49245.8. Samples: 2942164500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:56:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:56:29,607][71000] Updated weights for policy 0, policy_version 208344 (0.0034) [2024-06-13 08:56:30,940][70768] Fps is (10 sec: 44236.7, 60 sec: 48605.9, 300 sec: 48596.6). Total num frames: 3413540864. Throughput: 0: 49033.6. Samples: 2942446060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:56:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:56:32,499][71000] Updated weights for policy 0, policy_version 208354 (0.0026) [2024-06-13 08:56:35,940][70768] Fps is (10 sec: 45874.5, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 3413819392. Throughput: 0: 48857.6. Samples: 2942582940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:56:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:56:36,322][71000] Updated weights for policy 0, policy_version 208364 (0.0034) [2024-06-13 08:56:39,315][71000] Updated weights for policy 0, policy_version 208374 (0.0022) [2024-06-13 08:56:40,940][70768] Fps is (10 sec: 52428.4, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3414065152. Throughput: 0: 48956.0. Samples: 2942878580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:56:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:56:42,507][71000] Updated weights for policy 0, policy_version 208384 (0.0026) [2024-06-13 08:56:45,789][71000] Updated weights for policy 0, policy_version 208394 (0.0022) [2024-06-13 08:56:45,940][70768] Fps is (10 sec: 50790.8, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 3414327296. Throughput: 0: 49091.6. Samples: 2943184640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:56:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:56:49,291][71000] Updated weights for policy 0, policy_version 208404 (0.0038) [2024-06-13 08:56:50,939][70768] Fps is (10 sec: 44237.2, 60 sec: 48332.8, 300 sec: 48596.6). Total num frames: 3414507520. Throughput: 0: 48857.4. Samples: 2943321380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:56:50,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:56:52,533][71000] Updated weights for policy 0, policy_version 208414 (0.0023) [2024-06-13 08:56:55,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 3414786048. Throughput: 0: 48911.5. Samples: 2943612040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 08:56:55,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:56:56,238][71000] Updated weights for policy 0, policy_version 208424 (0.0030) [2024-06-13 08:56:58,129][70980] Signal inference workers to stop experience collection... (44050 times) [2024-06-13 08:56:58,175][71000] InferenceWorker_p0-w0: stopping experience collection (44050 times) [2024-06-13 08:56:58,180][70980] Signal inference workers to resume experience collection... (44050 times) [2024-06-13 08:56:58,187][71000] InferenceWorker_p0-w0: resuming experience collection (44050 times) [2024-06-13 08:56:59,373][71000] Updated weights for policy 0, policy_version 208434 (0.0038) [2024-06-13 08:57:00,939][70768] Fps is (10 sec: 54067.4, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 3415048192. Throughput: 0: 48883.4. Samples: 2943905900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:57:00,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:57:02,482][71000] Updated weights for policy 0, policy_version 208444 (0.0028) [2024-06-13 08:57:05,939][70768] Fps is (10 sec: 50790.6, 60 sec: 48879.0, 300 sec: 48929.9). Total num frames: 3415293952. Throughput: 0: 48885.8. Samples: 2944057780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:57:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:57:06,050][71000] Updated weights for policy 0, policy_version 208454 (0.0032) [2024-06-13 08:57:09,574][71000] Updated weights for policy 0, policy_version 208464 (0.0031) [2024-06-13 08:57:10,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 3415506944. Throughput: 0: 48467.5. Samples: 2944345540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:57:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:57:12,776][71000] Updated weights for policy 0, policy_version 208474 (0.0034) [2024-06-13 08:57:15,940][70768] Fps is (10 sec: 47513.2, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 3415769088. Throughput: 0: 48642.2. Samples: 2944634960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:57:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:57:16,230][71000] Updated weights for policy 0, policy_version 208484 (0.0028) [2024-06-13 08:57:19,779][71000] Updated weights for policy 0, policy_version 208494 (0.0022) [2024-06-13 08:57:20,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 3416014848. Throughput: 0: 49135.2. Samples: 2944794020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:57:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:57:23,247][71000] Updated weights for policy 0, policy_version 208504 (0.0022) [2024-06-13 08:57:25,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48059.6, 300 sec: 48874.3). Total num frames: 3416244224. Throughput: 0: 48961.3. Samples: 2945081840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:57:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 08:57:26,385][71000] Updated weights for policy 0, policy_version 208514 (0.0038) [2024-06-13 08:57:30,055][71000] Updated weights for policy 0, policy_version 208524 (0.0027) [2024-06-13 08:57:30,940][70768] Fps is (10 sec: 45875.8, 60 sec: 48878.9, 300 sec: 48763.3). Total num frames: 3416473600. Throughput: 0: 48542.7. Samples: 2945369060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:57:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:57:33,007][71000] Updated weights for policy 0, policy_version 208534 (0.0037) [2024-06-13 08:57:35,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3416752128. Throughput: 0: 48630.9. Samples: 2945509780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:57:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:57:36,538][71000] Updated weights for policy 0, policy_version 208544 (0.0036) [2024-06-13 08:57:39,814][71000] Updated weights for policy 0, policy_version 208554 (0.0025) [2024-06-13 08:57:40,940][70768] Fps is (10 sec: 52428.6, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 3416997888. Throughput: 0: 48821.7. Samples: 2945809020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:57:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:57:40,951][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000208557_3416997888.pth... [2024-06-13 08:57:41,007][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000207840_3405250560.pth [2024-06-13 08:57:43,098][71000] Updated weights for policy 0, policy_version 208564 (0.0028) [2024-06-13 08:57:45,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48332.7, 300 sec: 48874.3). Total num frames: 3417227264. Throughput: 0: 49074.8. Samples: 2946114280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:57:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:57:46,552][71000] Updated weights for policy 0, policy_version 208574 (0.0026) [2024-06-13 08:57:49,654][71000] Updated weights for policy 0, policy_version 208584 (0.0029) [2024-06-13 08:57:50,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 3417456640. Throughput: 0: 48785.7. Samples: 2946253140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:57:50,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:57:53,154][71000] Updated weights for policy 0, policy_version 208594 (0.0031) [2024-06-13 08:57:55,939][70768] Fps is (10 sec: 50791.6, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3417735168. Throughput: 0: 49032.5. Samples: 2946552000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:57:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 08:57:56,207][71000] Updated weights for policy 0, policy_version 208604 (0.0026) [2024-06-13 08:57:59,766][71000] Updated weights for policy 0, policy_version 208614 (0.0030) [2024-06-13 08:58:00,940][70768] Fps is (10 sec: 54067.1, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 3417997312. Throughput: 0: 49186.2. Samples: 2946848340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 08:58:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:58:02,790][71000] Updated weights for policy 0, policy_version 208624 (0.0022) [2024-06-13 08:58:04,132][70980] Signal inference workers to stop experience collection... (44100 times) [2024-06-13 08:58:04,134][70980] Signal inference workers to resume experience collection... (44100 times) [2024-06-13 08:58:04,173][71000] InferenceWorker_p0-w0: stopping experience collection (44100 times) [2024-06-13 08:58:04,174][71000] InferenceWorker_p0-w0: resuming experience collection (44100 times) [2024-06-13 08:58:05,939][70768] Fps is (10 sec: 47513.7, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 3418210304. Throughput: 0: 49126.8. Samples: 2947004720. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 08:58:05,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 08:58:06,385][71000] Updated weights for policy 0, policy_version 208634 (0.0027) [2024-06-13 08:58:09,504][71000] Updated weights for policy 0, policy_version 208644 (0.0030) [2024-06-13 08:58:10,940][70768] Fps is (10 sec: 47512.8, 60 sec: 49424.9, 300 sec: 48874.3). Total num frames: 3418472448. Throughput: 0: 49220.3. Samples: 2947296760. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 08:58:10,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:58:13,391][71000] Updated weights for policy 0, policy_version 208654 (0.0024) [2024-06-13 08:58:15,940][70768] Fps is (10 sec: 52427.6, 60 sec: 49424.9, 300 sec: 49040.9). Total num frames: 3418734592. Throughput: 0: 49161.1. Samples: 2947581320. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 08:58:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 08:58:16,402][71000] Updated weights for policy 0, policy_version 208664 (0.0028) [2024-06-13 08:58:19,903][71000] Updated weights for policy 0, policy_version 208674 (0.0023) [2024-06-13 08:58:20,940][70768] Fps is (10 sec: 52429.6, 60 sec: 49698.2, 300 sec: 49096.5). Total num frames: 3418996736. Throughput: 0: 49599.6. Samples: 2947741760. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 08:58:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:58:22,792][71000] Updated weights for policy 0, policy_version 208684 (0.0026) [2024-06-13 08:58:25,940][70768] Fps is (10 sec: 44237.7, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 3419176960. Throughput: 0: 49601.4. Samples: 2948041080. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 08:58:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:58:26,710][71000] Updated weights for policy 0, policy_version 208694 (0.0025) [2024-06-13 08:58:29,295][71000] Updated weights for policy 0, policy_version 208704 (0.0030) [2024-06-13 08:58:30,940][70768] Fps is (10 sec: 44236.9, 60 sec: 49425.0, 300 sec: 48874.3). Total num frames: 3419439104. Throughput: 0: 49139.3. Samples: 2948325540. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 08:58:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:58:33,423][71000] Updated weights for policy 0, policy_version 208714 (0.0033) [2024-06-13 08:58:35,940][70768] Fps is (10 sec: 54067.0, 60 sec: 49425.2, 300 sec: 49096.5). Total num frames: 3419717632. Throughput: 0: 49272.5. Samples: 2948470400. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 08:58:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:58:36,710][71000] Updated weights for policy 0, policy_version 208724 (0.0029) [2024-06-13 08:58:40,357][71000] Updated weights for policy 0, policy_version 208734 (0.0023) [2024-06-13 08:58:40,940][70768] Fps is (10 sec: 50789.8, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 3419947008. Throughput: 0: 49047.8. Samples: 2948759160. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 08:58:40,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:58:43,163][71000] Updated weights for policy 0, policy_version 208744 (0.0024) [2024-06-13 08:58:45,942][70768] Fps is (10 sec: 44227.9, 60 sec: 48877.5, 300 sec: 48818.4). Total num frames: 3420160000. Throughput: 0: 48975.6. Samples: 2949052340. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 08:58:45,942][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 08:58:46,882][71000] Updated weights for policy 0, policy_version 208754 (0.0026) [2024-06-13 08:58:49,523][71000] Updated weights for policy 0, policy_version 208764 (0.0029) [2024-06-13 08:58:50,940][70768] Fps is (10 sec: 47513.8, 60 sec: 49425.0, 300 sec: 48874.3). Total num frames: 3420422144. Throughput: 0: 48700.3. Samples: 2949196240. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 08:58:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:58:53,613][71000] Updated weights for policy 0, policy_version 208774 (0.0024) [2024-06-13 08:58:55,940][70768] Fps is (10 sec: 54078.1, 60 sec: 49425.0, 300 sec: 49096.5). Total num frames: 3420700672. Throughput: 0: 48819.8. Samples: 2949493640. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 08:58:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:58:56,593][71000] Updated weights for policy 0, policy_version 208784 (0.0023) [2024-06-13 08:58:59,798][70980] Signal inference workers to stop experience collection... (44150 times) [2024-06-13 08:58:59,798][70980] Signal inference workers to resume experience collection... (44150 times) [2024-06-13 08:58:59,811][71000] InferenceWorker_p0-w0: stopping experience collection (44150 times) [2024-06-13 08:58:59,811][71000] InferenceWorker_p0-w0: resuming experience collection (44150 times) [2024-06-13 08:59:00,414][71000] Updated weights for policy 0, policy_version 208794 (0.0027) [2024-06-13 08:59:00,940][70768] Fps is (10 sec: 50791.0, 60 sec: 48879.0, 300 sec: 49041.1). Total num frames: 3420930048. Throughput: 0: 49162.0. Samples: 2949793600. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 08:59:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 08:59:03,435][71000] Updated weights for policy 0, policy_version 208804 (0.0029) [2024-06-13 08:59:05,940][70768] Fps is (10 sec: 44236.7, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3421143040. Throughput: 0: 48883.1. Samples: 2949941500. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 08:59:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 08:59:07,064][71000] Updated weights for policy 0, policy_version 208814 (0.0027) [2024-06-13 08:59:09,939][71000] Updated weights for policy 0, policy_version 208824 (0.0024) [2024-06-13 08:59:10,939][70768] Fps is (10 sec: 47513.7, 60 sec: 48879.1, 300 sec: 48985.4). Total num frames: 3421405184. Throughput: 0: 48551.6. Samples: 2950225900. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 08:59:10,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 08:59:13,768][71000] Updated weights for policy 0, policy_version 208834 (0.0035) [2024-06-13 08:59:15,940][70768] Fps is (10 sec: 52428.7, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 3421667328. Throughput: 0: 48700.9. Samples: 2950517080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 08:59:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:59:16,429][71000] Updated weights for policy 0, policy_version 208844 (0.0024) [2024-06-13 08:59:20,517][71000] Updated weights for policy 0, policy_version 208854 (0.0027) [2024-06-13 08:59:20,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48059.7, 300 sec: 48929.8). Total num frames: 3421880320. Throughput: 0: 48757.3. Samples: 2950664480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 08:59:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 08:59:23,607][71000] Updated weights for policy 0, policy_version 208864 (0.0033) [2024-06-13 08:59:25,940][70768] Fps is (10 sec: 44236.9, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 3422109696. Throughput: 0: 48743.2. Samples: 2950952600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 08:59:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:59:27,339][71000] Updated weights for policy 0, policy_version 208874 (0.0033) [2024-06-13 08:59:30,310][71000] Updated weights for policy 0, policy_version 208884 (0.0032) [2024-06-13 08:59:30,940][70768] Fps is (10 sec: 50789.5, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 3422388224. Throughput: 0: 48803.3. Samples: 2951248400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 08:59:30,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 08:59:34,064][71000] Updated weights for policy 0, policy_version 208894 (0.0026) [2024-06-13 08:59:35,940][70768] Fps is (10 sec: 52428.2, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 3422633984. Throughput: 0: 48938.6. Samples: 2951398480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 08:59:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 08:59:36,703][71000] Updated weights for policy 0, policy_version 208904 (0.0029) [2024-06-13 08:59:40,779][71000] Updated weights for policy 0, policy_version 208914 (0.0029) [2024-06-13 08:59:40,940][70768] Fps is (10 sec: 45875.6, 60 sec: 48332.8, 300 sec: 48929.8). Total num frames: 3422846976. Throughput: 0: 48862.1. Samples: 2951692440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 08:59:40,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 08:59:40,976][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000208915_3422863360.pth... [2024-06-13 08:59:41,024][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000208199_3411132416.pth [2024-06-13 08:59:43,549][71000] Updated weights for policy 0, policy_version 208924 (0.0022) [2024-06-13 08:59:45,940][70768] Fps is (10 sec: 45875.2, 60 sec: 48880.5, 300 sec: 48818.8). Total num frames: 3423092736. Throughput: 0: 48358.5. Samples: 2951969740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 08:59:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 08:59:47,579][71000] Updated weights for policy 0, policy_version 208934 (0.0028) [2024-06-13 08:59:50,436][71000] Updated weights for policy 0, policy_version 208944 (0.0037) [2024-06-13 08:59:50,943][70768] Fps is (10 sec: 49137.2, 60 sec: 48603.4, 300 sec: 48873.8). Total num frames: 3423338496. Throughput: 0: 48492.7. Samples: 2952123820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 08:59:50,943][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:59:54,206][71000] Updated weights for policy 0, policy_version 208954 (0.0027) [2024-06-13 08:59:55,940][70768] Fps is (10 sec: 50791.0, 60 sec: 48332.8, 300 sec: 48874.3). Total num frames: 3423600640. Throughput: 0: 48816.4. Samples: 2952422640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 08:59:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 08:59:56,766][71000] Updated weights for policy 0, policy_version 208964 (0.0029) [2024-06-13 09:00:00,695][70980] Signal inference workers to stop experience collection... (44200 times) [2024-06-13 09:00:00,695][70980] Signal inference workers to resume experience collection... (44200 times) [2024-06-13 09:00:00,710][71000] InferenceWorker_p0-w0: stopping experience collection (44200 times) [2024-06-13 09:00:00,710][71000] InferenceWorker_p0-w0: resuming experience collection (44200 times) [2024-06-13 09:00:00,852][71000] Updated weights for policy 0, policy_version 208974 (0.0033) [2024-06-13 09:00:00,940][70768] Fps is (10 sec: 49166.6, 60 sec: 48332.7, 300 sec: 48874.3). Total num frames: 3423830016. Throughput: 0: 48987.4. Samples: 2952721520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 09:00:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:00:03,498][71000] Updated weights for policy 0, policy_version 208984 (0.0019) [2024-06-13 09:00:05,939][70768] Fps is (10 sec: 47513.9, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 3424075776. Throughput: 0: 48749.9. Samples: 2952858220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 09:00:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:00:07,413][71000] Updated weights for policy 0, policy_version 208994 (0.0024) [2024-06-13 09:00:10,249][71000] Updated weights for policy 0, policy_version 209004 (0.0028) [2024-06-13 09:00:10,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 3424337920. Throughput: 0: 48994.6. Samples: 2953157360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 09:00:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:00:14,195][71000] Updated weights for policy 0, policy_version 209014 (0.0023) [2024-06-13 09:00:15,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 3424583680. Throughput: 0: 48930.4. Samples: 2953450260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 09:00:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:00:16,723][71000] Updated weights for policy 0, policy_version 209024 (0.0035) [2024-06-13 09:00:20,804][71000] Updated weights for policy 0, policy_version 209034 (0.0023) [2024-06-13 09:00:20,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 3424813056. Throughput: 0: 48899.9. Samples: 2953598980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-13 09:00:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:00:23,275][71000] Updated weights for policy 0, policy_version 209044 (0.0026) [2024-06-13 09:00:25,944][70768] Fps is (10 sec: 49130.5, 60 sec: 49421.5, 300 sec: 48984.6). Total num frames: 3425075200. Throughput: 0: 48910.9. Samples: 2953893640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-13 09:00:25,944][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:00:27,245][71000] Updated weights for policy 0, policy_version 209054 (0.0022) [2024-06-13 09:00:30,310][71000] Updated weights for policy 0, policy_version 209064 (0.0023) [2024-06-13 09:00:30,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 3425320960. Throughput: 0: 49310.6. Samples: 2954188720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-13 09:00:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:00:34,030][71000] Updated weights for policy 0, policy_version 209074 (0.0031) [2024-06-13 09:00:35,940][70768] Fps is (10 sec: 47533.8, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 3425550336. Throughput: 0: 49252.6. Samples: 2954340040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-13 09:00:35,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 09:00:36,843][71000] Updated weights for policy 0, policy_version 209084 (0.0030) [2024-06-13 09:00:40,672][71000] Updated weights for policy 0, policy_version 209094 (0.0031) [2024-06-13 09:00:40,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49425.1, 300 sec: 48929.9). Total num frames: 3425812480. Throughput: 0: 49058.6. Samples: 2954630280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-13 09:00:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:00:43,645][71000] Updated weights for policy 0, policy_version 209104 (0.0031) [2024-06-13 09:00:45,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 3426058240. Throughput: 0: 48915.1. Samples: 2954922700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-13 09:00:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:00:47,383][71000] Updated weights for policy 0, policy_version 209114 (0.0030) [2024-06-13 09:00:50,505][71000] Updated weights for policy 0, policy_version 209124 (0.0025) [2024-06-13 09:00:50,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49427.6, 300 sec: 48985.4). Total num frames: 3426304000. Throughput: 0: 49248.3. Samples: 2955074400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-13 09:00:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:00:54,255][71000] Updated weights for policy 0, policy_version 209134 (0.0028) [2024-06-13 09:00:55,941][70768] Fps is (10 sec: 44231.1, 60 sec: 48331.7, 300 sec: 48818.5). Total num frames: 3426500608. Throughput: 0: 48966.5. Samples: 2955360920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-13 09:00:55,942][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:00:57,186][71000] Updated weights for policy 0, policy_version 209144 (0.0026) [2024-06-13 09:01:00,908][71000] Updated weights for policy 0, policy_version 209154 (0.0032) [2024-06-13 09:01:00,940][70768] Fps is (10 sec: 47513.3, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3426779136. Throughput: 0: 48820.8. Samples: 2955647200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-13 09:01:00,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 09:01:03,844][71000] Updated weights for policy 0, policy_version 209164 (0.0032) [2024-06-13 09:01:05,940][70768] Fps is (10 sec: 54074.8, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 3427041280. Throughput: 0: 49020.7. Samples: 2955804900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-13 09:01:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:01:07,342][71000] Updated weights for policy 0, policy_version 209174 (0.0034) [2024-06-13 09:01:09,573][70980] Signal inference workers to stop experience collection... (44250 times) [2024-06-13 09:01:09,574][70980] Signal inference workers to resume experience collection... (44250 times) [2024-06-13 09:01:09,605][71000] InferenceWorker_p0-w0: stopping experience collection (44250 times) [2024-06-13 09:01:09,606][71000] InferenceWorker_p0-w0: resuming experience collection (44250 times) [2024-06-13 09:01:10,730][71000] Updated weights for policy 0, policy_version 209184 (0.0026) [2024-06-13 09:01:10,940][70768] Fps is (10 sec: 49152.8, 60 sec: 48879.0, 300 sec: 48985.4). Total num frames: 3427270656. Throughput: 0: 49019.5. Samples: 2956099300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-13 09:01:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:01:14,418][71000] Updated weights for policy 0, policy_version 209194 (0.0033) [2024-06-13 09:01:15,940][70768] Fps is (10 sec: 44236.8, 60 sec: 48332.8, 300 sec: 48763.2). Total num frames: 3427483648. Throughput: 0: 48845.1. Samples: 2956386740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-13 09:01:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:01:17,234][71000] Updated weights for policy 0, policy_version 209204 (0.0032) [2024-06-13 09:01:20,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.3, 300 sec: 48818.8). Total num frames: 3427762176. Throughput: 0: 48567.4. Samples: 2956525560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-13 09:01:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:01:20,946][71000] Updated weights for policy 0, policy_version 209214 (0.0026) [2024-06-13 09:01:24,083][71000] Updated weights for policy 0, policy_version 209224 (0.0029) [2024-06-13 09:01:25,940][70768] Fps is (10 sec: 54067.4, 60 sec: 49155.6, 300 sec: 49096.5). Total num frames: 3428024320. Throughput: 0: 48622.3. Samples: 2956818280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-13 09:01:25,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 09:01:27,289][71000] Updated weights for policy 0, policy_version 209234 (0.0025) [2024-06-13 09:01:30,744][71000] Updated weights for policy 0, policy_version 209244 (0.0025) [2024-06-13 09:01:30,939][70768] Fps is (10 sec: 49151.7, 60 sec: 48879.1, 300 sec: 48929.9). Total num frames: 3428253696. Throughput: 0: 48871.3. Samples: 2957121900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 09:01:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:01:34,014][71000] Updated weights for policy 0, policy_version 209254 (0.0032) [2024-06-13 09:01:35,940][70768] Fps is (10 sec: 44236.7, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 3428466688. Throughput: 0: 48844.1. Samples: 2957272380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 09:01:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:01:37,363][71000] Updated weights for policy 0, policy_version 209264 (0.0030) [2024-06-13 09:01:40,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 3428728832. Throughput: 0: 48787.7. Samples: 2957556300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 09:01:40,949][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:01:41,030][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000209274_3428745216.pth... [2024-06-13 09:01:41,038][71000] Updated weights for policy 0, policy_version 209274 (0.0041) [2024-06-13 09:01:41,084][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000208557_3416997888.pth [2024-06-13 09:01:44,335][71000] Updated weights for policy 0, policy_version 209284 (0.0029) [2024-06-13 09:01:45,940][70768] Fps is (10 sec: 52428.8, 60 sec: 48879.0, 300 sec: 49096.5). Total num frames: 3428990976. Throughput: 0: 48763.7. Samples: 2957841560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 09:01:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:01:47,713][71000] Updated weights for policy 0, policy_version 209294 (0.0031) [2024-06-13 09:01:50,847][71000] Updated weights for policy 0, policy_version 209304 (0.0024) [2024-06-13 09:01:50,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3429236736. Throughput: 0: 48771.0. Samples: 2957999600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 09:01:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:01:54,356][71000] Updated weights for policy 0, policy_version 209314 (0.0025) [2024-06-13 09:01:55,940][70768] Fps is (10 sec: 45874.3, 60 sec: 49153.0, 300 sec: 48818.7). Total num frames: 3429449728. Throughput: 0: 48899.3. Samples: 2958299780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 09:01:55,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 09:01:57,525][71000] Updated weights for policy 0, policy_version 209324 (0.0034) [2024-06-13 09:02:00,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.1, 300 sec: 48929.8). Total num frames: 3429728256. Throughput: 0: 49185.8. Samples: 2958600100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 09:02:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:02:00,945][71000] Updated weights for policy 0, policy_version 209334 (0.0030) [2024-06-13 09:02:04,058][71000] Updated weights for policy 0, policy_version 209344 (0.0027) [2024-06-13 09:02:05,940][70768] Fps is (10 sec: 54068.4, 60 sec: 49152.0, 300 sec: 49096.5). Total num frames: 3429990400. Throughput: 0: 49286.6. Samples: 2958743460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 09:02:05,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:02:07,811][71000] Updated weights for policy 0, policy_version 209354 (0.0030) [2024-06-13 09:02:10,837][71000] Updated weights for policy 0, policy_version 209364 (0.0029) [2024-06-13 09:02:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3430219776. Throughput: 0: 49455.5. Samples: 2959043780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 09:02:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:02:13,189][70980] Signal inference workers to stop experience collection... (44300 times) [2024-06-13 09:02:13,232][71000] InferenceWorker_p0-w0: stopping experience collection (44300 times) [2024-06-13 09:02:13,247][70980] Signal inference workers to resume experience collection... (44300 times) [2024-06-13 09:02:13,251][71000] InferenceWorker_p0-w0: resuming experience collection (44300 times) [2024-06-13 09:02:14,434][71000] Updated weights for policy 0, policy_version 209374 (0.0033) [2024-06-13 09:02:15,940][70768] Fps is (10 sec: 44236.6, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3430432768. Throughput: 0: 49172.8. Samples: 2959334680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 09:02:15,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 09:02:17,548][71000] Updated weights for policy 0, policy_version 209384 (0.0018) [2024-06-13 09:02:20,906][71000] Updated weights for policy 0, policy_version 209394 (0.0026) [2024-06-13 09:02:20,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 3430711296. Throughput: 0: 48832.5. Samples: 2959469840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 09:02:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:02:23,935][71000] Updated weights for policy 0, policy_version 209404 (0.0029) [2024-06-13 09:02:25,939][70768] Fps is (10 sec: 54067.9, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 3430973440. Throughput: 0: 49273.0. Samples: 2959773580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 09:02:25,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 09:02:27,491][71000] Updated weights for policy 0, policy_version 209414 (0.0032) [2024-06-13 09:02:30,387][71000] Updated weights for policy 0, policy_version 209424 (0.0023) [2024-06-13 09:02:30,940][70768] Fps is (10 sec: 52428.6, 60 sec: 49698.1, 300 sec: 49096.5). Total num frames: 3431235584. Throughput: 0: 49716.0. Samples: 2960078780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 09:02:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 09:02:34,103][71000] Updated weights for policy 0, policy_version 209434 (0.0036) [2024-06-13 09:02:35,940][70768] Fps is (10 sec: 44236.4, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3431415808. Throughput: 0: 49541.4. Samples: 2960228960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:02:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:02:37,106][71000] Updated weights for policy 0, policy_version 209444 (0.0039) [2024-06-13 09:02:40,799][71000] Updated weights for policy 0, policy_version 209454 (0.0027) [2024-06-13 09:02:40,940][70768] Fps is (10 sec: 45875.2, 60 sec: 49425.1, 300 sec: 49041.0). Total num frames: 3431694336. Throughput: 0: 49372.2. Samples: 2960521520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:02:40,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 09:02:43,818][71000] Updated weights for policy 0, policy_version 209464 (0.0038) [2024-06-13 09:02:45,940][70768] Fps is (10 sec: 54065.3, 60 sec: 49424.8, 300 sec: 49151.9). Total num frames: 3431956480. Throughput: 0: 49060.5. Samples: 2960807840. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:02:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:02:47,645][71000] Updated weights for policy 0, policy_version 209474 (0.0030) [2024-06-13 09:02:50,616][71000] Updated weights for policy 0, policy_version 209484 (0.0032) [2024-06-13 09:02:50,943][70768] Fps is (10 sec: 50773.0, 60 sec: 49422.3, 300 sec: 49040.3). Total num frames: 3432202240. Throughput: 0: 49240.2. Samples: 2960959440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:02:50,944][70768] Avg episode reward: [(0, '0.294')] [2024-06-13 09:02:54,013][71000] Updated weights for policy 0, policy_version 209494 (0.0029) [2024-06-13 09:02:55,940][70768] Fps is (10 sec: 45876.4, 60 sec: 49425.1, 300 sec: 48874.3). Total num frames: 3432415232. Throughput: 0: 49222.6. Samples: 2961258800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:02:55,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:02:57,424][71000] Updated weights for policy 0, policy_version 209504 (0.0030) [2024-06-13 09:03:00,587][71000] Updated weights for policy 0, policy_version 209514 (0.0032) [2024-06-13 09:03:00,940][70768] Fps is (10 sec: 49168.3, 60 sec: 49425.0, 300 sec: 49096.4). Total num frames: 3432693760. Throughput: 0: 49239.0. Samples: 2961550440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:03:00,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:03:04,102][71000] Updated weights for policy 0, policy_version 209524 (0.0026) [2024-06-13 09:03:05,940][70768] Fps is (10 sec: 50790.9, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3432923136. Throughput: 0: 49533.7. Samples: 2961698860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:03:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:03:07,433][71000] Updated weights for policy 0, policy_version 209534 (0.0029) [2024-06-13 09:03:10,637][71000] Updated weights for policy 0, policy_version 209544 (0.0042) [2024-06-13 09:03:10,940][70768] Fps is (10 sec: 47514.1, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 3433168896. Throughput: 0: 49246.1. Samples: 2961989660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:03:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:03:14,368][71000] Updated weights for policy 0, policy_version 209554 (0.0031) [2024-06-13 09:03:15,940][70768] Fps is (10 sec: 45875.3, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 3433381888. Throughput: 0: 48968.9. Samples: 2962282380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:03:15,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:03:17,313][71000] Updated weights for policy 0, policy_version 209564 (0.0024) [2024-06-13 09:03:20,488][70980] Signal inference workers to stop experience collection... (44350 times) [2024-06-13 09:03:20,489][70980] Signal inference workers to resume experience collection... (44350 times) [2024-06-13 09:03:20,530][71000] InferenceWorker_p0-w0: stopping experience collection (44350 times) [2024-06-13 09:03:20,530][71000] InferenceWorker_p0-w0: resuming experience collection (44350 times) [2024-06-13 09:03:20,622][71000] Updated weights for policy 0, policy_version 209574 (0.0023) [2024-06-13 09:03:20,940][70768] Fps is (10 sec: 49151.1, 60 sec: 49151.8, 300 sec: 49096.4). Total num frames: 3433660416. Throughput: 0: 48741.1. Samples: 2962422320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:03:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:03:24,221][71000] Updated weights for policy 0, policy_version 209584 (0.0033) [2024-06-13 09:03:25,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 3433889792. Throughput: 0: 48867.6. Samples: 2962720560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:03:25,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 09:03:27,370][71000] Updated weights for policy 0, policy_version 209594 (0.0024) [2024-06-13 09:03:30,819][71000] Updated weights for policy 0, policy_version 209604 (0.0027) [2024-06-13 09:03:30,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 3434151936. Throughput: 0: 49092.3. Samples: 2963016980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:03:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 09:03:34,204][71000] Updated weights for policy 0, policy_version 209614 (0.0033) [2024-06-13 09:03:35,940][70768] Fps is (10 sec: 50790.1, 60 sec: 49698.1, 300 sec: 48985.4). Total num frames: 3434397696. Throughput: 0: 49003.7. Samples: 2963164440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:03:35,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 09:03:37,118][71000] Updated weights for policy 0, policy_version 209624 (0.0026) [2024-06-13 09:03:40,441][71000] Updated weights for policy 0, policy_version 209634 (0.0027) [2024-06-13 09:03:40,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.0, 300 sec: 49152.3). Total num frames: 3434659840. Throughput: 0: 49214.7. Samples: 2963473460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:03:40,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 09:03:40,948][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000209635_3434659840.pth... [2024-06-13 09:03:41,004][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000208915_3422863360.pth [2024-06-13 09:03:43,739][71000] Updated weights for policy 0, policy_version 209644 (0.0020) [2024-06-13 09:03:45,939][70768] Fps is (10 sec: 49152.5, 60 sec: 48879.3, 300 sec: 49041.0). Total num frames: 3434889216. Throughput: 0: 49565.5. Samples: 2963780880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-13 09:03:45,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 09:03:47,034][71000] Updated weights for policy 0, policy_version 209654 (0.0027) [2024-06-13 09:03:50,154][71000] Updated weights for policy 0, policy_version 209664 (0.0030) [2024-06-13 09:03:50,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49154.8, 300 sec: 48985.4). Total num frames: 3435151360. Throughput: 0: 49499.5. Samples: 2963926340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-13 09:03:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 09:03:53,474][71000] Updated weights for policy 0, policy_version 209674 (0.0031) [2024-06-13 09:03:55,939][70768] Fps is (10 sec: 50790.3, 60 sec: 49698.2, 300 sec: 49040.9). Total num frames: 3435397120. Throughput: 0: 49709.8. Samples: 2964226600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-13 09:03:55,940][70768] Avg episode reward: [(0, '0.278')] [2024-06-13 09:03:56,819][71000] Updated weights for policy 0, policy_version 209684 (0.0021) [2024-06-13 09:04:00,126][71000] Updated weights for policy 0, policy_version 209694 (0.0026) [2024-06-13 09:04:00,939][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.2, 300 sec: 49152.0). Total num frames: 3435642880. Throughput: 0: 49769.0. Samples: 2964521980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-13 09:04:00,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 09:04:03,280][71000] Updated weights for policy 0, policy_version 209704 (0.0026) [2024-06-13 09:04:05,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 49096.4). Total num frames: 3435888640. Throughput: 0: 50118.0. Samples: 2964677620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-13 09:04:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:04:06,539][71000] Updated weights for policy 0, policy_version 209714 (0.0029) [2024-06-13 09:04:10,056][71000] Updated weights for policy 0, policy_version 209724 (0.0029) [2024-06-13 09:04:10,940][70768] Fps is (10 sec: 50790.0, 60 sec: 49698.2, 300 sec: 49096.5). Total num frames: 3436150784. Throughput: 0: 50224.0. Samples: 2964980640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-13 09:04:10,940][70768] Avg episode reward: [(0, '0.275')] [2024-06-13 09:04:13,317][71000] Updated weights for policy 0, policy_version 209734 (0.0027) [2024-06-13 09:04:15,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49971.2, 300 sec: 49152.0). Total num frames: 3436380160. Throughput: 0: 49911.2. Samples: 2965262980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-13 09:04:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 09:04:16,692][71000] Updated weights for policy 0, policy_version 209744 (0.0027) [2024-06-13 09:04:19,854][71000] Updated weights for policy 0, policy_version 209754 (0.0024) [2024-06-13 09:04:20,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49698.3, 300 sec: 49263.1). Total num frames: 3436642304. Throughput: 0: 50067.6. Samples: 2965417480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-13 09:04:20,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:04:23,649][71000] Updated weights for policy 0, policy_version 209764 (0.0037) [2024-06-13 09:04:25,939][70768] Fps is (10 sec: 50790.8, 60 sec: 49971.3, 300 sec: 49152.1). Total num frames: 3436888064. Throughput: 0: 49657.5. Samples: 2965708040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-13 09:04:25,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 09:04:26,815][71000] Updated weights for policy 0, policy_version 209774 (0.0026) [2024-06-13 09:04:30,410][71000] Updated weights for policy 0, policy_version 209784 (0.0026) [2024-06-13 09:04:30,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49425.2, 300 sec: 49096.5). Total num frames: 3437117440. Throughput: 0: 49472.0. Samples: 2966007120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-13 09:04:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:04:33,463][71000] Updated weights for policy 0, policy_version 209794 (0.0025) [2024-06-13 09:04:35,939][70768] Fps is (10 sec: 47513.1, 60 sec: 49425.1, 300 sec: 49207.6). Total num frames: 3437363200. Throughput: 0: 49474.7. Samples: 2966152700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-13 09:04:35,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 09:04:36,845][71000] Updated weights for policy 0, policy_version 209804 (0.0039) [2024-06-13 09:04:40,334][71000] Updated weights for policy 0, policy_version 209814 (0.0028) [2024-06-13 09:04:40,940][70768] Fps is (10 sec: 50789.1, 60 sec: 49425.0, 300 sec: 49263.1). Total num frames: 3437625344. Throughput: 0: 49326.9. Samples: 2966446320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-13 09:04:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 09:04:43,850][71000] Updated weights for policy 0, policy_version 209824 (0.0033) [2024-06-13 09:04:44,865][70980] Signal inference workers to stop experience collection... (44400 times) [2024-06-13 09:04:44,903][71000] InferenceWorker_p0-w0: stopping experience collection (44400 times) [2024-06-13 09:04:44,924][70980] Signal inference workers to resume experience collection... (44400 times) [2024-06-13 09:04:44,925][71000] InferenceWorker_p0-w0: resuming experience collection (44400 times) [2024-06-13 09:04:45,939][70768] Fps is (10 sec: 49152.1, 60 sec: 49425.0, 300 sec: 49208.1). Total num frames: 3437854720. Throughput: 0: 49032.0. Samples: 2966728420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-13 09:04:45,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 09:04:46,966][71000] Updated weights for policy 0, policy_version 209834 (0.0029) [2024-06-13 09:04:50,629][71000] Updated weights for policy 0, policy_version 209844 (0.0028) [2024-06-13 09:04:50,939][70768] Fps is (10 sec: 47514.6, 60 sec: 49152.1, 300 sec: 49152.0). Total num frames: 3438100480. Throughput: 0: 48737.4. Samples: 2966870800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 09:04:50,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:04:53,931][71000] Updated weights for policy 0, policy_version 209854 (0.0034) [2024-06-13 09:04:55,940][70768] Fps is (10 sec: 49150.8, 60 sec: 49151.8, 300 sec: 49207.5). Total num frames: 3438346240. Throughput: 0: 48549.6. Samples: 2967165380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 09:04:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:04:57,429][71000] Updated weights for policy 0, policy_version 209864 (0.0033) [2024-06-13 09:05:00,464][71000] Updated weights for policy 0, policy_version 209874 (0.0023) [2024-06-13 09:05:00,943][70768] Fps is (10 sec: 49134.5, 60 sec: 49149.0, 300 sec: 49206.9). Total num frames: 3438592000. Throughput: 0: 48813.0. Samples: 2967459740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 09:05:00,944][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 09:05:04,227][71000] Updated weights for policy 0, policy_version 209884 (0.0032) [2024-06-13 09:05:05,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3438837760. Throughput: 0: 48630.9. Samples: 2967605880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 09:05:05,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 09:05:07,245][71000] Updated weights for policy 0, policy_version 209894 (0.0024) [2024-06-13 09:05:10,654][71000] Updated weights for policy 0, policy_version 209904 (0.0023) [2024-06-13 09:05:10,939][70768] Fps is (10 sec: 47530.7, 60 sec: 48605.9, 300 sec: 49096.5). Total num frames: 3439067136. Throughput: 0: 48779.5. Samples: 2967903120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 09:05:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:05:14,183][71000] Updated weights for policy 0, policy_version 209914 (0.0024) [2024-06-13 09:05:15,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49151.8, 300 sec: 49207.5). Total num frames: 3439329280. Throughput: 0: 48342.0. Samples: 2968182520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 09:05:15,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 09:05:17,826][71000] Updated weights for policy 0, policy_version 209924 (0.0033) [2024-06-13 09:05:20,682][71000] Updated weights for policy 0, policy_version 209934 (0.0034) [2024-06-13 09:05:20,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48605.8, 300 sec: 49097.2). Total num frames: 3439558656. Throughput: 0: 48596.4. Samples: 2968339540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 09:05:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:05:24,266][71000] Updated weights for policy 0, policy_version 209944 (0.0032) [2024-06-13 09:05:25,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48878.7, 300 sec: 49152.0). Total num frames: 3439820800. Throughput: 0: 48803.2. Samples: 2968642460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 09:05:25,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 09:05:27,478][71000] Updated weights for policy 0, policy_version 209954 (0.0030) [2024-06-13 09:05:30,940][70768] Fps is (10 sec: 45875.3, 60 sec: 48332.7, 300 sec: 49040.9). Total num frames: 3440017408. Throughput: 0: 48901.7. Samples: 2968929000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 09:05:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:05:31,206][71000] Updated weights for policy 0, policy_version 209964 (0.0025) [2024-06-13 09:05:33,884][71000] Updated weights for policy 0, policy_version 209974 (0.0029) [2024-06-13 09:05:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3440312320. Throughput: 0: 49078.5. Samples: 2969079340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 09:05:35,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 09:05:37,603][71000] Updated weights for policy 0, policy_version 209984 (0.0038) [2024-06-13 09:05:40,405][71000] Updated weights for policy 0, policy_version 209994 (0.0024) [2024-06-13 09:05:40,940][70768] Fps is (10 sec: 54067.2, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 3440558080. Throughput: 0: 49033.1. Samples: 2969371860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 09:05:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:05:40,976][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000209996_3440574464.pth... [2024-06-13 09:05:41,011][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000209274_3428745216.pth [2024-06-13 09:05:44,377][71000] Updated weights for policy 0, policy_version 210004 (0.0027) [2024-06-13 09:05:45,940][70768] Fps is (10 sec: 49152.6, 60 sec: 49152.0, 300 sec: 49152.0). Total num frames: 3440803840. Throughput: 0: 49154.5. Samples: 2969671520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 09:05:45,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 09:05:47,073][71000] Updated weights for policy 0, policy_version 210014 (0.0033) [2024-06-13 09:05:50,823][71000] Updated weights for policy 0, policy_version 210024 (0.0026) [2024-06-13 09:05:50,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.8, 300 sec: 49263.3). Total num frames: 3441033216. Throughput: 0: 49052.5. Samples: 2969813240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 09:05:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:05:53,857][71000] Updated weights for policy 0, policy_version 210034 (0.0029) [2024-06-13 09:05:55,590][70980] Signal inference workers to stop experience collection... (44450 times) [2024-06-13 09:05:55,592][70980] Signal inference workers to resume experience collection... (44450 times) [2024-06-13 09:05:55,640][71000] InferenceWorker_p0-w0: stopping experience collection (44450 times) [2024-06-13 09:05:55,640][71000] InferenceWorker_p0-w0: resuming experience collection (44450 times) [2024-06-13 09:05:55,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49152.1, 300 sec: 49207.6). Total num frames: 3441295360. Throughput: 0: 48983.9. Samples: 2970107400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 09:05:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:05:57,891][71000] Updated weights for policy 0, policy_version 210044 (0.0029) [2024-06-13 09:06:00,405][71000] Updated weights for policy 0, policy_version 210054 (0.0034) [2024-06-13 09:06:00,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49154.8, 300 sec: 49152.0). Total num frames: 3441541120. Throughput: 0: 49283.6. Samples: 2970400280. Policy #0 lag: (min: 1.0, avg: 7.9, max: 18.0) [2024-06-13 09:06:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:06:04,316][71000] Updated weights for policy 0, policy_version 210064 (0.0028) [2024-06-13 09:06:05,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.2, 300 sec: 49207.5). Total num frames: 3441786880. Throughput: 0: 49071.6. Samples: 2970547760. Policy #0 lag: (min: 1.0, avg: 7.9, max: 18.0) [2024-06-13 09:06:05,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 09:06:07,370][71000] Updated weights for policy 0, policy_version 210074 (0.0024) [2024-06-13 09:06:10,940][70768] Fps is (10 sec: 45875.7, 60 sec: 48878.9, 300 sec: 49207.5). Total num frames: 3441999872. Throughput: 0: 48844.1. Samples: 2970840440. Policy #0 lag: (min: 1.0, avg: 7.9, max: 18.0) [2024-06-13 09:06:10,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:06:11,014][71000] Updated weights for policy 0, policy_version 210084 (0.0026) [2024-06-13 09:06:13,851][71000] Updated weights for policy 0, policy_version 210094 (0.0027) [2024-06-13 09:06:15,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48879.1, 300 sec: 49152.0). Total num frames: 3442262016. Throughput: 0: 49104.4. Samples: 2971138700. Policy #0 lag: (min: 1.0, avg: 7.9, max: 18.0) [2024-06-13 09:06:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:06:17,893][71000] Updated weights for policy 0, policy_version 210104 (0.0037) [2024-06-13 09:06:20,583][71000] Updated weights for policy 0, policy_version 210114 (0.0027) [2024-06-13 09:06:20,940][70768] Fps is (10 sec: 52429.0, 60 sec: 49425.1, 300 sec: 49152.0). Total num frames: 3442524160. Throughput: 0: 48930.4. Samples: 2971281200. Policy #0 lag: (min: 1.0, avg: 7.9, max: 18.0) [2024-06-13 09:06:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:06:24,689][71000] Updated weights for policy 0, policy_version 210124 (0.0028) [2024-06-13 09:06:25,939][70768] Fps is (10 sec: 45875.4, 60 sec: 48332.9, 300 sec: 49040.9). Total num frames: 3442720768. Throughput: 0: 48954.7. Samples: 2971574820. Policy #0 lag: (min: 1.0, avg: 7.9, max: 18.0) [2024-06-13 09:06:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:06:27,294][71000] Updated weights for policy 0, policy_version 210134 (0.0025) [2024-06-13 09:06:30,940][70768] Fps is (10 sec: 44236.4, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3442966528. Throughput: 0: 48654.1. Samples: 2971860960. Policy #0 lag: (min: 1.0, avg: 7.9, max: 18.0) [2024-06-13 09:06:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 09:06:31,321][71000] Updated weights for policy 0, policy_version 210144 (0.0032) [2024-06-13 09:06:33,809][71000] Updated weights for policy 0, policy_version 210154 (0.0032) [2024-06-13 09:06:35,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48606.0, 300 sec: 49152.0). Total num frames: 3443228672. Throughput: 0: 48740.2. Samples: 2972006540. Policy #0 lag: (min: 1.0, avg: 7.9, max: 18.0) [2024-06-13 09:06:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:06:37,903][71000] Updated weights for policy 0, policy_version 210164 (0.0030) [2024-06-13 09:06:40,555][71000] Updated weights for policy 0, policy_version 210174 (0.0028) [2024-06-13 09:06:40,940][70768] Fps is (10 sec: 54067.4, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 3443507200. Throughput: 0: 48866.2. Samples: 2972306380. Policy #0 lag: (min: 1.0, avg: 7.9, max: 18.0) [2024-06-13 09:06:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:06:44,803][71000] Updated weights for policy 0, policy_version 210184 (0.0029) [2024-06-13 09:06:45,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48605.9, 300 sec: 49096.5). Total num frames: 3443720192. Throughput: 0: 48933.9. Samples: 2972602300. Policy #0 lag: (min: 1.0, avg: 7.9, max: 18.0) [2024-06-13 09:06:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 09:06:47,205][71000] Updated weights for policy 0, policy_version 210194 (0.0024) [2024-06-13 09:06:50,940][70768] Fps is (10 sec: 44236.5, 60 sec: 48605.9, 300 sec: 49152.0). Total num frames: 3443949568. Throughput: 0: 48730.5. Samples: 2972740640. Policy #0 lag: (min: 1.0, avg: 7.9, max: 18.0) [2024-06-13 09:06:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:06:51,419][71000] Updated weights for policy 0, policy_version 210204 (0.0027) [2024-06-13 09:06:54,210][71000] Updated weights for policy 0, policy_version 210214 (0.0028) [2024-06-13 09:06:55,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48332.7, 300 sec: 49040.9). Total num frames: 3444195328. Throughput: 0: 48411.9. Samples: 2973018980. Policy #0 lag: (min: 1.0, avg: 7.9, max: 18.0) [2024-06-13 09:06:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:06:58,050][71000] Updated weights for policy 0, policy_version 210224 (0.0028) [2024-06-13 09:07:00,133][70980] Signal inference workers to stop experience collection... (44500 times) [2024-06-13 09:07:00,180][71000] InferenceWorker_p0-w0: stopping experience collection (44500 times) [2024-06-13 09:07:00,181][70980] Signal inference workers to resume experience collection... (44500 times) [2024-06-13 09:07:00,199][71000] InferenceWorker_p0-w0: resuming experience collection (44500 times) [2024-06-13 09:07:00,939][70768] Fps is (10 sec: 50791.4, 60 sec: 48606.0, 300 sec: 49040.9). Total num frames: 3444457472. Throughput: 0: 48452.1. Samples: 2973319040. Policy #0 lag: (min: 1.0, avg: 7.9, max: 18.0) [2024-06-13 09:07:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:07:00,970][71000] Updated weights for policy 0, policy_version 210234 (0.0030) [2024-06-13 09:07:04,863][71000] Updated weights for policy 0, policy_version 210244 (0.0033) [2024-06-13 09:07:05,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48332.7, 300 sec: 49040.9). Total num frames: 3444686848. Throughput: 0: 48420.7. Samples: 2973460140. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 09:07:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:07:07,747][71000] Updated weights for policy 0, policy_version 210254 (0.0036) [2024-06-13 09:07:10,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48878.9, 300 sec: 49152.0). Total num frames: 3444932608. Throughput: 0: 48414.6. Samples: 2973753480. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 09:07:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:07:11,840][71000] Updated weights for policy 0, policy_version 210264 (0.0033) [2024-06-13 09:07:14,560][71000] Updated weights for policy 0, policy_version 210274 (0.0023) [2024-06-13 09:07:15,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48605.7, 300 sec: 49040.9). Total num frames: 3445178368. Throughput: 0: 48278.6. Samples: 2974033500. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 09:07:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:07:18,259][71000] Updated weights for policy 0, policy_version 210284 (0.0026) [2024-06-13 09:07:20,940][70768] Fps is (10 sec: 49151.6, 60 sec: 48332.7, 300 sec: 48985.4). Total num frames: 3445424128. Throughput: 0: 48470.6. Samples: 2974187720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 09:07:20,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:07:21,289][71000] Updated weights for policy 0, policy_version 210294 (0.0029) [2024-06-13 09:07:24,986][71000] Updated weights for policy 0, policy_version 210304 (0.0023) [2024-06-13 09:07:25,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 3445669888. Throughput: 0: 48534.1. Samples: 2974490420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 09:07:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:07:27,677][71000] Updated weights for policy 0, policy_version 210314 (0.0025) [2024-06-13 09:07:30,940][70768] Fps is (10 sec: 49149.5, 60 sec: 49151.6, 300 sec: 49151.9). Total num frames: 3445915648. Throughput: 0: 48676.2. Samples: 2974792760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 09:07:30,941][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:07:31,575][71000] Updated weights for policy 0, policy_version 210324 (0.0022) [2024-06-13 09:07:34,251][71000] Updated weights for policy 0, policy_version 210334 (0.0028) [2024-06-13 09:07:35,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 3446177792. Throughput: 0: 48776.4. Samples: 2974935580. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 09:07:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:07:38,047][71000] Updated weights for policy 0, policy_version 210344 (0.0031) [2024-06-13 09:07:40,940][70768] Fps is (10 sec: 50793.4, 60 sec: 48605.9, 300 sec: 49041.0). Total num frames: 3446423552. Throughput: 0: 49239.3. Samples: 2975234740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 09:07:40,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 09:07:40,967][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000210354_3446439936.pth... [2024-06-13 09:07:40,971][71000] Updated weights for policy 0, policy_version 210354 (0.0023) [2024-06-13 09:07:41,011][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000209635_3434659840.pth [2024-06-13 09:07:44,831][71000] Updated weights for policy 0, policy_version 210364 (0.0032) [2024-06-13 09:07:45,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48878.8, 300 sec: 48985.9). Total num frames: 3446652928. Throughput: 0: 49028.2. Samples: 2975525320. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 09:07:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 09:07:47,763][71000] Updated weights for policy 0, policy_version 210374 (0.0035) [2024-06-13 09:07:50,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48879.0, 300 sec: 49040.9). Total num frames: 3446882304. Throughput: 0: 49012.1. Samples: 2975665680. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 09:07:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:07:51,451][71000] Updated weights for policy 0, policy_version 210384 (0.0030) [2024-06-13 09:07:54,544][71000] Updated weights for policy 0, policy_version 210394 (0.0037) [2024-06-13 09:07:55,940][70768] Fps is (10 sec: 50790.4, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 3447160832. Throughput: 0: 49123.4. Samples: 2975964040. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 09:07:55,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 09:07:58,022][71000] Updated weights for policy 0, policy_version 210404 (0.0030) [2024-06-13 09:08:00,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48878.8, 300 sec: 49040.9). Total num frames: 3447390208. Throughput: 0: 49530.3. Samples: 2976262360. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 09:08:00,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 09:08:01,363][71000] Updated weights for policy 0, policy_version 210414 (0.0031) [2024-06-13 09:08:04,719][71000] Updated weights for policy 0, policy_version 210424 (0.0034) [2024-06-13 09:08:05,940][70768] Fps is (10 sec: 47514.2, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 3447635968. Throughput: 0: 49452.1. Samples: 2976413060. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 09:08:05,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:08:07,914][71000] Updated weights for policy 0, policy_version 210434 (0.0024) [2024-06-13 09:08:10,871][70980] Signal inference workers to stop experience collection... (44550 times) [2024-06-13 09:08:10,919][71000] InferenceWorker_p0-w0: stopping experience collection (44550 times) [2024-06-13 09:08:10,924][70980] Signal inference workers to resume experience collection... (44550 times) [2024-06-13 09:08:10,934][71000] InferenceWorker_p0-w0: resuming experience collection (44550 times) [2024-06-13 09:08:10,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49151.9, 300 sec: 49152.0). Total num frames: 3447881728. Throughput: 0: 49231.6. Samples: 2976705840. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-13 09:08:10,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 09:08:11,209][71000] Updated weights for policy 0, policy_version 210444 (0.0033) [2024-06-13 09:08:14,550][71000] Updated weights for policy 0, policy_version 210454 (0.0029) [2024-06-13 09:08:15,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49425.1, 300 sec: 49096.5). Total num frames: 3448143872. Throughput: 0: 48877.4. Samples: 2976992220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 09:08:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:08:18,081][71000] Updated weights for policy 0, policy_version 210464 (0.0036) [2024-06-13 09:08:20,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 49096.4). Total num frames: 3448373248. Throughput: 0: 48961.4. Samples: 2977138840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 09:08:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:08:21,291][71000] Updated weights for policy 0, policy_version 210474 (0.0032) [2024-06-13 09:08:24,780][71000] Updated weights for policy 0, policy_version 210484 (0.0034) [2024-06-13 09:08:25,940][70768] Fps is (10 sec: 47514.5, 60 sec: 49152.1, 300 sec: 49040.9). Total num frames: 3448619008. Throughput: 0: 48910.7. Samples: 2977435720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 09:08:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:08:28,226][71000] Updated weights for policy 0, policy_version 210494 (0.0035) [2024-06-13 09:08:30,939][70768] Fps is (10 sec: 47514.3, 60 sec: 48879.5, 300 sec: 48985.4). Total num frames: 3448848384. Throughput: 0: 49130.0. Samples: 2977736160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 09:08:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:08:31,270][71000] Updated weights for policy 0, policy_version 210504 (0.0023) [2024-06-13 09:08:34,768][71000] Updated weights for policy 0, policy_version 210514 (0.0035) [2024-06-13 09:08:35,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48606.0, 300 sec: 48929.9). Total num frames: 3449094144. Throughput: 0: 49171.1. Samples: 2977878380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 09:08:35,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:08:38,252][71000] Updated weights for policy 0, policy_version 210524 (0.0027) [2024-06-13 09:08:40,940][70768] Fps is (10 sec: 49151.5, 60 sec: 48605.8, 300 sec: 48985.4). Total num frames: 3449339904. Throughput: 0: 48973.9. Samples: 2978167860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 09:08:40,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 09:08:41,558][71000] Updated weights for policy 0, policy_version 210534 (0.0029) [2024-06-13 09:08:44,697][71000] Updated weights for policy 0, policy_version 210544 (0.0029) [2024-06-13 09:08:45,939][70768] Fps is (10 sec: 50790.7, 60 sec: 49152.2, 300 sec: 48985.4). Total num frames: 3449602048. Throughput: 0: 48833.5. Samples: 2978459860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 09:08:45,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 09:08:48,431][71000] Updated weights for policy 0, policy_version 210554 (0.0031) [2024-06-13 09:08:50,940][70768] Fps is (10 sec: 49151.6, 60 sec: 49151.9, 300 sec: 48929.8). Total num frames: 3449831424. Throughput: 0: 48907.5. Samples: 2978613900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 09:08:50,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 09:08:51,574][71000] Updated weights for policy 0, policy_version 210564 (0.0029) [2024-06-13 09:08:54,897][71000] Updated weights for policy 0, policy_version 210574 (0.0031) [2024-06-13 09:08:55,940][70768] Fps is (10 sec: 45874.9, 60 sec: 48332.9, 300 sec: 48874.3). Total num frames: 3450060800. Throughput: 0: 48926.3. Samples: 2978907520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 09:08:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:08:58,402][71000] Updated weights for policy 0, policy_version 210584 (0.0028) [2024-06-13 09:09:00,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 3450306560. Throughput: 0: 48877.4. Samples: 2979191700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 09:09:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:09:01,956][71000] Updated weights for policy 0, policy_version 210594 (0.0038) [2024-06-13 09:09:05,169][71000] Updated weights for policy 0, policy_version 210604 (0.0028) [2024-06-13 09:09:05,940][70768] Fps is (10 sec: 54067.3, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 3450601472. Throughput: 0: 48906.8. Samples: 2979339640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 09:09:05,940][70768] Avg episode reward: [(0, '0.295')] [2024-06-13 09:09:08,475][71000] Updated weights for policy 0, policy_version 210614 (0.0034) [2024-06-13 09:09:10,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 3450798080. Throughput: 0: 48893.7. Samples: 2979635940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 09:09:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:09:11,685][71000] Updated weights for policy 0, policy_version 210624 (0.0038) [2024-06-13 09:09:15,257][71000] Updated weights for policy 0, policy_version 210634 (0.0030) [2024-06-13 09:09:15,940][70768] Fps is (10 sec: 44236.3, 60 sec: 48332.8, 300 sec: 48818.7). Total num frames: 3451043840. Throughput: 0: 48595.8. Samples: 2979922980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 09:09:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:09:18,739][71000] Updated weights for policy 0, policy_version 210644 (0.0029) [2024-06-13 09:09:18,943][70980] Signal inference workers to stop experience collection... (44600 times) [2024-06-13 09:09:18,943][70980] Signal inference workers to resume experience collection... (44600 times) [2024-06-13 09:09:18,960][71000] InferenceWorker_p0-w0: stopping experience collection (44600 times) [2024-06-13 09:09:18,961][71000] InferenceWorker_p0-w0: resuming experience collection (44600 times) [2024-06-13 09:09:20,940][70768] Fps is (10 sec: 47513.7, 60 sec: 48332.9, 300 sec: 48763.2). Total num frames: 3451273216. Throughput: 0: 48797.8. Samples: 2980074280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-13 09:09:20,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 09:09:22,210][71000] Updated weights for policy 0, policy_version 210654 (0.0026) [2024-06-13 09:09:25,304][71000] Updated weights for policy 0, policy_version 210664 (0.0026) [2024-06-13 09:09:25,939][70768] Fps is (10 sec: 50791.3, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 3451551744. Throughput: 0: 48716.6. Samples: 2980360100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 09:09:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:09:29,039][71000] Updated weights for policy 0, policy_version 210674 (0.0030) [2024-06-13 09:09:30,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3451781120. Throughput: 0: 48700.4. Samples: 2980651380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 09:09:30,940][70768] Avg episode reward: [(0, '0.295')] [2024-06-13 09:09:32,133][71000] Updated weights for policy 0, policy_version 210684 (0.0032) [2024-06-13 09:09:35,520][71000] Updated weights for policy 0, policy_version 210694 (0.0025) [2024-06-13 09:09:35,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 3452026880. Throughput: 0: 48564.5. Samples: 2980799300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 09:09:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:09:38,775][71000] Updated weights for policy 0, policy_version 210704 (0.0025) [2024-06-13 09:09:40,939][70768] Fps is (10 sec: 47513.8, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 3452256256. Throughput: 0: 48649.9. Samples: 2981096760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 09:09:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:09:40,970][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000210710_3452272640.pth... [2024-06-13 09:09:41,014][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000209996_3440574464.pth [2024-06-13 09:09:42,458][71000] Updated weights for policy 0, policy_version 210714 (0.0034) [2024-06-13 09:09:45,284][71000] Updated weights for policy 0, policy_version 210724 (0.0022) [2024-06-13 09:09:45,940][70768] Fps is (10 sec: 49152.5, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 3452518400. Throughput: 0: 48735.7. Samples: 2981384800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 09:09:45,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 09:09:49,164][71000] Updated weights for policy 0, policy_version 210734 (0.0034) [2024-06-13 09:09:50,939][70768] Fps is (10 sec: 52428.7, 60 sec: 49152.1, 300 sec: 48929.9). Total num frames: 3452780544. Throughput: 0: 48794.7. Samples: 2981535400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 09:09:50,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 09:09:51,881][71000] Updated weights for policy 0, policy_version 210744 (0.0029) [2024-06-13 09:09:55,860][71000] Updated weights for policy 0, policy_version 210754 (0.0024) [2024-06-13 09:09:55,940][70768] Fps is (10 sec: 47511.8, 60 sec: 48878.6, 300 sec: 48819.3). Total num frames: 3452993536. Throughput: 0: 48845.4. Samples: 2981834000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 09:09:55,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 09:09:58,554][71000] Updated weights for policy 0, policy_version 210764 (0.0028) [2024-06-13 09:10:00,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3453255680. Throughput: 0: 48727.6. Samples: 2982115720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 09:10:00,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 09:10:02,416][71000] Updated weights for policy 0, policy_version 210774 (0.0024) [2024-06-13 09:10:05,500][71000] Updated weights for policy 0, policy_version 210784 (0.0033) [2024-06-13 09:10:05,940][70768] Fps is (10 sec: 50792.0, 60 sec: 48332.7, 300 sec: 48929.8). Total num frames: 3453501440. Throughput: 0: 48745.3. Samples: 2982267820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 09:10:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:10:09,211][71000] Updated weights for policy 0, policy_version 210794 (0.0028) [2024-06-13 09:10:10,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3453747200. Throughput: 0: 49021.3. Samples: 2982566060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 09:10:10,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 09:10:11,995][71000] Updated weights for policy 0, policy_version 210804 (0.0028) [2024-06-13 09:10:15,843][71000] Updated weights for policy 0, policy_version 210814 (0.0026) [2024-06-13 09:10:15,939][70768] Fps is (10 sec: 47514.0, 60 sec: 48879.1, 300 sec: 48874.3). Total num frames: 3453976576. Throughput: 0: 49168.0. Samples: 2982863940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 09:10:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:10:18,946][71000] Updated weights for policy 0, policy_version 210824 (0.0029) [2024-06-13 09:10:20,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49151.9, 300 sec: 48818.8). Total num frames: 3454222336. Throughput: 0: 48900.4. Samples: 2982999820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 09:10:20,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:10:22,559][71000] Updated weights for policy 0, policy_version 210834 (0.0031) [2024-06-13 09:10:25,608][71000] Updated weights for policy 0, policy_version 210844 (0.0037) [2024-06-13 09:10:25,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3454484480. Throughput: 0: 48725.3. Samples: 2983289400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 09:10:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:10:29,090][71000] Updated weights for policy 0, policy_version 210854 (0.0030) [2024-06-13 09:10:30,940][70768] Fps is (10 sec: 50790.7, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 3454730240. Throughput: 0: 49059.9. Samples: 2983592500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 09:10:30,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:10:32,061][71000] Updated weights for policy 0, policy_version 210864 (0.0025) [2024-06-13 09:10:35,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 3454943232. Throughput: 0: 48934.1. Samples: 2983737440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 09:10:35,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 09:10:36,086][71000] Updated weights for policy 0, policy_version 210874 (0.0041) [2024-06-13 09:10:37,599][70980] Signal inference workers to stop experience collection... (44650 times) [2024-06-13 09:10:37,599][70980] Signal inference workers to resume experience collection... (44650 times) [2024-06-13 09:10:37,641][71000] InferenceWorker_p0-w0: stopping experience collection (44650 times) [2024-06-13 09:10:37,641][71000] InferenceWorker_p0-w0: resuming experience collection (44650 times) [2024-06-13 09:10:39,132][71000] Updated weights for policy 0, policy_version 210884 (0.0023) [2024-06-13 09:10:40,940][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 3455205376. Throughput: 0: 48560.4. Samples: 2984019200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 09:10:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:10:42,768][71000] Updated weights for policy 0, policy_version 210894 (0.0036) [2024-06-13 09:10:45,511][71000] Updated weights for policy 0, policy_version 210904 (0.0026) [2024-06-13 09:10:45,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3455451136. Throughput: 0: 48695.5. Samples: 2984307020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 09:10:45,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 09:10:49,938][71000] Updated weights for policy 0, policy_version 210914 (0.0030) [2024-06-13 09:10:50,939][70768] Fps is (10 sec: 49152.2, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 3455696896. Throughput: 0: 48609.4. Samples: 2984455240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 09:10:50,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:10:52,694][71000] Updated weights for policy 0, policy_version 210924 (0.0032) [2024-06-13 09:10:55,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48606.1, 300 sec: 48707.7). Total num frames: 3455909888. Throughput: 0: 48744.4. Samples: 2984759560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 09:10:55,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:10:56,371][71000] Updated weights for policy 0, policy_version 210934 (0.0027) [2024-06-13 09:10:59,278][71000] Updated weights for policy 0, policy_version 210944 (0.0022) [2024-06-13 09:11:00,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48332.9, 300 sec: 48707.7). Total num frames: 3456155648. Throughput: 0: 48593.3. Samples: 2985050640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 09:11:00,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:11:03,215][71000] Updated weights for policy 0, policy_version 210954 (0.0027) [2024-06-13 09:11:05,940][70768] Fps is (10 sec: 50790.5, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 3456417792. Throughput: 0: 48855.7. Samples: 2985198320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 09:11:05,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 09:11:05,989][71000] Updated weights for policy 0, policy_version 210964 (0.0031) [2024-06-13 09:11:09,719][71000] Updated weights for policy 0, policy_version 210974 (0.0043) [2024-06-13 09:11:10,940][70768] Fps is (10 sec: 50789.7, 60 sec: 48605.8, 300 sec: 48818.7). Total num frames: 3456663552. Throughput: 0: 48846.5. Samples: 2985487500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 09:11:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:11:12,600][71000] Updated weights for policy 0, policy_version 210984 (0.0030) [2024-06-13 09:11:15,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 3456892928. Throughput: 0: 48568.9. Samples: 2985778100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 09:11:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:11:16,245][71000] Updated weights for policy 0, policy_version 210994 (0.0030) [2024-06-13 09:11:19,181][71000] Updated weights for policy 0, policy_version 211004 (0.0032) [2024-06-13 09:11:20,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 3457138688. Throughput: 0: 48615.0. Samples: 2985925120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 09:11:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:11:23,146][71000] Updated weights for policy 0, policy_version 211014 (0.0027) [2024-06-13 09:11:25,829][71000] Updated weights for policy 0, policy_version 211024 (0.0028) [2024-06-13 09:11:25,940][70768] Fps is (10 sec: 52428.3, 60 sec: 48878.8, 300 sec: 48985.4). Total num frames: 3457417216. Throughput: 0: 49002.1. Samples: 2986224300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 09:11:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:11:29,830][71000] Updated weights for policy 0, policy_version 211034 (0.0030) [2024-06-13 09:11:30,940][70768] Fps is (10 sec: 52429.7, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 3457662976. Throughput: 0: 49237.4. Samples: 2986522700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 09:11:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:11:32,279][71000] Updated weights for policy 0, policy_version 211044 (0.0028) [2024-06-13 09:11:33,772][70980] Signal inference workers to stop experience collection... (44700 times) [2024-06-13 09:11:33,816][71000] InferenceWorker_p0-w0: stopping experience collection (44700 times) [2024-06-13 09:11:33,879][70980] Signal inference workers to resume experience collection... (44700 times) [2024-06-13 09:11:33,880][71000] InferenceWorker_p0-w0: resuming experience collection (44700 times) [2024-06-13 09:11:35,940][70768] Fps is (10 sec: 45876.0, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 3457875968. Throughput: 0: 49184.0. Samples: 2986668520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-13 09:11:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:11:36,243][71000] Updated weights for policy 0, policy_version 211054 (0.0026) [2024-06-13 09:11:39,118][71000] Updated weights for policy 0, policy_version 211064 (0.0026) [2024-06-13 09:11:40,940][70768] Fps is (10 sec: 45875.1, 60 sec: 48605.8, 300 sec: 48818.8). Total num frames: 3458121728. Throughput: 0: 48808.0. Samples: 2986955920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:11:40,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:11:41,118][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000211069_3458154496.pth... [2024-06-13 09:11:41,157][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000210354_3446439936.pth [2024-06-13 09:11:43,011][71000] Updated weights for policy 0, policy_version 211074 (0.0030) [2024-06-13 09:11:45,899][71000] Updated weights for policy 0, policy_version 211084 (0.0025) [2024-06-13 09:11:45,940][70768] Fps is (10 sec: 52428.7, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 3458400256. Throughput: 0: 48902.7. Samples: 2987251260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:11:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:11:49,742][71000] Updated weights for policy 0, policy_version 211094 (0.0036) [2024-06-13 09:11:50,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48878.9, 300 sec: 48929.9). Total num frames: 3458629632. Throughput: 0: 48968.4. Samples: 2987401900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:11:50,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 09:11:52,624][71000] Updated weights for policy 0, policy_version 211104 (0.0028) [2024-06-13 09:11:55,940][70768] Fps is (10 sec: 47513.1, 60 sec: 49425.0, 300 sec: 48874.3). Total num frames: 3458875392. Throughput: 0: 49149.8. Samples: 2987699240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:11:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:11:56,122][71000] Updated weights for policy 0, policy_version 211114 (0.0037) [2024-06-13 09:11:59,191][71000] Updated weights for policy 0, policy_version 211124 (0.0029) [2024-06-13 09:12:00,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49425.0, 300 sec: 48929.9). Total num frames: 3459121152. Throughput: 0: 49425.4. Samples: 2988002240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:12:00,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:12:02,795][71000] Updated weights for policy 0, policy_version 211134 (0.0032) [2024-06-13 09:12:05,691][71000] Updated weights for policy 0, policy_version 211144 (0.0027) [2024-06-13 09:12:05,940][70768] Fps is (10 sec: 50790.9, 60 sec: 49425.1, 300 sec: 48985.4). Total num frames: 3459383296. Throughput: 0: 49499.7. Samples: 2988152600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:12:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:12:09,391][71000] Updated weights for policy 0, policy_version 211154 (0.0040) [2024-06-13 09:12:10,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49152.0, 300 sec: 48929.8). Total num frames: 3459612672. Throughput: 0: 49148.9. Samples: 2988436000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:12:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:12:12,742][71000] Updated weights for policy 0, policy_version 211164 (0.0028) [2024-06-13 09:12:15,887][71000] Updated weights for policy 0, policy_version 211174 (0.0025) [2024-06-13 09:12:15,939][70768] Fps is (10 sec: 49152.3, 60 sec: 49698.2, 300 sec: 48985.4). Total num frames: 3459874816. Throughput: 0: 49024.5. Samples: 2988728800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:12:15,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 09:12:19,011][71000] Updated weights for policy 0, policy_version 211184 (0.0023) [2024-06-13 09:12:20,939][70768] Fps is (10 sec: 49153.2, 60 sec: 49425.3, 300 sec: 48929.9). Total num frames: 3460104192. Throughput: 0: 49268.1. Samples: 2988885580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:12:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:12:22,566][71000] Updated weights for policy 0, policy_version 211194 (0.0027) [2024-06-13 09:12:25,884][71000] Updated weights for policy 0, policy_version 211204 (0.0033) [2024-06-13 09:12:25,940][70768] Fps is (10 sec: 49151.2, 60 sec: 49152.0, 300 sec: 48985.5). Total num frames: 3460366336. Throughput: 0: 49425.2. Samples: 2989180060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:12:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:12:28,902][71000] Updated weights for policy 0, policy_version 211214 (0.0031) [2024-06-13 09:12:30,939][70768] Fps is (10 sec: 50790.2, 60 sec: 49152.0, 300 sec: 48929.9). Total num frames: 3460612096. Throughput: 0: 49433.4. Samples: 2989475760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:12:30,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:12:32,667][71000] Updated weights for policy 0, policy_version 211224 (0.0030) [2024-06-13 09:12:33,618][70980] Signal inference workers to stop experience collection... (44750 times) [2024-06-13 09:12:33,659][71000] InferenceWorker_p0-w0: stopping experience collection (44750 times) [2024-06-13 09:12:33,672][70980] Signal inference workers to resume experience collection... (44750 times) [2024-06-13 09:12:33,673][71000] InferenceWorker_p0-w0: resuming experience collection (44750 times) [2024-06-13 09:12:35,814][71000] Updated weights for policy 0, policy_version 211234 (0.0031) [2024-06-13 09:12:35,940][70768] Fps is (10 sec: 49151.8, 60 sec: 49698.0, 300 sec: 48929.8). Total num frames: 3460857856. Throughput: 0: 49316.7. Samples: 2989621160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:12:35,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:12:39,303][71000] Updated weights for policy 0, policy_version 211244 (0.0023) [2024-06-13 09:12:40,940][70768] Fps is (10 sec: 45874.4, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 3461070848. Throughput: 0: 49124.8. Samples: 2989909860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:12:40,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 09:12:42,570][71000] Updated weights for policy 0, policy_version 211254 (0.0028) [2024-06-13 09:12:45,871][71000] Updated weights for policy 0, policy_version 211264 (0.0023) [2024-06-13 09:12:45,940][70768] Fps is (10 sec: 49152.0, 60 sec: 49151.9, 300 sec: 49040.9). Total num frames: 3461349376. Throughput: 0: 48987.0. Samples: 2990206660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:12:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:12:49,353][71000] Updated weights for policy 0, policy_version 211274 (0.0042) [2024-06-13 09:12:50,939][70768] Fps is (10 sec: 49153.1, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 3461562368. Throughput: 0: 48983.6. Samples: 2990356860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:12:50,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 09:12:52,777][71000] Updated weights for policy 0, policy_version 211284 (0.0027) [2024-06-13 09:12:55,940][70768] Fps is (10 sec: 45875.7, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 3461808128. Throughput: 0: 48947.3. Samples: 2990638620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:12:55,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:12:56,139][71000] Updated weights for policy 0, policy_version 211294 (0.0030) [2024-06-13 09:12:59,643][71000] Updated weights for policy 0, policy_version 211304 (0.0023) [2024-06-13 09:13:00,940][70768] Fps is (10 sec: 49150.8, 60 sec: 48878.8, 300 sec: 48874.3). Total num frames: 3462053888. Throughput: 0: 48890.9. Samples: 2990928900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:13:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 09:13:02,852][71000] Updated weights for policy 0, policy_version 211314 (0.0022) [2024-06-13 09:13:05,940][70768] Fps is (10 sec: 49151.7, 60 sec: 48605.8, 300 sec: 48874.3). Total num frames: 3462299648. Throughput: 0: 48583.4. Samples: 2991071840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:13:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:13:06,219][71000] Updated weights for policy 0, policy_version 211324 (0.0041) [2024-06-13 09:13:09,608][71000] Updated weights for policy 0, policy_version 211334 (0.0041) [2024-06-13 09:13:10,940][70768] Fps is (10 sec: 49152.3, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 3462545408. Throughput: 0: 48604.0. Samples: 2991367240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:13:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:13:13,155][71000] Updated weights for policy 0, policy_version 211344 (0.0030) [2024-06-13 09:13:15,939][70768] Fps is (10 sec: 47514.3, 60 sec: 48332.8, 300 sec: 48818.8). Total num frames: 3462774784. Throughput: 0: 48461.8. Samples: 2991656540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:13:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 09:13:16,281][71000] Updated weights for policy 0, policy_version 211354 (0.0029) [2024-06-13 09:13:20,035][71000] Updated weights for policy 0, policy_version 211364 (0.0029) [2024-06-13 09:13:20,939][70768] Fps is (10 sec: 49153.1, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3463036928. Throughput: 0: 48308.3. Samples: 2991795020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:13:20,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 09:13:22,934][71000] Updated weights for policy 0, policy_version 211374 (0.0024) [2024-06-13 09:13:25,940][70768] Fps is (10 sec: 47513.3, 60 sec: 48059.8, 300 sec: 48818.8). Total num frames: 3463249920. Throughput: 0: 48654.4. Samples: 2992099300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:13:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:13:26,746][71000] Updated weights for policy 0, policy_version 211384 (0.0035) [2024-06-13 09:13:29,727][71000] Updated weights for policy 0, policy_version 211394 (0.0033) [2024-06-13 09:13:30,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48605.8, 300 sec: 48929.8). Total num frames: 3463528448. Throughput: 0: 48402.3. Samples: 2992384760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:13:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:13:33,488][71000] Updated weights for policy 0, policy_version 211404 (0.0034) [2024-06-13 09:13:35,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48332.9, 300 sec: 48874.3). Total num frames: 3463757824. Throughput: 0: 48463.4. Samples: 2992537720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:13:35,940][70768] Avg episode reward: [(0, '0.282')] [2024-06-13 09:13:36,429][71000] Updated weights for policy 0, policy_version 211414 (0.0027) [2024-06-13 09:13:40,059][71000] Updated weights for policy 0, policy_version 211424 (0.0027) [2024-06-13 09:13:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3464019968. Throughput: 0: 48738.6. Samples: 2992831860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:13:40,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:13:40,955][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000211427_3464019968.pth... [2024-06-13 09:13:40,998][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000210710_3452272640.pth [2024-06-13 09:13:42,560][70980] Signal inference workers to stop experience collection... (44800 times) [2024-06-13 09:13:42,581][71000] InferenceWorker_p0-w0: stopping experience collection (44800 times) [2024-06-13 09:13:42,664][70980] Signal inference workers to resume experience collection... (44800 times) [2024-06-13 09:13:42,664][71000] InferenceWorker_p0-w0: resuming experience collection (44800 times) [2024-06-13 09:13:42,952][71000] Updated weights for policy 0, policy_version 211434 (0.0023) [2024-06-13 09:13:45,940][70768] Fps is (10 sec: 47514.0, 60 sec: 48059.8, 300 sec: 48818.8). Total num frames: 3464232960. Throughput: 0: 48788.6. Samples: 2993124380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:13:45,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 09:13:46,764][71000] Updated weights for policy 0, policy_version 211444 (0.0027) [2024-06-13 09:13:49,976][71000] Updated weights for policy 0, policy_version 211454 (0.0027) [2024-06-13 09:13:50,939][70768] Fps is (10 sec: 49152.7, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3464511488. Throughput: 0: 48904.6. Samples: 2993272540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:13:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:13:53,339][71000] Updated weights for policy 0, policy_version 211464 (0.0030) [2024-06-13 09:13:55,940][70768] Fps is (10 sec: 50789.9, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3464740864. Throughput: 0: 48903.6. Samples: 2993567900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:13:55,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 09:13:56,471][71000] Updated weights for policy 0, policy_version 211474 (0.0029) [2024-06-13 09:14:00,037][71000] Updated weights for policy 0, policy_version 211484 (0.0030) [2024-06-13 09:14:00,940][70768] Fps is (10 sec: 47512.7, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 3464986624. Throughput: 0: 49071.8. Samples: 2993864780. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:14:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 09:14:02,983][71000] Updated weights for policy 0, policy_version 211494 (0.0019) [2024-06-13 09:14:05,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48605.7, 300 sec: 48874.3). Total num frames: 3465216000. Throughput: 0: 49112.5. Samples: 2994005100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:14:05,940][70768] Avg episode reward: [(0, '0.281')] [2024-06-13 09:14:06,866][71000] Updated weights for policy 0, policy_version 211504 (0.0026) [2024-06-13 09:14:09,907][71000] Updated weights for policy 0, policy_version 211514 (0.0033) [2024-06-13 09:14:10,940][70768] Fps is (10 sec: 50790.3, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 3465494528. Throughput: 0: 48989.6. Samples: 2994303840. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:14:10,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 09:14:13,675][71000] Updated weights for policy 0, policy_version 211524 (0.0030) [2024-06-13 09:14:15,940][70768] Fps is (10 sec: 49152.8, 60 sec: 48878.8, 300 sec: 48929.8). Total num frames: 3465707520. Throughput: 0: 48934.2. Samples: 2994586800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:14:15,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 09:14:16,690][71000] Updated weights for policy 0, policy_version 211534 (0.0028) [2024-06-13 09:14:20,598][71000] Updated weights for policy 0, policy_version 211544 (0.0025) [2024-06-13 09:14:20,940][70768] Fps is (10 sec: 45876.0, 60 sec: 48605.8, 300 sec: 48818.8). Total num frames: 3465953280. Throughput: 0: 48723.2. Samples: 2994730260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:14:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:14:23,399][71000] Updated weights for policy 0, policy_version 211554 (0.0023) [2024-06-13 09:14:25,940][70768] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 48818.7). Total num frames: 3466182656. Throughput: 0: 48750.2. Samples: 2995025620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:14:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:14:27,085][71000] Updated weights for policy 0, policy_version 211564 (0.0023) [2024-06-13 09:14:30,027][71000] Updated weights for policy 0, policy_version 211574 (0.0028) [2024-06-13 09:14:30,940][70768] Fps is (10 sec: 52428.0, 60 sec: 49151.9, 300 sec: 48985.4). Total num frames: 3466477568. Throughput: 0: 48775.0. Samples: 2995319260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:14:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:14:33,924][71000] Updated weights for policy 0, policy_version 211584 (0.0033) [2024-06-13 09:14:35,940][70768] Fps is (10 sec: 50790.7, 60 sec: 48879.0, 300 sec: 48929.8). Total num frames: 3466690560. Throughput: 0: 48806.2. Samples: 2995468820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:14:35,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:14:36,984][71000] Updated weights for policy 0, policy_version 211594 (0.0033) [2024-06-13 09:14:40,614][71000] Updated weights for policy 0, policy_version 211604 (0.0035) [2024-06-13 09:14:40,940][70768] Fps is (10 sec: 44236.9, 60 sec: 48332.8, 300 sec: 48818.7). Total num frames: 3466919936. Throughput: 0: 48778.2. Samples: 2995762920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:14:40,952][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:14:43,639][71000] Updated weights for policy 0, policy_version 211614 (0.0033) [2024-06-13 09:14:45,940][70768] Fps is (10 sec: 47512.9, 60 sec: 48878.8, 300 sec: 48763.2). Total num frames: 3467165696. Throughput: 0: 48653.3. Samples: 2996054180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:14:45,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 09:14:47,075][70980] Signal inference workers to stop experience collection... (44850 times) [2024-06-13 09:14:47,075][70980] Signal inference workers to resume experience collection... (44850 times) [2024-06-13 09:14:47,120][71000] InferenceWorker_p0-w0: stopping experience collection (44850 times) [2024-06-13 09:14:47,120][71000] InferenceWorker_p0-w0: resuming experience collection (44850 times) [2024-06-13 09:14:47,206][71000] Updated weights for policy 0, policy_version 211624 (0.0028) [2024-06-13 09:14:50,193][71000] Updated weights for policy 0, policy_version 211634 (0.0030) [2024-06-13 09:14:50,940][70768] Fps is (10 sec: 52429.4, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3467444224. Throughput: 0: 48899.8. Samples: 2996205580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:14:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:14:53,927][71000] Updated weights for policy 0, policy_version 211644 (0.0026) [2024-06-13 09:14:55,939][70768] Fps is (10 sec: 50791.6, 60 sec: 48879.1, 300 sec: 48874.3). Total num frames: 3467673600. Throughput: 0: 48812.7. Samples: 2996500400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:14:55,940][70768] Avg episode reward: [(0, '0.276')] [2024-06-13 09:14:56,843][71000] Updated weights for policy 0, policy_version 211654 (0.0038) [2024-06-13 09:15:00,804][71000] Updated weights for policy 0, policy_version 211664 (0.0034) [2024-06-13 09:15:00,940][70768] Fps is (10 sec: 45874.8, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 3467902976. Throughput: 0: 48919.1. Samples: 2996788160. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 09:15:00,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 09:15:03,823][71000] Updated weights for policy 0, policy_version 211674 (0.0026) [2024-06-13 09:15:05,939][70768] Fps is (10 sec: 45875.2, 60 sec: 48606.1, 300 sec: 48763.2). Total num frames: 3468132352. Throughput: 0: 48990.3. Samples: 2996934820. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 09:15:05,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:15:07,582][71000] Updated weights for policy 0, policy_version 211684 (0.0035) [2024-06-13 09:15:10,413][71000] Updated weights for policy 0, policy_version 211694 (0.0029) [2024-06-13 09:15:10,940][70768] Fps is (10 sec: 50790.0, 60 sec: 48605.9, 300 sec: 48929.8). Total num frames: 3468410880. Throughput: 0: 48663.0. Samples: 2997215460. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 09:15:10,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:15:14,245][71000] Updated weights for policy 0, policy_version 211704 (0.0033) [2024-06-13 09:15:15,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 3468640256. Throughput: 0: 48719.3. Samples: 2997511620. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 09:15:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:15:16,983][71000] Updated weights for policy 0, policy_version 211714 (0.0032) [2024-06-13 09:15:20,940][70768] Fps is (10 sec: 45872.5, 60 sec: 48605.2, 300 sec: 48763.1). Total num frames: 3468869632. Throughput: 0: 48592.5. Samples: 2997655520. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 09:15:20,941][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:15:21,070][71000] Updated weights for policy 0, policy_version 211724 (0.0028) [2024-06-13 09:15:23,766][71000] Updated weights for policy 0, policy_version 211734 (0.0030) [2024-06-13 09:15:25,940][70768] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 48818.8). Total num frames: 3469131776. Throughput: 0: 48771.3. Samples: 2997957620. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 09:15:25,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:15:27,433][71000] Updated weights for policy 0, policy_version 211744 (0.0028) [2024-06-13 09:15:30,258][71000] Updated weights for policy 0, policy_version 211754 (0.0028) [2024-06-13 09:15:30,940][70768] Fps is (10 sec: 52432.7, 60 sec: 48606.0, 300 sec: 48985.4). Total num frames: 3469393920. Throughput: 0: 48815.7. Samples: 2998250880. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 09:15:30,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:15:34,343][71000] Updated weights for policy 0, policy_version 211764 (0.0028) [2024-06-13 09:15:35,940][70768] Fps is (10 sec: 49152.0, 60 sec: 48879.0, 300 sec: 48874.3). Total num frames: 3469623296. Throughput: 0: 48921.8. Samples: 2998407060. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 09:15:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:15:36,762][71000] Updated weights for policy 0, policy_version 211774 (0.0031) [2024-06-13 09:15:40,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48879.1, 300 sec: 48818.8). Total num frames: 3469852672. Throughput: 0: 48809.8. Samples: 2998696840. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 09:15:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:15:40,972][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000211784_3469869056.pth... [2024-06-13 09:15:40,985][71000] Updated weights for policy 0, policy_version 211784 (0.0037) [2024-06-13 09:15:41,026][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000211069_3458154496.pth [2024-06-13 09:15:43,775][71000] Updated weights for policy 0, policy_version 211794 (0.0024) [2024-06-13 09:15:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48879.1, 300 sec: 48818.8). Total num frames: 3470098432. Throughput: 0: 48894.3. Samples: 2998988400. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 09:15:45,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 09:15:47,859][71000] Updated weights for policy 0, policy_version 211804 (0.0023) [2024-06-13 09:15:50,477][71000] Updated weights for policy 0, policy_version 211814 (0.0028) [2024-06-13 09:15:50,940][70768] Fps is (10 sec: 52428.5, 60 sec: 48878.9, 300 sec: 49040.9). Total num frames: 3470376960. Throughput: 0: 48844.4. Samples: 2999132820. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 09:15:50,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:15:54,415][70980] Signal inference workers to stop experience collection... (44900 times) [2024-06-13 09:15:54,416][70980] Signal inference workers to resume experience collection... (44900 times) [2024-06-13 09:15:54,432][71000] InferenceWorker_p0-w0: stopping experience collection (44900 times) [2024-06-13 09:15:54,433][71000] InferenceWorker_p0-w0: resuming experience collection (44900 times) [2024-06-13 09:15:54,562][71000] Updated weights for policy 0, policy_version 211824 (0.0029) [2024-06-13 09:15:55,939][70768] Fps is (10 sec: 50790.6, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3470606336. Throughput: 0: 49238.0. Samples: 2999431160. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 09:15:55,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 09:15:56,994][71000] Updated weights for policy 0, policy_version 211834 (0.0025) [2024-06-13 09:16:00,939][70768] Fps is (10 sec: 45875.6, 60 sec: 48879.1, 300 sec: 48874.3). Total num frames: 3470835712. Throughput: 0: 49123.7. Samples: 2999722180. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 09:16:00,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 09:16:01,049][71000] Updated weights for policy 0, policy_version 211844 (0.0037) [2024-06-13 09:16:03,780][71000] Updated weights for policy 0, policy_version 211854 (0.0032) [2024-06-13 09:16:05,939][70768] Fps is (10 sec: 47513.7, 60 sec: 49152.0, 300 sec: 48874.3). Total num frames: 3471081472. Throughput: 0: 48918.2. Samples: 2999856800. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-13 09:16:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:16:08,048][71000] Updated weights for policy 0, policy_version 211864 (0.0036) [2024-06-13 09:16:10,636][71000] Updated weights for policy 0, policy_version 211874 (0.0023) [2024-06-13 09:16:10,940][70768] Fps is (10 sec: 50790.1, 60 sec: 48879.1, 300 sec: 48985.4). Total num frames: 3471343616. Throughput: 0: 48855.6. Samples: 3000156120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 09:16:10,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 09:16:14,544][71000] Updated weights for policy 0, policy_version 211884 (0.0027) [2024-06-13 09:16:15,939][70768] Fps is (10 sec: 47513.5, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 3471556608. Throughput: 0: 48934.3. Samples: 3000452920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 09:16:15,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 09:16:17,130][71000] Updated weights for policy 0, policy_version 211894 (0.0025) [2024-06-13 09:16:20,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48879.5, 300 sec: 48763.2). Total num frames: 3471802368. Throughput: 0: 48695.5. Samples: 3000598360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 09:16:20,940][70768] Avg episode reward: [(0, '0.283')] [2024-06-13 09:16:21,331][71000] Updated weights for policy 0, policy_version 211904 (0.0033) [2024-06-13 09:16:23,958][71000] Updated weights for policy 0, policy_version 211914 (0.0033) [2024-06-13 09:16:25,940][70768] Fps is (10 sec: 49150.9, 60 sec: 48605.7, 300 sec: 48763.2). Total num frames: 3472048128. Throughput: 0: 48495.3. Samples: 3000879140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 09:16:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:16:27,756][71000] Updated weights for policy 0, policy_version 211924 (0.0026) [2024-06-13 09:16:30,584][71000] Updated weights for policy 0, policy_version 211934 (0.0030) [2024-06-13 09:16:30,940][70768] Fps is (10 sec: 52428.7, 60 sec: 48878.9, 300 sec: 48985.4). Total num frames: 3472326656. Throughput: 0: 48596.0. Samples: 3001175220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 09:16:30,940][70768] Avg episode reward: [(0, '0.285')] [2024-06-13 09:16:34,367][71000] Updated weights for policy 0, policy_version 211944 (0.0029) [2024-06-13 09:16:35,940][70768] Fps is (10 sec: 50791.3, 60 sec: 48878.9, 300 sec: 48929.8). Total num frames: 3472556032. Throughput: 0: 48822.7. Samples: 3001329840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 09:16:35,940][70768] Avg episode reward: [(0, '0.273')] [2024-06-13 09:16:37,209][71000] Updated weights for policy 0, policy_version 211954 (0.0025) [2024-06-13 09:16:40,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 3472785408. Throughput: 0: 48768.9. Samples: 3001625760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 09:16:40,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 09:16:41,398][71000] Updated weights for policy 0, policy_version 211964 (0.0030) [2024-06-13 09:16:44,241][71000] Updated weights for policy 0, policy_version 211974 (0.0026) [2024-06-13 09:16:45,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 3473031168. Throughput: 0: 48680.7. Samples: 3001912820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 09:16:45,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:16:48,135][71000] Updated weights for policy 0, policy_version 211984 (0.0023) [2024-06-13 09:16:50,940][70768] Fps is (10 sec: 50788.6, 60 sec: 48605.6, 300 sec: 48874.3). Total num frames: 3473293312. Throughput: 0: 48828.5. Samples: 3002054100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 09:16:50,941][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 09:16:51,159][71000] Updated weights for policy 0, policy_version 211994 (0.0027) [2024-06-13 09:16:54,830][71000] Updated weights for policy 0, policy_version 212004 (0.0023) [2024-06-13 09:16:55,940][70768] Fps is (10 sec: 49152.4, 60 sec: 48605.8, 300 sec: 48818.8). Total num frames: 3473522688. Throughput: 0: 48684.0. Samples: 3002346900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 09:16:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:16:57,859][71000] Updated weights for policy 0, policy_version 212014 (0.0024) [2024-06-13 09:17:00,940][70768] Fps is (10 sec: 45876.8, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 3473752064. Throughput: 0: 48614.2. Samples: 3002640560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 09:17:00,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 09:17:01,501][71000] Updated weights for policy 0, policy_version 212024 (0.0033) [2024-06-13 09:17:02,868][70980] Signal inference workers to stop experience collection... (44950 times) [2024-06-13 09:17:02,868][70980] Signal inference workers to resume experience collection... (44950 times) [2024-06-13 09:17:02,906][71000] InferenceWorker_p0-w0: stopping experience collection (44950 times) [2024-06-13 09:17:02,906][71000] InferenceWorker_p0-w0: resuming experience collection (44950 times) [2024-06-13 09:17:04,537][71000] Updated weights for policy 0, policy_version 212034 (0.0022) [2024-06-13 09:17:05,940][70768] Fps is (10 sec: 47513.1, 60 sec: 48605.7, 300 sec: 48763.2). Total num frames: 3473997824. Throughput: 0: 48625.2. Samples: 3002786500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 09:17:05,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:17:08,163][71000] Updated weights for policy 0, policy_version 212044 (0.0031) [2024-06-13 09:17:10,940][70768] Fps is (10 sec: 50790.2, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 3474259968. Throughput: 0: 48872.6. Samples: 3003078400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-13 09:17:10,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:17:11,298][71000] Updated weights for policy 0, policy_version 212054 (0.0035) [2024-06-13 09:17:15,029][71000] Updated weights for policy 0, policy_version 212064 (0.0030) [2024-06-13 09:17:15,940][70768] Fps is (10 sec: 49152.6, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 3474489344. Throughput: 0: 48764.1. Samples: 3003369600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 09:17:15,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 09:17:18,202][71000] Updated weights for policy 0, policy_version 212074 (0.0030) [2024-06-13 09:17:20,940][70768] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 3474751488. Throughput: 0: 48501.3. Samples: 3003512400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 09:17:20,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 09:17:21,563][71000] Updated weights for policy 0, policy_version 212084 (0.0030) [2024-06-13 09:17:24,924][71000] Updated weights for policy 0, policy_version 212094 (0.0031) [2024-06-13 09:17:25,939][70768] Fps is (10 sec: 49152.1, 60 sec: 48879.1, 300 sec: 48707.7). Total num frames: 3474980864. Throughput: 0: 48444.0. Samples: 3003805740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 09:17:25,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:17:28,438][71000] Updated weights for policy 0, policy_version 212104 (0.0044) [2024-06-13 09:17:30,940][70768] Fps is (10 sec: 47512.8, 60 sec: 48332.7, 300 sec: 48707.7). Total num frames: 3475226624. Throughput: 0: 48635.0. Samples: 3004101400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 09:17:30,940][70768] Avg episode reward: [(0, '0.286')] [2024-06-13 09:17:31,562][71000] Updated weights for policy 0, policy_version 212114 (0.0029) [2024-06-13 09:17:35,247][71000] Updated weights for policy 0, policy_version 212124 (0.0030) [2024-06-13 09:17:35,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 3475472384. Throughput: 0: 48705.2. Samples: 3004245820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 09:17:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:17:38,312][71000] Updated weights for policy 0, policy_version 212134 (0.0028) [2024-06-13 09:17:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.8, 300 sec: 48707.7). Total num frames: 3475718144. Throughput: 0: 48602.9. Samples: 3004534040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 09:17:40,942][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:17:41,004][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000212142_3475734528.pth... [2024-06-13 09:17:41,045][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000211427_3464019968.pth [2024-06-13 09:17:42,113][71000] Updated weights for policy 0, policy_version 212144 (0.0042) [2024-06-13 09:17:45,306][71000] Updated weights for policy 0, policy_version 212154 (0.0034) [2024-06-13 09:17:45,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.9, 300 sec: 48763.2). Total num frames: 3475947520. Throughput: 0: 48392.9. Samples: 3004818240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 09:17:45,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:17:48,716][71000] Updated weights for policy 0, policy_version 212164 (0.0041) [2024-06-13 09:17:50,940][70768] Fps is (10 sec: 45875.5, 60 sec: 48059.9, 300 sec: 48707.7). Total num frames: 3476176896. Throughput: 0: 48315.6. Samples: 3004960700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 09:17:50,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:17:52,338][71000] Updated weights for policy 0, policy_version 212174 (0.0027) [2024-06-13 09:17:55,602][71000] Updated weights for policy 0, policy_version 212184 (0.0022) [2024-06-13 09:17:55,940][70768] Fps is (10 sec: 49151.4, 60 sec: 48605.8, 300 sec: 48763.2). Total num frames: 3476439040. Throughput: 0: 48393.7. Samples: 3005256120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 09:17:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:17:58,755][71000] Updated weights for policy 0, policy_version 212194 (0.0031) [2024-06-13 09:18:00,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 3476684800. Throughput: 0: 48525.7. Samples: 3005553260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 09:18:00,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:18:02,349][71000] Updated weights for policy 0, policy_version 212204 (0.0029) [2024-06-13 09:18:05,766][71000] Updated weights for policy 0, policy_version 212214 (0.0032) [2024-06-13 09:18:05,940][70768] Fps is (10 sec: 47514.1, 60 sec: 48606.0, 300 sec: 48707.7). Total num frames: 3476914176. Throughput: 0: 48534.2. Samples: 3005696440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 09:18:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:18:08,903][71000] Updated weights for policy 0, policy_version 212224 (0.0029) [2024-06-13 09:18:10,939][70768] Fps is (10 sec: 49152.5, 60 sec: 48606.0, 300 sec: 48818.8). Total num frames: 3477176320. Throughput: 0: 48412.5. Samples: 3005984300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 09:18:10,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 09:18:12,310][71000] Updated weights for policy 0, policy_version 212234 (0.0030) [2024-06-13 09:18:15,691][71000] Updated weights for policy 0, policy_version 212244 (0.0026) [2024-06-13 09:18:15,940][70768] Fps is (10 sec: 49152.1, 60 sec: 48605.9, 300 sec: 48707.7). Total num frames: 3477405696. Throughput: 0: 48466.9. Samples: 3006282400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 09:18:15,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:18:19,191][71000] Updated weights for policy 0, policy_version 212254 (0.0027) [2024-06-13 09:18:20,940][70768] Fps is (10 sec: 47513.4, 60 sec: 48332.8, 300 sec: 48818.8). Total num frames: 3477651456. Throughput: 0: 48402.7. Samples: 3006423940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-13 09:18:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:18:22,430][71000] Updated weights for policy 0, policy_version 212264 (0.0029) [2024-06-13 09:18:25,558][70980] Signal inference workers to stop experience collection... (45000 times) [2024-06-13 09:18:25,560][70980] Signal inference workers to resume experience collection... (45000 times) [2024-06-13 09:18:25,602][71000] InferenceWorker_p0-w0: stopping experience collection (45000 times) [2024-06-13 09:18:25,603][71000] InferenceWorker_p0-w0: resuming experience collection (45000 times) [2024-06-13 09:18:25,694][71000] Updated weights for policy 0, policy_version 212274 (0.0030) [2024-06-13 09:18:25,940][70768] Fps is (10 sec: 49151.8, 60 sec: 48605.8, 300 sec: 48707.7). Total num frames: 3477897216. Throughput: 0: 48604.2. Samples: 3006721220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:18:25,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:18:29,076][71000] Updated weights for policy 0, policy_version 212284 (0.0024) [2024-06-13 09:18:30,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48332.9, 300 sec: 48707.7). Total num frames: 3478126592. Throughput: 0: 48693.7. Samples: 3007009460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:18:30,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:18:32,553][71000] Updated weights for policy 0, policy_version 212294 (0.0027) [2024-06-13 09:18:35,709][71000] Updated weights for policy 0, policy_version 212304 (0.0032) [2024-06-13 09:18:35,939][70768] Fps is (10 sec: 50791.0, 60 sec: 48879.0, 300 sec: 48763.3). Total num frames: 3478405120. Throughput: 0: 48864.7. Samples: 3007159600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:18:35,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:18:38,929][71000] Updated weights for policy 0, policy_version 212314 (0.0029) [2024-06-13 09:18:40,940][70768] Fps is (10 sec: 52429.0, 60 sec: 48879.1, 300 sec: 48874.3). Total num frames: 3478650880. Throughput: 0: 48920.5. Samples: 3007457540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:18:40,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:18:42,172][71000] Updated weights for policy 0, policy_version 212324 (0.0026) [2024-06-13 09:18:45,845][71000] Updated weights for policy 0, policy_version 212334 (0.0029) [2024-06-13 09:18:45,940][70768] Fps is (10 sec: 47513.0, 60 sec: 48878.9, 300 sec: 48707.7). Total num frames: 3478880256. Throughput: 0: 48765.8. Samples: 3007747720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:18:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:18:48,996][71000] Updated weights for policy 0, policy_version 212344 (0.0037) [2024-06-13 09:18:50,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 3479109632. Throughput: 0: 48821.7. Samples: 3007893420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:18:50,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 09:18:52,906][71000] Updated weights for policy 0, policy_version 212354 (0.0032) [2024-06-13 09:18:55,842][71000] Updated weights for policy 0, policy_version 212364 (0.0026) [2024-06-13 09:18:55,940][70768] Fps is (10 sec: 49151.1, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 3479371776. Throughput: 0: 48829.5. Samples: 3008181640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:18:55,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 09:18:59,566][71000] Updated weights for policy 0, policy_version 212374 (0.0032) [2024-06-13 09:19:00,940][70768] Fps is (10 sec: 50790.3, 60 sec: 48878.9, 300 sec: 48818.8). Total num frames: 3479617536. Throughput: 0: 48719.0. Samples: 3008474760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:19:00,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 09:19:02,541][71000] Updated weights for policy 0, policy_version 212384 (0.0027) [2024-06-13 09:19:05,940][70768] Fps is (10 sec: 45875.7, 60 sec: 48605.8, 300 sec: 48596.6). Total num frames: 3479830528. Throughput: 0: 48853.6. Samples: 3008622360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:19:05,940][70768] Avg episode reward: [(0, '0.292')] [2024-06-13 09:19:06,251][71000] Updated weights for policy 0, policy_version 212394 (0.0038) [2024-06-13 09:19:09,299][71000] Updated weights for policy 0, policy_version 212404 (0.0039) [2024-06-13 09:19:10,940][70768] Fps is (10 sec: 45874.7, 60 sec: 48332.6, 300 sec: 48707.7). Total num frames: 3480076288. Throughput: 0: 48575.3. Samples: 3008907120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:19:10,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:19:13,234][71000] Updated weights for policy 0, policy_version 212414 (0.0028) [2024-06-13 09:19:15,940][70768] Fps is (10 sec: 50790.8, 60 sec: 48878.9, 300 sec: 48763.2). Total num frames: 3480338432. Throughput: 0: 48780.0. Samples: 3009204560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:19:15,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:19:16,032][71000] Updated weights for policy 0, policy_version 212424 (0.0026) [2024-06-13 09:19:19,763][71000] Updated weights for policy 0, policy_version 212434 (0.0038) [2024-06-13 09:19:20,940][70768] Fps is (10 sec: 50790.4, 60 sec: 48878.7, 300 sec: 48818.7). Total num frames: 3480584192. Throughput: 0: 48812.5. Samples: 3009356180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:19:20,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:19:22,886][71000] Updated weights for policy 0, policy_version 212444 (0.0027) [2024-06-13 09:19:25,940][70768] Fps is (10 sec: 47513.6, 60 sec: 48605.9, 300 sec: 48596.6). Total num frames: 3480813568. Throughput: 0: 48669.8. Samples: 3009647680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:19:25,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:19:26,362][71000] Updated weights for policy 0, policy_version 212454 (0.0028) [2024-06-13 09:19:29,354][71000] Updated weights for policy 0, policy_version 212464 (0.0028) [2024-06-13 09:19:30,939][70768] Fps is (10 sec: 47515.1, 60 sec: 48879.1, 300 sec: 48707.7). Total num frames: 3481059328. Throughput: 0: 48637.5. Samples: 3009936400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 09:19:30,940][70768] Avg episode reward: [(0, '0.288')] [2024-06-13 09:19:33,085][71000] Updated weights for policy 0, policy_version 212474 (0.0031) [2024-06-13 09:19:35,940][70768] Fps is (10 sec: 52429.1, 60 sec: 48878.9, 300 sec: 48874.3). Total num frames: 3481337856. Throughput: 0: 48756.1. Samples: 3010087440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 09:19:35,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:19:35,950][71000] Updated weights for policy 0, policy_version 212484 (0.0034) [2024-06-13 09:19:36,921][70980] Signal inference workers to stop experience collection... (45050 times) [2024-06-13 09:19:36,923][70980] Signal inference workers to resume experience collection... (45050 times) [2024-06-13 09:19:36,967][71000] InferenceWorker_p0-w0: stopping experience collection (45050 times) [2024-06-13 09:19:36,967][71000] InferenceWorker_p0-w0: resuming experience collection (45050 times) [2024-06-13 09:19:39,691][71000] Updated weights for policy 0, policy_version 212494 (0.0039) [2024-06-13 09:19:40,939][70768] Fps is (10 sec: 50790.1, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 3481567232. Throughput: 0: 48940.7. Samples: 3010383960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 09:19:40,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:19:40,973][70980] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000212499_3481583616.pth... [2024-06-13 09:19:41,021][70980] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000211784_3469869056.pth [2024-06-13 09:19:42,736][71000] Updated weights for policy 0, policy_version 212504 (0.0038) [2024-06-13 09:19:45,940][70768] Fps is (10 sec: 44235.3, 60 sec: 48332.6, 300 sec: 48596.6). Total num frames: 3481780224. Throughput: 0: 48896.7. Samples: 3010675120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 09:19:45,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:19:46,451][71000] Updated weights for policy 0, policy_version 212514 (0.0024) [2024-06-13 09:19:49,666][71000] Updated weights for policy 0, policy_version 212524 (0.0039) [2024-06-13 09:19:50,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48879.0, 300 sec: 48707.7). Total num frames: 3482042368. Throughput: 0: 48672.1. Samples: 3010812600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 09:19:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:19:52,967][71000] Updated weights for policy 0, policy_version 212534 (0.0027) [2024-06-13 09:19:55,940][70768] Fps is (10 sec: 50792.0, 60 sec: 48606.0, 300 sec: 48763.2). Total num frames: 3482288128. Throughput: 0: 49022.5. Samples: 3011113120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 09:19:55,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:19:56,245][71000] Updated weights for policy 0, policy_version 212544 (0.0036) [2024-06-13 09:19:59,502][71000] Updated weights for policy 0, policy_version 212554 (0.0023) [2024-06-13 09:20:00,941][70768] Fps is (10 sec: 49145.7, 60 sec: 48604.9, 300 sec: 48818.5). Total num frames: 3482533888. Throughput: 0: 48970.6. Samples: 3011408300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 09:20:00,941][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 09:20:02,934][71000] Updated weights for policy 0, policy_version 212564 (0.0024) [2024-06-13 09:20:05,939][70768] Fps is (10 sec: 47513.7, 60 sec: 48879.0, 300 sec: 48652.2). Total num frames: 3482763264. Throughput: 0: 48835.8. Samples: 3011553780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 09:20:05,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:20:06,314][71000] Updated weights for policy 0, policy_version 212574 (0.0027) [2024-06-13 09:20:09,411][71000] Updated weights for policy 0, policy_version 212584 (0.0034) [2024-06-13 09:20:10,940][70768] Fps is (10 sec: 49158.4, 60 sec: 49152.2, 300 sec: 48763.2). Total num frames: 3483025408. Throughput: 0: 48933.3. Samples: 3011849680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 09:20:10,940][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:20:12,909][71000] Updated weights for policy 0, policy_version 212594 (0.0026) [2024-06-13 09:20:15,940][70768] Fps is (10 sec: 52425.8, 60 sec: 49151.6, 300 sec: 48874.3). Total num frames: 3483287552. Throughput: 0: 49096.6. Samples: 3012145780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 09:20:15,941][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:20:16,278][71000] Updated weights for policy 0, policy_version 212604 (0.0024) [2024-06-13 09:20:19,562][71000] Updated weights for policy 0, policy_version 212614 (0.0027) [2024-06-13 09:20:20,940][70768] Fps is (10 sec: 49151.3, 60 sec: 48879.0, 300 sec: 48763.2). Total num frames: 3483516928. Throughput: 0: 49077.6. Samples: 3012295940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 09:20:20,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:20:22,774][71000] Updated weights for policy 0, policy_version 212624 (0.0029) [2024-06-13 09:20:25,940][70768] Fps is (10 sec: 47515.6, 60 sec: 49151.9, 300 sec: 48707.7). Total num frames: 3483762688. Throughput: 0: 49007.4. Samples: 3012589300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 09:20:25,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:20:26,273][71000] Updated weights for policy 0, policy_version 212634 (0.0028) [2024-06-13 09:20:29,633][71000] Updated weights for policy 0, policy_version 212644 (0.0035) [2024-06-13 09:20:30,940][70768] Fps is (10 sec: 49152.5, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 3484008448. Throughput: 0: 49005.2. Samples: 3012880340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 09:20:30,943][70768] Avg episode reward: [(0, '0.289')] [2024-06-13 09:20:32,596][70980] Signal inference workers to stop experience collection... (45100 times) [2024-06-13 09:20:32,619][71000] InferenceWorker_p0-w0: stopping experience collection (45100 times) [2024-06-13 09:20:32,651][70980] Signal inference workers to resume experience collection... (45100 times) [2024-06-13 09:20:32,652][71000] InferenceWorker_p0-w0: resuming experience collection (45100 times) [2024-06-13 09:20:32,806][71000] Updated weights for policy 0, policy_version 212654 (0.0028) [2024-06-13 09:20:35,940][70768] Fps is (10 sec: 49152.7, 60 sec: 48605.9, 300 sec: 48818.8). Total num frames: 3484254208. Throughput: 0: 49262.3. Samples: 3013029400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-13 09:20:35,940][70768] Avg episode reward: [(0, '0.291')] [2024-06-13 09:20:36,386][71000] Updated weights for policy 0, policy_version 212664 (0.0031) [2024-06-13 09:20:39,798][71000] Updated weights for policy 0, policy_version 212674 (0.0037) [2024-06-13 09:20:40,940][70768] Fps is (10 sec: 49151.9, 60 sec: 48878.8, 300 sec: 48818.8). Total num frames: 3484499968. Throughput: 0: 49107.5. Samples: 3013322960. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-13 09:20:40,940][70768] Avg episode reward: [(0, '0.280')] [2024-06-13 09:20:43,213][71000] Updated weights for policy 0, policy_version 212684 (0.0033) [2024-06-13 09:20:45,939][70768] Fps is (10 sec: 47513.9, 60 sec: 49152.3, 300 sec: 48652.2). Total num frames: 3484729344. Throughput: 0: 49083.8. Samples: 3013617000. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-13 09:20:45,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:20:46,234][71000] Updated weights for policy 0, policy_version 212694 (0.0031) [2024-06-13 09:20:49,605][71000] Updated weights for policy 0, policy_version 212704 (0.0029) [2024-06-13 09:20:50,940][70768] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 48763.2). Total num frames: 3484991488. Throughput: 0: 49249.7. Samples: 3013770020. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-13 09:20:50,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:20:52,947][71000] Updated weights for policy 0, policy_version 212714 (0.0022) [2024-06-13 09:20:55,940][70768] Fps is (10 sec: 50789.9, 60 sec: 49152.0, 300 sec: 48818.8). Total num frames: 3485237248. Throughput: 0: 49068.0. Samples: 3014057740. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-13 09:20:55,940][70768] Avg episode reward: [(0, '0.290')] [2024-06-13 09:20:56,505][71000] Updated weights for policy 0, policy_version 212724 (0.0026) [2024-06-13 09:20:59,949][71000] Updated weights for policy 0, policy_version 212734 (0.0021) [2024-06-13 09:21:00,940][70768] Fps is (10 sec: 49151.9, 60 sec: 49153.1, 300 sec: 48818.7). Total num frames: 3485483008. Throughput: 0: 49040.1. Samples: 3014352560. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-13 09:21:00,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 09:21:03,379][71000] Updated weights for policy 0, policy_version 212744 (0.0027) [2024-06-13 09:21:05,940][70768] Fps is (10 sec: 45875.0, 60 sec: 48878.9, 300 sec: 48652.1). Total num frames: 3485696000. Throughput: 0: 48956.1. Samples: 3014498960. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-13 09:21:05,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 09:21:06,540][71000] Updated weights for policy 0, policy_version 212754 (0.0036) [2024-06-13 09:21:09,935][71000] Updated weights for policy 0, policy_version 212764 (0.0027) [2024-06-13 09:21:10,939][70768] Fps is (10 sec: 49152.4, 60 sec: 49152.1, 300 sec: 48874.3). Total num frames: 3485974528. Throughput: 0: 48758.0. Samples: 3014783400. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-13 09:21:10,940][70768] Avg episode reward: [(0, '0.293')] [2024-06-13 09:21:13,162][71000] Updated weights for policy 0, policy_version 212774 (0.0027) [2024-06-13 09:21:15,939][70768] Fps is (10 sec: 52429.6, 60 sec: 48879.5, 300 sec: 48874.3). Total num frames: 3486220288. Throughput: 0: 48903.7. Samples: 3015081000. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-13 09:21:15,940][70768] Avg episode reward: [(0, '0.287')] [2024-06-13 09:21:16,610][71000] Updated weights for policy 0, policy_version 212784 (0.0026) [2024-06-13 09:21:20,004][71000] Updated weights for policy 0, policy_version 212794 (0.0036) [2024-06-13 09:21:20,940][70768] Fps is (10 sec: 47513.2, 60 sec: 48879.0, 300 sec: 48818.8). Total num frames: 3486449664. Throughput: 0: 48759.9. Samples: 3015223600. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-13 09:21:20,940][70768] Avg episode reward: [(0, '0.284')] [2024-06-13 09:21:23,465][71000] Updated weights for policy 0, policy_version 212804 (0.0026) [2024-06-13 09:22:27,761][73265] Saving configuration to /workspace/metta/train_dir/p2.death/config.json... [2024-06-13 09:22:27,777][73265] Rollout worker 0 uses device cpu [2024-06-13 09:22:27,778][73265] Rollout worker 1 uses device cpu [2024-06-13 09:22:27,778][73265] Rollout worker 2 uses device cpu [2024-06-13 09:22:27,778][73265] Rollout worker 3 uses device cpu [2024-06-13 09:22:27,778][73265] Rollout worker 4 uses device cpu [2024-06-13 09:22:27,778][73265] Rollout worker 5 uses device cpu [2024-06-13 09:22:27,778][73265] Rollout worker 6 uses device cpu [2024-06-13 09:22:27,778][73265] Rollout worker 7 uses device cpu [2024-06-13 09:22:27,779][73265] Rollout worker 8 uses device cpu [2024-06-13 09:22:27,779][73265] Rollout worker 9 uses device cpu [2024-06-13 09:22:27,779][73265] Rollout worker 10 uses device cpu [2024-06-13 09:22:27,779][73265] Rollout worker 11 uses device cpu [2024-06-13 09:22:27,779][73265] Rollout worker 12 uses device cpu [2024-06-13 09:22:27,779][73265] Rollout worker 13 uses device cpu [2024-06-13 09:22:27,779][73265] Rollout worker 14 uses device cpu [2024-06-13 09:22:27,780][73265] Rollout worker 15 uses device cpu [2024-06-13 09:22:27,780][73265] Rollout worker 16 uses device cpu [2024-06-13 09:22:27,780][73265] Rollout worker 17 uses device cpu [2024-06-13 09:22:27,780][73265] Rollout worker 18 uses device cpu [2024-06-13 09:22:27,780][73265] Rollout worker 19 uses device cpu [2024-06-13 09:22:27,780][73265] Rollout worker 20 uses device cpu [2024-06-13 09:22:27,780][73265] Rollout worker 21 uses device cpu [2024-06-13 09:22:27,780][73265] Rollout worker 22 uses device cpu [2024-06-13 09:22:27,781][73265] Rollout worker 23 uses device cpu [2024-06-13 09:22:27,781][73265] Rollout worker 24 uses device cpu [2024-06-13 09:22:27,781][73265] Rollout worker 25 uses device cpu [2024-06-13 09:22:27,781][73265] Rollout worker 26 uses device cpu [2024-06-13 09:22:27,781][73265] Rollout worker 27 uses device cpu [2024-06-13 09:22:27,781][73265] Rollout worker 28 uses device cpu [2024-06-13 09:22:27,781][73265] Rollout worker 29 uses device cpu [2024-06-13 09:22:27,782][73265] Rollout worker 30 uses device cpu [2024-06-13 09:22:27,782][73265] Rollout worker 31 uses device cpu [2024-06-13 09:22:28,359][73265] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-13 09:22:28,359][73265] InferenceWorker_p0-w0: min num requests: 10 [2024-06-13 09:22:28,418][73265] Starting all processes... [2024-06-13 09:22:28,419][73265] Starting process learner_proc0 [2024-06-13 09:22:28,683][73265] Starting all processes... [2024-06-13 09:22:28,686][73265] Starting process inference_proc0-0 [2024-06-13 09:22:28,686][73265] Starting process rollout_proc0 [2024-06-13 09:22:28,686][73265] Starting process rollout_proc1 [2024-06-13 09:22:28,686][73265] Starting process rollout_proc2 [2024-06-13 09:22:28,692][73265] Starting process rollout_proc13 [2024-06-13 09:22:28,687][73265] Starting process rollout_proc4 [2024-06-13 09:22:28,687][73265] Starting process rollout_proc5 [2024-06-13 09:22:28,687][73265] Starting process rollout_proc6 [2024-06-13 09:22:28,689][73265] Starting process rollout_proc7 [2024-06-13 09:22:28,690][73265] Starting process rollout_proc8 [2024-06-13 09:22:28,690][73265] Starting process rollout_proc9 [2024-06-13 09:22:28,690][73265] Starting process rollout_proc10 [2024-06-13 09:22:28,691][73265] Starting process rollout_proc11 [2024-06-13 09:22:28,692][73265] Starting process rollout_proc12 [2024-06-13 09:22:28,687][73265] Starting process rollout_proc3 [2024-06-13 09:22:28,692][73265] Starting process rollout_proc14 [2024-06-13 09:22:28,692][73265] Starting process rollout_proc15 [2024-06-13 09:22:28,692][73265] Starting process rollout_proc16 [2024-06-13 09:22:28,694][73265] Starting process rollout_proc17 [2024-06-13 09:22:28,697][73265] Starting process rollout_proc18 [2024-06-13 09:22:28,698][73265] Starting process rollout_proc19 [2024-06-13 09:22:28,698][73265] Starting process rollout_proc20 [2024-06-13 09:22:28,699][73265] Starting process rollout_proc21 [2024-06-13 09:22:28,703][73265] Starting process rollout_proc22 [2024-06-13 09:22:28,706][73265] Starting process rollout_proc23 [2024-06-13 09:22:28,708][73265] Starting process rollout_proc24 [2024-06-13 09:22:28,710][73265] Starting process rollout_proc25 [2024-06-13 09:22:28,710][73265] Starting process rollout_proc26 [2024-06-13 09:22:28,710][73265] Starting process rollout_proc27 [2024-06-13 09:22:28,711][73265] Starting process rollout_proc28 [2024-06-13 09:22:28,713][73265] Starting process rollout_proc29 [2024-06-13 09:22:28,714][73265] Starting process rollout_proc30 [2024-06-13 09:22:28,717][73265] Starting process rollout_proc31 [2024-06-13 09:22:30,740][73503] Worker 5 uses CPU cores [5] [2024-06-13 09:22:30,760][73500] Worker 2 uses CPU cores [2] [2024-06-13 09:22:30,815][73506] Worker 9 uses CPU cores [9] [2024-06-13 09:22:30,856][73527] Worker 28 uses CPU cores [28] [2024-06-13 09:22:30,859][73516] Worker 19 uses CPU cores [19] [2024-06-13 09:22:30,865][73477] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-13 09:22:30,865][73477] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-06-13 09:22:30,868][73522] Worker 24 uses CPU cores [24] [2024-06-13 09:22:30,874][73477] Num visible devices: 1 [2024-06-13 09:22:30,892][73477] Setting fixed seed 0 [2024-06-13 09:22:30,893][73477] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-13 09:22:30,893][73477] Initializing actor-critic model on device cuda:0 [2024-06-13 09:22:30,904][73498] Worker 0 uses CPU cores [0] [2024-06-13 09:22:30,908][73508] Worker 7 uses CPU cores [7] [2024-06-13 09:22:30,912][73526] Worker 29 uses CPU cores [29] [2024-06-13 09:22:30,924][73502] Worker 13 uses CPU cores [13] [2024-06-13 09:22:30,951][73523] Worker 25 uses CPU cores [25] [2024-06-13 09:22:30,955][73513] Worker 15 uses CPU cores [15] [2024-06-13 09:22:30,960][73518] Worker 18 uses CPU cores [18] [2024-06-13 09:22:30,960][73499] Worker 1 uses CPU cores [1] [2024-06-13 09:22:30,968][73509] Worker 12 uses CPU cores [12] [2024-06-13 09:22:30,978][73504] Worker 6 uses CPU cores [6] [2024-06-13 09:22:31,062][73510] Worker 11 uses CPU cores [11] [2024-06-13 09:22:31,072][73528] Worker 30 uses CPU cores [30] [2024-06-13 09:22:31,098][73515] Worker 17 uses CPU cores [17] [2024-06-13 09:22:31,108][73497] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-13 09:22:31,108][73497] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-06-13 09:22:31,115][73497] Num visible devices: 1 [2024-06-13 09:22:31,128][73521] Worker 23 uses CPU cores [23] [2024-06-13 09:22:31,140][73519] Worker 21 uses CPU cores [21] [2024-06-13 09:22:31,149][73507] Worker 10 uses CPU cores [10] [2024-06-13 09:22:31,152][73520] Worker 22 uses CPU cores [22] [2024-06-13 09:22:31,162][73517] Worker 20 uses CPU cores [20] [2024-06-13 09:22:31,172][73514] Worker 16 uses CPU cores [16] [2024-06-13 09:22:31,176][73505] Worker 8 uses CPU cores [8] [2024-06-13 09:22:31,180][73511] Worker 3 uses CPU cores [3] [2024-06-13 09:22:31,184][73524] Worker 26 uses CPU cores [26] [2024-06-13 09:22:31,188][73501] Worker 4 uses CPU cores [4] [2024-06-13 09:22:31,230][73512] Worker 14 uses CPU cores [14] [2024-06-13 09:22:31,236][73525] Worker 27 uses CPU cores [27] [2024-06-13 09:22:31,268][73529] Worker 31 uses CPU cores [31] [2024-06-13 09:22:31,711][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,711][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,711][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,711][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,711][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,711][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,711][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,712][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,712][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,712][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,712][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,712][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,712][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,712][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,712][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,712][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,712][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,712][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,712][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,712][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,712][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,712][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,715][73477] RunningMeanStd input shape: (1,) [2024-06-13 09:22:31,715][73477] RunningMeanStd input shape: (1,) [2024-06-13 09:22:31,715][73477] RunningMeanStd input shape: (1,) [2024-06-13 09:22:31,716][73477] RunningMeanStd input shape: (1,) [2024-06-13 09:22:31,716][73477] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:31,755][73477] RunningMeanStd input shape: (1,) [2024-06-13 09:22:31,759][73477] Created Actor Critic model with architecture: [2024-06-13 09:22:31,759][73477] SampleFactoryAgentWrapper( (obs_normalizer): ObservationNormalizer() (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (agent): MettaAgent( (_encoder): MultiFeatureSetEncoder( (feature_set_encoders): ModuleDict( (grid_obs): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (agent): RunningMeanStdInPlace() (altar): RunningMeanStdInPlace() (converter): RunningMeanStdInPlace() (generator): RunningMeanStdInPlace() (wall): RunningMeanStdInPlace() (agent:dir): RunningMeanStdInPlace() (agent:energy): RunningMeanStdInPlace() (agent:frozen): RunningMeanStdInPlace() (agent:hp): RunningMeanStdInPlace() (agent:id): RunningMeanStdInPlace() (agent:inv_r1): RunningMeanStdInPlace() (agent:inv_r2): RunningMeanStdInPlace() (agent:inv_r3): RunningMeanStdInPlace() (agent:shield): RunningMeanStdInPlace() (altar:hp): RunningMeanStdInPlace() (altar:state): RunningMeanStdInPlace() (converter:hp): RunningMeanStdInPlace() (converter:state): RunningMeanStdInPlace() (generator:amount): RunningMeanStdInPlace() (generator:hp): RunningMeanStdInPlace() (generator:state): RunningMeanStdInPlace() (wall:hp): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) (6): Linear(in_features=512, out_features=512, bias=True) (7): ELU(alpha=1.0) ) ) (global_vars): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (_steps): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_action): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_action_id): RunningMeanStdInPlace() (last_action_val): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_reward): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_reward): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (kinship): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (kinship): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) ) (merged_encoder): Sequential( (0): Linear(in_features=544, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) ) ) (_core): ModelCoreRNN( (core): GRU(512, 512) ) (_decoder): Decoder( (mlp): Identity() ) (_critic_linear): Linear(in_features=512, out_features=1, bias=True) (_action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=16, bias=True) ) ) ) [2024-06-13 09:22:31,831][73477] Using optimizer [2024-06-13 09:22:32,014][73477] Loading state from checkpoint /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000212499_3481583616.pth... [2024-06-13 09:22:32,030][73477] Loading model from checkpoint [2024-06-13 09:22:32,032][73477] Loaded experiment state at self.train_step=212499, self.env_steps=3481583616 [2024-06-13 09:22:32,032][73477] Initialized policy 0 weights for model version 212499 [2024-06-13 09:22:32,034][73477] LearnerWorker_p0 finished initialization! [2024-06-13 09:22:32,034][73477] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-13 09:22:32,742][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,742][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,743][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,743][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,743][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,743][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,743][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,743][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,743][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,743][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,743][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,743][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,743][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,743][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,743][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,743][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,743][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,743][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,743][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,743][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,743][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,743][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,746][73497] RunningMeanStd input shape: (1,) [2024-06-13 09:22:32,747][73497] RunningMeanStd input shape: (1,) [2024-06-13 09:22:32,747][73497] RunningMeanStd input shape: (1,) [2024-06-13 09:22:32,747][73497] RunningMeanStd input shape: (1,) [2024-06-13 09:22:32,747][73497] RunningMeanStd input shape: (11, 11) [2024-06-13 09:22:32,786][73497] RunningMeanStd input shape: (1,) [2024-06-13 09:22:32,809][73265] Inference worker 0-0 is ready! [2024-06-13 09:22:32,809][73265] All inference workers are ready! Signal rollout workers to start! [2024-06-13 09:22:35,233][73514] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,239][73520] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,241][73524] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,241][73522] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,241][73515] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,246][73516] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,254][73529] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,255][73517] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,257][73518] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,258][73523] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,260][73521] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,262][73527] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,265][73528] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,269][73526] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,294][73519] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,305][73511] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,307][73513] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,313][73506] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,322][73503] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,325][73504] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,325][73505] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,327][73498] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,328][73499] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,336][73510] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,336][73500] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,336][73502] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,338][73512] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,338][73501] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,342][73509] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,344][73508] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,345][73507] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,368][73525] Decorrelating experience for 0 frames... [2024-06-13 09:22:35,502][73265] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 3481583616. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-13 09:22:36,603][73514] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,619][73520] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,635][73522] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,639][73515] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,641][73524] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,653][73516] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,682][73529] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,683][73517] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,685][73523] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,687][73521] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,688][73518] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,694][73513] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,698][73528] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,703][73511] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,708][73527] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,713][73526] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,717][73506] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,742][73504] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,751][73503] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,755][73498] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,755][73505] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,756][73500] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,761][73512] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,768][73499] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,774][73501] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,776][73502] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,779][73510] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,788][73507] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,790][73509] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,792][73519] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,793][73508] Decorrelating experience for 256 frames... [2024-06-13 09:22:36,840][73525] Decorrelating experience for 256 frames... [2024-06-13 09:22:40,501][73265] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 3481583616. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-13 09:22:44,534][73512] Worker 14, sleep for 65.625 sec to decorrelate experience collection [2024-06-13 09:22:44,551][73503] Worker 5, sleep for 23.438 sec to decorrelate experience collection [2024-06-13 09:22:44,558][73513] Worker 15, sleep for 70.312 sec to decorrelate experience collection [2024-06-13 09:22:44,559][73509] Worker 12, sleep for 56.250 sec to decorrelate experience collection [2024-06-13 09:22:44,560][73511] Worker 3, sleep for 14.062 sec to decorrelate experience collection [2024-06-13 09:22:44,570][73499] Worker 1, sleep for 4.688 sec to decorrelate experience collection [2024-06-13 09:22:44,582][73500] Worker 2, sleep for 9.375 sec to decorrelate experience collection [2024-06-13 09:22:44,600][73501] Worker 4, sleep for 18.750 sec to decorrelate experience collection [2024-06-13 09:22:44,608][73510] Worker 11, sleep for 51.562 sec to decorrelate experience collection [2024-06-13 09:22:44,611][73507] Worker 10, sleep for 46.875 sec to decorrelate experience collection [2024-06-13 09:22:44,623][73505] Worker 8, sleep for 37.500 sec to decorrelate experience collection [2024-06-13 09:22:44,635][73502] Worker 13, sleep for 60.938 sec to decorrelate experience collection [2024-06-13 09:22:44,650][73508] Worker 7, sleep for 32.812 sec to decorrelate experience collection [2024-06-13 09:22:44,650][73523] Worker 25, sleep for 117.188 sec to decorrelate experience collection [2024-06-13 09:22:44,685][73477] Signal inference workers to stop experience collection... [2024-06-13 09:22:44,690][73497] InferenceWorker_p0-w0: stopping experience collection [2024-06-13 09:22:45,309][73477] Signal inference workers to resume experience collection... [2024-06-13 09:22:45,309][73497] InferenceWorker_p0-w0: resuming experience collection [2024-06-13 09:22:45,325][73526] Worker 29, sleep for 135.938 sec to decorrelate experience collection [2024-06-13 09:22:45,342][73520] Worker 22, sleep for 103.125 sec to decorrelate experience collection [2024-06-13 09:22:45,501][73265] Fps is (10 sec: 1638.5, 60 sec: 1638.5, 300 sec: 1638.5). Total num frames: 3481600000. Throughput: 0: 31559.0. Samples: 315580. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-13 09:22:45,502][73265] Avg episode reward: [(0, '0.000')] [2024-06-13 09:22:45,611][73529] Worker 31, sleep for 145.312 sec to decorrelate experience collection [2024-06-13 09:22:45,671][73516] Worker 19, sleep for 89.062 sec to decorrelate experience collection [2024-06-13 09:22:45,671][73515] Worker 17, sleep for 79.688 sec to decorrelate experience collection [2024-06-13 09:22:45,786][73521] Worker 23, sleep for 107.812 sec to decorrelate experience collection [2024-06-13 09:22:45,808][73527] Worker 28, sleep for 131.250 sec to decorrelate experience collection [2024-06-13 09:22:45,906][73524] Worker 26, sleep for 121.875 sec to decorrelate experience collection [2024-06-13 09:22:45,962][73519] Worker 21, sleep for 98.438 sec to decorrelate experience collection [2024-06-13 09:22:46,079][73528] Worker 30, sleep for 140.625 sec to decorrelate experience collection [2024-06-13 09:22:46,104][73517] Worker 20, sleep for 93.750 sec to decorrelate experience collection [2024-06-13 09:22:46,161][73525] Worker 27, sleep for 126.562 sec to decorrelate experience collection [2024-06-13 09:22:46,210][73522] Worker 24, sleep for 112.500 sec to decorrelate experience collection [2024-06-13 09:22:46,400][73506] Worker 9, sleep for 42.188 sec to decorrelate experience collection [2024-06-13 09:22:46,571][73504] Worker 6, sleep for 28.125 sec to decorrelate experience collection [2024-06-13 09:22:47,551][73514] Worker 16, sleep for 75.000 sec to decorrelate experience collection [2024-06-13 09:22:47,698][73497] Updated weights for policy 0, policy_version 212509 (0.0017) [2024-06-13 09:22:47,764][73518] Worker 18, sleep for 84.375 sec to decorrelate experience collection [2024-06-13 09:22:48,355][73265] Heartbeat connected on Batcher_0 [2024-06-13 09:22:48,357][73265] Heartbeat connected on LearnerWorker_p0 [2024-06-13 09:22:48,374][73265] Heartbeat connected on RolloutWorker_w0 [2024-06-13 09:22:48,397][73265] Heartbeat connected on InferenceWorker_p0-w0 [2024-06-13 09:22:49,280][73499] Worker 1 awakens! [2024-06-13 09:22:49,285][73265] Heartbeat connected on RolloutWorker_w1 [2024-06-13 09:22:50,501][73265] Fps is (10 sec: 16383.9, 60 sec: 10922.8, 300 sec: 10922.8). Total num frames: 3481747456. Throughput: 0: 22080.2. Samples: 331200. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-13 09:22:50,502][73265] Avg episode reward: [(0, '0.000')] [2024-06-13 09:22:54,004][73500] Worker 2 awakens! [2024-06-13 09:22:54,008][73265] Heartbeat connected on RolloutWorker_w2 [2024-06-13 09:22:55,502][73265] Fps is (10 sec: 16383.5, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 3481763840. Throughput: 0: 17128.0. Samples: 342560. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-13 09:22:55,510][73265] Avg episode reward: [(0, '0.000')] [2024-06-13 09:22:58,692][73511] Worker 3 awakens! [2024-06-13 09:22:58,701][73265] Heartbeat connected on RolloutWorker_w3 [2024-06-13 09:23:00,501][73265] Fps is (10 sec: 3276.8, 60 sec: 7864.4, 300 sec: 7864.4). Total num frames: 3481780224. Throughput: 0: 14510.5. Samples: 362760. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-13 09:23:00,502][73265] Avg episode reward: [(0, '0.000')] [2024-06-13 09:23:03,444][73501] Worker 4 awakens! [2024-06-13 09:23:03,449][73265] Heartbeat connected on RolloutWorker_w4 [2024-06-13 09:23:05,501][73265] Fps is (10 sec: 6553.8, 60 sec: 8192.1, 300 sec: 8192.1). Total num frames: 3481829376. Throughput: 0: 12513.5. Samples: 375400. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-13 09:23:05,502][73265] Avg episode reward: [(0, '0.000')] [2024-06-13 09:23:08,063][73503] Worker 5 awakens! [2024-06-13 09:23:08,067][73265] Heartbeat connected on RolloutWorker_w5 [2024-06-13 09:23:10,501][73265] Fps is (10 sec: 11468.9, 60 sec: 8894.3, 300 sec: 8894.3). Total num frames: 3481894912. Throughput: 0: 13236.1. Samples: 463260. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-13 09:23:10,509][73265] Avg episode reward: [(0, '0.229')] [2024-06-13 09:23:10,910][73497] Updated weights for policy 0, policy_version 212519 (0.0015) [2024-06-13 09:23:14,796][73504] Worker 6 awakens! [2024-06-13 09:23:14,802][73265] Heartbeat connected on RolloutWorker_w6 [2024-06-13 09:23:15,501][73265] Fps is (10 sec: 16383.9, 60 sec: 10240.1, 300 sec: 10240.1). Total num frames: 3481993216. Throughput: 0: 14051.1. Samples: 562040. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-13 09:23:15,502][73265] Avg episode reward: [(0, '0.161')] [2024-06-13 09:23:17,560][73508] Worker 7 awakens! [2024-06-13 09:23:17,564][73265] Heartbeat connected on RolloutWorker_w7 [2024-06-13 09:23:18,150][73497] Updated weights for policy 0, policy_version 212529 (0.0012) [2024-06-13 09:23:20,501][73265] Fps is (10 sec: 21298.9, 60 sec: 11650.9, 300 sec: 11650.9). Total num frames: 3482107904. Throughput: 0: 14063.2. Samples: 632840. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-13 09:23:20,502][73265] Avg episode reward: [(0, '0.213')] [2024-06-13 09:23:22,220][73505] Worker 8 awakens! [2024-06-13 09:23:22,223][73265] Heartbeat connected on RolloutWorker_w8 [2024-06-13 09:23:24,906][73497] Updated weights for policy 0, policy_version 212539 (0.0011) [2024-06-13 09:23:25,501][73265] Fps is (10 sec: 24575.9, 60 sec: 13107.3, 300 sec: 13107.3). Total num frames: 3482238976. Throughput: 0: 17536.0. Samples: 789120. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-13 09:23:25,509][73265] Avg episode reward: [(0, '0.346')] [2024-06-13 09:23:25,514][73477] Saving new best policy, reward=0.346! [2024-06-13 09:23:28,687][73506] Worker 9 awakens! [2024-06-13 09:23:28,695][73265] Heartbeat connected on RolloutWorker_w9 [2024-06-13 09:23:30,501][73265] Fps is (10 sec: 24576.2, 60 sec: 14000.9, 300 sec: 14000.9). Total num frames: 3482353664. Throughput: 0: 14018.2. Samples: 946400. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-13 09:23:30,502][73265] Avg episode reward: [(0, '0.346')] [2024-06-13 09:23:31,367][73497] Updated weights for policy 0, policy_version 212549 (0.0012) [2024-06-13 09:23:31,585][73507] Worker 10 awakens! [2024-06-13 09:23:31,589][73265] Heartbeat connected on RolloutWorker_w10 [2024-06-13 09:23:35,501][73265] Fps is (10 sec: 31129.5, 60 sec: 16111.0, 300 sec: 16111.0). Total num frames: 3482550272. Throughput: 0: 15773.4. Samples: 1041000. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-13 09:23:35,502][73265] Avg episode reward: [(0, '0.409')] [2024-06-13 09:23:35,520][73477] Saving new best policy, reward=0.409! [2024-06-13 09:23:35,525][73497] Updated weights for policy 0, policy_version 212559 (0.0013) [2024-06-13 09:23:36,271][73510] Worker 11 awakens! [2024-06-13 09:23:36,276][73265] Heartbeat connected on RolloutWorker_w11 [2024-06-13 09:23:40,501][73265] Fps is (10 sec: 36044.4, 60 sec: 18841.6, 300 sec: 17392.3). Total num frames: 3482714112. Throughput: 0: 20309.4. Samples: 1256480. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-13 09:23:40,509][73265] Avg episode reward: [(0, '0.291')] [2024-06-13 09:23:40,760][73497] Updated weights for policy 0, policy_version 212569 (0.0013) [2024-06-13 09:23:40,909][73509] Worker 12 awakens! [2024-06-13 09:23:40,913][73265] Heartbeat connected on RolloutWorker_w12 [2024-06-13 09:23:44,823][73497] Updated weights for policy 0, policy_version 212579 (0.0014) [2024-06-13 09:23:45,501][73265] Fps is (10 sec: 37683.0, 60 sec: 22118.3, 300 sec: 19192.7). Total num frames: 3482927104. Throughput: 0: 24950.2. Samples: 1485520. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-13 09:23:45,502][73265] Avg episode reward: [(0, '0.328')] [2024-06-13 09:23:45,672][73502] Worker 13 awakens! [2024-06-13 09:23:45,678][73265] Heartbeat connected on RolloutWorker_w13 [2024-06-13 09:23:48,536][73497] Updated weights for policy 0, policy_version 212589 (0.0017) [2024-06-13 09:23:50,260][73512] Worker 14 awakens! [2024-06-13 09:23:50,265][73265] Heartbeat connected on RolloutWorker_w14 [2024-06-13 09:23:50,502][73265] Fps is (10 sec: 40959.6, 60 sec: 22937.6, 300 sec: 20534.6). Total num frames: 3483123712. Throughput: 0: 27550.5. Samples: 1615180. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-13 09:23:50,502][73265] Avg episode reward: [(0, '0.293')] [2024-06-13 09:23:53,245][73497] Updated weights for policy 0, policy_version 212599 (0.0022) [2024-06-13 09:23:54,970][73513] Worker 15 awakens! [2024-06-13 09:23:54,976][73265] Heartbeat connected on RolloutWorker_w15 [2024-06-13 09:23:55,501][73265] Fps is (10 sec: 37683.2, 60 sec: 25668.3, 300 sec: 21504.0). Total num frames: 3483303936. Throughput: 0: 30518.1. Samples: 1836580. Policy #0 lag: (min: 0.0, avg: 4.2, max: 11.0) [2024-06-13 09:23:55,502][73265] Avg episode reward: [(0, '0.327')] [2024-06-13 09:23:57,499][73497] Updated weights for policy 0, policy_version 212609 (0.0027) [2024-06-13 09:24:00,502][73265] Fps is (10 sec: 36044.8, 60 sec: 28398.9, 300 sec: 22359.4). Total num frames: 3483484160. Throughput: 0: 33210.1. Samples: 2056500. Policy #0 lag: (min: 0.0, avg: 4.2, max: 11.0) [2024-06-13 09:24:00,502][73265] Avg episode reward: [(0, '0.364')] [2024-06-13 09:24:02,282][73497] Updated weights for policy 0, policy_version 212619 (0.0024) [2024-06-13 09:24:02,652][73514] Worker 16 awakens! [2024-06-13 09:24:02,662][73265] Heartbeat connected on RolloutWorker_w16 [2024-06-13 09:24:05,442][73515] Worker 17 awakens! [2024-06-13 09:24:05,448][73265] Heartbeat connected on RolloutWorker_w17 [2024-06-13 09:24:05,501][73265] Fps is (10 sec: 36045.1, 60 sec: 30583.4, 300 sec: 23119.7). Total num frames: 3483664384. Throughput: 0: 34027.1. Samples: 2164060. Policy #0 lag: (min: 0.0, avg: 4.2, max: 11.0) [2024-06-13 09:24:05,502][73265] Avg episode reward: [(0, '0.360')] [2024-06-13 09:24:06,074][73497] Updated weights for policy 0, policy_version 212629 (0.0023) [2024-06-13 09:24:10,387][73497] Updated weights for policy 0, policy_version 212639 (0.0023) [2024-06-13 09:24:10,501][73265] Fps is (10 sec: 39321.8, 60 sec: 33041.0, 300 sec: 24144.9). Total num frames: 3483877376. Throughput: 0: 35764.8. Samples: 2398540. Policy #0 lag: (min: 0.0, avg: 4.2, max: 11.0) [2024-06-13 09:24:10,502][73265] Avg episode reward: [(0, '0.431')] [2024-06-13 09:24:10,502][73477] Saving new best policy, reward=0.431! [2024-06-13 09:24:12,236][73518] Worker 18 awakens! [2024-06-13 09:24:12,245][73265] Heartbeat connected on RolloutWorker_w18 [2024-06-13 09:24:14,741][73497] Updated weights for policy 0, policy_version 212649 (0.0029) [2024-06-13 09:24:14,833][73516] Worker 19 awakens! [2024-06-13 09:24:14,841][73265] Heartbeat connected on RolloutWorker_w19 [2024-06-13 09:24:15,501][73265] Fps is (10 sec: 39321.3, 60 sec: 34406.3, 300 sec: 24739.9). Total num frames: 3484057600. Throughput: 0: 37342.1. Samples: 2626800. Policy #0 lag: (min: 0.0, avg: 4.2, max: 11.0) [2024-06-13 09:24:15,502][73265] Avg episode reward: [(0, '0.350')] [2024-06-13 09:24:18,481][73497] Updated weights for policy 0, policy_version 212659 (0.0027) [2024-06-13 09:24:19,868][73517] Worker 20 awakens! [2024-06-13 09:24:19,876][73265] Heartbeat connected on RolloutWorker_w20 [2024-06-13 09:24:20,501][73265] Fps is (10 sec: 39321.9, 60 sec: 36044.8, 300 sec: 25590.3). Total num frames: 3484270592. Throughput: 0: 37968.4. Samples: 2749580. Policy #0 lag: (min: 0.0, avg: 4.2, max: 11.0) [2024-06-13 09:24:20,502][73265] Avg episode reward: [(0, '0.353')] [2024-06-13 09:24:22,760][73497] Updated weights for policy 0, policy_version 212669 (0.0027) [2024-06-13 09:24:24,416][73519] Worker 21 awakens! [2024-06-13 09:24:24,423][73265] Heartbeat connected on RolloutWorker_w21 [2024-06-13 09:24:25,502][73265] Fps is (10 sec: 44236.7, 60 sec: 37683.1, 300 sec: 26512.3). Total num frames: 3484499968. Throughput: 0: 38655.5. Samples: 2995980. Policy #0 lag: (min: 0.0, avg: 4.2, max: 11.0) [2024-06-13 09:24:25,502][73265] Avg episode reward: [(0, '0.384')] [2024-06-13 09:24:25,510][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000212677_3484499968.pth... [2024-06-13 09:24:25,550][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000212142_3475734528.pth [2024-06-13 09:24:26,653][73497] Updated weights for policy 0, policy_version 212679 (0.0021) [2024-06-13 09:24:28,542][73520] Worker 22 awakens! [2024-06-13 09:24:28,551][73265] Heartbeat connected on RolloutWorker_w22 [2024-06-13 09:24:30,429][73497] Updated weights for policy 0, policy_version 212689 (0.0030) [2024-06-13 09:24:30,501][73265] Fps is (10 sec: 42598.4, 60 sec: 39048.5, 300 sec: 27069.3). Total num frames: 3484696576. Throughput: 0: 39208.0. Samples: 3249880. Policy #0 lag: (min: 0.0, avg: 4.2, max: 11.0) [2024-06-13 09:24:30,502][73265] Avg episode reward: [(0, '0.378')] [2024-06-13 09:24:33,700][73521] Worker 23 awakens! [2024-06-13 09:24:33,708][73265] Heartbeat connected on RolloutWorker_w23 [2024-06-13 09:24:34,404][73497] Updated weights for policy 0, policy_version 212699 (0.0026) [2024-06-13 09:24:35,501][73265] Fps is (10 sec: 40960.3, 60 sec: 39321.6, 300 sec: 27716.3). Total num frames: 3484909568. Throughput: 0: 39237.4. Samples: 3380860. Policy #0 lag: (min: 0.0, avg: 4.2, max: 11.0) [2024-06-13 09:24:35,502][73265] Avg episode reward: [(0, '0.315')] [2024-06-13 09:24:38,161][73497] Updated weights for policy 0, policy_version 212709 (0.0026) [2024-06-13 09:24:38,804][73522] Worker 24 awakens! [2024-06-13 09:24:38,813][73265] Heartbeat connected on RolloutWorker_w24 [2024-06-13 09:24:40,501][73265] Fps is (10 sec: 42598.5, 60 sec: 40140.8, 300 sec: 28311.6). Total num frames: 3485122560. Throughput: 0: 40079.2. Samples: 3640140. Policy #0 lag: (min: 0.0, avg: 4.2, max: 11.0) [2024-06-13 09:24:40,502][73265] Avg episode reward: [(0, '0.342')] [2024-06-13 09:24:41,908][73523] Worker 25 awakens! [2024-06-13 09:24:41,919][73265] Heartbeat connected on RolloutWorker_w25 [2024-06-13 09:24:42,047][73497] Updated weights for policy 0, policy_version 212719 (0.0025) [2024-06-13 09:24:45,424][73497] Updated weights for policy 0, policy_version 212729 (0.0034) [2024-06-13 09:24:45,501][73265] Fps is (10 sec: 44236.6, 60 sec: 40413.9, 300 sec: 28987.1). Total num frames: 3485351936. Throughput: 0: 40929.9. Samples: 3898340. Policy #0 lag: (min: 0.0, avg: 4.2, max: 11.0) [2024-06-13 09:24:45,502][73265] Avg episode reward: [(0, '0.375')] [2024-06-13 09:24:47,880][73524] Worker 26 awakens! [2024-06-13 09:24:47,890][73265] Heartbeat connected on RolloutWorker_w26 [2024-06-13 09:24:49,310][73497] Updated weights for policy 0, policy_version 212739 (0.0027) [2024-06-13 09:24:50,501][73265] Fps is (10 sec: 42598.4, 60 sec: 40414.0, 300 sec: 29369.9). Total num frames: 3485548544. Throughput: 0: 41482.7. Samples: 4030780. Policy #0 lag: (min: 0.0, avg: 4.2, max: 11.0) [2024-06-13 09:24:50,502][73265] Avg episode reward: [(0, '0.285')] [2024-06-13 09:24:52,824][73525] Worker 27 awakens! [2024-06-13 09:24:52,834][73265] Heartbeat connected on RolloutWorker_w27 [2024-06-13 09:24:53,082][73497] Updated weights for policy 0, policy_version 212749 (0.0025) [2024-06-13 09:24:55,502][73265] Fps is (10 sec: 42597.7, 60 sec: 41233.0, 300 sec: 29959.3). Total num frames: 3485777920. Throughput: 0: 42143.4. Samples: 4295000. Policy #0 lag: (min: 0.0, avg: 4.2, max: 11.0) [2024-06-13 09:24:55,502][73265] Avg episode reward: [(0, '0.317')] [2024-06-13 09:24:56,972][73497] Updated weights for policy 0, policy_version 212759 (0.0031) [2024-06-13 09:24:57,136][73527] Worker 28 awakens! [2024-06-13 09:24:57,147][73265] Heartbeat connected on RolloutWorker_w28 [2024-06-13 09:25:00,177][73497] Updated weights for policy 0, policy_version 212769 (0.0040) [2024-06-13 09:25:00,501][73265] Fps is (10 sec: 45874.9, 60 sec: 42052.3, 300 sec: 30508.2). Total num frames: 3486007296. Throughput: 0: 43044.9. Samples: 4563820. Policy #0 lag: (min: 0.0, avg: 4.2, max: 11.0) [2024-06-13 09:25:00,502][73265] Avg episode reward: [(0, '0.369')] [2024-06-13 09:25:01,360][73526] Worker 29 awakens! [2024-06-13 09:25:01,369][73265] Heartbeat connected on RolloutWorker_w29 [2024-06-13 09:25:04,143][73497] Updated weights for policy 0, policy_version 212779 (0.0036) [2024-06-13 09:25:05,501][73265] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 30911.2). Total num frames: 3486220288. Throughput: 0: 43353.8. Samples: 4700500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:25:05,502][73265] Avg episode reward: [(0, '0.321')] [2024-06-13 09:25:06,792][73528] Worker 30 awakens! [2024-06-13 09:25:06,806][73265] Heartbeat connected on RolloutWorker_w30 [2024-06-13 09:25:07,830][73497] Updated weights for policy 0, policy_version 212789 (0.0027) [2024-06-13 09:25:10,501][73265] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 31393.9). Total num frames: 3486449664. Throughput: 0: 43954.8. Samples: 4973940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:25:10,502][73265] Avg episode reward: [(0, '0.367')] [2024-06-13 09:25:10,932][73529] Worker 31 awakens! [2024-06-13 09:25:10,944][73265] Heartbeat connected on RolloutWorker_w31 [2024-06-13 09:25:11,002][73497] Updated weights for policy 0, policy_version 212799 (0.0033) [2024-06-13 09:25:14,534][73497] Updated weights for policy 0, policy_version 212809 (0.0035) [2024-06-13 09:25:15,501][73265] Fps is (10 sec: 47513.7, 60 sec: 43963.8, 300 sec: 31948.8). Total num frames: 3486695424. Throughput: 0: 44362.7. Samples: 5246200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:25:15,502][73265] Avg episode reward: [(0, '0.372')] [2024-06-13 09:25:18,218][73497] Updated weights for policy 0, policy_version 212819 (0.0029) [2024-06-13 09:25:20,501][73265] Fps is (10 sec: 45875.4, 60 sec: 43963.8, 300 sec: 32271.6). Total num frames: 3486908416. Throughput: 0: 44663.2. Samples: 5390700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:25:20,502][73265] Avg episode reward: [(0, '0.291')] [2024-06-13 09:25:21,461][73497] Updated weights for policy 0, policy_version 212829 (0.0040) [2024-06-13 09:25:25,502][73265] Fps is (10 sec: 44236.1, 60 sec: 43963.7, 300 sec: 32671.6). Total num frames: 3487137792. Throughput: 0: 45057.6. Samples: 5667740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:25:25,502][73265] Avg episode reward: [(0, '0.369')] [2024-06-13 09:25:25,523][73497] Updated weights for policy 0, policy_version 212839 (0.0034) [2024-06-13 09:25:27,531][73477] Signal inference workers to stop experience collection... (50 times) [2024-06-13 09:25:27,533][73477] Signal inference workers to resume experience collection... (50 times) [2024-06-13 09:25:27,550][73497] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-06-13 09:25:27,584][73497] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-06-13 09:25:28,928][73497] Updated weights for policy 0, policy_version 212849 (0.0031) [2024-06-13 09:25:30,501][73265] Fps is (10 sec: 45874.8, 60 sec: 44509.9, 300 sec: 33048.9). Total num frames: 3487367168. Throughput: 0: 45245.4. Samples: 5934380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:25:30,502][73265] Avg episode reward: [(0, '0.344')] [2024-06-13 09:25:32,619][73497] Updated weights for policy 0, policy_version 212859 (0.0031) [2024-06-13 09:25:35,504][73265] Fps is (10 sec: 44226.6, 60 sec: 44508.1, 300 sec: 33313.7). Total num frames: 3487580160. Throughput: 0: 45290.4. Samples: 6068960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:25:35,504][73265] Avg episode reward: [(0, '0.385')] [2024-06-13 09:25:36,267][73497] Updated weights for policy 0, policy_version 212869 (0.0028) [2024-06-13 09:25:39,498][73497] Updated weights for policy 0, policy_version 212879 (0.0037) [2024-06-13 09:25:40,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45056.0, 300 sec: 33742.2). Total num frames: 3487825920. Throughput: 0: 45633.9. Samples: 6348520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:25:40,502][73265] Avg episode reward: [(0, '0.406')] [2024-06-13 09:25:43,210][73497] Updated weights for policy 0, policy_version 212889 (0.0030) [2024-06-13 09:25:45,501][73265] Fps is (10 sec: 47525.3, 60 sec: 45056.0, 300 sec: 34061.5). Total num frames: 3488055296. Throughput: 0: 45841.4. Samples: 6626680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:25:45,502][73265] Avg episode reward: [(0, '0.461')] [2024-06-13 09:25:45,512][73477] Saving new best policy, reward=0.461! [2024-06-13 09:25:46,853][73497] Updated weights for policy 0, policy_version 212899 (0.0034) [2024-06-13 09:25:50,360][73497] Updated weights for policy 0, policy_version 212909 (0.0045) [2024-06-13 09:25:50,501][73265] Fps is (10 sec: 47513.9, 60 sec: 45875.2, 300 sec: 34448.5). Total num frames: 3488301056. Throughput: 0: 45744.0. Samples: 6758980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:25:50,502][73265] Avg episode reward: [(0, '0.394')] [2024-06-13 09:25:54,268][73497] Updated weights for policy 0, policy_version 212919 (0.0033) [2024-06-13 09:25:55,502][73265] Fps is (10 sec: 44236.3, 60 sec: 45329.2, 300 sec: 34570.3). Total num frames: 3488497664. Throughput: 0: 45635.0. Samples: 7027520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:25:55,502][73265] Avg episode reward: [(0, '0.380')] [2024-06-13 09:25:57,706][73497] Updated weights for policy 0, policy_version 212929 (0.0031) [2024-06-13 09:26:00,504][73265] Fps is (10 sec: 44225.6, 60 sec: 45600.2, 300 sec: 34925.5). Total num frames: 3488743424. Throughput: 0: 45588.6. Samples: 7297800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:26:00,505][73265] Avg episode reward: [(0, '0.408')] [2024-06-13 09:26:01,268][73497] Updated weights for policy 0, policy_version 212939 (0.0039) [2024-06-13 09:26:05,054][73497] Updated weights for policy 0, policy_version 212949 (0.0045) [2024-06-13 09:26:05,502][73265] Fps is (10 sec: 45875.2, 60 sec: 45602.1, 300 sec: 35108.6). Total num frames: 3488956416. Throughput: 0: 45385.1. Samples: 7433040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 09:26:05,502][73265] Avg episode reward: [(0, '0.410')] [2024-06-13 09:26:08,453][73497] Updated weights for policy 0, policy_version 212959 (0.0036) [2024-06-13 09:26:10,501][73265] Fps is (10 sec: 44248.0, 60 sec: 45602.1, 300 sec: 35359.0). Total num frames: 3489185792. Throughput: 0: 45365.0. Samples: 7709160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 09:26:10,502][73265] Avg episode reward: [(0, '0.384')] [2024-06-13 09:26:12,043][73497] Updated weights for policy 0, policy_version 212969 (0.0047) [2024-06-13 09:26:15,502][73265] Fps is (10 sec: 47513.4, 60 sec: 45602.0, 300 sec: 35672.4). Total num frames: 3489431552. Throughput: 0: 45848.3. Samples: 7997560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 09:26:15,502][73265] Avg episode reward: [(0, '0.409')] [2024-06-13 09:26:15,619][73497] Updated weights for policy 0, policy_version 212979 (0.0045) [2024-06-13 09:26:19,159][73497] Updated weights for policy 0, policy_version 212989 (0.0037) [2024-06-13 09:26:20,502][73265] Fps is (10 sec: 49151.5, 60 sec: 46148.1, 300 sec: 35972.0). Total num frames: 3489677312. Throughput: 0: 45861.5. Samples: 8132620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 09:26:20,502][73265] Avg episode reward: [(0, '0.348')] [2024-06-13 09:26:22,513][73497] Updated weights for policy 0, policy_version 212999 (0.0029) [2024-06-13 09:26:25,501][73265] Fps is (10 sec: 44237.4, 60 sec: 45602.2, 300 sec: 36044.8). Total num frames: 3489873920. Throughput: 0: 45684.9. Samples: 8404340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 09:26:25,502][73265] Avg episode reward: [(0, '0.379')] [2024-06-13 09:26:25,516][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000213005_3489873920.pth... [2024-06-13 09:26:25,577][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000212499_3481583616.pth [2024-06-13 09:26:26,519][73497] Updated weights for policy 0, policy_version 213009 (0.0043) [2024-06-13 09:26:29,879][73497] Updated weights for policy 0, policy_version 213019 (0.0035) [2024-06-13 09:26:30,502][73265] Fps is (10 sec: 44236.7, 60 sec: 45875.1, 300 sec: 36323.7). Total num frames: 3490119680. Throughput: 0: 45325.2. Samples: 8666320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 09:26:30,502][73265] Avg episode reward: [(0, '0.395')] [2024-06-13 09:26:33,957][73497] Updated weights for policy 0, policy_version 213029 (0.0034) [2024-06-13 09:26:35,504][73265] Fps is (10 sec: 47502.0, 60 sec: 46148.3, 300 sec: 36522.3). Total num frames: 3490349056. Throughput: 0: 45609.9. Samples: 8811540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 09:26:35,504][73265] Avg episode reward: [(0, '0.353')] [2024-06-13 09:26:37,093][73497] Updated weights for policy 0, policy_version 213039 (0.0044) [2024-06-13 09:26:40,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45602.1, 300 sec: 36646.7). Total num frames: 3490562048. Throughput: 0: 45769.4. Samples: 9087140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 09:26:40,502][73265] Avg episode reward: [(0, '0.381')] [2024-06-13 09:26:40,816][73497] Updated weights for policy 0, policy_version 213049 (0.0036) [2024-06-13 09:26:44,041][73497] Updated weights for policy 0, policy_version 213059 (0.0043) [2024-06-13 09:26:45,501][73265] Fps is (10 sec: 44247.8, 60 sec: 45602.1, 300 sec: 36831.3). Total num frames: 3490791424. Throughput: 0: 45947.9. Samples: 9365340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 09:26:45,502][73265] Avg episode reward: [(0, '0.453')] [2024-06-13 09:26:48,067][73497] Updated weights for policy 0, policy_version 213069 (0.0037) [2024-06-13 09:26:50,501][73265] Fps is (10 sec: 45875.7, 60 sec: 45329.1, 300 sec: 37008.6). Total num frames: 3491020800. Throughput: 0: 45985.5. Samples: 9502380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 09:26:50,502][73265] Avg episode reward: [(0, '0.386')] [2024-06-13 09:26:51,383][73497] Updated weights for policy 0, policy_version 213079 (0.0041) [2024-06-13 09:26:55,258][73497] Updated weights for policy 0, policy_version 213089 (0.0031) [2024-06-13 09:26:55,267][73477] Signal inference workers to stop experience collection... (100 times) [2024-06-13 09:26:55,268][73477] Signal inference workers to resume experience collection... (100 times) [2024-06-13 09:26:55,295][73497] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-06-13 09:26:55,295][73497] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-06-13 09:26:55,501][73265] Fps is (10 sec: 47513.4, 60 sec: 46148.3, 300 sec: 37242.1). Total num frames: 3491266560. Throughput: 0: 45959.1. Samples: 9777320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 09:26:55,502][73265] Avg episode reward: [(0, '0.420')] [2024-06-13 09:26:58,685][73497] Updated weights for policy 0, policy_version 213099 (0.0027) [2024-06-13 09:27:00,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45604.1, 300 sec: 37343.2). Total num frames: 3491479552. Throughput: 0: 45599.8. Samples: 10049540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 09:27:00,502][73265] Avg episode reward: [(0, '0.388')] [2024-06-13 09:27:02,591][73497] Updated weights for policy 0, policy_version 213109 (0.0032) [2024-06-13 09:27:05,502][73265] Fps is (10 sec: 45874.7, 60 sec: 46148.2, 300 sec: 37561.9). Total num frames: 3491725312. Throughput: 0: 45690.2. Samples: 10188680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 09:27:05,502][73265] Avg episode reward: [(0, '0.384')] [2024-06-13 09:27:05,511][73497] Updated weights for policy 0, policy_version 213119 (0.0028) [2024-06-13 09:27:09,568][73497] Updated weights for policy 0, policy_version 213129 (0.0036) [2024-06-13 09:27:10,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45875.2, 300 sec: 37653.4). Total num frames: 3491938304. Throughput: 0: 45923.6. Samples: 10470900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 09:27:10,502][73265] Avg episode reward: [(0, '0.407')] [2024-06-13 09:27:12,646][73497] Updated weights for policy 0, policy_version 213139 (0.0041) [2024-06-13 09:27:15,502][73265] Fps is (10 sec: 42598.4, 60 sec: 45329.1, 300 sec: 37741.7). Total num frames: 3492151296. Throughput: 0: 46060.4. Samples: 10739040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-13 09:27:15,502][73265] Avg episode reward: [(0, '0.395')] [2024-06-13 09:27:16,902][73497] Updated weights for policy 0, policy_version 213149 (0.0032) [2024-06-13 09:27:20,084][73497] Updated weights for policy 0, policy_version 213159 (0.0031) [2024-06-13 09:27:20,502][73265] Fps is (10 sec: 47512.8, 60 sec: 45602.1, 300 sec: 37999.4). Total num frames: 3492413440. Throughput: 0: 45808.1. Samples: 10872800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 09:27:20,502][73265] Avg episode reward: [(0, '0.395')] [2024-06-13 09:27:24,090][73497] Updated weights for policy 0, policy_version 213169 (0.0038) [2024-06-13 09:27:25,504][73265] Fps is (10 sec: 49140.4, 60 sec: 46146.4, 300 sec: 38134.9). Total num frames: 3492642816. Throughput: 0: 45963.3. Samples: 11155600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 09:27:25,504][73265] Avg episode reward: [(0, '0.403')] [2024-06-13 09:27:26,969][73497] Updated weights for policy 0, policy_version 213179 (0.0046) [2024-06-13 09:27:30,501][73265] Fps is (10 sec: 44237.6, 60 sec: 45602.2, 300 sec: 38210.9). Total num frames: 3492855808. Throughput: 0: 45787.1. Samples: 11425760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 09:27:30,502][73265] Avg episode reward: [(0, '0.365')] [2024-06-13 09:27:31,189][73497] Updated weights for policy 0, policy_version 213189 (0.0029) [2024-06-13 09:27:33,690][73497] Updated weights for policy 0, policy_version 213199 (0.0033) [2024-06-13 09:27:35,503][73265] Fps is (10 sec: 47516.8, 60 sec: 46148.7, 300 sec: 39099.2). Total num frames: 3493117952. Throughput: 0: 45846.1. Samples: 11565540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 09:27:35,504][73265] Avg episode reward: [(0, '0.390')] [2024-06-13 09:27:38,124][73497] Updated weights for policy 0, policy_version 213209 (0.0039) [2024-06-13 09:27:40,504][73265] Fps is (10 sec: 50777.6, 60 sec: 46692.5, 300 sec: 39876.6). Total num frames: 3493363712. Throughput: 0: 46158.4. Samples: 11854560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 09:27:40,504][73265] Avg episode reward: [(0, '0.429')] [2024-06-13 09:27:41,216][73497] Updated weights for policy 0, policy_version 213219 (0.0044) [2024-06-13 09:27:45,231][73497] Updated weights for policy 0, policy_version 213229 (0.0026) [2024-06-13 09:27:45,502][73265] Fps is (10 sec: 42605.5, 60 sec: 45875.1, 300 sec: 39988.1). Total num frames: 3493543936. Throughput: 0: 46186.0. Samples: 12127920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 09:27:45,502][73265] Avg episode reward: [(0, '0.401')] [2024-06-13 09:27:48,273][73497] Updated weights for policy 0, policy_version 213239 (0.0029) [2024-06-13 09:27:50,501][73265] Fps is (10 sec: 42609.0, 60 sec: 46148.2, 300 sec: 40765.6). Total num frames: 3493789696. Throughput: 0: 45986.8. Samples: 12258080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 09:27:50,502][73265] Avg episode reward: [(0, '0.429')] [2024-06-13 09:27:52,587][73497] Updated weights for policy 0, policy_version 213249 (0.0039) [2024-06-13 09:27:55,379][73497] Updated weights for policy 0, policy_version 213259 (0.0038) [2024-06-13 09:27:55,501][73265] Fps is (10 sec: 49153.3, 60 sec: 46148.4, 300 sec: 41543.2). Total num frames: 3494035456. Throughput: 0: 45717.9. Samples: 12528200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 09:27:55,502][73265] Avg episode reward: [(0, '0.435')] [2024-06-13 09:27:59,579][73497] Updated weights for policy 0, policy_version 213269 (0.0029) [2024-06-13 09:28:00,501][73265] Fps is (10 sec: 45875.7, 60 sec: 46148.3, 300 sec: 42098.5). Total num frames: 3494248448. Throughput: 0: 46116.2. Samples: 12814260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 09:28:00,502][73265] Avg episode reward: [(0, '0.399')] [2024-06-13 09:28:02,167][73497] Updated weights for policy 0, policy_version 213279 (0.0039) [2024-06-13 09:28:05,501][73265] Fps is (10 sec: 42598.1, 60 sec: 45602.2, 300 sec: 42598.4). Total num frames: 3494461440. Throughput: 0: 46184.6. Samples: 12951100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 09:28:05,502][73265] Avg episode reward: [(0, '0.414')] [2024-06-13 09:28:06,750][73497] Updated weights for policy 0, policy_version 213289 (0.0030) [2024-06-13 09:28:09,645][73497] Updated weights for policy 0, policy_version 213299 (0.0029) [2024-06-13 09:28:10,501][73265] Fps is (10 sec: 49151.5, 60 sec: 46694.4, 300 sec: 43209.3). Total num frames: 3494739968. Throughput: 0: 45977.2. Samples: 13224460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 09:28:10,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 09:28:13,556][73477] Signal inference workers to stop experience collection... (150 times) [2024-06-13 09:28:13,609][73497] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-06-13 09:28:13,609][73477] Signal inference workers to resume experience collection... (150 times) [2024-06-13 09:28:13,631][73497] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-06-13 09:28:13,744][73497] Updated weights for policy 0, policy_version 213309 (0.0031) [2024-06-13 09:28:15,501][73265] Fps is (10 sec: 50790.4, 60 sec: 46967.6, 300 sec: 43598.1). Total num frames: 3494969344. Throughput: 0: 46104.4. Samples: 13500460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 09:28:15,502][73265] Avg episode reward: [(0, '0.409')] [2024-06-13 09:28:16,514][73497] Updated weights for policy 0, policy_version 213319 (0.0029) [2024-06-13 09:28:20,501][73265] Fps is (10 sec: 40960.2, 60 sec: 45602.3, 300 sec: 43764.7). Total num frames: 3495149568. Throughput: 0: 46324.6. Samples: 13650060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 09:28:20,502][73265] Avg episode reward: [(0, '0.373')] [2024-06-13 09:28:20,933][73497] Updated weights for policy 0, policy_version 213329 (0.0034) [2024-06-13 09:28:23,763][73497] Updated weights for policy 0, policy_version 213339 (0.0031) [2024-06-13 09:28:25,501][73265] Fps is (10 sec: 44236.8, 60 sec: 46150.2, 300 sec: 44264.6). Total num frames: 3495411712. Throughput: 0: 45784.8. Samples: 13914760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 09:28:25,502][73265] Avg episode reward: [(0, '0.358')] [2024-06-13 09:28:25,517][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000213343_3495411712.pth... [2024-06-13 09:28:25,556][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000212677_3484499968.pth [2024-06-13 09:28:27,926][73497] Updated weights for policy 0, policy_version 213349 (0.0038) [2024-06-13 09:28:30,501][73265] Fps is (10 sec: 50790.3, 60 sec: 46694.4, 300 sec: 44431.2). Total num frames: 3495657472. Throughput: 0: 45936.6. Samples: 14195060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 09:28:30,502][73265] Avg episode reward: [(0, '0.396')] [2024-06-13 09:28:30,534][73497] Updated weights for policy 0, policy_version 213359 (0.0035) [2024-06-13 09:28:34,854][73497] Updated weights for policy 0, policy_version 213369 (0.0034) [2024-06-13 09:28:35,501][73265] Fps is (10 sec: 44236.5, 60 sec: 45603.5, 300 sec: 44542.3). Total num frames: 3495854080. Throughput: 0: 46254.6. Samples: 14339540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 09:28:35,502][73265] Avg episode reward: [(0, '0.366')] [2024-06-13 09:28:37,650][73497] Updated weights for policy 0, policy_version 213379 (0.0035) [2024-06-13 09:28:40,501][73265] Fps is (10 sec: 42598.3, 60 sec: 45331.0, 300 sec: 44597.8). Total num frames: 3496083456. Throughput: 0: 46368.3. Samples: 14614780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 09:28:40,502][73265] Avg episode reward: [(0, '0.408')] [2024-06-13 09:28:41,986][73497] Updated weights for policy 0, policy_version 213389 (0.0039) [2024-06-13 09:28:44,645][73497] Updated weights for policy 0, policy_version 213399 (0.0038) [2024-06-13 09:28:45,501][73265] Fps is (10 sec: 47513.6, 60 sec: 46421.4, 300 sec: 44764.4). Total num frames: 3496329216. Throughput: 0: 45949.2. Samples: 14881980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 09:28:45,510][73265] Avg episode reward: [(0, '0.377')] [2024-06-13 09:28:49,459][73497] Updated weights for policy 0, policy_version 213409 (0.0025) [2024-06-13 09:28:50,501][73265] Fps is (10 sec: 49152.3, 60 sec: 46421.4, 300 sec: 44986.6). Total num frames: 3496574976. Throughput: 0: 46186.3. Samples: 15029480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 09:28:50,502][73265] Avg episode reward: [(0, '0.406')] [2024-06-13 09:28:52,012][73497] Updated weights for policy 0, policy_version 213419 (0.0035) [2024-06-13 09:28:55,501][73265] Fps is (10 sec: 44236.8, 60 sec: 45602.0, 300 sec: 45042.1). Total num frames: 3496771584. Throughput: 0: 46210.2. Samples: 15303920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 09:28:55,502][73265] Avg episode reward: [(0, '0.406')] [2024-06-13 09:28:56,534][73497] Updated weights for policy 0, policy_version 213429 (0.0028) [2024-06-13 09:28:58,883][73497] Updated weights for policy 0, policy_version 213439 (0.0033) [2024-06-13 09:29:00,501][73265] Fps is (10 sec: 44236.7, 60 sec: 46148.2, 300 sec: 45264.3). Total num frames: 3497017344. Throughput: 0: 46214.2. Samples: 15580100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 09:29:00,502][73265] Avg episode reward: [(0, '0.425')] [2024-06-13 09:29:03,430][73497] Updated weights for policy 0, policy_version 213449 (0.0039) [2024-06-13 09:29:05,501][73265] Fps is (10 sec: 50790.8, 60 sec: 46967.5, 300 sec: 45430.9). Total num frames: 3497279488. Throughput: 0: 46177.3. Samples: 15728040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 09:29:05,502][73265] Avg episode reward: [(0, '0.405')] [2024-06-13 09:29:06,098][73497] Updated weights for policy 0, policy_version 213459 (0.0045) [2024-06-13 09:29:10,504][73265] Fps is (10 sec: 42587.6, 60 sec: 45054.1, 300 sec: 45375.0). Total num frames: 3497443328. Throughput: 0: 46217.4. Samples: 15994660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 09:29:10,505][73265] Avg episode reward: [(0, '0.419')] [2024-06-13 09:29:10,820][73497] Updated weights for policy 0, policy_version 213469 (0.0031) [2024-06-13 09:29:11,194][73477] Signal inference workers to stop experience collection... (200 times) [2024-06-13 09:29:11,252][73497] InferenceWorker_p0-w0: stopping experience collection (200 times) [2024-06-13 09:29:11,308][73477] Signal inference workers to resume experience collection... (200 times) [2024-06-13 09:29:11,309][73497] InferenceWorker_p0-w0: resuming experience collection (200 times) [2024-06-13 09:29:13,277][73497] Updated weights for policy 0, policy_version 213479 (0.0033) [2024-06-13 09:29:15,501][73265] Fps is (10 sec: 40959.9, 60 sec: 45329.1, 300 sec: 45486.4). Total num frames: 3497689088. Throughput: 0: 45966.2. Samples: 16263540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 09:29:15,502][73265] Avg episode reward: [(0, '0.405')] [2024-06-13 09:29:17,826][73497] Updated weights for policy 0, policy_version 213489 (0.0029) [2024-06-13 09:29:20,501][73265] Fps is (10 sec: 50802.8, 60 sec: 46694.3, 300 sec: 45597.5). Total num frames: 3497951232. Throughput: 0: 45768.9. Samples: 16399140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 09:29:20,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 09:29:20,640][73497] Updated weights for policy 0, policy_version 213499 (0.0036) [2024-06-13 09:29:24,828][73497] Updated weights for policy 0, policy_version 213509 (0.0028) [2024-06-13 09:29:25,501][73265] Fps is (10 sec: 47513.9, 60 sec: 45875.2, 300 sec: 45653.1). Total num frames: 3498164224. Throughput: 0: 46037.8. Samples: 16686480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 09:29:25,502][73265] Avg episode reward: [(0, '0.387')] [2024-06-13 09:29:27,545][73497] Updated weights for policy 0, policy_version 213519 (0.0027) [2024-06-13 09:29:30,501][73265] Fps is (10 sec: 42598.5, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 3498377216. Throughput: 0: 46260.9. Samples: 16963720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 09:29:30,503][73265] Avg episode reward: [(0, '0.400')] [2024-06-13 09:29:32,061][73497] Updated weights for policy 0, policy_version 213529 (0.0031) [2024-06-13 09:29:34,460][73497] Updated weights for policy 0, policy_version 213539 (0.0036) [2024-06-13 09:29:35,504][73265] Fps is (10 sec: 47501.5, 60 sec: 46419.4, 300 sec: 45819.3). Total num frames: 3498639360. Throughput: 0: 45941.4. Samples: 17096960. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-06-13 09:29:35,504][73265] Avg episode reward: [(0, '0.389')] [2024-06-13 09:29:38,896][73497] Updated weights for policy 0, policy_version 213549 (0.0032) [2024-06-13 09:29:40,501][73265] Fps is (10 sec: 49152.5, 60 sec: 46421.4, 300 sec: 45819.7). Total num frames: 3498868736. Throughput: 0: 46063.7. Samples: 17376780. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-06-13 09:29:40,502][73265] Avg episode reward: [(0, '0.371')] [2024-06-13 09:29:41,807][73497] Updated weights for policy 0, policy_version 213559 (0.0034) [2024-06-13 09:29:45,501][73265] Fps is (10 sec: 40970.4, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 3499048960. Throughput: 0: 46154.7. Samples: 17657060. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-06-13 09:29:45,502][73265] Avg episode reward: [(0, '0.408')] [2024-06-13 09:29:45,971][73497] Updated weights for policy 0, policy_version 213569 (0.0033) [2024-06-13 09:29:49,162][73497] Updated weights for policy 0, policy_version 213579 (0.0024) [2024-06-13 09:29:50,501][73265] Fps is (10 sec: 44236.3, 60 sec: 45602.1, 300 sec: 45875.2). Total num frames: 3499311104. Throughput: 0: 45605.7. Samples: 17780300. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-06-13 09:29:50,502][73265] Avg episode reward: [(0, '0.409')] [2024-06-13 09:29:53,297][73497] Updated weights for policy 0, policy_version 213589 (0.0030) [2024-06-13 09:29:55,501][73265] Fps is (10 sec: 52428.5, 60 sec: 46694.4, 300 sec: 45986.3). Total num frames: 3499573248. Throughput: 0: 46095.9. Samples: 18068860. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-06-13 09:29:55,502][73265] Avg episode reward: [(0, '0.410')] [2024-06-13 09:29:56,079][73497] Updated weights for policy 0, policy_version 213599 (0.0032) [2024-06-13 09:30:00,422][73497] Updated weights for policy 0, policy_version 213609 (0.0038) [2024-06-13 09:30:00,502][73265] Fps is (10 sec: 45872.2, 60 sec: 45874.6, 300 sec: 45930.6). Total num frames: 3499769856. Throughput: 0: 46342.8. Samples: 18349000. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-06-13 09:30:00,502][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 09:30:03,122][73497] Updated weights for policy 0, policy_version 213619 (0.0027) [2024-06-13 09:30:05,501][73265] Fps is (10 sec: 40960.2, 60 sec: 45056.0, 300 sec: 45875.2). Total num frames: 3499982848. Throughput: 0: 45947.2. Samples: 18466760. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-06-13 09:30:05,502][73265] Avg episode reward: [(0, '0.416')] [2024-06-13 09:30:07,573][73497] Updated weights for policy 0, policy_version 213629 (0.0032) [2024-06-13 09:30:10,327][73497] Updated weights for policy 0, policy_version 213639 (0.0040) [2024-06-13 09:30:10,502][73265] Fps is (10 sec: 49155.0, 60 sec: 46969.4, 300 sec: 45986.3). Total num frames: 3500261376. Throughput: 0: 45911.0. Samples: 18752480. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-06-13 09:30:10,502][73265] Avg episode reward: [(0, '0.392')] [2024-06-13 09:30:14,555][73497] Updated weights for policy 0, policy_version 213649 (0.0034) [2024-06-13 09:30:15,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 3500441600. Throughput: 0: 45866.3. Samples: 19027700. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-06-13 09:30:15,502][73265] Avg episode reward: [(0, '0.376')] [2024-06-13 09:30:17,533][73497] Updated weights for policy 0, policy_version 213659 (0.0030) [2024-06-13 09:30:20,501][73265] Fps is (10 sec: 40960.6, 60 sec: 45329.2, 300 sec: 45875.2). Total num frames: 3500670976. Throughput: 0: 45897.3. Samples: 19162220. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-06-13 09:30:20,502][73265] Avg episode reward: [(0, '0.406')] [2024-06-13 09:30:21,652][73497] Updated weights for policy 0, policy_version 213669 (0.0034) [2024-06-13 09:30:24,820][73497] Updated weights for policy 0, policy_version 213679 (0.0045) [2024-06-13 09:30:25,502][73265] Fps is (10 sec: 50789.5, 60 sec: 46421.2, 300 sec: 46041.8). Total num frames: 3500949504. Throughput: 0: 45926.0. Samples: 19443460. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-06-13 09:30:25,502][73265] Avg episode reward: [(0, '0.383')] [2024-06-13 09:30:25,520][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000213681_3500949504.pth... [2024-06-13 09:30:25,567][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000213005_3489873920.pth [2024-06-13 09:30:29,051][73497] Updated weights for policy 0, policy_version 213689 (0.0033) [2024-06-13 09:30:30,501][73265] Fps is (10 sec: 47513.4, 60 sec: 46148.3, 300 sec: 45986.7). Total num frames: 3501146112. Throughput: 0: 45864.8. Samples: 19720980. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-06-13 09:30:30,502][73265] Avg episode reward: [(0, '0.410')] [2024-06-13 09:30:30,932][73477] Signal inference workers to stop experience collection... (250 times) [2024-06-13 09:30:30,981][73497] InferenceWorker_p0-w0: stopping experience collection (250 times) [2024-06-13 09:30:30,990][73477] Signal inference workers to resume experience collection... (250 times) [2024-06-13 09:30:30,997][73497] InferenceWorker_p0-w0: resuming experience collection (250 times) [2024-06-13 09:30:31,857][73497] Updated weights for policy 0, policy_version 213699 (0.0039) [2024-06-13 09:30:35,501][73265] Fps is (10 sec: 42599.1, 60 sec: 45604.0, 300 sec: 45930.7). Total num frames: 3501375488. Throughput: 0: 46018.7. Samples: 19851140. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-06-13 09:30:35,502][73265] Avg episode reward: [(0, '0.399')] [2024-06-13 09:30:36,327][73497] Updated weights for policy 0, policy_version 213709 (0.0032) [2024-06-13 09:30:38,997][73497] Updated weights for policy 0, policy_version 213719 (0.0030) [2024-06-13 09:30:40,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45602.1, 300 sec: 45930.7). Total num frames: 3501604864. Throughput: 0: 45463.1. Samples: 20114700. Policy #0 lag: (min: 0.0, avg: 13.1, max: 26.0) [2024-06-13 09:30:40,502][73265] Avg episode reward: [(0, '0.404')] [2024-06-13 09:30:43,425][73497] Updated weights for policy 0, policy_version 213729 (0.0041) [2024-06-13 09:30:45,501][73265] Fps is (10 sec: 45875.4, 60 sec: 46421.3, 300 sec: 45875.2). Total num frames: 3501834240. Throughput: 0: 45355.9. Samples: 20389980. Policy #0 lag: (min: 0.0, avg: 13.1, max: 26.0) [2024-06-13 09:30:45,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 09:30:46,488][73497] Updated weights for policy 0, policy_version 213739 (0.0040) [2024-06-13 09:30:50,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45602.2, 300 sec: 45930.8). Total num frames: 3502047232. Throughput: 0: 45783.1. Samples: 20527000. Policy #0 lag: (min: 0.0, avg: 13.1, max: 26.0) [2024-06-13 09:30:50,502][73265] Avg episode reward: [(0, '0.403')] [2024-06-13 09:30:50,677][73497] Updated weights for policy 0, policy_version 213749 (0.0037) [2024-06-13 09:30:53,791][73497] Updated weights for policy 0, policy_version 213759 (0.0032) [2024-06-13 09:30:55,502][73265] Fps is (10 sec: 44236.1, 60 sec: 45055.9, 300 sec: 45875.6). Total num frames: 3502276608. Throughput: 0: 45520.0. Samples: 20800880. Policy #0 lag: (min: 0.0, avg: 13.1, max: 26.0) [2024-06-13 09:30:55,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 09:30:57,619][73497] Updated weights for policy 0, policy_version 213769 (0.0029) [2024-06-13 09:31:00,504][73265] Fps is (10 sec: 47501.5, 60 sec: 45873.8, 300 sec: 45985.9). Total num frames: 3502522368. Throughput: 0: 45438.8. Samples: 21072560. Policy #0 lag: (min: 0.0, avg: 13.1, max: 26.0) [2024-06-13 09:31:00,504][73265] Avg episode reward: [(0, '0.407')] [2024-06-13 09:31:00,877][73497] Updated weights for policy 0, policy_version 213779 (0.0041) [2024-06-13 09:31:05,106][73497] Updated weights for policy 0, policy_version 213789 (0.0026) [2024-06-13 09:31:05,501][73265] Fps is (10 sec: 44237.3, 60 sec: 45602.1, 300 sec: 45875.2). Total num frames: 3502718976. Throughput: 0: 45611.0. Samples: 21214720. Policy #0 lag: (min: 0.0, avg: 13.1, max: 26.0) [2024-06-13 09:31:05,502][73265] Avg episode reward: [(0, '0.387')] [2024-06-13 09:31:07,865][73497] Updated weights for policy 0, policy_version 213799 (0.0036) [2024-06-13 09:31:10,501][73265] Fps is (10 sec: 40970.4, 60 sec: 44510.0, 300 sec: 45764.2). Total num frames: 3502931968. Throughput: 0: 45204.7. Samples: 21477660. Policy #0 lag: (min: 0.0, avg: 13.1, max: 26.0) [2024-06-13 09:31:10,502][73265] Avg episode reward: [(0, '0.414')] [2024-06-13 09:31:12,323][73497] Updated weights for policy 0, policy_version 213809 (0.0029) [2024-06-13 09:31:15,488][73497] Updated weights for policy 0, policy_version 213819 (0.0029) [2024-06-13 09:31:15,502][73265] Fps is (10 sec: 49151.5, 60 sec: 46148.2, 300 sec: 45875.2). Total num frames: 3503210496. Throughput: 0: 45084.3. Samples: 21749780. Policy #0 lag: (min: 0.0, avg: 13.1, max: 26.0) [2024-06-13 09:31:15,502][73265] Avg episode reward: [(0, '0.396')] [2024-06-13 09:31:19,107][73497] Updated weights for policy 0, policy_version 213829 (0.0037) [2024-06-13 09:31:20,502][73265] Fps is (10 sec: 49151.2, 60 sec: 45875.1, 300 sec: 45930.7). Total num frames: 3503423488. Throughput: 0: 45596.3. Samples: 21902980. Policy #0 lag: (min: 0.0, avg: 13.1, max: 26.0) [2024-06-13 09:31:20,502][73265] Avg episode reward: [(0, '0.471')] [2024-06-13 09:31:20,571][73477] Saving new best policy, reward=0.471! [2024-06-13 09:31:22,655][73497] Updated weights for policy 0, policy_version 213839 (0.0036) [2024-06-13 09:31:25,501][73265] Fps is (10 sec: 45875.9, 60 sec: 45329.2, 300 sec: 45930.8). Total num frames: 3503669248. Throughput: 0: 45708.5. Samples: 22171580. Policy #0 lag: (min: 0.0, avg: 13.1, max: 26.0) [2024-06-13 09:31:25,502][73265] Avg episode reward: [(0, '0.398')] [2024-06-13 09:31:26,310][73497] Updated weights for policy 0, policy_version 213849 (0.0030) [2024-06-13 09:31:29,738][73497] Updated weights for policy 0, policy_version 213859 (0.0029) [2024-06-13 09:31:29,868][73477] Signal inference workers to stop experience collection... (300 times) [2024-06-13 09:31:29,893][73497] InferenceWorker_p0-w0: stopping experience collection (300 times) [2024-06-13 09:31:29,924][73477] Signal inference workers to resume experience collection... (300 times) [2024-06-13 09:31:29,925][73497] InferenceWorker_p0-w0: resuming experience collection (300 times) [2024-06-13 09:31:30,501][73265] Fps is (10 sec: 49152.4, 60 sec: 46148.2, 300 sec: 45986.7). Total num frames: 3503915008. Throughput: 0: 45507.0. Samples: 22437800. Policy #0 lag: (min: 0.0, avg: 13.1, max: 26.0) [2024-06-13 09:31:30,502][73265] Avg episode reward: [(0, '0.363')] [2024-06-13 09:31:34,154][73497] Updated weights for policy 0, policy_version 213869 (0.0029) [2024-06-13 09:31:35,501][73265] Fps is (10 sec: 42598.0, 60 sec: 45329.0, 300 sec: 45875.2). Total num frames: 3504095232. Throughput: 0: 45624.3. Samples: 22580100. Policy #0 lag: (min: 0.0, avg: 13.1, max: 26.0) [2024-06-13 09:31:35,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 09:31:36,892][73497] Updated weights for policy 0, policy_version 213879 (0.0032) [2024-06-13 09:31:40,502][73265] Fps is (10 sec: 40959.7, 60 sec: 45329.0, 300 sec: 45875.2). Total num frames: 3504324608. Throughput: 0: 45484.9. Samples: 22847700. Policy #0 lag: (min: 0.0, avg: 13.1, max: 26.0) [2024-06-13 09:31:40,502][73265] Avg episode reward: [(0, '0.382')] [2024-06-13 09:31:41,129][73497] Updated weights for policy 0, policy_version 213889 (0.0038) [2024-06-13 09:31:44,543][73497] Updated weights for policy 0, policy_version 213899 (0.0031) [2024-06-13 09:31:45,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45329.0, 300 sec: 45875.2). Total num frames: 3504553984. Throughput: 0: 45514.1. Samples: 23120580. Policy #0 lag: (min: 0.0, avg: 13.1, max: 26.0) [2024-06-13 09:31:45,502][73265] Avg episode reward: [(0, '0.423')] [2024-06-13 09:31:47,982][73497] Updated weights for policy 0, policy_version 213909 (0.0033) [2024-06-13 09:31:50,501][73265] Fps is (10 sec: 45875.8, 60 sec: 45602.1, 300 sec: 45819.7). Total num frames: 3504783360. Throughput: 0: 45416.5. Samples: 23258460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 09:31:50,502][73265] Avg episode reward: [(0, '0.390')] [2024-06-13 09:31:51,676][73497] Updated weights for policy 0, policy_version 213919 (0.0033) [2024-06-13 09:31:55,131][73497] Updated weights for policy 0, policy_version 213929 (0.0033) [2024-06-13 09:31:55,502][73265] Fps is (10 sec: 45874.7, 60 sec: 45602.1, 300 sec: 45875.2). Total num frames: 3505012736. Throughput: 0: 45595.9. Samples: 23529480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 09:31:55,502][73265] Avg episode reward: [(0, '0.449')] [2024-06-13 09:31:58,845][73497] Updated weights for policy 0, policy_version 213939 (0.0033) [2024-06-13 09:32:00,502][73265] Fps is (10 sec: 45874.2, 60 sec: 45330.8, 300 sec: 45819.7). Total num frames: 3505242112. Throughput: 0: 45568.4. Samples: 23800360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 09:32:00,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 09:32:02,553][73497] Updated weights for policy 0, policy_version 213949 (0.0040) [2024-06-13 09:32:05,501][73265] Fps is (10 sec: 45875.4, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 3505471488. Throughput: 0: 45279.2. Samples: 23940540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 09:32:05,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 09:32:05,850][73497] Updated weights for policy 0, policy_version 213959 (0.0042) [2024-06-13 09:32:09,723][73497] Updated weights for policy 0, policy_version 213969 (0.0040) [2024-06-13 09:32:10,501][73265] Fps is (10 sec: 45876.2, 60 sec: 46148.3, 300 sec: 45930.8). Total num frames: 3505700864. Throughput: 0: 45376.5. Samples: 24213520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 09:32:10,502][73265] Avg episode reward: [(0, '0.446')] [2024-06-13 09:32:13,063][73497] Updated weights for policy 0, policy_version 213979 (0.0027) [2024-06-13 09:32:15,502][73265] Fps is (10 sec: 45874.9, 60 sec: 45329.1, 300 sec: 45819.7). Total num frames: 3505930240. Throughput: 0: 45617.2. Samples: 24490580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 09:32:15,502][73265] Avg episode reward: [(0, '0.418')] [2024-06-13 09:32:16,571][73497] Updated weights for policy 0, policy_version 213989 (0.0035) [2024-06-13 09:32:20,475][73497] Updated weights for policy 0, policy_version 213999 (0.0028) [2024-06-13 09:32:20,502][73265] Fps is (10 sec: 45874.4, 60 sec: 45602.1, 300 sec: 45820.0). Total num frames: 3506159616. Throughput: 0: 45503.0. Samples: 24627740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 09:32:20,502][73265] Avg episode reward: [(0, '0.396')] [2024-06-13 09:32:24,057][73497] Updated weights for policy 0, policy_version 214009 (0.0042) [2024-06-13 09:32:25,502][73265] Fps is (10 sec: 44236.8, 60 sec: 45055.9, 300 sec: 45819.6). Total num frames: 3506372608. Throughput: 0: 45628.4. Samples: 24900980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 09:32:25,502][73265] Avg episode reward: [(0, '0.421')] [2024-06-13 09:32:25,518][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000214012_3506372608.pth... [2024-06-13 09:32:25,553][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000213343_3495411712.pth [2024-06-13 09:32:27,450][73497] Updated weights for policy 0, policy_version 214019 (0.0037) [2024-06-13 09:32:30,501][73265] Fps is (10 sec: 45875.9, 60 sec: 45056.0, 300 sec: 45764.4). Total num frames: 3506618368. Throughput: 0: 45571.6. Samples: 25171300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 09:32:30,502][73265] Avg episode reward: [(0, '0.422')] [2024-06-13 09:32:31,382][73497] Updated weights for policy 0, policy_version 214029 (0.0045) [2024-06-13 09:32:34,825][73497] Updated weights for policy 0, policy_version 214039 (0.0036) [2024-06-13 09:32:35,502][73265] Fps is (10 sec: 45875.1, 60 sec: 45602.1, 300 sec: 45653.4). Total num frames: 3506831360. Throughput: 0: 45560.7. Samples: 25308700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 09:32:35,502][73265] Avg episode reward: [(0, '0.413')] [2024-06-13 09:32:38,320][73497] Updated weights for policy 0, policy_version 214049 (0.0026) [2024-06-13 09:32:40,501][73265] Fps is (10 sec: 45874.8, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 3507077120. Throughput: 0: 45628.0. Samples: 25582740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 09:32:40,502][73265] Avg episode reward: [(0, '0.365')] [2024-06-13 09:32:42,149][73497] Updated weights for policy 0, policy_version 214059 (0.0042) [2024-06-13 09:32:45,227][73497] Updated weights for policy 0, policy_version 214069 (0.0030) [2024-06-13 09:32:45,501][73265] Fps is (10 sec: 47513.9, 60 sec: 45875.1, 300 sec: 45819.7). Total num frames: 3507306496. Throughput: 0: 45872.1. Samples: 25864600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 09:32:45,502][73265] Avg episode reward: [(0, '0.380')] [2024-06-13 09:32:49,367][73497] Updated weights for policy 0, policy_version 214079 (0.0026) [2024-06-13 09:32:50,501][73265] Fps is (10 sec: 45875.6, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 3507535872. Throughput: 0: 45886.3. Samples: 26005420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 09:32:50,502][73265] Avg episode reward: [(0, '0.449')] [2024-06-13 09:32:52,569][73497] Updated weights for policy 0, policy_version 214089 (0.0032) [2024-06-13 09:32:55,502][73265] Fps is (10 sec: 44236.5, 60 sec: 45602.1, 300 sec: 45764.1). Total num frames: 3507748864. Throughput: 0: 45842.0. Samples: 26276420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 09:32:55,502][73265] Avg episode reward: [(0, '0.399')] [2024-06-13 09:32:56,010][73477] Signal inference workers to stop experience collection... (350 times) [2024-06-13 09:32:56,016][73477] Signal inference workers to resume experience collection... (350 times) [2024-06-13 09:32:56,059][73497] InferenceWorker_p0-w0: stopping experience collection (350 times) [2024-06-13 09:32:56,059][73497] InferenceWorker_p0-w0: resuming experience collection (350 times) [2024-06-13 09:32:56,400][73497] Updated weights for policy 0, policy_version 214099 (0.0036) [2024-06-13 09:32:59,715][73497] Updated weights for policy 0, policy_version 214109 (0.0053) [2024-06-13 09:33:00,507][73265] Fps is (10 sec: 44213.8, 60 sec: 45598.3, 300 sec: 45818.9). Total num frames: 3507978240. Throughput: 0: 45537.5. Samples: 26540000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 09:33:00,507][73265] Avg episode reward: [(0, '0.421')] [2024-06-13 09:33:03,598][73497] Updated weights for policy 0, policy_version 214119 (0.0032) [2024-06-13 09:33:05,501][73265] Fps is (10 sec: 47514.5, 60 sec: 45875.3, 300 sec: 45708.6). Total num frames: 3508224000. Throughput: 0: 45739.3. Samples: 26686000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 09:33:05,502][73265] Avg episode reward: [(0, '0.377')] [2024-06-13 09:33:07,012][73497] Updated weights for policy 0, policy_version 214129 (0.0042) [2024-06-13 09:33:10,501][73265] Fps is (10 sec: 45899.4, 60 sec: 45602.1, 300 sec: 45653.1). Total num frames: 3508436992. Throughput: 0: 45701.1. Samples: 26957520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 09:33:10,502][73265] Avg episode reward: [(0, '0.395')] [2024-06-13 09:33:10,781][73497] Updated weights for policy 0, policy_version 214139 (0.0031) [2024-06-13 09:33:14,283][73497] Updated weights for policy 0, policy_version 214149 (0.0040) [2024-06-13 09:33:15,501][73265] Fps is (10 sec: 44236.2, 60 sec: 45602.2, 300 sec: 45819.6). Total num frames: 3508666368. Throughput: 0: 45837.2. Samples: 27233980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 09:33:15,502][73265] Avg episode reward: [(0, '0.384')] [2024-06-13 09:33:17,764][73497] Updated weights for policy 0, policy_version 214159 (0.0032) [2024-06-13 09:33:20,502][73265] Fps is (10 sec: 47512.7, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 3508912128. Throughput: 0: 45825.4. Samples: 27370840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 09:33:20,502][73265] Avg episode reward: [(0, '0.439')] [2024-06-13 09:33:21,233][73497] Updated weights for policy 0, policy_version 214169 (0.0038) [2024-06-13 09:33:25,227][73497] Updated weights for policy 0, policy_version 214179 (0.0032) [2024-06-13 09:33:25,502][73265] Fps is (10 sec: 45875.0, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 3509125120. Throughput: 0: 45953.3. Samples: 27650640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 09:33:25,502][73265] Avg episode reward: [(0, '0.409')] [2024-06-13 09:33:28,245][73497] Updated weights for policy 0, policy_version 214189 (0.0035) [2024-06-13 09:33:30,509][73265] Fps is (10 sec: 42565.1, 60 sec: 45323.1, 300 sec: 45707.4). Total num frames: 3509338112. Throughput: 0: 45553.4. Samples: 27914860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 09:33:30,510][73265] Avg episode reward: [(0, '0.415')] [2024-06-13 09:33:32,526][73497] Updated weights for policy 0, policy_version 214199 (0.0027) [2024-06-13 09:33:35,426][73497] Updated weights for policy 0, policy_version 214209 (0.0027) [2024-06-13 09:33:35,504][73265] Fps is (10 sec: 47502.5, 60 sec: 46146.5, 300 sec: 45819.3). Total num frames: 3509600256. Throughput: 0: 45537.1. Samples: 28054700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 09:33:35,504][73265] Avg episode reward: [(0, '0.413')] [2024-06-13 09:33:39,705][73497] Updated weights for policy 0, policy_version 214219 (0.0027) [2024-06-13 09:33:40,501][73265] Fps is (10 sec: 47551.5, 60 sec: 45602.2, 300 sec: 45708.6). Total num frames: 3509813248. Throughput: 0: 45754.0. Samples: 28335340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 09:33:40,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 09:33:42,838][73497] Updated weights for policy 0, policy_version 214229 (0.0025) [2024-06-13 09:33:45,501][73265] Fps is (10 sec: 42608.9, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 3510026240. Throughput: 0: 45877.3. Samples: 28604240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 09:33:45,502][73265] Avg episode reward: [(0, '0.448')] [2024-06-13 09:33:46,698][73497] Updated weights for policy 0, policy_version 214239 (0.0036) [2024-06-13 09:33:50,154][73497] Updated weights for policy 0, policy_version 214249 (0.0029) [2024-06-13 09:33:50,501][73265] Fps is (10 sec: 45874.5, 60 sec: 45602.0, 300 sec: 45764.1). Total num frames: 3510272000. Throughput: 0: 45633.2. Samples: 28739500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 09:33:50,502][73265] Avg episode reward: [(0, '0.422')] [2024-06-13 09:33:53,775][73497] Updated weights for policy 0, policy_version 214259 (0.0036) [2024-06-13 09:33:55,501][73265] Fps is (10 sec: 47513.5, 60 sec: 45875.3, 300 sec: 45708.6). Total num frames: 3510501376. Throughput: 0: 45540.3. Samples: 29006840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 09:33:55,502][73265] Avg episode reward: [(0, '0.448')] [2024-06-13 09:33:57,169][73497] Updated weights for policy 0, policy_version 214269 (0.0040) [2024-06-13 09:34:00,502][73265] Fps is (10 sec: 42598.3, 60 sec: 45332.9, 300 sec: 45486.4). Total num frames: 3510697984. Throughput: 0: 45517.7. Samples: 29282280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 09:34:00,502][73265] Avg episode reward: [(0, '0.465')] [2024-06-13 09:34:01,202][73497] Updated weights for policy 0, policy_version 214279 (0.0037) [2024-06-13 09:34:04,184][73497] Updated weights for policy 0, policy_version 214289 (0.0041) [2024-06-13 09:34:05,501][73265] Fps is (10 sec: 44237.2, 60 sec: 45329.1, 300 sec: 45764.5). Total num frames: 3510943744. Throughput: 0: 45402.4. Samples: 29413940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 09:34:05,502][73265] Avg episode reward: [(0, '0.420')] [2024-06-13 09:34:08,175][73497] Updated weights for policy 0, policy_version 214299 (0.0041) [2024-06-13 09:34:10,501][73265] Fps is (10 sec: 50790.8, 60 sec: 46148.2, 300 sec: 45819.7). Total num frames: 3511205888. Throughput: 0: 45477.4. Samples: 29697120. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 09:34:10,502][73265] Avg episode reward: [(0, '0.414')] [2024-06-13 09:34:11,503][73497] Updated weights for policy 0, policy_version 214309 (0.0030) [2024-06-13 09:34:15,058][73477] Signal inference workers to stop experience collection... (400 times) [2024-06-13 09:34:15,063][73477] Signal inference workers to resume experience collection... (400 times) [2024-06-13 09:34:15,102][73497] InferenceWorker_p0-w0: stopping experience collection (400 times) [2024-06-13 09:34:15,102][73497] InferenceWorker_p0-w0: resuming experience collection (400 times) [2024-06-13 09:34:15,190][73497] Updated weights for policy 0, policy_version 214319 (0.0045) [2024-06-13 09:34:15,502][73265] Fps is (10 sec: 47513.0, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 3511418880. Throughput: 0: 45719.1. Samples: 29971860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 09:34:15,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 09:34:18,861][73497] Updated weights for policy 0, policy_version 214329 (0.0040) [2024-06-13 09:34:20,501][73265] Fps is (10 sec: 40960.6, 60 sec: 45056.2, 300 sec: 45597.5). Total num frames: 3511615488. Throughput: 0: 45451.9. Samples: 30099920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 09:34:20,502][73265] Avg episode reward: [(0, '0.429')] [2024-06-13 09:34:22,480][73497] Updated weights for policy 0, policy_version 214339 (0.0028) [2024-06-13 09:34:25,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 3511877632. Throughput: 0: 45303.9. Samples: 30374020. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 09:34:25,502][73265] Avg episode reward: [(0, '0.417')] [2024-06-13 09:34:25,509][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000214348_3511877632.pth... [2024-06-13 09:34:25,555][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000213681_3500949504.pth [2024-06-13 09:34:25,891][73497] Updated weights for policy 0, policy_version 214349 (0.0037) [2024-06-13 09:34:30,080][73497] Updated weights for policy 0, policy_version 214359 (0.0037) [2024-06-13 09:34:30,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45608.2, 300 sec: 45542.4). Total num frames: 3512074240. Throughput: 0: 45424.9. Samples: 30648360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 09:34:30,502][73265] Avg episode reward: [(0, '0.381')] [2024-06-13 09:34:32,855][73497] Updated weights for policy 0, policy_version 214369 (0.0036) [2024-06-13 09:34:35,501][73265] Fps is (10 sec: 42598.4, 60 sec: 45057.8, 300 sec: 45542.0). Total num frames: 3512303616. Throughput: 0: 45208.0. Samples: 30773860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 09:34:35,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 09:34:37,017][73497] Updated weights for policy 0, policy_version 214379 (0.0041) [2024-06-13 09:34:40,319][73497] Updated weights for policy 0, policy_version 214389 (0.0034) [2024-06-13 09:34:40,502][73265] Fps is (10 sec: 47512.8, 60 sec: 45602.0, 300 sec: 45764.1). Total num frames: 3512549376. Throughput: 0: 45562.1. Samples: 31057140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 09:34:40,502][73265] Avg episode reward: [(0, '0.347')] [2024-06-13 09:34:44,329][73497] Updated weights for policy 0, policy_version 214399 (0.0035) [2024-06-13 09:34:45,502][73265] Fps is (10 sec: 44236.3, 60 sec: 45328.9, 300 sec: 45542.0). Total num frames: 3512745984. Throughput: 0: 45537.7. Samples: 31331480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 09:34:45,502][73265] Avg episode reward: [(0, '0.434')] [2024-06-13 09:34:47,509][73497] Updated weights for policy 0, policy_version 214409 (0.0026) [2024-06-13 09:34:50,501][73265] Fps is (10 sec: 42598.9, 60 sec: 45056.1, 300 sec: 45430.9). Total num frames: 3512975360. Throughput: 0: 45565.3. Samples: 31464380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 09:34:50,502][73265] Avg episode reward: [(0, '0.445')] [2024-06-13 09:34:51,450][73497] Updated weights for policy 0, policy_version 214419 (0.0036) [2024-06-13 09:34:54,422][73497] Updated weights for policy 0, policy_version 214429 (0.0033) [2024-06-13 09:34:55,501][73265] Fps is (10 sec: 49152.7, 60 sec: 45602.1, 300 sec: 45653.2). Total num frames: 3513237504. Throughput: 0: 45273.8. Samples: 31734440. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 09:34:55,502][73265] Avg episode reward: [(0, '0.417')] [2024-06-13 09:34:58,743][73497] Updated weights for policy 0, policy_version 214439 (0.0032) [2024-06-13 09:35:00,501][73265] Fps is (10 sec: 49152.7, 60 sec: 46148.5, 300 sec: 45708.6). Total num frames: 3513466880. Throughput: 0: 45541.5. Samples: 32021220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 09:35:00,501][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 09:35:01,423][73497] Updated weights for policy 0, policy_version 214449 (0.0029) [2024-06-13 09:35:05,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45602.1, 300 sec: 45486.4). Total num frames: 3513679872. Throughput: 0: 45628.4. Samples: 32153200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 09:35:05,502][73265] Avg episode reward: [(0, '0.397')] [2024-06-13 09:35:05,617][73497] Updated weights for policy 0, policy_version 214459 (0.0032) [2024-06-13 09:35:08,785][73497] Updated weights for policy 0, policy_version 214469 (0.0037) [2024-06-13 09:35:10,501][73265] Fps is (10 sec: 44236.1, 60 sec: 45056.0, 300 sec: 45653.0). Total num frames: 3513909248. Throughput: 0: 45532.5. Samples: 32422980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-13 09:35:10,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 09:35:12,981][73497] Updated weights for policy 0, policy_version 214479 (0.0038) [2024-06-13 09:35:15,501][73265] Fps is (10 sec: 47513.2, 60 sec: 45602.1, 300 sec: 45708.6). Total num frames: 3514155008. Throughput: 0: 45753.2. Samples: 32707260. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-13 09:35:15,502][73265] Avg episode reward: [(0, '0.395')] [2024-06-13 09:35:15,993][73497] Updated weights for policy 0, policy_version 214489 (0.0026) [2024-06-13 09:35:20,129][73497] Updated weights for policy 0, policy_version 214499 (0.0033) [2024-06-13 09:35:20,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45602.1, 300 sec: 45430.9). Total num frames: 3514351616. Throughput: 0: 46033.4. Samples: 32845360. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-13 09:35:20,502][73265] Avg episode reward: [(0, '0.442')] [2024-06-13 09:35:22,869][73497] Updated weights for policy 0, policy_version 214509 (0.0033) [2024-06-13 09:35:25,501][73265] Fps is (10 sec: 42598.4, 60 sec: 45056.0, 300 sec: 45542.0). Total num frames: 3514580992. Throughput: 0: 45561.0. Samples: 33107380. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-13 09:35:25,502][73265] Avg episode reward: [(0, '0.346')] [2024-06-13 09:35:27,039][73497] Updated weights for policy 0, policy_version 214519 (0.0027) [2024-06-13 09:35:29,173][73477] Signal inference workers to stop experience collection... (450 times) [2024-06-13 09:35:29,223][73497] InferenceWorker_p0-w0: stopping experience collection (450 times) [2024-06-13 09:35:29,280][73477] Signal inference workers to resume experience collection... (450 times) [2024-06-13 09:35:29,280][73497] InferenceWorker_p0-w0: resuming experience collection (450 times) [2024-06-13 09:35:29,912][73497] Updated weights for policy 0, policy_version 214529 (0.0032) [2024-06-13 09:35:30,501][73265] Fps is (10 sec: 50790.4, 60 sec: 46421.3, 300 sec: 45708.6). Total num frames: 3514859520. Throughput: 0: 45718.8. Samples: 33388820. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-13 09:35:30,502][73265] Avg episode reward: [(0, '0.391')] [2024-06-13 09:35:34,124][73497] Updated weights for policy 0, policy_version 214539 (0.0031) [2024-06-13 09:35:35,502][73265] Fps is (10 sec: 45874.5, 60 sec: 45602.0, 300 sec: 45541.9). Total num frames: 3515039744. Throughput: 0: 45995.3. Samples: 33534180. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-13 09:35:35,502][73265] Avg episode reward: [(0, '0.356')] [2024-06-13 09:35:37,325][73497] Updated weights for policy 0, policy_version 214549 (0.0026) [2024-06-13 09:35:40,501][73265] Fps is (10 sec: 44236.6, 60 sec: 45875.3, 300 sec: 45653.0). Total num frames: 3515301888. Throughput: 0: 46166.6. Samples: 33811940. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-13 09:35:40,502][73265] Avg episode reward: [(0, '0.370')] [2024-06-13 09:35:41,603][73497] Updated weights for policy 0, policy_version 214559 (0.0039) [2024-06-13 09:35:44,478][73497] Updated weights for policy 0, policy_version 214569 (0.0027) [2024-06-13 09:35:45,501][73265] Fps is (10 sec: 50791.2, 60 sec: 46694.5, 300 sec: 45764.1). Total num frames: 3515547648. Throughput: 0: 45658.0. Samples: 34075840. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-13 09:35:45,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 09:35:49,111][73497] Updated weights for policy 0, policy_version 214579 (0.0040) [2024-06-13 09:35:50,501][73265] Fps is (10 sec: 44237.2, 60 sec: 46148.3, 300 sec: 45653.1). Total num frames: 3515744256. Throughput: 0: 46113.8. Samples: 34228320. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-13 09:35:50,502][73265] Avg episode reward: [(0, '0.434')] [2024-06-13 09:35:51,562][73497] Updated weights for policy 0, policy_version 214589 (0.0035) [2024-06-13 09:35:55,501][73265] Fps is (10 sec: 40960.2, 60 sec: 45329.1, 300 sec: 45542.4). Total num frames: 3515957248. Throughput: 0: 46050.7. Samples: 34495260. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-13 09:35:55,502][73265] Avg episode reward: [(0, '0.462')] [2024-06-13 09:35:56,150][73497] Updated weights for policy 0, policy_version 214599 (0.0025) [2024-06-13 09:35:58,782][73497] Updated weights for policy 0, policy_version 214609 (0.0036) [2024-06-13 09:36:00,501][73265] Fps is (10 sec: 45874.9, 60 sec: 45602.0, 300 sec: 45708.6). Total num frames: 3516203008. Throughput: 0: 45693.4. Samples: 34763460. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-13 09:36:00,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 09:36:03,139][73497] Updated weights for policy 0, policy_version 214619 (0.0030) [2024-06-13 09:36:05,501][73265] Fps is (10 sec: 47513.8, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 3516432384. Throughput: 0: 45850.7. Samples: 34908640. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-13 09:36:05,502][73265] Avg episode reward: [(0, '0.433')] [2024-06-13 09:36:06,148][73497] Updated weights for policy 0, policy_version 214629 (0.0040) [2024-06-13 09:36:10,501][73265] Fps is (10 sec: 42598.5, 60 sec: 45329.1, 300 sec: 45486.4). Total num frames: 3516628992. Throughput: 0: 46083.1. Samples: 35181120. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-13 09:36:10,502][73265] Avg episode reward: [(0, '0.461')] [2024-06-13 09:36:10,526][73497] Updated weights for policy 0, policy_version 214639 (0.0024) [2024-06-13 09:36:13,293][73497] Updated weights for policy 0, policy_version 214649 (0.0038) [2024-06-13 09:36:15,501][73265] Fps is (10 sec: 47513.1, 60 sec: 45875.2, 300 sec: 45708.6). Total num frames: 3516907520. Throughput: 0: 45889.7. Samples: 35453860. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-13 09:36:15,502][73265] Avg episode reward: [(0, '0.423')] [2024-06-13 09:36:17,851][73497] Updated weights for policy 0, policy_version 214659 (0.0028) [2024-06-13 09:36:20,226][73497] Updated weights for policy 0, policy_version 214669 (0.0037) [2024-06-13 09:36:20,501][73265] Fps is (10 sec: 52429.0, 60 sec: 46694.4, 300 sec: 45708.6). Total num frames: 3517153280. Throughput: 0: 45788.7. Samples: 35594660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 09:36:20,502][73265] Avg episode reward: [(0, '0.427')] [2024-06-13 09:36:24,673][73497] Updated weights for policy 0, policy_version 214679 (0.0038) [2024-06-13 09:36:25,504][73265] Fps is (10 sec: 44225.9, 60 sec: 46146.4, 300 sec: 45541.6). Total num frames: 3517349888. Throughput: 0: 45804.6. Samples: 35873260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 09:36:25,505][73265] Avg episode reward: [(0, '0.415')] [2024-06-13 09:36:25,521][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000214682_3517349888.pth... [2024-06-13 09:36:25,585][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000214012_3506372608.pth [2024-06-13 09:36:27,355][73497] Updated weights for policy 0, policy_version 214689 (0.0031) [2024-06-13 09:36:30,503][73265] Fps is (10 sec: 42592.9, 60 sec: 45328.1, 300 sec: 45708.4). Total num frames: 3517579264. Throughput: 0: 45898.3. Samples: 36141320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 09:36:30,503][73265] Avg episode reward: [(0, '0.390')] [2024-06-13 09:36:31,727][73497] Updated weights for policy 0, policy_version 214699 (0.0035) [2024-06-13 09:36:34,777][73497] Updated weights for policy 0, policy_version 214709 (0.0032) [2024-06-13 09:36:35,501][73265] Fps is (10 sec: 49164.2, 60 sec: 46694.5, 300 sec: 45819.7). Total num frames: 3517841408. Throughput: 0: 45672.3. Samples: 36283580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 09:36:35,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 09:36:39,072][73497] Updated weights for policy 0, policy_version 214719 (0.0035) [2024-06-13 09:36:40,501][73265] Fps is (10 sec: 42604.0, 60 sec: 45056.1, 300 sec: 45597.5). Total num frames: 3518005248. Throughput: 0: 45790.7. Samples: 36555840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 09:36:40,502][73265] Avg episode reward: [(0, '0.428')] [2024-06-13 09:36:40,576][73477] Signal inference workers to stop experience collection... (500 times) [2024-06-13 09:36:40,621][73497] InferenceWorker_p0-w0: stopping experience collection (500 times) [2024-06-13 09:36:40,627][73477] Signal inference workers to resume experience collection... (500 times) [2024-06-13 09:36:40,638][73497] InferenceWorker_p0-w0: resuming experience collection (500 times) [2024-06-13 09:36:41,861][73497] Updated weights for policy 0, policy_version 214729 (0.0034) [2024-06-13 09:36:45,504][73265] Fps is (10 sec: 37673.9, 60 sec: 44508.0, 300 sec: 45541.6). Total num frames: 3518218240. Throughput: 0: 45747.2. Samples: 36822200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 09:36:45,505][73265] Avg episode reward: [(0, '0.412')] [2024-06-13 09:36:46,443][73497] Updated weights for policy 0, policy_version 214739 (0.0038) [2024-06-13 09:36:48,835][73497] Updated weights for policy 0, policy_version 214749 (0.0034) [2024-06-13 09:36:50,501][73265] Fps is (10 sec: 47513.3, 60 sec: 45602.1, 300 sec: 45653.1). Total num frames: 3518480384. Throughput: 0: 45558.1. Samples: 36958760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 09:36:50,502][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 09:36:53,561][73497] Updated weights for policy 0, policy_version 214759 (0.0039) [2024-06-13 09:36:55,502][73265] Fps is (10 sec: 49163.8, 60 sec: 45875.1, 300 sec: 45653.0). Total num frames: 3518709760. Throughput: 0: 45626.1. Samples: 37234300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 09:36:55,508][73265] Avg episode reward: [(0, '0.402')] [2024-06-13 09:36:56,041][73497] Updated weights for policy 0, policy_version 214769 (0.0043) [2024-06-13 09:37:00,501][73265] Fps is (10 sec: 44237.0, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 3518922752. Throughput: 0: 45788.5. Samples: 37514340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 09:37:00,502][73265] Avg episode reward: [(0, '0.394')] [2024-06-13 09:37:00,638][73497] Updated weights for policy 0, policy_version 214779 (0.0038) [2024-06-13 09:37:03,658][73497] Updated weights for policy 0, policy_version 214789 (0.0037) [2024-06-13 09:37:05,501][73265] Fps is (10 sec: 49152.4, 60 sec: 46148.2, 300 sec: 45764.1). Total num frames: 3519201280. Throughput: 0: 45527.9. Samples: 37643420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 09:37:05,502][73265] Avg episode reward: [(0, '0.434')] [2024-06-13 09:37:07,844][73497] Updated weights for policy 0, policy_version 214799 (0.0038) [2024-06-13 09:37:10,501][73265] Fps is (10 sec: 47513.2, 60 sec: 46148.2, 300 sec: 45653.0). Total num frames: 3519397888. Throughput: 0: 45551.4. Samples: 37922960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 09:37:10,502][73265] Avg episode reward: [(0, '0.417')] [2024-06-13 09:37:10,767][73497] Updated weights for policy 0, policy_version 214809 (0.0034) [2024-06-13 09:37:15,041][73497] Updated weights for policy 0, policy_version 214819 (0.0037) [2024-06-13 09:37:15,501][73265] Fps is (10 sec: 39321.7, 60 sec: 44783.0, 300 sec: 45542.0). Total num frames: 3519594496. Throughput: 0: 45537.3. Samples: 38190440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 09:37:15,502][73265] Avg episode reward: [(0, '0.393')] [2024-06-13 09:37:17,803][73497] Updated weights for policy 0, policy_version 214829 (0.0037) [2024-06-13 09:37:20,504][73265] Fps is (10 sec: 45864.3, 60 sec: 45054.1, 300 sec: 45708.2). Total num frames: 3519856640. Throughput: 0: 45198.9. Samples: 38317640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 09:37:20,513][73265] Avg episode reward: [(0, '0.418')] [2024-06-13 09:37:22,365][73497] Updated weights for policy 0, policy_version 214839 (0.0028) [2024-06-13 09:37:25,375][73497] Updated weights for policy 0, policy_version 214849 (0.0030) [2024-06-13 09:37:25,501][73265] Fps is (10 sec: 49152.0, 60 sec: 45604.0, 300 sec: 45653.0). Total num frames: 3520086016. Throughput: 0: 45215.5. Samples: 38590540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 09:37:25,502][73265] Avg episode reward: [(0, '0.428')] [2024-06-13 09:37:29,719][73497] Updated weights for policy 0, policy_version 214859 (0.0031) [2024-06-13 09:37:30,502][73265] Fps is (10 sec: 44247.2, 60 sec: 45329.9, 300 sec: 45653.0). Total num frames: 3520299008. Throughput: 0: 45542.0. Samples: 38871480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 09:37:30,502][73265] Avg episode reward: [(0, '0.426')] [2024-06-13 09:37:32,709][73497] Updated weights for policy 0, policy_version 214869 (0.0031) [2024-06-13 09:37:35,501][73265] Fps is (10 sec: 42599.0, 60 sec: 44510.0, 300 sec: 45542.0). Total num frames: 3520512000. Throughput: 0: 45416.1. Samples: 39002480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 09:37:35,502][73265] Avg episode reward: [(0, '0.410')] [2024-06-13 09:37:36,705][73497] Updated weights for policy 0, policy_version 214879 (0.0036) [2024-06-13 09:37:37,799][73477] Signal inference workers to stop experience collection... (550 times) [2024-06-13 09:37:37,802][73477] Signal inference workers to resume experience collection... (550 times) [2024-06-13 09:37:37,812][73497] InferenceWorker_p0-w0: stopping experience collection (550 times) [2024-06-13 09:37:37,826][73497] InferenceWorker_p0-w0: resuming experience collection (550 times) [2024-06-13 09:37:39,821][73497] Updated weights for policy 0, policy_version 214889 (0.0030) [2024-06-13 09:37:40,502][73265] Fps is (10 sec: 47513.3, 60 sec: 46148.1, 300 sec: 45653.0). Total num frames: 3520774144. Throughput: 0: 45425.7. Samples: 39278460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 09:37:40,502][73265] Avg episode reward: [(0, '0.443')] [2024-06-13 09:37:43,798][73497] Updated weights for policy 0, policy_version 214899 (0.0026) [2024-06-13 09:37:45,502][73265] Fps is (10 sec: 47512.6, 60 sec: 46150.1, 300 sec: 45597.5). Total num frames: 3520987136. Throughput: 0: 45355.0. Samples: 39555320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 09:37:45,502][73265] Avg episode reward: [(0, '0.428')] [2024-06-13 09:37:46,777][73497] Updated weights for policy 0, policy_version 214909 (0.0040) [2024-06-13 09:37:50,504][73265] Fps is (10 sec: 42588.3, 60 sec: 45327.2, 300 sec: 45597.1). Total num frames: 3521200128. Throughput: 0: 45357.0. Samples: 39684600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 09:37:50,505][73265] Avg episode reward: [(0, '0.448')] [2024-06-13 09:37:50,954][73497] Updated weights for policy 0, policy_version 214919 (0.0035) [2024-06-13 09:37:53,800][73497] Updated weights for policy 0, policy_version 214929 (0.0031) [2024-06-13 09:37:55,502][73265] Fps is (10 sec: 49151.9, 60 sec: 46148.3, 300 sec: 45764.9). Total num frames: 3521478656. Throughput: 0: 45342.2. Samples: 39963360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 09:37:55,502][73265] Avg episode reward: [(0, '0.433')] [2024-06-13 09:37:58,108][73497] Updated weights for policy 0, policy_version 214939 (0.0039) [2024-06-13 09:38:00,501][73265] Fps is (10 sec: 47525.5, 60 sec: 45875.2, 300 sec: 45597.5). Total num frames: 3521675264. Throughput: 0: 45618.2. Samples: 40243260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 09:38:00,502][73265] Avg episode reward: [(0, '0.431')] [2024-06-13 09:38:01,283][73497] Updated weights for policy 0, policy_version 214949 (0.0041) [2024-06-13 09:38:05,091][73497] Updated weights for policy 0, policy_version 214959 (0.0028) [2024-06-13 09:38:05,501][73265] Fps is (10 sec: 42599.2, 60 sec: 45056.1, 300 sec: 45653.0). Total num frames: 3521904640. Throughput: 0: 45671.0. Samples: 40372720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 09:38:05,502][73265] Avg episode reward: [(0, '0.365')] [2024-06-13 09:38:08,208][73497] Updated weights for policy 0, policy_version 214969 (0.0030) [2024-06-13 09:38:10,501][73265] Fps is (10 sec: 47513.5, 60 sec: 45875.2, 300 sec: 45708.6). Total num frames: 3522150400. Throughput: 0: 45878.6. Samples: 40655080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 09:38:10,502][73265] Avg episode reward: [(0, '0.412')] [2024-06-13 09:38:12,087][73497] Updated weights for policy 0, policy_version 214979 (0.0025) [2024-06-13 09:38:15,387][73497] Updated weights for policy 0, policy_version 214989 (0.0031) [2024-06-13 09:38:15,501][73265] Fps is (10 sec: 47513.2, 60 sec: 46421.3, 300 sec: 45653.1). Total num frames: 3522379776. Throughput: 0: 45788.9. Samples: 40931980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 09:38:15,502][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 09:38:19,395][73497] Updated weights for policy 0, policy_version 214999 (0.0039) [2024-06-13 09:38:20,504][73265] Fps is (10 sec: 44226.2, 60 sec: 45602.1, 300 sec: 45652.7). Total num frames: 3522592768. Throughput: 0: 45956.1. Samples: 41070620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 09:38:20,504][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 09:38:22,560][73497] Updated weights for policy 0, policy_version 215009 (0.0029) [2024-06-13 09:38:25,504][73265] Fps is (10 sec: 42588.0, 60 sec: 45327.2, 300 sec: 45653.9). Total num frames: 3522805760. Throughput: 0: 45664.3. Samples: 41333460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 09:38:25,505][73265] Avg episode reward: [(0, '0.404')] [2024-06-13 09:38:25,514][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000215015_3522805760.pth... [2024-06-13 09:38:25,560][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000214348_3511877632.pth [2024-06-13 09:38:26,784][73497] Updated weights for policy 0, policy_version 215019 (0.0047) [2024-06-13 09:38:30,102][73497] Updated weights for policy 0, policy_version 215029 (0.0041) [2024-06-13 09:38:30,501][73265] Fps is (10 sec: 47525.0, 60 sec: 46148.3, 300 sec: 45653.4). Total num frames: 3523067904. Throughput: 0: 45551.2. Samples: 41605120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 09:38:30,502][73265] Avg episode reward: [(0, '0.462')] [2024-06-13 09:38:34,114][73497] Updated weights for policy 0, policy_version 215039 (0.0032) [2024-06-13 09:38:35,502][73265] Fps is (10 sec: 45886.0, 60 sec: 45875.0, 300 sec: 45597.5). Total num frames: 3523264512. Throughput: 0: 45739.8. Samples: 41742780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 09:38:35,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 09:38:36,998][73477] Signal inference workers to stop experience collection... (600 times) [2024-06-13 09:38:37,003][73477] Signal inference workers to resume experience collection... (600 times) [2024-06-13 09:38:37,038][73497] InferenceWorker_p0-w0: stopping experience collection (600 times) [2024-06-13 09:38:37,038][73497] InferenceWorker_p0-w0: resuming experience collection (600 times) [2024-06-13 09:38:37,314][73497] Updated weights for policy 0, policy_version 215049 (0.0033) [2024-06-13 09:38:40,501][73265] Fps is (10 sec: 42599.0, 60 sec: 45329.3, 300 sec: 45653.1). Total num frames: 3523493888. Throughput: 0: 45571.3. Samples: 42014060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:38:40,502][73265] Avg episode reward: [(0, '0.410')] [2024-06-13 09:38:40,955][73497] Updated weights for policy 0, policy_version 215059 (0.0045) [2024-06-13 09:38:44,432][73497] Updated weights for policy 0, policy_version 215069 (0.0037) [2024-06-13 09:38:45,504][73265] Fps is (10 sec: 49140.3, 60 sec: 46146.4, 300 sec: 45708.2). Total num frames: 3523756032. Throughput: 0: 45389.1. Samples: 42285880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:38:45,505][73265] Avg episode reward: [(0, '0.426')] [2024-06-13 09:38:48,357][73497] Updated weights for policy 0, policy_version 215079 (0.0048) [2024-06-13 09:38:50,501][73265] Fps is (10 sec: 44236.5, 60 sec: 45604.1, 300 sec: 45542.0). Total num frames: 3523936256. Throughput: 0: 45468.9. Samples: 42418820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:38:50,502][73265] Avg episode reward: [(0, '0.470')] [2024-06-13 09:38:51,577][73497] Updated weights for policy 0, policy_version 215089 (0.0026) [2024-06-13 09:38:55,501][73265] Fps is (10 sec: 39331.6, 60 sec: 44510.0, 300 sec: 45597.5). Total num frames: 3524149248. Throughput: 0: 45353.9. Samples: 42696000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:38:55,502][73265] Avg episode reward: [(0, '0.416')] [2024-06-13 09:38:55,742][73497] Updated weights for policy 0, policy_version 215099 (0.0022) [2024-06-13 09:38:59,035][73497] Updated weights for policy 0, policy_version 215109 (0.0040) [2024-06-13 09:39:00,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 3524395008. Throughput: 0: 44996.5. Samples: 42956820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:39:00,502][73265] Avg episode reward: [(0, '0.385')] [2024-06-13 09:39:02,716][73497] Updated weights for policy 0, policy_version 215119 (0.0026) [2024-06-13 09:39:05,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45056.0, 300 sec: 45430.9). Total num frames: 3524608000. Throughput: 0: 45134.5. Samples: 43101560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:39:05,502][73265] Avg episode reward: [(0, '0.426')] [2024-06-13 09:39:06,167][73497] Updated weights for policy 0, policy_version 215129 (0.0039) [2024-06-13 09:39:09,545][73497] Updated weights for policy 0, policy_version 215139 (0.0034) [2024-06-13 09:39:10,502][73265] Fps is (10 sec: 45874.7, 60 sec: 45056.0, 300 sec: 45542.0). Total num frames: 3524853760. Throughput: 0: 45445.5. Samples: 43378400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:39:10,502][73265] Avg episode reward: [(0, '0.448')] [2024-06-13 09:39:13,287][73497] Updated weights for policy 0, policy_version 215149 (0.0029) [2024-06-13 09:39:15,501][73265] Fps is (10 sec: 47513.8, 60 sec: 45056.1, 300 sec: 45653.0). Total num frames: 3525083136. Throughput: 0: 45364.6. Samples: 43646520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:39:15,502][73265] Avg episode reward: [(0, '0.398')] [2024-06-13 09:39:17,276][73497] Updated weights for policy 0, policy_version 215159 (0.0047) [2024-06-13 09:39:20,217][73497] Updated weights for policy 0, policy_version 215169 (0.0046) [2024-06-13 09:39:20,502][73265] Fps is (10 sec: 47513.5, 60 sec: 45603.9, 300 sec: 45597.5). Total num frames: 3525328896. Throughput: 0: 45520.5. Samples: 43791200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:39:20,502][73265] Avg episode reward: [(0, '0.427')] [2024-06-13 09:39:24,255][73497] Updated weights for policy 0, policy_version 215179 (0.0028) [2024-06-13 09:39:25,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45604.1, 300 sec: 45653.0). Total num frames: 3525541888. Throughput: 0: 45471.1. Samples: 44060260. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:39:25,502][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 09:39:27,667][73497] Updated weights for policy 0, policy_version 215189 (0.0037) [2024-06-13 09:39:30,502][73265] Fps is (10 sec: 44236.8, 60 sec: 45056.0, 300 sec: 45653.0). Total num frames: 3525771264. Throughput: 0: 45474.0. Samples: 44332100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:39:30,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 09:39:31,255][73497] Updated weights for policy 0, policy_version 215199 (0.0045) [2024-06-13 09:39:35,207][73497] Updated weights for policy 0, policy_version 215209 (0.0029) [2024-06-13 09:39:35,502][73265] Fps is (10 sec: 45874.4, 60 sec: 45602.1, 300 sec: 45597.5). Total num frames: 3526000640. Throughput: 0: 45482.1. Samples: 44465520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:39:35,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 09:39:38,299][73497] Updated weights for policy 0, policy_version 215219 (0.0037) [2024-06-13 09:39:40,501][73265] Fps is (10 sec: 42598.8, 60 sec: 45055.9, 300 sec: 45597.5). Total num frames: 3526197248. Throughput: 0: 45345.3. Samples: 44736540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:39:40,502][73265] Avg episode reward: [(0, '0.414')] [2024-06-13 09:39:41,138][73477] Signal inference workers to stop experience collection... (650 times) [2024-06-13 09:39:41,138][73477] Signal inference workers to resume experience collection... (650 times) [2024-06-13 09:39:41,175][73497] InferenceWorker_p0-w0: stopping experience collection (650 times) [2024-06-13 09:39:41,175][73497] InferenceWorker_p0-w0: resuming experience collection (650 times) [2024-06-13 09:39:42,386][73497] Updated weights for policy 0, policy_version 215229 (0.0028) [2024-06-13 09:39:45,501][73265] Fps is (10 sec: 44237.3, 60 sec: 44784.8, 300 sec: 45653.0). Total num frames: 3526443008. Throughput: 0: 45440.4. Samples: 45001640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 09:39:45,502][73265] Avg episode reward: [(0, '0.421')] [2024-06-13 09:39:45,971][73497] Updated weights for policy 0, policy_version 215239 (0.0040) [2024-06-13 09:39:49,430][73497] Updated weights for policy 0, policy_version 215249 (0.0032) [2024-06-13 09:39:50,502][73265] Fps is (10 sec: 45874.8, 60 sec: 45329.0, 300 sec: 45486.4). Total num frames: 3526656000. Throughput: 0: 45499.4. Samples: 45149040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 09:39:50,502][73265] Avg episode reward: [(0, '0.464')] [2024-06-13 09:39:52,778][73497] Updated weights for policy 0, policy_version 215259 (0.0027) [2024-06-13 09:39:55,501][73265] Fps is (10 sec: 47513.5, 60 sec: 46148.2, 300 sec: 45597.5). Total num frames: 3526918144. Throughput: 0: 45626.3. Samples: 45431580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 09:39:55,502][73265] Avg episode reward: [(0, '0.448')] [2024-06-13 09:39:56,326][73497] Updated weights for policy 0, policy_version 215269 (0.0038) [2024-06-13 09:39:59,790][73497] Updated weights for policy 0, policy_version 215279 (0.0038) [2024-06-13 09:40:00,501][73265] Fps is (10 sec: 49152.3, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 3527147520. Throughput: 0: 45458.1. Samples: 45692140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 09:40:00,502][73265] Avg episode reward: [(0, '0.369')] [2024-06-13 09:40:04,117][73497] Updated weights for policy 0, policy_version 215289 (0.0035) [2024-06-13 09:40:05,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45875.2, 300 sec: 45597.5). Total num frames: 3527360512. Throughput: 0: 45498.8. Samples: 45838640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 09:40:05,502][73265] Avg episode reward: [(0, '0.360')] [2024-06-13 09:40:06,628][73497] Updated weights for policy 0, policy_version 215299 (0.0036) [2024-06-13 09:40:10,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45602.2, 300 sec: 45542.0). Total num frames: 3527589888. Throughput: 0: 45648.4. Samples: 46114440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 09:40:10,502][73265] Avg episode reward: [(0, '0.405')] [2024-06-13 09:40:11,299][73497] Updated weights for policy 0, policy_version 215309 (0.0032) [2024-06-13 09:40:14,357][73497] Updated weights for policy 0, policy_version 215319 (0.0037) [2024-06-13 09:40:15,504][73265] Fps is (10 sec: 45865.2, 60 sec: 45600.4, 300 sec: 45652.7). Total num frames: 3527819264. Throughput: 0: 45689.0. Samples: 46388200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 09:40:15,504][73265] Avg episode reward: [(0, '0.458')] [2024-06-13 09:40:18,107][73497] Updated weights for policy 0, policy_version 215329 (0.0027) [2024-06-13 09:40:20,501][73265] Fps is (10 sec: 47514.0, 60 sec: 45602.3, 300 sec: 45708.6). Total num frames: 3528065024. Throughput: 0: 45902.0. Samples: 46531100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 09:40:20,502][73265] Avg episode reward: [(0, '0.421')] [2024-06-13 09:40:21,241][73497] Updated weights for policy 0, policy_version 215339 (0.0025) [2024-06-13 09:40:25,415][73497] Updated weights for policy 0, policy_version 215349 (0.0029) [2024-06-13 09:40:25,501][73265] Fps is (10 sec: 45885.3, 60 sec: 45602.1, 300 sec: 45486.4). Total num frames: 3528278016. Throughput: 0: 45905.8. Samples: 46802300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 09:40:25,502][73265] Avg episode reward: [(0, '0.398')] [2024-06-13 09:40:25,509][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000215349_3528278016.pth... [2024-06-13 09:40:25,554][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000214682_3517349888.pth [2024-06-13 09:40:28,234][73497] Updated weights for policy 0, policy_version 215359 (0.0021) [2024-06-13 09:40:30,501][73265] Fps is (10 sec: 45874.6, 60 sec: 45875.3, 300 sec: 45708.6). Total num frames: 3528523776. Throughput: 0: 46259.5. Samples: 47083320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 09:40:30,502][73265] Avg episode reward: [(0, '0.405')] [2024-06-13 09:40:32,727][73497] Updated weights for policy 0, policy_version 215369 (0.0039) [2024-06-13 09:40:35,249][73497] Updated weights for policy 0, policy_version 215379 (0.0037) [2024-06-13 09:40:35,508][73265] Fps is (10 sec: 49120.5, 60 sec: 46143.5, 300 sec: 45652.1). Total num frames: 3528769536. Throughput: 0: 46150.9. Samples: 47226120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 09:40:35,508][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 09:40:39,976][73497] Updated weights for policy 0, policy_version 215389 (0.0028) [2024-06-13 09:40:40,501][73265] Fps is (10 sec: 42598.4, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 3528949760. Throughput: 0: 45688.4. Samples: 47487560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 09:40:40,502][73265] Avg episode reward: [(0, '0.415')] [2024-06-13 09:40:42,773][73497] Updated weights for policy 0, policy_version 215399 (0.0034) [2024-06-13 09:40:45,501][73265] Fps is (10 sec: 44265.2, 60 sec: 46148.3, 300 sec: 45653.0). Total num frames: 3529211904. Throughput: 0: 46016.1. Samples: 47762860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 09:40:45,502][73265] Avg episode reward: [(0, '0.385')] [2024-06-13 09:40:47,055][73497] Updated weights for policy 0, policy_version 215409 (0.0032) [2024-06-13 09:40:50,061][73497] Updated weights for policy 0, policy_version 215419 (0.0036) [2024-06-13 09:40:50,501][73265] Fps is (10 sec: 47513.8, 60 sec: 46148.3, 300 sec: 45653.0). Total num frames: 3529424896. Throughput: 0: 45958.2. Samples: 47906760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-13 09:40:50,502][73265] Avg episode reward: [(0, '0.426')] [2024-06-13 09:40:54,407][73497] Updated weights for policy 0, policy_version 215429 (0.0030) [2024-06-13 09:40:55,501][73265] Fps is (10 sec: 44236.6, 60 sec: 45602.1, 300 sec: 45597.5). Total num frames: 3529654272. Throughput: 0: 45952.4. Samples: 48182300. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-13 09:40:55,502][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 09:40:57,057][73497] Updated weights for policy 0, policy_version 215439 (0.0033) [2024-06-13 09:41:00,504][73265] Fps is (10 sec: 44225.7, 60 sec: 45327.2, 300 sec: 45541.6). Total num frames: 3529867264. Throughput: 0: 45725.0. Samples: 48445840. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-13 09:41:00,504][73265] Avg episode reward: [(0, '0.453')] [2024-06-13 09:41:01,721][73477] Signal inference workers to stop experience collection... (700 times) [2024-06-13 09:41:01,770][73497] InferenceWorker_p0-w0: stopping experience collection (700 times) [2024-06-13 09:41:01,777][73477] Signal inference workers to resume experience collection... (700 times) [2024-06-13 09:41:01,784][73497] InferenceWorker_p0-w0: resuming experience collection (700 times) [2024-06-13 09:41:01,787][73497] Updated weights for policy 0, policy_version 215449 (0.0033) [2024-06-13 09:41:04,305][73497] Updated weights for policy 0, policy_version 215459 (0.0038) [2024-06-13 09:41:05,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45875.2, 300 sec: 45708.6). Total num frames: 3530113024. Throughput: 0: 45506.1. Samples: 48578880. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-13 09:41:05,502][73265] Avg episode reward: [(0, '0.453')] [2024-06-13 09:41:09,031][73497] Updated weights for policy 0, policy_version 215469 (0.0039) [2024-06-13 09:41:10,502][73265] Fps is (10 sec: 47524.8, 60 sec: 45875.1, 300 sec: 45542.0). Total num frames: 3530342400. Throughput: 0: 45734.0. Samples: 48860340. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-13 09:41:10,502][73265] Avg episode reward: [(0, '0.415')] [2024-06-13 09:41:11,526][73497] Updated weights for policy 0, policy_version 215479 (0.0038) [2024-06-13 09:41:15,501][73265] Fps is (10 sec: 42598.3, 60 sec: 45330.7, 300 sec: 45375.3). Total num frames: 3530539008. Throughput: 0: 45410.2. Samples: 49126780. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-13 09:41:15,502][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 09:41:16,255][73497] Updated weights for policy 0, policy_version 215489 (0.0032) [2024-06-13 09:41:18,901][73497] Updated weights for policy 0, policy_version 215499 (0.0042) [2024-06-13 09:41:20,502][73265] Fps is (10 sec: 47513.6, 60 sec: 45875.0, 300 sec: 45653.4). Total num frames: 3530817536. Throughput: 0: 45282.3. Samples: 49263540. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-13 09:41:20,502][73265] Avg episode reward: [(0, '0.462')] [2024-06-13 09:41:23,135][73497] Updated weights for policy 0, policy_version 215509 (0.0035) [2024-06-13 09:41:25,502][73265] Fps is (10 sec: 49151.7, 60 sec: 45875.1, 300 sec: 45597.7). Total num frames: 3531030528. Throughput: 0: 45718.6. Samples: 49544900. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-13 09:41:25,503][73265] Avg episode reward: [(0, '0.462')] [2024-06-13 09:41:25,859][73497] Updated weights for policy 0, policy_version 215519 (0.0027) [2024-06-13 09:41:30,286][73497] Updated weights for policy 0, policy_version 215529 (0.0030) [2024-06-13 09:41:30,502][73265] Fps is (10 sec: 40960.1, 60 sec: 45055.9, 300 sec: 45375.3). Total num frames: 3531227136. Throughput: 0: 45692.7. Samples: 49819040. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-13 09:41:30,502][73265] Avg episode reward: [(0, '0.443')] [2024-06-13 09:41:33,477][73497] Updated weights for policy 0, policy_version 215539 (0.0035) [2024-06-13 09:41:35,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45333.9, 300 sec: 45708.6). Total num frames: 3531489280. Throughput: 0: 45274.6. Samples: 49944120. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-13 09:41:35,502][73265] Avg episode reward: [(0, '0.376')] [2024-06-13 09:41:37,778][73497] Updated weights for policy 0, policy_version 215549 (0.0038) [2024-06-13 09:41:40,501][73265] Fps is (10 sec: 45876.0, 60 sec: 45602.2, 300 sec: 45653.4). Total num frames: 3531685888. Throughput: 0: 45273.9. Samples: 50219620. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-13 09:41:40,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 09:41:40,643][73497] Updated weights for policy 0, policy_version 215559 (0.0041) [2024-06-13 09:41:44,884][73497] Updated weights for policy 0, policy_version 215569 (0.0033) [2024-06-13 09:41:45,501][73265] Fps is (10 sec: 42598.6, 60 sec: 45056.0, 300 sec: 45542.0). Total num frames: 3531915264. Throughput: 0: 45576.3. Samples: 50496660. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-13 09:41:45,502][73265] Avg episode reward: [(0, '0.421')] [2024-06-13 09:41:47,860][73497] Updated weights for policy 0, policy_version 215579 (0.0031) [2024-06-13 09:41:50,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 3532144640. Throughput: 0: 45481.8. Samples: 50625560. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-13 09:41:50,502][73265] Avg episode reward: [(0, '0.427')] [2024-06-13 09:41:51,783][73497] Updated weights for policy 0, policy_version 215589 (0.0034) [2024-06-13 09:41:54,906][73497] Updated weights for policy 0, policy_version 215599 (0.0032) [2024-06-13 09:41:55,501][73265] Fps is (10 sec: 47513.4, 60 sec: 45602.1, 300 sec: 45653.0). Total num frames: 3532390400. Throughput: 0: 45351.7. Samples: 50901160. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-13 09:41:55,502][73265] Avg episode reward: [(0, '0.415')] [2024-06-13 09:41:58,803][73497] Updated weights for policy 0, policy_version 215609 (0.0024) [2024-06-13 09:42:00,502][73265] Fps is (10 sec: 47513.0, 60 sec: 45877.0, 300 sec: 45486.4). Total num frames: 3532619776. Throughput: 0: 45559.5. Samples: 51176960. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-13 09:42:00,502][73265] Avg episode reward: [(0, '0.381')] [2024-06-13 09:42:02,191][73497] Updated weights for policy 0, policy_version 215619 (0.0030) [2024-06-13 09:42:05,502][73265] Fps is (10 sec: 44236.5, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 3532832768. Throughput: 0: 45366.3. Samples: 51305020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-13 09:42:05,502][73265] Avg episode reward: [(0, '0.458')] [2024-06-13 09:42:06,500][73497] Updated weights for policy 0, policy_version 215629 (0.0039) [2024-06-13 09:42:07,211][73477] Signal inference workers to stop experience collection... (750 times) [2024-06-13 09:42:07,258][73497] InferenceWorker_p0-w0: stopping experience collection (750 times) [2024-06-13 09:42:07,269][73477] Signal inference workers to resume experience collection... (750 times) [2024-06-13 09:42:07,275][73497] InferenceWorker_p0-w0: resuming experience collection (750 times) [2024-06-13 09:42:09,565][73497] Updated weights for policy 0, policy_version 215639 (0.0031) [2024-06-13 09:42:10,501][73265] Fps is (10 sec: 47514.2, 60 sec: 45875.3, 300 sec: 45764.1). Total num frames: 3533094912. Throughput: 0: 45245.0. Samples: 51580920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-13 09:42:10,502][73265] Avg episode reward: [(0, '0.392')] [2024-06-13 09:42:13,543][73497] Updated weights for policy 0, policy_version 215649 (0.0031) [2024-06-13 09:42:15,501][73265] Fps is (10 sec: 45875.4, 60 sec: 45875.2, 300 sec: 45542.3). Total num frames: 3533291520. Throughput: 0: 45135.2. Samples: 51850120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-13 09:42:15,502][73265] Avg episode reward: [(0, '0.475')] [2024-06-13 09:42:15,507][73477] Saving new best policy, reward=0.475! [2024-06-13 09:42:17,012][73497] Updated weights for policy 0, policy_version 215659 (0.0034) [2024-06-13 09:42:20,501][73265] Fps is (10 sec: 40960.2, 60 sec: 44783.1, 300 sec: 45486.4). Total num frames: 3533504512. Throughput: 0: 45189.0. Samples: 51977620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-13 09:42:20,502][73265] Avg episode reward: [(0, '0.439')] [2024-06-13 09:42:20,608][73497] Updated weights for policy 0, policy_version 215669 (0.0033) [2024-06-13 09:42:24,025][73497] Updated weights for policy 0, policy_version 215679 (0.0032) [2024-06-13 09:42:25,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 3533750272. Throughput: 0: 45259.5. Samples: 52256300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-13 09:42:25,502][73265] Avg episode reward: [(0, '0.423')] [2024-06-13 09:42:25,553][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000215684_3533766656.pth... [2024-06-13 09:42:25,606][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000215015_3522805760.pth [2024-06-13 09:42:27,971][73497] Updated weights for policy 0, policy_version 215689 (0.0032) [2024-06-13 09:42:30,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45602.3, 300 sec: 45597.5). Total num frames: 3533963264. Throughput: 0: 45144.0. Samples: 52528140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-13 09:42:30,502][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 09:42:31,032][73497] Updated weights for policy 0, policy_version 215699 (0.0026) [2024-06-13 09:42:35,265][73497] Updated weights for policy 0, policy_version 215709 (0.0026) [2024-06-13 09:42:35,502][73265] Fps is (10 sec: 42597.9, 60 sec: 44782.9, 300 sec: 45430.9). Total num frames: 3534176256. Throughput: 0: 45155.4. Samples: 52657560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-13 09:42:35,502][73265] Avg episode reward: [(0, '0.443')] [2024-06-13 09:42:38,541][73497] Updated weights for policy 0, policy_version 215719 (0.0030) [2024-06-13 09:42:40,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3534422016. Throughput: 0: 45108.0. Samples: 52931020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-13 09:42:40,502][73265] Avg episode reward: [(0, '0.439')] [2024-06-13 09:42:42,289][73497] Updated weights for policy 0, policy_version 215729 (0.0027) [2024-06-13 09:42:45,502][73265] Fps is (10 sec: 45875.5, 60 sec: 45329.0, 300 sec: 45542.3). Total num frames: 3534635008. Throughput: 0: 45030.2. Samples: 53203320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-13 09:42:45,510][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 09:42:45,971][73497] Updated weights for policy 0, policy_version 215739 (0.0041) [2024-06-13 09:42:49,626][73497] Updated weights for policy 0, policy_version 215749 (0.0030) [2024-06-13 09:42:50,501][73265] Fps is (10 sec: 42598.4, 60 sec: 45056.0, 300 sec: 45319.8). Total num frames: 3534848000. Throughput: 0: 45226.8. Samples: 53340220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-13 09:42:50,502][73265] Avg episode reward: [(0, '0.399')] [2024-06-13 09:42:52,935][73497] Updated weights for policy 0, policy_version 215759 (0.0028) [2024-06-13 09:42:55,502][73265] Fps is (10 sec: 45874.9, 60 sec: 45055.9, 300 sec: 45486.4). Total num frames: 3535093760. Throughput: 0: 45071.4. Samples: 53609140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-13 09:42:55,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 09:42:56,780][73497] Updated weights for policy 0, policy_version 215769 (0.0045) [2024-06-13 09:43:00,118][73497] Updated weights for policy 0, policy_version 215779 (0.0021) [2024-06-13 09:43:00,501][73265] Fps is (10 sec: 49152.1, 60 sec: 45329.2, 300 sec: 45542.0). Total num frames: 3535339520. Throughput: 0: 45198.3. Samples: 53884040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-13 09:43:00,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 09:43:04,016][73497] Updated weights for policy 0, policy_version 215789 (0.0025) [2024-06-13 09:43:05,501][73265] Fps is (10 sec: 42598.9, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 3535519744. Throughput: 0: 45415.5. Samples: 54021320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-13 09:43:05,502][73265] Avg episode reward: [(0, '0.490')] [2024-06-13 09:43:05,536][73477] Saving new best policy, reward=0.490! [2024-06-13 09:43:07,304][73497] Updated weights for policy 0, policy_version 215799 (0.0025) [2024-06-13 09:43:07,392][73477] Signal inference workers to stop experience collection... (800 times) [2024-06-13 09:43:07,420][73497] InferenceWorker_p0-w0: stopping experience collection (800 times) [2024-06-13 09:43:07,447][73477] Signal inference workers to resume experience collection... (800 times) [2024-06-13 09:43:07,467][73497] InferenceWorker_p0-w0: resuming experience collection (800 times) [2024-06-13 09:43:10,501][73265] Fps is (10 sec: 42598.5, 60 sec: 44509.9, 300 sec: 45375.4). Total num frames: 3535765504. Throughput: 0: 45107.6. Samples: 54286140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-13 09:43:10,502][73265] Avg episode reward: [(0, '0.418')] [2024-06-13 09:43:11,018][73497] Updated weights for policy 0, policy_version 215809 (0.0031) [2024-06-13 09:43:14,475][73497] Updated weights for policy 0, policy_version 215819 (0.0042) [2024-06-13 09:43:15,501][73265] Fps is (10 sec: 49152.1, 60 sec: 45329.1, 300 sec: 45486.8). Total num frames: 3536011264. Throughput: 0: 45186.6. Samples: 54561540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-13 09:43:15,502][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 09:43:18,506][73497] Updated weights for policy 0, policy_version 215829 (0.0029) [2024-06-13 09:43:20,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45329.1, 300 sec: 45486.8). Total num frames: 3536224256. Throughput: 0: 45452.2. Samples: 54702900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-13 09:43:20,502][73265] Avg episode reward: [(0, '0.456')] [2024-06-13 09:43:21,513][73497] Updated weights for policy 0, policy_version 215839 (0.0034) [2024-06-13 09:43:25,501][73497] Updated weights for policy 0, policy_version 215849 (0.0051) [2024-06-13 09:43:25,502][73265] Fps is (10 sec: 45874.7, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 3536470016. Throughput: 0: 45495.0. Samples: 54978300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-13 09:43:25,502][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 09:43:28,834][73497] Updated weights for policy 0, policy_version 215859 (0.0040) [2024-06-13 09:43:30,501][73265] Fps is (10 sec: 45874.7, 60 sec: 45329.0, 300 sec: 45486.4). Total num frames: 3536683008. Throughput: 0: 45342.3. Samples: 55243720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-13 09:43:30,502][73265] Avg episode reward: [(0, '0.421')] [2024-06-13 09:43:32,535][73497] Updated weights for policy 0, policy_version 215869 (0.0031) [2024-06-13 09:43:35,501][73265] Fps is (10 sec: 42599.0, 60 sec: 45329.2, 300 sec: 45430.9). Total num frames: 3536896000. Throughput: 0: 45343.1. Samples: 55380660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-13 09:43:35,502][73265] Avg episode reward: [(0, '0.464')] [2024-06-13 09:43:36,075][73497] Updated weights for policy 0, policy_version 215879 (0.0032) [2024-06-13 09:43:39,534][73497] Updated weights for policy 0, policy_version 215889 (0.0032) [2024-06-13 09:43:40,501][73265] Fps is (10 sec: 44237.3, 60 sec: 45056.1, 300 sec: 45320.2). Total num frames: 3537125376. Throughput: 0: 45484.7. Samples: 55655940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-13 09:43:40,502][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 09:43:43,270][73497] Updated weights for policy 0, policy_version 215899 (0.0039) [2024-06-13 09:43:45,501][73265] Fps is (10 sec: 45874.9, 60 sec: 45329.1, 300 sec: 45486.4). Total num frames: 3537354752. Throughput: 0: 45384.0. Samples: 55926320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-13 09:43:45,502][73265] Avg episode reward: [(0, '0.482')] [2024-06-13 09:43:47,750][73497] Updated weights for policy 0, policy_version 215909 (0.0030) [2024-06-13 09:43:50,501][73265] Fps is (10 sec: 47513.4, 60 sec: 45875.2, 300 sec: 45597.5). Total num frames: 3537600512. Throughput: 0: 45422.7. Samples: 56065340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-13 09:43:50,502][73265] Avg episode reward: [(0, '0.419')] [2024-06-13 09:43:50,636][73497] Updated weights for policy 0, policy_version 215919 (0.0044) [2024-06-13 09:43:54,764][73497] Updated weights for policy 0, policy_version 215929 (0.0032) [2024-06-13 09:43:55,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45056.1, 300 sec: 45430.9). Total num frames: 3537797120. Throughput: 0: 45548.8. Samples: 56335840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-13 09:43:55,502][73265] Avg episode reward: [(0, '0.379')] [2024-06-13 09:43:57,604][73497] Updated weights for policy 0, policy_version 215939 (0.0037) [2024-06-13 09:44:00,501][73265] Fps is (10 sec: 44236.8, 60 sec: 45056.0, 300 sec: 45542.0). Total num frames: 3538042880. Throughput: 0: 45340.1. Samples: 56601840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-13 09:44:00,502][73265] Avg episode reward: [(0, '0.429')] [2024-06-13 09:44:01,927][73497] Updated weights for policy 0, policy_version 215949 (0.0032) [2024-06-13 09:44:04,967][73477] Signal inference workers to stop experience collection... (850 times) [2024-06-13 09:44:04,968][73477] Signal inference workers to resume experience collection... (850 times) [2024-06-13 09:44:05,016][73497] InferenceWorker_p0-w0: stopping experience collection (850 times) [2024-06-13 09:44:05,016][73497] InferenceWorker_p0-w0: resuming experience collection (850 times) [2024-06-13 09:44:05,104][73497] Updated weights for policy 0, policy_version 215959 (0.0034) [2024-06-13 09:44:05,501][73265] Fps is (10 sec: 47514.0, 60 sec: 45875.3, 300 sec: 45486.5). Total num frames: 3538272256. Throughput: 0: 45285.8. Samples: 56740760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-13 09:44:05,510][73265] Avg episode reward: [(0, '0.429')] [2024-06-13 09:44:08,931][73497] Updated weights for policy 0, policy_version 215969 (0.0033) [2024-06-13 09:44:10,507][73265] Fps is (10 sec: 40935.2, 60 sec: 44778.4, 300 sec: 45318.9). Total num frames: 3538452480. Throughput: 0: 45108.3. Samples: 57008440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-13 09:44:10,508][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 09:44:12,378][73497] Updated weights for policy 0, policy_version 215979 (0.0033) [2024-06-13 09:44:15,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3538731008. Throughput: 0: 45154.3. Samples: 57275660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-13 09:44:15,502][73265] Avg episode reward: [(0, '0.419')] [2024-06-13 09:44:16,565][73497] Updated weights for policy 0, policy_version 215989 (0.0041) [2024-06-13 09:44:19,770][73497] Updated weights for policy 0, policy_version 215999 (0.0030) [2024-06-13 09:44:20,502][73265] Fps is (10 sec: 50820.3, 60 sec: 45602.0, 300 sec: 45486.4). Total num frames: 3538960384. Throughput: 0: 45378.5. Samples: 57422700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 09:44:20,502][73265] Avg episode reward: [(0, '0.419')] [2024-06-13 09:44:23,883][73497] Updated weights for policy 0, policy_version 216009 (0.0035) [2024-06-13 09:44:25,502][73265] Fps is (10 sec: 42597.5, 60 sec: 44782.9, 300 sec: 45375.4). Total num frames: 3539156992. Throughput: 0: 45263.8. Samples: 57692820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 09:44:25,502][73265] Avg episode reward: [(0, '0.385')] [2024-06-13 09:44:25,592][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000216014_3539173376.pth... [2024-06-13 09:44:25,636][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000215349_3528278016.pth [2024-06-13 09:44:26,759][73497] Updated weights for policy 0, policy_version 216019 (0.0035) [2024-06-13 09:44:30,501][73265] Fps is (10 sec: 42598.6, 60 sec: 45056.0, 300 sec: 45375.4). Total num frames: 3539386368. Throughput: 0: 45000.0. Samples: 57951320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 09:44:30,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 09:44:30,983][73497] Updated weights for policy 0, policy_version 216029 (0.0033) [2024-06-13 09:44:34,338][73497] Updated weights for policy 0, policy_version 216039 (0.0040) [2024-06-13 09:44:35,501][73265] Fps is (10 sec: 45875.9, 60 sec: 45329.1, 300 sec: 45486.4). Total num frames: 3539615744. Throughput: 0: 45185.8. Samples: 58098700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 09:44:35,502][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 09:44:37,983][73497] Updated weights for policy 0, policy_version 216049 (0.0027) [2024-06-13 09:44:40,501][73265] Fps is (10 sec: 45875.6, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 3539845120. Throughput: 0: 45164.5. Samples: 58368240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 09:44:40,502][73265] Avg episode reward: [(0, '0.394')] [2024-06-13 09:44:41,499][73497] Updated weights for policy 0, policy_version 216059 (0.0043) [2024-06-13 09:44:45,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45056.0, 300 sec: 45430.9). Total num frames: 3540058112. Throughput: 0: 45319.1. Samples: 58641200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 09:44:45,502][73265] Avg episode reward: [(0, '0.421')] [2024-06-13 09:44:45,631][73497] Updated weights for policy 0, policy_version 216069 (0.0029) [2024-06-13 09:44:48,452][73497] Updated weights for policy 0, policy_version 216079 (0.0026) [2024-06-13 09:44:50,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45056.0, 300 sec: 45375.4). Total num frames: 3540303872. Throughput: 0: 45287.1. Samples: 58778680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 09:44:50,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 09:44:52,398][73497] Updated weights for policy 0, policy_version 216089 (0.0033) [2024-06-13 09:44:55,501][73265] Fps is (10 sec: 49151.7, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 3540549632. Throughput: 0: 45470.4. Samples: 59054340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 09:44:55,502][73265] Avg episode reward: [(0, '0.422')] [2024-06-13 09:44:55,708][73497] Updated weights for policy 0, policy_version 216099 (0.0038) [2024-06-13 09:44:59,836][73497] Updated weights for policy 0, policy_version 216109 (0.0032) [2024-06-13 09:45:00,502][73265] Fps is (10 sec: 44236.2, 60 sec: 45055.9, 300 sec: 45375.3). Total num frames: 3540746240. Throughput: 0: 45470.0. Samples: 59321820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 09:45:00,502][73265] Avg episode reward: [(0, '0.423')] [2024-06-13 09:45:03,172][73497] Updated weights for policy 0, policy_version 216119 (0.0036) [2024-06-13 09:45:05,502][73265] Fps is (10 sec: 44236.5, 60 sec: 45328.9, 300 sec: 45430.9). Total num frames: 3540992000. Throughput: 0: 45394.6. Samples: 59465460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 09:45:05,502][73265] Avg episode reward: [(0, '0.463')] [2024-06-13 09:45:06,641][73497] Updated weights for policy 0, policy_version 216129 (0.0036) [2024-06-13 09:45:10,098][73497] Updated weights for policy 0, policy_version 216139 (0.0046) [2024-06-13 09:45:10,501][73265] Fps is (10 sec: 49152.4, 60 sec: 46426.0, 300 sec: 45486.8). Total num frames: 3541237760. Throughput: 0: 45447.2. Samples: 59737940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 09:45:10,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 09:45:14,422][73497] Updated weights for policy 0, policy_version 216149 (0.0028) [2024-06-13 09:45:15,361][73477] Signal inference workers to stop experience collection... (900 times) [2024-06-13 09:45:15,361][73477] Signal inference workers to resume experience collection... (900 times) [2024-06-13 09:45:15,379][73497] InferenceWorker_p0-w0: stopping experience collection (900 times) [2024-06-13 09:45:15,380][73497] InferenceWorker_p0-w0: resuming experience collection (900 times) [2024-06-13 09:45:15,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45328.9, 300 sec: 45375.3). Total num frames: 3541450752. Throughput: 0: 45852.4. Samples: 60014680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 09:45:15,502][73265] Avg episode reward: [(0, '0.435')] [2024-06-13 09:45:17,433][73497] Updated weights for policy 0, policy_version 216159 (0.0037) [2024-06-13 09:45:20,501][73265] Fps is (10 sec: 44236.6, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3541680128. Throughput: 0: 45459.9. Samples: 60144400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 09:45:20,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 09:45:21,361][73497] Updated weights for policy 0, policy_version 216169 (0.0029) [2024-06-13 09:45:24,713][73497] Updated weights for policy 0, policy_version 216179 (0.0036) [2024-06-13 09:45:25,501][73265] Fps is (10 sec: 44237.0, 60 sec: 45602.2, 300 sec: 45319.8). Total num frames: 3541893120. Throughput: 0: 45577.7. Samples: 60419240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 09:45:25,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 09:45:28,393][73497] Updated weights for policy 0, policy_version 216189 (0.0033) [2024-06-13 09:45:30,501][73265] Fps is (10 sec: 44237.4, 60 sec: 45602.2, 300 sec: 45265.3). Total num frames: 3542122496. Throughput: 0: 45457.8. Samples: 60686800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 09:45:30,502][73265] Avg episode reward: [(0, '0.420')] [2024-06-13 09:45:32,145][73497] Updated weights for policy 0, policy_version 216199 (0.0036) [2024-06-13 09:45:35,501][73265] Fps is (10 sec: 44236.6, 60 sec: 45329.0, 300 sec: 45375.3). Total num frames: 3542335488. Throughput: 0: 45429.2. Samples: 60823000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 09:45:35,502][73265] Avg episode reward: [(0, '0.374')] [2024-06-13 09:45:36,037][73497] Updated weights for policy 0, policy_version 216209 (0.0029) [2024-06-13 09:45:39,110][73497] Updated weights for policy 0, policy_version 216219 (0.0040) [2024-06-13 09:45:40,502][73265] Fps is (10 sec: 49151.2, 60 sec: 46148.2, 300 sec: 45430.9). Total num frames: 3542614016. Throughput: 0: 45389.3. Samples: 61096860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 09:45:40,502][73265] Avg episode reward: [(0, '0.417')] [2024-06-13 09:45:43,024][73497] Updated weights for policy 0, policy_version 216229 (0.0032) [2024-06-13 09:45:45,501][73265] Fps is (10 sec: 45875.7, 60 sec: 45602.1, 300 sec: 45319.8). Total num frames: 3542794240. Throughput: 0: 45501.0. Samples: 61369360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 09:45:45,502][73265] Avg episode reward: [(0, '0.404')] [2024-06-13 09:45:46,434][73497] Updated weights for policy 0, policy_version 216239 (0.0037) [2024-06-13 09:45:50,167][73497] Updated weights for policy 0, policy_version 216249 (0.0031) [2024-06-13 09:45:50,501][73265] Fps is (10 sec: 40960.2, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 3543023616. Throughput: 0: 45326.8. Samples: 61505160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 09:45:50,502][73265] Avg episode reward: [(0, '0.347')] [2024-06-13 09:45:53,842][73497] Updated weights for policy 0, policy_version 216259 (0.0046) [2024-06-13 09:45:55,504][73265] Fps is (10 sec: 47501.7, 60 sec: 45327.2, 300 sec: 45430.9). Total num frames: 3543269376. Throughput: 0: 45302.0. Samples: 61776640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 09:45:55,504][73265] Avg episode reward: [(0, '0.370')] [2024-06-13 09:45:57,504][73497] Updated weights for policy 0, policy_version 216269 (0.0034) [2024-06-13 09:46:00,501][73265] Fps is (10 sec: 42598.8, 60 sec: 45056.1, 300 sec: 45208.7). Total num frames: 3543449600. Throughput: 0: 45280.1. Samples: 62052280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 09:46:00,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 09:46:00,944][73497] Updated weights for policy 0, policy_version 216279 (0.0045) [2024-06-13 09:46:04,971][73497] Updated weights for policy 0, policy_version 216289 (0.0030) [2024-06-13 09:46:05,501][73265] Fps is (10 sec: 42608.9, 60 sec: 45056.1, 300 sec: 45264.3). Total num frames: 3543695360. Throughput: 0: 45235.2. Samples: 62179980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 09:46:05,502][73265] Avg episode reward: [(0, '0.410')] [2024-06-13 09:46:08,199][73497] Updated weights for policy 0, policy_version 216299 (0.0033) [2024-06-13 09:46:10,502][73265] Fps is (10 sec: 47509.6, 60 sec: 44782.3, 300 sec: 45375.2). Total num frames: 3543924736. Throughput: 0: 45119.7. Samples: 62449660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 09:46:10,503][73265] Avg episode reward: [(0, '0.474')] [2024-06-13 09:46:12,005][73497] Updated weights for policy 0, policy_version 216309 (0.0037) [2024-06-13 09:46:15,502][73265] Fps is (10 sec: 45874.7, 60 sec: 45056.0, 300 sec: 45208.7). Total num frames: 3544154112. Throughput: 0: 45410.0. Samples: 62730260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 09:46:15,502][73265] Avg episode reward: [(0, '0.420')] [2024-06-13 09:46:15,660][73497] Updated weights for policy 0, policy_version 216319 (0.0040) [2024-06-13 09:46:18,978][73497] Updated weights for policy 0, policy_version 216329 (0.0045) [2024-06-13 09:46:20,501][73265] Fps is (10 sec: 45878.6, 60 sec: 45056.0, 300 sec: 45264.3). Total num frames: 3544383488. Throughput: 0: 45407.1. Samples: 62866320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 09:46:20,502][73265] Avg episode reward: [(0, '0.393')] [2024-06-13 09:46:22,962][73497] Updated weights for policy 0, policy_version 216339 (0.0037) [2024-06-13 09:46:25,501][73265] Fps is (10 sec: 45875.9, 60 sec: 45329.1, 300 sec: 45375.4). Total num frames: 3544612864. Throughput: 0: 45279.2. Samples: 63134420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 09:46:25,502][73265] Avg episode reward: [(0, '0.398')] [2024-06-13 09:46:25,613][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000216347_3544629248.pth... [2024-06-13 09:46:25,663][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000215684_3533766656.pth [2024-06-13 09:46:26,003][73497] Updated weights for policy 0, policy_version 216349 (0.0040) [2024-06-13 09:46:29,783][73497] Updated weights for policy 0, policy_version 216359 (0.0047) [2024-06-13 09:46:30,501][73265] Fps is (10 sec: 47514.3, 60 sec: 45602.2, 300 sec: 45319.8). Total num frames: 3544858624. Throughput: 0: 45459.6. Samples: 63415040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 09:46:30,502][73265] Avg episode reward: [(0, '0.423')] [2024-06-13 09:46:33,768][73497] Updated weights for policy 0, policy_version 216369 (0.0030) [2024-06-13 09:46:34,781][73477] Signal inference workers to stop experience collection... (950 times) [2024-06-13 09:46:34,782][73477] Signal inference workers to resume experience collection... (950 times) [2024-06-13 09:46:34,827][73497] InferenceWorker_p0-w0: stopping experience collection (950 times) [2024-06-13 09:46:34,827][73497] InferenceWorker_p0-w0: resuming experience collection (950 times) [2024-06-13 09:46:35,502][73265] Fps is (10 sec: 45873.9, 60 sec: 45602.0, 300 sec: 45375.3). Total num frames: 3545071616. Throughput: 0: 45515.8. Samples: 63553380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 09:46:35,502][73265] Avg episode reward: [(0, '0.426')] [2024-06-13 09:46:36,906][73497] Updated weights for policy 0, policy_version 216379 (0.0023) [2024-06-13 09:46:40,502][73265] Fps is (10 sec: 44235.8, 60 sec: 44782.9, 300 sec: 45375.3). Total num frames: 3545300992. Throughput: 0: 45427.2. Samples: 63820760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 09:46:40,502][73265] Avg episode reward: [(0, '0.422')] [2024-06-13 09:46:40,791][73497] Updated weights for policy 0, policy_version 216389 (0.0043) [2024-06-13 09:46:44,566][73497] Updated weights for policy 0, policy_version 216399 (0.0029) [2024-06-13 09:46:45,502][73265] Fps is (10 sec: 45876.0, 60 sec: 45602.0, 300 sec: 45375.3). Total num frames: 3545530368. Throughput: 0: 45345.6. Samples: 64092840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 09:46:45,502][73265] Avg episode reward: [(0, '0.483')] [2024-06-13 09:46:47,649][73497] Updated weights for policy 0, policy_version 216409 (0.0032) [2024-06-13 09:46:50,504][73265] Fps is (10 sec: 44226.4, 60 sec: 45327.2, 300 sec: 45263.9). Total num frames: 3545743360. Throughput: 0: 45437.9. Samples: 64224800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 09:46:50,504][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 09:46:51,847][73497] Updated weights for policy 0, policy_version 216419 (0.0037) [2024-06-13 09:46:54,806][73497] Updated weights for policy 0, policy_version 216429 (0.0034) [2024-06-13 09:46:55,504][73265] Fps is (10 sec: 45864.2, 60 sec: 45329.0, 300 sec: 45319.4). Total num frames: 3545989120. Throughput: 0: 45592.9. Samples: 64501420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 09:46:55,504][73265] Avg episode reward: [(0, '0.406')] [2024-06-13 09:46:59,041][73497] Updated weights for policy 0, policy_version 216439 (0.0033) [2024-06-13 09:47:00,501][73265] Fps is (10 sec: 45886.6, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 3546202112. Throughput: 0: 45270.4. Samples: 64767420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 09:47:00,502][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 09:47:02,581][73497] Updated weights for policy 0, policy_version 216449 (0.0034) [2024-06-13 09:47:05,502][73265] Fps is (10 sec: 44247.1, 60 sec: 45602.0, 300 sec: 45208.7). Total num frames: 3546431488. Throughput: 0: 45307.9. Samples: 64905180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 09:47:05,502][73265] Avg episode reward: [(0, '0.469')] [2024-06-13 09:47:06,318][73497] Updated weights for policy 0, policy_version 216459 (0.0040) [2024-06-13 09:47:09,691][73497] Updated weights for policy 0, policy_version 216469 (0.0030) [2024-06-13 09:47:10,501][73265] Fps is (10 sec: 47513.8, 60 sec: 45875.8, 300 sec: 45375.4). Total num frames: 3546677248. Throughput: 0: 45289.4. Samples: 65172440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 09:47:10,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 09:47:13,309][73497] Updated weights for policy 0, policy_version 216479 (0.0034) [2024-06-13 09:47:15,502][73265] Fps is (10 sec: 45875.2, 60 sec: 45602.1, 300 sec: 45375.3). Total num frames: 3546890240. Throughput: 0: 45302.0. Samples: 65453640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 09:47:15,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 09:47:16,543][73497] Updated weights for policy 0, policy_version 216489 (0.0037) [2024-06-13 09:47:20,504][73265] Fps is (10 sec: 42587.6, 60 sec: 45327.2, 300 sec: 45263.9). Total num frames: 3547103232. Throughput: 0: 45288.4. Samples: 65591460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 09:47:20,505][73265] Avg episode reward: [(0, '0.465')] [2024-06-13 09:47:20,697][73497] Updated weights for policy 0, policy_version 216499 (0.0044) [2024-06-13 09:47:23,826][73497] Updated weights for policy 0, policy_version 216509 (0.0027) [2024-06-13 09:47:25,501][73265] Fps is (10 sec: 47514.6, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 3547365376. Throughput: 0: 45297.9. Samples: 65859160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 09:47:25,502][73265] Avg episode reward: [(0, '0.423')] [2024-06-13 09:47:27,637][73497] Updated weights for policy 0, policy_version 216519 (0.0032) [2024-06-13 09:47:30,501][73265] Fps is (10 sec: 47525.6, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 3547578368. Throughput: 0: 45402.8. Samples: 66135960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 09:47:30,502][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 09:47:31,201][73497] Updated weights for policy 0, policy_version 216529 (0.0033) [2024-06-13 09:47:34,937][73497] Updated weights for policy 0, policy_version 216539 (0.0039) [2024-06-13 09:47:35,501][73265] Fps is (10 sec: 40959.5, 60 sec: 45056.1, 300 sec: 45264.3). Total num frames: 3547774976. Throughput: 0: 45418.0. Samples: 66268500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 09:47:35,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 09:47:38,122][73497] Updated weights for policy 0, policy_version 216549 (0.0033) [2024-06-13 09:47:40,501][73265] Fps is (10 sec: 44236.4, 60 sec: 45329.1, 300 sec: 45375.4). Total num frames: 3548020736. Throughput: 0: 45451.8. Samples: 66546640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 09:47:40,502][73265] Avg episode reward: [(0, '0.429')] [2024-06-13 09:47:42,188][73497] Updated weights for policy 0, policy_version 216559 (0.0046) [2024-06-13 09:47:45,153][73497] Updated weights for policy 0, policy_version 216569 (0.0042) [2024-06-13 09:47:45,503][73265] Fps is (10 sec: 49144.6, 60 sec: 45601.0, 300 sec: 45486.2). Total num frames: 3548266496. Throughput: 0: 45430.0. Samples: 66811840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 09:47:45,512][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 09:47:49,511][73497] Updated weights for policy 0, policy_version 216579 (0.0025) [2024-06-13 09:47:50,504][73265] Fps is (10 sec: 45864.5, 60 sec: 45602.2, 300 sec: 45375.0). Total num frames: 3548479488. Throughput: 0: 45612.4. Samples: 66957840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 09:47:50,504][73265] Avg episode reward: [(0, '0.420')] [2024-06-13 09:47:52,924][73497] Updated weights for policy 0, policy_version 216589 (0.0035) [2024-06-13 09:47:55,502][73265] Fps is (10 sec: 45882.0, 60 sec: 45604.0, 300 sec: 45375.3). Total num frames: 3548725248. Throughput: 0: 45704.3. Samples: 67229140. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 09:47:55,502][73265] Avg episode reward: [(0, '0.390')] [2024-06-13 09:47:56,348][73497] Updated weights for policy 0, policy_version 216599 (0.0036) [2024-06-13 09:47:59,820][73497] Updated weights for policy 0, policy_version 216609 (0.0029) [2024-06-13 09:48:00,501][73265] Fps is (10 sec: 45886.1, 60 sec: 45602.1, 300 sec: 45486.4). Total num frames: 3548938240. Throughput: 0: 45440.6. Samples: 67498460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 09:48:00,502][73265] Avg episode reward: [(0, '0.444')] [2024-06-13 09:48:03,782][73497] Updated weights for policy 0, policy_version 216619 (0.0030) [2024-06-13 09:48:05,502][73265] Fps is (10 sec: 45875.2, 60 sec: 45875.3, 300 sec: 45486.4). Total num frames: 3549184000. Throughput: 0: 45495.3. Samples: 67638640. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 09:48:05,502][73265] Avg episode reward: [(0, '0.421')] [2024-06-13 09:48:06,749][73497] Updated weights for policy 0, policy_version 216629 (0.0043) [2024-06-13 09:48:10,502][73265] Fps is (10 sec: 45874.7, 60 sec: 45328.9, 300 sec: 45375.3). Total num frames: 3549396992. Throughput: 0: 45585.2. Samples: 67910500. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 09:48:10,502][73265] Avg episode reward: [(0, '0.391')] [2024-06-13 09:48:10,795][73497] Updated weights for policy 0, policy_version 216639 (0.0035) [2024-06-13 09:48:11,664][73477] Signal inference workers to stop experience collection... (1000 times) [2024-06-13 09:48:11,712][73497] InferenceWorker_p0-w0: stopping experience collection (1000 times) [2024-06-13 09:48:11,717][73477] Signal inference workers to resume experience collection... (1000 times) [2024-06-13 09:48:11,723][73497] InferenceWorker_p0-w0: resuming experience collection (1000 times) [2024-06-13 09:48:14,156][73497] Updated weights for policy 0, policy_version 216649 (0.0043) [2024-06-13 09:48:15,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45602.2, 300 sec: 45430.9). Total num frames: 3549626368. Throughput: 0: 45397.3. Samples: 68178840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 09:48:15,502][73265] Avg episode reward: [(0, '0.417')] [2024-06-13 09:48:18,290][73497] Updated weights for policy 0, policy_version 216659 (0.0041) [2024-06-13 09:48:20,501][73265] Fps is (10 sec: 45876.0, 60 sec: 45877.1, 300 sec: 45375.4). Total num frames: 3549855744. Throughput: 0: 45442.3. Samples: 68313400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 09:48:20,502][73265] Avg episode reward: [(0, '0.393')] [2024-06-13 09:48:21,930][73497] Updated weights for policy 0, policy_version 216669 (0.0032) [2024-06-13 09:48:25,270][73497] Updated weights for policy 0, policy_version 216679 (0.0031) [2024-06-13 09:48:25,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 3550085120. Throughput: 0: 45341.3. Samples: 68587000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 09:48:25,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 09:48:25,509][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000216680_3550085120.pth... [2024-06-13 09:48:25,555][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000216014_3539173376.pth [2024-06-13 09:48:28,921][73497] Updated weights for policy 0, policy_version 216689 (0.0032) [2024-06-13 09:48:30,501][73265] Fps is (10 sec: 44236.8, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3550298112. Throughput: 0: 45615.4. Samples: 68864460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 09:48:30,502][73265] Avg episode reward: [(0, '0.402')] [2024-06-13 09:48:32,803][73497] Updated weights for policy 0, policy_version 216699 (0.0041) [2024-06-13 09:48:35,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 3550527488. Throughput: 0: 45411.7. Samples: 69001260. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 09:48:35,502][73265] Avg episode reward: [(0, '0.464')] [2024-06-13 09:48:35,819][73497] Updated weights for policy 0, policy_version 216709 (0.0042) [2024-06-13 09:48:39,543][73497] Updated weights for policy 0, policy_version 216719 (0.0037) [2024-06-13 09:48:40,508][73265] Fps is (10 sec: 47482.3, 60 sec: 45870.2, 300 sec: 45485.4). Total num frames: 3550773248. Throughput: 0: 45567.3. Samples: 69279960. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 09:48:40,509][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 09:48:42,826][73497] Updated weights for policy 0, policy_version 216729 (0.0036) [2024-06-13 09:48:45,501][73265] Fps is (10 sec: 44237.0, 60 sec: 45057.2, 300 sec: 45319.8). Total num frames: 3550969856. Throughput: 0: 45525.3. Samples: 69547100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-13 09:48:45,502][73265] Avg episode reward: [(0, '0.414')] [2024-06-13 09:48:46,778][73497] Updated weights for policy 0, policy_version 216739 (0.0028) [2024-06-13 09:48:50,501][73265] Fps is (10 sec: 42626.7, 60 sec: 45330.9, 300 sec: 45430.9). Total num frames: 3551199232. Throughput: 0: 45475.3. Samples: 69685020. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 09:48:50,502][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 09:48:50,627][73497] Updated weights for policy 0, policy_version 216749 (0.0038) [2024-06-13 09:48:53,733][73497] Updated weights for policy 0, policy_version 216759 (0.0036) [2024-06-13 09:48:55,502][73265] Fps is (10 sec: 47513.2, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3551444992. Throughput: 0: 45499.1. Samples: 69957960. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 09:48:55,502][73265] Avg episode reward: [(0, '0.382')] [2024-06-13 09:48:57,629][73497] Updated weights for policy 0, policy_version 216769 (0.0033) [2024-06-13 09:49:00,501][73265] Fps is (10 sec: 47513.6, 60 sec: 45602.2, 300 sec: 45430.9). Total num frames: 3551674368. Throughput: 0: 45842.8. Samples: 70241760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 09:49:00,502][73265] Avg episode reward: [(0, '0.378')] [2024-06-13 09:49:00,943][73497] Updated weights for policy 0, policy_version 216779 (0.0035) [2024-06-13 09:49:04,498][73497] Updated weights for policy 0, policy_version 216789 (0.0026) [2024-06-13 09:49:05,505][73265] Fps is (10 sec: 44220.4, 60 sec: 45053.2, 300 sec: 45542.3). Total num frames: 3551887360. Throughput: 0: 45806.8. Samples: 70374880. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 09:49:05,506][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 09:49:08,285][73497] Updated weights for policy 0, policy_version 216799 (0.0039) [2024-06-13 09:49:10,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45602.3, 300 sec: 45430.9). Total num frames: 3552133120. Throughput: 0: 45853.5. Samples: 70650400. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 09:49:10,502][73265] Avg episode reward: [(0, '0.406')] [2024-06-13 09:49:11,459][73497] Updated weights for policy 0, policy_version 216809 (0.0034) [2024-06-13 09:49:13,191][73477] Signal inference workers to stop experience collection... (1050 times) [2024-06-13 09:49:13,193][73477] Signal inference workers to resume experience collection... (1050 times) [2024-06-13 09:49:13,244][73497] InferenceWorker_p0-w0: stopping experience collection (1050 times) [2024-06-13 09:49:13,244][73497] InferenceWorker_p0-w0: resuming experience collection (1050 times) [2024-06-13 09:49:15,219][73497] Updated weights for policy 0, policy_version 216819 (0.0054) [2024-06-13 09:49:15,508][73265] Fps is (10 sec: 47500.9, 60 sec: 45597.2, 300 sec: 45429.9). Total num frames: 3552362496. Throughput: 0: 45668.9. Samples: 70919860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 09:49:15,508][73265] Avg episode reward: [(0, '0.476')] [2024-06-13 09:49:18,922][73497] Updated weights for policy 0, policy_version 216829 (0.0028) [2024-06-13 09:49:20,501][73265] Fps is (10 sec: 45874.7, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3552591872. Throughput: 0: 45708.0. Samples: 71058120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 09:49:20,502][73265] Avg episode reward: [(0, '0.467')] [2024-06-13 09:49:22,288][73497] Updated weights for policy 0, policy_version 216839 (0.0047) [2024-06-13 09:49:25,501][73265] Fps is (10 sec: 45904.6, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3552821248. Throughput: 0: 45680.4. Samples: 71335280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 09:49:25,502][73265] Avg episode reward: [(0, '0.389')] [2024-06-13 09:49:25,978][73497] Updated weights for policy 0, policy_version 216849 (0.0038) [2024-06-13 09:49:29,573][73497] Updated weights for policy 0, policy_version 216859 (0.0041) [2024-06-13 09:49:30,505][73265] Fps is (10 sec: 45857.7, 60 sec: 45872.2, 300 sec: 45541.4). Total num frames: 3553050624. Throughput: 0: 45733.4. Samples: 71605280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 09:49:30,506][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 09:49:33,119][73497] Updated weights for policy 0, policy_version 216869 (0.0032) [2024-06-13 09:49:35,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45602.2, 300 sec: 45486.4). Total num frames: 3553263616. Throughput: 0: 45767.9. Samples: 71744580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 09:49:35,502][73265] Avg episode reward: [(0, '0.444')] [2024-06-13 09:49:36,915][73497] Updated weights for policy 0, policy_version 216879 (0.0029) [2024-06-13 09:49:40,323][73497] Updated weights for policy 0, policy_version 216889 (0.0039) [2024-06-13 09:49:40,501][73265] Fps is (10 sec: 45892.8, 60 sec: 45607.1, 300 sec: 45597.5). Total num frames: 3553509376. Throughput: 0: 45688.5. Samples: 72013940. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 09:49:40,502][73265] Avg episode reward: [(0, '0.347')] [2024-06-13 09:49:43,960][73497] Updated weights for policy 0, policy_version 216899 (0.0033) [2024-06-13 09:49:45,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45875.2, 300 sec: 45486.4). Total num frames: 3553722368. Throughput: 0: 45337.2. Samples: 72281940. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 09:49:45,502][73265] Avg episode reward: [(0, '0.443')] [2024-06-13 09:49:47,953][73497] Updated weights for policy 0, policy_version 216909 (0.0051) [2024-06-13 09:49:50,502][73265] Fps is (10 sec: 42597.9, 60 sec: 45602.0, 300 sec: 45375.3). Total num frames: 3553935360. Throughput: 0: 45441.9. Samples: 72419600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 09:49:50,502][73265] Avg episode reward: [(0, '0.412')] [2024-06-13 09:49:51,167][73497] Updated weights for policy 0, policy_version 216919 (0.0031) [2024-06-13 09:49:54,887][73497] Updated weights for policy 0, policy_version 216929 (0.0028) [2024-06-13 09:49:55,502][73265] Fps is (10 sec: 44236.3, 60 sec: 45329.0, 300 sec: 45486.4). Total num frames: 3554164736. Throughput: 0: 45469.1. Samples: 72696520. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 09:49:55,502][73265] Avg episode reward: [(0, '0.456')] [2024-06-13 09:49:58,320][73497] Updated weights for policy 0, policy_version 216939 (0.0047) [2024-06-13 09:50:00,501][73265] Fps is (10 sec: 45876.0, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 3554394112. Throughput: 0: 45443.9. Samples: 72964540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:50:00,502][73265] Avg episode reward: [(0, '0.433')] [2024-06-13 09:50:02,520][73497] Updated weights for policy 0, policy_version 216949 (0.0046) [2024-06-13 09:50:05,501][73265] Fps is (10 sec: 45875.6, 60 sec: 45605.0, 300 sec: 45375.3). Total num frames: 3554623488. Throughput: 0: 45429.3. Samples: 73102440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:50:05,502][73265] Avg episode reward: [(0, '0.415')] [2024-06-13 09:50:05,809][73497] Updated weights for policy 0, policy_version 216959 (0.0029) [2024-06-13 09:50:09,638][73497] Updated weights for policy 0, policy_version 216969 (0.0029) [2024-06-13 09:50:10,504][73265] Fps is (10 sec: 45863.7, 60 sec: 45327.1, 300 sec: 45430.5). Total num frames: 3554852864. Throughput: 0: 45486.0. Samples: 73382260. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:50:10,504][73265] Avg episode reward: [(0, '0.420')] [2024-06-13 09:50:12,771][73497] Updated weights for policy 0, policy_version 216979 (0.0036) [2024-06-13 09:50:15,501][73265] Fps is (10 sec: 44236.8, 60 sec: 45060.8, 300 sec: 45375.3). Total num frames: 3555065856. Throughput: 0: 45317.1. Samples: 73644380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:50:15,502][73265] Avg episode reward: [(0, '0.448')] [2024-06-13 09:50:17,086][73497] Updated weights for policy 0, policy_version 216989 (0.0036) [2024-06-13 09:50:19,899][73497] Updated weights for policy 0, policy_version 216999 (0.0033) [2024-06-13 09:50:20,501][73265] Fps is (10 sec: 47525.1, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3555328000. Throughput: 0: 45388.8. Samples: 73787080. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:50:20,502][73265] Avg episode reward: [(0, '0.401')] [2024-06-13 09:50:23,948][73497] Updated weights for policy 0, policy_version 217009 (0.0038) [2024-06-13 09:50:25,508][73265] Fps is (10 sec: 47483.4, 60 sec: 45324.2, 300 sec: 45485.4). Total num frames: 3555540992. Throughput: 0: 45469.5. Samples: 74060360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:50:25,508][73265] Avg episode reward: [(0, '0.469')] [2024-06-13 09:50:25,522][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000217013_3555540992.pth... [2024-06-13 09:50:25,588][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000216347_3544629248.pth [2024-06-13 09:50:26,128][73477] Signal inference workers to stop experience collection... (1100 times) [2024-06-13 09:50:26,171][73497] InferenceWorker_p0-w0: stopping experience collection (1100 times) [2024-06-13 09:50:26,182][73477] Signal inference workers to resume experience collection... (1100 times) [2024-06-13 09:50:26,189][73497] InferenceWorker_p0-w0: resuming experience collection (1100 times) [2024-06-13 09:50:26,882][73497] Updated weights for policy 0, policy_version 217019 (0.0036) [2024-06-13 09:50:30,501][73265] Fps is (10 sec: 42598.9, 60 sec: 45058.9, 300 sec: 45486.4). Total num frames: 3555753984. Throughput: 0: 45747.7. Samples: 74340580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:50:30,502][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 09:50:31,089][73497] Updated weights for policy 0, policy_version 217029 (0.0034) [2024-06-13 09:50:34,223][73497] Updated weights for policy 0, policy_version 217039 (0.0033) [2024-06-13 09:50:35,502][73265] Fps is (10 sec: 47543.5, 60 sec: 45875.1, 300 sec: 45430.9). Total num frames: 3556016128. Throughput: 0: 45594.2. Samples: 74471340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:50:35,502][73265] Avg episode reward: [(0, '0.419')] [2024-06-13 09:50:38,395][73497] Updated weights for policy 0, policy_version 217049 (0.0047) [2024-06-13 09:50:40,501][73265] Fps is (10 sec: 47513.4, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 3556229120. Throughput: 0: 45573.5. Samples: 74747320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:50:40,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 09:50:41,414][73497] Updated weights for policy 0, policy_version 217059 (0.0035) [2024-06-13 09:50:45,501][73265] Fps is (10 sec: 42599.3, 60 sec: 45329.2, 300 sec: 45486.4). Total num frames: 3556442112. Throughput: 0: 45674.7. Samples: 75019900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:50:45,510][73265] Avg episode reward: [(0, '0.410')] [2024-06-13 09:50:45,854][73497] Updated weights for policy 0, policy_version 217069 (0.0034) [2024-06-13 09:50:48,767][73497] Updated weights for policy 0, policy_version 217079 (0.0033) [2024-06-13 09:50:50,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45875.3, 300 sec: 45486.8). Total num frames: 3556687872. Throughput: 0: 45589.8. Samples: 75153980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:50:50,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 09:50:52,777][73497] Updated weights for policy 0, policy_version 217089 (0.0039) [2024-06-13 09:50:55,502][73265] Fps is (10 sec: 49151.1, 60 sec: 46148.3, 300 sec: 45708.6). Total num frames: 3556933632. Throughput: 0: 45503.7. Samples: 75429820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:50:55,507][73265] Avg episode reward: [(0, '0.461')] [2024-06-13 09:50:55,664][73497] Updated weights for policy 0, policy_version 217099 (0.0037) [2024-06-13 09:51:00,347][73497] Updated weights for policy 0, policy_version 217109 (0.0028) [2024-06-13 09:51:00,501][73265] Fps is (10 sec: 42598.2, 60 sec: 45329.0, 300 sec: 45486.4). Total num frames: 3557113856. Throughput: 0: 45604.9. Samples: 75696600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:51:00,502][73265] Avg episode reward: [(0, '0.323')] [2024-06-13 09:51:02,941][73497] Updated weights for policy 0, policy_version 217119 (0.0027) [2024-06-13 09:51:05,502][73265] Fps is (10 sec: 42598.3, 60 sec: 45602.1, 300 sec: 45542.1). Total num frames: 3557359616. Throughput: 0: 45292.8. Samples: 75825260. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-06-13 09:51:05,502][73265] Avg episode reward: [(0, '0.453')] [2024-06-13 09:51:07,630][73497] Updated weights for policy 0, policy_version 217129 (0.0039) [2024-06-13 09:51:10,330][73497] Updated weights for policy 0, policy_version 217139 (0.0034) [2024-06-13 09:51:10,501][73265] Fps is (10 sec: 49152.1, 60 sec: 45877.1, 300 sec: 45597.5). Total num frames: 3557605376. Throughput: 0: 45332.7. Samples: 76100040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:51:10,502][73265] Avg episode reward: [(0, '0.420')] [2024-06-13 09:51:14,812][73497] Updated weights for policy 0, policy_version 217149 (0.0030) [2024-06-13 09:51:15,501][73265] Fps is (10 sec: 42598.7, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3557785600. Throughput: 0: 45259.4. Samples: 76377260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:51:15,502][73265] Avg episode reward: [(0, '0.470')] [2024-06-13 09:51:17,774][73497] Updated weights for policy 0, policy_version 217159 (0.0029) [2024-06-13 09:51:20,501][73265] Fps is (10 sec: 40960.4, 60 sec: 44783.0, 300 sec: 45430.9). Total num frames: 3558014976. Throughput: 0: 45183.8. Samples: 76504600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:51:20,502][73265] Avg episode reward: [(0, '0.444')] [2024-06-13 09:51:21,907][73497] Updated weights for policy 0, policy_version 217169 (0.0036) [2024-06-13 09:51:24,429][73497] Updated weights for policy 0, policy_version 217179 (0.0033) [2024-06-13 09:51:25,501][73265] Fps is (10 sec: 49152.3, 60 sec: 45607.0, 300 sec: 45486.4). Total num frames: 3558277120. Throughput: 0: 45237.3. Samples: 76783000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:51:25,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 09:51:29,246][73497] Updated weights for policy 0, policy_version 217189 (0.0025) [2024-06-13 09:51:30,501][73265] Fps is (10 sec: 49151.7, 60 sec: 45875.1, 300 sec: 45542.0). Total num frames: 3558506496. Throughput: 0: 45399.5. Samples: 77062880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:51:30,502][73265] Avg episode reward: [(0, '0.454')] [2024-06-13 09:51:31,464][73497] Updated weights for policy 0, policy_version 217199 (0.0028) [2024-06-13 09:51:35,504][73265] Fps is (10 sec: 40949.8, 60 sec: 44508.1, 300 sec: 45375.0). Total num frames: 3558686720. Throughput: 0: 45457.1. Samples: 77199660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:51:35,505][73265] Avg episode reward: [(0, '0.397')] [2024-06-13 09:51:36,559][73497] Updated weights for policy 0, policy_version 217209 (0.0029) [2024-06-13 09:51:38,891][73497] Updated weights for policy 0, policy_version 217219 (0.0036) [2024-06-13 09:51:40,504][73265] Fps is (10 sec: 45863.9, 60 sec: 45600.2, 300 sec: 45541.6). Total num frames: 3558965248. Throughput: 0: 45098.1. Samples: 77459340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:51:40,504][73265] Avg episode reward: [(0, '0.391')] [2024-06-13 09:51:43,591][73497] Updated weights for policy 0, policy_version 217229 (0.0030) [2024-06-13 09:51:45,504][73265] Fps is (10 sec: 50790.2, 60 sec: 45873.2, 300 sec: 45597.5). Total num frames: 3559194624. Throughput: 0: 45556.2. Samples: 77746740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:51:45,505][73265] Avg episode reward: [(0, '0.426')] [2024-06-13 09:51:45,971][73477] Signal inference workers to stop experience collection... (1150 times) [2024-06-13 09:51:45,972][73477] Signal inference workers to resume experience collection... (1150 times) [2024-06-13 09:51:45,982][73497] InferenceWorker_p0-w0: stopping experience collection (1150 times) [2024-06-13 09:51:46,015][73497] InferenceWorker_p0-w0: resuming experience collection (1150 times) [2024-06-13 09:51:46,110][73497] Updated weights for policy 0, policy_version 217239 (0.0039) [2024-06-13 09:51:50,501][73265] Fps is (10 sec: 40970.1, 60 sec: 44782.9, 300 sec: 45375.7). Total num frames: 3559374848. Throughput: 0: 45595.7. Samples: 77877060. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:51:50,502][73265] Avg episode reward: [(0, '0.409')] [2024-06-13 09:51:50,766][73497] Updated weights for policy 0, policy_version 217249 (0.0034) [2024-06-13 09:51:53,009][73497] Updated weights for policy 0, policy_version 217259 (0.0040) [2024-06-13 09:51:55,502][73265] Fps is (10 sec: 42608.8, 60 sec: 44783.0, 300 sec: 45486.4). Total num frames: 3559620608. Throughput: 0: 45491.1. Samples: 78147140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:51:55,502][73265] Avg episode reward: [(0, '0.404')] [2024-06-13 09:51:58,142][73497] Updated weights for policy 0, policy_version 217269 (0.0043) [2024-06-13 09:52:00,279][73497] Updated weights for policy 0, policy_version 217279 (0.0034) [2024-06-13 09:52:00,502][73265] Fps is (10 sec: 52428.2, 60 sec: 46421.3, 300 sec: 45653.1). Total num frames: 3559899136. Throughput: 0: 45403.1. Samples: 78420400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:52:00,502][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 09:52:05,172][73497] Updated weights for policy 0, policy_version 217289 (0.0054) [2024-06-13 09:52:05,501][73265] Fps is (10 sec: 44237.0, 60 sec: 45056.1, 300 sec: 45375.3). Total num frames: 3560062976. Throughput: 0: 45542.1. Samples: 78554000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:52:05,502][73265] Avg episode reward: [(0, '0.414')] [2024-06-13 09:52:07,766][73497] Updated weights for policy 0, policy_version 217299 (0.0038) [2024-06-13 09:52:10,501][73265] Fps is (10 sec: 40960.8, 60 sec: 45056.1, 300 sec: 45486.5). Total num frames: 3560308736. Throughput: 0: 45383.2. Samples: 78825240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-13 09:52:10,502][73265] Avg episode reward: [(0, '0.421')] [2024-06-13 09:52:12,571][73497] Updated weights for policy 0, policy_version 217309 (0.0035) [2024-06-13 09:52:15,024][73497] Updated weights for policy 0, policy_version 217319 (0.0037) [2024-06-13 09:52:15,501][73265] Fps is (10 sec: 49151.9, 60 sec: 46148.3, 300 sec: 45597.9). Total num frames: 3560554496. Throughput: 0: 45279.5. Samples: 79100460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 09:52:15,502][73265] Avg episode reward: [(0, '0.414')] [2024-06-13 09:52:19,714][73497] Updated weights for policy 0, policy_version 217329 (0.0034) [2024-06-13 09:52:20,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 3560767488. Throughput: 0: 45467.0. Samples: 79245560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 09:52:20,502][73265] Avg episode reward: [(0, '0.417')] [2024-06-13 09:52:21,853][73497] Updated weights for policy 0, policy_version 217339 (0.0030) [2024-06-13 09:52:25,501][73265] Fps is (10 sec: 42598.6, 60 sec: 45056.0, 300 sec: 45430.9). Total num frames: 3560980480. Throughput: 0: 45711.8. Samples: 79516260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 09:52:25,502][73265] Avg episode reward: [(0, '0.377')] [2024-06-13 09:52:25,518][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000217345_3560980480.pth... [2024-06-13 09:52:25,564][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000216680_3550085120.pth [2024-06-13 09:52:26,784][73497] Updated weights for policy 0, policy_version 217349 (0.0032) [2024-06-13 09:52:29,214][73497] Updated weights for policy 0, policy_version 217359 (0.0032) [2024-06-13 09:52:30,501][73265] Fps is (10 sec: 45874.6, 60 sec: 45329.0, 300 sec: 45597.5). Total num frames: 3561226240. Throughput: 0: 45313.1. Samples: 79785720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 09:52:30,502][73265] Avg episode reward: [(0, '0.364')] [2024-06-13 09:52:33,827][73497] Updated weights for policy 0, policy_version 217369 (0.0029) [2024-06-13 09:52:35,501][73265] Fps is (10 sec: 49152.2, 60 sec: 46423.3, 300 sec: 45597.5). Total num frames: 3561472000. Throughput: 0: 45636.0. Samples: 79930680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 09:52:35,502][73265] Avg episode reward: [(0, '0.419')] [2024-06-13 09:52:36,461][73497] Updated weights for policy 0, policy_version 217379 (0.0035) [2024-06-13 09:52:40,502][73265] Fps is (10 sec: 44236.3, 60 sec: 45057.7, 300 sec: 45431.1). Total num frames: 3561668608. Throughput: 0: 45522.1. Samples: 80195640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 09:52:40,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 09:52:41,109][73497] Updated weights for policy 0, policy_version 217389 (0.0030) [2024-06-13 09:52:42,998][73477] Signal inference workers to stop experience collection... (1200 times) [2024-06-13 09:52:43,049][73497] InferenceWorker_p0-w0: stopping experience collection (1200 times) [2024-06-13 09:52:43,054][73477] Signal inference workers to resume experience collection... (1200 times) [2024-06-13 09:52:43,060][73497] InferenceWorker_p0-w0: resuming experience collection (1200 times) [2024-06-13 09:52:44,049][73497] Updated weights for policy 0, policy_version 217399 (0.0033) [2024-06-13 09:52:45,501][73265] Fps is (10 sec: 44236.4, 60 sec: 45330.9, 300 sec: 45542.3). Total num frames: 3561914368. Throughput: 0: 45450.3. Samples: 80465660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 09:52:45,502][73265] Avg episode reward: [(0, '0.435')] [2024-06-13 09:52:48,471][73497] Updated weights for policy 0, policy_version 217409 (0.0032) [2024-06-13 09:52:50,501][73265] Fps is (10 sec: 49153.2, 60 sec: 46421.4, 300 sec: 45542.0). Total num frames: 3562160128. Throughput: 0: 45758.3. Samples: 80613120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 09:52:50,502][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 09:52:51,059][73497] Updated weights for policy 0, policy_version 217419 (0.0031) [2024-06-13 09:52:55,501][73265] Fps is (10 sec: 42598.7, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3562340352. Throughput: 0: 45686.2. Samples: 80881120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 09:52:55,502][73265] Avg episode reward: [(0, '0.418')] [2024-06-13 09:52:55,866][73497] Updated weights for policy 0, policy_version 217429 (0.0048) [2024-06-13 09:52:58,670][73497] Updated weights for policy 0, policy_version 217439 (0.0026) [2024-06-13 09:53:00,501][73265] Fps is (10 sec: 40959.7, 60 sec: 44509.9, 300 sec: 45375.4). Total num frames: 3562569728. Throughput: 0: 45244.9. Samples: 81136480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 09:53:00,502][73265] Avg episode reward: [(0, '0.439')] [2024-06-13 09:53:02,928][73497] Updated weights for policy 0, policy_version 217449 (0.0030) [2024-06-13 09:53:05,501][73265] Fps is (10 sec: 47513.2, 60 sec: 45875.2, 300 sec: 45486.4). Total num frames: 3562815488. Throughput: 0: 45234.5. Samples: 81281120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 09:53:05,502][73265] Avg episode reward: [(0, '0.428')] [2024-06-13 09:53:06,027][73497] Updated weights for policy 0, policy_version 217459 (0.0034) [2024-06-13 09:53:10,159][73497] Updated weights for policy 0, policy_version 217469 (0.0028) [2024-06-13 09:53:10,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45055.9, 300 sec: 45375.4). Total num frames: 3563012096. Throughput: 0: 45153.8. Samples: 81548180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 09:53:10,502][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 09:53:13,411][73497] Updated weights for policy 0, policy_version 217479 (0.0036) [2024-06-13 09:53:15,501][73265] Fps is (10 sec: 40960.1, 60 sec: 44509.9, 300 sec: 45319.8). Total num frames: 3563225088. Throughput: 0: 45018.7. Samples: 81811560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 09:53:15,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 09:53:17,529][73497] Updated weights for policy 0, policy_version 217489 (0.0029) [2024-06-13 09:53:20,502][73265] Fps is (10 sec: 47513.1, 60 sec: 45328.9, 300 sec: 45430.9). Total num frames: 3563487232. Throughput: 0: 44824.3. Samples: 81947780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-13 09:53:20,502][73265] Avg episode reward: [(0, '0.448')] [2024-06-13 09:53:20,785][73497] Updated weights for policy 0, policy_version 217499 (0.0047) [2024-06-13 09:53:25,388][73497] Updated weights for policy 0, policy_version 217509 (0.0041) [2024-06-13 09:53:25,501][73265] Fps is (10 sec: 44237.3, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 3563667456. Throughput: 0: 44861.1. Samples: 82214380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 09:53:25,502][73265] Avg episode reward: [(0, '0.448')] [2024-06-13 09:53:28,283][73497] Updated weights for policy 0, policy_version 217519 (0.0030) [2024-06-13 09:53:30,504][73265] Fps is (10 sec: 42588.3, 60 sec: 44781.1, 300 sec: 45375.0). Total num frames: 3563913216. Throughput: 0: 44772.2. Samples: 82480520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 09:53:30,505][73265] Avg episode reward: [(0, '0.475')] [2024-06-13 09:53:32,347][73497] Updated weights for policy 0, policy_version 217529 (0.0028) [2024-06-13 09:53:35,502][73265] Fps is (10 sec: 47512.7, 60 sec: 44509.7, 300 sec: 45320.8). Total num frames: 3564142592. Throughput: 0: 44548.2. Samples: 82617800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 09:53:35,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 09:53:35,603][73497] Updated weights for policy 0, policy_version 217539 (0.0031) [2024-06-13 09:53:39,379][73497] Updated weights for policy 0, policy_version 217549 (0.0032) [2024-06-13 09:53:40,502][73265] Fps is (10 sec: 47524.8, 60 sec: 45329.1, 300 sec: 45486.4). Total num frames: 3564388352. Throughput: 0: 44596.3. Samples: 82887960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 09:53:40,502][73265] Avg episode reward: [(0, '0.487')] [2024-06-13 09:53:42,804][73497] Updated weights for policy 0, policy_version 217559 (0.0033) [2024-06-13 09:53:45,501][73265] Fps is (10 sec: 42599.3, 60 sec: 44236.9, 300 sec: 45319.8). Total num frames: 3564568576. Throughput: 0: 45113.0. Samples: 83166560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 09:53:45,502][73265] Avg episode reward: [(0, '0.379')] [2024-06-13 09:53:46,210][73497] Updated weights for policy 0, policy_version 217569 (0.0039) [2024-06-13 09:53:49,925][73497] Updated weights for policy 0, policy_version 217579 (0.0040) [2024-06-13 09:53:50,501][73265] Fps is (10 sec: 45875.8, 60 sec: 44782.9, 300 sec: 45430.9). Total num frames: 3564847104. Throughput: 0: 44920.1. Samples: 83302520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 09:53:50,502][73265] Avg episode reward: [(0, '0.404')] [2024-06-13 09:53:53,925][73497] Updated weights for policy 0, policy_version 217589 (0.0042) [2024-06-13 09:53:55,502][73265] Fps is (10 sec: 49150.9, 60 sec: 45328.9, 300 sec: 45375.3). Total num frames: 3565060096. Throughput: 0: 44970.5. Samples: 83571860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 09:53:55,502][73265] Avg episode reward: [(0, '0.417')] [2024-06-13 09:53:56,851][73477] Signal inference workers to stop experience collection... (1250 times) [2024-06-13 09:53:56,858][73477] Signal inference workers to resume experience collection... (1250 times) [2024-06-13 09:53:56,879][73497] InferenceWorker_p0-w0: stopping experience collection (1250 times) [2024-06-13 09:53:56,879][73497] InferenceWorker_p0-w0: resuming experience collection (1250 times) [2024-06-13 09:53:57,008][73497] Updated weights for policy 0, policy_version 217599 (0.0033) [2024-06-13 09:54:00,502][73265] Fps is (10 sec: 42597.6, 60 sec: 45055.9, 300 sec: 45375.9). Total num frames: 3565273088. Throughput: 0: 45017.2. Samples: 83837340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 09:54:00,502][73265] Avg episode reward: [(0, '0.409')] [2024-06-13 09:54:01,115][73497] Updated weights for policy 0, policy_version 217609 (0.0035) [2024-06-13 09:54:04,451][73497] Updated weights for policy 0, policy_version 217619 (0.0034) [2024-06-13 09:54:05,501][73265] Fps is (10 sec: 44237.4, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 3565502464. Throughput: 0: 45122.3. Samples: 83978280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 09:54:05,502][73265] Avg episode reward: [(0, '0.405')] [2024-06-13 09:54:08,063][73497] Updated weights for policy 0, policy_version 217629 (0.0030) [2024-06-13 09:54:10,502][73265] Fps is (10 sec: 45875.3, 60 sec: 45329.0, 300 sec: 45320.8). Total num frames: 3565731840. Throughput: 0: 45246.0. Samples: 84250460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 09:54:10,502][73265] Avg episode reward: [(0, '0.399')] [2024-06-13 09:54:11,683][73497] Updated weights for policy 0, policy_version 217639 (0.0026) [2024-06-13 09:54:15,279][73497] Updated weights for policy 0, policy_version 217649 (0.0025) [2024-06-13 09:54:15,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45602.1, 300 sec: 45319.8). Total num frames: 3565961216. Throughput: 0: 45347.3. Samples: 84521040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 09:54:15,502][73265] Avg episode reward: [(0, '0.383')] [2024-06-13 09:54:19,026][73497] Updated weights for policy 0, policy_version 217659 (0.0028) [2024-06-13 09:54:20,501][73265] Fps is (10 sec: 44237.7, 60 sec: 44783.1, 300 sec: 45264.3). Total num frames: 3566174208. Throughput: 0: 45352.2. Samples: 84658640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 09:54:20,502][73265] Avg episode reward: [(0, '0.427')] [2024-06-13 09:54:22,633][73497] Updated weights for policy 0, policy_version 217669 (0.0031) [2024-06-13 09:54:25,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45602.0, 300 sec: 45264.8). Total num frames: 3566403584. Throughput: 0: 45306.3. Samples: 84926740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 09:54:25,502][73265] Avg episode reward: [(0, '0.386')] [2024-06-13 09:54:25,510][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000217676_3566403584.pth... [2024-06-13 09:54:25,557][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000217013_3555540992.pth [2024-06-13 09:54:26,287][73497] Updated weights for policy 0, policy_version 217679 (0.0030) [2024-06-13 09:54:29,752][73497] Updated weights for policy 0, policy_version 217689 (0.0025) [2024-06-13 09:54:30,501][73265] Fps is (10 sec: 45874.7, 60 sec: 45330.9, 300 sec: 45319.8). Total num frames: 3566632960. Throughput: 0: 44889.2. Samples: 85186580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 09:54:30,510][73265] Avg episode reward: [(0, '0.417')] [2024-06-13 09:54:33,873][73497] Updated weights for policy 0, policy_version 217699 (0.0035) [2024-06-13 09:54:35,501][73265] Fps is (10 sec: 45875.4, 60 sec: 45329.1, 300 sec: 45264.3). Total num frames: 3566862336. Throughput: 0: 45051.5. Samples: 85329840. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 09:54:35,502][73265] Avg episode reward: [(0, '0.446')] [2024-06-13 09:54:36,918][73497] Updated weights for policy 0, policy_version 217709 (0.0036) [2024-06-13 09:54:40,501][73265] Fps is (10 sec: 44237.3, 60 sec: 44783.0, 300 sec: 45264.3). Total num frames: 3567075328. Throughput: 0: 45153.1. Samples: 85603740. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 09:54:40,510][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 09:54:40,923][73497] Updated weights for policy 0, policy_version 217719 (0.0027) [2024-06-13 09:54:44,571][73497] Updated weights for policy 0, policy_version 217729 (0.0035) [2024-06-13 09:54:45,501][73265] Fps is (10 sec: 45875.4, 60 sec: 45875.2, 300 sec: 45375.4). Total num frames: 3567321088. Throughput: 0: 45215.7. Samples: 85872040. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 09:54:45,502][73265] Avg episode reward: [(0, '0.416')] [2024-06-13 09:54:48,246][73497] Updated weights for policy 0, policy_version 217739 (0.0038) [2024-06-13 09:54:50,502][73265] Fps is (10 sec: 45874.5, 60 sec: 44782.8, 300 sec: 45319.8). Total num frames: 3567534080. Throughput: 0: 45163.0. Samples: 86010620. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 09:54:50,502][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 09:54:51,623][73497] Updated weights for policy 0, policy_version 217749 (0.0036) [2024-06-13 09:54:55,501][73265] Fps is (10 sec: 42598.2, 60 sec: 44783.0, 300 sec: 45264.3). Total num frames: 3567747072. Throughput: 0: 45026.3. Samples: 86276640. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 09:54:55,502][73265] Avg episode reward: [(0, '0.386')] [2024-06-13 09:54:55,572][73497] Updated weights for policy 0, policy_version 217759 (0.0039) [2024-06-13 09:54:58,782][73497] Updated weights for policy 0, policy_version 217769 (0.0041) [2024-06-13 09:55:00,502][73265] Fps is (10 sec: 44236.7, 60 sec: 45056.0, 300 sec: 45264.3). Total num frames: 3567976448. Throughput: 0: 45025.3. Samples: 86547180. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 09:55:00,502][73265] Avg episode reward: [(0, '0.415')] [2024-06-13 09:55:03,012][73497] Updated weights for policy 0, policy_version 217779 (0.0031) [2024-06-13 09:55:05,502][73265] Fps is (10 sec: 47513.3, 60 sec: 45329.0, 300 sec: 45320.2). Total num frames: 3568222208. Throughput: 0: 45074.5. Samples: 86687000. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 09:55:05,502][73265] Avg episode reward: [(0, '0.407')] [2024-06-13 09:55:05,927][73497] Updated weights for policy 0, policy_version 217789 (0.0031) [2024-06-13 09:55:09,876][73497] Updated weights for policy 0, policy_version 217799 (0.0033) [2024-06-13 09:55:10,508][73265] Fps is (10 sec: 44210.3, 60 sec: 44778.5, 300 sec: 45263.3). Total num frames: 3568418816. Throughput: 0: 45076.6. Samples: 86955460. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 09:55:10,508][73265] Avg episode reward: [(0, '0.489')] [2024-06-13 09:55:11,877][73477] Signal inference workers to stop experience collection... (1300 times) [2024-06-13 09:55:11,878][73477] Signal inference workers to resume experience collection... (1300 times) [2024-06-13 09:55:11,926][73497] InferenceWorker_p0-w0: stopping experience collection (1300 times) [2024-06-13 09:55:11,926][73497] InferenceWorker_p0-w0: resuming experience collection (1300 times) [2024-06-13 09:55:13,195][73497] Updated weights for policy 0, policy_version 217809 (0.0026) [2024-06-13 09:55:15,501][73265] Fps is (10 sec: 44237.5, 60 sec: 45056.1, 300 sec: 45208.7). Total num frames: 3568664576. Throughput: 0: 45358.8. Samples: 87227720. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 09:55:15,502][73265] Avg episode reward: [(0, '0.392')] [2024-06-13 09:55:17,441][73497] Updated weights for policy 0, policy_version 217819 (0.0037) [2024-06-13 09:55:20,501][73265] Fps is (10 sec: 47542.7, 60 sec: 45329.0, 300 sec: 45265.3). Total num frames: 3568893952. Throughput: 0: 45129.4. Samples: 87360660. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 09:55:20,502][73265] Avg episode reward: [(0, '0.413')] [2024-06-13 09:55:20,746][73497] Updated weights for policy 0, policy_version 217829 (0.0033) [2024-06-13 09:55:24,844][73497] Updated weights for policy 0, policy_version 217839 (0.0036) [2024-06-13 09:55:25,501][73265] Fps is (10 sec: 44236.8, 60 sec: 45056.1, 300 sec: 45264.3). Total num frames: 3569106944. Throughput: 0: 45129.8. Samples: 87634580. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 09:55:25,502][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 09:55:27,755][73497] Updated weights for policy 0, policy_version 217849 (0.0033) [2024-06-13 09:55:30,501][73265] Fps is (10 sec: 42598.2, 60 sec: 44782.9, 300 sec: 45097.7). Total num frames: 3569319936. Throughput: 0: 45167.9. Samples: 87904600. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 09:55:30,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 09:55:32,039][73497] Updated weights for policy 0, policy_version 217859 (0.0028) [2024-06-13 09:55:34,615][73497] Updated weights for policy 0, policy_version 217869 (0.0032) [2024-06-13 09:55:35,502][73265] Fps is (10 sec: 49151.4, 60 sec: 45602.1, 300 sec: 45319.8). Total num frames: 3569598464. Throughput: 0: 45252.0. Samples: 88046960. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-13 09:55:35,502][73265] Avg episode reward: [(0, '0.346')] [2024-06-13 09:55:39,178][73497] Updated weights for policy 0, policy_version 217879 (0.0032) [2024-06-13 09:55:40,501][73265] Fps is (10 sec: 50790.6, 60 sec: 45875.2, 300 sec: 45375.3). Total num frames: 3569827840. Throughput: 0: 45570.2. Samples: 88327300. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-13 09:55:40,502][73265] Avg episode reward: [(0, '0.444')] [2024-06-13 09:55:41,945][73497] Updated weights for policy 0, policy_version 217889 (0.0037) [2024-06-13 09:55:45,501][73265] Fps is (10 sec: 40960.5, 60 sec: 44783.0, 300 sec: 45153.2). Total num frames: 3570008064. Throughput: 0: 45529.9. Samples: 88596020. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-13 09:55:45,502][73265] Avg episode reward: [(0, '0.368')] [2024-06-13 09:55:46,532][73497] Updated weights for policy 0, policy_version 217899 (0.0047) [2024-06-13 09:55:49,258][73497] Updated weights for policy 0, policy_version 217909 (0.0034) [2024-06-13 09:55:50,504][73265] Fps is (10 sec: 44225.8, 60 sec: 45600.3, 300 sec: 45208.4). Total num frames: 3570270208. Throughput: 0: 45109.2. Samples: 88717020. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-13 09:55:50,504][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 09:55:53,569][73497] Updated weights for policy 0, policy_version 217919 (0.0026) [2024-06-13 09:55:55,502][73265] Fps is (10 sec: 50789.8, 60 sec: 46148.2, 300 sec: 45430.9). Total num frames: 3570515968. Throughput: 0: 45522.1. Samples: 89003680. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-13 09:55:55,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 09:55:56,170][73497] Updated weights for policy 0, policy_version 217929 (0.0031) [2024-06-13 09:56:00,497][73497] Updated weights for policy 0, policy_version 217939 (0.0032) [2024-06-13 09:56:00,501][73265] Fps is (10 sec: 44248.1, 60 sec: 45602.3, 300 sec: 45264.3). Total num frames: 3570712576. Throughput: 0: 45808.9. Samples: 89289120. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-13 09:56:00,502][73265] Avg episode reward: [(0, '0.399')] [2024-06-13 09:56:03,204][73497] Updated weights for policy 0, policy_version 217949 (0.0037) [2024-06-13 09:56:05,502][73265] Fps is (10 sec: 40959.8, 60 sec: 45056.0, 300 sec: 45153.2). Total num frames: 3570925568. Throughput: 0: 45593.7. Samples: 89412380. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-13 09:56:05,502][73265] Avg episode reward: [(0, '0.484')] [2024-06-13 09:56:07,818][73497] Updated weights for policy 0, policy_version 217959 (0.0037) [2024-06-13 09:56:10,501][73265] Fps is (10 sec: 47513.3, 60 sec: 46153.0, 300 sec: 45430.9). Total num frames: 3571187712. Throughput: 0: 45549.3. Samples: 89684300. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-13 09:56:10,502][73265] Avg episode reward: [(0, '0.428')] [2024-06-13 09:56:10,545][73497] Updated weights for policy 0, policy_version 217969 (0.0034) [2024-06-13 09:56:14,999][73497] Updated weights for policy 0, policy_version 217979 (0.0048) [2024-06-13 09:56:15,501][73265] Fps is (10 sec: 47514.4, 60 sec: 45602.1, 300 sec: 45375.3). Total num frames: 3571400704. Throughput: 0: 45663.2. Samples: 89959440. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-13 09:56:15,502][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 09:56:18,040][73497] Updated weights for policy 0, policy_version 217989 (0.0031) [2024-06-13 09:56:18,846][73477] Signal inference workers to stop experience collection... (1350 times) [2024-06-13 09:56:18,846][73477] Signal inference workers to resume experience collection... (1350 times) [2024-06-13 09:56:18,856][73497] InferenceWorker_p0-w0: stopping experience collection (1350 times) [2024-06-13 09:56:18,856][73497] InferenceWorker_p0-w0: resuming experience collection (1350 times) [2024-06-13 09:56:20,501][73265] Fps is (10 sec: 40960.0, 60 sec: 45056.0, 300 sec: 45153.2). Total num frames: 3571597312. Throughput: 0: 45342.8. Samples: 90087380. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-13 09:56:20,502][73265] Avg episode reward: [(0, '0.387')] [2024-06-13 09:56:22,353][73497] Updated weights for policy 0, policy_version 217999 (0.0035) [2024-06-13 09:56:25,157][73497] Updated weights for policy 0, policy_version 218009 (0.0044) [2024-06-13 09:56:25,501][73265] Fps is (10 sec: 45874.9, 60 sec: 45875.2, 300 sec: 45264.3). Total num frames: 3571859456. Throughput: 0: 45080.0. Samples: 90355900. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-13 09:56:25,502][73265] Avg episode reward: [(0, '0.414')] [2024-06-13 09:56:25,518][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000218009_3571859456.pth... [2024-06-13 09:56:25,559][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000217345_3560980480.pth [2024-06-13 09:56:29,713][73497] Updated weights for policy 0, policy_version 218019 (0.0034) [2024-06-13 09:56:30,501][73265] Fps is (10 sec: 47513.6, 60 sec: 45875.2, 300 sec: 45375.7). Total num frames: 3572072448. Throughput: 0: 45352.0. Samples: 90636860. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-13 09:56:30,502][73265] Avg episode reward: [(0, '0.475')] [2024-06-13 09:56:32,386][73497] Updated weights for policy 0, policy_version 218029 (0.0036) [2024-06-13 09:56:35,501][73265] Fps is (10 sec: 40960.3, 60 sec: 44510.0, 300 sec: 45098.0). Total num frames: 3572269056. Throughput: 0: 45519.0. Samples: 90765260. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-13 09:56:35,502][73265] Avg episode reward: [(0, '0.423')] [2024-06-13 09:56:36,883][73497] Updated weights for policy 0, policy_version 218039 (0.0021) [2024-06-13 09:56:39,730][73497] Updated weights for policy 0, policy_version 218049 (0.0034) [2024-06-13 09:56:40,502][73265] Fps is (10 sec: 45874.8, 60 sec: 45055.9, 300 sec: 45209.1). Total num frames: 3572531200. Throughput: 0: 45160.9. Samples: 91035920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-13 09:56:40,502][73265] Avg episode reward: [(0, '0.411')] [2024-06-13 09:56:43,939][73497] Updated weights for policy 0, policy_version 218059 (0.0031) [2024-06-13 09:56:45,504][73265] Fps is (10 sec: 49140.2, 60 sec: 45873.4, 300 sec: 45375.0). Total num frames: 3572760576. Throughput: 0: 44851.8. Samples: 91307560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-13 09:56:45,504][73265] Avg episode reward: [(0, '0.426')] [2024-06-13 09:56:46,826][73497] Updated weights for policy 0, policy_version 218069 (0.0041) [2024-06-13 09:56:50,501][73265] Fps is (10 sec: 40960.2, 60 sec: 44511.7, 300 sec: 45153.2). Total num frames: 3572940800. Throughput: 0: 45062.3. Samples: 91440180. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-13 09:56:50,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 09:56:51,603][73497] Updated weights for policy 0, policy_version 218079 (0.0030) [2024-06-13 09:56:54,203][73497] Updated weights for policy 0, policy_version 218089 (0.0036) [2024-06-13 09:56:55,501][73265] Fps is (10 sec: 44247.2, 60 sec: 44783.0, 300 sec: 45097.7). Total num frames: 3573202944. Throughput: 0: 44884.4. Samples: 91704100. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-13 09:56:55,511][73265] Avg episode reward: [(0, '0.442')] [2024-06-13 09:56:58,605][73497] Updated weights for policy 0, policy_version 218099 (0.0037) [2024-06-13 09:57:00,502][73265] Fps is (10 sec: 49151.7, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 3573432320. Throughput: 0: 44888.8. Samples: 91979440. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-13 09:57:00,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 09:57:01,854][73497] Updated weights for policy 0, policy_version 218109 (0.0049) [2024-06-13 09:57:05,502][73265] Fps is (10 sec: 42597.9, 60 sec: 45056.0, 300 sec: 45153.2). Total num frames: 3573628928. Throughput: 0: 45182.5. Samples: 92120600. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-13 09:57:05,502][73265] Avg episode reward: [(0, '0.428')] [2024-06-13 09:57:05,830][73497] Updated weights for policy 0, policy_version 218119 (0.0035) [2024-06-13 09:57:08,772][73497] Updated weights for policy 0, policy_version 218129 (0.0023) [2024-06-13 09:57:10,501][73265] Fps is (10 sec: 44237.3, 60 sec: 44783.0, 300 sec: 45153.2). Total num frames: 3573874688. Throughput: 0: 45042.3. Samples: 92382800. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-13 09:57:10,502][73265] Avg episode reward: [(0, '0.446')] [2024-06-13 09:57:13,097][73497] Updated weights for policy 0, policy_version 218139 (0.0035) [2024-06-13 09:57:15,501][73265] Fps is (10 sec: 49152.7, 60 sec: 45329.1, 300 sec: 45264.3). Total num frames: 3574120448. Throughput: 0: 44826.2. Samples: 92654040. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-13 09:57:15,502][73265] Avg episode reward: [(0, '0.463')] [2024-06-13 09:57:16,141][73497] Updated weights for policy 0, policy_version 218149 (0.0047) [2024-06-13 09:57:20,502][73265] Fps is (10 sec: 42597.9, 60 sec: 45055.9, 300 sec: 45153.2). Total num frames: 3574300672. Throughput: 0: 45148.7. Samples: 92796960. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-13 09:57:20,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 09:57:20,539][73497] Updated weights for policy 0, policy_version 218159 (0.0039) [2024-06-13 09:57:23,315][73497] Updated weights for policy 0, policy_version 218169 (0.0029) [2024-06-13 09:57:25,501][73265] Fps is (10 sec: 40959.8, 60 sec: 44509.9, 300 sec: 45097.7). Total num frames: 3574530048. Throughput: 0: 45076.9. Samples: 93064380. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-13 09:57:25,502][73265] Avg episode reward: [(0, '0.413')] [2024-06-13 09:57:27,486][73497] Updated weights for policy 0, policy_version 218179 (0.0034) [2024-06-13 09:57:30,501][73265] Fps is (10 sec: 47514.1, 60 sec: 45056.0, 300 sec: 45097.6). Total num frames: 3574775808. Throughput: 0: 45149.9. Samples: 93339200. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-13 09:57:30,502][73265] Avg episode reward: [(0, '0.404')] [2024-06-13 09:57:30,770][73497] Updated weights for policy 0, policy_version 218189 (0.0032) [2024-06-13 09:57:34,151][73477] Signal inference workers to stop experience collection... (1400 times) [2024-06-13 09:57:34,152][73477] Signal inference workers to resume experience collection... (1400 times) [2024-06-13 09:57:34,200][73497] InferenceWorker_p0-w0: stopping experience collection (1400 times) [2024-06-13 09:57:34,200][73497] InferenceWorker_p0-w0: resuming experience collection (1400 times) [2024-06-13 09:57:34,739][73497] Updated weights for policy 0, policy_version 218199 (0.0030) [2024-06-13 09:57:35,502][73265] Fps is (10 sec: 45874.7, 60 sec: 45328.9, 300 sec: 45153.2). Total num frames: 3574988800. Throughput: 0: 45336.8. Samples: 93480340. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-13 09:57:35,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 09:57:37,834][73497] Updated weights for policy 0, policy_version 218209 (0.0023) [2024-06-13 09:57:40,501][73265] Fps is (10 sec: 42598.5, 60 sec: 44510.0, 300 sec: 45042.1). Total num frames: 3575201792. Throughput: 0: 45289.8. Samples: 93742140. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-13 09:57:40,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 09:57:42,249][73497] Updated weights for policy 0, policy_version 218219 (0.0045) [2024-06-13 09:57:45,465][73497] Updated weights for policy 0, policy_version 218229 (0.0038) [2024-06-13 09:57:45,501][73265] Fps is (10 sec: 47514.5, 60 sec: 45057.8, 300 sec: 45097.7). Total num frames: 3575463936. Throughput: 0: 45019.2. Samples: 94005300. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-13 09:57:45,502][73265] Avg episode reward: [(0, '0.433')] [2024-06-13 09:57:49,405][73497] Updated weights for policy 0, policy_version 218239 (0.0034) [2024-06-13 09:57:50,501][73265] Fps is (10 sec: 47513.4, 60 sec: 45602.2, 300 sec: 45208.7). Total num frames: 3575676928. Throughput: 0: 45161.0. Samples: 94152840. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-13 09:57:50,510][73265] Avg episode reward: [(0, '0.453')] [2024-06-13 09:57:52,460][73497] Updated weights for policy 0, policy_version 218249 (0.0029) [2024-06-13 09:57:55,501][73265] Fps is (10 sec: 42598.6, 60 sec: 44783.0, 300 sec: 45153.2). Total num frames: 3575889920. Throughput: 0: 45198.7. Samples: 94416740. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-13 09:57:55,508][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 09:57:56,378][73497] Updated weights for policy 0, policy_version 218259 (0.0031) [2024-06-13 09:58:00,020][73497] Updated weights for policy 0, policy_version 218269 (0.0031) [2024-06-13 09:58:00,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45056.1, 300 sec: 45153.2). Total num frames: 3576135680. Throughput: 0: 45344.0. Samples: 94694520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 09:58:00,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 09:58:03,928][73497] Updated weights for policy 0, policy_version 218279 (0.0026) [2024-06-13 09:58:05,501][73265] Fps is (10 sec: 49151.6, 60 sec: 45875.3, 300 sec: 45319.8). Total num frames: 3576381440. Throughput: 0: 45206.3. Samples: 94831240. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 09:58:05,502][73265] Avg episode reward: [(0, '0.478')] [2024-06-13 09:58:07,052][73497] Updated weights for policy 0, policy_version 218289 (0.0032) [2024-06-13 09:58:10,501][73265] Fps is (10 sec: 40959.9, 60 sec: 44509.8, 300 sec: 45153.2). Total num frames: 3576545280. Throughput: 0: 45191.1. Samples: 95097980. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 09:58:10,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 09:58:11,359][73497] Updated weights for policy 0, policy_version 218299 (0.0022) [2024-06-13 09:58:14,335][73497] Updated weights for policy 0, policy_version 218309 (0.0043) [2024-06-13 09:58:15,501][73265] Fps is (10 sec: 44236.6, 60 sec: 45056.0, 300 sec: 45208.7). Total num frames: 3576823808. Throughput: 0: 44952.0. Samples: 95362040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 09:58:15,502][73265] Avg episode reward: [(0, '0.402')] [2024-06-13 09:58:18,653][73497] Updated weights for policy 0, policy_version 218319 (0.0024) [2024-06-13 09:58:20,501][73265] Fps is (10 sec: 49152.1, 60 sec: 45602.2, 300 sec: 45319.8). Total num frames: 3577036800. Throughput: 0: 44893.0. Samples: 95500520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 09:58:20,502][73265] Avg episode reward: [(0, '0.488')] [2024-06-13 09:58:21,869][73497] Updated weights for policy 0, policy_version 218329 (0.0037) [2024-06-13 09:58:25,501][73265] Fps is (10 sec: 42598.2, 60 sec: 45329.0, 300 sec: 45209.1). Total num frames: 3577249792. Throughput: 0: 45250.5. Samples: 95778420. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 09:58:25,502][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 09:58:25,590][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000218339_3577266176.pth... [2024-06-13 09:58:25,593][73497] Updated weights for policy 0, policy_version 218339 (0.0035) [2024-06-13 09:58:25,641][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000217676_3566403584.pth [2024-06-13 09:58:29,267][73497] Updated weights for policy 0, policy_version 218349 (0.0040) [2024-06-13 09:58:30,502][73265] Fps is (10 sec: 44236.2, 60 sec: 45055.9, 300 sec: 45208.7). Total num frames: 3577479168. Throughput: 0: 45192.7. Samples: 96038980. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 09:58:30,502][73265] Avg episode reward: [(0, '0.474')] [2024-06-13 09:58:33,183][73497] Updated weights for policy 0, policy_version 218359 (0.0043) [2024-06-13 09:58:35,501][73265] Fps is (10 sec: 47513.8, 60 sec: 45602.2, 300 sec: 45208.7). Total num frames: 3577724928. Throughput: 0: 45124.4. Samples: 96183440. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 09:58:35,502][73265] Avg episode reward: [(0, '0.408')] [2024-06-13 09:58:36,297][73497] Updated weights for policy 0, policy_version 218369 (0.0031) [2024-06-13 09:58:40,501][73265] Fps is (10 sec: 42598.7, 60 sec: 45055.9, 300 sec: 45208.7). Total num frames: 3577905152. Throughput: 0: 45066.1. Samples: 96444720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 09:58:40,502][73265] Avg episode reward: [(0, '0.453')] [2024-06-13 09:58:40,777][73497] Updated weights for policy 0, policy_version 218379 (0.0035) [2024-06-13 09:58:43,579][73497] Updated weights for policy 0, policy_version 218389 (0.0043) [2024-06-13 09:58:45,501][73265] Fps is (10 sec: 40960.2, 60 sec: 44509.8, 300 sec: 45042.1). Total num frames: 3578134528. Throughput: 0: 44659.1. Samples: 96704180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 09:58:45,502][73265] Avg episode reward: [(0, '0.418')] [2024-06-13 09:58:47,841][73497] Updated weights for policy 0, policy_version 218399 (0.0038) [2024-06-13 09:58:49,050][73477] Signal inference workers to stop experience collection... (1450 times) [2024-06-13 09:58:49,050][73477] Signal inference workers to resume experience collection... (1450 times) [2024-06-13 09:58:49,062][73497] InferenceWorker_p0-w0: stopping experience collection (1450 times) [2024-06-13 09:58:49,066][73497] InferenceWorker_p0-w0: resuming experience collection (1450 times) [2024-06-13 09:58:50,501][73265] Fps is (10 sec: 47513.7, 60 sec: 45056.0, 300 sec: 45153.2). Total num frames: 3578380288. Throughput: 0: 44746.6. Samples: 96844840. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 09:58:50,502][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 09:58:50,997][73497] Updated weights for policy 0, policy_version 218409 (0.0045) [2024-06-13 09:58:54,766][73497] Updated weights for policy 0, policy_version 218419 (0.0026) [2024-06-13 09:58:55,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45056.0, 300 sec: 45153.2). Total num frames: 3578593280. Throughput: 0: 44940.6. Samples: 97120300. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 09:58:55,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 09:58:58,373][73497] Updated weights for policy 0, policy_version 218429 (0.0043) [2024-06-13 09:59:00,502][73265] Fps is (10 sec: 44236.5, 60 sec: 44782.9, 300 sec: 45153.2). Total num frames: 3578822656. Throughput: 0: 45050.1. Samples: 97389300. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-13 09:59:00,502][73265] Avg episode reward: [(0, '0.415')] [2024-06-13 09:59:02,640][73497] Updated weights for policy 0, policy_version 218439 (0.0041) [2024-06-13 09:59:05,501][73265] Fps is (10 sec: 45875.0, 60 sec: 44509.9, 300 sec: 45153.2). Total num frames: 3579052032. Throughput: 0: 44941.4. Samples: 97522880. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-13 09:59:05,502][73265] Avg episode reward: [(0, '0.398')] [2024-06-13 09:59:05,616][73497] Updated weights for policy 0, policy_version 218449 (0.0042) [2024-06-13 09:59:09,902][73497] Updated weights for policy 0, policy_version 218459 (0.0034) [2024-06-13 09:59:10,501][73265] Fps is (10 sec: 42598.8, 60 sec: 45056.0, 300 sec: 45042.1). Total num frames: 3579248640. Throughput: 0: 44726.3. Samples: 97791100. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-13 09:59:10,502][73265] Avg episode reward: [(0, '0.464')] [2024-06-13 09:59:12,697][73497] Updated weights for policy 0, policy_version 218469 (0.0039) [2024-06-13 09:59:15,501][73265] Fps is (10 sec: 44236.7, 60 sec: 44509.9, 300 sec: 45153.2). Total num frames: 3579494400. Throughput: 0: 44844.1. Samples: 98056960. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-13 09:59:15,502][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 09:59:17,194][73497] Updated weights for policy 0, policy_version 218479 (0.0034) [2024-06-13 09:59:19,934][73497] Updated weights for policy 0, policy_version 218489 (0.0026) [2024-06-13 09:59:20,504][73265] Fps is (10 sec: 49139.8, 60 sec: 45054.1, 300 sec: 45208.4). Total num frames: 3579740160. Throughput: 0: 44699.8. Samples: 98195040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-13 09:59:20,504][73265] Avg episode reward: [(0, '0.404')] [2024-06-13 09:59:24,137][73497] Updated weights for policy 0, policy_version 218499 (0.0041) [2024-06-13 09:59:25,502][73265] Fps is (10 sec: 45874.5, 60 sec: 45056.0, 300 sec: 45153.2). Total num frames: 3579953152. Throughput: 0: 45184.8. Samples: 98478040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-13 09:59:25,502][73265] Avg episode reward: [(0, '0.469')] [2024-06-13 09:59:27,194][73497] Updated weights for policy 0, policy_version 218509 (0.0033) [2024-06-13 09:59:30,501][73265] Fps is (10 sec: 44247.9, 60 sec: 45056.1, 300 sec: 45153.2). Total num frames: 3580182528. Throughput: 0: 45199.6. Samples: 98738160. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-13 09:59:30,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 09:59:31,599][73497] Updated weights for policy 0, policy_version 218519 (0.0047) [2024-06-13 09:59:34,490][73497] Updated weights for policy 0, policy_version 218529 (0.0035) [2024-06-13 09:59:35,501][73265] Fps is (10 sec: 45875.8, 60 sec: 44782.9, 300 sec: 45208.7). Total num frames: 3580411904. Throughput: 0: 45237.4. Samples: 98880520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-13 09:59:35,502][73265] Avg episode reward: [(0, '0.423')] [2024-06-13 09:59:38,814][73497] Updated weights for policy 0, policy_version 218539 (0.0036) [2024-06-13 09:59:40,501][73265] Fps is (10 sec: 42598.4, 60 sec: 45056.1, 300 sec: 45042.1). Total num frames: 3580608512. Throughput: 0: 45030.6. Samples: 99146680. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-13 09:59:40,502][73265] Avg episode reward: [(0, '0.391')] [2024-06-13 09:59:41,643][73497] Updated weights for policy 0, policy_version 218549 (0.0032) [2024-06-13 09:59:45,501][73265] Fps is (10 sec: 42598.4, 60 sec: 45056.0, 300 sec: 45097.7). Total num frames: 3580837888. Throughput: 0: 45047.6. Samples: 99416440. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-13 09:59:45,502][73265] Avg episode reward: [(0, '0.542')] [2024-06-13 09:59:45,609][73477] Saving new best policy, reward=0.542! [2024-06-13 09:59:45,911][73497] Updated weights for policy 0, policy_version 218559 (0.0032) [2024-06-13 09:59:48,994][73497] Updated weights for policy 0, policy_version 218569 (0.0035) [2024-06-13 09:59:50,502][73265] Fps is (10 sec: 47512.9, 60 sec: 45055.9, 300 sec: 45208.7). Total num frames: 3581083648. Throughput: 0: 45215.4. Samples: 99557580. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-13 09:59:50,502][73265] Avg episode reward: [(0, '0.387')] [2024-06-13 09:59:53,044][73497] Updated weights for policy 0, policy_version 218579 (0.0039) [2024-06-13 09:59:55,501][73265] Fps is (10 sec: 47513.7, 60 sec: 45329.0, 300 sec: 45208.8). Total num frames: 3581313024. Throughput: 0: 45326.7. Samples: 99830800. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-13 09:59:55,502][73265] Avg episode reward: [(0, '0.384')] [2024-06-13 09:59:56,190][73497] Updated weights for policy 0, policy_version 218589 (0.0040) [2024-06-13 10:00:00,501][73265] Fps is (10 sec: 42598.5, 60 sec: 44782.9, 300 sec: 45042.1). Total num frames: 3581509632. Throughput: 0: 45462.1. Samples: 100102760. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-13 10:00:00,502][73265] Avg episode reward: [(0, '0.463')] [2024-06-13 10:00:00,538][73497] Updated weights for policy 0, policy_version 218599 (0.0036) [2024-06-13 10:00:03,431][73497] Updated weights for policy 0, policy_version 218609 (0.0030) [2024-06-13 10:00:05,501][73265] Fps is (10 sec: 45875.4, 60 sec: 45329.1, 300 sec: 45265.2). Total num frames: 3581771776. Throughput: 0: 45281.2. Samples: 100232580. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-13 10:00:05,502][73265] Avg episode reward: [(0, '0.397')] [2024-06-13 10:00:07,691][73497] Updated weights for policy 0, policy_version 218619 (0.0036) [2024-06-13 10:00:10,501][73265] Fps is (10 sec: 49152.7, 60 sec: 45875.3, 300 sec: 45208.7). Total num frames: 3582001152. Throughput: 0: 45242.9. Samples: 100513960. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-13 10:00:10,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 10:00:10,530][73497] Updated weights for policy 0, policy_version 218629 (0.0031) [2024-06-13 10:00:14,655][73497] Updated weights for policy 0, policy_version 218639 (0.0037) [2024-06-13 10:00:15,501][73265] Fps is (10 sec: 42598.1, 60 sec: 45056.0, 300 sec: 45097.7). Total num frames: 3582197760. Throughput: 0: 45363.5. Samples: 100779520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 19.0) [2024-06-13 10:00:15,502][73265] Avg episode reward: [(0, '0.380')] [2024-06-13 10:00:16,141][73477] Signal inference workers to stop experience collection... (1500 times) [2024-06-13 10:00:16,190][73497] InferenceWorker_p0-w0: stopping experience collection (1500 times) [2024-06-13 10:00:16,192][73477] Signal inference workers to resume experience collection... (1500 times) [2024-06-13 10:00:16,201][73497] InferenceWorker_p0-w0: resuming experience collection (1500 times) [2024-06-13 10:00:18,054][73497] Updated weights for policy 0, policy_version 218649 (0.0031) [2024-06-13 10:00:20,501][73265] Fps is (10 sec: 45874.6, 60 sec: 45330.9, 300 sec: 45264.3). Total num frames: 3582459904. Throughput: 0: 45280.4. Samples: 100918140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 19.0) [2024-06-13 10:00:20,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 10:00:22,198][73497] Updated weights for policy 0, policy_version 218659 (0.0036) [2024-06-13 10:00:24,929][73497] Updated weights for policy 0, policy_version 218669 (0.0031) [2024-06-13 10:00:25,501][73265] Fps is (10 sec: 49152.2, 60 sec: 45602.3, 300 sec: 45319.8). Total num frames: 3582689280. Throughput: 0: 45460.9. Samples: 101192420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 19.0) [2024-06-13 10:00:25,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 10:00:25,524][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000218670_3582689280.pth... [2024-06-13 10:00:25,570][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000218009_3571859456.pth [2024-06-13 10:00:29,343][73497] Updated weights for policy 0, policy_version 218679 (0.0028) [2024-06-13 10:00:30,501][73265] Fps is (10 sec: 40960.2, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 3582869504. Throughput: 0: 45499.5. Samples: 101463920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 19.0) [2024-06-13 10:00:30,502][73265] Avg episode reward: [(0, '0.463')] [2024-06-13 10:00:32,337][73497] Updated weights for policy 0, policy_version 218689 (0.0041) [2024-06-13 10:00:35,502][73265] Fps is (10 sec: 42597.7, 60 sec: 45055.9, 300 sec: 45042.1). Total num frames: 3583115264. Throughput: 0: 45116.9. Samples: 101587840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 19.0) [2024-06-13 10:00:35,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 10:00:36,376][73497] Updated weights for policy 0, policy_version 218699 (0.0040) [2024-06-13 10:00:39,802][73497] Updated weights for policy 0, policy_version 218709 (0.0039) [2024-06-13 10:00:40,501][73265] Fps is (10 sec: 49152.1, 60 sec: 45875.2, 300 sec: 45264.3). Total num frames: 3583361024. Throughput: 0: 45279.1. Samples: 101868360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 19.0) [2024-06-13 10:00:40,502][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 10:00:43,542][73497] Updated weights for policy 0, policy_version 218719 (0.0043) [2024-06-13 10:00:45,501][73265] Fps is (10 sec: 44237.4, 60 sec: 45329.1, 300 sec: 45042.5). Total num frames: 3583557632. Throughput: 0: 45429.0. Samples: 102147060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 19.0) [2024-06-13 10:00:45,502][73265] Avg episode reward: [(0, '0.479')] [2024-06-13 10:00:47,024][73497] Updated weights for policy 0, policy_version 218729 (0.0028) [2024-06-13 10:00:50,502][73265] Fps is (10 sec: 44236.4, 60 sec: 45329.1, 300 sec: 45042.1). Total num frames: 3583803392. Throughput: 0: 45390.1. Samples: 102275140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 19.0) [2024-06-13 10:00:50,502][73265] Avg episode reward: [(0, '0.398')] [2024-06-13 10:00:50,566][73497] Updated weights for policy 0, policy_version 218739 (0.0038) [2024-06-13 10:00:53,975][73497] Updated weights for policy 0, policy_version 218749 (0.0032) [2024-06-13 10:00:55,501][73265] Fps is (10 sec: 49152.1, 60 sec: 45602.1, 300 sec: 45208.7). Total num frames: 3584049152. Throughput: 0: 45157.3. Samples: 102546040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 19.0) [2024-06-13 10:00:55,502][73265] Avg episode reward: [(0, '0.384')] [2024-06-13 10:00:58,023][73497] Updated weights for policy 0, policy_version 218759 (0.0032) [2024-06-13 10:01:00,502][73265] Fps is (10 sec: 42598.4, 60 sec: 45329.1, 300 sec: 45097.7). Total num frames: 3584229376. Throughput: 0: 45381.7. Samples: 102821700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 19.0) [2024-06-13 10:01:00,502][73265] Avg episode reward: [(0, '0.410')] [2024-06-13 10:01:01,567][73497] Updated weights for policy 0, policy_version 218769 (0.0026) [2024-06-13 10:01:05,054][73497] Updated weights for policy 0, policy_version 218779 (0.0036) [2024-06-13 10:01:05,502][73265] Fps is (10 sec: 42597.9, 60 sec: 45055.9, 300 sec: 45042.1). Total num frames: 3584475136. Throughput: 0: 44941.3. Samples: 102940500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 19.0) [2024-06-13 10:01:05,502][73265] Avg episode reward: [(0, '0.444')] [2024-06-13 10:01:08,706][73497] Updated weights for policy 0, policy_version 218789 (0.0044) [2024-06-13 10:01:10,501][73265] Fps is (10 sec: 49152.6, 60 sec: 45329.0, 300 sec: 45153.2). Total num frames: 3584720896. Throughput: 0: 45111.5. Samples: 103222440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 19.0) [2024-06-13 10:01:10,502][73265] Avg episode reward: [(0, '0.410')] [2024-06-13 10:01:12,142][73497] Updated weights for policy 0, policy_version 218799 (0.0029) [2024-06-13 10:01:15,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45602.1, 300 sec: 45208.7). Total num frames: 3584933888. Throughput: 0: 45270.2. Samples: 103501080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 19.0) [2024-06-13 10:01:15,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 10:01:15,905][73497] Updated weights for policy 0, policy_version 218809 (0.0033) [2024-06-13 10:01:19,205][73497] Updated weights for policy 0, policy_version 218819 (0.0029) [2024-06-13 10:01:20,501][73265] Fps is (10 sec: 42598.5, 60 sec: 44783.0, 300 sec: 45042.1). Total num frames: 3585146880. Throughput: 0: 45539.7. Samples: 103637120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:01:20,502][73265] Avg episode reward: [(0, '0.404')] [2024-06-13 10:01:23,029][73497] Updated weights for policy 0, policy_version 218829 (0.0037) [2024-06-13 10:01:25,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45056.0, 300 sec: 45153.2). Total num frames: 3585392640. Throughput: 0: 45261.8. Samples: 103905140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:01:25,503][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 10:01:26,621][73497] Updated weights for policy 0, policy_version 218839 (0.0026) [2024-06-13 10:01:30,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45602.2, 300 sec: 45208.7). Total num frames: 3585605632. Throughput: 0: 45260.9. Samples: 104183800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:01:30,502][73265] Avg episode reward: [(0, '0.471')] [2024-06-13 10:01:30,595][73497] Updated weights for policy 0, policy_version 218849 (0.0040) [2024-06-13 10:01:30,763][73477] Signal inference workers to stop experience collection... (1550 times) [2024-06-13 10:01:30,763][73477] Signal inference workers to resume experience collection... (1550 times) [2024-06-13 10:01:30,809][73497] InferenceWorker_p0-w0: stopping experience collection (1550 times) [2024-06-13 10:01:30,810][73497] InferenceWorker_p0-w0: resuming experience collection (1550 times) [2024-06-13 10:01:33,658][73497] Updated weights for policy 0, policy_version 218859 (0.0031) [2024-06-13 10:01:35,501][73265] Fps is (10 sec: 40959.9, 60 sec: 44783.0, 300 sec: 44986.6). Total num frames: 3585802240. Throughput: 0: 45321.0. Samples: 104314580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:01:35,502][73265] Avg episode reward: [(0, '0.461')] [2024-06-13 10:01:37,727][73497] Updated weights for policy 0, policy_version 218869 (0.0039) [2024-06-13 10:01:40,501][73265] Fps is (10 sec: 47513.8, 60 sec: 45329.1, 300 sec: 45153.6). Total num frames: 3586080768. Throughput: 0: 45192.5. Samples: 104579700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:01:40,502][73265] Avg episode reward: [(0, '0.411')] [2024-06-13 10:01:41,044][73497] Updated weights for policy 0, policy_version 218879 (0.0034) [2024-06-13 10:01:44,579][73497] Updated weights for policy 0, policy_version 218889 (0.0042) [2024-06-13 10:01:45,501][73265] Fps is (10 sec: 49151.5, 60 sec: 45602.1, 300 sec: 45264.3). Total num frames: 3586293760. Throughput: 0: 45241.3. Samples: 104857560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:01:45,502][73265] Avg episode reward: [(0, '0.470')] [2024-06-13 10:01:48,121][73497] Updated weights for policy 0, policy_version 218899 (0.0037) [2024-06-13 10:01:50,501][73265] Fps is (10 sec: 40959.8, 60 sec: 44783.0, 300 sec: 45042.1). Total num frames: 3586490368. Throughput: 0: 45620.5. Samples: 104993420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:01:50,502][73265] Avg episode reward: [(0, '0.391')] [2024-06-13 10:01:52,461][73497] Updated weights for policy 0, policy_version 218909 (0.0038) [2024-06-13 10:01:55,327][73497] Updated weights for policy 0, policy_version 218919 (0.0036) [2024-06-13 10:01:55,502][73265] Fps is (10 sec: 47513.3, 60 sec: 45328.9, 300 sec: 45208.7). Total num frames: 3586768896. Throughput: 0: 45190.9. Samples: 105256040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:01:55,511][73265] Avg episode reward: [(0, '0.408')] [2024-06-13 10:01:59,743][73497] Updated weights for policy 0, policy_version 218929 (0.0029) [2024-06-13 10:02:00,501][73265] Fps is (10 sec: 49151.9, 60 sec: 45875.3, 300 sec: 45264.3). Total num frames: 3586981888. Throughput: 0: 44926.7. Samples: 105522780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:02:00,502][73265] Avg episode reward: [(0, '0.433')] [2024-06-13 10:02:02,892][73497] Updated weights for policy 0, policy_version 218939 (0.0039) [2024-06-13 10:02:05,502][73265] Fps is (10 sec: 40960.0, 60 sec: 45056.0, 300 sec: 45097.6). Total num frames: 3587178496. Throughput: 0: 44893.6. Samples: 105657340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:02:05,502][73265] Avg episode reward: [(0, '0.449')] [2024-06-13 10:02:06,834][73497] Updated weights for policy 0, policy_version 218949 (0.0036) [2024-06-13 10:02:10,002][73497] Updated weights for policy 0, policy_version 218959 (0.0024) [2024-06-13 10:02:10,502][73265] Fps is (10 sec: 44236.5, 60 sec: 45055.9, 300 sec: 45097.6). Total num frames: 3587424256. Throughput: 0: 45114.5. Samples: 105935300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:02:10,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 10:02:13,653][73497] Updated weights for policy 0, policy_version 218969 (0.0041) [2024-06-13 10:02:15,501][73265] Fps is (10 sec: 47513.9, 60 sec: 45329.0, 300 sec: 45264.3). Total num frames: 3587653632. Throughput: 0: 45079.0. Samples: 106212360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:02:15,502][73265] Avg episode reward: [(0, '0.461')] [2024-06-13 10:02:17,427][73497] Updated weights for policy 0, policy_version 218979 (0.0028) [2024-06-13 10:02:20,501][73265] Fps is (10 sec: 42598.9, 60 sec: 45056.0, 300 sec: 45153.2). Total num frames: 3587850240. Throughput: 0: 45250.2. Samples: 106350840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:02:20,502][73265] Avg episode reward: [(0, '0.456')] [2024-06-13 10:02:21,519][73497] Updated weights for policy 0, policy_version 218989 (0.0030) [2024-06-13 10:02:24,589][73497] Updated weights for policy 0, policy_version 218999 (0.0040) [2024-06-13 10:02:25,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45055.9, 300 sec: 45153.2). Total num frames: 3588096000. Throughput: 0: 45165.7. Samples: 106612160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:02:25,502][73265] Avg episode reward: [(0, '0.389')] [2024-06-13 10:02:25,517][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000219000_3588096000.pth... [2024-06-13 10:02:25,602][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000218339_3577266176.pth [2024-06-13 10:02:28,895][73497] Updated weights for policy 0, policy_version 219009 (0.0029) [2024-06-13 10:02:30,503][73265] Fps is (10 sec: 49142.2, 60 sec: 45600.6, 300 sec: 45264.0). Total num frames: 3588341760. Throughput: 0: 45045.2. Samples: 106884680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 10:02:30,504][73265] Avg episode reward: [(0, '0.416')] [2024-06-13 10:02:31,639][73497] Updated weights for policy 0, policy_version 219019 (0.0041) [2024-06-13 10:02:35,501][73265] Fps is (10 sec: 40960.2, 60 sec: 45056.0, 300 sec: 45097.6). Total num frames: 3588505600. Throughput: 0: 45065.3. Samples: 107021360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 10:02:35,502][73265] Avg episode reward: [(0, '0.410')] [2024-06-13 10:02:36,089][73497] Updated weights for policy 0, policy_version 219029 (0.0033) [2024-06-13 10:02:36,836][73477] Signal inference workers to stop experience collection... (1600 times) [2024-06-13 10:02:36,837][73477] Signal inference workers to resume experience collection... (1600 times) [2024-06-13 10:02:36,857][73497] InferenceWorker_p0-w0: stopping experience collection (1600 times) [2024-06-13 10:02:36,858][73497] InferenceWorker_p0-w0: resuming experience collection (1600 times) [2024-06-13 10:02:38,962][73497] Updated weights for policy 0, policy_version 219039 (0.0038) [2024-06-13 10:02:40,504][73265] Fps is (10 sec: 42596.3, 60 sec: 44781.1, 300 sec: 45097.3). Total num frames: 3588767744. Throughput: 0: 45183.4. Samples: 107289400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 10:02:40,505][73265] Avg episode reward: [(0, '0.439')] [2024-06-13 10:02:42,863][73497] Updated weights for policy 0, policy_version 219049 (0.0032) [2024-06-13 10:02:45,501][73265] Fps is (10 sec: 50790.3, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 3589013504. Throughput: 0: 45544.9. Samples: 107572300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 10:02:45,502][73265] Avg episode reward: [(0, '0.439')] [2024-06-13 10:02:46,462][73497] Updated weights for policy 0, policy_version 219059 (0.0033) [2024-06-13 10:02:50,501][73265] Fps is (10 sec: 44247.7, 60 sec: 45329.1, 300 sec: 45153.2). Total num frames: 3589210112. Throughput: 0: 45584.2. Samples: 107708620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 10:02:50,502][73265] Avg episode reward: [(0, '0.431')] [2024-06-13 10:02:50,583][73497] Updated weights for policy 0, policy_version 219069 (0.0042) [2024-06-13 10:02:53,638][73497] Updated weights for policy 0, policy_version 219079 (0.0047) [2024-06-13 10:02:55,501][73265] Fps is (10 sec: 42598.6, 60 sec: 44510.0, 300 sec: 45097.7). Total num frames: 3589439488. Throughput: 0: 45195.2. Samples: 107969080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 10:02:55,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 10:02:58,122][73497] Updated weights for policy 0, policy_version 219089 (0.0034) [2024-06-13 10:03:00,501][73265] Fps is (10 sec: 47513.4, 60 sec: 45056.0, 300 sec: 45097.6). Total num frames: 3589685248. Throughput: 0: 44898.7. Samples: 108232800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 10:03:00,502][73265] Avg episode reward: [(0, '0.445')] [2024-06-13 10:03:00,822][73497] Updated weights for policy 0, policy_version 219099 (0.0033) [2024-06-13 10:03:05,243][73497] Updated weights for policy 0, policy_version 219109 (0.0034) [2024-06-13 10:03:05,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45329.2, 300 sec: 45264.3). Total num frames: 3589898240. Throughput: 0: 44925.3. Samples: 108372480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 10:03:05,502][73265] Avg episode reward: [(0, '0.461')] [2024-06-13 10:03:08,279][73497] Updated weights for policy 0, policy_version 219119 (0.0039) [2024-06-13 10:03:10,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45056.1, 300 sec: 45097.7). Total num frames: 3590127616. Throughput: 0: 45103.2. Samples: 108641800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 10:03:10,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 10:03:12,267][73497] Updated weights for policy 0, policy_version 219129 (0.0026) [2024-06-13 10:03:15,502][73265] Fps is (10 sec: 45874.4, 60 sec: 45055.9, 300 sec: 45153.2). Total num frames: 3590356992. Throughput: 0: 45135.2. Samples: 108915680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 10:03:15,502][73265] Avg episode reward: [(0, '0.470')] [2024-06-13 10:03:15,683][73497] Updated weights for policy 0, policy_version 219139 (0.0036) [2024-06-13 10:03:20,070][73497] Updated weights for policy 0, policy_version 219149 (0.0028) [2024-06-13 10:03:20,501][73265] Fps is (10 sec: 40960.4, 60 sec: 44783.0, 300 sec: 45042.1). Total num frames: 3590537216. Throughput: 0: 44972.1. Samples: 109045100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 10:03:20,502][73265] Avg episode reward: [(0, '0.408')] [2024-06-13 10:03:22,766][73497] Updated weights for policy 0, policy_version 219159 (0.0035) [2024-06-13 10:03:25,501][73265] Fps is (10 sec: 42599.0, 60 sec: 44783.0, 300 sec: 45097.7). Total num frames: 3590782976. Throughput: 0: 44810.0. Samples: 109305740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 10:03:25,502][73265] Avg episode reward: [(0, '0.464')] [2024-06-13 10:03:27,619][73497] Updated weights for policy 0, policy_version 219169 (0.0037) [2024-06-13 10:03:30,355][73497] Updated weights for policy 0, policy_version 219179 (0.0039) [2024-06-13 10:03:30,501][73265] Fps is (10 sec: 49152.2, 60 sec: 44784.5, 300 sec: 45097.7). Total num frames: 3591028736. Throughput: 0: 44448.6. Samples: 109572480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 10:03:30,502][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 10:03:34,678][73497] Updated weights for policy 0, policy_version 219189 (0.0038) [2024-06-13 10:03:35,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45602.2, 300 sec: 45208.8). Total num frames: 3591241728. Throughput: 0: 44613.0. Samples: 109716200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-13 10:03:35,502][73265] Avg episode reward: [(0, '0.418')] [2024-06-13 10:03:37,673][73497] Updated weights for policy 0, policy_version 219199 (0.0034) [2024-06-13 10:03:40,501][73265] Fps is (10 sec: 42597.7, 60 sec: 44784.7, 300 sec: 45153.2). Total num frames: 3591454720. Throughput: 0: 44816.9. Samples: 109985840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 10:03:40,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 10:03:41,455][73497] Updated weights for policy 0, policy_version 219209 (0.0053) [2024-06-13 10:03:44,862][73497] Updated weights for policy 0, policy_version 219219 (0.0032) [2024-06-13 10:03:45,501][73265] Fps is (10 sec: 45875.1, 60 sec: 44783.0, 300 sec: 45153.2). Total num frames: 3591700480. Throughput: 0: 45165.0. Samples: 110265220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 10:03:45,502][73265] Avg episode reward: [(0, '0.388')] [2024-06-13 10:03:49,044][73497] Updated weights for policy 0, policy_version 219229 (0.0036) [2024-06-13 10:03:50,501][73265] Fps is (10 sec: 47513.8, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 3591929856. Throughput: 0: 45122.2. Samples: 110402980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 10:03:50,502][73265] Avg episode reward: [(0, '0.461')] [2024-06-13 10:03:51,755][73497] Updated weights for policy 0, policy_version 219239 (0.0033) [2024-06-13 10:03:53,448][73477] Signal inference workers to stop experience collection... (1650 times) [2024-06-13 10:03:53,500][73497] InferenceWorker_p0-w0: stopping experience collection (1650 times) [2024-06-13 10:03:53,566][73477] Signal inference workers to resume experience collection... (1650 times) [2024-06-13 10:03:53,566][73497] InferenceWorker_p0-w0: resuming experience collection (1650 times) [2024-06-13 10:03:55,501][73265] Fps is (10 sec: 44236.5, 60 sec: 45056.0, 300 sec: 45153.2). Total num frames: 3592142848. Throughput: 0: 45088.4. Samples: 110670780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 10:03:55,502][73265] Avg episode reward: [(0, '0.426')] [2024-06-13 10:03:56,507][73497] Updated weights for policy 0, policy_version 219249 (0.0032) [2024-06-13 10:03:58,969][73497] Updated weights for policy 0, policy_version 219259 (0.0029) [2024-06-13 10:04:00,501][73265] Fps is (10 sec: 47513.3, 60 sec: 45329.1, 300 sec: 45264.3). Total num frames: 3592404992. Throughput: 0: 44994.3. Samples: 110940420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 10:04:00,502][73265] Avg episode reward: [(0, '0.389')] [2024-06-13 10:04:03,561][73497] Updated weights for policy 0, policy_version 219269 (0.0039) [2024-06-13 10:04:05,501][73265] Fps is (10 sec: 45875.7, 60 sec: 45056.0, 300 sec: 45264.3). Total num frames: 3592601600. Throughput: 0: 45355.1. Samples: 111086080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 10:04:05,502][73265] Avg episode reward: [(0, '0.405')] [2024-06-13 10:04:06,590][73497] Updated weights for policy 0, policy_version 219279 (0.0023) [2024-06-13 10:04:10,501][73265] Fps is (10 sec: 40960.1, 60 sec: 44782.9, 300 sec: 45153.2). Total num frames: 3592814592. Throughput: 0: 45418.7. Samples: 111349580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 10:04:10,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 10:04:10,551][73497] Updated weights for policy 0, policy_version 219289 (0.0027) [2024-06-13 10:04:13,938][73497] Updated weights for policy 0, policy_version 219299 (0.0029) [2024-06-13 10:04:15,501][73265] Fps is (10 sec: 47513.6, 60 sec: 45329.2, 300 sec: 45209.1). Total num frames: 3593076736. Throughput: 0: 45437.3. Samples: 111617160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 10:04:15,502][73265] Avg episode reward: [(0, '0.508')] [2024-06-13 10:04:17,988][73497] Updated weights for policy 0, policy_version 219309 (0.0038) [2024-06-13 10:04:20,501][73265] Fps is (10 sec: 47513.9, 60 sec: 45875.2, 300 sec: 45208.8). Total num frames: 3593289728. Throughput: 0: 45446.2. Samples: 111761280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 10:04:20,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 10:04:20,903][73497] Updated weights for policy 0, policy_version 219319 (0.0030) [2024-06-13 10:04:25,497][73497] Updated weights for policy 0, policy_version 219329 (0.0034) [2024-06-13 10:04:25,501][73265] Fps is (10 sec: 40959.7, 60 sec: 45056.0, 300 sec: 45097.6). Total num frames: 3593486336. Throughput: 0: 45479.6. Samples: 112032420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 10:04:25,502][73265] Avg episode reward: [(0, '0.435')] [2024-06-13 10:04:25,525][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000219329_3593486336.pth... [2024-06-13 10:04:25,568][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000218670_3582689280.pth [2024-06-13 10:04:27,945][73497] Updated weights for policy 0, policy_version 219339 (0.0030) [2024-06-13 10:04:30,501][73265] Fps is (10 sec: 45874.5, 60 sec: 45328.9, 300 sec: 45208.7). Total num frames: 3593748480. Throughput: 0: 45154.5. Samples: 112297180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 10:04:30,502][73265] Avg episode reward: [(0, '0.467')] [2024-06-13 10:04:32,686][73497] Updated weights for policy 0, policy_version 219349 (0.0036) [2024-06-13 10:04:35,436][73497] Updated weights for policy 0, policy_version 219359 (0.0041) [2024-06-13 10:04:35,502][73265] Fps is (10 sec: 49151.5, 60 sec: 45602.0, 300 sec: 45319.8). Total num frames: 3593977856. Throughput: 0: 45423.0. Samples: 112447020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 10:04:35,502][73265] Avg episode reward: [(0, '0.448')] [2024-06-13 10:04:39,722][73497] Updated weights for policy 0, policy_version 219369 (0.0034) [2024-06-13 10:04:40,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45602.1, 300 sec: 45264.3). Total num frames: 3594190848. Throughput: 0: 45532.9. Samples: 112719760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 10:04:40,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 10:04:42,862][73497] Updated weights for policy 0, policy_version 219379 (0.0040) [2024-06-13 10:04:45,501][73265] Fps is (10 sec: 44237.5, 60 sec: 45329.1, 300 sec: 45208.8). Total num frames: 3594420224. Throughput: 0: 45437.4. Samples: 112985100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-13 10:04:45,502][73265] Avg episode reward: [(0, '0.435')] [2024-06-13 10:04:46,929][73497] Updated weights for policy 0, policy_version 219389 (0.0038) [2024-06-13 10:04:49,766][73497] Updated weights for policy 0, policy_version 219399 (0.0041) [2024-06-13 10:04:50,504][73265] Fps is (10 sec: 47501.9, 60 sec: 45600.2, 300 sec: 45263.9). Total num frames: 3594665984. Throughput: 0: 45367.7. Samples: 113127740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:04:50,504][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 10:04:54,251][73497] Updated weights for policy 0, policy_version 219409 (0.0038) [2024-06-13 10:04:55,501][73265] Fps is (10 sec: 44236.5, 60 sec: 45329.1, 300 sec: 45264.3). Total num frames: 3594862592. Throughput: 0: 45544.0. Samples: 113399060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:04:55,507][73265] Avg episode reward: [(0, '0.454')] [2024-06-13 10:04:57,050][73497] Updated weights for policy 0, policy_version 219419 (0.0038) [2024-06-13 10:05:00,502][73265] Fps is (10 sec: 40969.5, 60 sec: 44509.8, 300 sec: 45097.6). Total num frames: 3595075584. Throughput: 0: 45417.6. Samples: 113660960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:05:00,511][73265] Avg episode reward: [(0, '0.361')] [2024-06-13 10:05:01,295][73497] Updated weights for policy 0, policy_version 219429 (0.0025) [2024-06-13 10:05:02,141][73477] Signal inference workers to stop experience collection... (1700 times) [2024-06-13 10:05:02,141][73477] Signal inference workers to resume experience collection... (1700 times) [2024-06-13 10:05:02,170][73497] InferenceWorker_p0-w0: stopping experience collection (1700 times) [2024-06-13 10:05:02,176][73497] InferenceWorker_p0-w0: resuming experience collection (1700 times) [2024-06-13 10:05:04,527][73497] Updated weights for policy 0, policy_version 219439 (0.0034) [2024-06-13 10:05:05,501][73265] Fps is (10 sec: 47513.7, 60 sec: 45602.1, 300 sec: 45208.7). Total num frames: 3595337728. Throughput: 0: 45238.6. Samples: 113797020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:05:05,502][73265] Avg episode reward: [(0, '0.484')] [2024-06-13 10:05:08,519][73497] Updated weights for policy 0, policy_version 219449 (0.0029) [2024-06-13 10:05:10,502][73265] Fps is (10 sec: 47513.4, 60 sec: 45602.0, 300 sec: 45264.2). Total num frames: 3595550720. Throughput: 0: 45298.9. Samples: 114070880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:05:10,502][73265] Avg episode reward: [(0, '0.477')] [2024-06-13 10:05:11,994][73497] Updated weights for policy 0, policy_version 219459 (0.0042) [2024-06-13 10:05:15,501][73265] Fps is (10 sec: 40959.9, 60 sec: 44509.8, 300 sec: 45042.1). Total num frames: 3595747328. Throughput: 0: 45181.4. Samples: 114330340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:05:15,502][73265] Avg episode reward: [(0, '0.392')] [2024-06-13 10:05:16,186][73497] Updated weights for policy 0, policy_version 219469 (0.0039) [2024-06-13 10:05:19,281][73497] Updated weights for policy 0, policy_version 219479 (0.0040) [2024-06-13 10:05:20,501][73265] Fps is (10 sec: 44237.7, 60 sec: 45056.0, 300 sec: 45097.7). Total num frames: 3595993088. Throughput: 0: 44857.9. Samples: 114465620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:05:20,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 10:05:23,243][73497] Updated weights for policy 0, policy_version 219489 (0.0045) [2024-06-13 10:05:25,501][73265] Fps is (10 sec: 47513.6, 60 sec: 45602.1, 300 sec: 45264.3). Total num frames: 3596222464. Throughput: 0: 45015.1. Samples: 114745440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:05:25,502][73265] Avg episode reward: [(0, '0.439')] [2024-06-13 10:05:26,157][73497] Updated weights for policy 0, policy_version 219499 (0.0023) [2024-06-13 10:05:30,149][73497] Updated weights for policy 0, policy_version 219509 (0.0045) [2024-06-13 10:05:30,501][73265] Fps is (10 sec: 44236.5, 60 sec: 44783.0, 300 sec: 45153.2). Total num frames: 3596435456. Throughput: 0: 45091.0. Samples: 115014200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:05:30,504][73265] Avg episode reward: [(0, '0.414')] [2024-06-13 10:05:33,587][73497] Updated weights for policy 0, policy_version 219519 (0.0029) [2024-06-13 10:05:35,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45056.1, 300 sec: 45153.2). Total num frames: 3596681216. Throughput: 0: 44859.3. Samples: 115146300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:05:35,502][73265] Avg episode reward: [(0, '0.403')] [2024-06-13 10:05:37,813][73497] Updated weights for policy 0, policy_version 219529 (0.0040) [2024-06-13 10:05:40,501][73265] Fps is (10 sec: 45875.6, 60 sec: 45056.1, 300 sec: 45208.7). Total num frames: 3596894208. Throughput: 0: 44876.1. Samples: 115418480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:05:40,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 10:05:40,800][73497] Updated weights for policy 0, policy_version 219539 (0.0031) [2024-06-13 10:05:44,946][73497] Updated weights for policy 0, policy_version 219549 (0.0034) [2024-06-13 10:05:45,501][73265] Fps is (10 sec: 42598.3, 60 sec: 44782.9, 300 sec: 45097.7). Total num frames: 3597107200. Throughput: 0: 45113.4. Samples: 115691060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:05:45,502][73265] Avg episode reward: [(0, '0.423')] [2024-06-13 10:05:48,250][73497] Updated weights for policy 0, policy_version 219559 (0.0032) [2024-06-13 10:05:50,501][73265] Fps is (10 sec: 45875.1, 60 sec: 44784.8, 300 sec: 45097.7). Total num frames: 3597352960. Throughput: 0: 44922.7. Samples: 115818540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:05:50,502][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 10:05:52,013][73497] Updated weights for policy 0, policy_version 219569 (0.0034) [2024-06-13 10:05:55,171][73497] Updated weights for policy 0, policy_version 219579 (0.0037) [2024-06-13 10:05:55,501][73265] Fps is (10 sec: 49152.0, 60 sec: 45602.1, 300 sec: 45319.8). Total num frames: 3597598720. Throughput: 0: 45161.9. Samples: 116103160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 10:05:55,502][73265] Avg episode reward: [(0, '0.427')] [2024-06-13 10:05:58,935][73497] Updated weights for policy 0, policy_version 219589 (0.0039) [2024-06-13 10:06:00,501][73265] Fps is (10 sec: 42598.3, 60 sec: 45056.1, 300 sec: 45097.7). Total num frames: 3597778944. Throughput: 0: 45405.8. Samples: 116373600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 10:06:00,502][73265] Avg episode reward: [(0, '0.364')] [2024-06-13 10:06:02,419][73497] Updated weights for policy 0, policy_version 219599 (0.0026) [2024-06-13 10:06:05,501][73265] Fps is (10 sec: 44237.0, 60 sec: 45056.0, 300 sec: 45153.2). Total num frames: 3598041088. Throughput: 0: 45317.7. Samples: 116504920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 10:06:05,502][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 10:06:06,093][73497] Updated weights for policy 0, policy_version 219609 (0.0039) [2024-06-13 10:06:09,592][73497] Updated weights for policy 0, policy_version 219619 (0.0028) [2024-06-13 10:06:10,501][73265] Fps is (10 sec: 49152.0, 60 sec: 45329.2, 300 sec: 45208.7). Total num frames: 3598270464. Throughput: 0: 45131.6. Samples: 116776360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 10:06:10,508][73265] Avg episode reward: [(0, '0.428')] [2024-06-13 10:06:13,205][73497] Updated weights for policy 0, policy_version 219629 (0.0043) [2024-06-13 10:06:15,501][73265] Fps is (10 sec: 42598.7, 60 sec: 45329.1, 300 sec: 45153.2). Total num frames: 3598467072. Throughput: 0: 45299.6. Samples: 117052680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 10:06:15,502][73265] Avg episode reward: [(0, '0.421')] [2024-06-13 10:06:16,965][73497] Updated weights for policy 0, policy_version 219639 (0.0030) [2024-06-13 10:06:20,501][73265] Fps is (10 sec: 44236.8, 60 sec: 45329.0, 300 sec: 45153.2). Total num frames: 3598712832. Throughput: 0: 45228.9. Samples: 117181600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 10:06:20,502][73265] Avg episode reward: [(0, '0.445')] [2024-06-13 10:06:20,772][73497] Updated weights for policy 0, policy_version 219649 (0.0041) [2024-06-13 10:06:24,128][73497] Updated weights for policy 0, policy_version 219659 (0.0046) [2024-06-13 10:06:25,501][73265] Fps is (10 sec: 47513.9, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 3598942208. Throughput: 0: 45266.7. Samples: 117455480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 10:06:25,502][73265] Avg episode reward: [(0, '0.417')] [2024-06-13 10:06:25,517][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000219662_3598942208.pth... [2024-06-13 10:06:25,579][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000219000_3588096000.pth [2024-06-13 10:06:27,779][73497] Updated weights for policy 0, policy_version 219669 (0.0031) [2024-06-13 10:06:30,502][73265] Fps is (10 sec: 44236.3, 60 sec: 45329.0, 300 sec: 45264.2). Total num frames: 3599155200. Throughput: 0: 45271.9. Samples: 117728300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 10:06:30,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 10:06:31,553][73497] Updated weights for policy 0, policy_version 219679 (0.0033) [2024-06-13 10:06:35,083][73497] Updated weights for policy 0, policy_version 219689 (0.0034) [2024-06-13 10:06:35,501][73265] Fps is (10 sec: 45874.7, 60 sec: 45329.1, 300 sec: 45153.2). Total num frames: 3599400960. Throughput: 0: 45301.7. Samples: 117857120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 10:06:35,502][73265] Avg episode reward: [(0, '0.481')] [2024-06-13 10:06:38,594][73497] Updated weights for policy 0, policy_version 219699 (0.0044) [2024-06-13 10:06:40,502][73265] Fps is (10 sec: 47513.3, 60 sec: 45602.0, 300 sec: 45208.7). Total num frames: 3599630336. Throughput: 0: 44917.6. Samples: 118124460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 10:06:40,502][73265] Avg episode reward: [(0, '0.426')] [2024-06-13 10:06:41,798][73477] Signal inference workers to stop experience collection... (1750 times) [2024-06-13 10:06:41,798][73477] Signal inference workers to resume experience collection... (1750 times) [2024-06-13 10:06:41,807][73497] InferenceWorker_p0-w0: stopping experience collection (1750 times) [2024-06-13 10:06:41,807][73497] InferenceWorker_p0-w0: resuming experience collection (1750 times) [2024-06-13 10:06:42,251][73497] Updated weights for policy 0, policy_version 219709 (0.0032) [2024-06-13 10:06:45,502][73265] Fps is (10 sec: 42597.8, 60 sec: 45329.0, 300 sec: 45208.7). Total num frames: 3599826944. Throughput: 0: 45136.7. Samples: 118404760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 10:06:45,502][73265] Avg episode reward: [(0, '0.404')] [2024-06-13 10:06:46,470][73497] Updated weights for policy 0, policy_version 219719 (0.0047) [2024-06-13 10:06:49,619][73497] Updated weights for policy 0, policy_version 219729 (0.0040) [2024-06-13 10:06:50,501][73265] Fps is (10 sec: 42599.4, 60 sec: 45056.0, 300 sec: 45042.1). Total num frames: 3600056320. Throughput: 0: 45013.4. Samples: 118530520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 10:06:50,502][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 10:06:53,423][73497] Updated weights for policy 0, policy_version 219739 (0.0024) [2024-06-13 10:06:55,501][73265] Fps is (10 sec: 49153.0, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 3600318464. Throughput: 0: 45204.5. Samples: 118810560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 10:06:55,502][73265] Avg episode reward: [(0, '0.419')] [2024-06-13 10:06:56,533][73497] Updated weights for policy 0, policy_version 219749 (0.0022) [2024-06-13 10:07:00,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45602.2, 300 sec: 45208.8). Total num frames: 3600515072. Throughput: 0: 45080.5. Samples: 119081300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-13 10:07:00,502][73265] Avg episode reward: [(0, '0.479')] [2024-06-13 10:07:00,599][73497] Updated weights for policy 0, policy_version 219759 (0.0044) [2024-06-13 10:07:03,820][73497] Updated weights for policy 0, policy_version 219769 (0.0046) [2024-06-13 10:07:05,502][73265] Fps is (10 sec: 40959.3, 60 sec: 44782.9, 300 sec: 45097.7). Total num frames: 3600728064. Throughput: 0: 45210.1. Samples: 119216060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-13 10:07:05,502][73265] Avg episode reward: [(0, '0.458')] [2024-06-13 10:07:07,807][73497] Updated weights for policy 0, policy_version 219779 (0.0024) [2024-06-13 10:07:10,501][73265] Fps is (10 sec: 45874.9, 60 sec: 45056.0, 300 sec: 45153.2). Total num frames: 3600973824. Throughput: 0: 45087.5. Samples: 119484420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-13 10:07:10,502][73265] Avg episode reward: [(0, '0.410')] [2024-06-13 10:07:10,888][73497] Updated weights for policy 0, policy_version 219789 (0.0030) [2024-06-13 10:07:15,329][73497] Updated weights for policy 0, policy_version 219799 (0.0029) [2024-06-13 10:07:15,501][73265] Fps is (10 sec: 45875.4, 60 sec: 45329.0, 300 sec: 45208.7). Total num frames: 3601186816. Throughput: 0: 45200.0. Samples: 119762300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-13 10:07:15,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 10:07:18,521][73497] Updated weights for policy 0, policy_version 219809 (0.0046) [2024-06-13 10:07:20,505][73265] Fps is (10 sec: 44219.8, 60 sec: 45053.1, 300 sec: 45152.6). Total num frames: 3601416192. Throughput: 0: 45177.0. Samples: 119890260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-13 10:07:20,506][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 10:07:22,466][73497] Updated weights for policy 0, policy_version 219819 (0.0030) [2024-06-13 10:07:25,501][73265] Fps is (10 sec: 47514.0, 60 sec: 45329.0, 300 sec: 45153.5). Total num frames: 3601661952. Throughput: 0: 45327.3. Samples: 120164180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-13 10:07:25,502][73265] Avg episode reward: [(0, '0.485')] [2024-06-13 10:07:25,672][73497] Updated weights for policy 0, policy_version 219829 (0.0031) [2024-06-13 10:07:29,577][73497] Updated weights for policy 0, policy_version 219839 (0.0039) [2024-06-13 10:07:30,502][73265] Fps is (10 sec: 45892.1, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 3601874944. Throughput: 0: 45250.7. Samples: 120441040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-13 10:07:30,502][73265] Avg episode reward: [(0, '0.425')] [2024-06-13 10:07:32,794][73497] Updated weights for policy 0, policy_version 219849 (0.0031) [2024-06-13 10:07:35,501][73265] Fps is (10 sec: 40960.2, 60 sec: 44509.9, 300 sec: 45098.0). Total num frames: 3602071552. Throughput: 0: 45356.9. Samples: 120571580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-13 10:07:35,502][73265] Avg episode reward: [(0, '0.431')] [2024-06-13 10:07:37,063][73497] Updated weights for policy 0, policy_version 219859 (0.0028) [2024-06-13 10:07:39,740][73497] Updated weights for policy 0, policy_version 219869 (0.0034) [2024-06-13 10:07:40,501][73265] Fps is (10 sec: 47514.3, 60 sec: 45329.2, 300 sec: 45208.7). Total num frames: 3602350080. Throughput: 0: 45279.0. Samples: 120848120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-13 10:07:40,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 10:07:44,241][73497] Updated weights for policy 0, policy_version 219879 (0.0025) [2024-06-13 10:07:45,501][73265] Fps is (10 sec: 49151.2, 60 sec: 45602.2, 300 sec: 45264.3). Total num frames: 3602563072. Throughput: 0: 45394.1. Samples: 121124040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-13 10:07:45,502][73265] Avg episode reward: [(0, '0.508')] [2024-06-13 10:07:47,011][73497] Updated weights for policy 0, policy_version 219889 (0.0034) [2024-06-13 10:07:50,501][73265] Fps is (10 sec: 40960.3, 60 sec: 45056.0, 300 sec: 45153.2). Total num frames: 3602759680. Throughput: 0: 45389.1. Samples: 121258560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-13 10:07:50,502][73265] Avg episode reward: [(0, '0.419')] [2024-06-13 10:07:51,442][73497] Updated weights for policy 0, policy_version 219899 (0.0034) [2024-06-13 10:07:54,123][73497] Updated weights for policy 0, policy_version 219909 (0.0033) [2024-06-13 10:07:55,501][73265] Fps is (10 sec: 44237.1, 60 sec: 44782.9, 300 sec: 45153.2). Total num frames: 3603005440. Throughput: 0: 45325.3. Samples: 121524060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-13 10:07:55,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 10:07:58,490][73497] Updated weights for policy 0, policy_version 219919 (0.0030) [2024-06-13 10:08:00,501][73265] Fps is (10 sec: 49151.9, 60 sec: 45602.1, 300 sec: 45264.3). Total num frames: 3603251200. Throughput: 0: 45271.7. Samples: 121799520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-13 10:08:00,502][73265] Avg episode reward: [(0, '0.427')] [2024-06-13 10:08:01,777][73497] Updated weights for policy 0, policy_version 219929 (0.0033) [2024-06-13 10:08:05,501][73265] Fps is (10 sec: 42598.5, 60 sec: 45056.1, 300 sec: 45097.7). Total num frames: 3603431424. Throughput: 0: 45495.0. Samples: 121937360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-13 10:08:05,502][73265] Avg episode reward: [(0, '0.405')] [2024-06-13 10:08:06,141][73497] Updated weights for policy 0, policy_version 219939 (0.0034) [2024-06-13 10:08:06,145][73477] Signal inference workers to stop experience collection... (1800 times) [2024-06-13 10:08:06,146][73477] Signal inference workers to resume experience collection... (1800 times) [2024-06-13 10:08:06,160][73497] InferenceWorker_p0-w0: stopping experience collection (1800 times) [2024-06-13 10:08:06,161][73497] InferenceWorker_p0-w0: resuming experience collection (1800 times) [2024-06-13 10:08:08,844][73497] Updated weights for policy 0, policy_version 219949 (0.0033) [2024-06-13 10:08:10,501][73265] Fps is (10 sec: 44236.4, 60 sec: 45329.0, 300 sec: 45208.7). Total num frames: 3603693568. Throughput: 0: 45343.5. Samples: 122204640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-13 10:08:10,503][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 10:08:13,356][73497] Updated weights for policy 0, policy_version 219959 (0.0036) [2024-06-13 10:08:15,501][73265] Fps is (10 sec: 49151.9, 60 sec: 45602.2, 300 sec: 45375.3). Total num frames: 3603922944. Throughput: 0: 45257.9. Samples: 122477640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 10:08:15,502][73265] Avg episode reward: [(0, '0.406')] [2024-06-13 10:08:15,978][73497] Updated weights for policy 0, policy_version 219969 (0.0031) [2024-06-13 10:08:20,504][73265] Fps is (10 sec: 42588.4, 60 sec: 45057.1, 300 sec: 45208.4). Total num frames: 3604119552. Throughput: 0: 45417.0. Samples: 122615460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 10:08:20,504][73265] Avg episode reward: [(0, '0.399')] [2024-06-13 10:08:20,533][73497] Updated weights for policy 0, policy_version 219979 (0.0028) [2024-06-13 10:08:23,060][73497] Updated weights for policy 0, policy_version 219989 (0.0032) [2024-06-13 10:08:25,501][73265] Fps is (10 sec: 45874.9, 60 sec: 45329.0, 300 sec: 45264.2). Total num frames: 3604381696. Throughput: 0: 45324.8. Samples: 122887740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 10:08:25,502][73265] Avg episode reward: [(0, '0.379')] [2024-06-13 10:08:25,507][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000219994_3604381696.pth... [2024-06-13 10:08:25,568][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000219329_3593486336.pth [2024-06-13 10:08:27,627][73497] Updated weights for policy 0, policy_version 219999 (0.0036) [2024-06-13 10:08:30,501][73265] Fps is (10 sec: 49163.6, 60 sec: 45602.2, 300 sec: 45319.8). Total num frames: 3604611072. Throughput: 0: 45204.9. Samples: 123158260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 10:08:30,502][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 10:08:30,957][73497] Updated weights for policy 0, policy_version 220009 (0.0040) [2024-06-13 10:08:35,204][73497] Updated weights for policy 0, policy_version 220019 (0.0041) [2024-06-13 10:08:35,501][73265] Fps is (10 sec: 42599.0, 60 sec: 45602.1, 300 sec: 45264.3). Total num frames: 3604807680. Throughput: 0: 45336.0. Samples: 123298680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 10:08:35,502][73265] Avg episode reward: [(0, '0.433')] [2024-06-13 10:08:37,862][73497] Updated weights for policy 0, policy_version 220029 (0.0033) [2024-06-13 10:08:40,501][73265] Fps is (10 sec: 42598.5, 60 sec: 44782.9, 300 sec: 45208.7). Total num frames: 3605037056. Throughput: 0: 45306.2. Samples: 123562840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 10:08:40,502][73265] Avg episode reward: [(0, '0.465')] [2024-06-13 10:08:42,428][73497] Updated weights for policy 0, policy_version 220039 (0.0043) [2024-06-13 10:08:44,693][73497] Updated weights for policy 0, policy_version 220049 (0.0024) [2024-06-13 10:08:45,501][73265] Fps is (10 sec: 49151.7, 60 sec: 45602.2, 300 sec: 45319.8). Total num frames: 3605299200. Throughput: 0: 45313.3. Samples: 123838620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 10:08:45,502][73265] Avg episode reward: [(0, '0.487')] [2024-06-13 10:08:49,439][73497] Updated weights for policy 0, policy_version 220059 (0.0035) [2024-06-13 10:08:50,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45602.1, 300 sec: 45264.3). Total num frames: 3605495808. Throughput: 0: 45514.7. Samples: 123985520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 10:08:50,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 10:08:51,947][73497] Updated weights for policy 0, policy_version 220069 (0.0033) [2024-06-13 10:08:55,501][73265] Fps is (10 sec: 40960.1, 60 sec: 45056.0, 300 sec: 45097.7). Total num frames: 3605708800. Throughput: 0: 45435.6. Samples: 124249240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 10:08:55,502][73265] Avg episode reward: [(0, '0.385')] [2024-06-13 10:08:56,462][73497] Updated weights for policy 0, policy_version 220079 (0.0038) [2024-06-13 10:08:59,753][73497] Updated weights for policy 0, policy_version 220089 (0.0050) [2024-06-13 10:09:00,501][73265] Fps is (10 sec: 47513.2, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 3605970944. Throughput: 0: 45453.7. Samples: 124523060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 10:09:00,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 10:09:03,729][73497] Updated weights for policy 0, policy_version 220099 (0.0036) [2024-06-13 10:09:05,502][73265] Fps is (10 sec: 49151.3, 60 sec: 46148.2, 300 sec: 45375.3). Total num frames: 3606200320. Throughput: 0: 45617.4. Samples: 124668140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 10:09:05,502][73265] Avg episode reward: [(0, '0.395')] [2024-06-13 10:09:06,639][73497] Updated weights for policy 0, policy_version 220109 (0.0036) [2024-06-13 10:09:10,501][73265] Fps is (10 sec: 42598.9, 60 sec: 45056.1, 300 sec: 45153.2). Total num frames: 3606396928. Throughput: 0: 45506.8. Samples: 124935540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 10:09:10,502][73265] Avg episode reward: [(0, '0.488')] [2024-06-13 10:09:11,131][73497] Updated weights for policy 0, policy_version 220119 (0.0028) [2024-06-13 10:09:12,931][73477] Signal inference workers to stop experience collection... (1850 times) [2024-06-13 10:09:12,932][73477] Signal inference workers to resume experience collection... (1850 times) [2024-06-13 10:09:12,949][73497] InferenceWorker_p0-w0: stopping experience collection (1850 times) [2024-06-13 10:09:12,979][73497] InferenceWorker_p0-w0: resuming experience collection (1850 times) [2024-06-13 10:09:13,765][73497] Updated weights for policy 0, policy_version 220129 (0.0039) [2024-06-13 10:09:15,501][73265] Fps is (10 sec: 44237.0, 60 sec: 45329.0, 300 sec: 45264.3). Total num frames: 3606642688. Throughput: 0: 45363.1. Samples: 125199600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-13 10:09:15,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 10:09:18,081][73497] Updated weights for policy 0, policy_version 220139 (0.0036) [2024-06-13 10:09:20,501][73265] Fps is (10 sec: 47513.5, 60 sec: 45877.1, 300 sec: 45375.4). Total num frames: 3606872064. Throughput: 0: 45457.7. Samples: 125344280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:09:20,502][73265] Avg episode reward: [(0, '0.384')] [2024-06-13 10:09:21,222][73497] Updated weights for policy 0, policy_version 220149 (0.0034) [2024-06-13 10:09:25,110][73497] Updated weights for policy 0, policy_version 220159 (0.0030) [2024-06-13 10:09:25,501][73265] Fps is (10 sec: 45875.7, 60 sec: 45329.2, 300 sec: 45264.3). Total num frames: 3607101440. Throughput: 0: 45689.8. Samples: 125618880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:09:25,502][73265] Avg episode reward: [(0, '0.414')] [2024-06-13 10:09:28,303][73497] Updated weights for policy 0, policy_version 220169 (0.0034) [2024-06-13 10:09:30,501][73265] Fps is (10 sec: 45874.8, 60 sec: 45329.1, 300 sec: 45264.3). Total num frames: 3607330816. Throughput: 0: 45504.4. Samples: 125886320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:09:30,502][73265] Avg episode reward: [(0, '0.434')] [2024-06-13 10:09:32,725][73497] Updated weights for policy 0, policy_version 220179 (0.0024) [2024-06-13 10:09:35,392][73497] Updated weights for policy 0, policy_version 220189 (0.0028) [2024-06-13 10:09:35,501][73265] Fps is (10 sec: 47513.5, 60 sec: 46148.2, 300 sec: 45375.4). Total num frames: 3607576576. Throughput: 0: 45311.6. Samples: 126024540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:09:35,502][73265] Avg episode reward: [(0, '0.422')] [2024-06-13 10:09:39,696][73497] Updated weights for policy 0, policy_version 220199 (0.0027) [2024-06-13 10:09:40,501][73265] Fps is (10 sec: 42598.8, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 3607756800. Throughput: 0: 45464.5. Samples: 126295140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:09:40,502][73265] Avg episode reward: [(0, '0.495')] [2024-06-13 10:09:42,791][73497] Updated weights for policy 0, policy_version 220209 (0.0034) [2024-06-13 10:09:45,501][73265] Fps is (10 sec: 40960.2, 60 sec: 44783.0, 300 sec: 45153.6). Total num frames: 3607986176. Throughput: 0: 45465.5. Samples: 126569000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:09:45,502][73265] Avg episode reward: [(0, '0.467')] [2024-06-13 10:09:46,805][73497] Updated weights for policy 0, policy_version 220219 (0.0035) [2024-06-13 10:09:50,235][73497] Updated weights for policy 0, policy_version 220229 (0.0037) [2024-06-13 10:09:50,501][73265] Fps is (10 sec: 49151.5, 60 sec: 45875.2, 300 sec: 45375.3). Total num frames: 3608248320. Throughput: 0: 45225.4. Samples: 126703280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:09:50,502][73265] Avg episode reward: [(0, '0.422')] [2024-06-13 10:09:53,807][73497] Updated weights for policy 0, policy_version 220239 (0.0038) [2024-06-13 10:09:55,504][73265] Fps is (10 sec: 47501.5, 60 sec: 45873.3, 300 sec: 45375.0). Total num frames: 3608461312. Throughput: 0: 45319.7. Samples: 126975040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:09:55,504][73265] Avg episode reward: [(0, '0.486')] [2024-06-13 10:09:57,322][73497] Updated weights for policy 0, policy_version 220249 (0.0044) [2024-06-13 10:10:00,501][73265] Fps is (10 sec: 42598.7, 60 sec: 45056.1, 300 sec: 45208.7). Total num frames: 3608674304. Throughput: 0: 45577.9. Samples: 127250600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:10:00,502][73265] Avg episode reward: [(0, '0.413')] [2024-06-13 10:10:01,306][73497] Updated weights for policy 0, policy_version 220259 (0.0029) [2024-06-13 10:10:04,366][73497] Updated weights for policy 0, policy_version 220269 (0.0031) [2024-06-13 10:10:05,501][73265] Fps is (10 sec: 47525.3, 60 sec: 45602.2, 300 sec: 45375.4). Total num frames: 3608936448. Throughput: 0: 45415.5. Samples: 127387980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:10:05,502][73265] Avg episode reward: [(0, '0.410')] [2024-06-13 10:10:08,277][73497] Updated weights for policy 0, policy_version 220279 (0.0031) [2024-06-13 10:10:10,501][73265] Fps is (10 sec: 42598.3, 60 sec: 45056.0, 300 sec: 45264.3). Total num frames: 3609100288. Throughput: 0: 45384.4. Samples: 127661180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:10:10,502][73265] Avg episode reward: [(0, '0.418')] [2024-06-13 10:10:11,717][73497] Updated weights for policy 0, policy_version 220289 (0.0036) [2024-06-13 10:10:15,462][73497] Updated weights for policy 0, policy_version 220299 (0.0037) [2024-06-13 10:10:15,503][73265] Fps is (10 sec: 44229.4, 60 sec: 45600.9, 300 sec: 45375.1). Total num frames: 3609378816. Throughput: 0: 45310.8. Samples: 127925380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:10:15,504][73265] Avg episode reward: [(0, '0.472')] [2024-06-13 10:10:19,310][73497] Updated weights for policy 0, policy_version 220309 (0.0032) [2024-06-13 10:10:20,504][73265] Fps is (10 sec: 49139.9, 60 sec: 45327.2, 300 sec: 45319.4). Total num frames: 3609591808. Throughput: 0: 45296.6. Samples: 128063000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:10:20,504][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 10:10:21,889][73477] Signal inference workers to stop experience collection... (1900 times) [2024-06-13 10:10:21,889][73477] Signal inference workers to resume experience collection... (1900 times) [2024-06-13 10:10:21,940][73497] InferenceWorker_p0-w0: stopping experience collection (1900 times) [2024-06-13 10:10:21,941][73497] InferenceWorker_p0-w0: resuming experience collection (1900 times) [2024-06-13 10:10:22,591][73497] Updated weights for policy 0, policy_version 220319 (0.0023) [2024-06-13 10:10:25,501][73265] Fps is (10 sec: 42605.5, 60 sec: 45056.0, 300 sec: 45319.8). Total num frames: 3609804800. Throughput: 0: 45313.3. Samples: 128334240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:10:25,502][73265] Avg episode reward: [(0, '0.428')] [2024-06-13 10:10:25,526][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000220326_3609821184.pth... [2024-06-13 10:10:25,574][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000219662_3598942208.pth [2024-06-13 10:10:26,267][73497] Updated weights for policy 0, policy_version 220329 (0.0030) [2024-06-13 10:10:29,865][73497] Updated weights for policy 0, policy_version 220339 (0.0038) [2024-06-13 10:10:30,501][73265] Fps is (10 sec: 45886.4, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 3610050560. Throughput: 0: 45279.5. Samples: 128606580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 10:10:30,502][73265] Avg episode reward: [(0, '0.393')] [2024-06-13 10:10:33,341][73497] Updated weights for policy 0, policy_version 220349 (0.0030) [2024-06-13 10:10:35,502][73265] Fps is (10 sec: 47513.1, 60 sec: 45055.9, 300 sec: 45375.3). Total num frames: 3610279936. Throughput: 0: 45412.8. Samples: 128746860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 10:10:35,502][73265] Avg episode reward: [(0, '0.425')] [2024-06-13 10:10:37,095][73497] Updated weights for policy 0, policy_version 220359 (0.0031) [2024-06-13 10:10:40,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45875.1, 300 sec: 45430.9). Total num frames: 3610509312. Throughput: 0: 45615.8. Samples: 129027640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 10:10:40,502][73265] Avg episode reward: [(0, '0.446')] [2024-06-13 10:10:40,660][73497] Updated weights for policy 0, policy_version 220369 (0.0036) [2024-06-13 10:10:44,046][73497] Updated weights for policy 0, policy_version 220379 (0.0030) [2024-06-13 10:10:45,501][73265] Fps is (10 sec: 42599.0, 60 sec: 45329.0, 300 sec: 45264.3). Total num frames: 3610705920. Throughput: 0: 45359.1. Samples: 129291760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 10:10:45,502][73265] Avg episode reward: [(0, '0.418')] [2024-06-13 10:10:48,003][73497] Updated weights for policy 0, policy_version 220389 (0.0033) [2024-06-13 10:10:50,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 3610968064. Throughput: 0: 45449.8. Samples: 129433220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 10:10:50,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 10:10:51,274][73497] Updated weights for policy 0, policy_version 220399 (0.0030) [2024-06-13 10:10:54,960][73497] Updated weights for policy 0, policy_version 220409 (0.0043) [2024-06-13 10:10:55,501][73265] Fps is (10 sec: 49152.3, 60 sec: 45604.1, 300 sec: 45486.4). Total num frames: 3611197440. Throughput: 0: 45524.1. Samples: 129709760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 10:10:55,502][73265] Avg episode reward: [(0, '0.456')] [2024-06-13 10:10:58,179][73497] Updated weights for policy 0, policy_version 220419 (0.0036) [2024-06-13 10:11:00,501][73265] Fps is (10 sec: 40959.9, 60 sec: 45056.0, 300 sec: 45208.7). Total num frames: 3611377664. Throughput: 0: 45758.6. Samples: 129984440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 10:11:00,502][73265] Avg episode reward: [(0, '0.414')] [2024-06-13 10:11:02,179][73497] Updated weights for policy 0, policy_version 220429 (0.0035) [2024-06-13 10:11:05,501][73265] Fps is (10 sec: 45874.7, 60 sec: 45329.1, 300 sec: 45375.3). Total num frames: 3611656192. Throughput: 0: 45407.3. Samples: 130106220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 10:11:05,502][73265] Avg episode reward: [(0, '0.420')] [2024-06-13 10:11:05,941][73497] Updated weights for policy 0, policy_version 220439 (0.0034) [2024-06-13 10:11:09,765][73497] Updated weights for policy 0, policy_version 220449 (0.0038) [2024-06-13 10:11:10,501][73265] Fps is (10 sec: 50790.4, 60 sec: 46421.4, 300 sec: 45486.4). Total num frames: 3611885568. Throughput: 0: 45612.0. Samples: 130386780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 10:11:10,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 10:11:12,972][73497] Updated weights for policy 0, policy_version 220459 (0.0036) [2024-06-13 10:11:15,502][73265] Fps is (10 sec: 39321.3, 60 sec: 44511.0, 300 sec: 45208.7). Total num frames: 3612049408. Throughput: 0: 45620.4. Samples: 130659500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 10:11:15,502][73265] Avg episode reward: [(0, '0.415')] [2024-06-13 10:11:16,965][73497] Updated weights for policy 0, policy_version 220469 (0.0030) [2024-06-13 10:11:20,080][73497] Updated weights for policy 0, policy_version 220479 (0.0033) [2024-06-13 10:11:20,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45877.1, 300 sec: 45430.9). Total num frames: 3612344320. Throughput: 0: 45349.9. Samples: 130787600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 10:11:20,502][73265] Avg episode reward: [(0, '0.418')] [2024-06-13 10:11:24,084][73497] Updated weights for policy 0, policy_version 220489 (0.0034) [2024-06-13 10:11:25,501][73265] Fps is (10 sec: 49153.2, 60 sec: 45602.2, 300 sec: 45375.4). Total num frames: 3612540928. Throughput: 0: 45227.7. Samples: 131062880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 10:11:25,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 10:11:27,010][73497] Updated weights for policy 0, policy_version 220499 (0.0042) [2024-06-13 10:11:30,501][73265] Fps is (10 sec: 40959.8, 60 sec: 45056.0, 300 sec: 45264.3). Total num frames: 3612753920. Throughput: 0: 45719.9. Samples: 131349160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 10:11:30,502][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 10:11:31,317][73497] Updated weights for policy 0, policy_version 220509 (0.0031) [2024-06-13 10:11:34,387][73497] Updated weights for policy 0, policy_version 220519 (0.0037) [2024-06-13 10:11:35,501][73265] Fps is (10 sec: 45874.9, 60 sec: 45329.2, 300 sec: 45319.8). Total num frames: 3612999680. Throughput: 0: 45268.0. Samples: 131470280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-13 10:11:35,502][73265] Avg episode reward: [(0, '0.435')] [2024-06-13 10:11:37,105][73477] Signal inference workers to stop experience collection... (1950 times) [2024-06-13 10:11:37,105][73477] Signal inference workers to resume experience collection... (1950 times) [2024-06-13 10:11:37,126][73497] InferenceWorker_p0-w0: stopping experience collection (1950 times) [2024-06-13 10:11:37,126][73497] InferenceWorker_p0-w0: resuming experience collection (1950 times) [2024-06-13 10:11:38,639][73497] Updated weights for policy 0, policy_version 220529 (0.0036) [2024-06-13 10:11:40,501][73265] Fps is (10 sec: 49152.0, 60 sec: 45602.1, 300 sec: 45486.4). Total num frames: 3613245440. Throughput: 0: 45283.0. Samples: 131747500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:11:40,502][73265] Avg episode reward: [(0, '0.422')] [2024-06-13 10:11:41,544][73497] Updated weights for policy 0, policy_version 220539 (0.0035) [2024-06-13 10:11:45,501][73265] Fps is (10 sec: 44236.6, 60 sec: 45602.1, 300 sec: 45375.3). Total num frames: 3613442048. Throughput: 0: 45575.5. Samples: 132035340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:11:45,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 10:11:45,660][73497] Updated weights for policy 0, policy_version 220549 (0.0026) [2024-06-13 10:11:48,372][73497] Updated weights for policy 0, policy_version 220559 (0.0035) [2024-06-13 10:11:50,504][73265] Fps is (10 sec: 42587.9, 60 sec: 45054.1, 300 sec: 45263.9). Total num frames: 3613671424. Throughput: 0: 45601.1. Samples: 132158380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:11:50,505][73265] Avg episode reward: [(0, '0.439')] [2024-06-13 10:11:52,929][73497] Updated weights for policy 0, policy_version 220569 (0.0039) [2024-06-13 10:11:55,502][73265] Fps is (10 sec: 50789.9, 60 sec: 45875.1, 300 sec: 45541.9). Total num frames: 3613949952. Throughput: 0: 45589.2. Samples: 132438300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:11:55,502][73265] Avg episode reward: [(0, '0.404')] [2024-06-13 10:11:55,805][73497] Updated weights for policy 0, policy_version 220579 (0.0038) [2024-06-13 10:12:00,143][73497] Updated weights for policy 0, policy_version 220589 (0.0034) [2024-06-13 10:12:00,502][73265] Fps is (10 sec: 47525.1, 60 sec: 46148.2, 300 sec: 45486.4). Total num frames: 3614146560. Throughput: 0: 45830.3. Samples: 132721860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:12:00,502][73265] Avg episode reward: [(0, '0.448')] [2024-06-13 10:12:02,815][73497] Updated weights for policy 0, policy_version 220599 (0.0031) [2024-06-13 10:12:05,502][73265] Fps is (10 sec: 40959.8, 60 sec: 45055.9, 300 sec: 45375.3). Total num frames: 3614359552. Throughput: 0: 45683.8. Samples: 132843380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:12:05,502][73265] Avg episode reward: [(0, '0.501')] [2024-06-13 10:12:07,582][73497] Updated weights for policy 0, policy_version 220609 (0.0033) [2024-06-13 10:12:09,664][73497] Updated weights for policy 0, policy_version 220619 (0.0041) [2024-06-13 10:12:10,502][73265] Fps is (10 sec: 47513.6, 60 sec: 45602.0, 300 sec: 45542.0). Total num frames: 3614621696. Throughput: 0: 45628.2. Samples: 133116160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:12:10,502][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 10:12:14,481][73497] Updated weights for policy 0, policy_version 220629 (0.0045) [2024-06-13 10:12:15,502][73265] Fps is (10 sec: 49152.1, 60 sec: 46694.4, 300 sec: 45542.5). Total num frames: 3614851072. Throughput: 0: 45729.7. Samples: 133407000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:12:15,502][73265] Avg episode reward: [(0, '0.381')] [2024-06-13 10:12:16,825][73497] Updated weights for policy 0, policy_version 220639 (0.0030) [2024-06-13 10:12:20,501][73265] Fps is (10 sec: 42598.7, 60 sec: 45056.0, 300 sec: 45375.3). Total num frames: 3615047680. Throughput: 0: 46099.1. Samples: 133544740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:12:20,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 10:12:21,626][73497] Updated weights for policy 0, policy_version 220649 (0.0040) [2024-06-13 10:12:24,051][73497] Updated weights for policy 0, policy_version 220659 (0.0039) [2024-06-13 10:12:25,501][73265] Fps is (10 sec: 45875.6, 60 sec: 46148.1, 300 sec: 45542.0). Total num frames: 3615309824. Throughput: 0: 45870.2. Samples: 133811660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:12:25,502][73265] Avg episode reward: [(0, '0.449')] [2024-06-13 10:12:25,515][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000220661_3615309824.pth... [2024-06-13 10:12:25,562][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000219994_3604381696.pth [2024-06-13 10:12:28,992][73497] Updated weights for policy 0, policy_version 220669 (0.0032) [2024-06-13 10:12:30,501][73265] Fps is (10 sec: 49152.6, 60 sec: 46421.5, 300 sec: 45653.0). Total num frames: 3615539200. Throughput: 0: 45547.2. Samples: 134084960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:12:30,502][73265] Avg episode reward: [(0, '0.445')] [2024-06-13 10:12:31,149][73497] Updated weights for policy 0, policy_version 220679 (0.0046) [2024-06-13 10:12:32,868][73477] Signal inference workers to stop experience collection... (2000 times) [2024-06-13 10:12:32,910][73497] InferenceWorker_p0-w0: stopping experience collection (2000 times) [2024-06-13 10:12:32,919][73477] Signal inference workers to resume experience collection... (2000 times) [2024-06-13 10:12:32,923][73497] InferenceWorker_p0-w0: resuming experience collection (2000 times) [2024-06-13 10:12:35,501][73265] Fps is (10 sec: 39321.7, 60 sec: 45056.0, 300 sec: 45264.3). Total num frames: 3615703040. Throughput: 0: 45934.6. Samples: 134225320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:12:35,502][73265] Avg episode reward: [(0, '0.464')] [2024-06-13 10:12:36,109][73497] Updated weights for policy 0, policy_version 220689 (0.0027) [2024-06-13 10:12:38,254][73497] Updated weights for policy 0, policy_version 220699 (0.0030) [2024-06-13 10:12:40,502][73265] Fps is (10 sec: 44235.7, 60 sec: 45602.1, 300 sec: 45486.4). Total num frames: 3615981568. Throughput: 0: 45677.7. Samples: 134493800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:12:40,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 10:12:43,279][73497] Updated weights for policy 0, policy_version 220709 (0.0027) [2024-06-13 10:12:45,501][73265] Fps is (10 sec: 52429.2, 60 sec: 46421.4, 300 sec: 45653.0). Total num frames: 3616227328. Throughput: 0: 45372.6. Samples: 134763620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 10:12:45,502][73265] Avg episode reward: [(0, '0.412')] [2024-06-13 10:12:45,771][73497] Updated weights for policy 0, policy_version 220719 (0.0026) [2024-06-13 10:12:50,501][73265] Fps is (10 sec: 42599.2, 60 sec: 45604.1, 300 sec: 45430.9). Total num frames: 3616407552. Throughput: 0: 45846.4. Samples: 134906460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 10:12:50,502][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 10:12:50,612][73497] Updated weights for policy 0, policy_version 220729 (0.0038) [2024-06-13 10:12:52,733][73497] Updated weights for policy 0, policy_version 220739 (0.0033) [2024-06-13 10:12:55,501][73265] Fps is (10 sec: 42598.6, 60 sec: 45056.2, 300 sec: 45430.9). Total num frames: 3616653312. Throughput: 0: 45885.5. Samples: 135181000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 10:12:55,502][73265] Avg episode reward: [(0, '0.448')] [2024-06-13 10:12:57,806][73497] Updated weights for policy 0, policy_version 220749 (0.0040) [2024-06-13 10:12:59,782][73497] Updated weights for policy 0, policy_version 220759 (0.0038) [2024-06-13 10:13:00,502][73265] Fps is (10 sec: 50789.4, 60 sec: 46148.2, 300 sec: 45708.6). Total num frames: 3616915456. Throughput: 0: 45171.5. Samples: 135439720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 10:13:00,502][73265] Avg episode reward: [(0, '0.463')] [2024-06-13 10:13:05,100][73497] Updated weights for policy 0, policy_version 220769 (0.0035) [2024-06-13 10:13:05,501][73265] Fps is (10 sec: 44236.8, 60 sec: 45602.3, 300 sec: 45430.9). Total num frames: 3617095680. Throughput: 0: 45463.2. Samples: 135590580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 10:13:05,502][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 10:13:06,978][73497] Updated weights for policy 0, policy_version 220779 (0.0033) [2024-06-13 10:13:10,501][73265] Fps is (10 sec: 40960.5, 60 sec: 45056.1, 300 sec: 45430.9). Total num frames: 3617325056. Throughput: 0: 45671.6. Samples: 135866880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 10:13:10,502][73265] Avg episode reward: [(0, '0.449')] [2024-06-13 10:13:11,945][73497] Updated weights for policy 0, policy_version 220789 (0.0032) [2024-06-13 10:13:14,815][73497] Updated weights for policy 0, policy_version 220799 (0.0032) [2024-06-13 10:13:15,501][73265] Fps is (10 sec: 49151.5, 60 sec: 45602.2, 300 sec: 45653.4). Total num frames: 3617587200. Throughput: 0: 45589.7. Samples: 136136500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 10:13:15,502][73265] Avg episode reward: [(0, '0.446')] [2024-06-13 10:13:19,490][73497] Updated weights for policy 0, policy_version 220809 (0.0029) [2024-06-13 10:13:20,501][73265] Fps is (10 sec: 50790.9, 60 sec: 46421.4, 300 sec: 45597.5). Total num frames: 3617832960. Throughput: 0: 45852.1. Samples: 136288660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 10:13:20,502][73265] Avg episode reward: [(0, '0.388')] [2024-06-13 10:13:21,808][73497] Updated weights for policy 0, policy_version 220819 (0.0027) [2024-06-13 10:13:25,504][73265] Fps is (10 sec: 42587.6, 60 sec: 45054.1, 300 sec: 45430.5). Total num frames: 3618013184. Throughput: 0: 45785.6. Samples: 136554260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 10:13:25,505][73265] Avg episode reward: [(0, '0.479')] [2024-06-13 10:13:26,751][73497] Updated weights for policy 0, policy_version 220829 (0.0035) [2024-06-13 10:13:27,482][73477] Signal inference workers to stop experience collection... (2050 times) [2024-06-13 10:13:27,511][73497] InferenceWorker_p0-w0: stopping experience collection (2050 times) [2024-06-13 10:13:27,526][73477] Signal inference workers to resume experience collection... (2050 times) [2024-06-13 10:13:27,527][73497] InferenceWorker_p0-w0: resuming experience collection (2050 times) [2024-06-13 10:13:28,850][73497] Updated weights for policy 0, policy_version 220839 (0.0046) [2024-06-13 10:13:30,501][73265] Fps is (10 sec: 44236.4, 60 sec: 45602.0, 300 sec: 45653.0). Total num frames: 3618275328. Throughput: 0: 45703.5. Samples: 136820280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 10:13:30,502][73265] Avg episode reward: [(0, '0.404')] [2024-06-13 10:13:33,832][73497] Updated weights for policy 0, policy_version 220849 (0.0038) [2024-06-13 10:13:35,504][73265] Fps is (10 sec: 49153.8, 60 sec: 46692.7, 300 sec: 45652.7). Total num frames: 3618504704. Throughput: 0: 45666.2. Samples: 136961540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 10:13:35,504][73265] Avg episode reward: [(0, '0.399')] [2024-06-13 10:13:35,999][73497] Updated weights for policy 0, policy_version 220859 (0.0035) [2024-06-13 10:13:40,501][73265] Fps is (10 sec: 40960.2, 60 sec: 45056.1, 300 sec: 45375.4). Total num frames: 3618684928. Throughput: 0: 45484.8. Samples: 137227820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 10:13:40,502][73265] Avg episode reward: [(0, '0.405')] [2024-06-13 10:13:40,687][73497] Updated weights for policy 0, policy_version 220869 (0.0035) [2024-06-13 10:13:43,882][73497] Updated weights for policy 0, policy_version 220879 (0.0036) [2024-06-13 10:13:45,501][73265] Fps is (10 sec: 42607.7, 60 sec: 45056.0, 300 sec: 45542.0). Total num frames: 3618930688. Throughput: 0: 45911.7. Samples: 137505740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 10:13:45,502][73265] Avg episode reward: [(0, '0.378')] [2024-06-13 10:13:48,223][73497] Updated weights for policy 0, policy_version 220889 (0.0024) [2024-06-13 10:13:50,501][73265] Fps is (10 sec: 49151.8, 60 sec: 46148.2, 300 sec: 45653.0). Total num frames: 3619176448. Throughput: 0: 45712.3. Samples: 137647640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 10:13:50,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 10:13:50,802][73497] Updated weights for policy 0, policy_version 220899 (0.0030) [2024-06-13 10:13:55,499][73497] Updated weights for policy 0, policy_version 220909 (0.0034) [2024-06-13 10:13:55,501][73265] Fps is (10 sec: 44237.0, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 3619373056. Throughput: 0: 45576.1. Samples: 137917800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 10:13:55,502][73265] Avg episode reward: [(0, '0.393')] [2024-06-13 10:13:57,944][73497] Updated weights for policy 0, policy_version 220919 (0.0053) [2024-06-13 10:14:00,502][73265] Fps is (10 sec: 44236.5, 60 sec: 45056.0, 300 sec: 45486.4). Total num frames: 3619618816. Throughput: 0: 45514.6. Samples: 138184660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 10:14:00,504][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 10:14:02,601][73497] Updated weights for policy 0, policy_version 220929 (0.0036) [2024-06-13 10:14:05,009][73497] Updated weights for policy 0, policy_version 220939 (0.0027) [2024-06-13 10:14:05,501][73265] Fps is (10 sec: 50790.4, 60 sec: 46421.3, 300 sec: 45708.6). Total num frames: 3619880960. Throughput: 0: 45390.6. Samples: 138331240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 10:14:05,502][73265] Avg episode reward: [(0, '0.427')] [2024-06-13 10:14:09,369][73497] Updated weights for policy 0, policy_version 220949 (0.0036) [2024-06-13 10:14:10,501][73265] Fps is (10 sec: 47513.9, 60 sec: 46148.3, 300 sec: 45597.5). Total num frames: 3620093952. Throughput: 0: 45534.1. Samples: 138603180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 10:14:10,502][73265] Avg episode reward: [(0, '0.482')] [2024-06-13 10:14:12,074][73497] Updated weights for policy 0, policy_version 220959 (0.0032) [2024-06-13 10:14:15,501][73265] Fps is (10 sec: 42598.4, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 3620306944. Throughput: 0: 45687.6. Samples: 138876220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 10:14:15,510][73265] Avg episode reward: [(0, '0.371')] [2024-06-13 10:14:16,916][73497] Updated weights for policy 0, policy_version 220969 (0.0037) [2024-06-13 10:14:19,198][73497] Updated weights for policy 0, policy_version 220979 (0.0022) [2024-06-13 10:14:20,501][73265] Fps is (10 sec: 44237.0, 60 sec: 45056.0, 300 sec: 45542.0). Total num frames: 3620536320. Throughput: 0: 45602.7. Samples: 139013560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 10:14:20,502][73265] Avg episode reward: [(0, '0.415')] [2024-06-13 10:14:24,124][73497] Updated weights for policy 0, policy_version 220989 (0.0032) [2024-06-13 10:14:25,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45877.2, 300 sec: 45542.0). Total num frames: 3620765696. Throughput: 0: 45660.9. Samples: 139282560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 10:14:25,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 10:14:25,524][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000220994_3620765696.pth... [2024-06-13 10:14:25,569][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000220326_3609821184.pth [2024-06-13 10:14:26,768][73497] Updated weights for policy 0, policy_version 220999 (0.0030) [2024-06-13 10:14:30,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45056.1, 300 sec: 45430.9). Total num frames: 3620978688. Throughput: 0: 45381.4. Samples: 139547900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 10:14:30,502][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 10:14:31,130][73497] Updated weights for policy 0, policy_version 221009 (0.0038) [2024-06-13 10:14:33,255][73477] Signal inference workers to stop experience collection... (2100 times) [2024-06-13 10:14:33,255][73477] Signal inference workers to resume experience collection... (2100 times) [2024-06-13 10:14:33,278][73497] InferenceWorker_p0-w0: stopping experience collection (2100 times) [2024-06-13 10:14:33,278][73497] InferenceWorker_p0-w0: resuming experience collection (2100 times) [2024-06-13 10:14:33,932][73497] Updated weights for policy 0, policy_version 221019 (0.0039) [2024-06-13 10:14:35,501][73265] Fps is (10 sec: 45874.9, 60 sec: 45330.7, 300 sec: 45653.0). Total num frames: 3621224448. Throughput: 0: 45510.2. Samples: 139695600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 10:14:35,502][73265] Avg episode reward: [(0, '0.471')] [2024-06-13 10:14:38,106][73497] Updated weights for policy 0, policy_version 221029 (0.0030) [2024-06-13 10:14:40,501][73265] Fps is (10 sec: 49151.4, 60 sec: 46421.3, 300 sec: 45708.6). Total num frames: 3621470208. Throughput: 0: 45384.8. Samples: 139960120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 10:14:40,502][73265] Avg episode reward: [(0, '0.445')] [2024-06-13 10:14:41,029][73497] Updated weights for policy 0, policy_version 221039 (0.0028) [2024-06-13 10:14:45,501][73265] Fps is (10 sec: 40959.9, 60 sec: 45056.0, 300 sec: 45375.3). Total num frames: 3621634048. Throughput: 0: 45448.9. Samples: 140229860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 10:14:45,502][73265] Avg episode reward: [(0, '0.423')] [2024-06-13 10:14:45,759][73497] Updated weights for policy 0, policy_version 221049 (0.0032) [2024-06-13 10:14:48,623][73497] Updated weights for policy 0, policy_version 221059 (0.0036) [2024-06-13 10:14:50,501][73265] Fps is (10 sec: 42598.6, 60 sec: 45329.1, 300 sec: 45542.4). Total num frames: 3621896192. Throughput: 0: 45134.2. Samples: 140362280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 10:14:50,502][73265] Avg episode reward: [(0, '0.390')] [2024-06-13 10:14:52,862][73497] Updated weights for policy 0, policy_version 221069 (0.0043) [2024-06-13 10:14:55,501][73265] Fps is (10 sec: 47514.0, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3622109184. Throughput: 0: 45254.7. Samples: 140639640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 10:14:55,502][73265] Avg episode reward: [(0, '0.405')] [2024-06-13 10:14:56,043][73497] Updated weights for policy 0, policy_version 221079 (0.0032) [2024-06-13 10:14:59,836][73497] Updated weights for policy 0, policy_version 221089 (0.0031) [2024-06-13 10:15:00,501][73265] Fps is (10 sec: 42598.2, 60 sec: 45056.1, 300 sec: 45375.4). Total num frames: 3622322176. Throughput: 0: 45093.3. Samples: 140905420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-13 10:15:00,502][73265] Avg episode reward: [(0, '0.417')] [2024-06-13 10:15:03,233][73497] Updated weights for policy 0, policy_version 221099 (0.0042) [2024-06-13 10:15:05,501][73265] Fps is (10 sec: 49152.5, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 3622600704. Throughput: 0: 45161.4. Samples: 141045820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 10:15:05,502][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 10:15:06,767][73497] Updated weights for policy 0, policy_version 221109 (0.0037) [2024-06-13 10:15:10,134][73497] Updated weights for policy 0, policy_version 221119 (0.0029) [2024-06-13 10:15:10,501][73265] Fps is (10 sec: 49152.0, 60 sec: 45329.1, 300 sec: 45542.2). Total num frames: 3622813696. Throughput: 0: 45393.8. Samples: 141325280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 10:15:10,502][73265] Avg episode reward: [(0, '0.419')] [2024-06-13 10:15:14,528][73497] Updated weights for policy 0, policy_version 221129 (0.0027) [2024-06-13 10:15:15,501][73265] Fps is (10 sec: 39321.3, 60 sec: 44782.9, 300 sec: 45431.3). Total num frames: 3622993920. Throughput: 0: 45489.3. Samples: 141594920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 10:15:15,502][73265] Avg episode reward: [(0, '0.352')] [2024-06-13 10:15:17,655][73497] Updated weights for policy 0, policy_version 221139 (0.0041) [2024-06-13 10:15:20,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45602.1, 300 sec: 45653.0). Total num frames: 3623272448. Throughput: 0: 45061.4. Samples: 141723360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 10:15:20,502][73265] Avg episode reward: [(0, '0.382')] [2024-06-13 10:15:21,565][73497] Updated weights for policy 0, policy_version 221149 (0.0037) [2024-06-13 10:15:25,102][73497] Updated weights for policy 0, policy_version 221159 (0.0035) [2024-06-13 10:15:25,501][73265] Fps is (10 sec: 49151.9, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 3623485440. Throughput: 0: 45267.1. Samples: 141997140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 10:15:25,502][73265] Avg episode reward: [(0, '0.478')] [2024-06-13 10:15:28,440][73497] Updated weights for policy 0, policy_version 221169 (0.0042) [2024-06-13 10:15:30,504][73265] Fps is (10 sec: 40949.8, 60 sec: 45054.0, 300 sec: 45430.5). Total num frames: 3623682048. Throughput: 0: 45435.3. Samples: 142274560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 10:15:30,505][73265] Avg episode reward: [(0, '0.477')] [2024-06-13 10:15:32,505][73497] Updated weights for policy 0, policy_version 221179 (0.0035) [2024-06-13 10:15:35,393][73497] Updated weights for policy 0, policy_version 221189 (0.0047) [2024-06-13 10:15:35,501][73265] Fps is (10 sec: 47513.1, 60 sec: 45602.1, 300 sec: 45597.5). Total num frames: 3623960576. Throughput: 0: 45373.2. Samples: 142404080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 10:15:35,502][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 10:15:39,457][73497] Updated weights for policy 0, policy_version 221199 (0.0034) [2024-06-13 10:15:40,286][73477] Signal inference workers to stop experience collection... (2150 times) [2024-06-13 10:15:40,287][73477] Signal inference workers to resume experience collection... (2150 times) [2024-06-13 10:15:40,314][73497] InferenceWorker_p0-w0: stopping experience collection (2150 times) [2024-06-13 10:15:40,314][73497] InferenceWorker_p0-w0: resuming experience collection (2150 times) [2024-06-13 10:15:40,501][73265] Fps is (10 sec: 49164.1, 60 sec: 45056.0, 300 sec: 45653.0). Total num frames: 3624173568. Throughput: 0: 45296.8. Samples: 142678000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 10:15:40,502][73265] Avg episode reward: [(0, '0.473')] [2024-06-13 10:15:42,950][73497] Updated weights for policy 0, policy_version 221209 (0.0028) [2024-06-13 10:15:45,502][73265] Fps is (10 sec: 40959.6, 60 sec: 45602.1, 300 sec: 45430.9). Total num frames: 3624370176. Throughput: 0: 45742.5. Samples: 142963840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 10:15:45,502][73265] Avg episode reward: [(0, '0.470')] [2024-06-13 10:15:46,578][73497] Updated weights for policy 0, policy_version 221219 (0.0040) [2024-06-13 10:15:49,825][73497] Updated weights for policy 0, policy_version 221229 (0.0029) [2024-06-13 10:15:50,502][73265] Fps is (10 sec: 44236.6, 60 sec: 45329.0, 300 sec: 45486.4). Total num frames: 3624615936. Throughput: 0: 45310.0. Samples: 143084780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 10:15:50,502][73265] Avg episode reward: [(0, '0.439')] [2024-06-13 10:15:53,882][73497] Updated weights for policy 0, policy_version 221239 (0.0045) [2024-06-13 10:15:55,501][73265] Fps is (10 sec: 50791.1, 60 sec: 46148.2, 300 sec: 45764.1). Total num frames: 3624878080. Throughput: 0: 45264.9. Samples: 143362200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 10:15:55,503][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 10:15:56,699][73497] Updated weights for policy 0, policy_version 221249 (0.0032) [2024-06-13 10:16:00,502][73265] Fps is (10 sec: 44236.7, 60 sec: 45602.1, 300 sec: 45430.9). Total num frames: 3625058304. Throughput: 0: 45599.4. Samples: 143646900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 10:16:00,502][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 10:16:01,261][73497] Updated weights for policy 0, policy_version 221259 (0.0030) [2024-06-13 10:16:03,715][73497] Updated weights for policy 0, policy_version 221269 (0.0032) [2024-06-13 10:16:05,501][73265] Fps is (10 sec: 40960.5, 60 sec: 44782.9, 300 sec: 45430.9). Total num frames: 3625287680. Throughput: 0: 45524.1. Samples: 143771940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-13 10:16:05,502][73265] Avg episode reward: [(0, '0.417')] [2024-06-13 10:16:08,029][73497] Updated weights for policy 0, policy_version 221279 (0.0030) [2024-06-13 10:16:10,501][73265] Fps is (10 sec: 49152.9, 60 sec: 45602.2, 300 sec: 45764.2). Total num frames: 3625549824. Throughput: 0: 45677.4. Samples: 144052620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 10:16:10,502][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 10:16:11,279][73497] Updated weights for policy 0, policy_version 221289 (0.0041) [2024-06-13 10:16:15,241][73497] Updated weights for policy 0, policy_version 221299 (0.0045) [2024-06-13 10:16:15,501][73265] Fps is (10 sec: 47513.1, 60 sec: 46148.2, 300 sec: 45486.4). Total num frames: 3625762816. Throughput: 0: 45707.9. Samples: 144331300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 10:16:15,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 10:16:18,154][73497] Updated weights for policy 0, policy_version 221309 (0.0035) [2024-06-13 10:16:20,501][73265] Fps is (10 sec: 42598.5, 60 sec: 45056.1, 300 sec: 45542.0). Total num frames: 3625975808. Throughput: 0: 45758.8. Samples: 144463220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 10:16:20,502][73265] Avg episode reward: [(0, '0.485')] [2024-06-13 10:16:22,233][73497] Updated weights for policy 0, policy_version 221319 (0.0038) [2024-06-13 10:16:24,970][73497] Updated weights for policy 0, policy_version 221329 (0.0028) [2024-06-13 10:16:25,501][73265] Fps is (10 sec: 49151.7, 60 sec: 46148.2, 300 sec: 45764.1). Total num frames: 3626254336. Throughput: 0: 45853.3. Samples: 144741400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 10:16:25,502][73265] Avg episode reward: [(0, '0.377')] [2024-06-13 10:16:25,521][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000221329_3626254336.pth... [2024-06-13 10:16:25,583][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000220661_3615309824.pth [2024-06-13 10:16:29,872][73497] Updated weights for policy 0, policy_version 221339 (0.0038) [2024-06-13 10:16:30,501][73265] Fps is (10 sec: 49151.8, 60 sec: 46423.3, 300 sec: 45653.0). Total num frames: 3626467328. Throughput: 0: 45758.4. Samples: 145022960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 10:16:30,502][73265] Avg episode reward: [(0, '0.374')] [2024-06-13 10:16:32,440][73497] Updated weights for policy 0, policy_version 221349 (0.0032) [2024-06-13 10:16:35,502][73265] Fps is (10 sec: 40959.8, 60 sec: 45056.0, 300 sec: 45486.4). Total num frames: 3626663936. Throughput: 0: 46005.8. Samples: 145155040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 10:16:35,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 10:16:36,806][73497] Updated weights for policy 0, policy_version 221359 (0.0040) [2024-06-13 10:16:39,866][73497] Updated weights for policy 0, policy_version 221369 (0.0027) [2024-06-13 10:16:40,504][73265] Fps is (10 sec: 45863.6, 60 sec: 45873.3, 300 sec: 45708.2). Total num frames: 3626926080. Throughput: 0: 45940.2. Samples: 145429620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 10:16:40,504][73265] Avg episode reward: [(0, '0.434')] [2024-06-13 10:16:43,306][73477] Signal inference workers to stop experience collection... (2200 times) [2024-06-13 10:16:43,308][73477] Signal inference workers to resume experience collection... (2200 times) [2024-06-13 10:16:43,316][73497] InferenceWorker_p0-w0: stopping experience collection (2200 times) [2024-06-13 10:16:43,346][73497] InferenceWorker_p0-w0: resuming experience collection (2200 times) [2024-06-13 10:16:44,020][73497] Updated weights for policy 0, policy_version 221379 (0.0034) [2024-06-13 10:16:45,501][73265] Fps is (10 sec: 50790.9, 60 sec: 46694.5, 300 sec: 45764.5). Total num frames: 3627171840. Throughput: 0: 45578.8. Samples: 145697940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 10:16:45,502][73265] Avg episode reward: [(0, '0.488')] [2024-06-13 10:16:46,906][73497] Updated weights for policy 0, policy_version 221389 (0.0037) [2024-06-13 10:16:50,501][73265] Fps is (10 sec: 39331.7, 60 sec: 45056.1, 300 sec: 45319.8). Total num frames: 3627319296. Throughput: 0: 45924.0. Samples: 145838520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 10:16:50,502][73265] Avg episode reward: [(0, '0.405')] [2024-06-13 10:16:51,372][73497] Updated weights for policy 0, policy_version 221399 (0.0029) [2024-06-13 10:16:53,979][73497] Updated weights for policy 0, policy_version 221409 (0.0029) [2024-06-13 10:16:55,506][73265] Fps is (10 sec: 42580.8, 60 sec: 45326.0, 300 sec: 45596.9). Total num frames: 3627597824. Throughput: 0: 45595.7. Samples: 146104620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 10:16:55,506][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 10:16:58,693][73497] Updated weights for policy 0, policy_version 221419 (0.0028) [2024-06-13 10:17:00,502][73265] Fps is (10 sec: 54065.7, 60 sec: 46694.3, 300 sec: 45764.1). Total num frames: 3627859968. Throughput: 0: 45639.3. Samples: 146385080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 10:17:00,502][73265] Avg episode reward: [(0, '0.415')] [2024-06-13 10:17:01,466][73497] Updated weights for policy 0, policy_version 221429 (0.0031) [2024-06-13 10:17:05,454][73497] Updated weights for policy 0, policy_version 221439 (0.0030) [2024-06-13 10:17:05,501][73265] Fps is (10 sec: 45894.5, 60 sec: 46148.2, 300 sec: 45542.0). Total num frames: 3628056576. Throughput: 0: 45911.1. Samples: 146529220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 10:17:05,502][73265] Avg episode reward: [(0, '0.442')] [2024-06-13 10:17:08,644][73497] Updated weights for policy 0, policy_version 221449 (0.0026) [2024-06-13 10:17:10,501][73265] Fps is (10 sec: 40960.7, 60 sec: 45329.0, 300 sec: 45486.4). Total num frames: 3628269568. Throughput: 0: 45678.3. Samples: 146796920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 10:17:10,502][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 10:17:12,891][73497] Updated weights for policy 0, policy_version 221459 (0.0028) [2024-06-13 10:17:15,504][73265] Fps is (10 sec: 47501.6, 60 sec: 46146.4, 300 sec: 45708.2). Total num frames: 3628531712. Throughput: 0: 45405.9. Samples: 147066340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-13 10:17:15,504][73265] Avg episode reward: [(0, '0.476')] [2024-06-13 10:17:15,673][73497] Updated weights for policy 0, policy_version 221469 (0.0034) [2024-06-13 10:17:20,162][73497] Updated weights for policy 0, policy_version 221479 (0.0029) [2024-06-13 10:17:20,501][73265] Fps is (10 sec: 45875.6, 60 sec: 45875.2, 300 sec: 45486.4). Total num frames: 3628728320. Throughput: 0: 45613.5. Samples: 147207640. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 10:17:20,502][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 10:17:23,031][73497] Updated weights for policy 0, policy_version 221489 (0.0033) [2024-06-13 10:17:25,501][73265] Fps is (10 sec: 39331.3, 60 sec: 44509.9, 300 sec: 45375.3). Total num frames: 3628924928. Throughput: 0: 45372.3. Samples: 147471260. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 10:17:25,502][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 10:17:27,374][73497] Updated weights for policy 0, policy_version 221499 (0.0030) [2024-06-13 10:17:30,422][73497] Updated weights for policy 0, policy_version 221509 (0.0030) [2024-06-13 10:17:30,501][73265] Fps is (10 sec: 47513.1, 60 sec: 45602.1, 300 sec: 45764.1). Total num frames: 3629203456. Throughput: 0: 45307.5. Samples: 147736780. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 10:17:30,502][73265] Avg episode reward: [(0, '0.484')] [2024-06-13 10:17:34,677][73497] Updated weights for policy 0, policy_version 221519 (0.0033) [2024-06-13 10:17:35,501][73265] Fps is (10 sec: 49152.3, 60 sec: 45875.3, 300 sec: 45542.0). Total num frames: 3629416448. Throughput: 0: 45454.2. Samples: 147883960. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 10:17:35,502][73265] Avg episode reward: [(0, '0.488')] [2024-06-13 10:17:38,144][73497] Updated weights for policy 0, policy_version 221529 (0.0036) [2024-06-13 10:17:40,501][73265] Fps is (10 sec: 39321.6, 60 sec: 44511.7, 300 sec: 45319.8). Total num frames: 3629596672. Throughput: 0: 45305.5. Samples: 148143180. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 10:17:40,502][73265] Avg episode reward: [(0, '0.429')] [2024-06-13 10:17:42,133][73497] Updated weights for policy 0, policy_version 221539 (0.0042) [2024-06-13 10:17:45,347][73497] Updated weights for policy 0, policy_version 221549 (0.0028) [2024-06-13 10:17:45,501][73265] Fps is (10 sec: 44236.7, 60 sec: 44783.0, 300 sec: 45597.5). Total num frames: 3629858816. Throughput: 0: 45052.7. Samples: 148412440. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 10:17:45,502][73265] Avg episode reward: [(0, '0.494')] [2024-06-13 10:17:49,345][73497] Updated weights for policy 0, policy_version 221559 (0.0037) [2024-06-13 10:17:50,501][73265] Fps is (10 sec: 47514.3, 60 sec: 45875.2, 300 sec: 45486.4). Total num frames: 3630071808. Throughput: 0: 45035.2. Samples: 148555800. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 10:17:50,502][73265] Avg episode reward: [(0, '0.421')] [2024-06-13 10:17:50,974][73477] Signal inference workers to stop experience collection... (2250 times) [2024-06-13 10:17:51,019][73497] InferenceWorker_p0-w0: stopping experience collection (2250 times) [2024-06-13 10:17:51,084][73477] Signal inference workers to resume experience collection... (2250 times) [2024-06-13 10:17:51,084][73497] InferenceWorker_p0-w0: resuming experience collection (2250 times) [2024-06-13 10:17:52,424][73497] Updated weights for policy 0, policy_version 221569 (0.0028) [2024-06-13 10:17:55,501][73265] Fps is (10 sec: 40959.9, 60 sec: 44512.9, 300 sec: 45264.3). Total num frames: 3630268416. Throughput: 0: 44847.6. Samples: 148815060. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 10:17:55,502][73265] Avg episode reward: [(0, '0.481')] [2024-06-13 10:17:56,698][73497] Updated weights for policy 0, policy_version 221579 (0.0037) [2024-06-13 10:17:59,872][73497] Updated weights for policy 0, policy_version 221589 (0.0029) [2024-06-13 10:18:00,501][73265] Fps is (10 sec: 45874.7, 60 sec: 44510.0, 300 sec: 45541.9). Total num frames: 3630530560. Throughput: 0: 44758.4. Samples: 149080360. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 10:18:00,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 10:18:03,782][73497] Updated weights for policy 0, policy_version 221599 (0.0038) [2024-06-13 10:18:05,501][73265] Fps is (10 sec: 47513.6, 60 sec: 44782.9, 300 sec: 45486.4). Total num frames: 3630743552. Throughput: 0: 44869.7. Samples: 149226780. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 10:18:05,502][73265] Avg episode reward: [(0, '0.484')] [2024-06-13 10:18:07,179][73497] Updated weights for policy 0, policy_version 221609 (0.0032) [2024-06-13 10:18:10,504][73265] Fps is (10 sec: 42587.9, 60 sec: 44781.1, 300 sec: 45319.4). Total num frames: 3630956544. Throughput: 0: 44885.5. Samples: 149491220. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 10:18:10,504][73265] Avg episode reward: [(0, '0.475')] [2024-06-13 10:18:11,303][73497] Updated weights for policy 0, policy_version 221619 (0.0044) [2024-06-13 10:18:14,360][73497] Updated weights for policy 0, policy_version 221629 (0.0040) [2024-06-13 10:18:15,501][73265] Fps is (10 sec: 45875.5, 60 sec: 44511.8, 300 sec: 45319.8). Total num frames: 3631202304. Throughput: 0: 45081.0. Samples: 149765420. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 10:18:15,502][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 10:18:18,323][73497] Updated weights for policy 0, policy_version 221639 (0.0045) [2024-06-13 10:18:20,501][73265] Fps is (10 sec: 47525.6, 60 sec: 45056.0, 300 sec: 45486.8). Total num frames: 3631431680. Throughput: 0: 44796.0. Samples: 149899780. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 10:18:20,502][73265] Avg episode reward: [(0, '0.416')] [2024-06-13 10:18:21,452][73497] Updated weights for policy 0, policy_version 221649 (0.0037) [2024-06-13 10:18:25,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 3631644672. Throughput: 0: 45186.8. Samples: 150176580. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-13 10:18:25,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 10:18:25,541][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000221659_3631661056.pth... [2024-06-13 10:18:25,550][73497] Updated weights for policy 0, policy_version 221659 (0.0032) [2024-06-13 10:18:25,588][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000220994_3620765696.pth [2024-06-13 10:18:28,771][73497] Updated weights for policy 0, policy_version 221669 (0.0030) [2024-06-13 10:18:30,501][73265] Fps is (10 sec: 44237.0, 60 sec: 44510.0, 300 sec: 45320.2). Total num frames: 3631874048. Throughput: 0: 45257.0. Samples: 150449000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:18:30,502][73265] Avg episode reward: [(0, '0.419')] [2024-06-13 10:18:32,646][73497] Updated weights for policy 0, policy_version 221679 (0.0032) [2024-06-13 10:18:35,502][73265] Fps is (10 sec: 47512.8, 60 sec: 45055.9, 300 sec: 45541.9). Total num frames: 3632119808. Throughput: 0: 45072.2. Samples: 150584060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:18:35,502][73265] Avg episode reward: [(0, '0.446')] [2024-06-13 10:18:35,939][73497] Updated weights for policy 0, policy_version 221689 (0.0045) [2024-06-13 10:18:40,304][73497] Updated weights for policy 0, policy_version 221699 (0.0039) [2024-06-13 10:18:40,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45329.1, 300 sec: 45375.4). Total num frames: 3632316416. Throughput: 0: 45394.3. Samples: 150857800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:18:40,502][73265] Avg episode reward: [(0, '0.425')] [2024-06-13 10:18:43,403][73497] Updated weights for policy 0, policy_version 221709 (0.0037) [2024-06-13 10:18:45,501][73265] Fps is (10 sec: 42599.0, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 3632545792. Throughput: 0: 45662.7. Samples: 151135180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:18:45,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 10:18:47,156][73497] Updated weights for policy 0, policy_version 221719 (0.0027) [2024-06-13 10:18:50,297][73497] Updated weights for policy 0, policy_version 221729 (0.0022) [2024-06-13 10:18:50,501][73265] Fps is (10 sec: 49151.4, 60 sec: 45602.0, 300 sec: 45541.9). Total num frames: 3632807936. Throughput: 0: 45354.1. Samples: 151267720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:18:50,502][73265] Avg episode reward: [(0, '0.463')] [2024-06-13 10:18:54,063][73497] Updated weights for policy 0, policy_version 221739 (0.0030) [2024-06-13 10:18:55,501][73265] Fps is (10 sec: 47513.3, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 3633020928. Throughput: 0: 45707.8. Samples: 151547960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:18:55,502][73265] Avg episode reward: [(0, '0.413')] [2024-06-13 10:18:57,485][73497] Updated weights for policy 0, policy_version 221749 (0.0031) [2024-06-13 10:19:00,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45602.1, 300 sec: 45375.3). Total num frames: 3633266688. Throughput: 0: 45593.7. Samples: 151817140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:19:00,502][73265] Avg episode reward: [(0, '0.367')] [2024-06-13 10:19:01,431][73497] Updated weights for policy 0, policy_version 221759 (0.0037) [2024-06-13 10:19:04,718][73497] Updated weights for policy 0, policy_version 221769 (0.0029) [2024-06-13 10:19:05,501][73265] Fps is (10 sec: 47513.8, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 3633496064. Throughput: 0: 45624.8. Samples: 151952900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:19:05,502][73265] Avg episode reward: [(0, '0.415')] [2024-06-13 10:19:08,967][73497] Updated weights for policy 0, policy_version 221779 (0.0043) [2024-06-13 10:19:10,504][73265] Fps is (10 sec: 42588.0, 60 sec: 45602.1, 300 sec: 45375.0). Total num frames: 3633692672. Throughput: 0: 45740.5. Samples: 152235020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:19:10,504][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 10:19:11,810][73497] Updated weights for policy 0, policy_version 221789 (0.0028) [2024-06-13 10:19:15,501][73265] Fps is (10 sec: 42598.2, 60 sec: 45329.0, 300 sec: 45375.3). Total num frames: 3633922048. Throughput: 0: 45518.1. Samples: 152497320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:19:15,502][73265] Avg episode reward: [(0, '0.476')] [2024-06-13 10:19:15,787][73497] Updated weights for policy 0, policy_version 221799 (0.0044) [2024-06-13 10:19:17,977][73477] Signal inference workers to stop experience collection... (2300 times) [2024-06-13 10:19:18,021][73497] InferenceWorker_p0-w0: stopping experience collection (2300 times) [2024-06-13 10:19:18,028][73477] Signal inference workers to resume experience collection... (2300 times) [2024-06-13 10:19:18,035][73497] InferenceWorker_p0-w0: resuming experience collection (2300 times) [2024-06-13 10:19:19,296][73497] Updated weights for policy 0, policy_version 221809 (0.0037) [2024-06-13 10:19:20,501][73265] Fps is (10 sec: 47525.5, 60 sec: 45602.1, 300 sec: 45430.9). Total num frames: 3634167808. Throughput: 0: 45832.1. Samples: 152646500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:19:20,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 10:19:22,740][73497] Updated weights for policy 0, policy_version 221819 (0.0033) [2024-06-13 10:19:25,501][73265] Fps is (10 sec: 49152.0, 60 sec: 46148.2, 300 sec: 45541.9). Total num frames: 3634413568. Throughput: 0: 45884.8. Samples: 152922620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:19:25,502][73265] Avg episode reward: [(0, '0.448')] [2024-06-13 10:19:26,458][73497] Updated weights for policy 0, policy_version 221829 (0.0038) [2024-06-13 10:19:30,381][73497] Updated weights for policy 0, policy_version 221839 (0.0038) [2024-06-13 10:19:30,501][73265] Fps is (10 sec: 44236.4, 60 sec: 45602.0, 300 sec: 45375.3). Total num frames: 3634610176. Throughput: 0: 45736.8. Samples: 153193340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:19:30,502][73265] Avg episode reward: [(0, '0.456')] [2024-06-13 10:19:33,567][73497] Updated weights for policy 0, policy_version 221849 (0.0025) [2024-06-13 10:19:35,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45875.3, 300 sec: 45430.9). Total num frames: 3634872320. Throughput: 0: 45869.8. Samples: 153331860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-13 10:19:35,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 10:19:37,590][73497] Updated weights for policy 0, policy_version 221859 (0.0036) [2024-06-13 10:19:40,501][73265] Fps is (10 sec: 47514.0, 60 sec: 46148.2, 300 sec: 45597.5). Total num frames: 3635085312. Throughput: 0: 45724.1. Samples: 153605540. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 10:19:40,502][73265] Avg episode reward: [(0, '0.402')] [2024-06-13 10:19:40,523][73497] Updated weights for policy 0, policy_version 221869 (0.0036) [2024-06-13 10:19:44,484][73497] Updated weights for policy 0, policy_version 221879 (0.0043) [2024-06-13 10:19:45,501][73265] Fps is (10 sec: 42598.5, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 3635298304. Throughput: 0: 45544.9. Samples: 153866660. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 10:19:45,502][73265] Avg episode reward: [(0, '0.462')] [2024-06-13 10:19:48,200][73497] Updated weights for policy 0, policy_version 221889 (0.0041) [2024-06-13 10:19:50,501][73265] Fps is (10 sec: 47513.6, 60 sec: 45875.3, 300 sec: 45597.5). Total num frames: 3635560448. Throughput: 0: 45754.7. Samples: 154011860. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 10:19:50,502][73265] Avg episode reward: [(0, '0.470')] [2024-06-13 10:19:51,517][73497] Updated weights for policy 0, policy_version 221899 (0.0037) [2024-06-13 10:19:55,080][73497] Updated weights for policy 0, policy_version 221909 (0.0030) [2024-06-13 10:19:55,501][73265] Fps is (10 sec: 47513.8, 60 sec: 45875.3, 300 sec: 45597.5). Total num frames: 3635773440. Throughput: 0: 45586.1. Samples: 154286280. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 10:19:55,502][73265] Avg episode reward: [(0, '0.471')] [2024-06-13 10:19:58,933][73497] Updated weights for policy 0, policy_version 221919 (0.0028) [2024-06-13 10:20:00,501][73265] Fps is (10 sec: 40959.8, 60 sec: 45056.0, 300 sec: 45319.8). Total num frames: 3635970048. Throughput: 0: 45876.0. Samples: 154561740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 10:20:00,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 10:20:02,387][73497] Updated weights for policy 0, policy_version 221929 (0.0032) [2024-06-13 10:20:05,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3636215808. Throughput: 0: 45352.5. Samples: 154687360. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 10:20:05,502][73265] Avg episode reward: [(0, '0.461')] [2024-06-13 10:20:05,974][73497] Updated weights for policy 0, policy_version 221939 (0.0028) [2024-06-13 10:20:09,453][73497] Updated weights for policy 0, policy_version 221949 (0.0036) [2024-06-13 10:20:10,501][73265] Fps is (10 sec: 49152.2, 60 sec: 46150.2, 300 sec: 45653.0). Total num frames: 3636461568. Throughput: 0: 45640.1. Samples: 154976420. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 10:20:10,502][73265] Avg episode reward: [(0, '0.398')] [2024-06-13 10:20:11,619][73477] Signal inference workers to stop experience collection... (2350 times) [2024-06-13 10:20:11,624][73477] Signal inference workers to resume experience collection... (2350 times) [2024-06-13 10:20:11,634][73497] InferenceWorker_p0-w0: stopping experience collection (2350 times) [2024-06-13 10:20:11,665][73497] InferenceWorker_p0-w0: resuming experience collection (2350 times) [2024-06-13 10:20:12,877][73497] Updated weights for policy 0, policy_version 221959 (0.0026) [2024-06-13 10:20:15,501][73265] Fps is (10 sec: 42597.9, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 3636641792. Throughput: 0: 45621.3. Samples: 155246300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 10:20:15,502][73265] Avg episode reward: [(0, '0.420')] [2024-06-13 10:20:16,654][73497] Updated weights for policy 0, policy_version 221969 (0.0033) [2024-06-13 10:20:20,134][73497] Updated weights for policy 0, policy_version 221979 (0.0032) [2024-06-13 10:20:20,501][73265] Fps is (10 sec: 44236.8, 60 sec: 45602.1, 300 sec: 45486.4). Total num frames: 3636903936. Throughput: 0: 45307.2. Samples: 155370680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 10:20:20,502][73265] Avg episode reward: [(0, '0.444')] [2024-06-13 10:20:23,933][73497] Updated weights for policy 0, policy_version 221989 (0.0044) [2024-06-13 10:20:25,501][73265] Fps is (10 sec: 50790.7, 60 sec: 45602.2, 300 sec: 45653.4). Total num frames: 3637149696. Throughput: 0: 45571.1. Samples: 155656240. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 10:20:25,502][73265] Avg episode reward: [(0, '0.400')] [2024-06-13 10:20:25,514][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000221995_3637166080.pth... [2024-06-13 10:20:25,567][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000221329_3626254336.pth [2024-06-13 10:20:27,627][73497] Updated weights for policy 0, policy_version 221999 (0.0034) [2024-06-13 10:20:30,502][73265] Fps is (10 sec: 42597.8, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 3637329920. Throughput: 0: 45819.4. Samples: 155928540. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 10:20:30,502][73265] Avg episode reward: [(0, '0.393')] [2024-06-13 10:20:31,326][73497] Updated weights for policy 0, policy_version 222009 (0.0038) [2024-06-13 10:20:34,593][73497] Updated weights for policy 0, policy_version 222019 (0.0026) [2024-06-13 10:20:35,501][73265] Fps is (10 sec: 40960.1, 60 sec: 44783.0, 300 sec: 45375.4). Total num frames: 3637559296. Throughput: 0: 45278.2. Samples: 156049380. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 10:20:35,502][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 10:20:38,394][73497] Updated weights for policy 0, policy_version 222029 (0.0035) [2024-06-13 10:20:40,504][73265] Fps is (10 sec: 50778.7, 60 sec: 45873.3, 300 sec: 45652.7). Total num frames: 3637837824. Throughput: 0: 45372.6. Samples: 156328160. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-13 10:20:40,504][73265] Avg episode reward: [(0, '0.529')] [2024-06-13 10:20:41,717][73497] Updated weights for policy 0, policy_version 222039 (0.0038) [2024-06-13 10:20:45,501][73265] Fps is (10 sec: 47513.4, 60 sec: 45602.1, 300 sec: 45486.4). Total num frames: 3638034432. Throughput: 0: 45556.9. Samples: 156611800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 10:20:45,502][73265] Avg episode reward: [(0, '0.422')] [2024-06-13 10:20:45,694][73497] Updated weights for policy 0, policy_version 222049 (0.0027) [2024-06-13 10:20:49,106][73497] Updated weights for policy 0, policy_version 222059 (0.0032) [2024-06-13 10:20:50,501][73265] Fps is (10 sec: 42608.5, 60 sec: 45055.9, 300 sec: 45375.3). Total num frames: 3638263808. Throughput: 0: 45527.0. Samples: 156736080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 10:20:50,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 10:20:52,800][73497] Updated weights for policy 0, policy_version 222069 (0.0041) [2024-06-13 10:20:55,502][73265] Fps is (10 sec: 47513.3, 60 sec: 45602.0, 300 sec: 45597.5). Total num frames: 3638509568. Throughput: 0: 45193.7. Samples: 157010140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 10:20:55,502][73265] Avg episode reward: [(0, '0.454')] [2024-06-13 10:20:56,207][73497] Updated weights for policy 0, policy_version 222079 (0.0035) [2024-06-13 10:21:00,015][73497] Updated weights for policy 0, policy_version 222089 (0.0037) [2024-06-13 10:21:00,502][73265] Fps is (10 sec: 47513.3, 60 sec: 46148.2, 300 sec: 45597.5). Total num frames: 3638738944. Throughput: 0: 45492.8. Samples: 157293480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 10:21:00,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 10:21:03,027][73497] Updated weights for policy 0, policy_version 222099 (0.0038) [2024-06-13 10:21:05,501][73265] Fps is (10 sec: 40960.4, 60 sec: 45056.0, 300 sec: 45319.8). Total num frames: 3638919168. Throughput: 0: 45633.3. Samples: 157424180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 10:21:05,502][73265] Avg episode reward: [(0, '0.400')] [2024-06-13 10:21:07,250][73497] Updated weights for policy 0, policy_version 222109 (0.0035) [2024-06-13 10:21:10,206][73497] Updated weights for policy 0, policy_version 222119 (0.0032) [2024-06-13 10:21:10,501][73265] Fps is (10 sec: 45875.8, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3639197696. Throughput: 0: 45191.6. Samples: 157689860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 10:21:10,502][73265] Avg episode reward: [(0, '0.480')] [2024-06-13 10:21:14,232][73497] Updated weights for policy 0, policy_version 222129 (0.0038) [2024-06-13 10:21:15,501][73265] Fps is (10 sec: 50790.2, 60 sec: 46421.4, 300 sec: 45597.5). Total num frames: 3639427072. Throughput: 0: 45574.3. Samples: 157979380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 10:21:15,502][73265] Avg episode reward: [(0, '0.442')] [2024-06-13 10:21:17,334][73497] Updated weights for policy 0, policy_version 222139 (0.0035) [2024-06-13 10:21:20,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45602.1, 300 sec: 45375.4). Total num frames: 3639640064. Throughput: 0: 45850.2. Samples: 158112640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 10:21:20,502][73265] Avg episode reward: [(0, '0.431')] [2024-06-13 10:21:21,237][73497] Updated weights for policy 0, policy_version 222149 (0.0033) [2024-06-13 10:21:24,579][73497] Updated weights for policy 0, policy_version 222159 (0.0037) [2024-06-13 10:21:25,502][73265] Fps is (10 sec: 44236.0, 60 sec: 45328.9, 300 sec: 45430.8). Total num frames: 3639869440. Throughput: 0: 45699.1. Samples: 158384520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 10:21:25,502][73265] Avg episode reward: [(0, '0.445')] [2024-06-13 10:21:28,354][73497] Updated weights for policy 0, policy_version 222169 (0.0035) [2024-06-13 10:21:30,502][73265] Fps is (10 sec: 49151.5, 60 sec: 46694.4, 300 sec: 45653.0). Total num frames: 3640131584. Throughput: 0: 45602.2. Samples: 158663900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 10:21:30,502][73265] Avg episode reward: [(0, '0.487')] [2024-06-13 10:21:31,083][73477] Signal inference workers to stop experience collection... (2400 times) [2024-06-13 10:21:31,125][73497] InferenceWorker_p0-w0: stopping experience collection (2400 times) [2024-06-13 10:21:31,191][73477] Signal inference workers to resume experience collection... (2400 times) [2024-06-13 10:21:31,192][73497] InferenceWorker_p0-w0: resuming experience collection (2400 times) [2024-06-13 10:21:31,323][73497] Updated weights for policy 0, policy_version 222179 (0.0038) [2024-06-13 10:21:35,507][73265] Fps is (10 sec: 45851.7, 60 sec: 46144.2, 300 sec: 45430.4). Total num frames: 3640328192. Throughput: 0: 46159.0. Samples: 158813480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 10:21:35,507][73265] Avg episode reward: [(0, '0.420')] [2024-06-13 10:21:35,699][73497] Updated weights for policy 0, policy_version 222189 (0.0044) [2024-06-13 10:21:38,508][73497] Updated weights for policy 0, policy_version 222199 (0.0029) [2024-06-13 10:21:40,501][73265] Fps is (10 sec: 39322.3, 60 sec: 44784.8, 300 sec: 45264.3). Total num frames: 3640524800. Throughput: 0: 45792.6. Samples: 159070800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 10:21:40,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 10:21:42,687][73497] Updated weights for policy 0, policy_version 222209 (0.0030) [2024-06-13 10:21:45,501][73265] Fps is (10 sec: 47539.3, 60 sec: 46148.4, 300 sec: 45708.6). Total num frames: 3640803328. Throughput: 0: 45720.7. Samples: 159350900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 10:21:45,502][73265] Avg episode reward: [(0, '0.402')] [2024-06-13 10:21:45,995][73497] Updated weights for policy 0, policy_version 222219 (0.0038) [2024-06-13 10:21:49,822][73497] Updated weights for policy 0, policy_version 222229 (0.0038) [2024-06-13 10:21:50,502][73265] Fps is (10 sec: 52427.9, 60 sec: 46421.3, 300 sec: 45598.1). Total num frames: 3641049088. Throughput: 0: 46152.3. Samples: 159501040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-13 10:21:50,502][73265] Avg episode reward: [(0, '0.464')] [2024-06-13 10:21:52,995][73497] Updated weights for policy 0, policy_version 222239 (0.0037) [2024-06-13 10:21:55,501][73265] Fps is (10 sec: 40959.5, 60 sec: 45056.0, 300 sec: 45264.3). Total num frames: 3641212928. Throughput: 0: 46234.6. Samples: 159770420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-13 10:21:55,502][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 10:21:56,820][73497] Updated weights for policy 0, policy_version 222249 (0.0037) [2024-06-13 10:21:59,841][73497] Updated weights for policy 0, policy_version 222259 (0.0031) [2024-06-13 10:22:00,501][73265] Fps is (10 sec: 45875.7, 60 sec: 46148.4, 300 sec: 45597.5). Total num frames: 3641507840. Throughput: 0: 45703.6. Samples: 160036040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-13 10:22:00,502][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 10:22:04,424][73497] Updated weights for policy 0, policy_version 222269 (0.0038) [2024-06-13 10:22:05,501][73265] Fps is (10 sec: 52429.0, 60 sec: 46967.5, 300 sec: 45653.0). Total num frames: 3641737216. Throughput: 0: 46231.1. Samples: 160193040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-13 10:22:05,502][73265] Avg episode reward: [(0, '0.421')] [2024-06-13 10:22:06,670][73497] Updated weights for policy 0, policy_version 222279 (0.0039) [2024-06-13 10:22:10,504][73265] Fps is (10 sec: 40949.8, 60 sec: 45327.2, 300 sec: 45375.3). Total num frames: 3641917440. Throughput: 0: 46235.9. Samples: 160465240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-13 10:22:10,504][73265] Avg episode reward: [(0, '0.454')] [2024-06-13 10:22:11,465][73497] Updated weights for policy 0, policy_version 222289 (0.0035) [2024-06-13 10:22:14,482][73497] Updated weights for policy 0, policy_version 222299 (0.0032) [2024-06-13 10:22:15,501][73265] Fps is (10 sec: 42598.5, 60 sec: 45602.2, 300 sec: 45542.0). Total num frames: 3642163200. Throughput: 0: 45949.4. Samples: 160731620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-13 10:22:15,502][73265] Avg episode reward: [(0, '0.476')] [2024-06-13 10:22:18,387][73497] Updated weights for policy 0, policy_version 222309 (0.0027) [2024-06-13 10:22:20,501][73265] Fps is (10 sec: 49164.5, 60 sec: 46148.3, 300 sec: 45708.6). Total num frames: 3642408960. Throughput: 0: 45798.8. Samples: 160874180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-13 10:22:20,502][73265] Avg episode reward: [(0, '0.417')] [2024-06-13 10:22:21,263][73477] Signal inference workers to stop experience collection... (2450 times) [2024-06-13 10:22:21,316][73497] InferenceWorker_p0-w0: stopping experience collection (2450 times) [2024-06-13 10:22:21,318][73477] Signal inference workers to resume experience collection... (2450 times) [2024-06-13 10:22:21,331][73497] InferenceWorker_p0-w0: resuming experience collection (2450 times) [2024-06-13 10:22:21,445][73497] Updated weights for policy 0, policy_version 222319 (0.0038) [2024-06-13 10:22:25,379][73497] Updated weights for policy 0, policy_version 222329 (0.0029) [2024-06-13 10:22:25,501][73265] Fps is (10 sec: 47513.9, 60 sec: 46148.5, 300 sec: 45542.0). Total num frames: 3642638336. Throughput: 0: 46468.0. Samples: 161161860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-13 10:22:25,502][73265] Avg episode reward: [(0, '0.419')] [2024-06-13 10:22:25,510][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000222329_3642638336.pth... [2024-06-13 10:22:25,562][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000221659_3631661056.pth [2024-06-13 10:22:28,162][73497] Updated weights for policy 0, policy_version 222339 (0.0035) [2024-06-13 10:22:30,501][73265] Fps is (10 sec: 40960.0, 60 sec: 44783.1, 300 sec: 45430.9). Total num frames: 3642818560. Throughput: 0: 46146.7. Samples: 161427500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-13 10:22:30,502][73265] Avg episode reward: [(0, '0.372')] [2024-06-13 10:22:32,693][73497] Updated weights for policy 0, policy_version 222349 (0.0039) [2024-06-13 10:22:35,086][73497] Updated weights for policy 0, policy_version 222359 (0.0031) [2024-06-13 10:22:35,501][73265] Fps is (10 sec: 49151.7, 60 sec: 46698.5, 300 sec: 45875.2). Total num frames: 3643129856. Throughput: 0: 45805.4. Samples: 161562280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-13 10:22:35,502][73265] Avg episode reward: [(0, '0.434')] [2024-06-13 10:22:40,001][73497] Updated weights for policy 0, policy_version 222369 (0.0035) [2024-06-13 10:22:40,502][73265] Fps is (10 sec: 50789.5, 60 sec: 46694.3, 300 sec: 45653.0). Total num frames: 3643326464. Throughput: 0: 45931.5. Samples: 161837340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-13 10:22:40,502][73265] Avg episode reward: [(0, '0.373')] [2024-06-13 10:22:43,134][73497] Updated weights for policy 0, policy_version 222379 (0.0033) [2024-06-13 10:22:45,501][73265] Fps is (10 sec: 37683.0, 60 sec: 45055.9, 300 sec: 45541.9). Total num frames: 3643506688. Throughput: 0: 46117.3. Samples: 162111320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-13 10:22:45,504][73265] Avg episode reward: [(0, '0.428')] [2024-06-13 10:22:47,006][73497] Updated weights for policy 0, policy_version 222389 (0.0031) [2024-06-13 10:22:50,269][73497] Updated weights for policy 0, policy_version 222399 (0.0036) [2024-06-13 10:22:50,501][73265] Fps is (10 sec: 45875.8, 60 sec: 45602.2, 300 sec: 45819.7). Total num frames: 3643785216. Throughput: 0: 45573.0. Samples: 162243820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-13 10:22:50,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 10:22:54,038][73497] Updated weights for policy 0, policy_version 222409 (0.0031) [2024-06-13 10:22:55,501][73265] Fps is (10 sec: 50791.1, 60 sec: 46694.5, 300 sec: 45708.6). Total num frames: 3644014592. Throughput: 0: 45751.5. Samples: 162523940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-13 10:22:55,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 10:22:56,995][73497] Updated weights for policy 0, policy_version 222419 (0.0036) [2024-06-13 10:23:00,501][73265] Fps is (10 sec: 42598.5, 60 sec: 45056.0, 300 sec: 45653.1). Total num frames: 3644211200. Throughput: 0: 46001.0. Samples: 162801660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-13 10:23:00,502][73265] Avg episode reward: [(0, '0.475')] [2024-06-13 10:23:01,371][73497] Updated weights for policy 0, policy_version 222429 (0.0023) [2024-06-13 10:23:03,949][73497] Updated weights for policy 0, policy_version 222439 (0.0034) [2024-06-13 10:23:05,501][73265] Fps is (10 sec: 44236.4, 60 sec: 45329.1, 300 sec: 45764.5). Total num frames: 3644456960. Throughput: 0: 45781.7. Samples: 162934360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 10:23:05,502][73265] Avg episode reward: [(0, '0.409')] [2024-06-13 10:23:08,529][73497] Updated weights for policy 0, policy_version 222449 (0.0044) [2024-06-13 10:23:10,501][73265] Fps is (10 sec: 47513.1, 60 sec: 46150.1, 300 sec: 45708.6). Total num frames: 3644686336. Throughput: 0: 45516.3. Samples: 163210100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 10:23:10,502][73265] Avg episode reward: [(0, '0.453')] [2024-06-13 10:23:11,887][73497] Updated weights for policy 0, policy_version 222459 (0.0032) [2024-06-13 10:23:15,501][73265] Fps is (10 sec: 45875.7, 60 sec: 45875.3, 300 sec: 45708.6). Total num frames: 3644915712. Throughput: 0: 45702.7. Samples: 163484120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 10:23:15,502][73265] Avg episode reward: [(0, '0.453')] [2024-06-13 10:23:15,687][73497] Updated weights for policy 0, policy_version 222469 (0.0035) [2024-06-13 10:23:18,664][73497] Updated weights for policy 0, policy_version 222479 (0.0028) [2024-06-13 10:23:20,501][73265] Fps is (10 sec: 45875.6, 60 sec: 45602.1, 300 sec: 45764.1). Total num frames: 3645145088. Throughput: 0: 45794.7. Samples: 163623040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 10:23:20,502][73265] Avg episode reward: [(0, '0.418')] [2024-06-13 10:23:23,034][73497] Updated weights for policy 0, policy_version 222489 (0.0039) [2024-06-13 10:23:25,373][73477] Signal inference workers to stop experience collection... (2500 times) [2024-06-13 10:23:25,373][73477] Signal inference workers to resume experience collection... (2500 times) [2024-06-13 10:23:25,388][73497] InferenceWorker_p0-w0: stopping experience collection (2500 times) [2024-06-13 10:23:25,389][73497] InferenceWorker_p0-w0: resuming experience collection (2500 times) [2024-06-13 10:23:25,501][73265] Fps is (10 sec: 49151.8, 60 sec: 46148.3, 300 sec: 45875.2). Total num frames: 3645407232. Throughput: 0: 45785.9. Samples: 163897700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 10:23:25,502][73265] Avg episode reward: [(0, '0.456')] [2024-06-13 10:23:25,524][73497] Updated weights for policy 0, policy_version 222499 (0.0049) [2024-06-13 10:23:30,010][73497] Updated weights for policy 0, policy_version 222509 (0.0036) [2024-06-13 10:23:30,502][73265] Fps is (10 sec: 44236.1, 60 sec: 46148.1, 300 sec: 45653.0). Total num frames: 3645587456. Throughput: 0: 45811.5. Samples: 164172840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 10:23:30,502][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 10:23:33,050][73497] Updated weights for policy 0, policy_version 222519 (0.0030) [2024-06-13 10:23:35,501][73265] Fps is (10 sec: 42598.4, 60 sec: 45056.0, 300 sec: 45819.7). Total num frames: 3645833216. Throughput: 0: 45704.4. Samples: 164300520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 10:23:35,502][73265] Avg episode reward: [(0, '0.402')] [2024-06-13 10:23:37,013][73497] Updated weights for policy 0, policy_version 222529 (0.0027) [2024-06-13 10:23:40,501][73265] Fps is (10 sec: 47514.0, 60 sec: 45602.2, 300 sec: 45819.7). Total num frames: 3646062592. Throughput: 0: 45718.6. Samples: 164581280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 10:23:40,502][73265] Avg episode reward: [(0, '0.449')] [2024-06-13 10:23:40,650][73497] Updated weights for policy 0, policy_version 222539 (0.0049) [2024-06-13 10:23:44,668][73497] Updated weights for policy 0, policy_version 222549 (0.0026) [2024-06-13 10:23:45,501][73265] Fps is (10 sec: 44236.7, 60 sec: 46148.3, 300 sec: 45653.1). Total num frames: 3646275584. Throughput: 0: 45672.4. Samples: 164856920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 10:23:45,502][73265] Avg episode reward: [(0, '0.461')] [2024-06-13 10:23:47,724][73497] Updated weights for policy 0, policy_version 222559 (0.0030) [2024-06-13 10:23:50,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45602.1, 300 sec: 45764.1). Total num frames: 3646521344. Throughput: 0: 45736.0. Samples: 164992480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 10:23:50,502][73265] Avg episode reward: [(0, '0.428')] [2024-06-13 10:23:51,851][73497] Updated weights for policy 0, policy_version 222569 (0.0039) [2024-06-13 10:23:54,661][73497] Updated weights for policy 0, policy_version 222579 (0.0048) [2024-06-13 10:23:55,501][73265] Fps is (10 sec: 45875.6, 60 sec: 45329.1, 300 sec: 45653.1). Total num frames: 3646734336. Throughput: 0: 45519.3. Samples: 165258460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 10:23:55,502][73265] Avg episode reward: [(0, '0.464')] [2024-06-13 10:23:58,925][73497] Updated weights for policy 0, policy_version 222589 (0.0027) [2024-06-13 10:24:00,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45875.1, 300 sec: 45653.0). Total num frames: 3646963712. Throughput: 0: 45625.2. Samples: 165537260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 10:24:00,502][73265] Avg episode reward: [(0, '0.398')] [2024-06-13 10:24:02,493][73497] Updated weights for policy 0, policy_version 222599 (0.0034) [2024-06-13 10:24:05,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45602.2, 300 sec: 45764.5). Total num frames: 3647193088. Throughput: 0: 45334.2. Samples: 165663080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-13 10:24:05,502][73265] Avg episode reward: [(0, '0.446')] [2024-06-13 10:24:05,889][73497] Updated weights for policy 0, policy_version 222609 (0.0044) [2024-06-13 10:24:09,401][73497] Updated weights for policy 0, policy_version 222619 (0.0033) [2024-06-13 10:24:10,501][73265] Fps is (10 sec: 47514.2, 60 sec: 45875.3, 300 sec: 45819.7). Total num frames: 3647438848. Throughput: 0: 45544.9. Samples: 165947220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-13 10:24:10,502][73265] Avg episode reward: [(0, '0.407')] [2024-06-13 10:24:13,401][73497] Updated weights for policy 0, policy_version 222629 (0.0024) [2024-06-13 10:24:15,501][73265] Fps is (10 sec: 47513.2, 60 sec: 45875.1, 300 sec: 45764.1). Total num frames: 3647668224. Throughput: 0: 45716.5. Samples: 166230080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-13 10:24:15,502][73265] Avg episode reward: [(0, '0.373')] [2024-06-13 10:24:16,407][73497] Updated weights for policy 0, policy_version 222639 (0.0025) [2024-06-13 10:24:20,378][73497] Updated weights for policy 0, policy_version 222649 (0.0034) [2024-06-13 10:24:20,507][73265] Fps is (10 sec: 44212.1, 60 sec: 45597.9, 300 sec: 45652.2). Total num frames: 3647881216. Throughput: 0: 45777.0. Samples: 166360740. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-13 10:24:20,507][73265] Avg episode reward: [(0, '0.404')] [2024-06-13 10:24:23,261][73497] Updated weights for policy 0, policy_version 222659 (0.0036) [2024-06-13 10:24:25,502][73265] Fps is (10 sec: 45874.9, 60 sec: 45329.0, 300 sec: 45819.7). Total num frames: 3648126976. Throughput: 0: 45737.3. Samples: 166639460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-13 10:24:25,502][73265] Avg episode reward: [(0, '0.376')] [2024-06-13 10:24:25,631][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000222665_3648143360.pth... [2024-06-13 10:24:25,677][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000221995_3637166080.pth [2024-06-13 10:24:27,613][73497] Updated weights for policy 0, policy_version 222669 (0.0035) [2024-06-13 10:24:30,501][73265] Fps is (10 sec: 47539.9, 60 sec: 46148.4, 300 sec: 45708.6). Total num frames: 3648356352. Throughput: 0: 45680.9. Samples: 166912560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-13 10:24:30,502][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 10:24:30,635][73497] Updated weights for policy 0, policy_version 222679 (0.0033) [2024-06-13 10:24:34,643][73497] Updated weights for policy 0, policy_version 222689 (0.0029) [2024-06-13 10:24:35,501][73265] Fps is (10 sec: 44237.5, 60 sec: 45602.2, 300 sec: 45708.6). Total num frames: 3648569344. Throughput: 0: 45633.9. Samples: 167046000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-13 10:24:35,502][73265] Avg episode reward: [(0, '0.518')] [2024-06-13 10:24:37,694][73497] Updated weights for policy 0, policy_version 222699 (0.0032) [2024-06-13 10:24:40,501][73265] Fps is (10 sec: 47513.3, 60 sec: 46148.3, 300 sec: 45875.2). Total num frames: 3648831488. Throughput: 0: 46022.1. Samples: 167329460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-13 10:24:40,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 10:24:41,478][73497] Updated weights for policy 0, policy_version 222709 (0.0037) [2024-06-13 10:24:44,879][73497] Updated weights for policy 0, policy_version 222719 (0.0025) [2024-06-13 10:24:45,501][73265] Fps is (10 sec: 49152.1, 60 sec: 46421.4, 300 sec: 45764.1). Total num frames: 3649060864. Throughput: 0: 46017.5. Samples: 167608040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-13 10:24:45,502][73265] Avg episode reward: [(0, '0.418')] [2024-06-13 10:24:48,791][73497] Updated weights for policy 0, policy_version 222729 (0.0031) [2024-06-13 10:24:49,671][73477] Signal inference workers to stop experience collection... (2550 times) [2024-06-13 10:24:49,718][73497] InferenceWorker_p0-w0: stopping experience collection (2550 times) [2024-06-13 10:24:49,777][73477] Signal inference workers to resume experience collection... (2550 times) [2024-06-13 10:24:49,777][73497] InferenceWorker_p0-w0: resuming experience collection (2550 times) [2024-06-13 10:24:50,506][73265] Fps is (10 sec: 42578.4, 60 sec: 45598.6, 300 sec: 45707.8). Total num frames: 3649257472. Throughput: 0: 46232.4. Samples: 167743760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-13 10:24:50,507][73265] Avg episode reward: [(0, '0.449')] [2024-06-13 10:24:51,801][73497] Updated weights for policy 0, policy_version 222739 (0.0030) [2024-06-13 10:24:55,504][73265] Fps is (10 sec: 42587.4, 60 sec: 45873.2, 300 sec: 45819.3). Total num frames: 3649486848. Throughput: 0: 46037.8. Samples: 168019040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-13 10:24:55,504][73265] Avg episode reward: [(0, '0.469')] [2024-06-13 10:24:56,016][73497] Updated weights for policy 0, policy_version 222749 (0.0028) [2024-06-13 10:24:59,219][73497] Updated weights for policy 0, policy_version 222759 (0.0033) [2024-06-13 10:25:00,501][73265] Fps is (10 sec: 49175.1, 60 sec: 46421.3, 300 sec: 45875.2). Total num frames: 3649748992. Throughput: 0: 45934.6. Samples: 168297140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-13 10:25:00,502][73265] Avg episode reward: [(0, '0.406')] [2024-06-13 10:25:02,892][73497] Updated weights for policy 0, policy_version 222769 (0.0030) [2024-06-13 10:25:05,501][73265] Fps is (10 sec: 49164.5, 60 sec: 46421.3, 300 sec: 45819.7). Total num frames: 3649978368. Throughput: 0: 46167.0. Samples: 168438000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-13 10:25:05,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 10:25:06,190][73497] Updated weights for policy 0, policy_version 222779 (0.0040) [2024-06-13 10:25:09,601][73497] Updated weights for policy 0, policy_version 222789 (0.0033) [2024-06-13 10:25:10,501][73265] Fps is (10 sec: 44237.2, 60 sec: 45875.2, 300 sec: 45930.8). Total num frames: 3650191360. Throughput: 0: 46093.0. Samples: 168713640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-13 10:25:10,502][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 10:25:13,220][73497] Updated weights for policy 0, policy_version 222799 (0.0035) [2024-06-13 10:25:15,502][73265] Fps is (10 sec: 45874.4, 60 sec: 46148.2, 300 sec: 45875.2). Total num frames: 3650437120. Throughput: 0: 46042.1. Samples: 168984460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-13 10:25:15,502][73265] Avg episode reward: [(0, '0.410')] [2024-06-13 10:25:17,035][73497] Updated weights for policy 0, policy_version 222809 (0.0044) [2024-06-13 10:25:20,314][73497] Updated weights for policy 0, policy_version 222819 (0.0030) [2024-06-13 10:25:20,501][73265] Fps is (10 sec: 47513.5, 60 sec: 46425.6, 300 sec: 45819.7). Total num frames: 3650666496. Throughput: 0: 46243.1. Samples: 169126940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:25:20,502][73265] Avg episode reward: [(0, '0.399')] [2024-06-13 10:25:24,415][73497] Updated weights for policy 0, policy_version 222829 (0.0035) [2024-06-13 10:25:25,501][73265] Fps is (10 sec: 42598.7, 60 sec: 45602.2, 300 sec: 45875.2). Total num frames: 3650863104. Throughput: 0: 46029.8. Samples: 169400800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:25:25,502][73265] Avg episode reward: [(0, '0.491')] [2024-06-13 10:25:27,897][73497] Updated weights for policy 0, policy_version 222839 (0.0040) [2024-06-13 10:25:30,502][73265] Fps is (10 sec: 45874.5, 60 sec: 46148.1, 300 sec: 45986.3). Total num frames: 3651125248. Throughput: 0: 45797.6. Samples: 169668940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:25:30,502][73265] Avg episode reward: [(0, '0.478')] [2024-06-13 10:25:31,303][73497] Updated weights for policy 0, policy_version 222849 (0.0027) [2024-06-13 10:25:34,788][73497] Updated weights for policy 0, policy_version 222859 (0.0027) [2024-06-13 10:25:35,502][73265] Fps is (10 sec: 49151.5, 60 sec: 46421.2, 300 sec: 45820.0). Total num frames: 3651354624. Throughput: 0: 46072.2. Samples: 169816800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:25:35,502][73265] Avg episode reward: [(0, '0.412')] [2024-06-13 10:25:37,995][73497] Updated weights for policy 0, policy_version 222869 (0.0038) [2024-06-13 10:25:40,501][73265] Fps is (10 sec: 44237.6, 60 sec: 45602.2, 300 sec: 45875.2). Total num frames: 3651567616. Throughput: 0: 46144.8. Samples: 170095440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:25:40,502][73265] Avg episode reward: [(0, '0.487')] [2024-06-13 10:25:41,877][73497] Updated weights for policy 0, policy_version 222879 (0.0034) [2024-06-13 10:25:45,027][73497] Updated weights for policy 0, policy_version 222889 (0.0041) [2024-06-13 10:25:45,501][73265] Fps is (10 sec: 45876.3, 60 sec: 45875.2, 300 sec: 45930.8). Total num frames: 3651813376. Throughput: 0: 45892.6. Samples: 170362300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:25:45,502][73265] Avg episode reward: [(0, '0.458')] [2024-06-13 10:25:48,917][73497] Updated weights for policy 0, policy_version 222899 (0.0040) [2024-06-13 10:25:50,501][73265] Fps is (10 sec: 47513.1, 60 sec: 46425.0, 300 sec: 45875.2). Total num frames: 3652042752. Throughput: 0: 45950.1. Samples: 170505760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:25:50,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 10:25:52,566][73497] Updated weights for policy 0, policy_version 222909 (0.0028) [2024-06-13 10:25:55,501][73265] Fps is (10 sec: 42598.4, 60 sec: 45877.2, 300 sec: 45764.2). Total num frames: 3652239360. Throughput: 0: 45985.8. Samples: 170783000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:25:55,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 10:25:56,398][73497] Updated weights for policy 0, policy_version 222919 (0.0040) [2024-06-13 10:25:59,555][73497] Updated weights for policy 0, policy_version 222929 (0.0037) [2024-06-13 10:26:00,501][73265] Fps is (10 sec: 42598.7, 60 sec: 45329.1, 300 sec: 45930.7). Total num frames: 3652468736. Throughput: 0: 45873.0. Samples: 171048740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:26:00,502][73265] Avg episode reward: [(0, '0.429')] [2024-06-13 10:26:03,252][73497] Updated weights for policy 0, policy_version 222939 (0.0030) [2024-06-13 10:26:05,504][73265] Fps is (10 sec: 47501.5, 60 sec: 45600.2, 300 sec: 45819.3). Total num frames: 3652714496. Throughput: 0: 45785.9. Samples: 171187420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:26:05,504][73265] Avg episode reward: [(0, '0.375')] [2024-06-13 10:26:06,739][73497] Updated weights for policy 0, policy_version 222949 (0.0030) [2024-06-13 10:26:10,501][73265] Fps is (10 sec: 47513.4, 60 sec: 45875.2, 300 sec: 45819.7). Total num frames: 3652943872. Throughput: 0: 45939.2. Samples: 171468060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:26:10,502][73265] Avg episode reward: [(0, '0.442')] [2024-06-13 10:26:10,801][73497] Updated weights for policy 0, policy_version 222959 (0.0034) [2024-06-13 10:26:13,683][73497] Updated weights for policy 0, policy_version 222969 (0.0033) [2024-06-13 10:26:15,502][73265] Fps is (10 sec: 42608.4, 60 sec: 45056.0, 300 sec: 45764.1). Total num frames: 3653140480. Throughput: 0: 45876.4. Samples: 171733380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:26:15,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 10:26:17,958][73497] Updated weights for policy 0, policy_version 222979 (0.0046) [2024-06-13 10:26:20,501][73265] Fps is (10 sec: 47513.9, 60 sec: 45875.2, 300 sec: 45930.8). Total num frames: 3653419008. Throughput: 0: 45642.9. Samples: 171870720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:26:20,502][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 10:26:21,458][73497] Updated weights for policy 0, policy_version 222989 (0.0036) [2024-06-13 10:26:24,825][73477] Signal inference workers to stop experience collection... (2600 times) [2024-06-13 10:26:24,826][73477] Signal inference workers to resume experience collection... (2600 times) [2024-06-13 10:26:24,845][73497] InferenceWorker_p0-w0: stopping experience collection (2600 times) [2024-06-13 10:26:24,845][73497] InferenceWorker_p0-w0: resuming experience collection (2600 times) [2024-06-13 10:26:25,138][73497] Updated weights for policy 0, policy_version 222999 (0.0039) [2024-06-13 10:26:25,506][73265] Fps is (10 sec: 49129.7, 60 sec: 46144.7, 300 sec: 45763.4). Total num frames: 3653632000. Throughput: 0: 45393.9. Samples: 172138380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:26:25,507][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 10:26:25,521][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000223000_3653632000.pth... [2024-06-13 10:26:25,571][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000222329_3642638336.pth [2024-06-13 10:26:28,602][73497] Updated weights for policy 0, policy_version 223009 (0.0033) [2024-06-13 10:26:30,502][73265] Fps is (10 sec: 42597.5, 60 sec: 45329.0, 300 sec: 45820.5). Total num frames: 3653844992. Throughput: 0: 45617.5. Samples: 172415100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 10:26:30,502][73265] Avg episode reward: [(0, '0.433')] [2024-06-13 10:26:32,015][73497] Updated weights for policy 0, policy_version 223019 (0.0037) [2024-06-13 10:26:35,501][73265] Fps is (10 sec: 44257.4, 60 sec: 45329.2, 300 sec: 45930.7). Total num frames: 3654074368. Throughput: 0: 45369.4. Samples: 172547380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 10:26:35,502][73265] Avg episode reward: [(0, '0.465')] [2024-06-13 10:26:35,862][73497] Updated weights for policy 0, policy_version 223029 (0.0035) [2024-06-13 10:26:39,665][73497] Updated weights for policy 0, policy_version 223039 (0.0037) [2024-06-13 10:26:40,501][73265] Fps is (10 sec: 45875.9, 60 sec: 45602.1, 300 sec: 45764.1). Total num frames: 3654303744. Throughput: 0: 45196.4. Samples: 172816840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 10:26:40,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 10:26:42,947][73497] Updated weights for policy 0, policy_version 223049 (0.0024) [2024-06-13 10:26:45,502][73265] Fps is (10 sec: 44236.4, 60 sec: 45055.9, 300 sec: 45653.0). Total num frames: 3654516736. Throughput: 0: 45437.7. Samples: 173093440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 10:26:45,502][73265] Avg episode reward: [(0, '0.443')] [2024-06-13 10:26:46,982][73497] Updated weights for policy 0, policy_version 223059 (0.0039) [2024-06-13 10:26:50,134][73497] Updated weights for policy 0, policy_version 223069 (0.0038) [2024-06-13 10:26:50,504][73265] Fps is (10 sec: 45863.8, 60 sec: 45327.2, 300 sec: 45930.4). Total num frames: 3654762496. Throughput: 0: 45296.0. Samples: 173225740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 10:26:50,504][73265] Avg episode reward: [(0, '0.418')] [2024-06-13 10:26:54,256][73497] Updated weights for policy 0, policy_version 223079 (0.0039) [2024-06-13 10:26:55,501][73265] Fps is (10 sec: 45876.0, 60 sec: 45602.1, 300 sec: 45653.1). Total num frames: 3654975488. Throughput: 0: 45000.1. Samples: 173493060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 10:26:55,502][73265] Avg episode reward: [(0, '0.387')] [2024-06-13 10:26:57,678][73497] Updated weights for policy 0, policy_version 223089 (0.0038) [2024-06-13 10:27:00,501][73265] Fps is (10 sec: 45886.5, 60 sec: 45875.2, 300 sec: 45708.6). Total num frames: 3655221248. Throughput: 0: 45203.7. Samples: 173767540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 10:27:00,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 10:27:01,043][73497] Updated weights for policy 0, policy_version 223099 (0.0029) [2024-06-13 10:27:05,020][73497] Updated weights for policy 0, policy_version 223109 (0.0033) [2024-06-13 10:27:05,501][73265] Fps is (10 sec: 45874.9, 60 sec: 45330.9, 300 sec: 45820.0). Total num frames: 3655434240. Throughput: 0: 45130.2. Samples: 173901580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 10:27:05,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 10:27:08,529][73497] Updated weights for policy 0, policy_version 223119 (0.0035) [2024-06-13 10:27:10,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 3655663616. Throughput: 0: 45312.3. Samples: 174177220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 10:27:10,502][73265] Avg episode reward: [(0, '0.453')] [2024-06-13 10:27:11,935][73497] Updated weights for policy 0, policy_version 223129 (0.0032) [2024-06-13 10:27:15,501][73265] Fps is (10 sec: 44236.8, 60 sec: 45602.2, 300 sec: 45653.0). Total num frames: 3655876608. Throughput: 0: 45241.1. Samples: 174450940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 10:27:15,502][73265] Avg episode reward: [(0, '0.475')] [2024-06-13 10:27:15,934][73497] Updated weights for policy 0, policy_version 223139 (0.0033) [2024-06-13 10:27:19,442][73497] Updated weights for policy 0, policy_version 223149 (0.0041) [2024-06-13 10:27:20,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45055.9, 300 sec: 45708.6). Total num frames: 3656122368. Throughput: 0: 45094.2. Samples: 174576620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 10:27:20,502][73265] Avg episode reward: [(0, '0.397')] [2024-06-13 10:27:23,347][73497] Updated weights for policy 0, policy_version 223159 (0.0029) [2024-06-13 10:27:25,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45059.5, 300 sec: 45819.6). Total num frames: 3656335360. Throughput: 0: 45288.0. Samples: 174854800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 10:27:25,502][73265] Avg episode reward: [(0, '0.413')] [2024-06-13 10:27:26,695][73497] Updated weights for policy 0, policy_version 223169 (0.0040) [2024-06-13 10:27:30,258][73497] Updated weights for policy 0, policy_version 223179 (0.0041) [2024-06-13 10:27:30,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45602.3, 300 sec: 45597.5). Total num frames: 3656581120. Throughput: 0: 45324.6. Samples: 175133040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-13 10:27:30,502][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 10:27:34,153][73497] Updated weights for policy 0, policy_version 223189 (0.0042) [2024-06-13 10:27:35,501][73265] Fps is (10 sec: 47513.4, 60 sec: 45602.1, 300 sec: 45708.6). Total num frames: 3656810496. Throughput: 0: 45287.3. Samples: 175263560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 10:27:35,502][73265] Avg episode reward: [(0, '0.415')] [2024-06-13 10:27:37,273][73497] Updated weights for policy 0, policy_version 223199 (0.0040) [2024-06-13 10:27:40,501][73265] Fps is (10 sec: 44236.5, 60 sec: 45329.0, 300 sec: 45819.7). Total num frames: 3657023488. Throughput: 0: 45555.9. Samples: 175543080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 10:27:40,502][73265] Avg episode reward: [(0, '0.445')] [2024-06-13 10:27:40,991][73497] Updated weights for policy 0, policy_version 223209 (0.0032) [2024-06-13 10:27:44,881][73497] Updated weights for policy 0, policy_version 223219 (0.0035) [2024-06-13 10:27:45,501][73265] Fps is (10 sec: 44237.2, 60 sec: 45602.2, 300 sec: 45653.0). Total num frames: 3657252864. Throughput: 0: 45474.7. Samples: 175813900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 10:27:45,502][73265] Avg episode reward: [(0, '0.471')] [2024-06-13 10:27:47,904][73477] Signal inference workers to stop experience collection... (2650 times) [2024-06-13 10:27:47,905][73477] Signal inference workers to resume experience collection... (2650 times) [2024-06-13 10:27:47,952][73497] InferenceWorker_p0-w0: stopping experience collection (2650 times) [2024-06-13 10:27:47,952][73497] InferenceWorker_p0-w0: resuming experience collection (2650 times) [2024-06-13 10:27:48,038][73497] Updated weights for policy 0, policy_version 223229 (0.0040) [2024-06-13 10:27:50,501][73265] Fps is (10 sec: 45875.7, 60 sec: 45331.0, 300 sec: 45653.0). Total num frames: 3657482240. Throughput: 0: 45426.7. Samples: 175945780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 10:27:50,502][73265] Avg episode reward: [(0, '0.458')] [2024-06-13 10:27:52,220][73497] Updated weights for policy 0, policy_version 223239 (0.0032) [2024-06-13 10:27:55,501][73265] Fps is (10 sec: 44236.8, 60 sec: 45329.0, 300 sec: 45708.6). Total num frames: 3657695232. Throughput: 0: 45444.9. Samples: 176222240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 10:27:55,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 10:27:55,512][73497] Updated weights for policy 0, policy_version 223249 (0.0037) [2024-06-13 10:27:59,066][73497] Updated weights for policy 0, policy_version 223259 (0.0034) [2024-06-13 10:28:00,502][73265] Fps is (10 sec: 45874.7, 60 sec: 45329.0, 300 sec: 45708.6). Total num frames: 3657940992. Throughput: 0: 45387.5. Samples: 176493380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 10:28:00,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 10:28:02,756][73497] Updated weights for policy 0, policy_version 223269 (0.0051) [2024-06-13 10:28:05,502][73265] Fps is (10 sec: 44236.3, 60 sec: 45055.9, 300 sec: 45597.5). Total num frames: 3658137600. Throughput: 0: 45585.3. Samples: 176627960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 10:28:05,502][73265] Avg episode reward: [(0, '0.476')] [2024-06-13 10:28:06,465][73497] Updated weights for policy 0, policy_version 223279 (0.0045) [2024-06-13 10:28:09,862][73497] Updated weights for policy 0, policy_version 223289 (0.0028) [2024-06-13 10:28:10,501][73265] Fps is (10 sec: 45875.9, 60 sec: 45602.2, 300 sec: 45708.6). Total num frames: 3658399744. Throughput: 0: 45535.2. Samples: 176903880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 10:28:10,502][73265] Avg episode reward: [(0, '0.453')] [2024-06-13 10:28:13,929][73497] Updated weights for policy 0, policy_version 223299 (0.0036) [2024-06-13 10:28:15,501][73265] Fps is (10 sec: 47514.0, 60 sec: 45602.1, 300 sec: 45653.0). Total num frames: 3658612736. Throughput: 0: 45254.7. Samples: 177169500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 10:28:15,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 10:28:17,258][73497] Updated weights for policy 0, policy_version 223309 (0.0027) [2024-06-13 10:28:20,501][73265] Fps is (10 sec: 40959.9, 60 sec: 44783.0, 300 sec: 45430.9). Total num frames: 3658809344. Throughput: 0: 45363.7. Samples: 177304920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 10:28:20,502][73265] Avg episode reward: [(0, '0.421')] [2024-06-13 10:28:21,006][73497] Updated weights for policy 0, policy_version 223319 (0.0035) [2024-06-13 10:28:24,497][73497] Updated weights for policy 0, policy_version 223329 (0.0036) [2024-06-13 10:28:25,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45602.1, 300 sec: 45708.6). Total num frames: 3659071488. Throughput: 0: 45389.9. Samples: 177585620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 10:28:25,502][73265] Avg episode reward: [(0, '0.472')] [2024-06-13 10:28:25,529][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000223332_3659071488.pth... [2024-06-13 10:28:25,578][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000222665_3648143360.pth [2024-06-13 10:28:27,838][73497] Updated weights for policy 0, policy_version 223339 (0.0029) [2024-06-13 10:28:30,501][73265] Fps is (10 sec: 47513.0, 60 sec: 45056.0, 300 sec: 45597.5). Total num frames: 3659284480. Throughput: 0: 45339.0. Samples: 177854160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 10:28:30,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 10:28:31,645][73497] Updated weights for policy 0, policy_version 223349 (0.0036) [2024-06-13 10:28:35,501][73265] Fps is (10 sec: 42598.4, 60 sec: 44783.0, 300 sec: 45542.0). Total num frames: 3659497472. Throughput: 0: 45419.1. Samples: 177989640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 10:28:35,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 10:28:35,534][73497] Updated weights for policy 0, policy_version 223359 (0.0050) [2024-06-13 10:28:38,720][73497] Updated weights for policy 0, policy_version 223369 (0.0054) [2024-06-13 10:28:40,501][73265] Fps is (10 sec: 47513.9, 60 sec: 45602.2, 300 sec: 45708.6). Total num frames: 3659759616. Throughput: 0: 45300.9. Samples: 178260780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-13 10:28:40,502][73265] Avg episode reward: [(0, '0.412')] [2024-06-13 10:28:42,570][73497] Updated weights for policy 0, policy_version 223379 (0.0039) [2024-06-13 10:28:45,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45056.0, 300 sec: 45542.0). Total num frames: 3659956224. Throughput: 0: 45429.8. Samples: 178537720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 10:28:45,502][73265] Avg episode reward: [(0, '0.401')] [2024-06-13 10:28:46,182][73497] Updated weights for policy 0, policy_version 223389 (0.0029) [2024-06-13 10:28:49,566][73497] Updated weights for policy 0, policy_version 223399 (0.0026) [2024-06-13 10:28:50,503][73265] Fps is (10 sec: 44229.4, 60 sec: 45327.8, 300 sec: 45652.8). Total num frames: 3660201984. Throughput: 0: 45298.9. Samples: 178666480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 10:28:50,504][73265] Avg episode reward: [(0, '0.489')] [2024-06-13 10:28:53,236][73497] Updated weights for policy 0, policy_version 223409 (0.0034) [2024-06-13 10:28:55,501][73265] Fps is (10 sec: 47513.6, 60 sec: 45602.1, 300 sec: 45653.0). Total num frames: 3660431360. Throughput: 0: 45465.7. Samples: 178949840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 10:28:55,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 10:28:56,552][73497] Updated weights for policy 0, policy_version 223419 (0.0030) [2024-06-13 10:29:00,161][73497] Updated weights for policy 0, policy_version 223429 (0.0036) [2024-06-13 10:29:00,504][73265] Fps is (10 sec: 45872.8, 60 sec: 45327.5, 300 sec: 45652.7). Total num frames: 3660660736. Throughput: 0: 45664.0. Samples: 179224480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 10:29:00,504][73265] Avg episode reward: [(0, '0.456')] [2024-06-13 10:29:04,009][73497] Updated weights for policy 0, policy_version 223439 (0.0033) [2024-06-13 10:29:05,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45602.2, 300 sec: 45542.0). Total num frames: 3660873728. Throughput: 0: 45701.7. Samples: 179361500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 10:29:05,502][73265] Avg episode reward: [(0, '0.463')] [2024-06-13 10:29:07,351][73497] Updated weights for policy 0, policy_version 223449 (0.0043) [2024-06-13 10:29:10,501][73265] Fps is (10 sec: 44246.8, 60 sec: 45056.0, 300 sec: 45542.0). Total num frames: 3661103104. Throughput: 0: 45417.8. Samples: 179629420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 10:29:10,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 10:29:11,119][73497] Updated weights for policy 0, policy_version 223459 (0.0031) [2024-06-13 10:29:14,773][73497] Updated weights for policy 0, policy_version 223469 (0.0040) [2024-06-13 10:29:15,269][73477] Signal inference workers to stop experience collection... (2700 times) [2024-06-13 10:29:15,269][73477] Signal inference workers to resume experience collection... (2700 times) [2024-06-13 10:29:15,294][73497] InferenceWorker_p0-w0: stopping experience collection (2700 times) [2024-06-13 10:29:15,294][73497] InferenceWorker_p0-w0: resuming experience collection (2700 times) [2024-06-13 10:29:15,501][73265] Fps is (10 sec: 47513.9, 60 sec: 45602.2, 300 sec: 45653.9). Total num frames: 3661348864. Throughput: 0: 45585.0. Samples: 179905480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 10:29:15,502][73265] Avg episode reward: [(0, '0.423')] [2024-06-13 10:29:18,201][73497] Updated weights for policy 0, policy_version 223479 (0.0040) [2024-06-13 10:29:20,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 3661561856. Throughput: 0: 45812.0. Samples: 180051180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 10:29:20,502][73265] Avg episode reward: [(0, '0.428')] [2024-06-13 10:29:21,626][73497] Updated weights for policy 0, policy_version 223489 (0.0040) [2024-06-13 10:29:25,228][73497] Updated weights for policy 0, policy_version 223499 (0.0030) [2024-06-13 10:29:25,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45602.2, 300 sec: 45597.5). Total num frames: 3661807616. Throughput: 0: 45893.8. Samples: 180326000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 10:29:25,502][73265] Avg episode reward: [(0, '0.462')] [2024-06-13 10:29:29,065][73497] Updated weights for policy 0, policy_version 223509 (0.0024) [2024-06-13 10:29:30,501][73265] Fps is (10 sec: 47513.4, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 3662036992. Throughput: 0: 45653.8. Samples: 180592140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 10:29:30,502][73265] Avg episode reward: [(0, '0.484')] [2024-06-13 10:29:32,721][73497] Updated weights for policy 0, policy_version 223519 (0.0042) [2024-06-13 10:29:35,501][73265] Fps is (10 sec: 45874.7, 60 sec: 46148.2, 300 sec: 45542.0). Total num frames: 3662266368. Throughput: 0: 45807.9. Samples: 180727760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 10:29:35,507][73265] Avg episode reward: [(0, '0.444')] [2024-06-13 10:29:36,175][73497] Updated weights for policy 0, policy_version 223529 (0.0024) [2024-06-13 10:29:40,313][73497] Updated weights for policy 0, policy_version 223539 (0.0035) [2024-06-13 10:29:40,501][73265] Fps is (10 sec: 42598.6, 60 sec: 45056.0, 300 sec: 45430.9). Total num frames: 3662462976. Throughput: 0: 45535.6. Samples: 180998940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 10:29:40,502][73265] Avg episode reward: [(0, '0.463')] [2024-06-13 10:29:43,243][73497] Updated weights for policy 0, policy_version 223549 (0.0036) [2024-06-13 10:29:45,501][73265] Fps is (10 sec: 45875.3, 60 sec: 46148.3, 300 sec: 45653.8). Total num frames: 3662725120. Throughput: 0: 45645.8. Samples: 181278440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 10:29:45,502][73265] Avg episode reward: [(0, '0.449')] [2024-06-13 10:29:47,133][73497] Updated weights for policy 0, policy_version 223559 (0.0023) [2024-06-13 10:29:50,501][73265] Fps is (10 sec: 47513.4, 60 sec: 45603.4, 300 sec: 45597.9). Total num frames: 3662938112. Throughput: 0: 45630.6. Samples: 181414880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-13 10:29:50,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 10:29:50,605][73497] Updated weights for policy 0, policy_version 223569 (0.0039) [2024-06-13 10:29:54,350][73497] Updated weights for policy 0, policy_version 223579 (0.0031) [2024-06-13 10:29:55,501][73265] Fps is (10 sec: 45875.6, 60 sec: 45875.3, 300 sec: 45542.0). Total num frames: 3663183872. Throughput: 0: 45730.7. Samples: 181687300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 10:29:55,502][73265] Avg episode reward: [(0, '0.463')] [2024-06-13 10:29:57,872][73497] Updated weights for policy 0, policy_version 223589 (0.0036) [2024-06-13 10:30:00,501][73265] Fps is (10 sec: 45875.7, 60 sec: 45603.9, 300 sec: 45486.4). Total num frames: 3663396864. Throughput: 0: 45595.6. Samples: 181957280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 10:30:00,502][73265] Avg episode reward: [(0, '0.471')] [2024-06-13 10:30:01,833][73497] Updated weights for policy 0, policy_version 223599 (0.0037) [2024-06-13 10:30:04,963][73497] Updated weights for policy 0, policy_version 223609 (0.0027) [2024-06-13 10:30:05,501][73265] Fps is (10 sec: 44236.5, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 3663626240. Throughput: 0: 45378.6. Samples: 182093220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 10:30:05,502][73265] Avg episode reward: [(0, '0.470')] [2024-06-13 10:30:08,905][73497] Updated weights for policy 0, policy_version 223619 (0.0036) [2024-06-13 10:30:10,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45875.2, 300 sec: 45486.5). Total num frames: 3663855616. Throughput: 0: 45352.0. Samples: 182366840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 10:30:10,502][73265] Avg episode reward: [(0, '0.435')] [2024-06-13 10:30:12,311][73497] Updated weights for policy 0, policy_version 223629 (0.0036) [2024-06-13 10:30:15,502][73265] Fps is (10 sec: 45874.6, 60 sec: 45602.0, 300 sec: 45486.4). Total num frames: 3664084992. Throughput: 0: 45633.2. Samples: 182645640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 10:30:15,502][73265] Avg episode reward: [(0, '0.449')] [2024-06-13 10:30:15,706][73497] Updated weights for policy 0, policy_version 223639 (0.0034) [2024-06-13 10:30:19,169][73497] Updated weights for policy 0, policy_version 223649 (0.0022) [2024-06-13 10:30:20,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3664297984. Throughput: 0: 45708.5. Samples: 182784640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 10:30:20,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 10:30:22,923][73497] Updated weights for policy 0, policy_version 223659 (0.0040) [2024-06-13 10:30:25,501][73265] Fps is (10 sec: 45875.9, 60 sec: 45602.1, 300 sec: 45486.5). Total num frames: 3664543744. Throughput: 0: 45730.2. Samples: 183056800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 10:30:25,502][73265] Avg episode reward: [(0, '0.467')] [2024-06-13 10:30:25,510][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000223666_3664543744.pth... [2024-06-13 10:30:25,571][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000223000_3653632000.pth [2024-06-13 10:30:26,514][73497] Updated weights for policy 0, policy_version 223669 (0.0031) [2024-06-13 10:30:30,025][73497] Updated weights for policy 0, policy_version 223679 (0.0030) [2024-06-13 10:30:30,502][73265] Fps is (10 sec: 45874.4, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 3664756736. Throughput: 0: 45671.0. Samples: 183333640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 10:30:30,502][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 10:30:33,499][73497] Updated weights for policy 0, policy_version 223689 (0.0028) [2024-06-13 10:30:35,501][73265] Fps is (10 sec: 45874.8, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3665002496. Throughput: 0: 45648.4. Samples: 183469060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 10:30:35,502][73265] Avg episode reward: [(0, '0.442')] [2024-06-13 10:30:37,004][73497] Updated weights for policy 0, policy_version 223699 (0.0029) [2024-06-13 10:30:40,502][73265] Fps is (10 sec: 47512.2, 60 sec: 46147.9, 300 sec: 45486.4). Total num frames: 3665231872. Throughput: 0: 45737.7. Samples: 183745520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 10:30:40,502][73265] Avg episode reward: [(0, '0.469')] [2024-06-13 10:30:40,577][73497] Updated weights for policy 0, policy_version 223709 (0.0030) [2024-06-13 10:30:42,964][73477] Signal inference workers to stop experience collection... (2750 times) [2024-06-13 10:30:42,964][73477] Signal inference workers to resume experience collection... (2750 times) [2024-06-13 10:30:42,995][73497] InferenceWorker_p0-w0: stopping experience collection (2750 times) [2024-06-13 10:30:42,995][73497] InferenceWorker_p0-w0: resuming experience collection (2750 times) [2024-06-13 10:30:44,380][73497] Updated weights for policy 0, policy_version 223719 (0.0024) [2024-06-13 10:30:45,502][73265] Fps is (10 sec: 45875.0, 60 sec: 45602.1, 300 sec: 45486.4). Total num frames: 3665461248. Throughput: 0: 45871.3. Samples: 184021500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 10:30:45,502][73265] Avg episode reward: [(0, '0.425')] [2024-06-13 10:30:47,922][73497] Updated weights for policy 0, policy_version 223729 (0.0033) [2024-06-13 10:30:50,501][73265] Fps is (10 sec: 44238.5, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3665674240. Throughput: 0: 45963.1. Samples: 184161560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 10:30:50,502][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 10:30:51,462][73497] Updated weights for policy 0, policy_version 223739 (0.0037) [2024-06-13 10:30:55,027][73497] Updated weights for policy 0, policy_version 223749 (0.0031) [2024-06-13 10:30:55,502][73265] Fps is (10 sec: 45874.9, 60 sec: 45602.0, 300 sec: 45597.5). Total num frames: 3665920000. Throughput: 0: 45877.1. Samples: 184431320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 10:30:55,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 10:30:58,706][73497] Updated weights for policy 0, policy_version 223759 (0.0031) [2024-06-13 10:31:00,501][73265] Fps is (10 sec: 47513.6, 60 sec: 45875.1, 300 sec: 45542.3). Total num frames: 3666149376. Throughput: 0: 45849.9. Samples: 184708880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 10:31:00,502][73265] Avg episode reward: [(0, '0.446')] [2024-06-13 10:31:02,080][73497] Updated weights for policy 0, policy_version 223769 (0.0035) [2024-06-13 10:31:05,504][73265] Fps is (10 sec: 45864.7, 60 sec: 45873.3, 300 sec: 45541.6). Total num frames: 3666378752. Throughput: 0: 45796.1. Samples: 184845580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 10:31:05,504][73265] Avg episode reward: [(0, '0.422')] [2024-06-13 10:31:05,850][73497] Updated weights for policy 0, policy_version 223779 (0.0033) [2024-06-13 10:31:09,107][73497] Updated weights for policy 0, policy_version 223789 (0.0042) [2024-06-13 10:31:10,501][73265] Fps is (10 sec: 42598.5, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 3666575360. Throughput: 0: 45710.2. Samples: 185113760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 10:31:10,502][73265] Avg episode reward: [(0, '0.376')] [2024-06-13 10:31:12,985][73497] Updated weights for policy 0, policy_version 223799 (0.0036) [2024-06-13 10:31:15,501][73265] Fps is (10 sec: 45886.7, 60 sec: 45875.3, 300 sec: 45486.4). Total num frames: 3666837504. Throughput: 0: 45663.3. Samples: 185388480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 10:31:15,502][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 10:31:16,336][73497] Updated weights for policy 0, policy_version 223809 (0.0032) [2024-06-13 10:31:20,447][73497] Updated weights for policy 0, policy_version 223819 (0.0024) [2024-06-13 10:31:20,507][73265] Fps is (10 sec: 47488.0, 60 sec: 45871.1, 300 sec: 45486.3). Total num frames: 3667050496. Throughput: 0: 45795.5. Samples: 185530100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 10:31:20,507][73265] Avg episode reward: [(0, '0.377')] [2024-06-13 10:31:23,502][73497] Updated weights for policy 0, policy_version 223829 (0.0025) [2024-06-13 10:31:25,501][73265] Fps is (10 sec: 44236.6, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3667279872. Throughput: 0: 45654.6. Samples: 185799960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 10:31:25,502][73265] Avg episode reward: [(0, '0.425')] [2024-06-13 10:31:27,518][73497] Updated weights for policy 0, policy_version 223839 (0.0024) [2024-06-13 10:31:30,461][73497] Updated weights for policy 0, policy_version 223849 (0.0044) [2024-06-13 10:31:30,501][73265] Fps is (10 sec: 49178.3, 60 sec: 46421.4, 300 sec: 45653.0). Total num frames: 3667542016. Throughput: 0: 45621.0. Samples: 186074440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 10:31:30,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 10:31:34,442][73497] Updated weights for policy 0, policy_version 223859 (0.0029) [2024-06-13 10:31:35,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45602.2, 300 sec: 45542.0). Total num frames: 3667738624. Throughput: 0: 45615.6. Samples: 186214260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 10:31:35,503][73265] Avg episode reward: [(0, '0.445')] [2024-06-13 10:31:38,077][73497] Updated weights for policy 0, policy_version 223869 (0.0033) [2024-06-13 10:31:40,504][73265] Fps is (10 sec: 40950.2, 60 sec: 45327.5, 300 sec: 45541.6). Total num frames: 3667951616. Throughput: 0: 45777.7. Samples: 186491420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 10:31:40,505][73265] Avg episode reward: [(0, '0.382')] [2024-06-13 10:31:41,411][73497] Updated weights for policy 0, policy_version 223879 (0.0033) [2024-06-13 10:31:44,907][73497] Updated weights for policy 0, policy_version 223889 (0.0047) [2024-06-13 10:31:45,501][73265] Fps is (10 sec: 47513.5, 60 sec: 45875.3, 300 sec: 45597.9). Total num frames: 3668213760. Throughput: 0: 45607.5. Samples: 186761220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 10:31:45,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 10:31:48,975][73497] Updated weights for policy 0, policy_version 223899 (0.0031) [2024-06-13 10:31:50,504][73265] Fps is (10 sec: 49151.7, 60 sec: 46146.4, 300 sec: 45652.6). Total num frames: 3668443136. Throughput: 0: 45701.7. Samples: 186902160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 10:31:50,505][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 10:31:52,265][73497] Updated weights for policy 0, policy_version 223909 (0.0031) [2024-06-13 10:31:55,501][73265] Fps is (10 sec: 42598.6, 60 sec: 45329.2, 300 sec: 45486.4). Total num frames: 3668639744. Throughput: 0: 45758.7. Samples: 187172900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 10:31:55,502][73265] Avg episode reward: [(0, '0.423')] [2024-06-13 10:31:56,155][73497] Updated weights for policy 0, policy_version 223919 (0.0032) [2024-06-13 10:31:59,545][73497] Updated weights for policy 0, policy_version 223929 (0.0031) [2024-06-13 10:32:00,501][73265] Fps is (10 sec: 42609.0, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 3668869120. Throughput: 0: 45694.2. Samples: 187444720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 10:32:00,502][73265] Avg episode reward: [(0, '0.407')] [2024-06-13 10:32:03,039][73497] Updated weights for policy 0, policy_version 223939 (0.0037) [2024-06-13 10:32:05,501][73265] Fps is (10 sec: 49151.9, 60 sec: 45877.1, 300 sec: 45653.0). Total num frames: 3669131264. Throughput: 0: 45555.7. Samples: 187579860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-13 10:32:05,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 10:32:06,879][73497] Updated weights for policy 0, policy_version 223949 (0.0025) [2024-06-13 10:32:10,354][73497] Updated weights for policy 0, policy_version 223959 (0.0025) [2024-06-13 10:32:10,501][73265] Fps is (10 sec: 47513.9, 60 sec: 46148.3, 300 sec: 45653.1). Total num frames: 3669344256. Throughput: 0: 45832.9. Samples: 187862440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-13 10:32:10,502][73265] Avg episode reward: [(0, '0.392')] [2024-06-13 10:32:13,724][73497] Updated weights for policy 0, policy_version 223969 (0.0037) [2024-06-13 10:32:15,501][73265] Fps is (10 sec: 42598.1, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 3669557248. Throughput: 0: 45627.5. Samples: 188127680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-13 10:32:15,502][73265] Avg episode reward: [(0, '0.399')] [2024-06-13 10:32:17,519][73477] Signal inference workers to stop experience collection... (2800 times) [2024-06-13 10:32:17,519][73477] Signal inference workers to resume experience collection... (2800 times) [2024-06-13 10:32:17,532][73497] InferenceWorker_p0-w0: stopping experience collection (2800 times) [2024-06-13 10:32:17,532][73497] InferenceWorker_p0-w0: resuming experience collection (2800 times) [2024-06-13 10:32:17,663][73497] Updated weights for policy 0, policy_version 223979 (0.0034) [2024-06-13 10:32:20,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45879.4, 300 sec: 45653.1). Total num frames: 3669803008. Throughput: 0: 45445.0. Samples: 188259280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-13 10:32:20,502][73265] Avg episode reward: [(0, '0.439')] [2024-06-13 10:32:21,623][73497] Updated weights for policy 0, policy_version 223989 (0.0025) [2024-06-13 10:32:24,865][73497] Updated weights for policy 0, policy_version 223999 (0.0041) [2024-06-13 10:32:25,501][73265] Fps is (10 sec: 45875.4, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3670016000. Throughput: 0: 45384.7. Samples: 188533620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-13 10:32:25,502][73265] Avg episode reward: [(0, '0.433')] [2024-06-13 10:32:25,513][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000224000_3670016000.pth... [2024-06-13 10:32:25,564][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000223332_3659071488.pth [2024-06-13 10:32:28,743][73497] Updated weights for policy 0, policy_version 224009 (0.0040) [2024-06-13 10:32:30,501][73265] Fps is (10 sec: 44236.5, 60 sec: 45056.0, 300 sec: 45542.0). Total num frames: 3670245376. Throughput: 0: 45469.8. Samples: 188807360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-13 10:32:30,502][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 10:32:31,771][73497] Updated weights for policy 0, policy_version 224019 (0.0031) [2024-06-13 10:32:35,503][73265] Fps is (10 sec: 45866.2, 60 sec: 45600.6, 300 sec: 45597.2). Total num frames: 3670474752. Throughput: 0: 45411.6. Samples: 188945660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-13 10:32:35,504][73265] Avg episode reward: [(0, '0.483')] [2024-06-13 10:32:35,558][73497] Updated weights for policy 0, policy_version 224029 (0.0035) [2024-06-13 10:32:39,299][73497] Updated weights for policy 0, policy_version 224039 (0.0037) [2024-06-13 10:32:40,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45877.1, 300 sec: 45597.5). Total num frames: 3670704128. Throughput: 0: 45543.1. Samples: 189222340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-13 10:32:40,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 10:32:42,407][73497] Updated weights for policy 0, policy_version 224049 (0.0043) [2024-06-13 10:32:45,501][73265] Fps is (10 sec: 42607.0, 60 sec: 44783.0, 300 sec: 45486.4). Total num frames: 3670900736. Throughput: 0: 45404.1. Samples: 189487900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-13 10:32:45,502][73265] Avg episode reward: [(0, '0.428')] [2024-06-13 10:32:46,427][73497] Updated weights for policy 0, policy_version 224059 (0.0033) [2024-06-13 10:32:50,263][73497] Updated weights for policy 0, policy_version 224069 (0.0029) [2024-06-13 10:32:50,501][73265] Fps is (10 sec: 44236.8, 60 sec: 45057.9, 300 sec: 45597.5). Total num frames: 3671146496. Throughput: 0: 45434.2. Samples: 189624400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-13 10:32:50,502][73265] Avg episode reward: [(0, '0.479')] [2024-06-13 10:32:53,549][73497] Updated weights for policy 0, policy_version 224079 (0.0031) [2024-06-13 10:32:55,501][73265] Fps is (10 sec: 49151.6, 60 sec: 45875.2, 300 sec: 45597.5). Total num frames: 3671392256. Throughput: 0: 45392.8. Samples: 189905120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-13 10:32:55,502][73265] Avg episode reward: [(0, '0.500')] [2024-06-13 10:32:57,622][73497] Updated weights for policy 0, policy_version 224089 (0.0026) [2024-06-13 10:33:00,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45602.2, 300 sec: 45653.1). Total num frames: 3671605248. Throughput: 0: 45575.3. Samples: 190178560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-13 10:33:00,502][73265] Avg episode reward: [(0, '0.402')] [2024-06-13 10:33:01,033][73497] Updated weights for policy 0, policy_version 224099 (0.0026) [2024-06-13 10:33:04,517][73497] Updated weights for policy 0, policy_version 224109 (0.0040) [2024-06-13 10:33:05,501][73265] Fps is (10 sec: 45875.6, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 3671851008. Throughput: 0: 45631.1. Samples: 190312680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-13 10:33:05,502][73265] Avg episode reward: [(0, '0.444')] [2024-06-13 10:33:08,220][73497] Updated weights for policy 0, policy_version 224119 (0.0037) [2024-06-13 10:33:10,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 3672064000. Throughput: 0: 45645.4. Samples: 190587660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-13 10:33:10,502][73265] Avg episode reward: [(0, '0.469')] [2024-06-13 10:33:11,386][73497] Updated weights for policy 0, policy_version 224129 (0.0039) [2024-06-13 10:33:15,228][73497] Updated weights for policy 0, policy_version 224139 (0.0031) [2024-06-13 10:33:15,504][73265] Fps is (10 sec: 44225.6, 60 sec: 45600.3, 300 sec: 45708.2). Total num frames: 3672293376. Throughput: 0: 45577.0. Samples: 190858440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-13 10:33:15,504][73265] Avg episode reward: [(0, '0.413')] [2024-06-13 10:33:18,951][73497] Updated weights for policy 0, policy_version 224149 (0.0029) [2024-06-13 10:33:20,501][73265] Fps is (10 sec: 44236.6, 60 sec: 45056.0, 300 sec: 45542.0). Total num frames: 3672506368. Throughput: 0: 45522.5. Samples: 190994080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 10:33:20,502][73265] Avg episode reward: [(0, '0.482')] [2024-06-13 10:33:22,328][73497] Updated weights for policy 0, policy_version 224159 (0.0026) [2024-06-13 10:33:25,501][73265] Fps is (10 sec: 44247.8, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 3672735744. Throughput: 0: 45364.0. Samples: 191263720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 10:33:25,502][73265] Avg episode reward: [(0, '0.449')] [2024-06-13 10:33:26,387][73497] Updated weights for policy 0, policy_version 224169 (0.0034) [2024-06-13 10:33:30,077][73497] Updated weights for policy 0, policy_version 224179 (0.0039) [2024-06-13 10:33:30,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45056.0, 300 sec: 45597.5). Total num frames: 3672948736. Throughput: 0: 45441.4. Samples: 191532760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 10:33:30,502][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 10:33:33,555][73497] Updated weights for policy 0, policy_version 224189 (0.0030) [2024-06-13 10:33:35,502][73265] Fps is (10 sec: 45874.8, 60 sec: 45330.5, 300 sec: 45542.0). Total num frames: 3673194496. Throughput: 0: 45446.6. Samples: 191669500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 10:33:35,502][73265] Avg episode reward: [(0, '0.485')] [2024-06-13 10:33:37,263][73497] Updated weights for policy 0, policy_version 224199 (0.0035) [2024-06-13 10:33:40,501][73265] Fps is (10 sec: 47513.8, 60 sec: 45329.1, 300 sec: 45653.1). Total num frames: 3673423872. Throughput: 0: 45339.3. Samples: 191945380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 10:33:40,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 10:33:40,549][73497] Updated weights for policy 0, policy_version 224209 (0.0032) [2024-06-13 10:33:41,118][73477] Signal inference workers to stop experience collection... (2850 times) [2024-06-13 10:33:41,118][73477] Signal inference workers to resume experience collection... (2850 times) [2024-06-13 10:33:41,162][73497] InferenceWorker_p0-w0: stopping experience collection (2850 times) [2024-06-13 10:33:41,162][73497] InferenceWorker_p0-w0: resuming experience collection (2850 times) [2024-06-13 10:33:44,191][73497] Updated weights for policy 0, policy_version 224219 (0.0035) [2024-06-13 10:33:45,501][73265] Fps is (10 sec: 45875.8, 60 sec: 45875.2, 300 sec: 45597.8). Total num frames: 3673653248. Throughput: 0: 45234.6. Samples: 192214120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 10:33:45,502][73265] Avg episode reward: [(0, '0.469')] [2024-06-13 10:33:47,887][73497] Updated weights for policy 0, policy_version 224229 (0.0035) [2024-06-13 10:33:50,501][73265] Fps is (10 sec: 44236.2, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 3673866240. Throughput: 0: 45256.8. Samples: 192349240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 10:33:50,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 10:33:51,358][73497] Updated weights for policy 0, policy_version 224239 (0.0027) [2024-06-13 10:33:55,182][73497] Updated weights for policy 0, policy_version 224249 (0.0037) [2024-06-13 10:33:55,501][73265] Fps is (10 sec: 44236.5, 60 sec: 45056.0, 300 sec: 45542.3). Total num frames: 3674095616. Throughput: 0: 45355.4. Samples: 192628660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 10:33:55,502][73265] Avg episode reward: [(0, '0.453')] [2024-06-13 10:33:58,884][73497] Updated weights for policy 0, policy_version 224259 (0.0035) [2024-06-13 10:34:00,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45329.0, 300 sec: 45597.5). Total num frames: 3674324992. Throughput: 0: 45300.7. Samples: 192896860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 10:34:00,502][73265] Avg episode reward: [(0, '0.498')] [2024-06-13 10:34:02,144][73497] Updated weights for policy 0, policy_version 224269 (0.0037) [2024-06-13 10:34:05,502][73265] Fps is (10 sec: 45874.7, 60 sec: 45055.8, 300 sec: 45597.5). Total num frames: 3674554368. Throughput: 0: 45183.8. Samples: 193027360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 10:34:05,502][73265] Avg episode reward: [(0, '0.417')] [2024-06-13 10:34:06,250][73497] Updated weights for policy 0, policy_version 224279 (0.0033) [2024-06-13 10:34:09,597][73497] Updated weights for policy 0, policy_version 224289 (0.0030) [2024-06-13 10:34:10,502][73265] Fps is (10 sec: 45874.9, 60 sec: 45328.9, 300 sec: 45541.9). Total num frames: 3674783744. Throughput: 0: 45367.9. Samples: 193305280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 10:34:10,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 10:34:13,275][73497] Updated weights for policy 0, policy_version 224299 (0.0041) [2024-06-13 10:34:15,501][73265] Fps is (10 sec: 44237.6, 60 sec: 45057.9, 300 sec: 45542.0). Total num frames: 3674996736. Throughput: 0: 45287.5. Samples: 193570700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 10:34:15,502][73265] Avg episode reward: [(0, '0.435')] [2024-06-13 10:34:16,818][73497] Updated weights for policy 0, policy_version 224309 (0.0040) [2024-06-13 10:34:20,171][73497] Updated weights for policy 0, policy_version 224319 (0.0037) [2024-06-13 10:34:20,504][73265] Fps is (10 sec: 45864.2, 60 sec: 45600.2, 300 sec: 45541.6). Total num frames: 3675242496. Throughput: 0: 45222.0. Samples: 193704600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 10:34:20,504][73265] Avg episode reward: [(0, '0.429')] [2024-06-13 10:34:24,086][73497] Updated weights for policy 0, policy_version 224329 (0.0031) [2024-06-13 10:34:25,501][73265] Fps is (10 sec: 47513.3, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3675471872. Throughput: 0: 45407.0. Samples: 193988700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-13 10:34:25,502][73265] Avg episode reward: [(0, '0.410')] [2024-06-13 10:34:25,538][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000224334_3675488256.pth... [2024-06-13 10:34:25,581][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000223666_3664543744.pth [2024-06-13 10:34:27,419][73497] Updated weights for policy 0, policy_version 224339 (0.0032) [2024-06-13 10:34:30,502][73265] Fps is (10 sec: 44247.4, 60 sec: 45602.0, 300 sec: 45486.4). Total num frames: 3675684864. Throughput: 0: 45411.9. Samples: 194257660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 10:34:30,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 10:34:31,087][73497] Updated weights for policy 0, policy_version 224349 (0.0035) [2024-06-13 10:34:34,871][73497] Updated weights for policy 0, policy_version 224359 (0.0047) [2024-06-13 10:34:35,501][73265] Fps is (10 sec: 42598.5, 60 sec: 45056.1, 300 sec: 45542.0). Total num frames: 3675897856. Throughput: 0: 45372.9. Samples: 194391020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 10:34:35,502][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 10:34:38,584][73497] Updated weights for policy 0, policy_version 224369 (0.0031) [2024-06-13 10:34:40,502][73265] Fps is (10 sec: 45874.9, 60 sec: 45328.9, 300 sec: 45486.4). Total num frames: 3676143616. Throughput: 0: 45099.4. Samples: 194658140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 10:34:40,502][73265] Avg episode reward: [(0, '0.423')] [2024-06-13 10:34:41,981][73497] Updated weights for policy 0, policy_version 224379 (0.0031) [2024-06-13 10:34:45,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45055.9, 300 sec: 45486.4). Total num frames: 3676356608. Throughput: 0: 45284.0. Samples: 194934640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 10:34:45,502][73265] Avg episode reward: [(0, '0.501')] [2024-06-13 10:34:46,058][73497] Updated weights for policy 0, policy_version 224389 (0.0035) [2024-06-13 10:34:49,187][73497] Updated weights for policy 0, policy_version 224399 (0.0032) [2024-06-13 10:34:50,502][73265] Fps is (10 sec: 45875.2, 60 sec: 45602.0, 300 sec: 45486.4). Total num frames: 3676602368. Throughput: 0: 45392.0. Samples: 195070000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 10:34:50,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 10:34:53,022][73497] Updated weights for policy 0, policy_version 224409 (0.0036) [2024-06-13 10:34:55,501][73265] Fps is (10 sec: 45875.6, 60 sec: 45329.1, 300 sec: 45486.4). Total num frames: 3676815360. Throughput: 0: 45242.8. Samples: 195341200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 10:34:55,502][73265] Avg episode reward: [(0, '0.429')] [2024-06-13 10:34:56,652][73497] Updated weights for policy 0, policy_version 224419 (0.0036) [2024-06-13 10:35:00,238][73497] Updated weights for policy 0, policy_version 224429 (0.0035) [2024-06-13 10:35:00,501][73265] Fps is (10 sec: 44237.7, 60 sec: 45329.1, 300 sec: 45486.4). Total num frames: 3677044736. Throughput: 0: 45382.2. Samples: 195612900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 10:35:00,502][73265] Avg episode reward: [(0, '0.410')] [2024-06-13 10:35:03,795][73497] Updated weights for policy 0, policy_version 224439 (0.0038) [2024-06-13 10:35:05,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45056.1, 300 sec: 45430.9). Total num frames: 3677257728. Throughput: 0: 45331.9. Samples: 195744420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 10:35:05,502][73265] Avg episode reward: [(0, '0.491')] [2024-06-13 10:35:07,554][73477] Signal inference workers to stop experience collection... (2900 times) [2024-06-13 10:35:07,605][73477] Signal inference workers to resume experience collection... (2900 times) [2024-06-13 10:35:07,605][73497] InferenceWorker_p0-w0: stopping experience collection (2900 times) [2024-06-13 10:35:07,614][73497] Updated weights for policy 0, policy_version 224449 (0.0036) [2024-06-13 10:35:07,641][73497] InferenceWorker_p0-w0: resuming experience collection (2900 times) [2024-06-13 10:35:10,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45329.1, 300 sec: 45486.5). Total num frames: 3677503488. Throughput: 0: 45063.6. Samples: 196016560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 10:35:10,502][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 10:35:11,058][73497] Updated weights for policy 0, policy_version 224459 (0.0039) [2024-06-13 10:35:14,783][73497] Updated weights for policy 0, policy_version 224469 (0.0040) [2024-06-13 10:35:15,501][73265] Fps is (10 sec: 47513.0, 60 sec: 45602.0, 300 sec: 45541.9). Total num frames: 3677732864. Throughput: 0: 45136.5. Samples: 196288800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 10:35:15,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 10:35:18,125][73497] Updated weights for policy 0, policy_version 224479 (0.0032) [2024-06-13 10:35:20,501][73265] Fps is (10 sec: 45875.4, 60 sec: 45331.0, 300 sec: 45486.4). Total num frames: 3677962240. Throughput: 0: 45172.5. Samples: 196423780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 10:35:20,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 10:35:22,043][73497] Updated weights for policy 0, policy_version 224489 (0.0030) [2024-06-13 10:35:25,280][73497] Updated weights for policy 0, policy_version 224499 (0.0033) [2024-06-13 10:35:25,502][73265] Fps is (10 sec: 45875.1, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 3678191616. Throughput: 0: 45321.0. Samples: 196697580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 10:35:25,502][73265] Avg episode reward: [(0, '0.463')] [2024-06-13 10:35:29,290][73497] Updated weights for policy 0, policy_version 224509 (0.0034) [2024-06-13 10:35:30,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45602.2, 300 sec: 45486.4). Total num frames: 3678420992. Throughput: 0: 45419.7. Samples: 196978520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 10:35:30,502][73265] Avg episode reward: [(0, '0.429')] [2024-06-13 10:35:32,698][73497] Updated weights for policy 0, policy_version 224519 (0.0037) [2024-06-13 10:35:35,501][73265] Fps is (10 sec: 42598.7, 60 sec: 45329.0, 300 sec: 45375.4). Total num frames: 3678617600. Throughput: 0: 45228.1. Samples: 197105260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-13 10:35:35,502][73265] Avg episode reward: [(0, '0.410')] [2024-06-13 10:35:36,823][73497] Updated weights for policy 0, policy_version 224529 (0.0036) [2024-06-13 10:35:39,939][73497] Updated weights for policy 0, policy_version 224539 (0.0044) [2024-06-13 10:35:40,502][73265] Fps is (10 sec: 44236.1, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3678863360. Throughput: 0: 45155.0. Samples: 197373180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 10:35:40,502][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 10:35:44,021][73497] Updated weights for policy 0, policy_version 224549 (0.0040) [2024-06-13 10:35:45,501][73265] Fps is (10 sec: 47514.1, 60 sec: 45602.2, 300 sec: 45486.4). Total num frames: 3679092736. Throughput: 0: 45187.6. Samples: 197646340. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 10:35:45,502][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 10:35:47,117][73497] Updated weights for policy 0, policy_version 224559 (0.0043) [2024-06-13 10:35:50,501][73265] Fps is (10 sec: 44237.2, 60 sec: 45056.1, 300 sec: 45375.4). Total num frames: 3679305728. Throughput: 0: 45346.6. Samples: 197785020. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 10:35:50,502][73265] Avg episode reward: [(0, '0.433')] [2024-06-13 10:35:51,097][73497] Updated weights for policy 0, policy_version 224569 (0.0037) [2024-06-13 10:35:54,035][73497] Updated weights for policy 0, policy_version 224579 (0.0037) [2024-06-13 10:35:55,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45602.1, 300 sec: 45430.9). Total num frames: 3679551488. Throughput: 0: 45416.9. Samples: 198060320. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 10:35:55,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 10:35:58,192][73497] Updated weights for policy 0, policy_version 224589 (0.0032) [2024-06-13 10:36:00,501][73265] Fps is (10 sec: 47513.8, 60 sec: 45602.1, 300 sec: 45431.3). Total num frames: 3679780864. Throughput: 0: 45373.0. Samples: 198330580. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 10:36:00,502][73265] Avg episode reward: [(0, '0.483')] [2024-06-13 10:36:01,363][73497] Updated weights for policy 0, policy_version 224599 (0.0033) [2024-06-13 10:36:05,501][73265] Fps is (10 sec: 40960.1, 60 sec: 45056.0, 300 sec: 45375.4). Total num frames: 3679961088. Throughput: 0: 45387.1. Samples: 198466200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 10:36:05,502][73265] Avg episode reward: [(0, '0.406')] [2024-06-13 10:36:05,956][73497] Updated weights for policy 0, policy_version 224609 (0.0026) [2024-06-13 10:36:08,855][73497] Updated weights for policy 0, policy_version 224619 (0.0027) [2024-06-13 10:36:10,501][73265] Fps is (10 sec: 45874.9, 60 sec: 45602.1, 300 sec: 45430.9). Total num frames: 3680239616. Throughput: 0: 45299.2. Samples: 198736040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 10:36:10,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 10:36:12,849][73497] Updated weights for policy 0, policy_version 224629 (0.0037) [2024-06-13 10:36:13,775][73477] Signal inference workers to stop experience collection... (2950 times) [2024-06-13 10:36:13,777][73477] Signal inference workers to resume experience collection... (2950 times) [2024-06-13 10:36:13,786][73497] InferenceWorker_p0-w0: stopping experience collection (2950 times) [2024-06-13 10:36:13,816][73497] InferenceWorker_p0-w0: resuming experience collection (2950 times) [2024-06-13 10:36:15,504][73265] Fps is (10 sec: 49139.6, 60 sec: 45327.2, 300 sec: 45431.3). Total num frames: 3680452608. Throughput: 0: 45267.7. Samples: 199015680. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 10:36:15,513][73265] Avg episode reward: [(0, '0.412')] [2024-06-13 10:36:16,051][73497] Updated weights for policy 0, policy_version 224639 (0.0031) [2024-06-13 10:36:19,847][73497] Updated weights for policy 0, policy_version 224649 (0.0036) [2024-06-13 10:36:20,501][73265] Fps is (10 sec: 40960.7, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 3680649216. Throughput: 0: 45491.7. Samples: 199152380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 10:36:20,502][73265] Avg episode reward: [(0, '0.391')] [2024-06-13 10:36:23,113][73497] Updated weights for policy 0, policy_version 224659 (0.0032) [2024-06-13 10:36:25,501][73265] Fps is (10 sec: 45886.5, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 3680911360. Throughput: 0: 45551.7. Samples: 199423000. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 10:36:25,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 10:36:25,522][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000224665_3680911360.pth... [2024-06-13 10:36:25,579][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000224000_3670016000.pth [2024-06-13 10:36:27,290][73497] Updated weights for policy 0, policy_version 224669 (0.0036) [2024-06-13 10:36:30,083][73497] Updated weights for policy 0, policy_version 224679 (0.0026) [2024-06-13 10:36:30,501][73265] Fps is (10 sec: 50789.9, 60 sec: 45602.1, 300 sec: 45486.4). Total num frames: 3681157120. Throughput: 0: 45520.4. Samples: 199694760. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 10:36:30,502][73265] Avg episode reward: [(0, '0.445')] [2024-06-13 10:36:34,376][73497] Updated weights for policy 0, policy_version 224689 (0.0037) [2024-06-13 10:36:35,505][73265] Fps is (10 sec: 44221.2, 60 sec: 45599.5, 300 sec: 45430.7). Total num frames: 3681353728. Throughput: 0: 45422.7. Samples: 199829200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 10:36:35,505][73265] Avg episode reward: [(0, '0.431')] [2024-06-13 10:36:37,278][73497] Updated weights for policy 0, policy_version 224699 (0.0042) [2024-06-13 10:36:40,501][73265] Fps is (10 sec: 42598.5, 60 sec: 45329.2, 300 sec: 45319.8). Total num frames: 3681583104. Throughput: 0: 45271.6. Samples: 200097540. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-13 10:36:40,502][73265] Avg episode reward: [(0, '0.446')] [2024-06-13 10:36:41,418][73497] Updated weights for policy 0, policy_version 224709 (0.0031) [2024-06-13 10:36:44,999][73497] Updated weights for policy 0, policy_version 224719 (0.0030) [2024-06-13 10:36:45,501][73265] Fps is (10 sec: 45891.5, 60 sec: 45329.0, 300 sec: 45320.2). Total num frames: 3681812480. Throughput: 0: 45559.1. Samples: 200380740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-13 10:36:45,507][73265] Avg episode reward: [(0, '0.456')] [2024-06-13 10:36:48,843][73497] Updated weights for policy 0, policy_version 224729 (0.0034) [2024-06-13 10:36:50,502][73265] Fps is (10 sec: 45874.4, 60 sec: 45602.0, 300 sec: 45430.9). Total num frames: 3682041856. Throughput: 0: 45423.8. Samples: 200510280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-13 10:36:50,502][73265] Avg episode reward: [(0, '0.473')] [2024-06-13 10:36:51,993][73497] Updated weights for policy 0, policy_version 224739 (0.0030) [2024-06-13 10:36:55,501][73265] Fps is (10 sec: 42598.6, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 3682238464. Throughput: 0: 45537.4. Samples: 200785220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-13 10:36:55,502][73265] Avg episode reward: [(0, '0.412')] [2024-06-13 10:36:56,125][73497] Updated weights for policy 0, policy_version 224749 (0.0037) [2024-06-13 10:36:59,123][73497] Updated weights for policy 0, policy_version 224759 (0.0035) [2024-06-13 10:37:00,501][73265] Fps is (10 sec: 44237.6, 60 sec: 45056.0, 300 sec: 45264.3). Total num frames: 3682484224. Throughput: 0: 45232.3. Samples: 201051020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-13 10:37:00,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 10:37:03,446][73497] Updated weights for policy 0, policy_version 224769 (0.0033) [2024-06-13 10:37:05,501][73265] Fps is (10 sec: 49151.4, 60 sec: 46148.2, 300 sec: 45375.3). Total num frames: 3682729984. Throughput: 0: 45369.6. Samples: 201194020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-13 10:37:05,511][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 10:37:06,206][73497] Updated weights for policy 0, policy_version 224779 (0.0043) [2024-06-13 10:37:10,398][73497] Updated weights for policy 0, policy_version 224789 (0.0040) [2024-06-13 10:37:10,501][73265] Fps is (10 sec: 45874.7, 60 sec: 45056.0, 300 sec: 45375.4). Total num frames: 3682942976. Throughput: 0: 45260.0. Samples: 201459700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-13 10:37:10,502][73265] Avg episode reward: [(0, '0.439')] [2024-06-13 10:37:13,800][73497] Updated weights for policy 0, policy_version 224799 (0.0037) [2024-06-13 10:37:15,501][73265] Fps is (10 sec: 45875.8, 60 sec: 45604.1, 300 sec: 45375.3). Total num frames: 3683188736. Throughput: 0: 45290.7. Samples: 201732840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-13 10:37:15,502][73265] Avg episode reward: [(0, '0.423')] [2024-06-13 10:37:17,630][73497] Updated weights for policy 0, policy_version 224809 (0.0029) [2024-06-13 10:37:20,502][73265] Fps is (10 sec: 45875.0, 60 sec: 45875.0, 300 sec: 45375.3). Total num frames: 3683401728. Throughput: 0: 45321.2. Samples: 201868500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-13 10:37:20,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 10:37:20,800][73497] Updated weights for policy 0, policy_version 224819 (0.0029) [2024-06-13 10:37:25,152][73497] Updated weights for policy 0, policy_version 224829 (0.0034) [2024-06-13 10:37:25,504][73265] Fps is (10 sec: 42587.6, 60 sec: 45054.2, 300 sec: 45319.4). Total num frames: 3683614720. Throughput: 0: 45503.7. Samples: 202145320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-13 10:37:25,504][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 10:37:28,198][73497] Updated weights for policy 0, policy_version 224839 (0.0032) [2024-06-13 10:37:30,506][73265] Fps is (10 sec: 44215.8, 60 sec: 44779.3, 300 sec: 45319.4). Total num frames: 3683844096. Throughput: 0: 44997.8. Samples: 202405860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-13 10:37:30,507][73265] Avg episode reward: [(0, '0.443')] [2024-06-13 10:37:32,426][73497] Updated weights for policy 0, policy_version 224849 (0.0040) [2024-06-13 10:37:33,598][73477] Signal inference workers to stop experience collection... (3000 times) [2024-06-13 10:37:33,604][73477] Signal inference workers to resume experience collection... (3000 times) [2024-06-13 10:37:33,617][73497] InferenceWorker_p0-w0: stopping experience collection (3000 times) [2024-06-13 10:37:33,645][73497] InferenceWorker_p0-w0: resuming experience collection (3000 times) [2024-06-13 10:37:35,383][73497] Updated weights for policy 0, policy_version 224859 (0.0027) [2024-06-13 10:37:35,501][73265] Fps is (10 sec: 47525.2, 60 sec: 45604.8, 300 sec: 45375.3). Total num frames: 3684089856. Throughput: 0: 45218.8. Samples: 202545120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-13 10:37:35,502][73265] Avg episode reward: [(0, '0.442')] [2024-06-13 10:37:39,252][73497] Updated weights for policy 0, policy_version 224869 (0.0038) [2024-06-13 10:37:40,502][73265] Fps is (10 sec: 45896.7, 60 sec: 45328.9, 300 sec: 45430.9). Total num frames: 3684302848. Throughput: 0: 45239.3. Samples: 202821000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-13 10:37:40,502][73265] Avg episode reward: [(0, '0.463')] [2024-06-13 10:37:42,680][73497] Updated weights for policy 0, policy_version 224879 (0.0030) [2024-06-13 10:37:45,501][73265] Fps is (10 sec: 42598.3, 60 sec: 45056.0, 300 sec: 45319.8). Total num frames: 3684515840. Throughput: 0: 45265.7. Samples: 203087980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-13 10:37:45,502][73265] Avg episode reward: [(0, '0.485')] [2024-06-13 10:37:46,372][73497] Updated weights for policy 0, policy_version 224889 (0.0033) [2024-06-13 10:37:49,638][73497] Updated weights for policy 0, policy_version 224899 (0.0035) [2024-06-13 10:37:50,501][73265] Fps is (10 sec: 45876.1, 60 sec: 45329.2, 300 sec: 45319.8). Total num frames: 3684761600. Throughput: 0: 45269.4. Samples: 203231140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-13 10:37:50,502][73265] Avg episode reward: [(0, '0.475')] [2024-06-13 10:37:53,994][73497] Updated weights for policy 0, policy_version 224909 (0.0037) [2024-06-13 10:37:55,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45602.1, 300 sec: 45319.8). Total num frames: 3684974592. Throughput: 0: 45449.4. Samples: 203504920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 10:37:55,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 10:37:56,962][73497] Updated weights for policy 0, policy_version 224919 (0.0032) [2024-06-13 10:38:00,501][73265] Fps is (10 sec: 44236.4, 60 sec: 45329.0, 300 sec: 45264.3). Total num frames: 3685203968. Throughput: 0: 45337.6. Samples: 203773040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 10:38:00,502][73265] Avg episode reward: [(0, '0.458')] [2024-06-13 10:38:01,182][73497] Updated weights for policy 0, policy_version 224929 (0.0031) [2024-06-13 10:38:04,261][73497] Updated weights for policy 0, policy_version 224939 (0.0039) [2024-06-13 10:38:05,502][73265] Fps is (10 sec: 45874.9, 60 sec: 45056.0, 300 sec: 45319.8). Total num frames: 3685433344. Throughput: 0: 45357.8. Samples: 203909600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 10:38:05,502][73265] Avg episode reward: [(0, '0.419')] [2024-06-13 10:38:08,082][73497] Updated weights for policy 0, policy_version 224949 (0.0038) [2024-06-13 10:38:10,502][73265] Fps is (10 sec: 45875.0, 60 sec: 45329.0, 300 sec: 45320.2). Total num frames: 3685662720. Throughput: 0: 45272.6. Samples: 204182480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 10:38:10,502][73265] Avg episode reward: [(0, '0.392')] [2024-06-13 10:38:11,609][73497] Updated weights for policy 0, policy_version 224959 (0.0039) [2024-06-13 10:38:15,249][73497] Updated weights for policy 0, policy_version 224969 (0.0034) [2024-06-13 10:38:15,501][73265] Fps is (10 sec: 45875.7, 60 sec: 45056.0, 300 sec: 45375.3). Total num frames: 3685892096. Throughput: 0: 45554.7. Samples: 204455600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 10:38:15,502][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 10:38:18,645][73497] Updated weights for policy 0, policy_version 224979 (0.0029) [2024-06-13 10:38:20,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45329.1, 300 sec: 45375.3). Total num frames: 3686121472. Throughput: 0: 45441.7. Samples: 204590000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 10:38:20,502][73265] Avg episode reward: [(0, '0.461')] [2024-06-13 10:38:22,830][73497] Updated weights for policy 0, policy_version 224989 (0.0039) [2024-06-13 10:38:25,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45604.0, 300 sec: 45430.9). Total num frames: 3686350848. Throughput: 0: 45229.6. Samples: 204856320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 10:38:25,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 10:38:25,600][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000224998_3686367232.pth... [2024-06-13 10:38:25,676][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000224334_3675488256.pth [2024-06-13 10:38:26,047][73497] Updated weights for policy 0, policy_version 224999 (0.0037) [2024-06-13 10:38:29,971][73497] Updated weights for policy 0, policy_version 225009 (0.0031) [2024-06-13 10:38:30,501][73265] Fps is (10 sec: 44237.3, 60 sec: 45332.8, 300 sec: 45319.8). Total num frames: 3686563840. Throughput: 0: 45301.4. Samples: 205126540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 10:38:30,502][73265] Avg episode reward: [(0, '0.444')] [2024-06-13 10:38:33,431][73497] Updated weights for policy 0, policy_version 225019 (0.0028) [2024-06-13 10:38:35,504][73265] Fps is (10 sec: 44225.7, 60 sec: 45054.2, 300 sec: 45319.4). Total num frames: 3686793216. Throughput: 0: 45203.7. Samples: 205265420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 10:38:35,505][73265] Avg episode reward: [(0, '0.465')] [2024-06-13 10:38:37,144][73497] Updated weights for policy 0, policy_version 225029 (0.0044) [2024-06-13 10:38:40,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45329.2, 300 sec: 45319.8). Total num frames: 3687022592. Throughput: 0: 45156.5. Samples: 205536960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 10:38:40,502][73265] Avg episode reward: [(0, '0.461')] [2024-06-13 10:38:40,587][73497] Updated weights for policy 0, policy_version 225039 (0.0042) [2024-06-13 10:38:44,271][73497] Updated weights for policy 0, policy_version 225049 (0.0025) [2024-06-13 10:38:45,502][73265] Fps is (10 sec: 44247.4, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 3687235584. Throughput: 0: 45308.0. Samples: 205811900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 10:38:45,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 10:38:47,659][73497] Updated weights for policy 0, policy_version 225059 (0.0034) [2024-06-13 10:38:50,502][73265] Fps is (10 sec: 45874.6, 60 sec: 45329.0, 300 sec: 45375.3). Total num frames: 3687481344. Throughput: 0: 45160.0. Samples: 205941800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 10:38:50,502][73265] Avg episode reward: [(0, '0.421')] [2024-06-13 10:38:51,616][73497] Updated weights for policy 0, policy_version 225069 (0.0039) [2024-06-13 10:38:54,417][73477] Signal inference workers to stop experience collection... (3050 times) [2024-06-13 10:38:54,417][73477] Signal inference workers to resume experience collection... (3050 times) [2024-06-13 10:38:54,452][73497] InferenceWorker_p0-w0: stopping experience collection (3050 times) [2024-06-13 10:38:54,482][73497] InferenceWorker_p0-w0: resuming experience collection (3050 times) [2024-06-13 10:38:55,098][73497] Updated weights for policy 0, policy_version 225079 (0.0041) [2024-06-13 10:38:55,501][73265] Fps is (10 sec: 47514.5, 60 sec: 45602.2, 300 sec: 45375.4). Total num frames: 3687710720. Throughput: 0: 45246.5. Samples: 206218560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 10:38:55,502][73265] Avg episode reward: [(0, '0.483')] [2024-06-13 10:38:58,655][73497] Updated weights for policy 0, policy_version 225089 (0.0038) [2024-06-13 10:39:00,501][73265] Fps is (10 sec: 44237.2, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 3687923712. Throughput: 0: 45292.9. Samples: 206493780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-13 10:39:00,502][73265] Avg episode reward: [(0, '0.462')] [2024-06-13 10:39:02,079][73497] Updated weights for policy 0, policy_version 225099 (0.0035) [2024-06-13 10:39:05,501][73265] Fps is (10 sec: 45874.6, 60 sec: 45602.2, 300 sec: 45375.4). Total num frames: 3688169472. Throughput: 0: 45374.7. Samples: 206631860. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-13 10:39:05,502][73265] Avg episode reward: [(0, '0.442')] [2024-06-13 10:39:05,645][73497] Updated weights for policy 0, policy_version 225109 (0.0035) [2024-06-13 10:39:09,569][73497] Updated weights for policy 0, policy_version 225119 (0.0035) [2024-06-13 10:39:10,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45056.1, 300 sec: 45319.8). Total num frames: 3688366080. Throughput: 0: 45439.0. Samples: 206901080. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-13 10:39:10,502][73265] Avg episode reward: [(0, '0.415')] [2024-06-13 10:39:12,817][73497] Updated weights for policy 0, policy_version 225129 (0.0037) [2024-06-13 10:39:15,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45329.0, 300 sec: 45320.2). Total num frames: 3688611840. Throughput: 0: 45523.0. Samples: 207175080. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-13 10:39:15,502][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 10:39:16,602][73497] Updated weights for policy 0, policy_version 225139 (0.0042) [2024-06-13 10:39:20,304][73497] Updated weights for policy 0, policy_version 225149 (0.0030) [2024-06-13 10:39:20,501][73265] Fps is (10 sec: 47513.4, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 3688841216. Throughput: 0: 45399.3. Samples: 207308280. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-13 10:39:20,502][73265] Avg episode reward: [(0, '0.453')] [2024-06-13 10:39:23,828][73497] Updated weights for policy 0, policy_version 225159 (0.0039) [2024-06-13 10:39:25,501][73265] Fps is (10 sec: 45875.6, 60 sec: 45329.1, 300 sec: 45375.4). Total num frames: 3689070592. Throughput: 0: 45298.2. Samples: 207575380. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-13 10:39:25,502][73265] Avg episode reward: [(0, '0.422')] [2024-06-13 10:39:27,462][73497] Updated weights for policy 0, policy_version 225169 (0.0030) [2024-06-13 10:39:30,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45602.1, 300 sec: 45430.9). Total num frames: 3689299968. Throughput: 0: 45489.4. Samples: 207858920. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-13 10:39:30,502][73265] Avg episode reward: [(0, '0.411')] [2024-06-13 10:39:31,329][73497] Updated weights for policy 0, policy_version 225179 (0.0036) [2024-06-13 10:39:34,437][73497] Updated weights for policy 0, policy_version 225189 (0.0034) [2024-06-13 10:39:35,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45604.0, 300 sec: 45375.4). Total num frames: 3689529344. Throughput: 0: 45593.0. Samples: 207993480. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-13 10:39:35,502][73265] Avg episode reward: [(0, '0.426')] [2024-06-13 10:39:38,402][73497] Updated weights for policy 0, policy_version 225199 (0.0030) [2024-06-13 10:39:40,502][73265] Fps is (10 sec: 44236.4, 60 sec: 45329.0, 300 sec: 45375.3). Total num frames: 3689742336. Throughput: 0: 45551.8. Samples: 208268400. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-13 10:39:40,502][73265] Avg episode reward: [(0, '0.486')] [2024-06-13 10:39:41,633][73497] Updated weights for policy 0, policy_version 225209 (0.0029) [2024-06-13 10:39:45,324][73497] Updated weights for policy 0, policy_version 225219 (0.0042) [2024-06-13 10:39:45,502][73265] Fps is (10 sec: 45874.6, 60 sec: 45875.2, 300 sec: 45375.4). Total num frames: 3689988096. Throughput: 0: 45346.5. Samples: 208534380. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-13 10:39:45,502][73265] Avg episode reward: [(0, '0.467')] [2024-06-13 10:39:48,974][73497] Updated weights for policy 0, policy_version 225229 (0.0034) [2024-06-13 10:39:50,504][73265] Fps is (10 sec: 44227.7, 60 sec: 45054.5, 300 sec: 45319.5). Total num frames: 3690184704. Throughput: 0: 45395.6. Samples: 208674760. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-13 10:39:50,504][73265] Avg episode reward: [(0, '0.448')] [2024-06-13 10:39:53,070][73497] Updated weights for policy 0, policy_version 225239 (0.0040) [2024-06-13 10:39:55,501][73265] Fps is (10 sec: 42598.9, 60 sec: 45055.9, 300 sec: 45319.8). Total num frames: 3690414080. Throughput: 0: 45343.6. Samples: 208941540. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-13 10:39:55,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 10:39:56,347][73497] Updated weights for policy 0, policy_version 225249 (0.0037) [2024-06-13 10:40:00,280][73497] Updated weights for policy 0, policy_version 225259 (0.0029) [2024-06-13 10:40:00,501][73265] Fps is (10 sec: 45885.0, 60 sec: 45329.1, 300 sec: 45375.3). Total num frames: 3690643456. Throughput: 0: 45432.0. Samples: 209219520. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-13 10:40:00,502][73265] Avg episode reward: [(0, '0.495')] [2024-06-13 10:40:03,615][73497] Updated weights for policy 0, policy_version 225269 (0.0038) [2024-06-13 10:40:05,501][73265] Fps is (10 sec: 47513.5, 60 sec: 45329.1, 300 sec: 45375.3). Total num frames: 3690889216. Throughput: 0: 45427.6. Samples: 209352520. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-13 10:40:05,502][73265] Avg episode reward: [(0, '0.402')] [2024-06-13 10:40:07,406][73497] Updated weights for policy 0, policy_version 225279 (0.0033) [2024-06-13 10:40:10,501][73265] Fps is (10 sec: 47513.9, 60 sec: 45875.3, 300 sec: 45375.4). Total num frames: 3691118592. Throughput: 0: 45687.6. Samples: 209631320. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-13 10:40:10,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 10:40:10,621][73497] Updated weights for policy 0, policy_version 225289 (0.0032) [2024-06-13 10:40:13,314][73477] Signal inference workers to stop experience collection... (3100 times) [2024-06-13 10:40:13,320][73477] Signal inference workers to resume experience collection... (3100 times) [2024-06-13 10:40:13,338][73497] InferenceWorker_p0-w0: stopping experience collection (3100 times) [2024-06-13 10:40:13,338][73497] InferenceWorker_p0-w0: resuming experience collection (3100 times) [2024-06-13 10:40:14,317][73497] Updated weights for policy 0, policy_version 225299 (0.0041) [2024-06-13 10:40:15,502][73265] Fps is (10 sec: 44236.4, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 3691331584. Throughput: 0: 45163.5. Samples: 209891280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 10:40:15,502][73265] Avg episode reward: [(0, '0.446')] [2024-06-13 10:40:17,899][73497] Updated weights for policy 0, policy_version 225309 (0.0036) [2024-06-13 10:40:20,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45329.2, 300 sec: 45319.8). Total num frames: 3691560960. Throughput: 0: 45157.4. Samples: 210025560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 10:40:20,502][73265] Avg episode reward: [(0, '0.428')] [2024-06-13 10:40:21,770][73497] Updated weights for policy 0, policy_version 225319 (0.0026) [2024-06-13 10:40:24,873][73497] Updated weights for policy 0, policy_version 225329 (0.0029) [2024-06-13 10:40:25,501][73265] Fps is (10 sec: 45875.4, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 3691790336. Throughput: 0: 45358.7. Samples: 210309540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 10:40:25,502][73265] Avg episode reward: [(0, '0.492')] [2024-06-13 10:40:25,587][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000225330_3691806720.pth... [2024-06-13 10:40:25,635][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000224665_3680911360.pth [2024-06-13 10:40:29,283][73497] Updated weights for policy 0, policy_version 225339 (0.0029) [2024-06-13 10:40:30,501][73265] Fps is (10 sec: 45874.8, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3692019712. Throughput: 0: 45557.0. Samples: 210584440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 10:40:30,502][73265] Avg episode reward: [(0, '0.425')] [2024-06-13 10:40:32,319][73497] Updated weights for policy 0, policy_version 225349 (0.0032) [2024-06-13 10:40:35,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45055.9, 300 sec: 45319.8). Total num frames: 3692232704. Throughput: 0: 45541.2. Samples: 210724020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 10:40:35,502][73265] Avg episode reward: [(0, '0.454')] [2024-06-13 10:40:36,234][73497] Updated weights for policy 0, policy_version 225359 (0.0040) [2024-06-13 10:40:39,441][73497] Updated weights for policy 0, policy_version 225369 (0.0040) [2024-06-13 10:40:40,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45602.3, 300 sec: 45375.3). Total num frames: 3692478464. Throughput: 0: 45571.2. Samples: 210992240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 10:40:40,502][73265] Avg episode reward: [(0, '0.431')] [2024-06-13 10:40:43,272][73497] Updated weights for policy 0, policy_version 225379 (0.0030) [2024-06-13 10:40:45,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45056.1, 300 sec: 45375.4). Total num frames: 3692691456. Throughput: 0: 45541.3. Samples: 211268880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 10:40:45,502][73265] Avg episode reward: [(0, '0.473')] [2024-06-13 10:40:46,486][73497] Updated weights for policy 0, policy_version 225389 (0.0046) [2024-06-13 10:40:50,354][73497] Updated weights for policy 0, policy_version 225399 (0.0040) [2024-06-13 10:40:50,501][73265] Fps is (10 sec: 45874.7, 60 sec: 45876.8, 300 sec: 45375.3). Total num frames: 3692937216. Throughput: 0: 45564.4. Samples: 211402920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 10:40:50,502][73265] Avg episode reward: [(0, '0.501')] [2024-06-13 10:40:53,611][73497] Updated weights for policy 0, policy_version 225409 (0.0045) [2024-06-13 10:40:55,501][73265] Fps is (10 sec: 47513.9, 60 sec: 45875.3, 300 sec: 45375.4). Total num frames: 3693166592. Throughput: 0: 45345.3. Samples: 211671860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 10:40:55,502][73265] Avg episode reward: [(0, '0.467')] [2024-06-13 10:40:57,962][73497] Updated weights for policy 0, policy_version 225419 (0.0039) [2024-06-13 10:41:00,501][73265] Fps is (10 sec: 45875.9, 60 sec: 45875.3, 300 sec: 45542.0). Total num frames: 3693395968. Throughput: 0: 45814.9. Samples: 211952940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 10:41:00,502][73265] Avg episode reward: [(0, '0.423')] [2024-06-13 10:41:01,072][73497] Updated weights for policy 0, policy_version 225429 (0.0043) [2024-06-13 10:41:05,041][73497] Updated weights for policy 0, policy_version 225439 (0.0034) [2024-06-13 10:41:05,501][73265] Fps is (10 sec: 44236.3, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 3693608960. Throughput: 0: 45775.4. Samples: 212085460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 10:41:05,502][73265] Avg episode reward: [(0, '0.488')] [2024-06-13 10:41:08,467][73497] Updated weights for policy 0, policy_version 225449 (0.0036) [2024-06-13 10:41:10,502][73265] Fps is (10 sec: 44235.8, 60 sec: 45328.9, 300 sec: 45375.7). Total num frames: 3693838336. Throughput: 0: 45289.3. Samples: 212347560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 10:41:10,502][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 10:41:12,179][73497] Updated weights for policy 0, policy_version 225459 (0.0030) [2024-06-13 10:41:15,501][73265] Fps is (10 sec: 44237.3, 60 sec: 45329.2, 300 sec: 45430.9). Total num frames: 3694051328. Throughput: 0: 45220.5. Samples: 212619360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-13 10:41:15,502][73265] Avg episode reward: [(0, '0.399')] [2024-06-13 10:41:15,719][73497] Updated weights for policy 0, policy_version 225469 (0.0046) [2024-06-13 10:41:19,604][73497] Updated weights for policy 0, policy_version 225479 (0.0032) [2024-06-13 10:41:20,501][73265] Fps is (10 sec: 42599.0, 60 sec: 45055.9, 300 sec: 45264.3). Total num frames: 3694264320. Throughput: 0: 45264.5. Samples: 212760920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-06-13 10:41:20,502][73265] Avg episode reward: [(0, '0.413')] [2024-06-13 10:41:22,545][73497] Updated weights for policy 0, policy_version 225489 (0.0026) [2024-06-13 10:41:25,501][73265] Fps is (10 sec: 45874.8, 60 sec: 45329.1, 300 sec: 45264.3). Total num frames: 3694510080. Throughput: 0: 45280.4. Samples: 213029860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-06-13 10:41:25,502][73265] Avg episode reward: [(0, '0.425')] [2024-06-13 10:41:26,691][73497] Updated weights for policy 0, policy_version 225499 (0.0041) [2024-06-13 10:41:29,111][73477] Signal inference workers to stop experience collection... (3150 times) [2024-06-13 10:41:29,155][73497] InferenceWorker_p0-w0: stopping experience collection (3150 times) [2024-06-13 10:41:29,166][73477] Signal inference workers to resume experience collection... (3150 times) [2024-06-13 10:41:29,171][73497] InferenceWorker_p0-w0: resuming experience collection (3150 times) [2024-06-13 10:41:30,096][73497] Updated weights for policy 0, policy_version 225509 (0.0036) [2024-06-13 10:41:30,502][73265] Fps is (10 sec: 49151.6, 60 sec: 45602.1, 300 sec: 45431.4). Total num frames: 3694755840. Throughput: 0: 45275.5. Samples: 213306280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-06-13 10:41:30,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 10:41:33,751][73497] Updated weights for policy 0, policy_version 225519 (0.0037) [2024-06-13 10:41:35,501][73265] Fps is (10 sec: 44237.0, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 3694952448. Throughput: 0: 45343.6. Samples: 213443380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-06-13 10:41:35,502][73265] Avg episode reward: [(0, '0.434')] [2024-06-13 10:41:37,679][73497] Updated weights for policy 0, policy_version 225529 (0.0034) [2024-06-13 10:41:40,501][73265] Fps is (10 sec: 42598.9, 60 sec: 45056.0, 300 sec: 45319.8). Total num frames: 3695181824. Throughput: 0: 45261.3. Samples: 213708620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-06-13 10:41:40,502][73265] Avg episode reward: [(0, '0.431')] [2024-06-13 10:41:41,115][73497] Updated weights for policy 0, policy_version 225539 (0.0040) [2024-06-13 10:41:44,650][73497] Updated weights for policy 0, policy_version 225549 (0.0031) [2024-06-13 10:41:45,501][73265] Fps is (10 sec: 49151.8, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 3695443968. Throughput: 0: 45348.8. Samples: 213993640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-06-13 10:41:45,502][73265] Avg episode reward: [(0, '0.464')] [2024-06-13 10:41:48,134][73497] Updated weights for policy 0, policy_version 225559 (0.0039) [2024-06-13 10:41:50,501][73265] Fps is (10 sec: 49152.3, 60 sec: 45602.2, 300 sec: 45542.0). Total num frames: 3695673344. Throughput: 0: 45565.5. Samples: 214135900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-06-13 10:41:50,502][73265] Avg episode reward: [(0, '0.482')] [2024-06-13 10:41:51,280][73497] Updated weights for policy 0, policy_version 225569 (0.0031) [2024-06-13 10:41:55,025][73497] Updated weights for policy 0, policy_version 225579 (0.0032) [2024-06-13 10:41:55,502][73265] Fps is (10 sec: 45874.9, 60 sec: 45602.0, 300 sec: 45486.4). Total num frames: 3695902720. Throughput: 0: 45928.5. Samples: 214414340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-06-13 10:41:55,504][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 10:41:58,704][73497] Updated weights for policy 0, policy_version 225589 (0.0040) [2024-06-13 10:42:00,501][73265] Fps is (10 sec: 47513.5, 60 sec: 45875.2, 300 sec: 45486.4). Total num frames: 3696148480. Throughput: 0: 45940.0. Samples: 214686660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-06-13 10:42:00,502][73265] Avg episode reward: [(0, '0.453')] [2024-06-13 10:42:02,079][73497] Updated weights for policy 0, policy_version 225599 (0.0033) [2024-06-13 10:42:05,502][73265] Fps is (10 sec: 42598.3, 60 sec: 45329.0, 300 sec: 45375.3). Total num frames: 3696328704. Throughput: 0: 45730.1. Samples: 214818780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-06-13 10:42:05,502][73265] Avg episode reward: [(0, '0.471')] [2024-06-13 10:42:06,323][73497] Updated weights for policy 0, policy_version 225609 (0.0036) [2024-06-13 10:42:09,915][73497] Updated weights for policy 0, policy_version 225619 (0.0035) [2024-06-13 10:42:10,502][73265] Fps is (10 sec: 40959.2, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 3696558080. Throughput: 0: 45813.2. Samples: 215091460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-06-13 10:42:10,502][73265] Avg episode reward: [(0, '0.434')] [2024-06-13 10:42:13,499][73497] Updated weights for policy 0, policy_version 225629 (0.0036) [2024-06-13 10:42:15,501][73265] Fps is (10 sec: 47514.3, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 3696803840. Throughput: 0: 45574.8. Samples: 215357140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-06-13 10:42:15,502][73265] Avg episode reward: [(0, '0.411')] [2024-06-13 10:42:16,837][73497] Updated weights for policy 0, policy_version 225639 (0.0040) [2024-06-13 10:42:20,274][73497] Updated weights for policy 0, policy_version 225649 (0.0032) [2024-06-13 10:42:20,501][73265] Fps is (10 sec: 47514.2, 60 sec: 46148.2, 300 sec: 45486.8). Total num frames: 3697033216. Throughput: 0: 45857.7. Samples: 215506980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-06-13 10:42:20,502][73265] Avg episode reward: [(0, '0.426')] [2024-06-13 10:42:23,759][73497] Updated weights for policy 0, policy_version 225659 (0.0034) [2024-06-13 10:42:25,501][73265] Fps is (10 sec: 42598.0, 60 sec: 45329.0, 300 sec: 45376.1). Total num frames: 3697229824. Throughput: 0: 45744.3. Samples: 215767120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-06-13 10:42:25,502][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 10:42:25,622][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000225663_3697262592.pth... [2024-06-13 10:42:25,659][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000224998_3686367232.pth [2024-06-13 10:42:27,578][73497] Updated weights for policy 0, policy_version 225669 (0.0031) [2024-06-13 10:42:30,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45602.2, 300 sec: 45430.9). Total num frames: 3697491968. Throughput: 0: 45438.3. Samples: 216038360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-13 10:42:30,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 10:42:30,980][73497] Updated weights for policy 0, policy_version 225679 (0.0043) [2024-06-13 10:42:35,340][73497] Updated weights for policy 0, policy_version 225689 (0.0040) [2024-06-13 10:42:35,502][73265] Fps is (10 sec: 45874.8, 60 sec: 45602.0, 300 sec: 45375.4). Total num frames: 3697688576. Throughput: 0: 45221.5. Samples: 216170880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-13 10:42:35,502][73265] Avg episode reward: [(0, '0.417')] [2024-06-13 10:42:38,770][73497] Updated weights for policy 0, policy_version 225699 (0.0027) [2024-06-13 10:42:40,501][73265] Fps is (10 sec: 44236.3, 60 sec: 45875.1, 300 sec: 45486.4). Total num frames: 3697934336. Throughput: 0: 45148.0. Samples: 216446000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-13 10:42:40,502][73265] Avg episode reward: [(0, '0.401')] [2024-06-13 10:42:42,239][73497] Updated weights for policy 0, policy_version 225709 (0.0031) [2024-06-13 10:42:45,501][73265] Fps is (10 sec: 47514.2, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3698163712. Throughput: 0: 45213.7. Samples: 216721280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-13 10:42:45,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 10:42:45,547][73497] Updated weights for policy 0, policy_version 225719 (0.0033) [2024-06-13 10:42:49,264][73497] Updated weights for policy 0, policy_version 225729 (0.0042) [2024-06-13 10:42:50,502][73265] Fps is (10 sec: 44236.6, 60 sec: 45055.9, 300 sec: 45430.9). Total num frames: 3698376704. Throughput: 0: 45285.8. Samples: 216856640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-13 10:42:50,502][73265] Avg episode reward: [(0, '0.462')] [2024-06-13 10:42:52,612][73497] Updated weights for policy 0, policy_version 225739 (0.0030) [2024-06-13 10:42:55,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45056.1, 300 sec: 45430.9). Total num frames: 3698606080. Throughput: 0: 45415.7. Samples: 217135160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-13 10:42:55,502][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 10:42:56,279][73497] Updated weights for policy 0, policy_version 225749 (0.0033) [2024-06-13 10:43:00,047][73497] Updated weights for policy 0, policy_version 225759 (0.0034) [2024-06-13 10:43:00,501][73265] Fps is (10 sec: 45875.3, 60 sec: 44782.8, 300 sec: 45430.9). Total num frames: 3698835456. Throughput: 0: 45368.3. Samples: 217398720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-13 10:43:00,502][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 10:43:02,656][73477] Signal inference workers to stop experience collection... (3200 times) [2024-06-13 10:43:02,698][73497] InferenceWorker_p0-w0: stopping experience collection (3200 times) [2024-06-13 10:43:02,771][73477] Signal inference workers to resume experience collection... (3200 times) [2024-06-13 10:43:02,771][73497] InferenceWorker_p0-w0: resuming experience collection (3200 times) [2024-06-13 10:43:03,891][73497] Updated weights for policy 0, policy_version 225769 (0.0039) [2024-06-13 10:43:05,501][73265] Fps is (10 sec: 44237.2, 60 sec: 45329.2, 300 sec: 45375.4). Total num frames: 3699048448. Throughput: 0: 45032.1. Samples: 217533420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-13 10:43:05,502][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 10:43:07,711][73497] Updated weights for policy 0, policy_version 225779 (0.0032) [2024-06-13 10:43:10,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45602.2, 300 sec: 45430.9). Total num frames: 3699294208. Throughput: 0: 45338.2. Samples: 217807340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-13 10:43:10,502][73265] Avg episode reward: [(0, '0.422')] [2024-06-13 10:43:11,179][73497] Updated weights for policy 0, policy_version 225789 (0.0038) [2024-06-13 10:43:14,435][73497] Updated weights for policy 0, policy_version 225799 (0.0025) [2024-06-13 10:43:15,501][73265] Fps is (10 sec: 47513.1, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 3699523584. Throughput: 0: 45605.3. Samples: 218090600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-13 10:43:15,502][73265] Avg episode reward: [(0, '0.444')] [2024-06-13 10:43:18,143][73497] Updated weights for policy 0, policy_version 225809 (0.0033) [2024-06-13 10:43:20,501][73265] Fps is (10 sec: 45875.8, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3699752960. Throughput: 0: 45660.7. Samples: 218225600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-13 10:43:20,502][73265] Avg episode reward: [(0, '0.456')] [2024-06-13 10:43:21,451][73497] Updated weights for policy 0, policy_version 225819 (0.0040) [2024-06-13 10:43:25,215][73497] Updated weights for policy 0, policy_version 225829 (0.0031) [2024-06-13 10:43:25,504][73265] Fps is (10 sec: 45863.8, 60 sec: 45873.3, 300 sec: 45486.0). Total num frames: 3699982336. Throughput: 0: 45615.3. Samples: 218498800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-13 10:43:25,504][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 10:43:29,186][73497] Updated weights for policy 0, policy_version 225839 (0.0035) [2024-06-13 10:43:30,502][73265] Fps is (10 sec: 45874.3, 60 sec: 45328.9, 300 sec: 45486.8). Total num frames: 3700211712. Throughput: 0: 45396.3. Samples: 218764120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-13 10:43:30,502][73265] Avg episode reward: [(0, '0.404')] [2024-06-13 10:43:32,309][73497] Updated weights for policy 0, policy_version 225849 (0.0045) [2024-06-13 10:43:35,501][73265] Fps is (10 sec: 45886.9, 60 sec: 45875.4, 300 sec: 45486.4). Total num frames: 3700441088. Throughput: 0: 45393.0. Samples: 218899320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-13 10:43:35,502][73265] Avg episode reward: [(0, '0.418')] [2024-06-13 10:43:36,338][73497] Updated weights for policy 0, policy_version 225859 (0.0029) [2024-06-13 10:43:39,522][73497] Updated weights for policy 0, policy_version 225869 (0.0033) [2024-06-13 10:43:40,504][73265] Fps is (10 sec: 45864.4, 60 sec: 45600.3, 300 sec: 45541.6). Total num frames: 3700670464. Throughput: 0: 45438.4. Samples: 219180000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 10:43:40,505][73265] Avg episode reward: [(0, '0.418')] [2024-06-13 10:43:43,451][73497] Updated weights for policy 0, policy_version 225879 (0.0041) [2024-06-13 10:43:45,501][73265] Fps is (10 sec: 45874.9, 60 sec: 45602.1, 300 sec: 45486.4). Total num frames: 3700899840. Throughput: 0: 45581.4. Samples: 219449880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 10:43:45,502][73265] Avg episode reward: [(0, '0.475')] [2024-06-13 10:43:47,373][73497] Updated weights for policy 0, policy_version 225889 (0.0036) [2024-06-13 10:43:50,491][73497] Updated weights for policy 0, policy_version 225899 (0.0036) [2024-06-13 10:43:50,501][73265] Fps is (10 sec: 45886.6, 60 sec: 45875.3, 300 sec: 45486.4). Total num frames: 3701129216. Throughput: 0: 45716.4. Samples: 219590660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 10:43:50,502][73265] Avg episode reward: [(0, '0.434')] [2024-06-13 10:43:54,295][73497] Updated weights for policy 0, policy_version 225909 (0.0034) [2024-06-13 10:43:55,502][73265] Fps is (10 sec: 42598.0, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 3701325824. Throughput: 0: 45603.5. Samples: 219859500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 10:43:55,502][73265] Avg episode reward: [(0, '0.439')] [2024-06-13 10:43:57,953][73497] Updated weights for policy 0, policy_version 225919 (0.0038) [2024-06-13 10:44:00,501][73265] Fps is (10 sec: 42598.5, 60 sec: 45329.1, 300 sec: 45375.4). Total num frames: 3701555200. Throughput: 0: 45215.1. Samples: 220125280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 10:44:00,502][73265] Avg episode reward: [(0, '0.465')] [2024-06-13 10:44:01,268][73497] Updated weights for policy 0, policy_version 225929 (0.0032) [2024-06-13 10:44:05,166][73497] Updated weights for policy 0, policy_version 225939 (0.0033) [2024-06-13 10:44:05,501][73265] Fps is (10 sec: 47514.0, 60 sec: 45875.1, 300 sec: 45542.0). Total num frames: 3701800960. Throughput: 0: 45404.8. Samples: 220268820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 10:44:05,510][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 10:44:08,259][73497] Updated weights for policy 0, policy_version 225949 (0.0035) [2024-06-13 10:44:10,501][73265] Fps is (10 sec: 47513.9, 60 sec: 45602.2, 300 sec: 45486.4). Total num frames: 3702030336. Throughput: 0: 45609.3. Samples: 220551100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 10:44:10,502][73265] Avg episode reward: [(0, '0.464')] [2024-06-13 10:44:12,194][73497] Updated weights for policy 0, policy_version 225959 (0.0027) [2024-06-13 10:44:15,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45602.1, 300 sec: 45486.4). Total num frames: 3702259712. Throughput: 0: 45762.8. Samples: 220823440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 10:44:15,502][73265] Avg episode reward: [(0, '0.400')] [2024-06-13 10:44:15,568][73497] Updated weights for policy 0, policy_version 225969 (0.0033) [2024-06-13 10:44:19,025][73477] Signal inference workers to stop experience collection... (3250 times) [2024-06-13 10:44:19,026][73477] Signal inference workers to resume experience collection... (3250 times) [2024-06-13 10:44:19,037][73497] InferenceWorker_p0-w0: stopping experience collection (3250 times) [2024-06-13 10:44:19,037][73497] InferenceWorker_p0-w0: resuming experience collection (3250 times) [2024-06-13 10:44:19,170][73497] Updated weights for policy 0, policy_version 225979 (0.0019) [2024-06-13 10:44:20,501][73265] Fps is (10 sec: 47513.5, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 3702505472. Throughput: 0: 45814.2. Samples: 220960960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 10:44:20,502][73265] Avg episode reward: [(0, '0.420')] [2024-06-13 10:44:22,898][73497] Updated weights for policy 0, policy_version 225989 (0.0046) [2024-06-13 10:44:25,501][73265] Fps is (10 sec: 42598.5, 60 sec: 45057.9, 300 sec: 45375.3). Total num frames: 3702685696. Throughput: 0: 45622.1. Samples: 221232880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 10:44:25,502][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 10:44:25,544][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000225995_3702702080.pth... [2024-06-13 10:44:25,613][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000225330_3691806720.pth [2024-06-13 10:44:26,679][73497] Updated weights for policy 0, policy_version 225999 (0.0039) [2024-06-13 10:44:30,046][73497] Updated weights for policy 0, policy_version 226009 (0.0038) [2024-06-13 10:44:30,501][73265] Fps is (10 sec: 44236.3, 60 sec: 45602.2, 300 sec: 45486.4). Total num frames: 3702947840. Throughput: 0: 45364.4. Samples: 221491280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 10:44:30,502][73265] Avg episode reward: [(0, '0.507')] [2024-06-13 10:44:33,738][73497] Updated weights for policy 0, policy_version 226019 (0.0033) [2024-06-13 10:44:35,501][73265] Fps is (10 sec: 47513.5, 60 sec: 45329.0, 300 sec: 45486.4). Total num frames: 3703160832. Throughput: 0: 45513.3. Samples: 221638760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 10:44:35,502][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 10:44:37,180][73497] Updated weights for policy 0, policy_version 226029 (0.0031) [2024-06-13 10:44:40,501][73265] Fps is (10 sec: 42598.8, 60 sec: 45057.9, 300 sec: 45375.4). Total num frames: 3703373824. Throughput: 0: 45535.3. Samples: 221908580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 10:44:40,502][73265] Avg episode reward: [(0, '0.413')] [2024-06-13 10:44:41,239][73497] Updated weights for policy 0, policy_version 226039 (0.0026) [2024-06-13 10:44:44,492][73497] Updated weights for policy 0, policy_version 226049 (0.0034) [2024-06-13 10:44:45,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45056.0, 300 sec: 45486.8). Total num frames: 3703603200. Throughput: 0: 45698.2. Samples: 222181700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 10:44:45,502][73265] Avg episode reward: [(0, '0.463')] [2024-06-13 10:44:48,286][73497] Updated weights for policy 0, policy_version 226059 (0.0036) [2024-06-13 10:44:50,502][73265] Fps is (10 sec: 47512.9, 60 sec: 45329.0, 300 sec: 45541.9). Total num frames: 3703848960. Throughput: 0: 45663.5. Samples: 222323680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-13 10:44:50,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 10:44:51,812][73497] Updated weights for policy 0, policy_version 226069 (0.0039) [2024-06-13 10:44:55,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45602.2, 300 sec: 45486.4). Total num frames: 3704061952. Throughput: 0: 45259.5. Samples: 222587780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-13 10:44:55,502][73265] Avg episode reward: [(0, '0.478')] [2024-06-13 10:44:55,510][73497] Updated weights for policy 0, policy_version 226079 (0.0039) [2024-06-13 10:44:58,996][73497] Updated weights for policy 0, policy_version 226089 (0.0031) [2024-06-13 10:45:00,501][73265] Fps is (10 sec: 44237.4, 60 sec: 45602.1, 300 sec: 45430.9). Total num frames: 3704291328. Throughput: 0: 45272.0. Samples: 222860680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-13 10:45:00,502][73265] Avg episode reward: [(0, '0.462')] [2024-06-13 10:45:02,523][73497] Updated weights for policy 0, policy_version 226099 (0.0033) [2024-06-13 10:45:05,502][73265] Fps is (10 sec: 45874.7, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 3704520704. Throughput: 0: 45255.0. Samples: 222997440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-13 10:45:05,502][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 10:45:06,396][73497] Updated weights for policy 0, policy_version 226109 (0.0039) [2024-06-13 10:45:10,055][73497] Updated weights for policy 0, policy_version 226119 (0.0028) [2024-06-13 10:45:10,501][73265] Fps is (10 sec: 45875.4, 60 sec: 45329.1, 300 sec: 45486.4). Total num frames: 3704750080. Throughput: 0: 45301.4. Samples: 223271440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-13 10:45:10,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 10:45:13,613][73497] Updated weights for policy 0, policy_version 226129 (0.0034) [2024-06-13 10:45:15,501][73265] Fps is (10 sec: 44237.5, 60 sec: 45056.1, 300 sec: 45430.9). Total num frames: 3704963072. Throughput: 0: 45662.4. Samples: 223546080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-13 10:45:15,502][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 10:45:17,123][73497] Updated weights for policy 0, policy_version 226139 (0.0026) [2024-06-13 10:45:20,501][73265] Fps is (10 sec: 44236.7, 60 sec: 44782.9, 300 sec: 45430.9). Total num frames: 3705192448. Throughput: 0: 45260.9. Samples: 223675500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-13 10:45:20,502][73265] Avg episode reward: [(0, '0.475')] [2024-06-13 10:45:20,935][73497] Updated weights for policy 0, policy_version 226149 (0.0043) [2024-06-13 10:45:24,711][73497] Updated weights for policy 0, policy_version 226159 (0.0035) [2024-06-13 10:45:25,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45602.2, 300 sec: 45430.9). Total num frames: 3705421824. Throughput: 0: 45253.4. Samples: 223944980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-13 10:45:25,502][73265] Avg episode reward: [(0, '0.421')] [2024-06-13 10:45:28,115][73497] Updated weights for policy 0, policy_version 226169 (0.0040) [2024-06-13 10:45:30,502][73265] Fps is (10 sec: 45874.9, 60 sec: 45056.0, 300 sec: 45486.4). Total num frames: 3705651200. Throughput: 0: 45483.1. Samples: 224228440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-13 10:45:30,502][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 10:45:31,505][73497] Updated weights for policy 0, policy_version 226179 (0.0030) [2024-06-13 10:45:32,764][73477] Signal inference workers to stop experience collection... (3300 times) [2024-06-13 10:45:32,764][73477] Signal inference workers to resume experience collection... (3300 times) [2024-06-13 10:45:32,803][73497] InferenceWorker_p0-w0: stopping experience collection (3300 times) [2024-06-13 10:45:32,804][73497] InferenceWorker_p0-w0: resuming experience collection (3300 times) [2024-06-13 10:45:35,288][73497] Updated weights for policy 0, policy_version 226189 (0.0042) [2024-06-13 10:45:35,502][73265] Fps is (10 sec: 45874.2, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 3705880576. Throughput: 0: 45211.1. Samples: 224358180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-13 10:45:35,502][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 10:45:38,371][73497] Updated weights for policy 0, policy_version 226199 (0.0046) [2024-06-13 10:45:40,501][73265] Fps is (10 sec: 47513.8, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 3706126336. Throughput: 0: 45595.5. Samples: 224639580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-13 10:45:40,502][73265] Avg episode reward: [(0, '0.434')] [2024-06-13 10:45:42,587][73497] Updated weights for policy 0, policy_version 226209 (0.0042) [2024-06-13 10:45:45,501][73265] Fps is (10 sec: 44237.5, 60 sec: 45329.1, 300 sec: 45375.4). Total num frames: 3706322944. Throughput: 0: 45526.7. Samples: 224909380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-13 10:45:45,502][73265] Avg episode reward: [(0, '0.477')] [2024-06-13 10:45:46,227][73497] Updated weights for policy 0, policy_version 226219 (0.0040) [2024-06-13 10:45:49,882][73497] Updated weights for policy 0, policy_version 226229 (0.0031) [2024-06-13 10:45:50,501][73265] Fps is (10 sec: 44237.3, 60 sec: 45329.2, 300 sec: 45430.9). Total num frames: 3706568704. Throughput: 0: 45377.5. Samples: 225039420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-13 10:45:50,502][73265] Avg episode reward: [(0, '0.461')] [2024-06-13 10:45:53,496][73497] Updated weights for policy 0, policy_version 226239 (0.0039) [2024-06-13 10:45:55,501][73265] Fps is (10 sec: 47513.0, 60 sec: 45602.1, 300 sec: 45430.9). Total num frames: 3706798080. Throughput: 0: 45345.7. Samples: 225312000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-13 10:45:55,502][73265] Avg episode reward: [(0, '0.462')] [2024-06-13 10:45:57,026][73497] Updated weights for policy 0, policy_version 226249 (0.0030) [2024-06-13 10:46:00,462][73497] Updated weights for policy 0, policy_version 226259 (0.0033) [2024-06-13 10:46:00,502][73265] Fps is (10 sec: 45874.2, 60 sec: 45602.0, 300 sec: 45486.4). Total num frames: 3707027456. Throughput: 0: 45488.7. Samples: 225593080. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-13 10:46:00,502][73265] Avg episode reward: [(0, '0.500')] [2024-06-13 10:46:04,394][73497] Updated weights for policy 0, policy_version 226269 (0.0034) [2024-06-13 10:46:05,501][73265] Fps is (10 sec: 42599.1, 60 sec: 45056.2, 300 sec: 45375.4). Total num frames: 3707224064. Throughput: 0: 45421.0. Samples: 225719440. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-13 10:46:05,502][73265] Avg episode reward: [(0, '0.439')] [2024-06-13 10:46:07,483][73497] Updated weights for policy 0, policy_version 226279 (0.0043) [2024-06-13 10:46:10,501][73265] Fps is (10 sec: 44237.5, 60 sec: 45329.1, 300 sec: 45486.4). Total num frames: 3707469824. Throughput: 0: 45530.2. Samples: 225993840. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-13 10:46:10,502][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 10:46:11,724][73497] Updated weights for policy 0, policy_version 226289 (0.0034) [2024-06-13 10:46:15,443][73497] Updated weights for policy 0, policy_version 226299 (0.0033) [2024-06-13 10:46:15,501][73265] Fps is (10 sec: 45874.6, 60 sec: 45329.0, 300 sec: 45486.4). Total num frames: 3707682816. Throughput: 0: 45304.0. Samples: 226267120. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-13 10:46:15,502][73265] Avg episode reward: [(0, '0.462')] [2024-06-13 10:46:18,853][73497] Updated weights for policy 0, policy_version 226309 (0.0032) [2024-06-13 10:46:20,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45602.1, 300 sec: 45486.4). Total num frames: 3707928576. Throughput: 0: 45230.3. Samples: 226393540. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-13 10:46:20,502][73265] Avg episode reward: [(0, '0.473')] [2024-06-13 10:46:22,710][73497] Updated weights for policy 0, policy_version 226319 (0.0044) [2024-06-13 10:46:25,501][73265] Fps is (10 sec: 45874.9, 60 sec: 45328.9, 300 sec: 45375.3). Total num frames: 3708141568. Throughput: 0: 44974.6. Samples: 226663440. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-13 10:46:25,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 10:46:25,690][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000226329_3708174336.pth... [2024-06-13 10:46:25,696][73497] Updated weights for policy 0, policy_version 226329 (0.0025) [2024-06-13 10:46:25,744][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000225663_3697262592.pth [2024-06-13 10:46:29,355][73497] Updated weights for policy 0, policy_version 226339 (0.0026) [2024-06-13 10:46:30,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45602.2, 300 sec: 45542.0). Total num frames: 3708387328. Throughput: 0: 45186.2. Samples: 226942760. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-13 10:46:30,502][73265] Avg episode reward: [(0, '0.406')] [2024-06-13 10:46:33,205][73497] Updated weights for policy 0, policy_version 226349 (0.0040) [2024-06-13 10:46:35,501][73265] Fps is (10 sec: 47513.9, 60 sec: 45602.2, 300 sec: 45542.0). Total num frames: 3708616704. Throughput: 0: 45487.4. Samples: 227086360. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-13 10:46:35,502][73265] Avg episode reward: [(0, '0.469')] [2024-06-13 10:46:36,151][73497] Updated weights for policy 0, policy_version 226359 (0.0036) [2024-06-13 10:46:40,078][73497] Updated weights for policy 0, policy_version 226369 (0.0044) [2024-06-13 10:46:40,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3708846080. Throughput: 0: 45564.5. Samples: 227362400. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-13 10:46:40,502][73265] Avg episode reward: [(0, '0.500')] [2024-06-13 10:46:43,943][73477] Signal inference workers to stop experience collection... (3350 times) [2024-06-13 10:46:43,943][73477] Signal inference workers to resume experience collection... (3350 times) [2024-06-13 10:46:43,950][73497] Updated weights for policy 0, policy_version 226379 (0.0035) [2024-06-13 10:46:43,972][73497] InferenceWorker_p0-w0: stopping experience collection (3350 times) [2024-06-13 10:46:43,972][73497] InferenceWorker_p0-w0: resuming experience collection (3350 times) [2024-06-13 10:46:45,504][73265] Fps is (10 sec: 45864.0, 60 sec: 45873.3, 300 sec: 45430.5). Total num frames: 3709075456. Throughput: 0: 45298.1. Samples: 227631600. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-13 10:46:45,504][73265] Avg episode reward: [(0, '0.448')] [2024-06-13 10:46:47,211][73497] Updated weights for policy 0, policy_version 226389 (0.0032) [2024-06-13 10:46:50,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45329.0, 300 sec: 45375.4). Total num frames: 3709288448. Throughput: 0: 45497.3. Samples: 227766820. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-13 10:46:50,502][73265] Avg episode reward: [(0, '0.473')] [2024-06-13 10:46:50,810][73497] Updated weights for policy 0, policy_version 226399 (0.0030) [2024-06-13 10:46:54,324][73497] Updated weights for policy 0, policy_version 226409 (0.0028) [2024-06-13 10:46:55,504][73265] Fps is (10 sec: 45875.4, 60 sec: 45600.4, 300 sec: 45375.0). Total num frames: 3709534208. Throughput: 0: 45737.0. Samples: 228052120. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-13 10:46:55,504][73265] Avg episode reward: [(0, '0.462')] [2024-06-13 10:46:57,567][73497] Updated weights for policy 0, policy_version 226419 (0.0021) [2024-06-13 10:47:00,501][73265] Fps is (10 sec: 47513.5, 60 sec: 45602.2, 300 sec: 45542.0). Total num frames: 3709763584. Throughput: 0: 45732.0. Samples: 228325060. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-13 10:47:00,502][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 10:47:01,398][73497] Updated weights for policy 0, policy_version 226429 (0.0029) [2024-06-13 10:47:05,012][73497] Updated weights for policy 0, policy_version 226439 (0.0026) [2024-06-13 10:47:05,501][73265] Fps is (10 sec: 47524.8, 60 sec: 46421.2, 300 sec: 45597.5). Total num frames: 3710009344. Throughput: 0: 46064.8. Samples: 228466460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 10:47:05,502][73265] Avg episode reward: [(0, '0.499')] [2024-06-13 10:47:08,382][73497] Updated weights for policy 0, policy_version 226449 (0.0040) [2024-06-13 10:47:10,502][73265] Fps is (10 sec: 44236.1, 60 sec: 45602.0, 300 sec: 45430.9). Total num frames: 3710205952. Throughput: 0: 45934.2. Samples: 228730480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 10:47:10,502][73265] Avg episode reward: [(0, '0.463')] [2024-06-13 10:47:12,393][73497] Updated weights for policy 0, policy_version 226459 (0.0039) [2024-06-13 10:47:15,501][73265] Fps is (10 sec: 42598.5, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 3710435328. Throughput: 0: 45671.5. Samples: 228997980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 10:47:15,504][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 10:47:15,683][73497] Updated weights for policy 0, policy_version 226469 (0.0035) [2024-06-13 10:47:19,411][73497] Updated weights for policy 0, policy_version 226479 (0.0027) [2024-06-13 10:47:20,501][73265] Fps is (10 sec: 44237.4, 60 sec: 45329.1, 300 sec: 45486.4). Total num frames: 3710648320. Throughput: 0: 45466.2. Samples: 229132340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 10:47:20,510][73265] Avg episode reward: [(0, '0.443')] [2024-06-13 10:47:23,271][73497] Updated weights for policy 0, policy_version 226489 (0.0034) [2024-06-13 10:47:25,502][73265] Fps is (10 sec: 44236.4, 60 sec: 45602.1, 300 sec: 45375.3). Total num frames: 3710877696. Throughput: 0: 45493.2. Samples: 229409600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 10:47:25,502][73265] Avg episode reward: [(0, '0.429')] [2024-06-13 10:47:26,306][73497] Updated weights for policy 0, policy_version 226499 (0.0029) [2024-06-13 10:47:30,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45329.1, 300 sec: 45486.5). Total num frames: 3711107072. Throughput: 0: 45530.5. Samples: 229680360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 10:47:30,502][73265] Avg episode reward: [(0, '0.479')] [2024-06-13 10:47:30,547][73497] Updated weights for policy 0, policy_version 226509 (0.0045) [2024-06-13 10:47:33,762][73497] Updated weights for policy 0, policy_version 226519 (0.0027) [2024-06-13 10:47:35,501][73265] Fps is (10 sec: 47514.5, 60 sec: 45602.2, 300 sec: 45486.4). Total num frames: 3711352832. Throughput: 0: 45591.1. Samples: 229818420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 10:47:35,502][73265] Avg episode reward: [(0, '0.479')] [2024-06-13 10:47:37,454][73497] Updated weights for policy 0, policy_version 226529 (0.0036) [2024-06-13 10:47:40,502][73265] Fps is (10 sec: 44236.2, 60 sec: 45055.9, 300 sec: 45375.3). Total num frames: 3711549440. Throughput: 0: 45121.9. Samples: 230082500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 10:47:40,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 10:47:41,091][73497] Updated weights for policy 0, policy_version 226539 (0.0035) [2024-06-13 10:47:44,696][73497] Updated weights for policy 0, policy_version 226549 (0.0027) [2024-06-13 10:47:45,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45331.0, 300 sec: 45486.4). Total num frames: 3711795200. Throughput: 0: 45028.0. Samples: 230351320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 10:47:45,502][73265] Avg episode reward: [(0, '0.449')] [2024-06-13 10:47:48,105][73497] Updated weights for policy 0, policy_version 226559 (0.0048) [2024-06-13 10:47:50,501][73265] Fps is (10 sec: 45876.0, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3712008192. Throughput: 0: 44967.2. Samples: 230489980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 10:47:50,502][73265] Avg episode reward: [(0, '0.393')] [2024-06-13 10:47:51,960][73497] Updated weights for policy 0, policy_version 226569 (0.0028) [2024-06-13 10:47:53,964][73477] Signal inference workers to stop experience collection... (3400 times) [2024-06-13 10:47:53,964][73477] Signal inference workers to resume experience collection... (3400 times) [2024-06-13 10:47:53,978][73497] InferenceWorker_p0-w0: stopping experience collection (3400 times) [2024-06-13 10:47:53,978][73497] InferenceWorker_p0-w0: resuming experience collection (3400 times) [2024-06-13 10:47:55,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45330.9, 300 sec: 45486.4). Total num frames: 3712253952. Throughput: 0: 45318.3. Samples: 230769800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 10:47:55,502][73265] Avg episode reward: [(0, '0.434')] [2024-06-13 10:47:55,527][73497] Updated weights for policy 0, policy_version 226579 (0.0028) [2024-06-13 10:47:59,274][73497] Updated weights for policy 0, policy_version 226589 (0.0028) [2024-06-13 10:48:00,502][73265] Fps is (10 sec: 45874.4, 60 sec: 45055.9, 300 sec: 45486.4). Total num frames: 3712466944. Throughput: 0: 45226.6. Samples: 231033180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 10:48:00,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 10:48:02,630][73497] Updated weights for policy 0, policy_version 226599 (0.0032) [2024-06-13 10:48:05,501][73265] Fps is (10 sec: 42598.5, 60 sec: 44509.9, 300 sec: 45375.4). Total num frames: 3712679936. Throughput: 0: 45117.8. Samples: 231162640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 10:48:05,502][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 10:48:06,375][73497] Updated weights for policy 0, policy_version 226609 (0.0025) [2024-06-13 10:48:10,198][73497] Updated weights for policy 0, policy_version 226619 (0.0031) [2024-06-13 10:48:10,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3712925696. Throughput: 0: 45189.0. Samples: 231443100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-13 10:48:10,502][73265] Avg episode reward: [(0, '0.427')] [2024-06-13 10:48:13,701][73497] Updated weights for policy 0, policy_version 226629 (0.0039) [2024-06-13 10:48:15,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45056.0, 300 sec: 45375.3). Total num frames: 3713138688. Throughput: 0: 45220.9. Samples: 231715300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 10:48:15,502][73265] Avg episode reward: [(0, '0.410')] [2024-06-13 10:48:17,038][73497] Updated weights for policy 0, policy_version 226639 (0.0038) [2024-06-13 10:48:20,501][73265] Fps is (10 sec: 45875.7, 60 sec: 45602.2, 300 sec: 45431.3). Total num frames: 3713384448. Throughput: 0: 45125.3. Samples: 231849060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 10:48:20,502][73265] Avg episode reward: [(0, '0.478')] [2024-06-13 10:48:21,031][73497] Updated weights for policy 0, policy_version 226649 (0.0038) [2024-06-13 10:48:24,350][73497] Updated weights for policy 0, policy_version 226659 (0.0028) [2024-06-13 10:48:25,504][73265] Fps is (10 sec: 45863.7, 60 sec: 45327.3, 300 sec: 45375.0). Total num frames: 3713597440. Throughput: 0: 45313.6. Samples: 232121720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 10:48:25,504][73265] Avg episode reward: [(0, '0.411')] [2024-06-13 10:48:25,512][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000226660_3713597440.pth... [2024-06-13 10:48:25,557][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000225995_3702702080.pth [2024-06-13 10:48:27,972][73497] Updated weights for policy 0, policy_version 226669 (0.0039) [2024-06-13 10:48:30,504][73265] Fps is (10 sec: 44225.8, 60 sec: 45327.2, 300 sec: 45375.0). Total num frames: 3713826816. Throughput: 0: 45500.6. Samples: 232398960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 10:48:30,504][73265] Avg episode reward: [(0, '0.477')] [2024-06-13 10:48:31,998][73497] Updated weights for policy 0, policy_version 226679 (0.0035) [2024-06-13 10:48:35,091][73497] Updated weights for policy 0, policy_version 226689 (0.0038) [2024-06-13 10:48:35,502][73265] Fps is (10 sec: 47524.8, 60 sec: 45328.9, 300 sec: 45431.2). Total num frames: 3714072576. Throughput: 0: 45505.6. Samples: 232537740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 10:48:35,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 10:48:39,032][73497] Updated weights for policy 0, policy_version 226699 (0.0048) [2024-06-13 10:48:40,501][73265] Fps is (10 sec: 44247.6, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 3714269184. Throughput: 0: 45105.3. Samples: 232799540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 10:48:40,502][73265] Avg episode reward: [(0, '0.433')] [2024-06-13 10:48:42,657][73497] Updated weights for policy 0, policy_version 226709 (0.0039) [2024-06-13 10:48:45,501][73265] Fps is (10 sec: 44237.7, 60 sec: 45329.1, 300 sec: 45375.4). Total num frames: 3714514944. Throughput: 0: 45360.2. Samples: 233074380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 10:48:45,502][73265] Avg episode reward: [(0, '0.461')] [2024-06-13 10:48:46,319][73497] Updated weights for policy 0, policy_version 226719 (0.0032) [2024-06-13 10:48:50,239][73497] Updated weights for policy 0, policy_version 226729 (0.0033) [2024-06-13 10:48:50,501][73265] Fps is (10 sec: 47513.3, 60 sec: 45602.0, 300 sec: 45486.4). Total num frames: 3714744320. Throughput: 0: 45627.0. Samples: 233215860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 10:48:50,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 10:48:53,476][73497] Updated weights for policy 0, policy_version 226739 (0.0033) [2024-06-13 10:48:55,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45056.1, 300 sec: 45430.9). Total num frames: 3714957312. Throughput: 0: 45452.6. Samples: 233488460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 10:48:55,502][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 10:48:57,096][73497] Updated weights for policy 0, policy_version 226749 (0.0031) [2024-06-13 10:49:00,501][73265] Fps is (10 sec: 44237.5, 60 sec: 45329.2, 300 sec: 45375.4). Total num frames: 3715186688. Throughput: 0: 45417.4. Samples: 233759080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 10:49:00,502][73265] Avg episode reward: [(0, '0.463')] [2024-06-13 10:49:00,867][73497] Updated weights for policy 0, policy_version 226759 (0.0025) [2024-06-13 10:49:04,192][73497] Updated weights for policy 0, policy_version 226769 (0.0036) [2024-06-13 10:49:05,501][73265] Fps is (10 sec: 45874.8, 60 sec: 45602.1, 300 sec: 45375.3). Total num frames: 3715416064. Throughput: 0: 45439.1. Samples: 233893820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 10:49:05,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 10:49:08,114][73497] Updated weights for policy 0, policy_version 226779 (0.0032) [2024-06-13 10:49:10,501][73265] Fps is (10 sec: 47513.8, 60 sec: 45602.3, 300 sec: 45430.9). Total num frames: 3715661824. Throughput: 0: 45663.6. Samples: 234176460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 10:49:10,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 10:49:11,674][73497] Updated weights for policy 0, policy_version 226789 (0.0034) [2024-06-13 10:49:12,408][73477] Signal inference workers to stop experience collection... (3450 times) [2024-06-13 10:49:12,457][73477] Signal inference workers to resume experience collection... (3450 times) [2024-06-13 10:49:12,458][73497] InferenceWorker_p0-w0: stopping experience collection (3450 times) [2024-06-13 10:49:12,476][73497] InferenceWorker_p0-w0: resuming experience collection (3450 times) [2024-06-13 10:49:14,946][73497] Updated weights for policy 0, policy_version 226799 (0.0029) [2024-06-13 10:49:15,501][73265] Fps is (10 sec: 47513.7, 60 sec: 45875.3, 300 sec: 45375.4). Total num frames: 3715891200. Throughput: 0: 45649.7. Samples: 234453080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 10:49:15,502][73265] Avg episode reward: [(0, '0.467')] [2024-06-13 10:49:19,031][73497] Updated weights for policy 0, policy_version 226809 (0.0029) [2024-06-13 10:49:20,502][73265] Fps is (10 sec: 47512.7, 60 sec: 45875.1, 300 sec: 45597.5). Total num frames: 3716136960. Throughput: 0: 45703.6. Samples: 234594400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-13 10:49:20,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 10:49:22,201][73497] Updated weights for policy 0, policy_version 226819 (0.0038) [2024-06-13 10:49:25,502][73265] Fps is (10 sec: 44235.9, 60 sec: 45603.9, 300 sec: 45375.3). Total num frames: 3716333568. Throughput: 0: 45886.5. Samples: 234864440. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-13 10:49:25,502][73265] Avg episode reward: [(0, '0.464')] [2024-06-13 10:49:25,963][73497] Updated weights for policy 0, policy_version 226829 (0.0028) [2024-06-13 10:49:29,649][73497] Updated weights for policy 0, policy_version 226839 (0.0026) [2024-06-13 10:49:30,501][73265] Fps is (10 sec: 44237.2, 60 sec: 45877.1, 300 sec: 45486.4). Total num frames: 3716579328. Throughput: 0: 45843.1. Samples: 235137320. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-13 10:49:30,502][73265] Avg episode reward: [(0, '0.453')] [2024-06-13 10:49:33,327][73497] Updated weights for policy 0, policy_version 226849 (0.0039) [2024-06-13 10:49:35,501][73265] Fps is (10 sec: 45876.1, 60 sec: 45329.2, 300 sec: 45486.4). Total num frames: 3716792320. Throughput: 0: 45696.1. Samples: 235272180. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-13 10:49:35,502][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 10:49:36,867][73497] Updated weights for policy 0, policy_version 226859 (0.0030) [2024-06-13 10:49:40,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45875.2, 300 sec: 45486.4). Total num frames: 3717021696. Throughput: 0: 45715.9. Samples: 235545680. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-13 10:49:40,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 10:49:40,503][73497] Updated weights for policy 0, policy_version 226869 (0.0033) [2024-06-13 10:49:43,851][73497] Updated weights for policy 0, policy_version 226879 (0.0033) [2024-06-13 10:49:45,502][73265] Fps is (10 sec: 47512.7, 60 sec: 45875.1, 300 sec: 45486.4). Total num frames: 3717267456. Throughput: 0: 45706.0. Samples: 235815860. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-13 10:49:45,502][73265] Avg episode reward: [(0, '0.433')] [2024-06-13 10:49:47,506][73497] Updated weights for policy 0, policy_version 226889 (0.0039) [2024-06-13 10:49:50,502][73265] Fps is (10 sec: 44236.4, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3717464064. Throughput: 0: 45884.3. Samples: 235958620. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-13 10:49:50,502][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 10:49:51,145][73497] Updated weights for policy 0, policy_version 226899 (0.0036) [2024-06-13 10:49:54,960][73497] Updated weights for policy 0, policy_version 226909 (0.0039) [2024-06-13 10:49:55,501][73265] Fps is (10 sec: 44237.3, 60 sec: 45875.1, 300 sec: 45486.4). Total num frames: 3717709824. Throughput: 0: 45736.3. Samples: 236234600. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-13 10:49:55,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 10:49:58,449][73497] Updated weights for policy 0, policy_version 226919 (0.0040) [2024-06-13 10:50:00,501][73265] Fps is (10 sec: 49152.0, 60 sec: 46148.2, 300 sec: 45542.0). Total num frames: 3717955584. Throughput: 0: 45397.6. Samples: 236495980. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-13 10:50:00,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 10:50:02,150][73497] Updated weights for policy 0, policy_version 226929 (0.0038) [2024-06-13 10:50:05,374][73497] Updated weights for policy 0, policy_version 226939 (0.0029) [2024-06-13 10:50:05,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45875.2, 300 sec: 45486.4). Total num frames: 3718168576. Throughput: 0: 45369.0. Samples: 236636000. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-13 10:50:05,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 10:50:09,057][73497] Updated weights for policy 0, policy_version 226949 (0.0032) [2024-06-13 10:50:10,502][73265] Fps is (10 sec: 42598.4, 60 sec: 45328.9, 300 sec: 45486.4). Total num frames: 3718381568. Throughput: 0: 45737.8. Samples: 236922640. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-13 10:50:10,502][73265] Avg episode reward: [(0, '0.477')] [2024-06-13 10:50:12,795][73497] Updated weights for policy 0, policy_version 226959 (0.0037) [2024-06-13 10:50:15,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45329.0, 300 sec: 45486.4). Total num frames: 3718610944. Throughput: 0: 45624.5. Samples: 237190420. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-13 10:50:15,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 10:50:16,034][73497] Updated weights for policy 0, policy_version 226969 (0.0040) [2024-06-13 10:50:19,890][73497] Updated weights for policy 0, policy_version 226979 (0.0036) [2024-06-13 10:50:20,139][73477] Signal inference workers to stop experience collection... (3500 times) [2024-06-13 10:50:20,139][73477] Signal inference workers to resume experience collection... (3500 times) [2024-06-13 10:50:20,161][73497] InferenceWorker_p0-w0: stopping experience collection (3500 times) [2024-06-13 10:50:20,162][73497] InferenceWorker_p0-w0: resuming experience collection (3500 times) [2024-06-13 10:50:20,501][73265] Fps is (10 sec: 49152.3, 60 sec: 45602.2, 300 sec: 45597.5). Total num frames: 3718873088. Throughput: 0: 45755.9. Samples: 237331200. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-13 10:50:20,502][73265] Avg episode reward: [(0, '0.433')] [2024-06-13 10:50:23,527][73497] Updated weights for policy 0, policy_version 226989 (0.0029) [2024-06-13 10:50:25,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45602.2, 300 sec: 45486.4). Total num frames: 3719069696. Throughput: 0: 45747.5. Samples: 237604320. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-13 10:50:25,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 10:50:25,526][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000226995_3719086080.pth... [2024-06-13 10:50:25,568][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000226329_3708174336.pth [2024-06-13 10:50:26,943][73497] Updated weights for policy 0, policy_version 226999 (0.0023) [2024-06-13 10:50:30,501][73265] Fps is (10 sec: 40960.2, 60 sec: 45056.0, 300 sec: 45430.9). Total num frames: 3719282688. Throughput: 0: 45714.4. Samples: 237873000. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-13 10:50:30,502][73265] Avg episode reward: [(0, '0.421')] [2024-06-13 10:50:30,682][73497] Updated weights for policy 0, policy_version 227009 (0.0042) [2024-06-13 10:50:33,978][73497] Updated weights for policy 0, policy_version 227019 (0.0034) [2024-06-13 10:50:35,501][73265] Fps is (10 sec: 47513.9, 60 sec: 45875.2, 300 sec: 45486.4). Total num frames: 3719544832. Throughput: 0: 45822.8. Samples: 238020640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:50:35,502][73265] Avg episode reward: [(0, '0.462')] [2024-06-13 10:50:37,531][73497] Updated weights for policy 0, policy_version 227029 (0.0030) [2024-06-13 10:50:40,501][73265] Fps is (10 sec: 47513.4, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3719757824. Throughput: 0: 45759.6. Samples: 238293780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:50:40,502][73265] Avg episode reward: [(0, '0.433')] [2024-06-13 10:50:41,287][73497] Updated weights for policy 0, policy_version 227039 (0.0026) [2024-06-13 10:50:44,403][73497] Updated weights for policy 0, policy_version 227049 (0.0035) [2024-06-13 10:50:45,502][73265] Fps is (10 sec: 45874.5, 60 sec: 45602.2, 300 sec: 45541.9). Total num frames: 3720003584. Throughput: 0: 46041.3. Samples: 238567840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:50:45,502][73265] Avg episode reward: [(0, '0.448')] [2024-06-13 10:50:48,395][73497] Updated weights for policy 0, policy_version 227059 (0.0034) [2024-06-13 10:50:50,501][73265] Fps is (10 sec: 47513.7, 60 sec: 46148.3, 300 sec: 45542.0). Total num frames: 3720232960. Throughput: 0: 46203.1. Samples: 238715140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:50:50,502][73265] Avg episode reward: [(0, '0.415')] [2024-06-13 10:50:51,909][73497] Updated weights for policy 0, policy_version 227069 (0.0027) [2024-06-13 10:50:55,244][73497] Updated weights for policy 0, policy_version 227079 (0.0035) [2024-06-13 10:50:55,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 3720462336. Throughput: 0: 45698.2. Samples: 238979060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:50:55,502][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 10:50:58,982][73497] Updated weights for policy 0, policy_version 227089 (0.0031) [2024-06-13 10:51:00,501][73265] Fps is (10 sec: 45874.9, 60 sec: 45602.1, 300 sec: 45653.0). Total num frames: 3720691712. Throughput: 0: 45859.0. Samples: 239254080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:51:00,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 10:51:02,591][73497] Updated weights for policy 0, policy_version 227099 (0.0034) [2024-06-13 10:51:05,501][73265] Fps is (10 sec: 45875.8, 60 sec: 45875.2, 300 sec: 45597.5). Total num frames: 3720921088. Throughput: 0: 45764.9. Samples: 239390620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:51:05,502][73265] Avg episode reward: [(0, '0.442')] [2024-06-13 10:51:05,858][73497] Updated weights for policy 0, policy_version 227109 (0.0025) [2024-06-13 10:51:09,903][73497] Updated weights for policy 0, policy_version 227119 (0.0039) [2024-06-13 10:51:10,501][73265] Fps is (10 sec: 45875.5, 60 sec: 46148.3, 300 sec: 45653.0). Total num frames: 3721150464. Throughput: 0: 45981.4. Samples: 239673480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:51:10,502][73265] Avg episode reward: [(0, '0.458')] [2024-06-13 10:51:12,765][73497] Updated weights for policy 0, policy_version 227129 (0.0035) [2024-06-13 10:51:15,501][73265] Fps is (10 sec: 44236.5, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 3721363456. Throughput: 0: 45891.9. Samples: 239938140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:51:15,502][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 10:51:17,239][73497] Updated weights for policy 0, policy_version 227139 (0.0029) [2024-06-13 10:51:20,507][73265] Fps is (10 sec: 44211.8, 60 sec: 45324.8, 300 sec: 45596.6). Total num frames: 3721592832. Throughput: 0: 45576.0. Samples: 240071820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:51:20,508][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 10:51:20,759][73497] Updated weights for policy 0, policy_version 227149 (0.0043) [2024-06-13 10:51:24,334][73497] Updated weights for policy 0, policy_version 227159 (0.0038) [2024-06-13 10:51:25,502][73265] Fps is (10 sec: 45874.9, 60 sec: 45875.1, 300 sec: 45542.0). Total num frames: 3721822208. Throughput: 0: 45363.5. Samples: 240335140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:51:25,502][73265] Avg episode reward: [(0, '0.427')] [2024-06-13 10:51:26,048][73477] Signal inference workers to stop experience collection... (3550 times) [2024-06-13 10:51:26,049][73477] Signal inference workers to resume experience collection... (3550 times) [2024-06-13 10:51:26,071][73497] InferenceWorker_p0-w0: stopping experience collection (3550 times) [2024-06-13 10:51:26,072][73497] InferenceWorker_p0-w0: resuming experience collection (3550 times) [2024-06-13 10:51:27,846][73497] Updated weights for policy 0, policy_version 227169 (0.0040) [2024-06-13 10:51:30,503][73265] Fps is (10 sec: 44254.3, 60 sec: 45873.9, 300 sec: 45486.2). Total num frames: 3722035200. Throughput: 0: 45258.9. Samples: 240604560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:51:30,504][73265] Avg episode reward: [(0, '0.473')] [2024-06-13 10:51:31,321][73497] Updated weights for policy 0, policy_version 227179 (0.0031) [2024-06-13 10:51:34,809][73497] Updated weights for policy 0, policy_version 227189 (0.0024) [2024-06-13 10:51:35,501][73265] Fps is (10 sec: 45876.0, 60 sec: 45602.2, 300 sec: 45542.0). Total num frames: 3722280960. Throughput: 0: 45295.6. Samples: 240753440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 10:51:35,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 10:51:38,910][73497] Updated weights for policy 0, policy_version 227199 (0.0034) [2024-06-13 10:51:40,501][73265] Fps is (10 sec: 44244.3, 60 sec: 45329.1, 300 sec: 45431.3). Total num frames: 3722477568. Throughput: 0: 45289.9. Samples: 241017100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:51:40,502][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 10:51:41,708][73497] Updated weights for policy 0, policy_version 227209 (0.0029) [2024-06-13 10:51:45,501][73265] Fps is (10 sec: 44236.3, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 3722723328. Throughput: 0: 45248.1. Samples: 241290240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:51:45,502][73265] Avg episode reward: [(0, '0.439')] [2024-06-13 10:51:45,889][73497] Updated weights for policy 0, policy_version 227219 (0.0031) [2024-06-13 10:51:49,494][73497] Updated weights for policy 0, policy_version 227229 (0.0042) [2024-06-13 10:51:50,502][73265] Fps is (10 sec: 45874.7, 60 sec: 45055.9, 300 sec: 45431.2). Total num frames: 3722936320. Throughput: 0: 45139.9. Samples: 241421920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:51:50,502][73265] Avg episode reward: [(0, '0.433')] [2024-06-13 10:51:52,901][73497] Updated weights for policy 0, policy_version 227239 (0.0033) [2024-06-13 10:51:55,502][73265] Fps is (10 sec: 45874.6, 60 sec: 45329.0, 300 sec: 45486.4). Total num frames: 3723182080. Throughput: 0: 44845.2. Samples: 241691520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:51:55,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 10:51:56,756][73497] Updated weights for policy 0, policy_version 227249 (0.0039) [2024-06-13 10:52:00,209][73497] Updated weights for policy 0, policy_version 227259 (0.0038) [2024-06-13 10:52:00,501][73265] Fps is (10 sec: 47514.3, 60 sec: 45329.2, 300 sec: 45430.9). Total num frames: 3723411456. Throughput: 0: 45026.3. Samples: 241964320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:52:00,502][73265] Avg episode reward: [(0, '0.485')] [2024-06-13 10:52:03,554][73497] Updated weights for policy 0, policy_version 227269 (0.0035) [2024-06-13 10:52:05,501][73265] Fps is (10 sec: 45875.9, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 3723640832. Throughput: 0: 45245.7. Samples: 242107620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:52:05,502][73265] Avg episode reward: [(0, '0.480')] [2024-06-13 10:52:07,713][73497] Updated weights for policy 0, policy_version 227279 (0.0034) [2024-06-13 10:52:10,441][73497] Updated weights for policy 0, policy_version 227289 (0.0033) [2024-06-13 10:52:10,501][73265] Fps is (10 sec: 49152.0, 60 sec: 45875.2, 300 sec: 45653.1). Total num frames: 3723902976. Throughput: 0: 45610.4. Samples: 242387600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:52:10,502][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 10:52:14,538][73497] Updated weights for policy 0, policy_version 227299 (0.0031) [2024-06-13 10:52:15,501][73265] Fps is (10 sec: 44236.6, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 3724083200. Throughput: 0: 45528.4. Samples: 242653260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:52:15,502][73265] Avg episode reward: [(0, '0.446')] [2024-06-13 10:52:18,095][73497] Updated weights for policy 0, policy_version 227309 (0.0030) [2024-06-13 10:52:20,501][73265] Fps is (10 sec: 40959.8, 60 sec: 45333.3, 300 sec: 45542.0). Total num frames: 3724312576. Throughput: 0: 45133.7. Samples: 242784460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:52:20,502][73265] Avg episode reward: [(0, '0.426')] [2024-06-13 10:52:21,591][73497] Updated weights for policy 0, policy_version 227319 (0.0031) [2024-06-13 10:52:25,341][73497] Updated weights for policy 0, policy_version 227329 (0.0042) [2024-06-13 10:52:25,501][73265] Fps is (10 sec: 47514.2, 60 sec: 45602.3, 300 sec: 45597.5). Total num frames: 3724558336. Throughput: 0: 45513.0. Samples: 243065180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:52:25,502][73265] Avg episode reward: [(0, '0.420')] [2024-06-13 10:52:25,526][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000227329_3724558336.pth... [2024-06-13 10:52:25,585][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000226660_3713597440.pth [2024-06-13 10:52:28,455][73497] Updated weights for policy 0, policy_version 227339 (0.0031) [2024-06-13 10:52:30,501][73265] Fps is (10 sec: 45875.4, 60 sec: 45603.5, 300 sec: 45486.4). Total num frames: 3724771328. Throughput: 0: 45724.1. Samples: 243347820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:52:30,512][73265] Avg episode reward: [(0, '0.431')] [2024-06-13 10:52:32,250][73497] Updated weights for policy 0, policy_version 227349 (0.0025) [2024-06-13 10:52:33,833][73477] Signal inference workers to stop experience collection... (3600 times) [2024-06-13 10:52:33,833][73477] Signal inference workers to resume experience collection... (3600 times) [2024-06-13 10:52:33,854][73497] InferenceWorker_p0-w0: stopping experience collection (3600 times) [2024-06-13 10:52:33,854][73497] InferenceWorker_p0-w0: resuming experience collection (3600 times) [2024-06-13 10:52:35,501][73265] Fps is (10 sec: 44236.3, 60 sec: 45329.0, 300 sec: 45597.5). Total num frames: 3725000704. Throughput: 0: 45620.1. Samples: 243474820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:52:35,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 10:52:36,251][73497] Updated weights for policy 0, policy_version 227359 (0.0032) [2024-06-13 10:52:39,235][73497] Updated weights for policy 0, policy_version 227369 (0.0038) [2024-06-13 10:52:40,502][73265] Fps is (10 sec: 47512.8, 60 sec: 46148.2, 300 sec: 45597.5). Total num frames: 3725246464. Throughput: 0: 45623.1. Samples: 243744560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:52:40,502][73265] Avg episode reward: [(0, '0.462')] [2024-06-13 10:52:43,247][73497] Updated weights for policy 0, policy_version 227379 (0.0032) [2024-06-13 10:52:45,501][73265] Fps is (10 sec: 42598.7, 60 sec: 45056.1, 300 sec: 45486.4). Total num frames: 3725426688. Throughput: 0: 45668.5. Samples: 244019400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-13 10:52:45,502][73265] Avg episode reward: [(0, '0.515')] [2024-06-13 10:52:47,311][73497] Updated weights for policy 0, policy_version 227389 (0.0038) [2024-06-13 10:52:50,271][73497] Updated weights for policy 0, policy_version 227399 (0.0038) [2024-06-13 10:52:50,502][73265] Fps is (10 sec: 45875.1, 60 sec: 46148.2, 300 sec: 45597.5). Total num frames: 3725705216. Throughput: 0: 45389.2. Samples: 244150140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 10:52:50,502][73265] Avg episode reward: [(0, '0.490')] [2024-06-13 10:52:54,389][73497] Updated weights for policy 0, policy_version 227409 (0.0027) [2024-06-13 10:52:55,501][73265] Fps is (10 sec: 47513.3, 60 sec: 45329.2, 300 sec: 45542.0). Total num frames: 3725901824. Throughput: 0: 45330.6. Samples: 244427480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 10:52:55,502][73265] Avg episode reward: [(0, '0.425')] [2024-06-13 10:52:57,446][73497] Updated weights for policy 0, policy_version 227419 (0.0025) [2024-06-13 10:53:00,501][73265] Fps is (10 sec: 44237.2, 60 sec: 45602.1, 300 sec: 45653.0). Total num frames: 3726147584. Throughput: 0: 45474.2. Samples: 244699600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 10:53:00,502][73265] Avg episode reward: [(0, '0.479')] [2024-06-13 10:53:01,386][73497] Updated weights for policy 0, policy_version 227429 (0.0045) [2024-06-13 10:53:05,086][73497] Updated weights for policy 0, policy_version 227439 (0.0025) [2024-06-13 10:53:05,502][73265] Fps is (10 sec: 45874.6, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 3726360576. Throughput: 0: 45514.5. Samples: 244832620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 10:53:05,502][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 10:53:08,287][73497] Updated weights for policy 0, policy_version 227449 (0.0028) [2024-06-13 10:53:10,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45056.0, 300 sec: 45653.0). Total num frames: 3726606336. Throughput: 0: 45419.0. Samples: 245109040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 10:53:10,502][73265] Avg episode reward: [(0, '0.471')] [2024-06-13 10:53:12,092][73497] Updated weights for policy 0, policy_version 227459 (0.0037) [2024-06-13 10:53:15,501][73265] Fps is (10 sec: 47513.8, 60 sec: 45875.2, 300 sec: 45597.5). Total num frames: 3726835712. Throughput: 0: 45151.0. Samples: 245379620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 10:53:15,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 10:53:15,899][73497] Updated weights for policy 0, policy_version 227469 (0.0042) [2024-06-13 10:53:19,246][73497] Updated weights for policy 0, policy_version 227479 (0.0038) [2024-06-13 10:53:20,501][73265] Fps is (10 sec: 40960.2, 60 sec: 45056.0, 300 sec: 45486.8). Total num frames: 3727015936. Throughput: 0: 45346.3. Samples: 245515400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 10:53:20,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 10:53:23,420][73497] Updated weights for policy 0, policy_version 227489 (0.0050) [2024-06-13 10:53:25,502][73265] Fps is (10 sec: 44236.6, 60 sec: 45328.9, 300 sec: 45597.9). Total num frames: 3727278080. Throughput: 0: 45479.6. Samples: 245791140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 10:53:25,502][73265] Avg episode reward: [(0, '0.469')] [2024-06-13 10:53:26,377][73497] Updated weights for policy 0, policy_version 227499 (0.0040) [2024-06-13 10:53:30,394][73497] Updated weights for policy 0, policy_version 227509 (0.0031) [2024-06-13 10:53:30,501][73265] Fps is (10 sec: 49151.9, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3727507456. Throughput: 0: 45277.7. Samples: 246056900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 10:53:30,502][73265] Avg episode reward: [(0, '0.517')] [2024-06-13 10:53:34,029][73497] Updated weights for policy 0, policy_version 227519 (0.0049) [2024-06-13 10:53:35,347][73477] Signal inference workers to stop experience collection... (3650 times) [2024-06-13 10:53:35,347][73477] Signal inference workers to resume experience collection... (3650 times) [2024-06-13 10:53:35,389][73497] InferenceWorker_p0-w0: stopping experience collection (3650 times) [2024-06-13 10:53:35,389][73497] InferenceWorker_p0-w0: resuming experience collection (3650 times) [2024-06-13 10:53:35,501][73265] Fps is (10 sec: 44237.3, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 3727720448. Throughput: 0: 45541.5. Samples: 246199500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 10:53:35,502][73265] Avg episode reward: [(0, '0.453')] [2024-06-13 10:53:37,567][73497] Updated weights for policy 0, policy_version 227529 (0.0039) [2024-06-13 10:53:40,502][73265] Fps is (10 sec: 44236.2, 60 sec: 45056.0, 300 sec: 45541.9). Total num frames: 3727949824. Throughput: 0: 45284.8. Samples: 246465300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 10:53:40,502][73265] Avg episode reward: [(0, '0.463')] [2024-06-13 10:53:41,105][73497] Updated weights for policy 0, policy_version 227539 (0.0041) [2024-06-13 10:53:45,257][73497] Updated weights for policy 0, policy_version 227549 (0.0039) [2024-06-13 10:53:45,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45875.1, 300 sec: 45542.0). Total num frames: 3728179200. Throughput: 0: 45354.7. Samples: 246740560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 10:53:45,502][73265] Avg episode reward: [(0, '0.425')] [2024-06-13 10:53:48,463][73497] Updated weights for policy 0, policy_version 227559 (0.0028) [2024-06-13 10:53:50,501][73265] Fps is (10 sec: 42599.0, 60 sec: 44510.0, 300 sec: 45486.4). Total num frames: 3728375808. Throughput: 0: 45357.9. Samples: 246873720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 10:53:50,508][73265] Avg episode reward: [(0, '0.478')] [2024-06-13 10:53:52,390][73497] Updated weights for policy 0, policy_version 227569 (0.0031) [2024-06-13 10:53:55,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 3728621568. Throughput: 0: 45261.8. Samples: 247145820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-13 10:53:55,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 10:53:55,719][73497] Updated weights for policy 0, policy_version 227579 (0.0040) [2024-06-13 10:53:59,325][73497] Updated weights for policy 0, policy_version 227589 (0.0033) [2024-06-13 10:54:00,501][73265] Fps is (10 sec: 49151.7, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 3728867328. Throughput: 0: 45317.4. Samples: 247418900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:54:00,502][73265] Avg episode reward: [(0, '0.435')] [2024-06-13 10:54:03,251][73497] Updated weights for policy 0, policy_version 227599 (0.0045) [2024-06-13 10:54:05,502][73265] Fps is (10 sec: 44236.0, 60 sec: 45056.0, 300 sec: 45430.9). Total num frames: 3729063936. Throughput: 0: 45313.1. Samples: 247554500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:54:05,502][73265] Avg episode reward: [(0, '0.428')] [2024-06-13 10:54:06,706][73497] Updated weights for policy 0, policy_version 227609 (0.0039) [2024-06-13 10:54:10,378][73497] Updated weights for policy 0, policy_version 227619 (0.0026) [2024-06-13 10:54:10,502][73265] Fps is (10 sec: 44236.1, 60 sec: 45055.9, 300 sec: 45486.4). Total num frames: 3729309696. Throughput: 0: 45183.5. Samples: 247824400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:54:10,502][73265] Avg episode reward: [(0, '0.426')] [2024-06-13 10:54:13,835][73497] Updated weights for policy 0, policy_version 227629 (0.0030) [2024-06-13 10:54:15,501][73265] Fps is (10 sec: 50791.0, 60 sec: 45602.2, 300 sec: 45542.0). Total num frames: 3729571840. Throughput: 0: 45475.1. Samples: 248103280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:54:15,502][73265] Avg episode reward: [(0, '0.480')] [2024-06-13 10:54:17,343][73497] Updated weights for policy 0, policy_version 227639 (0.0035) [2024-06-13 10:54:20,501][73265] Fps is (10 sec: 47514.5, 60 sec: 46148.2, 300 sec: 45597.5). Total num frames: 3729784832. Throughput: 0: 45484.5. Samples: 248246300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:54:20,502][73265] Avg episode reward: [(0, '0.431')] [2024-06-13 10:54:20,627][73497] Updated weights for policy 0, policy_version 227649 (0.0025) [2024-06-13 10:54:24,587][73497] Updated weights for policy 0, policy_version 227659 (0.0024) [2024-06-13 10:54:25,502][73265] Fps is (10 sec: 44236.3, 60 sec: 45602.1, 300 sec: 45541.9). Total num frames: 3730014208. Throughput: 0: 45724.9. Samples: 248522920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:54:25,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 10:54:25,517][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000227662_3730014208.pth... [2024-06-13 10:54:25,568][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000226995_3719086080.pth [2024-06-13 10:54:28,213][73497] Updated weights for policy 0, policy_version 227669 (0.0023) [2024-06-13 10:54:30,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 3730227200. Throughput: 0: 45410.7. Samples: 248784040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:54:30,502][73265] Avg episode reward: [(0, '0.429')] [2024-06-13 10:54:31,938][73497] Updated weights for policy 0, policy_version 227679 (0.0031) [2024-06-13 10:54:35,284][73497] Updated weights for policy 0, policy_version 227689 (0.0032) [2024-06-13 10:54:35,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45602.1, 300 sec: 45541.9). Total num frames: 3730456576. Throughput: 0: 45690.5. Samples: 248929800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:54:35,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 10:54:39,142][73497] Updated weights for policy 0, policy_version 227699 (0.0043) [2024-06-13 10:54:40,501][73265] Fps is (10 sec: 44236.5, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3730669568. Throughput: 0: 45691.0. Samples: 249201920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:54:40,505][73265] Avg episode reward: [(0, '0.519')] [2024-06-13 10:54:42,542][73497] Updated weights for policy 0, policy_version 227709 (0.0036) [2024-06-13 10:54:45,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 3730898944. Throughput: 0: 45609.8. Samples: 249471340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:54:45,502][73265] Avg episode reward: [(0, '0.502')] [2024-06-13 10:54:46,345][73497] Updated weights for policy 0, policy_version 227719 (0.0035) [2024-06-13 10:54:49,682][73497] Updated weights for policy 0, policy_version 227729 (0.0019) [2024-06-13 10:54:50,501][73265] Fps is (10 sec: 49152.2, 60 sec: 46421.3, 300 sec: 45597.5). Total num frames: 3731161088. Throughput: 0: 45775.7. Samples: 249614400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:54:50,502][73265] Avg episode reward: [(0, '0.421')] [2024-06-13 10:54:53,452][73497] Updated weights for policy 0, policy_version 227739 (0.0038) [2024-06-13 10:54:55,286][73477] Signal inference workers to stop experience collection... (3700 times) [2024-06-13 10:54:55,287][73477] Signal inference workers to resume experience collection... (3700 times) [2024-06-13 10:54:55,325][73497] InferenceWorker_p0-w0: stopping experience collection (3700 times) [2024-06-13 10:54:55,325][73497] InferenceWorker_p0-w0: resuming experience collection (3700 times) [2024-06-13 10:54:55,501][73265] Fps is (10 sec: 45874.9, 60 sec: 45602.0, 300 sec: 45430.9). Total num frames: 3731357696. Throughput: 0: 45777.9. Samples: 249884400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:54:55,502][73265] Avg episode reward: [(0, '0.456')] [2024-06-13 10:54:57,064][73497] Updated weights for policy 0, policy_version 227749 (0.0037) [2024-06-13 10:55:00,501][73265] Fps is (10 sec: 42598.8, 60 sec: 45329.1, 300 sec: 45486.4). Total num frames: 3731587072. Throughput: 0: 45726.7. Samples: 250160980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:55:00,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 10:55:00,600][73497] Updated weights for policy 0, policy_version 227759 (0.0043) [2024-06-13 10:55:03,998][73497] Updated weights for policy 0, policy_version 227769 (0.0024) [2024-06-13 10:55:05,501][73265] Fps is (10 sec: 47513.9, 60 sec: 46148.3, 300 sec: 45597.5). Total num frames: 3731832832. Throughput: 0: 45585.3. Samples: 250297640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 10:55:05,502][73265] Avg episode reward: [(0, '0.473')] [2024-06-13 10:55:07,926][73497] Updated weights for policy 0, policy_version 227779 (0.0041) [2024-06-13 10:55:10,501][73265] Fps is (10 sec: 44236.5, 60 sec: 45329.2, 300 sec: 45486.4). Total num frames: 3732029440. Throughput: 0: 45532.1. Samples: 250571860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 10:55:10,502][73265] Avg episode reward: [(0, '0.428')] [2024-06-13 10:55:11,017][73497] Updated weights for policy 0, policy_version 227789 (0.0031) [2024-06-13 10:55:14,764][73497] Updated weights for policy 0, policy_version 227799 (0.0038) [2024-06-13 10:55:15,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45056.0, 300 sec: 45430.9). Total num frames: 3732275200. Throughput: 0: 45847.5. Samples: 250847180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 10:55:15,502][73265] Avg episode reward: [(0, '0.444')] [2024-06-13 10:55:18,193][73497] Updated weights for policy 0, policy_version 227809 (0.0042) [2024-06-13 10:55:20,501][73265] Fps is (10 sec: 47513.9, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 3732504576. Throughput: 0: 45684.6. Samples: 250985600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 10:55:20,502][73265] Avg episode reward: [(0, '0.439')] [2024-06-13 10:55:21,856][73497] Updated weights for policy 0, policy_version 227819 (0.0030) [2024-06-13 10:55:25,463][73497] Updated weights for policy 0, policy_version 227829 (0.0036) [2024-06-13 10:55:25,501][73265] Fps is (10 sec: 47513.8, 60 sec: 45602.2, 300 sec: 45653.0). Total num frames: 3732750336. Throughput: 0: 45676.6. Samples: 251257360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 10:55:25,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 10:55:29,185][73497] Updated weights for policy 0, policy_version 227839 (0.0037) [2024-06-13 10:55:30,501][73265] Fps is (10 sec: 45874.6, 60 sec: 45602.1, 300 sec: 45486.4). Total num frames: 3732963328. Throughput: 0: 45783.1. Samples: 251531580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 10:55:30,502][73265] Avg episode reward: [(0, '0.453')] [2024-06-13 10:55:32,803][73497] Updated weights for policy 0, policy_version 227849 (0.0028) [2024-06-13 10:55:35,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45875.3, 300 sec: 45597.5). Total num frames: 3733209088. Throughput: 0: 45573.4. Samples: 251665200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 10:55:35,502][73265] Avg episode reward: [(0, '0.429')] [2024-06-13 10:55:36,142][73497] Updated weights for policy 0, policy_version 227859 (0.0036) [2024-06-13 10:55:39,773][73497] Updated weights for policy 0, policy_version 227869 (0.0031) [2024-06-13 10:55:40,501][73265] Fps is (10 sec: 49152.5, 60 sec: 46421.4, 300 sec: 45597.5). Total num frames: 3733454848. Throughput: 0: 45857.0. Samples: 251947960. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 10:55:40,502][73265] Avg episode reward: [(0, '0.433')] [2024-06-13 10:55:43,592][73497] Updated weights for policy 0, policy_version 227879 (0.0033) [2024-06-13 10:55:45,501][73265] Fps is (10 sec: 44236.3, 60 sec: 45875.2, 300 sec: 45486.4). Total num frames: 3733651456. Throughput: 0: 45893.2. Samples: 252226180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 10:55:45,502][73265] Avg episode reward: [(0, '0.434')] [2024-06-13 10:55:46,836][73497] Updated weights for policy 0, policy_version 227889 (0.0024) [2024-06-13 10:55:50,395][73497] Updated weights for policy 0, policy_version 227899 (0.0038) [2024-06-13 10:55:50,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45602.2, 300 sec: 45542.0). Total num frames: 3733897216. Throughput: 0: 45751.2. Samples: 252356440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 10:55:50,502][73265] Avg episode reward: [(0, '0.501')] [2024-06-13 10:55:53,871][73497] Updated weights for policy 0, policy_version 227909 (0.0037) [2024-06-13 10:55:55,501][73265] Fps is (10 sec: 47513.7, 60 sec: 46148.3, 300 sec: 45542.0). Total num frames: 3734126592. Throughput: 0: 45906.6. Samples: 252637660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 10:55:55,502][73265] Avg episode reward: [(0, '0.439')] [2024-06-13 10:55:57,440][73497] Updated weights for policy 0, policy_version 227919 (0.0040) [2024-06-13 10:56:00,501][73265] Fps is (10 sec: 44236.6, 60 sec: 45875.1, 300 sec: 45486.4). Total num frames: 3734339584. Throughput: 0: 45750.7. Samples: 252905960. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 10:56:00,502][73265] Avg episode reward: [(0, '0.409')] [2024-06-13 10:56:01,435][73497] Updated weights for policy 0, policy_version 227929 (0.0034) [2024-06-13 10:56:04,550][73497] Updated weights for policy 0, policy_version 227939 (0.0048) [2024-06-13 10:56:05,501][73265] Fps is (10 sec: 42598.6, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3734552576. Throughput: 0: 45668.8. Samples: 253040700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 10:56:05,502][73265] Avg episode reward: [(0, '0.429')] [2024-06-13 10:56:08,229][73497] Updated weights for policy 0, policy_version 227949 (0.0031) [2024-06-13 10:56:10,501][73265] Fps is (10 sec: 45875.4, 60 sec: 46148.3, 300 sec: 45542.0). Total num frames: 3734798336. Throughput: 0: 45769.8. Samples: 253317000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 10:56:10,502][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 10:56:11,967][73497] Updated weights for policy 0, policy_version 227959 (0.0034) [2024-06-13 10:56:13,129][73477] Signal inference workers to stop experience collection... (3750 times) [2024-06-13 10:56:13,130][73477] Signal inference workers to resume experience collection... (3750 times) [2024-06-13 10:56:13,142][73497] InferenceWorker_p0-w0: stopping experience collection (3750 times) [2024-06-13 10:56:13,142][73497] InferenceWorker_p0-w0: resuming experience collection (3750 times) [2024-06-13 10:56:15,501][73265] Fps is (10 sec: 47513.6, 60 sec: 45875.2, 300 sec: 45542.8). Total num frames: 3735027712. Throughput: 0: 45838.3. Samples: 253594300. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-13 10:56:15,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 10:56:15,562][73497] Updated weights for policy 0, policy_version 227969 (0.0040) [2024-06-13 10:56:19,080][73497] Updated weights for policy 0, policy_version 227979 (0.0048) [2024-06-13 10:56:20,502][73265] Fps is (10 sec: 44236.2, 60 sec: 45602.0, 300 sec: 45486.4). Total num frames: 3735240704. Throughput: 0: 45948.7. Samples: 253732900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 10:56:20,502][73265] Avg episode reward: [(0, '0.406')] [2024-06-13 10:56:22,572][73497] Updated weights for policy 0, policy_version 227989 (0.0032) [2024-06-13 10:56:25,501][73265] Fps is (10 sec: 47513.3, 60 sec: 45875.1, 300 sec: 45653.3). Total num frames: 3735502848. Throughput: 0: 45873.2. Samples: 254012260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 10:56:25,502][73265] Avg episode reward: [(0, '0.396')] [2024-06-13 10:56:25,517][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000227997_3735502848.pth... [2024-06-13 10:56:25,571][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000227329_3724558336.pth [2024-06-13 10:56:26,119][73497] Updated weights for policy 0, policy_version 227999 (0.0034) [2024-06-13 10:56:30,031][73497] Updated weights for policy 0, policy_version 228009 (0.0032) [2024-06-13 10:56:30,501][73265] Fps is (10 sec: 47514.3, 60 sec: 45875.3, 300 sec: 45542.0). Total num frames: 3735715840. Throughput: 0: 45664.1. Samples: 254281060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 10:56:30,502][73265] Avg episode reward: [(0, '0.467')] [2024-06-13 10:56:33,202][73497] Updated weights for policy 0, policy_version 228019 (0.0036) [2024-06-13 10:56:35,502][73265] Fps is (10 sec: 44236.2, 60 sec: 45602.0, 300 sec: 45653.0). Total num frames: 3735945216. Throughput: 0: 45776.7. Samples: 254416400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 10:56:35,502][73265] Avg episode reward: [(0, '0.387')] [2024-06-13 10:56:36,755][73497] Updated weights for policy 0, policy_version 228029 (0.0028) [2024-06-13 10:56:40,504][73265] Fps is (10 sec: 45863.9, 60 sec: 45327.2, 300 sec: 45597.1). Total num frames: 3736174592. Throughput: 0: 45728.7. Samples: 254695560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 10:56:40,504][73265] Avg episode reward: [(0, '0.410')] [2024-06-13 10:56:40,815][73497] Updated weights for policy 0, policy_version 228039 (0.0036) [2024-06-13 10:56:44,031][73497] Updated weights for policy 0, policy_version 228049 (0.0032) [2024-06-13 10:56:45,501][73265] Fps is (10 sec: 45876.1, 60 sec: 45875.3, 300 sec: 45653.1). Total num frames: 3736403968. Throughput: 0: 46041.8. Samples: 254977840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 10:56:45,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 10:56:47,905][73497] Updated weights for policy 0, policy_version 228059 (0.0040) [2024-06-13 10:56:50,501][73265] Fps is (10 sec: 45886.7, 60 sec: 45602.2, 300 sec: 45597.5). Total num frames: 3736633344. Throughput: 0: 46110.7. Samples: 255115680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 10:56:50,502][73265] Avg episode reward: [(0, '0.428')] [2024-06-13 10:56:51,070][73497] Updated weights for policy 0, policy_version 228069 (0.0033) [2024-06-13 10:56:55,091][73497] Updated weights for policy 0, policy_version 228079 (0.0047) [2024-06-13 10:56:55,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 3736846336. Throughput: 0: 45870.2. Samples: 255381160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 10:56:55,502][73265] Avg episode reward: [(0, '0.419')] [2024-06-13 10:56:58,218][73497] Updated weights for policy 0, policy_version 228089 (0.0034) [2024-06-13 10:57:00,501][73265] Fps is (10 sec: 45874.8, 60 sec: 45875.2, 300 sec: 45597.5). Total num frames: 3737092096. Throughput: 0: 45829.7. Samples: 255656640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 10:57:00,502][73265] Avg episode reward: [(0, '0.463')] [2024-06-13 10:57:01,946][73497] Updated weights for policy 0, policy_version 228099 (0.0034) [2024-06-13 10:57:05,447][73497] Updated weights for policy 0, policy_version 228109 (0.0030) [2024-06-13 10:57:05,501][73265] Fps is (10 sec: 49152.0, 60 sec: 46421.3, 300 sec: 45542.0). Total num frames: 3737337856. Throughput: 0: 45906.8. Samples: 255798700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 10:57:05,502][73265] Avg episode reward: [(0, '0.425')] [2024-06-13 10:57:09,498][73497] Updated weights for policy 0, policy_version 228119 (0.0039) [2024-06-13 10:57:10,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 3737550848. Throughput: 0: 45885.8. Samples: 256077120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 10:57:10,502][73265] Avg episode reward: [(0, '0.409')] [2024-06-13 10:57:12,678][73497] Updated weights for policy 0, policy_version 228129 (0.0035) [2024-06-13 10:57:15,501][73265] Fps is (10 sec: 45875.1, 60 sec: 46148.2, 300 sec: 45708.6). Total num frames: 3737796608. Throughput: 0: 45817.7. Samples: 256342860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 10:57:15,502][73265] Avg episode reward: [(0, '0.495')] [2024-06-13 10:57:16,542][73497] Updated weights for policy 0, policy_version 228139 (0.0038) [2024-06-13 10:57:19,814][73497] Updated weights for policy 0, policy_version 228149 (0.0033) [2024-06-13 10:57:20,502][73265] Fps is (10 sec: 45874.5, 60 sec: 46148.2, 300 sec: 45597.5). Total num frames: 3738009600. Throughput: 0: 45890.2. Samples: 256481460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-13 10:57:20,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 10:57:23,834][73497] Updated weights for policy 0, policy_version 228159 (0.0034) [2024-06-13 10:57:25,502][73265] Fps is (10 sec: 44236.5, 60 sec: 45602.1, 300 sec: 45653.0). Total num frames: 3738238976. Throughput: 0: 45757.9. Samples: 256754560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:57:25,502][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 10:57:26,983][73497] Updated weights for policy 0, policy_version 228169 (0.0031) [2024-06-13 10:57:30,502][73265] Fps is (10 sec: 44236.8, 60 sec: 45602.0, 300 sec: 45597.5). Total num frames: 3738451968. Throughput: 0: 45470.0. Samples: 257024000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:57:30,502][73265] Avg episode reward: [(0, '0.382')] [2024-06-13 10:57:30,995][73497] Updated weights for policy 0, policy_version 228179 (0.0033) [2024-06-13 10:57:34,176][73497] Updated weights for policy 0, policy_version 228189 (0.0040) [2024-06-13 10:57:35,501][73265] Fps is (10 sec: 45875.9, 60 sec: 45875.4, 300 sec: 45597.5). Total num frames: 3738697728. Throughput: 0: 45613.3. Samples: 257168280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:57:35,502][73265] Avg episode reward: [(0, '0.495')] [2024-06-13 10:57:38,242][73497] Updated weights for policy 0, policy_version 228199 (0.0026) [2024-06-13 10:57:40,501][73265] Fps is (10 sec: 45876.2, 60 sec: 45604.0, 300 sec: 45708.6). Total num frames: 3738910720. Throughput: 0: 45836.1. Samples: 257443780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:57:40,502][73265] Avg episode reward: [(0, '0.420')] [2024-06-13 10:57:41,427][73497] Updated weights for policy 0, policy_version 228209 (0.0029) [2024-06-13 10:57:45,297][73497] Updated weights for policy 0, policy_version 228219 (0.0042) [2024-06-13 10:57:45,501][73265] Fps is (10 sec: 44236.5, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3739140096. Throughput: 0: 45825.8. Samples: 257718800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:57:45,502][73265] Avg episode reward: [(0, '0.454')] [2024-06-13 10:57:46,405][73477] Signal inference workers to stop experience collection... (3800 times) [2024-06-13 10:57:46,405][73477] Signal inference workers to resume experience collection... (3800 times) [2024-06-13 10:57:46,443][73497] InferenceWorker_p0-w0: stopping experience collection (3800 times) [2024-06-13 10:57:46,443][73497] InferenceWorker_p0-w0: resuming experience collection (3800 times) [2024-06-13 10:57:48,413][73497] Updated weights for policy 0, policy_version 228229 (0.0033) [2024-06-13 10:57:50,501][73265] Fps is (10 sec: 45874.9, 60 sec: 45602.1, 300 sec: 45653.0). Total num frames: 3739369472. Throughput: 0: 45528.9. Samples: 257847500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:57:50,502][73265] Avg episode reward: [(0, '0.488')] [2024-06-13 10:57:52,547][73497] Updated weights for policy 0, policy_version 228239 (0.0035) [2024-06-13 10:57:55,501][73265] Fps is (10 sec: 47513.4, 60 sec: 46148.2, 300 sec: 45653.0). Total num frames: 3739615232. Throughput: 0: 45327.1. Samples: 258116840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:57:55,502][73265] Avg episode reward: [(0, '0.417')] [2024-06-13 10:57:55,666][73497] Updated weights for policy 0, policy_version 228249 (0.0044) [2024-06-13 10:57:59,688][73497] Updated weights for policy 0, policy_version 228259 (0.0031) [2024-06-13 10:58:00,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45602.2, 300 sec: 45653.1). Total num frames: 3739828224. Throughput: 0: 45638.3. Samples: 258396580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:58:00,502][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 10:58:03,062][73497] Updated weights for policy 0, policy_version 228269 (0.0026) [2024-06-13 10:58:05,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45602.1, 300 sec: 45653.0). Total num frames: 3740073984. Throughput: 0: 45589.9. Samples: 258533000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:58:05,502][73265] Avg episode reward: [(0, '0.465')] [2024-06-13 10:58:06,814][73497] Updated weights for policy 0, policy_version 228279 (0.0036) [2024-06-13 10:58:10,178][73497] Updated weights for policy 0, policy_version 228289 (0.0037) [2024-06-13 10:58:10,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45602.1, 300 sec: 45597.5). Total num frames: 3740286976. Throughput: 0: 45675.2. Samples: 258809940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:58:10,502][73265] Avg episode reward: [(0, '0.435')] [2024-06-13 10:58:14,007][73497] Updated weights for policy 0, policy_version 228299 (0.0035) [2024-06-13 10:58:15,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 3740516352. Throughput: 0: 45604.6. Samples: 259076200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:58:15,502][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 10:58:17,223][73497] Updated weights for policy 0, policy_version 228309 (0.0030) [2024-06-13 10:58:20,501][73265] Fps is (10 sec: 44236.8, 60 sec: 45329.2, 300 sec: 45597.5). Total num frames: 3740729344. Throughput: 0: 45437.3. Samples: 259212960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:58:20,502][73265] Avg episode reward: [(0, '0.393')] [2024-06-13 10:58:21,296][73497] Updated weights for policy 0, policy_version 228319 (0.0023) [2024-06-13 10:58:24,183][73497] Updated weights for policy 0, policy_version 228329 (0.0036) [2024-06-13 10:58:25,502][73265] Fps is (10 sec: 44236.3, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 3740958720. Throughput: 0: 45368.7. Samples: 259485380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:58:25,502][73265] Avg episode reward: [(0, '0.470')] [2024-06-13 10:58:25,712][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000228332_3740991488.pth... [2024-06-13 10:58:25,755][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000227662_3730014208.pth [2024-06-13 10:58:28,194][73497] Updated weights for policy 0, policy_version 228339 (0.0031) [2024-06-13 10:58:30,501][73265] Fps is (10 sec: 47513.5, 60 sec: 45875.3, 300 sec: 45708.6). Total num frames: 3741204480. Throughput: 0: 45573.8. Samples: 259769620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-13 10:58:30,502][73265] Avg episode reward: [(0, '0.405')] [2024-06-13 10:58:31,587][73497] Updated weights for policy 0, policy_version 228349 (0.0027) [2024-06-13 10:58:35,035][73497] Updated weights for policy 0, policy_version 228359 (0.0032) [2024-06-13 10:58:35,501][73265] Fps is (10 sec: 47514.0, 60 sec: 45602.1, 300 sec: 45708.6). Total num frames: 3741433856. Throughput: 0: 45784.0. Samples: 259907780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 10:58:35,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 10:58:38,603][73497] Updated weights for policy 0, policy_version 228369 (0.0031) [2024-06-13 10:58:40,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45875.2, 300 sec: 45708.6). Total num frames: 3741663232. Throughput: 0: 45865.0. Samples: 260180760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 10:58:40,502][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 10:58:42,362][73497] Updated weights for policy 0, policy_version 228379 (0.0041) [2024-06-13 10:58:45,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45875.2, 300 sec: 45819.6). Total num frames: 3741892608. Throughput: 0: 45611.1. Samples: 260449080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 10:58:45,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 10:58:45,990][73497] Updated weights for policy 0, policy_version 228389 (0.0037) [2024-06-13 10:58:49,661][73497] Updated weights for policy 0, policy_version 228399 (0.0033) [2024-06-13 10:58:50,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45602.1, 300 sec: 45708.6). Total num frames: 3742105600. Throughput: 0: 45728.1. Samples: 260590760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 10:58:50,502][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 10:58:53,297][73497] Updated weights for policy 0, policy_version 228409 (0.0032) [2024-06-13 10:58:55,502][73265] Fps is (10 sec: 45874.8, 60 sec: 45602.1, 300 sec: 45708.6). Total num frames: 3742351360. Throughput: 0: 45738.5. Samples: 260868180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 10:58:55,502][73265] Avg episode reward: [(0, '0.429')] [2024-06-13 10:58:56,546][73497] Updated weights for policy 0, policy_version 228419 (0.0028) [2024-06-13 10:59:00,251][73497] Updated weights for policy 0, policy_version 228429 (0.0033) [2024-06-13 10:59:00,502][73265] Fps is (10 sec: 47513.0, 60 sec: 45875.1, 300 sec: 45819.7). Total num frames: 3742580736. Throughput: 0: 45753.2. Samples: 261135100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 10:59:00,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 10:59:03,883][73497] Updated weights for policy 0, policy_version 228439 (0.0037) [2024-06-13 10:59:05,501][73265] Fps is (10 sec: 44237.4, 60 sec: 45329.1, 300 sec: 45708.6). Total num frames: 3742793728. Throughput: 0: 45726.2. Samples: 261270640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 10:59:05,502][73265] Avg episode reward: [(0, '0.428')] [2024-06-13 10:59:06,992][73477] Signal inference workers to stop experience collection... (3850 times) [2024-06-13 10:59:07,039][73477] Signal inference workers to resume experience collection... (3850 times) [2024-06-13 10:59:07,040][73497] InferenceWorker_p0-w0: stopping experience collection (3850 times) [2024-06-13 10:59:07,052][73497] InferenceWorker_p0-w0: resuming experience collection (3850 times) [2024-06-13 10:59:07,385][73497] Updated weights for policy 0, policy_version 228449 (0.0039) [2024-06-13 10:59:10,502][73265] Fps is (10 sec: 44236.9, 60 sec: 45602.1, 300 sec: 45597.5). Total num frames: 3743023104. Throughput: 0: 45786.7. Samples: 261545780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 10:59:10,502][73265] Avg episode reward: [(0, '0.443')] [2024-06-13 10:59:11,306][73497] Updated weights for policy 0, policy_version 228459 (0.0052) [2024-06-13 10:59:15,035][73497] Updated weights for policy 0, policy_version 228469 (0.0042) [2024-06-13 10:59:15,501][73265] Fps is (10 sec: 47513.6, 60 sec: 45875.2, 300 sec: 45708.6). Total num frames: 3743268864. Throughput: 0: 45473.8. Samples: 261815940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 10:59:15,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 10:59:18,795][73497] Updated weights for policy 0, policy_version 228479 (0.0033) [2024-06-13 10:59:20,501][73265] Fps is (10 sec: 47514.3, 60 sec: 46148.3, 300 sec: 45708.6). Total num frames: 3743498240. Throughput: 0: 45427.2. Samples: 261952000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 10:59:20,502][73265] Avg episode reward: [(0, '0.471')] [2024-06-13 10:59:22,131][73497] Updated weights for policy 0, policy_version 228489 (0.0042) [2024-06-13 10:59:25,502][73265] Fps is (10 sec: 42597.7, 60 sec: 45602.1, 300 sec: 45653.0). Total num frames: 3743694848. Throughput: 0: 45520.7. Samples: 262229200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 10:59:25,502][73265] Avg episode reward: [(0, '0.409')] [2024-06-13 10:59:25,941][73497] Updated weights for policy 0, policy_version 228499 (0.0037) [2024-06-13 10:59:29,103][73497] Updated weights for policy 0, policy_version 228509 (0.0038) [2024-06-13 10:59:30,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45602.2, 300 sec: 45708.6). Total num frames: 3743940608. Throughput: 0: 45481.9. Samples: 262495760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 10:59:30,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 10:59:33,074][73497] Updated weights for policy 0, policy_version 228519 (0.0023) [2024-06-13 10:59:35,501][73265] Fps is (10 sec: 45876.1, 60 sec: 45329.1, 300 sec: 45708.6). Total num frames: 3744153600. Throughput: 0: 45427.6. Samples: 262635000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 10:59:35,502][73265] Avg episode reward: [(0, '0.458')] [2024-06-13 10:59:36,692][73497] Updated weights for policy 0, policy_version 228529 (0.0031) [2024-06-13 10:59:40,504][73265] Fps is (10 sec: 42587.7, 60 sec: 45054.1, 300 sec: 45652.7). Total num frames: 3744366592. Throughput: 0: 45170.6. Samples: 262900960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-13 10:59:40,504][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 10:59:40,596][73497] Updated weights for policy 0, policy_version 228539 (0.0033) [2024-06-13 10:59:44,134][73497] Updated weights for policy 0, policy_version 228549 (0.0036) [2024-06-13 10:59:45,501][73265] Fps is (10 sec: 44236.5, 60 sec: 45056.0, 300 sec: 45542.0). Total num frames: 3744595968. Throughput: 0: 45342.3. Samples: 263175500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 10:59:45,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 10:59:48,040][73497] Updated weights for policy 0, policy_version 228559 (0.0036) [2024-06-13 10:59:50,501][73265] Fps is (10 sec: 47525.2, 60 sec: 45602.1, 300 sec: 45708.6). Total num frames: 3744841728. Throughput: 0: 45427.5. Samples: 263314880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 10:59:50,502][73265] Avg episode reward: [(0, '0.446')] [2024-06-13 10:59:51,151][73497] Updated weights for policy 0, policy_version 228569 (0.0027) [2024-06-13 10:59:55,126][73497] Updated weights for policy 0, policy_version 228579 (0.0030) [2024-06-13 10:59:55,501][73265] Fps is (10 sec: 47513.7, 60 sec: 45329.2, 300 sec: 45708.6). Total num frames: 3745071104. Throughput: 0: 45284.1. Samples: 263583560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 10:59:55,507][73265] Avg episode reward: [(0, '0.408')] [2024-06-13 10:59:58,079][73497] Updated weights for policy 0, policy_version 228589 (0.0034) [2024-06-13 11:00:00,501][73265] Fps is (10 sec: 44237.0, 60 sec: 45056.1, 300 sec: 45597.5). Total num frames: 3745284096. Throughput: 0: 45296.5. Samples: 263854280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 11:00:00,502][73265] Avg episode reward: [(0, '0.471')] [2024-06-13 11:00:02,088][73497] Updated weights for policy 0, policy_version 228599 (0.0037) [2024-06-13 11:00:05,463][73497] Updated weights for policy 0, policy_version 228609 (0.0031) [2024-06-13 11:00:05,502][73265] Fps is (10 sec: 45874.5, 60 sec: 45602.0, 300 sec: 45764.1). Total num frames: 3745529856. Throughput: 0: 45306.0. Samples: 263990780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 11:00:05,502][73265] Avg episode reward: [(0, '0.444')] [2024-06-13 11:00:09,415][73497] Updated weights for policy 0, policy_version 228619 (0.0044) [2024-06-13 11:00:10,502][73265] Fps is (10 sec: 45874.5, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 3745742848. Throughput: 0: 45237.8. Samples: 264264900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 11:00:10,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 11:00:12,770][73497] Updated weights for policy 0, policy_version 228629 (0.0040) [2024-06-13 11:00:15,502][73265] Fps is (10 sec: 42598.0, 60 sec: 44782.7, 300 sec: 45597.5). Total num frames: 3745955840. Throughput: 0: 45270.8. Samples: 264532960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 11:00:15,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 11:00:16,901][73497] Updated weights for policy 0, policy_version 228639 (0.0026) [2024-06-13 11:00:19,832][73497] Updated weights for policy 0, policy_version 228649 (0.0034) [2024-06-13 11:00:20,501][73265] Fps is (10 sec: 45875.8, 60 sec: 45056.0, 300 sec: 45597.5). Total num frames: 3746201600. Throughput: 0: 45160.9. Samples: 264667240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 11:00:20,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 11:00:23,841][73497] Updated weights for policy 0, policy_version 228659 (0.0027) [2024-06-13 11:00:25,501][73265] Fps is (10 sec: 47514.7, 60 sec: 45602.2, 300 sec: 45653.1). Total num frames: 3746430976. Throughput: 0: 45403.3. Samples: 264944000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 11:00:25,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 11:00:25,536][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000228665_3746447360.pth... [2024-06-13 11:00:25,574][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000227997_3735502848.pth [2024-06-13 11:00:27,282][73497] Updated weights for policy 0, policy_version 228669 (0.0035) [2024-06-13 11:00:30,501][73265] Fps is (10 sec: 44236.6, 60 sec: 45055.9, 300 sec: 45542.0). Total num frames: 3746643968. Throughput: 0: 45417.3. Samples: 265219280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 11:00:30,502][73265] Avg episode reward: [(0, '0.465')] [2024-06-13 11:00:31,061][73497] Updated weights for policy 0, policy_version 228679 (0.0037) [2024-06-13 11:00:31,433][73477] Signal inference workers to stop experience collection... (3900 times) [2024-06-13 11:00:31,433][73477] Signal inference workers to resume experience collection... (3900 times) [2024-06-13 11:00:31,478][73497] InferenceWorker_p0-w0: stopping experience collection (3900 times) [2024-06-13 11:00:31,479][73497] InferenceWorker_p0-w0: resuming experience collection (3900 times) [2024-06-13 11:00:34,370][73497] Updated weights for policy 0, policy_version 228689 (0.0032) [2024-06-13 11:00:35,504][73265] Fps is (10 sec: 45863.8, 60 sec: 45600.2, 300 sec: 45541.6). Total num frames: 3746889728. Throughput: 0: 45337.5. Samples: 265355180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 11:00:35,504][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 11:00:38,303][73497] Updated weights for policy 0, policy_version 228699 (0.0032) [2024-06-13 11:00:40,501][73265] Fps is (10 sec: 47513.8, 60 sec: 45877.1, 300 sec: 45653.1). Total num frames: 3747119104. Throughput: 0: 45271.1. Samples: 265620760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 11:00:40,502][73265] Avg episode reward: [(0, '0.389')] [2024-06-13 11:00:41,707][73497] Updated weights for policy 0, policy_version 228709 (0.0037) [2024-06-13 11:00:45,501][73265] Fps is (10 sec: 42609.3, 60 sec: 45329.1, 300 sec: 45486.4). Total num frames: 3747315712. Throughput: 0: 45371.6. Samples: 265896000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 11:00:45,502][73265] Avg episode reward: [(0, '0.414')] [2024-06-13 11:00:45,584][73497] Updated weights for policy 0, policy_version 228719 (0.0033) [2024-06-13 11:00:48,699][73497] Updated weights for policy 0, policy_version 228729 (0.0029) [2024-06-13 11:00:50,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 3747561472. Throughput: 0: 45419.3. Samples: 266034640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-13 11:00:50,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 11:00:52,768][73497] Updated weights for policy 0, policy_version 228739 (0.0037) [2024-06-13 11:00:55,501][73265] Fps is (10 sec: 47513.5, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 3747790848. Throughput: 0: 45331.7. Samples: 266304820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 11:00:55,502][73265] Avg episode reward: [(0, '0.477')] [2024-06-13 11:00:55,992][73497] Updated weights for policy 0, policy_version 228749 (0.0021) [2024-06-13 11:00:59,915][73497] Updated weights for policy 0, policy_version 228759 (0.0033) [2024-06-13 11:01:00,501][73265] Fps is (10 sec: 45874.9, 60 sec: 45602.1, 300 sec: 45653.0). Total num frames: 3748020224. Throughput: 0: 45607.4. Samples: 266585280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 11:01:00,502][73265] Avg episode reward: [(0, '0.444')] [2024-06-13 11:01:03,253][73497] Updated weights for policy 0, policy_version 228769 (0.0033) [2024-06-13 11:01:05,501][73265] Fps is (10 sec: 44236.5, 60 sec: 45056.1, 300 sec: 45542.0). Total num frames: 3748233216. Throughput: 0: 45516.8. Samples: 266715500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 11:01:05,502][73265] Avg episode reward: [(0, '0.408')] [2024-06-13 11:01:07,262][73497] Updated weights for policy 0, policy_version 228779 (0.0029) [2024-06-13 11:01:10,487][73497] Updated weights for policy 0, policy_version 228789 (0.0038) [2024-06-13 11:01:10,502][73265] Fps is (10 sec: 45874.7, 60 sec: 45602.1, 300 sec: 45597.5). Total num frames: 3748478976. Throughput: 0: 45363.9. Samples: 266985380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 11:01:10,502][73265] Avg episode reward: [(0, '0.431')] [2024-06-13 11:01:14,239][73497] Updated weights for policy 0, policy_version 228799 (0.0037) [2024-06-13 11:01:15,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45602.3, 300 sec: 45597.5). Total num frames: 3748691968. Throughput: 0: 45431.6. Samples: 267263700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 11:01:15,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 11:01:17,407][73497] Updated weights for policy 0, policy_version 228809 (0.0035) [2024-06-13 11:01:20,502][73265] Fps is (10 sec: 40960.0, 60 sec: 44782.8, 300 sec: 45375.3). Total num frames: 3748888576. Throughput: 0: 45270.8. Samples: 267392260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 11:01:20,502][73265] Avg episode reward: [(0, '0.426')] [2024-06-13 11:01:21,722][73497] Updated weights for policy 0, policy_version 228819 (0.0046) [2024-06-13 11:01:24,476][73497] Updated weights for policy 0, policy_version 228829 (0.0031) [2024-06-13 11:01:25,501][73265] Fps is (10 sec: 47513.6, 60 sec: 45602.1, 300 sec: 45597.5). Total num frames: 3749167104. Throughput: 0: 45537.3. Samples: 267669940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 11:01:25,502][73265] Avg episode reward: [(0, '0.416')] [2024-06-13 11:01:28,766][73497] Updated weights for policy 0, policy_version 228839 (0.0035) [2024-06-13 11:01:30,501][73265] Fps is (10 sec: 49153.1, 60 sec: 45602.2, 300 sec: 45542.0). Total num frames: 3749380096. Throughput: 0: 45427.6. Samples: 267940240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 11:01:30,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 11:01:31,896][73497] Updated weights for policy 0, policy_version 228849 (0.0042) [2024-06-13 11:01:35,503][73265] Fps is (10 sec: 42590.6, 60 sec: 45056.5, 300 sec: 45486.5). Total num frames: 3749593088. Throughput: 0: 45360.3. Samples: 268075940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 11:01:35,504][73265] Avg episode reward: [(0, '0.472')] [2024-06-13 11:01:35,954][73497] Updated weights for policy 0, policy_version 228859 (0.0036) [2024-06-13 11:01:38,920][73497] Updated weights for policy 0, policy_version 228869 (0.0046) [2024-06-13 11:01:40,501][73265] Fps is (10 sec: 45874.9, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 3749838848. Throughput: 0: 45383.1. Samples: 268347060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 11:01:40,502][73265] Avg episode reward: [(0, '0.418')] [2024-06-13 11:01:42,883][73497] Updated weights for policy 0, policy_version 228879 (0.0030) [2024-06-13 11:01:45,501][73265] Fps is (10 sec: 47522.3, 60 sec: 45875.1, 300 sec: 45542.0). Total num frames: 3750068224. Throughput: 0: 45517.7. Samples: 268633580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 11:01:45,502][73265] Avg episode reward: [(0, '0.475')] [2024-06-13 11:01:45,927][73497] Updated weights for policy 0, policy_version 228889 (0.0040) [2024-06-13 11:01:50,291][73497] Updated weights for policy 0, policy_version 228899 (0.0040) [2024-06-13 11:01:50,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45602.1, 300 sec: 45597.5). Total num frames: 3750297600. Throughput: 0: 45592.5. Samples: 268767160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 11:01:50,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 11:01:53,451][73497] Updated weights for policy 0, policy_version 228909 (0.0040) [2024-06-13 11:01:55,502][73265] Fps is (10 sec: 44236.5, 60 sec: 45329.0, 300 sec: 45486.4). Total num frames: 3750510592. Throughput: 0: 45542.2. Samples: 269034780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 11:01:55,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 11:01:57,408][73497] Updated weights for policy 0, policy_version 228919 (0.0032) [2024-06-13 11:02:00,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45602.2, 300 sec: 45486.4). Total num frames: 3750756352. Throughput: 0: 45366.3. Samples: 269305180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-13 11:02:00,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 11:02:00,700][73497] Updated weights for policy 0, policy_version 228929 (0.0034) [2024-06-13 11:02:02,420][73477] Signal inference workers to stop experience collection... (3950 times) [2024-06-13 11:02:02,467][73477] Signal inference workers to resume experience collection... (3950 times) [2024-06-13 11:02:02,467][73497] InferenceWorker_p0-w0: stopping experience collection (3950 times) [2024-06-13 11:02:02,483][73497] InferenceWorker_p0-w0: resuming experience collection (3950 times) [2024-06-13 11:02:04,645][73497] Updated weights for policy 0, policy_version 228939 (0.0029) [2024-06-13 11:02:05,502][73265] Fps is (10 sec: 44236.7, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 3750952960. Throughput: 0: 45491.5. Samples: 269439380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 11:02:05,502][73265] Avg episode reward: [(0, '0.434')] [2024-06-13 11:02:07,853][73497] Updated weights for policy 0, policy_version 228949 (0.0034) [2024-06-13 11:02:10,501][73265] Fps is (10 sec: 42597.9, 60 sec: 45056.0, 300 sec: 45375.3). Total num frames: 3751182336. Throughput: 0: 45295.5. Samples: 269708240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 11:02:10,502][73265] Avg episode reward: [(0, '0.492')] [2024-06-13 11:02:11,761][73497] Updated weights for policy 0, policy_version 228959 (0.0043) [2024-06-13 11:02:15,089][73497] Updated weights for policy 0, policy_version 228969 (0.0037) [2024-06-13 11:02:15,501][73265] Fps is (10 sec: 49152.4, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 3751444480. Throughput: 0: 45387.4. Samples: 269982680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 11:02:15,502][73265] Avg episode reward: [(0, '0.449')] [2024-06-13 11:02:19,340][73497] Updated weights for policy 0, policy_version 228979 (0.0040) [2024-06-13 11:02:20,502][73265] Fps is (10 sec: 45875.0, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 3751641088. Throughput: 0: 45528.0. Samples: 270124620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 11:02:20,502][73265] Avg episode reward: [(0, '0.421')] [2024-06-13 11:02:22,139][73497] Updated weights for policy 0, policy_version 228989 (0.0040) [2024-06-13 11:02:25,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 3751886848. Throughput: 0: 45565.3. Samples: 270397500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 11:02:25,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 11:02:25,521][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000228997_3751886848.pth... [2024-06-13 11:02:25,573][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000228332_3740991488.pth [2024-06-13 11:02:26,315][73497] Updated weights for policy 0, policy_version 228999 (0.0035) [2024-06-13 11:02:29,725][73497] Updated weights for policy 0, policy_version 229009 (0.0037) [2024-06-13 11:02:30,502][73265] Fps is (10 sec: 45875.4, 60 sec: 45328.9, 300 sec: 45430.9). Total num frames: 3752099840. Throughput: 0: 45117.3. Samples: 270663860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 11:02:30,502][73265] Avg episode reward: [(0, '0.472')] [2024-06-13 11:02:33,487][73497] Updated weights for policy 0, policy_version 229019 (0.0030) [2024-06-13 11:02:35,501][73265] Fps is (10 sec: 42598.3, 60 sec: 45330.5, 300 sec: 45430.9). Total num frames: 3752312832. Throughput: 0: 45309.8. Samples: 270806100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 11:02:35,502][73265] Avg episode reward: [(0, '0.444')] [2024-06-13 11:02:36,931][73497] Updated weights for policy 0, policy_version 229029 (0.0032) [2024-06-13 11:02:40,501][73265] Fps is (10 sec: 45875.9, 60 sec: 45329.1, 300 sec: 45486.4). Total num frames: 3752558592. Throughput: 0: 45237.5. Samples: 271070460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 11:02:40,502][73265] Avg episode reward: [(0, '0.494')] [2024-06-13 11:02:40,802][73497] Updated weights for policy 0, policy_version 229039 (0.0036) [2024-06-13 11:02:44,101][73497] Updated weights for policy 0, policy_version 229049 (0.0034) [2024-06-13 11:02:45,501][73265] Fps is (10 sec: 47513.7, 60 sec: 45329.1, 300 sec: 45486.4). Total num frames: 3752787968. Throughput: 0: 45334.7. Samples: 271345240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 11:02:45,502][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 11:02:47,978][73497] Updated weights for policy 0, policy_version 229059 (0.0028) [2024-06-13 11:02:50,501][73265] Fps is (10 sec: 44236.4, 60 sec: 45056.0, 300 sec: 45375.4). Total num frames: 3753000960. Throughput: 0: 45434.3. Samples: 271483920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 11:02:50,502][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 11:02:51,032][73497] Updated weights for policy 0, policy_version 229069 (0.0037) [2024-06-13 11:02:55,011][73497] Updated weights for policy 0, policy_version 229079 (0.0032) [2024-06-13 11:02:55,502][73265] Fps is (10 sec: 44236.1, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 3753230336. Throughput: 0: 45535.1. Samples: 271757320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 11:02:55,508][73265] Avg episode reward: [(0, '0.481')] [2024-06-13 11:02:58,691][73497] Updated weights for policy 0, policy_version 229089 (0.0034) [2024-06-13 11:03:00,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45056.0, 300 sec: 45375.4). Total num frames: 3753459712. Throughput: 0: 45272.5. Samples: 272019940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 11:03:00,502][73265] Avg episode reward: [(0, '0.473')] [2024-06-13 11:03:02,641][73497] Updated weights for policy 0, policy_version 229099 (0.0027) [2024-06-13 11:03:05,503][73265] Fps is (10 sec: 45867.6, 60 sec: 45600.9, 300 sec: 45430.6). Total num frames: 3753689088. Throughput: 0: 45226.8. Samples: 272159900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-13 11:03:05,504][73265] Avg episode reward: [(0, '0.444')] [2024-06-13 11:03:06,332][73497] Updated weights for policy 0, policy_version 229109 (0.0033) [2024-06-13 11:03:09,635][73497] Updated weights for policy 0, policy_version 229119 (0.0030) [2024-06-13 11:03:10,504][73265] Fps is (10 sec: 45863.9, 60 sec: 45600.3, 300 sec: 45430.5). Total num frames: 3753918464. Throughput: 0: 45266.4. Samples: 272434600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-13 11:03:10,504][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 11:03:13,424][73497] Updated weights for policy 0, policy_version 229129 (0.0031) [2024-06-13 11:03:15,501][73265] Fps is (10 sec: 45883.7, 60 sec: 45056.1, 300 sec: 45486.4). Total num frames: 3754147840. Throughput: 0: 45392.6. Samples: 272706520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-13 11:03:15,502][73265] Avg episode reward: [(0, '0.462')] [2024-06-13 11:03:16,764][73497] Updated weights for policy 0, policy_version 229139 (0.0030) [2024-06-13 11:03:20,297][73497] Updated weights for policy 0, policy_version 229149 (0.0032) [2024-06-13 11:03:20,501][73265] Fps is (10 sec: 45886.5, 60 sec: 45602.2, 300 sec: 45486.4). Total num frames: 3754377216. Throughput: 0: 45253.8. Samples: 272842520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-13 11:03:20,502][73265] Avg episode reward: [(0, '0.425')] [2024-06-13 11:03:21,688][73477] Signal inference workers to stop experience collection... (4000 times) [2024-06-13 11:03:21,689][73477] Signal inference workers to resume experience collection... (4000 times) [2024-06-13 11:03:21,708][73497] InferenceWorker_p0-w0: stopping experience collection (4000 times) [2024-06-13 11:03:21,708][73497] InferenceWorker_p0-w0: resuming experience collection (4000 times) [2024-06-13 11:03:23,749][73497] Updated weights for policy 0, policy_version 229159 (0.0033) [2024-06-13 11:03:25,501][73265] Fps is (10 sec: 44236.4, 60 sec: 45055.9, 300 sec: 45375.3). Total num frames: 3754590208. Throughput: 0: 45373.7. Samples: 273112280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-13 11:03:25,502][73265] Avg episode reward: [(0, '0.446')] [2024-06-13 11:03:27,386][73497] Updated weights for policy 0, policy_version 229169 (0.0033) [2024-06-13 11:03:30,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45329.2, 300 sec: 45375.4). Total num frames: 3754819584. Throughput: 0: 45416.9. Samples: 273389000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-13 11:03:30,502][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 11:03:31,339][73497] Updated weights for policy 0, policy_version 229179 (0.0027) [2024-06-13 11:03:35,202][73497] Updated weights for policy 0, policy_version 229189 (0.0034) [2024-06-13 11:03:35,502][73265] Fps is (10 sec: 44236.6, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 3755032576. Throughput: 0: 45316.4. Samples: 273523160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-13 11:03:35,502][73265] Avg episode reward: [(0, '0.414')] [2024-06-13 11:03:38,487][73497] Updated weights for policy 0, policy_version 229199 (0.0029) [2024-06-13 11:03:40,506][73265] Fps is (10 sec: 47492.0, 60 sec: 45598.7, 300 sec: 45430.2). Total num frames: 3755294720. Throughput: 0: 45301.8. Samples: 273796100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-13 11:03:40,506][73265] Avg episode reward: [(0, '0.491')] [2024-06-13 11:03:42,499][73497] Updated weights for policy 0, policy_version 229209 (0.0028) [2024-06-13 11:03:45,391][73497] Updated weights for policy 0, policy_version 229219 (0.0029) [2024-06-13 11:03:45,502][73265] Fps is (10 sec: 49151.9, 60 sec: 45602.0, 300 sec: 45486.4). Total num frames: 3755524096. Throughput: 0: 45526.1. Samples: 274068620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-13 11:03:45,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 11:03:49,652][73497] Updated weights for policy 0, policy_version 229229 (0.0036) [2024-06-13 11:03:50,501][73265] Fps is (10 sec: 42617.6, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 3755720704. Throughput: 0: 45472.5. Samples: 274206080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-13 11:03:50,502][73265] Avg episode reward: [(0, '0.467')] [2024-06-13 11:03:52,639][73497] Updated weights for policy 0, policy_version 229239 (0.0031) [2024-06-13 11:03:55,502][73265] Fps is (10 sec: 42598.4, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 3755950080. Throughput: 0: 45360.6. Samples: 274475720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-13 11:03:55,502][73265] Avg episode reward: [(0, '0.448')] [2024-06-13 11:03:56,742][73497] Updated weights for policy 0, policy_version 229249 (0.0039) [2024-06-13 11:04:00,026][73497] Updated weights for policy 0, policy_version 229259 (0.0035) [2024-06-13 11:04:00,501][73265] Fps is (10 sec: 47513.3, 60 sec: 45602.1, 300 sec: 45430.9). Total num frames: 3756195840. Throughput: 0: 45382.6. Samples: 274748740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-13 11:04:00,502][73265] Avg episode reward: [(0, '0.469')] [2024-06-13 11:04:04,290][73497] Updated weights for policy 0, policy_version 229269 (0.0029) [2024-06-13 11:04:05,501][73265] Fps is (10 sec: 47514.0, 60 sec: 45603.5, 300 sec: 45430.9). Total num frames: 3756425216. Throughput: 0: 45529.3. Samples: 274891340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-13 11:04:05,502][73265] Avg episode reward: [(0, '0.520')] [2024-06-13 11:04:06,851][73497] Updated weights for policy 0, policy_version 229279 (0.0025) [2024-06-13 11:04:10,502][73265] Fps is (10 sec: 44236.5, 60 sec: 45330.9, 300 sec: 45319.8). Total num frames: 3756638208. Throughput: 0: 45496.4. Samples: 275159620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-13 11:04:10,502][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 11:04:11,607][73497] Updated weights for policy 0, policy_version 229289 (0.0038) [2024-06-13 11:04:13,799][73497] Updated weights for policy 0, policy_version 229299 (0.0030) [2024-06-13 11:04:15,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45602.1, 300 sec: 45375.3). Total num frames: 3756883968. Throughput: 0: 45385.7. Samples: 275431360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-13 11:04:15,502][73265] Avg episode reward: [(0, '0.420')] [2024-06-13 11:04:18,538][73497] Updated weights for policy 0, policy_version 229309 (0.0031) [2024-06-13 11:04:20,501][73265] Fps is (10 sec: 45875.7, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3757096960. Throughput: 0: 45617.0. Samples: 275575920. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-13 11:04:20,502][73265] Avg episode reward: [(0, '0.464')] [2024-06-13 11:04:21,317][73497] Updated weights for policy 0, policy_version 229319 (0.0029) [2024-06-13 11:04:25,502][73265] Fps is (10 sec: 40959.7, 60 sec: 45055.9, 300 sec: 45264.2). Total num frames: 3757293568. Throughput: 0: 45379.0. Samples: 275837960. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-13 11:04:25,502][73265] Avg episode reward: [(0, '0.496')] [2024-06-13 11:04:25,554][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000229328_3757309952.pth... [2024-06-13 11:04:25,564][73477] Signal inference workers to stop experience collection... (4050 times) [2024-06-13 11:04:25,565][73477] Signal inference workers to resume experience collection... (4050 times) [2024-06-13 11:04:25,584][73497] InferenceWorker_p0-w0: stopping experience collection (4050 times) [2024-06-13 11:04:25,584][73497] InferenceWorker_p0-w0: resuming experience collection (4050 times) [2024-06-13 11:04:25,602][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000228665_3746447360.pth [2024-06-13 11:04:25,753][73497] Updated weights for policy 0, policy_version 229329 (0.0044) [2024-06-13 11:04:28,492][73497] Updated weights for policy 0, policy_version 229339 (0.0032) [2024-06-13 11:04:30,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45602.1, 300 sec: 45430.9). Total num frames: 3757555712. Throughput: 0: 45315.7. Samples: 276107820. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-13 11:04:30,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 11:04:33,307][73497] Updated weights for policy 0, policy_version 229349 (0.0040) [2024-06-13 11:04:35,501][73265] Fps is (10 sec: 50791.6, 60 sec: 46148.4, 300 sec: 45542.4). Total num frames: 3757801472. Throughput: 0: 45553.0. Samples: 276255960. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-13 11:04:35,502][73265] Avg episode reward: [(0, '0.474')] [2024-06-13 11:04:35,546][73497] Updated weights for policy 0, policy_version 229359 (0.0032) [2024-06-13 11:04:40,439][73497] Updated weights for policy 0, policy_version 229369 (0.0029) [2024-06-13 11:04:40,501][73265] Fps is (10 sec: 42598.4, 60 sec: 44786.3, 300 sec: 45375.3). Total num frames: 3757981696. Throughput: 0: 45525.4. Samples: 276524360. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-13 11:04:40,505][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 11:04:42,987][73497] Updated weights for policy 0, policy_version 229379 (0.0038) [2024-06-13 11:04:45,502][73265] Fps is (10 sec: 42597.2, 60 sec: 45055.9, 300 sec: 45375.3). Total num frames: 3758227456. Throughput: 0: 45424.7. Samples: 276792860. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-13 11:04:45,502][73265] Avg episode reward: [(0, '0.461')] [2024-06-13 11:04:47,410][73497] Updated weights for policy 0, policy_version 229389 (0.0041) [2024-06-13 11:04:50,115][73497] Updated weights for policy 0, policy_version 229399 (0.0046) [2024-06-13 11:04:50,501][73265] Fps is (10 sec: 50790.6, 60 sec: 46148.3, 300 sec: 45486.4). Total num frames: 3758489600. Throughput: 0: 45470.3. Samples: 276937500. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-13 11:04:50,502][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 11:04:54,924][73497] Updated weights for policy 0, policy_version 229409 (0.0030) [2024-06-13 11:04:55,501][73265] Fps is (10 sec: 44237.9, 60 sec: 45329.2, 300 sec: 45375.4). Total num frames: 3758669824. Throughput: 0: 45562.4. Samples: 277209920. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-13 11:04:55,502][73265] Avg episode reward: [(0, '0.448')] [2024-06-13 11:04:57,227][73497] Updated weights for policy 0, policy_version 229419 (0.0031) [2024-06-13 11:05:00,501][73265] Fps is (10 sec: 42598.5, 60 sec: 45329.1, 300 sec: 45375.4). Total num frames: 3758915584. Throughput: 0: 45547.7. Samples: 277481000. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-13 11:05:00,502][73265] Avg episode reward: [(0, '0.415')] [2024-06-13 11:05:02,079][73497] Updated weights for policy 0, policy_version 229429 (0.0036) [2024-06-13 11:05:04,035][73497] Updated weights for policy 0, policy_version 229439 (0.0028) [2024-06-13 11:05:05,501][73265] Fps is (10 sec: 49151.2, 60 sec: 45602.1, 300 sec: 45486.4). Total num frames: 3759161344. Throughput: 0: 45433.7. Samples: 277620440. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-13 11:05:05,502][73265] Avg episode reward: [(0, '0.480')] [2024-06-13 11:05:09,227][73497] Updated weights for policy 0, policy_version 229449 (0.0030) [2024-06-13 11:05:10,502][73265] Fps is (10 sec: 45874.2, 60 sec: 45602.1, 300 sec: 45486.4). Total num frames: 3759374336. Throughput: 0: 45765.3. Samples: 277897400. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-13 11:05:10,502][73265] Avg episode reward: [(0, '0.439')] [2024-06-13 11:05:11,613][73497] Updated weights for policy 0, policy_version 229459 (0.0032) [2024-06-13 11:05:15,501][73265] Fps is (10 sec: 42598.8, 60 sec: 45056.0, 300 sec: 45375.3). Total num frames: 3759587328. Throughput: 0: 45797.8. Samples: 278168720. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-13 11:05:15,502][73265] Avg episode reward: [(0, '0.506')] [2024-06-13 11:05:16,299][73497] Updated weights for policy 0, policy_version 229469 (0.0037) [2024-06-13 11:05:18,825][73497] Updated weights for policy 0, policy_version 229479 (0.0038) [2024-06-13 11:05:20,501][73265] Fps is (10 sec: 45876.1, 60 sec: 45602.1, 300 sec: 45430.9). Total num frames: 3759833088. Throughput: 0: 45353.7. Samples: 278296880. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-13 11:05:20,502][73265] Avg episode reward: [(0, '0.414')] [2024-06-13 11:05:23,637][73497] Updated weights for policy 0, policy_version 229489 (0.0025) [2024-06-13 11:05:25,502][73265] Fps is (10 sec: 49151.5, 60 sec: 46421.4, 300 sec: 45542.0). Total num frames: 3760078848. Throughput: 0: 45769.3. Samples: 278583980. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-13 11:05:25,502][73265] Avg episode reward: [(0, '0.469')] [2024-06-13 11:05:25,690][73497] Updated weights for policy 0, policy_version 229499 (0.0037) [2024-06-13 11:05:30,501][73265] Fps is (10 sec: 42598.8, 60 sec: 45056.1, 300 sec: 45320.2). Total num frames: 3760259072. Throughput: 0: 45854.1. Samples: 278856280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 11:05:30,502][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 11:05:30,518][73497] Updated weights for policy 0, policy_version 229509 (0.0036) [2024-06-13 11:05:32,847][73477] Signal inference workers to stop experience collection... (4100 times) [2024-06-13 11:05:32,852][73477] Signal inference workers to resume experience collection... (4100 times) [2024-06-13 11:05:32,891][73497] InferenceWorker_p0-w0: stopping experience collection (4100 times) [2024-06-13 11:05:32,891][73497] InferenceWorker_p0-w0: resuming experience collection (4100 times) [2024-06-13 11:05:32,997][73497] Updated weights for policy 0, policy_version 229519 (0.0044) [2024-06-13 11:05:35,501][73265] Fps is (10 sec: 40960.4, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 3760488448. Throughput: 0: 45443.1. Samples: 278982440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 11:05:35,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 11:05:37,937][73497] Updated weights for policy 0, policy_version 229529 (0.0033) [2024-06-13 11:05:40,391][73497] Updated weights for policy 0, policy_version 229539 (0.0037) [2024-06-13 11:05:40,501][73265] Fps is (10 sec: 50790.0, 60 sec: 46421.4, 300 sec: 45597.5). Total num frames: 3760766976. Throughput: 0: 45514.2. Samples: 279258060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 11:05:40,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 11:05:44,979][73497] Updated weights for policy 0, policy_version 229549 (0.0043) [2024-06-13 11:05:45,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45329.2, 300 sec: 45375.3). Total num frames: 3760947200. Throughput: 0: 45695.5. Samples: 279537300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 11:05:45,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 11:05:47,434][73497] Updated weights for policy 0, policy_version 229559 (0.0032) [2024-06-13 11:05:50,501][73265] Fps is (10 sec: 42598.4, 60 sec: 45056.0, 300 sec: 45430.9). Total num frames: 3761192960. Throughput: 0: 45436.6. Samples: 279665080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 11:05:50,502][73265] Avg episode reward: [(0, '0.473')] [2024-06-13 11:05:52,431][73497] Updated weights for policy 0, policy_version 229569 (0.0037) [2024-06-13 11:05:54,873][73497] Updated weights for policy 0, policy_version 229579 (0.0031) [2024-06-13 11:05:55,501][73265] Fps is (10 sec: 49152.0, 60 sec: 46148.2, 300 sec: 45486.4). Total num frames: 3761438720. Throughput: 0: 45334.8. Samples: 279937460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 11:05:55,502][73265] Avg episode reward: [(0, '0.470')] [2024-06-13 11:05:59,543][73497] Updated weights for policy 0, policy_version 229589 (0.0026) [2024-06-13 11:06:00,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 3761635328. Throughput: 0: 45370.2. Samples: 280210380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 11:06:00,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 11:06:02,215][73497] Updated weights for policy 0, policy_version 229599 (0.0040) [2024-06-13 11:06:05,502][73265] Fps is (10 sec: 40959.5, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 3761848320. Throughput: 0: 45423.4. Samples: 280340940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 11:06:05,502][73265] Avg episode reward: [(0, '0.498')] [2024-06-13 11:06:06,373][73497] Updated weights for policy 0, policy_version 229609 (0.0032) [2024-06-13 11:06:09,397][73497] Updated weights for policy 0, policy_version 229619 (0.0035) [2024-06-13 11:06:10,501][73265] Fps is (10 sec: 47513.2, 60 sec: 45602.2, 300 sec: 45486.4). Total num frames: 3762110464. Throughput: 0: 45156.9. Samples: 280616040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 11:06:10,502][73265] Avg episode reward: [(0, '0.444')] [2024-06-13 11:06:13,894][73497] Updated weights for policy 0, policy_version 229629 (0.0034) [2024-06-13 11:06:15,501][73265] Fps is (10 sec: 49152.6, 60 sec: 45875.2, 300 sec: 45597.5). Total num frames: 3762339840. Throughput: 0: 45243.9. Samples: 280892260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 11:06:15,502][73265] Avg episode reward: [(0, '0.428')] [2024-06-13 11:06:16,483][73497] Updated weights for policy 0, policy_version 229639 (0.0044) [2024-06-13 11:06:20,501][73265] Fps is (10 sec: 42598.5, 60 sec: 45055.9, 300 sec: 45319.8). Total num frames: 3762536448. Throughput: 0: 45434.6. Samples: 281027000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 11:06:20,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 11:06:21,033][73497] Updated weights for policy 0, policy_version 229649 (0.0037) [2024-06-13 11:06:23,869][73497] Updated weights for policy 0, policy_version 229659 (0.0028) [2024-06-13 11:06:25,502][73265] Fps is (10 sec: 44236.2, 60 sec: 45056.0, 300 sec: 45430.8). Total num frames: 3762782208. Throughput: 0: 45144.3. Samples: 281289560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 11:06:25,502][73265] Avg episode reward: [(0, '0.456')] [2024-06-13 11:06:25,515][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000229662_3762782208.pth... [2024-06-13 11:06:25,566][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000228997_3751886848.pth [2024-06-13 11:06:28,043][73497] Updated weights for policy 0, policy_version 229669 (0.0039) [2024-06-13 11:06:30,501][73265] Fps is (10 sec: 47514.2, 60 sec: 45875.2, 300 sec: 45486.7). Total num frames: 3763011584. Throughput: 0: 45213.0. Samples: 281571880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 11:06:30,502][73265] Avg episode reward: [(0, '0.406')] [2024-06-13 11:06:31,009][73497] Updated weights for policy 0, policy_version 229679 (0.0035) [2024-06-13 11:06:35,220][73497] Updated weights for policy 0, policy_version 229689 (0.0043) [2024-06-13 11:06:35,501][73265] Fps is (10 sec: 45876.3, 60 sec: 45875.3, 300 sec: 45430.9). Total num frames: 3763240960. Throughput: 0: 45408.1. Samples: 281708440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-13 11:06:35,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 11:06:38,733][73497] Updated weights for policy 0, policy_version 229699 (0.0040) [2024-06-13 11:06:40,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45056.0, 300 sec: 45430.9). Total num frames: 3763470336. Throughput: 0: 45210.7. Samples: 281971940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-06-13 11:06:40,502][73265] Avg episode reward: [(0, '0.369')] [2024-06-13 11:06:42,425][73477] Signal inference workers to stop experience collection... (4150 times) [2024-06-13 11:06:42,425][73477] Signal inference workers to resume experience collection... (4150 times) [2024-06-13 11:06:42,442][73497] InferenceWorker_p0-w0: stopping experience collection (4150 times) [2024-06-13 11:06:42,442][73497] InferenceWorker_p0-w0: resuming experience collection (4150 times) [2024-06-13 11:06:42,559][73497] Updated weights for policy 0, policy_version 229709 (0.0040) [2024-06-13 11:06:45,501][73265] Fps is (10 sec: 44236.6, 60 sec: 45602.2, 300 sec: 45375.4). Total num frames: 3763683328. Throughput: 0: 45315.6. Samples: 282249580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-06-13 11:06:45,502][73265] Avg episode reward: [(0, '0.400')] [2024-06-13 11:06:45,807][73497] Updated weights for policy 0, policy_version 229719 (0.0034) [2024-06-13 11:06:49,779][73497] Updated weights for policy 0, policy_version 229729 (0.0031) [2024-06-13 11:06:50,501][73265] Fps is (10 sec: 44236.5, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 3763912704. Throughput: 0: 45451.7. Samples: 282386260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-06-13 11:06:50,502][73265] Avg episode reward: [(0, '0.473')] [2024-06-13 11:06:52,949][73497] Updated weights for policy 0, policy_version 229739 (0.0041) [2024-06-13 11:06:55,501][73265] Fps is (10 sec: 44236.5, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 3764125696. Throughput: 0: 45458.7. Samples: 282661680. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-06-13 11:06:55,502][73265] Avg episode reward: [(0, '0.488')] [2024-06-13 11:06:56,823][73497] Updated weights for policy 0, policy_version 229749 (0.0035) [2024-06-13 11:06:59,986][73497] Updated weights for policy 0, policy_version 229759 (0.0039) [2024-06-13 11:07:00,507][73265] Fps is (10 sec: 45850.0, 60 sec: 45598.0, 300 sec: 45485.6). Total num frames: 3764371456. Throughput: 0: 45376.2. Samples: 282934440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-06-13 11:07:00,507][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 11:07:03,943][73497] Updated weights for policy 0, policy_version 229769 (0.0034) [2024-06-13 11:07:05,501][73265] Fps is (10 sec: 45875.4, 60 sec: 45602.3, 300 sec: 45430.9). Total num frames: 3764584448. Throughput: 0: 45401.0. Samples: 283070040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-06-13 11:07:05,502][73265] Avg episode reward: [(0, '0.444')] [2024-06-13 11:07:07,765][73497] Updated weights for policy 0, policy_version 229779 (0.0043) [2024-06-13 11:07:10,504][73265] Fps is (10 sec: 42611.5, 60 sec: 44781.2, 300 sec: 45263.9). Total num frames: 3764797440. Throughput: 0: 45355.0. Samples: 283330640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-06-13 11:07:10,504][73265] Avg episode reward: [(0, '0.426')] [2024-06-13 11:07:11,339][73497] Updated weights for policy 0, policy_version 229789 (0.0032) [2024-06-13 11:07:15,008][73497] Updated weights for policy 0, policy_version 229799 (0.0042) [2024-06-13 11:07:15,502][73265] Fps is (10 sec: 49151.3, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3765075968. Throughput: 0: 45436.3. Samples: 283616520. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-06-13 11:07:15,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 11:07:18,521][73497] Updated weights for policy 0, policy_version 229809 (0.0029) [2024-06-13 11:07:20,501][73265] Fps is (10 sec: 47525.1, 60 sec: 45602.2, 300 sec: 45375.3). Total num frames: 3765272576. Throughput: 0: 45378.6. Samples: 283750480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-06-13 11:07:20,502][73265] Avg episode reward: [(0, '0.473')] [2024-06-13 11:07:22,007][73497] Updated weights for policy 0, policy_version 229819 (0.0032) [2024-06-13 11:07:25,502][73265] Fps is (10 sec: 42598.4, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3765501952. Throughput: 0: 45601.6. Samples: 284024020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-06-13 11:07:25,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 11:07:25,662][73497] Updated weights for policy 0, policy_version 229829 (0.0031) [2024-06-13 11:07:29,168][73497] Updated weights for policy 0, policy_version 229839 (0.0040) [2024-06-13 11:07:30,502][73265] Fps is (10 sec: 47513.0, 60 sec: 45602.0, 300 sec: 45541.9). Total num frames: 3765747712. Throughput: 0: 45589.1. Samples: 284301100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-06-13 11:07:30,502][73265] Avg episode reward: [(0, '0.395')] [2024-06-13 11:07:32,667][73497] Updated weights for policy 0, policy_version 229849 (0.0035) [2024-06-13 11:07:35,501][73265] Fps is (10 sec: 44237.3, 60 sec: 45055.9, 300 sec: 45375.3). Total num frames: 3765944320. Throughput: 0: 45485.8. Samples: 284433120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-06-13 11:07:35,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 11:07:36,596][73497] Updated weights for policy 0, policy_version 229859 (0.0047) [2024-06-13 11:07:40,059][73497] Updated weights for policy 0, policy_version 229869 (0.0026) [2024-06-13 11:07:40,501][73265] Fps is (10 sec: 44237.7, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3766190080. Throughput: 0: 45470.3. Samples: 284707840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-06-13 11:07:40,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 11:07:43,471][73497] Updated weights for policy 0, policy_version 229879 (0.0029) [2024-06-13 11:07:45,501][73265] Fps is (10 sec: 47513.6, 60 sec: 45602.1, 300 sec: 45486.4). Total num frames: 3766419456. Throughput: 0: 45638.0. Samples: 284987900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-06-13 11:07:45,502][73265] Avg episode reward: [(0, '0.470')] [2024-06-13 11:07:47,188][73497] Updated weights for policy 0, policy_version 229889 (0.0037) [2024-06-13 11:07:50,501][73265] Fps is (10 sec: 44236.6, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3766632448. Throughput: 0: 45410.2. Samples: 285113500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-13 11:07:50,502][73265] Avg episode reward: [(0, '0.422')] [2024-06-13 11:07:50,766][73497] Updated weights for policy 0, policy_version 229899 (0.0042) [2024-06-13 11:07:54,237][73497] Updated weights for policy 0, policy_version 229909 (0.0030) [2024-06-13 11:07:55,501][73265] Fps is (10 sec: 47514.0, 60 sec: 46148.3, 300 sec: 45542.0). Total num frames: 3766894592. Throughput: 0: 46000.3. Samples: 285400540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-13 11:07:55,502][73265] Avg episode reward: [(0, '0.498')] [2024-06-13 11:07:57,699][73497] Updated weights for policy 0, policy_version 229919 (0.0037) [2024-06-13 11:08:00,502][73265] Fps is (10 sec: 45874.5, 60 sec: 45333.1, 300 sec: 45431.1). Total num frames: 3767091200. Throughput: 0: 45563.5. Samples: 285666880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-13 11:08:00,502][73265] Avg episode reward: [(0, '0.467')] [2024-06-13 11:08:00,759][73477] Signal inference workers to stop experience collection... (4200 times) [2024-06-13 11:08:00,812][73497] InferenceWorker_p0-w0: stopping experience collection (4200 times) [2024-06-13 11:08:00,871][73477] Signal inference workers to resume experience collection... (4200 times) [2024-06-13 11:08:00,871][73497] InferenceWorker_p0-w0: resuming experience collection (4200 times) [2024-06-13 11:08:01,171][73497] Updated weights for policy 0, policy_version 229929 (0.0033) [2024-06-13 11:08:05,195][73497] Updated weights for policy 0, policy_version 229939 (0.0029) [2024-06-13 11:08:05,501][73265] Fps is (10 sec: 42598.3, 60 sec: 45602.1, 300 sec: 45431.3). Total num frames: 3767320576. Throughput: 0: 45504.1. Samples: 285798160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-13 11:08:05,502][73265] Avg episode reward: [(0, '0.484')] [2024-06-13 11:08:08,478][73497] Updated weights for policy 0, policy_version 229949 (0.0036) [2024-06-13 11:08:10,502][73265] Fps is (10 sec: 49151.6, 60 sec: 46423.0, 300 sec: 45541.9). Total num frames: 3767582720. Throughput: 0: 45766.6. Samples: 286083520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-13 11:08:10,502][73265] Avg episode reward: [(0, '0.500')] [2024-06-13 11:08:12,117][73497] Updated weights for policy 0, policy_version 229959 (0.0035) [2024-06-13 11:08:15,501][73265] Fps is (10 sec: 45874.8, 60 sec: 45056.0, 300 sec: 45430.9). Total num frames: 3767779328. Throughput: 0: 45732.5. Samples: 286359060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-13 11:08:15,502][73265] Avg episode reward: [(0, '0.431')] [2024-06-13 11:08:15,665][73497] Updated weights for policy 0, policy_version 229969 (0.0041) [2024-06-13 11:08:19,348][73497] Updated weights for policy 0, policy_version 229979 (0.0032) [2024-06-13 11:08:20,501][73265] Fps is (10 sec: 44237.4, 60 sec: 45875.1, 300 sec: 45542.0). Total num frames: 3768025088. Throughput: 0: 45831.9. Samples: 286495560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-13 11:08:20,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 11:08:22,697][73497] Updated weights for policy 0, policy_version 229989 (0.0043) [2024-06-13 11:08:25,501][73265] Fps is (10 sec: 47513.9, 60 sec: 45875.3, 300 sec: 45542.0). Total num frames: 3768254464. Throughput: 0: 45876.4. Samples: 286772280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-13 11:08:25,502][73265] Avg episode reward: [(0, '0.478')] [2024-06-13 11:08:25,629][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000229997_3768270848.pth... [2024-06-13 11:08:25,675][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000229328_3757309952.pth [2024-06-13 11:08:26,361][73497] Updated weights for policy 0, policy_version 229999 (0.0034) [2024-06-13 11:08:29,626][73497] Updated weights for policy 0, policy_version 230009 (0.0041) [2024-06-13 11:08:30,502][73265] Fps is (10 sec: 45874.6, 60 sec: 45602.1, 300 sec: 45597.5). Total num frames: 3768483840. Throughput: 0: 45650.0. Samples: 287042160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-13 11:08:30,502][73265] Avg episode reward: [(0, '0.391')] [2024-06-13 11:08:33,816][73497] Updated weights for policy 0, policy_version 230019 (0.0044) [2024-06-13 11:08:35,501][73265] Fps is (10 sec: 45874.9, 60 sec: 46148.2, 300 sec: 45487.1). Total num frames: 3768713216. Throughput: 0: 45916.8. Samples: 287179760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-13 11:08:35,502][73265] Avg episode reward: [(0, '0.429')] [2024-06-13 11:08:37,310][73497] Updated weights for policy 0, policy_version 230029 (0.0029) [2024-06-13 11:08:40,501][73265] Fps is (10 sec: 44237.5, 60 sec: 45602.1, 300 sec: 45430.9). Total num frames: 3768926208. Throughput: 0: 45590.1. Samples: 287452100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-13 11:08:40,502][73265] Avg episode reward: [(0, '0.433')] [2024-06-13 11:08:41,167][73497] Updated weights for policy 0, policy_version 230039 (0.0035) [2024-06-13 11:08:44,274][73497] Updated weights for policy 0, policy_version 230049 (0.0027) [2024-06-13 11:08:45,502][73265] Fps is (10 sec: 45874.8, 60 sec: 45875.1, 300 sec: 45597.5). Total num frames: 3769171968. Throughput: 0: 45747.6. Samples: 287725520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-13 11:08:45,502][73265] Avg episode reward: [(0, '0.467')] [2024-06-13 11:08:48,492][73497] Updated weights for policy 0, policy_version 230059 (0.0033) [2024-06-13 11:08:50,504][73265] Fps is (10 sec: 45864.1, 60 sec: 45873.3, 300 sec: 45541.6). Total num frames: 3769384960. Throughput: 0: 46047.2. Samples: 287870400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-13 11:08:50,504][73265] Avg episode reward: [(0, '0.405')] [2024-06-13 11:08:51,326][73497] Updated weights for policy 0, policy_version 230069 (0.0033) [2024-06-13 11:08:55,504][73265] Fps is (10 sec: 42588.3, 60 sec: 45054.1, 300 sec: 45430.5). Total num frames: 3769597952. Throughput: 0: 45603.5. Samples: 288135780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:08:55,505][73265] Avg episode reward: [(0, '0.426')] [2024-06-13 11:08:55,556][73497] Updated weights for policy 0, policy_version 230079 (0.0037) [2024-06-13 11:08:58,703][73497] Updated weights for policy 0, policy_version 230089 (0.0031) [2024-06-13 11:09:00,501][73265] Fps is (10 sec: 45886.7, 60 sec: 45875.3, 300 sec: 45486.4). Total num frames: 3769843712. Throughput: 0: 45425.9. Samples: 288403220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:09:00,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 11:09:03,205][73497] Updated weights for policy 0, policy_version 230099 (0.0041) [2024-06-13 11:09:05,501][73265] Fps is (10 sec: 47525.4, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 3770073088. Throughput: 0: 45668.5. Samples: 288550640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:09:05,502][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 11:09:05,902][73497] Updated weights for policy 0, policy_version 230109 (0.0027) [2024-06-13 11:09:10,335][73497] Updated weights for policy 0, policy_version 230119 (0.0026) [2024-06-13 11:09:10,502][73265] Fps is (10 sec: 44236.1, 60 sec: 45056.1, 300 sec: 45430.9). Total num frames: 3770286080. Throughput: 0: 45333.7. Samples: 288812300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:09:10,502][73265] Avg episode reward: [(0, '0.481')] [2024-06-13 11:09:10,774][73477] Signal inference workers to stop experience collection... (4250 times) [2024-06-13 11:09:10,774][73477] Signal inference workers to resume experience collection... (4250 times) [2024-06-13 11:09:10,803][73497] InferenceWorker_p0-w0: stopping experience collection (4250 times) [2024-06-13 11:09:10,804][73497] InferenceWorker_p0-w0: resuming experience collection (4250 times) [2024-06-13 11:09:12,939][73497] Updated weights for policy 0, policy_version 230129 (0.0034) [2024-06-13 11:09:15,501][73265] Fps is (10 sec: 47513.7, 60 sec: 46148.3, 300 sec: 45597.5). Total num frames: 3770548224. Throughput: 0: 45542.5. Samples: 289091560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:09:15,502][73265] Avg episode reward: [(0, '0.484')] [2024-06-13 11:09:17,146][73497] Updated weights for policy 0, policy_version 230139 (0.0033) [2024-06-13 11:09:20,179][73497] Updated weights for policy 0, policy_version 230149 (0.0037) [2024-06-13 11:09:20,501][73265] Fps is (10 sec: 49152.4, 60 sec: 45875.2, 300 sec: 45708.6). Total num frames: 3770777600. Throughput: 0: 45634.2. Samples: 289233300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:09:20,502][73265] Avg episode reward: [(0, '0.411')] [2024-06-13 11:09:24,852][73497] Updated weights for policy 0, policy_version 230159 (0.0045) [2024-06-13 11:09:25,502][73265] Fps is (10 sec: 42597.7, 60 sec: 45329.0, 300 sec: 45486.4). Total num frames: 3770974208. Throughput: 0: 45575.0. Samples: 289502980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:09:25,502][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 11:09:27,392][73497] Updated weights for policy 0, policy_version 230169 (0.0028) [2024-06-13 11:09:30,501][73265] Fps is (10 sec: 42598.6, 60 sec: 45329.2, 300 sec: 45430.9). Total num frames: 3771203584. Throughput: 0: 45361.9. Samples: 289766800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:09:30,502][73265] Avg episode reward: [(0, '0.462')] [2024-06-13 11:09:31,851][73497] Updated weights for policy 0, policy_version 230179 (0.0029) [2024-06-13 11:09:34,382][73497] Updated weights for policy 0, policy_version 230189 (0.0039) [2024-06-13 11:09:35,501][73265] Fps is (10 sec: 49152.5, 60 sec: 45875.2, 300 sec: 45708.6). Total num frames: 3771465728. Throughput: 0: 45335.3. Samples: 289910380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:09:35,502][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 11:09:39,181][73497] Updated weights for policy 0, policy_version 230199 (0.0045) [2024-06-13 11:09:40,501][73265] Fps is (10 sec: 47513.3, 60 sec: 45875.2, 300 sec: 45597.5). Total num frames: 3771678720. Throughput: 0: 45602.0. Samples: 290187760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:09:40,502][73265] Avg episode reward: [(0, '0.421')] [2024-06-13 11:09:41,675][73497] Updated weights for policy 0, policy_version 230209 (0.0041) [2024-06-13 11:09:45,501][73265] Fps is (10 sec: 40960.0, 60 sec: 45056.1, 300 sec: 45375.3). Total num frames: 3771875328. Throughput: 0: 45485.7. Samples: 290450080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:09:45,502][73265] Avg episode reward: [(0, '0.499')] [2024-06-13 11:09:46,422][73497] Updated weights for policy 0, policy_version 230219 (0.0038) [2024-06-13 11:09:48,809][73497] Updated weights for policy 0, policy_version 230229 (0.0029) [2024-06-13 11:09:50,502][73265] Fps is (10 sec: 44236.7, 60 sec: 45603.9, 300 sec: 45597.5). Total num frames: 3772121088. Throughput: 0: 45319.5. Samples: 290590020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:09:50,502][73265] Avg episode reward: [(0, '0.422')] [2024-06-13 11:09:53,490][73497] Updated weights for policy 0, policy_version 230239 (0.0042) [2024-06-13 11:09:55,501][73265] Fps is (10 sec: 49152.1, 60 sec: 46150.2, 300 sec: 45597.5). Total num frames: 3772366848. Throughput: 0: 45714.8. Samples: 290869460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:09:55,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 11:09:56,108][73497] Updated weights for policy 0, policy_version 230249 (0.0024) [2024-06-13 11:10:00,456][73497] Updated weights for policy 0, policy_version 230259 (0.0035) [2024-06-13 11:10:00,501][73265] Fps is (10 sec: 44237.0, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 3772563456. Throughput: 0: 45601.7. Samples: 291143640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:10:00,502][73265] Avg episode reward: [(0, '0.490')] [2024-06-13 11:10:03,000][73497] Updated weights for policy 0, policy_version 230269 (0.0038) [2024-06-13 11:10:05,501][73265] Fps is (10 sec: 44236.8, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3772809216. Throughput: 0: 45351.1. Samples: 291274100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 11:10:05,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 11:10:07,655][73497] Updated weights for policy 0, policy_version 230279 (0.0031) [2024-06-13 11:10:10,025][73497] Updated weights for policy 0, policy_version 230289 (0.0036) [2024-06-13 11:10:10,501][73265] Fps is (10 sec: 49152.0, 60 sec: 46148.3, 300 sec: 45653.0). Total num frames: 3773054976. Throughput: 0: 45731.2. Samples: 291560880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 11:10:10,502][73265] Avg episode reward: [(0, '0.453')] [2024-06-13 11:10:14,858][73497] Updated weights for policy 0, policy_version 230299 (0.0034) [2024-06-13 11:10:15,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45056.0, 300 sec: 45486.4). Total num frames: 3773251584. Throughput: 0: 45897.7. Samples: 291832200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 11:10:15,502][73265] Avg episode reward: [(0, '0.479')] [2024-06-13 11:10:15,969][73477] Signal inference workers to stop experience collection... (4300 times) [2024-06-13 11:10:15,971][73477] Signal inference workers to resume experience collection... (4300 times) [2024-06-13 11:10:16,010][73497] InferenceWorker_p0-w0: stopping experience collection (4300 times) [2024-06-13 11:10:16,010][73497] InferenceWorker_p0-w0: resuming experience collection (4300 times) [2024-06-13 11:10:17,403][73497] Updated weights for policy 0, policy_version 230309 (0.0034) [2024-06-13 11:10:20,501][73265] Fps is (10 sec: 42598.8, 60 sec: 45056.1, 300 sec: 45430.9). Total num frames: 3773480960. Throughput: 0: 45504.1. Samples: 291958060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 11:10:20,502][73265] Avg episode reward: [(0, '0.465')] [2024-06-13 11:10:21,974][73497] Updated weights for policy 0, policy_version 230319 (0.0032) [2024-06-13 11:10:24,527][73497] Updated weights for policy 0, policy_version 230329 (0.0029) [2024-06-13 11:10:25,501][73265] Fps is (10 sec: 50790.4, 60 sec: 46421.4, 300 sec: 45764.1). Total num frames: 3773759488. Throughput: 0: 45624.9. Samples: 292240880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 11:10:25,502][73265] Avg episode reward: [(0, '0.399')] [2024-06-13 11:10:25,518][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000230332_3773759488.pth... [2024-06-13 11:10:25,573][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000229662_3762782208.pth [2024-06-13 11:10:29,007][73497] Updated weights for policy 0, policy_version 230339 (0.0040) [2024-06-13 11:10:30,501][73265] Fps is (10 sec: 47513.7, 60 sec: 45875.2, 300 sec: 45653.1). Total num frames: 3773956096. Throughput: 0: 46039.2. Samples: 292521840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 11:10:30,502][73265] Avg episode reward: [(0, '0.444')] [2024-06-13 11:10:31,692][73497] Updated weights for policy 0, policy_version 230349 (0.0040) [2024-06-13 11:10:35,501][73265] Fps is (10 sec: 40960.4, 60 sec: 45056.1, 300 sec: 45430.9). Total num frames: 3774169088. Throughput: 0: 45728.6. Samples: 292647800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 11:10:35,502][73265] Avg episode reward: [(0, '0.453')] [2024-06-13 11:10:36,075][73497] Updated weights for policy 0, policy_version 230359 (0.0037) [2024-06-13 11:10:38,769][73497] Updated weights for policy 0, policy_version 230369 (0.0033) [2024-06-13 11:10:40,501][73265] Fps is (10 sec: 47512.9, 60 sec: 45875.2, 300 sec: 45708.6). Total num frames: 3774431232. Throughput: 0: 45626.2. Samples: 292922640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 11:10:40,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 11:10:43,156][73497] Updated weights for policy 0, policy_version 230379 (0.0032) [2024-06-13 11:10:45,501][73265] Fps is (10 sec: 49151.8, 60 sec: 46421.4, 300 sec: 45653.0). Total num frames: 3774660608. Throughput: 0: 45820.5. Samples: 293205560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 11:10:45,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 11:10:45,706][73497] Updated weights for policy 0, policy_version 230389 (0.0033) [2024-06-13 11:10:50,433][73497] Updated weights for policy 0, policy_version 230399 (0.0027) [2024-06-13 11:10:50,502][73265] Fps is (10 sec: 42598.3, 60 sec: 45602.1, 300 sec: 45486.4). Total num frames: 3774857216. Throughput: 0: 46060.8. Samples: 293346840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 11:10:50,502][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 11:10:53,143][73497] Updated weights for policy 0, policy_version 230409 (0.0043) [2024-06-13 11:10:55,501][73265] Fps is (10 sec: 44237.0, 60 sec: 45602.2, 300 sec: 45653.1). Total num frames: 3775102976. Throughput: 0: 45589.4. Samples: 293612400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 11:10:55,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 11:10:57,316][73497] Updated weights for policy 0, policy_version 230419 (0.0035) [2024-06-13 11:11:00,168][73497] Updated weights for policy 0, policy_version 230429 (0.0027) [2024-06-13 11:11:00,501][73265] Fps is (10 sec: 50790.7, 60 sec: 46694.4, 300 sec: 45819.7). Total num frames: 3775365120. Throughput: 0: 45759.1. Samples: 293891360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 11:11:00,502][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 11:11:04,847][73497] Updated weights for policy 0, policy_version 230439 (0.0027) [2024-06-13 11:11:05,501][73265] Fps is (10 sec: 44236.4, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3775545344. Throughput: 0: 46315.5. Samples: 294042260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 11:11:05,502][73265] Avg episode reward: [(0, '0.492')] [2024-06-13 11:11:07,091][73497] Updated weights for policy 0, policy_version 230449 (0.0041) [2024-06-13 11:11:10,501][73265] Fps is (10 sec: 39322.0, 60 sec: 45056.1, 300 sec: 45486.4). Total num frames: 3775758336. Throughput: 0: 45974.3. Samples: 294309720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-13 11:11:10,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 11:11:11,741][73497] Updated weights for policy 0, policy_version 230459 (0.0031) [2024-06-13 11:11:14,335][73497] Updated weights for policy 0, policy_version 230469 (0.0034) [2024-06-13 11:11:15,501][73265] Fps is (10 sec: 50790.6, 60 sec: 46694.4, 300 sec: 45819.7). Total num frames: 3776053248. Throughput: 0: 45751.5. Samples: 294580660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-06-13 11:11:15,502][73265] Avg episode reward: [(0, '0.420')] [2024-06-13 11:11:18,705][73497] Updated weights for policy 0, policy_version 230479 (0.0027) [2024-06-13 11:11:20,501][73265] Fps is (10 sec: 50789.8, 60 sec: 46421.2, 300 sec: 45708.6). Total num frames: 3776266240. Throughput: 0: 46362.5. Samples: 294734120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-06-13 11:11:20,502][73265] Avg episode reward: [(0, '0.475')] [2024-06-13 11:11:21,105][73477] Signal inference workers to stop experience collection... (4350 times) [2024-06-13 11:11:21,152][73497] InferenceWorker_p0-w0: stopping experience collection (4350 times) [2024-06-13 11:11:21,152][73477] Signal inference workers to resume experience collection... (4350 times) [2024-06-13 11:11:21,171][73497] InferenceWorker_p0-w0: resuming experience collection (4350 times) [2024-06-13 11:11:21,610][73497] Updated weights for policy 0, policy_version 230489 (0.0031) [2024-06-13 11:11:25,504][73265] Fps is (10 sec: 42587.9, 60 sec: 45327.2, 300 sec: 45652.6). Total num frames: 3776479232. Throughput: 0: 46400.6. Samples: 295010780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-06-13 11:11:25,504][73265] Avg episode reward: [(0, '0.406')] [2024-06-13 11:11:25,634][73497] Updated weights for policy 0, policy_version 230499 (0.0034) [2024-06-13 11:11:28,629][73497] Updated weights for policy 0, policy_version 230509 (0.0025) [2024-06-13 11:11:30,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 3776708608. Throughput: 0: 46137.3. Samples: 295281740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-06-13 11:11:30,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 11:11:32,983][73497] Updated weights for policy 0, policy_version 230519 (0.0030) [2024-06-13 11:11:35,501][73265] Fps is (10 sec: 47525.2, 60 sec: 46421.3, 300 sec: 45708.6). Total num frames: 3776954368. Throughput: 0: 46181.8. Samples: 295425020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-06-13 11:11:35,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 11:11:35,974][73497] Updated weights for policy 0, policy_version 230529 (0.0032) [2024-06-13 11:11:40,115][73497] Updated weights for policy 0, policy_version 230539 (0.0026) [2024-06-13 11:11:40,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45329.1, 300 sec: 45653.0). Total num frames: 3777150976. Throughput: 0: 46347.1. Samples: 295698020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-06-13 11:11:40,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 11:11:42,902][73497] Updated weights for policy 0, policy_version 230549 (0.0024) [2024-06-13 11:11:45,501][73265] Fps is (10 sec: 42598.7, 60 sec: 45329.1, 300 sec: 45653.0). Total num frames: 3777380352. Throughput: 0: 46189.4. Samples: 295969880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-06-13 11:11:45,502][73265] Avg episode reward: [(0, '0.461')] [2024-06-13 11:11:47,120][73497] Updated weights for policy 0, policy_version 230559 (0.0032) [2024-06-13 11:11:50,159][73497] Updated weights for policy 0, policy_version 230569 (0.0033) [2024-06-13 11:11:50,501][73265] Fps is (10 sec: 49152.1, 60 sec: 46421.4, 300 sec: 45819.7). Total num frames: 3777642496. Throughput: 0: 45883.2. Samples: 296107000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-06-13 11:11:50,502][73265] Avg episode reward: [(0, '0.416')] [2024-06-13 11:11:54,422][73497] Updated weights for policy 0, policy_version 230579 (0.0030) [2024-06-13 11:11:55,501][73265] Fps is (10 sec: 49152.1, 60 sec: 46148.3, 300 sec: 45765.0). Total num frames: 3777871872. Throughput: 0: 45977.8. Samples: 296378720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-06-13 11:11:55,502][73265] Avg episode reward: [(0, '0.469')] [2024-06-13 11:11:57,448][73497] Updated weights for policy 0, policy_version 230589 (0.0047) [2024-06-13 11:12:00,501][73265] Fps is (10 sec: 44236.6, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 3778084864. Throughput: 0: 46056.4. Samples: 296653200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-06-13 11:12:00,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 11:12:01,633][73497] Updated weights for policy 0, policy_version 230599 (0.0037) [2024-06-13 11:12:04,784][73497] Updated weights for policy 0, policy_version 230609 (0.0030) [2024-06-13 11:12:05,501][73265] Fps is (10 sec: 45875.0, 60 sec: 46421.4, 300 sec: 45875.6). Total num frames: 3778330624. Throughput: 0: 45765.4. Samples: 296793560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-06-13 11:12:05,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 11:12:08,875][73497] Updated weights for policy 0, policy_version 230619 (0.0030) [2024-06-13 11:12:10,501][73265] Fps is (10 sec: 45875.2, 60 sec: 46421.3, 300 sec: 45653.1). Total num frames: 3778543616. Throughput: 0: 45524.3. Samples: 297059260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-06-13 11:12:10,502][73265] Avg episode reward: [(0, '0.458')] [2024-06-13 11:12:11,869][73497] Updated weights for policy 0, policy_version 230629 (0.0037) [2024-06-13 11:12:15,501][73265] Fps is (10 sec: 42598.4, 60 sec: 45056.0, 300 sec: 45708.6). Total num frames: 3778756608. Throughput: 0: 45588.0. Samples: 297333200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-06-13 11:12:15,511][73265] Avg episode reward: [(0, '0.475')] [2024-06-13 11:12:15,808][73497] Updated weights for policy 0, policy_version 230639 (0.0024) [2024-06-13 11:12:18,934][73497] Updated weights for policy 0, policy_version 230649 (0.0033) [2024-06-13 11:12:20,501][73265] Fps is (10 sec: 42598.5, 60 sec: 45056.0, 300 sec: 45653.1). Total num frames: 3778969600. Throughput: 0: 45343.2. Samples: 297465460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-06-13 11:12:20,502][73265] Avg episode reward: [(0, '0.465')] [2024-06-13 11:12:23,283][73497] Updated weights for policy 0, policy_version 230659 (0.0030) [2024-06-13 11:12:25,501][73265] Fps is (10 sec: 47513.9, 60 sec: 45877.1, 300 sec: 45708.6). Total num frames: 3779231744. Throughput: 0: 45393.4. Samples: 297740720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-13 11:12:25,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 11:12:25,597][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000230667_3779248128.pth... [2024-06-13 11:12:25,637][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000229997_3768270848.pth [2024-06-13 11:12:26,205][73497] Updated weights for policy 0, policy_version 230669 (0.0028) [2024-06-13 11:12:30,323][73497] Updated weights for policy 0, policy_version 230679 (0.0035) [2024-06-13 11:12:30,501][73265] Fps is (10 sec: 47513.5, 60 sec: 45602.1, 300 sec: 45764.1). Total num frames: 3779444736. Throughput: 0: 45479.1. Samples: 298016440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-13 11:12:30,510][73265] Avg episode reward: [(0, '0.443')] [2024-06-13 11:12:33,923][73497] Updated weights for policy 0, policy_version 230689 (0.0033) [2024-06-13 11:12:35,502][73265] Fps is (10 sec: 44235.6, 60 sec: 45328.9, 300 sec: 45708.5). Total num frames: 3779674112. Throughput: 0: 45505.6. Samples: 298154760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-13 11:12:35,502][73265] Avg episode reward: [(0, '0.443')] [2024-06-13 11:12:37,597][73497] Updated weights for policy 0, policy_version 230699 (0.0028) [2024-06-13 11:12:40,501][73265] Fps is (10 sec: 47513.3, 60 sec: 46148.2, 300 sec: 45764.1). Total num frames: 3779919872. Throughput: 0: 45399.4. Samples: 298421700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-13 11:12:40,502][73265] Avg episode reward: [(0, '0.484')] [2024-06-13 11:12:40,860][73497] Updated weights for policy 0, policy_version 230709 (0.0035) [2024-06-13 11:12:44,447][73497] Updated weights for policy 0, policy_version 230719 (0.0027) [2024-06-13 11:12:45,501][73265] Fps is (10 sec: 44237.9, 60 sec: 45602.1, 300 sec: 45708.6). Total num frames: 3780116480. Throughput: 0: 45475.6. Samples: 298699600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-13 11:12:45,502][73265] Avg episode reward: [(0, '0.470')] [2024-06-13 11:12:45,964][73477] Signal inference workers to stop experience collection... (4400 times) [2024-06-13 11:12:45,964][73477] Signal inference workers to resume experience collection... (4400 times) [2024-06-13 11:12:45,986][73497] InferenceWorker_p0-w0: stopping experience collection (4400 times) [2024-06-13 11:12:45,986][73497] InferenceWorker_p0-w0: resuming experience collection (4400 times) [2024-06-13 11:12:48,196][73497] Updated weights for policy 0, policy_version 230729 (0.0035) [2024-06-13 11:12:50,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45602.1, 300 sec: 45708.6). Total num frames: 3780378624. Throughput: 0: 45456.4. Samples: 298839100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-13 11:12:50,502][73265] Avg episode reward: [(0, '0.425')] [2024-06-13 11:12:51,859][73497] Updated weights for policy 0, policy_version 230739 (0.0031) [2024-06-13 11:12:55,466][73497] Updated weights for policy 0, policy_version 230749 (0.0033) [2024-06-13 11:12:55,502][73265] Fps is (10 sec: 47512.9, 60 sec: 45328.9, 300 sec: 45764.1). Total num frames: 3780591616. Throughput: 0: 45514.6. Samples: 299107420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-13 11:12:55,502][73265] Avg episode reward: [(0, '0.446')] [2024-06-13 11:12:58,754][73497] Updated weights for policy 0, policy_version 230759 (0.0027) [2024-06-13 11:13:00,501][73265] Fps is (10 sec: 45875.6, 60 sec: 45875.2, 300 sec: 45819.7). Total num frames: 3780837376. Throughput: 0: 45536.0. Samples: 299382320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-13 11:13:00,502][73265] Avg episode reward: [(0, '0.445')] [2024-06-13 11:13:02,658][73497] Updated weights for policy 0, policy_version 230769 (0.0023) [2024-06-13 11:13:05,502][73265] Fps is (10 sec: 45875.2, 60 sec: 45329.0, 300 sec: 45653.1). Total num frames: 3781050368. Throughput: 0: 45680.3. Samples: 299521080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-13 11:13:05,502][73265] Avg episode reward: [(0, '0.482')] [2024-06-13 11:13:06,306][73497] Updated weights for policy 0, policy_version 230779 (0.0038) [2024-06-13 11:13:09,934][73497] Updated weights for policy 0, policy_version 230789 (0.0028) [2024-06-13 11:13:10,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45602.2, 300 sec: 45764.1). Total num frames: 3781279744. Throughput: 0: 45676.9. Samples: 299796180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-13 11:13:10,502][73265] Avg episode reward: [(0, '0.425')] [2024-06-13 11:13:13,139][73497] Updated weights for policy 0, policy_version 230799 (0.0025) [2024-06-13 11:13:15,501][73265] Fps is (10 sec: 47514.0, 60 sec: 46148.3, 300 sec: 45764.1). Total num frames: 3781525504. Throughput: 0: 45635.6. Samples: 300070040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-13 11:13:15,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 11:13:16,909][73497] Updated weights for policy 0, policy_version 230809 (0.0045) [2024-06-13 11:13:20,074][73497] Updated weights for policy 0, policy_version 230819 (0.0032) [2024-06-13 11:13:20,501][73265] Fps is (10 sec: 47513.3, 60 sec: 46421.3, 300 sec: 45764.1). Total num frames: 3781754880. Throughput: 0: 45864.2. Samples: 300218640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-13 11:13:20,504][73265] Avg episode reward: [(0, '0.478')] [2024-06-13 11:13:23,716][73497] Updated weights for policy 0, policy_version 230829 (0.0038) [2024-06-13 11:13:25,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45602.1, 300 sec: 45708.6). Total num frames: 3781967872. Throughput: 0: 46078.7. Samples: 300495240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-13 11:13:25,502][73265] Avg episode reward: [(0, '0.496')] [2024-06-13 11:13:27,082][73497] Updated weights for policy 0, policy_version 230839 (0.0034) [2024-06-13 11:13:30,502][73265] Fps is (10 sec: 44235.7, 60 sec: 45875.0, 300 sec: 45708.5). Total num frames: 3782197248. Throughput: 0: 45892.6. Samples: 300764780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-13 11:13:30,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 11:13:31,494][73497] Updated weights for policy 0, policy_version 230849 (0.0028) [2024-06-13 11:13:34,779][73497] Updated weights for policy 0, policy_version 230859 (0.0036) [2024-06-13 11:13:35,501][73265] Fps is (10 sec: 47513.7, 60 sec: 46148.4, 300 sec: 45819.7). Total num frames: 3782443008. Throughput: 0: 45749.8. Samples: 300897840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 11:13:35,502][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 11:13:38,358][73497] Updated weights for policy 0, policy_version 230869 (0.0026) [2024-06-13 11:13:40,501][73265] Fps is (10 sec: 45876.4, 60 sec: 45602.2, 300 sec: 45708.6). Total num frames: 3782656000. Throughput: 0: 46088.5. Samples: 301181400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 11:13:40,502][73265] Avg episode reward: [(0, '0.496')] [2024-06-13 11:13:41,606][73497] Updated weights for policy 0, policy_version 230879 (0.0033) [2024-06-13 11:13:45,331][73497] Updated weights for policy 0, policy_version 230889 (0.0029) [2024-06-13 11:13:45,502][73265] Fps is (10 sec: 44236.4, 60 sec: 46148.2, 300 sec: 45764.5). Total num frames: 3782885376. Throughput: 0: 46125.7. Samples: 301457980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 11:13:45,502][73265] Avg episode reward: [(0, '0.404')] [2024-06-13 11:13:48,432][73497] Updated weights for policy 0, policy_version 230899 (0.0024) [2024-06-13 11:13:50,501][73265] Fps is (10 sec: 49152.1, 60 sec: 46148.3, 300 sec: 45931.1). Total num frames: 3783147520. Throughput: 0: 46177.9. Samples: 301599080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 11:13:50,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 11:13:52,171][73497] Updated weights for policy 0, policy_version 230909 (0.0031) [2024-06-13 11:13:55,502][73265] Fps is (10 sec: 45875.3, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 3783344128. Throughput: 0: 46160.3. Samples: 301873400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 11:13:55,502][73265] Avg episode reward: [(0, '0.511')] [2024-06-13 11:13:56,007][73497] Updated weights for policy 0, policy_version 230919 (0.0040) [2024-06-13 11:14:00,075][73497] Updated weights for policy 0, policy_version 230929 (0.0037) [2024-06-13 11:14:00,501][73265] Fps is (10 sec: 40960.0, 60 sec: 45329.1, 300 sec: 45708.6). Total num frames: 3783557120. Throughput: 0: 46070.7. Samples: 302143220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 11:14:00,502][73265] Avg episode reward: [(0, '0.478')] [2024-06-13 11:14:03,371][73497] Updated weights for policy 0, policy_version 230939 (0.0038) [2024-06-13 11:14:05,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45875.2, 300 sec: 45819.7). Total num frames: 3783802880. Throughput: 0: 45729.7. Samples: 302276480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 11:14:05,502][73265] Avg episode reward: [(0, '0.476')] [2024-06-13 11:14:07,018][73497] Updated weights for policy 0, policy_version 230949 (0.0042) [2024-06-13 11:14:10,163][73497] Updated weights for policy 0, policy_version 230959 (0.0026) [2024-06-13 11:14:10,501][73265] Fps is (10 sec: 49152.1, 60 sec: 46148.3, 300 sec: 45764.1). Total num frames: 3784048640. Throughput: 0: 45636.1. Samples: 302548860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 11:14:10,502][73265] Avg episode reward: [(0, '0.464')] [2024-06-13 11:14:13,978][73497] Updated weights for policy 0, policy_version 230969 (0.0036) [2024-06-13 11:14:15,501][73265] Fps is (10 sec: 44237.0, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 3784245248. Throughput: 0: 45790.5. Samples: 302825340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 11:14:15,502][73265] Avg episode reward: [(0, '0.400')] [2024-06-13 11:14:16,299][73477] Signal inference workers to stop experience collection... (4450 times) [2024-06-13 11:14:16,323][73497] InferenceWorker_p0-w0: stopping experience collection (4450 times) [2024-06-13 11:14:16,358][73477] Signal inference workers to resume experience collection... (4450 times) [2024-06-13 11:14:16,358][73497] InferenceWorker_p0-w0: resuming experience collection (4450 times) [2024-06-13 11:14:17,411][73497] Updated weights for policy 0, policy_version 230979 (0.0032) [2024-06-13 11:14:20,501][73265] Fps is (10 sec: 44236.5, 60 sec: 45602.1, 300 sec: 45819.7). Total num frames: 3784491008. Throughput: 0: 45751.1. Samples: 302956640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 11:14:20,502][73265] Avg episode reward: [(0, '0.442')] [2024-06-13 11:14:21,561][73497] Updated weights for policy 0, policy_version 230989 (0.0028) [2024-06-13 11:14:25,058][73497] Updated weights for policy 0, policy_version 230999 (0.0035) [2024-06-13 11:14:25,501][73265] Fps is (10 sec: 45875.6, 60 sec: 45602.2, 300 sec: 45764.1). Total num frames: 3784704000. Throughput: 0: 45392.9. Samples: 303224080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 11:14:25,502][73265] Avg episode reward: [(0, '0.515')] [2024-06-13 11:14:25,615][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000231001_3784720384.pth... [2024-06-13 11:14:25,669][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000230332_3773759488.pth [2024-06-13 11:14:28,834][73497] Updated weights for policy 0, policy_version 231009 (0.0032) [2024-06-13 11:14:30,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45602.3, 300 sec: 45653.0). Total num frames: 3784933376. Throughput: 0: 45321.4. Samples: 303497440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 11:14:30,502][73265] Avg episode reward: [(0, '0.474')] [2024-06-13 11:14:32,181][73497] Updated weights for policy 0, policy_version 231019 (0.0032) [2024-06-13 11:14:35,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45329.1, 300 sec: 45708.6). Total num frames: 3785162752. Throughput: 0: 45020.0. Samples: 303624980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-13 11:14:35,502][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 11:14:35,617][73497] Updated weights for policy 0, policy_version 231029 (0.0036) [2024-06-13 11:14:39,077][73497] Updated weights for policy 0, policy_version 231039 (0.0031) [2024-06-13 11:14:40,504][73265] Fps is (10 sec: 45864.3, 60 sec: 45600.3, 300 sec: 45819.3). Total num frames: 3785392128. Throughput: 0: 45098.1. Samples: 303902920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 11:14:40,504][73265] Avg episode reward: [(0, '0.464')] [2024-06-13 11:14:43,453][73497] Updated weights for policy 0, policy_version 231049 (0.0030) [2024-06-13 11:14:45,501][73265] Fps is (10 sec: 44236.4, 60 sec: 45329.1, 300 sec: 45708.6). Total num frames: 3785605120. Throughput: 0: 45127.1. Samples: 304173940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 11:14:45,502][73265] Avg episode reward: [(0, '0.427')] [2024-06-13 11:14:46,377][73497] Updated weights for policy 0, policy_version 231059 (0.0030) [2024-06-13 11:14:50,346][73497] Updated weights for policy 0, policy_version 231069 (0.0029) [2024-06-13 11:14:50,501][73265] Fps is (10 sec: 44247.4, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 3785834496. Throughput: 0: 45330.7. Samples: 304316360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 11:14:50,502][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 11:14:53,682][73497] Updated weights for policy 0, policy_version 231079 (0.0042) [2024-06-13 11:14:55,502][73265] Fps is (10 sec: 47513.3, 60 sec: 45602.1, 300 sec: 45819.6). Total num frames: 3786080256. Throughput: 0: 45226.5. Samples: 304584060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 11:14:55,502][73265] Avg episode reward: [(0, '0.445')] [2024-06-13 11:14:57,555][73497] Updated weights for policy 0, policy_version 231089 (0.0029) [2024-06-13 11:15:00,501][73265] Fps is (10 sec: 45875.6, 60 sec: 45602.2, 300 sec: 45708.6). Total num frames: 3786293248. Throughput: 0: 45076.1. Samples: 304853760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 11:15:00,502][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 11:15:01,085][73497] Updated weights for policy 0, policy_version 231099 (0.0039) [2024-06-13 11:15:04,756][73497] Updated weights for policy 0, policy_version 231109 (0.0029) [2024-06-13 11:15:05,501][73265] Fps is (10 sec: 42598.8, 60 sec: 45056.0, 300 sec: 45597.5). Total num frames: 3786506240. Throughput: 0: 45189.4. Samples: 304990160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 11:15:05,502][73265] Avg episode reward: [(0, '0.489')] [2024-06-13 11:15:08,066][73497] Updated weights for policy 0, policy_version 231119 (0.0033) [2024-06-13 11:15:10,501][73265] Fps is (10 sec: 45874.9, 60 sec: 45056.0, 300 sec: 45764.1). Total num frames: 3786752000. Throughput: 0: 45211.9. Samples: 305258620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 11:15:10,502][73265] Avg episode reward: [(0, '0.472')] [2024-06-13 11:15:11,938][73497] Updated weights for policy 0, policy_version 231129 (0.0037) [2024-06-13 11:15:15,303][73497] Updated weights for policy 0, policy_version 231139 (0.0029) [2024-06-13 11:15:15,502][73265] Fps is (10 sec: 49151.5, 60 sec: 45875.1, 300 sec: 45819.6). Total num frames: 3786997760. Throughput: 0: 45376.0. Samples: 305539360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 11:15:15,502][73265] Avg episode reward: [(0, '0.481')] [2024-06-13 11:15:18,431][73477] Signal inference workers to stop experience collection... (4500 times) [2024-06-13 11:15:18,432][73477] Signal inference workers to resume experience collection... (4500 times) [2024-06-13 11:15:18,460][73497] InferenceWorker_p0-w0: stopping experience collection (4500 times) [2024-06-13 11:15:18,461][73497] InferenceWorker_p0-w0: resuming experience collection (4500 times) [2024-06-13 11:15:18,860][73497] Updated weights for policy 0, policy_version 231149 (0.0027) [2024-06-13 11:15:20,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45056.0, 300 sec: 45542.0). Total num frames: 3787194368. Throughput: 0: 45499.5. Samples: 305672460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 11:15:20,502][73265] Avg episode reward: [(0, '0.480')] [2024-06-13 11:15:22,682][73497] Updated weights for policy 0, policy_version 231159 (0.0038) [2024-06-13 11:15:25,501][73265] Fps is (10 sec: 40960.6, 60 sec: 45056.0, 300 sec: 45597.5). Total num frames: 3787407360. Throughput: 0: 45223.4. Samples: 305937860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 11:15:25,502][73265] Avg episode reward: [(0, '0.494')] [2024-06-13 11:15:26,569][73497] Updated weights for policy 0, policy_version 231169 (0.0046) [2024-06-13 11:15:29,603][73497] Updated weights for policy 0, policy_version 231179 (0.0033) [2024-06-13 11:15:30,501][73265] Fps is (10 sec: 47513.6, 60 sec: 45602.2, 300 sec: 45764.1). Total num frames: 3787669504. Throughput: 0: 45307.6. Samples: 306212780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 11:15:30,502][73265] Avg episode reward: [(0, '0.456')] [2024-06-13 11:15:33,619][73497] Updated weights for policy 0, policy_version 231189 (0.0034) [2024-06-13 11:15:35,501][73265] Fps is (10 sec: 47513.6, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 3787882496. Throughput: 0: 45420.5. Samples: 306360280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 11:15:35,502][73265] Avg episode reward: [(0, '0.493')] [2024-06-13 11:15:36,935][73497] Updated weights for policy 0, policy_version 231199 (0.0041) [2024-06-13 11:15:40,501][73265] Fps is (10 sec: 42598.6, 60 sec: 45057.9, 300 sec: 45542.0). Total num frames: 3788095488. Throughput: 0: 45310.0. Samples: 306623000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 11:15:40,502][73265] Avg episode reward: [(0, '0.410')] [2024-06-13 11:15:40,915][73497] Updated weights for policy 0, policy_version 231209 (0.0034) [2024-06-13 11:15:44,137][73497] Updated weights for policy 0, policy_version 231219 (0.0037) [2024-06-13 11:15:45,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45602.2, 300 sec: 45708.6). Total num frames: 3788341248. Throughput: 0: 45272.4. Samples: 306891020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-13 11:15:45,502][73265] Avg episode reward: [(0, '0.461')] [2024-06-13 11:15:48,191][73497] Updated weights for policy 0, policy_version 231229 (0.0033) [2024-06-13 11:15:50,501][73265] Fps is (10 sec: 44236.5, 60 sec: 45056.0, 300 sec: 45542.0). Total num frames: 3788537856. Throughput: 0: 45390.7. Samples: 307032740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 11:15:50,502][73265] Avg episode reward: [(0, '0.443')] [2024-06-13 11:15:51,438][73497] Updated weights for policy 0, policy_version 231239 (0.0042) [2024-06-13 11:15:55,311][73497] Updated weights for policy 0, policy_version 231249 (0.0021) [2024-06-13 11:15:55,502][73265] Fps is (10 sec: 44236.1, 60 sec: 45056.0, 300 sec: 45486.4). Total num frames: 3788783616. Throughput: 0: 45471.5. Samples: 307304840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 11:15:55,502][73265] Avg episode reward: [(0, '0.445')] [2024-06-13 11:15:58,431][73497] Updated weights for policy 0, policy_version 231259 (0.0040) [2024-06-13 11:16:00,501][73265] Fps is (10 sec: 49151.8, 60 sec: 45602.1, 300 sec: 45708.6). Total num frames: 3789029376. Throughput: 0: 45324.5. Samples: 307578960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 11:16:00,502][73265] Avg episode reward: [(0, '0.443')] [2024-06-13 11:16:02,485][73497] Updated weights for policy 0, policy_version 231269 (0.0035) [2024-06-13 11:16:05,502][73265] Fps is (10 sec: 45875.1, 60 sec: 45602.0, 300 sec: 45708.6). Total num frames: 3789242368. Throughput: 0: 45549.6. Samples: 307722200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 11:16:05,502][73265] Avg episode reward: [(0, '0.425')] [2024-06-13 11:16:05,868][73497] Updated weights for policy 0, policy_version 231279 (0.0040) [2024-06-13 11:16:09,773][73497] Updated weights for policy 0, policy_version 231289 (0.0044) [2024-06-13 11:16:10,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45329.1, 300 sec: 45486.4). Total num frames: 3789471744. Throughput: 0: 45665.8. Samples: 307992820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 11:16:10,502][73265] Avg episode reward: [(0, '0.416')] [2024-06-13 11:16:12,965][73497] Updated weights for policy 0, policy_version 231299 (0.0043) [2024-06-13 11:16:15,501][73265] Fps is (10 sec: 47514.4, 60 sec: 45329.2, 300 sec: 45597.5). Total num frames: 3789717504. Throughput: 0: 45760.9. Samples: 308272020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 11:16:15,502][73265] Avg episode reward: [(0, '0.422')] [2024-06-13 11:16:16,613][73497] Updated weights for policy 0, policy_version 231309 (0.0035) [2024-06-13 11:16:19,992][73497] Updated weights for policy 0, policy_version 231319 (0.0039) [2024-06-13 11:16:20,501][73265] Fps is (10 sec: 49151.8, 60 sec: 46148.2, 300 sec: 45709.0). Total num frames: 3789963264. Throughput: 0: 45662.2. Samples: 308415080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 11:16:20,503][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 11:16:23,810][73497] Updated weights for policy 0, policy_version 231329 (0.0033) [2024-06-13 11:16:25,501][73265] Fps is (10 sec: 42598.0, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3790143488. Throughput: 0: 45699.9. Samples: 308679500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 11:16:25,502][73265] Avg episode reward: [(0, '0.422')] [2024-06-13 11:16:25,550][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000231333_3790159872.pth... [2024-06-13 11:16:25,592][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000230667_3779248128.pth [2024-06-13 11:16:26,938][73497] Updated weights for policy 0, policy_version 231339 (0.0039) [2024-06-13 11:16:28,727][73477] Signal inference workers to stop experience collection... (4550 times) [2024-06-13 11:16:28,729][73477] Signal inference workers to resume experience collection... (4550 times) [2024-06-13 11:16:28,745][73497] InferenceWorker_p0-w0: stopping experience collection (4550 times) [2024-06-13 11:16:28,745][73497] InferenceWorker_p0-w0: resuming experience collection (4550 times) [2024-06-13 11:16:30,502][73265] Fps is (10 sec: 42598.1, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 3790389248. Throughput: 0: 45962.1. Samples: 308959320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 11:16:30,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 11:16:31,318][73497] Updated weights for policy 0, policy_version 231349 (0.0036) [2024-06-13 11:16:34,553][73497] Updated weights for policy 0, policy_version 231359 (0.0028) [2024-06-13 11:16:35,501][73265] Fps is (10 sec: 49152.2, 60 sec: 45875.2, 300 sec: 45708.6). Total num frames: 3790635008. Throughput: 0: 45855.1. Samples: 309096220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 11:16:35,502][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 11:16:38,235][73497] Updated weights for policy 0, policy_version 231369 (0.0030) [2024-06-13 11:16:40,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45602.0, 300 sec: 45597.5). Total num frames: 3790831616. Throughput: 0: 45830.7. Samples: 309367220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 11:16:40,503][73265] Avg episode reward: [(0, '0.429')] [2024-06-13 11:16:41,541][73497] Updated weights for policy 0, policy_version 231379 (0.0027) [2024-06-13 11:16:45,083][73497] Updated weights for policy 0, policy_version 231389 (0.0028) [2024-06-13 11:16:45,502][73265] Fps is (10 sec: 45874.8, 60 sec: 45875.1, 300 sec: 45597.5). Total num frames: 3791093760. Throughput: 0: 45820.9. Samples: 309640900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 11:16:45,502][73265] Avg episode reward: [(0, '0.416')] [2024-06-13 11:16:48,918][73497] Updated weights for policy 0, policy_version 231399 (0.0032) [2024-06-13 11:16:50,502][73265] Fps is (10 sec: 49151.7, 60 sec: 46421.3, 300 sec: 45597.5). Total num frames: 3791323136. Throughput: 0: 45758.2. Samples: 309781320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 11:16:50,502][73265] Avg episode reward: [(0, '0.463')] [2024-06-13 11:16:52,422][73497] Updated weights for policy 0, policy_version 231409 (0.0040) [2024-06-13 11:16:55,501][73265] Fps is (10 sec: 44237.0, 60 sec: 45875.2, 300 sec: 45597.5). Total num frames: 3791536128. Throughput: 0: 45887.9. Samples: 310057780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-13 11:16:55,502][73265] Avg episode reward: [(0, '0.488')] [2024-06-13 11:16:55,911][73497] Updated weights for policy 0, policy_version 231419 (0.0029) [2024-06-13 11:16:59,830][73497] Updated weights for policy 0, policy_version 231429 (0.0037) [2024-06-13 11:17:00,501][73265] Fps is (10 sec: 42599.1, 60 sec: 45329.2, 300 sec: 45486.4). Total num frames: 3791749120. Throughput: 0: 45863.1. Samples: 310335860. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-13 11:17:00,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 11:17:03,321][73497] Updated weights for policy 0, policy_version 231439 (0.0031) [2024-06-13 11:17:05,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45875.3, 300 sec: 45597.5). Total num frames: 3791994880. Throughput: 0: 45660.4. Samples: 310469800. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-13 11:17:05,502][73265] Avg episode reward: [(0, '0.399')] [2024-06-13 11:17:06,680][73497] Updated weights for policy 0, policy_version 231449 (0.0037) [2024-06-13 11:17:10,366][73497] Updated weights for policy 0, policy_version 231459 (0.0039) [2024-06-13 11:17:10,501][73265] Fps is (10 sec: 47513.3, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 3792224256. Throughput: 0: 45924.9. Samples: 310746120. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-13 11:17:10,502][73265] Avg episode reward: [(0, '0.503')] [2024-06-13 11:17:13,567][73497] Updated weights for policy 0, policy_version 231469 (0.0035) [2024-06-13 11:17:15,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 3792437248. Throughput: 0: 45784.1. Samples: 311019600. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-13 11:17:15,502][73265] Avg episode reward: [(0, '0.396')] [2024-06-13 11:17:17,537][73497] Updated weights for policy 0, policy_version 231479 (0.0042) [2024-06-13 11:17:20,501][73265] Fps is (10 sec: 47513.8, 60 sec: 45602.2, 300 sec: 45653.0). Total num frames: 3792699392. Throughput: 0: 45607.2. Samples: 311148540. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-13 11:17:20,502][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 11:17:21,547][73497] Updated weights for policy 0, policy_version 231489 (0.0039) [2024-06-13 11:17:24,626][73497] Updated weights for policy 0, policy_version 231499 (0.0031) [2024-06-13 11:17:25,504][73265] Fps is (10 sec: 45863.9, 60 sec: 45873.3, 300 sec: 45597.1). Total num frames: 3792896000. Throughput: 0: 45848.2. Samples: 311430500. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-13 11:17:25,504][73265] Avg episode reward: [(0, '0.420')] [2024-06-13 11:17:28,715][73497] Updated weights for policy 0, policy_version 231509 (0.0032) [2024-06-13 11:17:30,501][73265] Fps is (10 sec: 42598.3, 60 sec: 45602.2, 300 sec: 45597.5). Total num frames: 3793125376. Throughput: 0: 45573.0. Samples: 311691680. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-13 11:17:30,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 11:17:32,212][73497] Updated weights for policy 0, policy_version 231519 (0.0027) [2024-06-13 11:17:35,458][73497] Updated weights for policy 0, policy_version 231529 (0.0036) [2024-06-13 11:17:35,501][73265] Fps is (10 sec: 47525.6, 60 sec: 45602.2, 300 sec: 45597.5). Total num frames: 3793371136. Throughput: 0: 45548.6. Samples: 311831000. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-13 11:17:35,502][73265] Avg episode reward: [(0, '0.497')] [2024-06-13 11:17:39,347][73497] Updated weights for policy 0, policy_version 231539 (0.0034) [2024-06-13 11:17:40,502][73265] Fps is (10 sec: 47513.2, 60 sec: 46148.3, 300 sec: 45708.6). Total num frames: 3793600512. Throughput: 0: 45654.6. Samples: 312112240. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-13 11:17:40,502][73265] Avg episode reward: [(0, '0.434')] [2024-06-13 11:17:42,687][73497] Updated weights for policy 0, policy_version 231549 (0.0034) [2024-06-13 11:17:45,501][73265] Fps is (10 sec: 40960.0, 60 sec: 44783.0, 300 sec: 45430.9). Total num frames: 3793780736. Throughput: 0: 45404.9. Samples: 312379080. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-13 11:17:45,502][73265] Avg episode reward: [(0, '0.499')] [2024-06-13 11:17:46,510][73477] Signal inference workers to stop experience collection... (4600 times) [2024-06-13 11:17:46,513][73477] Signal inference workers to resume experience collection... (4600 times) [2024-06-13 11:17:46,518][73497] Updated weights for policy 0, policy_version 231559 (0.0032) [2024-06-13 11:17:46,529][73497] InferenceWorker_p0-w0: stopping experience collection (4600 times) [2024-06-13 11:17:46,529][73497] InferenceWorker_p0-w0: resuming experience collection (4600 times) [2024-06-13 11:17:49,780][73497] Updated weights for policy 0, policy_version 231569 (0.0033) [2024-06-13 11:17:50,501][73265] Fps is (10 sec: 44237.2, 60 sec: 45329.2, 300 sec: 45597.5). Total num frames: 3794042880. Throughput: 0: 45388.0. Samples: 312512260. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-13 11:17:50,502][73265] Avg episode reward: [(0, '0.429')] [2024-06-13 11:17:53,474][73497] Updated weights for policy 0, policy_version 231579 (0.0024) [2024-06-13 11:17:55,501][73265] Fps is (10 sec: 50790.4, 60 sec: 45875.3, 300 sec: 45597.5). Total num frames: 3794288640. Throughput: 0: 45562.3. Samples: 312796420. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-13 11:17:55,502][73265] Avg episode reward: [(0, '0.443')] [2024-06-13 11:17:57,066][73497] Updated weights for policy 0, policy_version 231589 (0.0030) [2024-06-13 11:18:00,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3794485248. Throughput: 0: 45549.4. Samples: 313069320. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-13 11:18:00,502][73265] Avg episode reward: [(0, '0.444')] [2024-06-13 11:18:00,834][73497] Updated weights for policy 0, policy_version 231599 (0.0027) [2024-06-13 11:18:04,146][73497] Updated weights for policy 0, policy_version 231609 (0.0041) [2024-06-13 11:18:05,501][73265] Fps is (10 sec: 44236.8, 60 sec: 45602.2, 300 sec: 45597.5). Total num frames: 3794731008. Throughput: 0: 45653.8. Samples: 313202960. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-13 11:18:05,502][73265] Avg episode reward: [(0, '0.448')] [2024-06-13 11:18:08,149][73497] Updated weights for policy 0, policy_version 231619 (0.0025) [2024-06-13 11:18:10,502][73265] Fps is (10 sec: 47511.2, 60 sec: 45601.8, 300 sec: 45541.9). Total num frames: 3794960384. Throughput: 0: 45532.7. Samples: 313479380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 11:18:10,502][73265] Avg episode reward: [(0, '0.431')] [2024-06-13 11:18:11,386][73497] Updated weights for policy 0, policy_version 231629 (0.0029) [2024-06-13 11:18:14,943][73497] Updated weights for policy 0, policy_version 231639 (0.0025) [2024-06-13 11:18:15,502][73265] Fps is (10 sec: 47512.8, 60 sec: 46148.2, 300 sec: 45597.5). Total num frames: 3795206144. Throughput: 0: 46139.9. Samples: 313767980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 11:18:15,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 11:18:18,161][73497] Updated weights for policy 0, policy_version 231649 (0.0032) [2024-06-13 11:18:20,501][73265] Fps is (10 sec: 45877.5, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 3795419136. Throughput: 0: 45916.0. Samples: 313897220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 11:18:20,502][73265] Avg episode reward: [(0, '0.431')] [2024-06-13 11:18:21,995][73497] Updated weights for policy 0, policy_version 231659 (0.0022) [2024-06-13 11:18:25,251][73497] Updated weights for policy 0, policy_version 231669 (0.0032) [2024-06-13 11:18:25,501][73265] Fps is (10 sec: 45876.0, 60 sec: 46150.2, 300 sec: 45653.1). Total num frames: 3795664896. Throughput: 0: 45745.0. Samples: 314170760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 11:18:25,502][73265] Avg episode reward: [(0, '0.443')] [2024-06-13 11:18:25,521][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000231669_3795664896.pth... [2024-06-13 11:18:25,587][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000231001_3784720384.pth [2024-06-13 11:18:29,287][73497] Updated weights for policy 0, policy_version 231679 (0.0042) [2024-06-13 11:18:30,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 3795877888. Throughput: 0: 46074.7. Samples: 314452440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 11:18:30,502][73265] Avg episode reward: [(0, '0.489')] [2024-06-13 11:18:32,510][73497] Updated weights for policy 0, policy_version 231689 (0.0031) [2024-06-13 11:18:35,501][73265] Fps is (10 sec: 42598.2, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 3796090880. Throughput: 0: 46015.6. Samples: 314582960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 11:18:35,502][73265] Avg episode reward: [(0, '0.472')] [2024-06-13 11:18:36,323][73497] Updated weights for policy 0, policy_version 231699 (0.0035) [2024-06-13 11:18:39,639][73497] Updated weights for policy 0, policy_version 231709 (0.0031) [2024-06-13 11:18:40,501][73265] Fps is (10 sec: 45874.7, 60 sec: 45602.1, 300 sec: 45597.5). Total num frames: 3796336640. Throughput: 0: 45930.6. Samples: 314863300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 11:18:40,502][73265] Avg episode reward: [(0, '0.475')] [2024-06-13 11:18:43,417][73497] Updated weights for policy 0, policy_version 231719 (0.0033) [2024-06-13 11:18:45,501][73265] Fps is (10 sec: 49152.1, 60 sec: 46694.4, 300 sec: 45542.0). Total num frames: 3796582400. Throughput: 0: 46104.0. Samples: 315144000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 11:18:45,502][73265] Avg episode reward: [(0, '0.404')] [2024-06-13 11:18:46,410][73497] Updated weights for policy 0, policy_version 231729 (0.0028) [2024-06-13 11:18:50,501][73265] Fps is (10 sec: 45876.0, 60 sec: 45875.3, 300 sec: 45597.5). Total num frames: 3796795392. Throughput: 0: 46141.8. Samples: 315279340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 11:18:50,502][73265] Avg episode reward: [(0, '0.369')] [2024-06-13 11:18:50,610][73497] Updated weights for policy 0, policy_version 231739 (0.0036) [2024-06-13 11:18:53,805][73497] Updated weights for policy 0, policy_version 231749 (0.0043) [2024-06-13 11:18:55,502][73265] Fps is (10 sec: 47512.9, 60 sec: 46148.1, 300 sec: 45764.1). Total num frames: 3797057536. Throughput: 0: 46033.7. Samples: 315550880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 11:18:55,502][73265] Avg episode reward: [(0, '0.456')] [2024-06-13 11:18:57,731][73497] Updated weights for policy 0, policy_version 231759 (0.0034) [2024-06-13 11:19:00,502][73265] Fps is (10 sec: 47512.0, 60 sec: 46421.1, 300 sec: 45653.0). Total num frames: 3797270528. Throughput: 0: 45603.5. Samples: 315820140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 11:19:00,502][73265] Avg episode reward: [(0, '0.423')] [2024-06-13 11:19:01,004][73497] Updated weights for policy 0, policy_version 231769 (0.0037) [2024-06-13 11:19:04,661][73497] Updated weights for policy 0, policy_version 231779 (0.0035) [2024-06-13 11:19:05,501][73265] Fps is (10 sec: 42598.9, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 3797483520. Throughput: 0: 45678.6. Samples: 315952760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 11:19:05,502][73265] Avg episode reward: [(0, '0.433')] [2024-06-13 11:19:08,247][73497] Updated weights for policy 0, policy_version 231789 (0.0031) [2024-06-13 11:19:09,266][73477] Signal inference workers to stop experience collection... (4650 times) [2024-06-13 11:19:09,272][73477] Signal inference workers to resume experience collection... (4650 times) [2024-06-13 11:19:09,297][73497] InferenceWorker_p0-w0: stopping experience collection (4650 times) [2024-06-13 11:19:09,297][73497] InferenceWorker_p0-w0: resuming experience collection (4650 times) [2024-06-13 11:19:10,502][73265] Fps is (10 sec: 45875.6, 60 sec: 46148.5, 300 sec: 45708.6). Total num frames: 3797729280. Throughput: 0: 45828.7. Samples: 316233060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 11:19:10,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 11:19:12,031][73497] Updated weights for policy 0, policy_version 231799 (0.0034) [2024-06-13 11:19:15,232][73497] Updated weights for policy 0, policy_version 231809 (0.0040) [2024-06-13 11:19:15,502][73265] Fps is (10 sec: 47512.3, 60 sec: 45875.1, 300 sec: 45653.0). Total num frames: 3797958656. Throughput: 0: 45693.5. Samples: 316508660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-13 11:19:15,508][73265] Avg episode reward: [(0, '0.439')] [2024-06-13 11:19:19,343][73497] Updated weights for policy 0, policy_version 231819 (0.0032) [2024-06-13 11:19:20,501][73265] Fps is (10 sec: 44237.7, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 3798171648. Throughput: 0: 45852.1. Samples: 316646300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 11:19:20,502][73265] Avg episode reward: [(0, '0.414')] [2024-06-13 11:19:22,629][73497] Updated weights for policy 0, policy_version 231829 (0.0033) [2024-06-13 11:19:25,502][73265] Fps is (10 sec: 42599.1, 60 sec: 45329.0, 300 sec: 45597.5). Total num frames: 3798384640. Throughput: 0: 45460.0. Samples: 316909000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 11:19:25,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 11:19:26,385][73497] Updated weights for policy 0, policy_version 231839 (0.0042) [2024-06-13 11:19:29,935][73497] Updated weights for policy 0, policy_version 231849 (0.0027) [2024-06-13 11:19:30,502][73265] Fps is (10 sec: 44235.9, 60 sec: 45602.0, 300 sec: 45597.5). Total num frames: 3798614016. Throughput: 0: 45218.0. Samples: 317178820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 11:19:30,502][73265] Avg episode reward: [(0, '0.471')] [2024-06-13 11:19:33,470][73497] Updated weights for policy 0, policy_version 231859 (0.0034) [2024-06-13 11:19:35,501][73265] Fps is (10 sec: 44237.3, 60 sec: 45602.1, 300 sec: 45542.3). Total num frames: 3798827008. Throughput: 0: 45216.8. Samples: 317314100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 11:19:35,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 11:19:37,085][73497] Updated weights for policy 0, policy_version 231869 (0.0030) [2024-06-13 11:19:40,501][73265] Fps is (10 sec: 47514.5, 60 sec: 45875.3, 300 sec: 45708.6). Total num frames: 3799089152. Throughput: 0: 45342.8. Samples: 317591300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 11:19:40,502][73265] Avg episode reward: [(0, '0.404')] [2024-06-13 11:19:40,827][73497] Updated weights for policy 0, policy_version 231879 (0.0033) [2024-06-13 11:19:44,559][73497] Updated weights for policy 0, policy_version 231889 (0.0038) [2024-06-13 11:19:45,501][73265] Fps is (10 sec: 49151.7, 60 sec: 45602.1, 300 sec: 45708.6). Total num frames: 3799318528. Throughput: 0: 45248.2. Samples: 317856300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 11:19:45,504][73265] Avg episode reward: [(0, '0.429')] [2024-06-13 11:19:48,249][73497] Updated weights for policy 0, policy_version 231899 (0.0039) [2024-06-13 11:19:50,501][73265] Fps is (10 sec: 42598.4, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 3799515136. Throughput: 0: 45554.2. Samples: 318002700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 11:19:50,502][73265] Avg episode reward: [(0, '0.392')] [2024-06-13 11:19:51,689][73497] Updated weights for policy 0, policy_version 231909 (0.0027) [2024-06-13 11:19:55,421][73497] Updated weights for policy 0, policy_version 231919 (0.0023) [2024-06-13 11:19:55,501][73265] Fps is (10 sec: 44237.2, 60 sec: 45056.1, 300 sec: 45653.0). Total num frames: 3799760896. Throughput: 0: 45251.7. Samples: 318269380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 11:19:55,502][73265] Avg episode reward: [(0, '0.501')] [2024-06-13 11:19:58,898][73497] Updated weights for policy 0, policy_version 231929 (0.0032) [2024-06-13 11:20:00,501][73265] Fps is (10 sec: 45874.9, 60 sec: 45056.1, 300 sec: 45653.0). Total num frames: 3799973888. Throughput: 0: 45163.8. Samples: 318541020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 11:20:00,502][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 11:20:02,434][73497] Updated weights for policy 0, policy_version 231939 (0.0045) [2024-06-13 11:20:05,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 3800203264. Throughput: 0: 45050.7. Samples: 318673580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 11:20:05,502][73265] Avg episode reward: [(0, '0.462')] [2024-06-13 11:20:06,077][73497] Updated weights for policy 0, policy_version 231949 (0.0032) [2024-06-13 11:20:09,578][73497] Updated weights for policy 0, policy_version 231959 (0.0040) [2024-06-13 11:20:10,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45056.1, 300 sec: 45542.0). Total num frames: 3800432640. Throughput: 0: 45413.0. Samples: 318952580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 11:20:10,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 11:20:13,522][73497] Updated weights for policy 0, policy_version 231969 (0.0043) [2024-06-13 11:20:15,501][73265] Fps is (10 sec: 45874.6, 60 sec: 45056.1, 300 sec: 45653.0). Total num frames: 3800662016. Throughput: 0: 45201.9. Samples: 319212900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 11:20:15,502][73265] Avg episode reward: [(0, '0.431')] [2024-06-13 11:20:17,594][73497] Updated weights for policy 0, policy_version 231979 (0.0034) [2024-06-13 11:20:20,503][73265] Fps is (10 sec: 44230.1, 60 sec: 45054.8, 300 sec: 45652.8). Total num frames: 3800875008. Throughput: 0: 45311.8. Samples: 319353200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 11:20:20,504][73265] Avg episode reward: [(0, '0.443')] [2024-06-13 11:20:20,760][73497] Updated weights for policy 0, policy_version 231989 (0.0030) [2024-06-13 11:20:24,566][73497] Updated weights for policy 0, policy_version 231999 (0.0042) [2024-06-13 11:20:25,501][73265] Fps is (10 sec: 42598.6, 60 sec: 45056.1, 300 sec: 45486.4). Total num frames: 3801088000. Throughput: 0: 45172.4. Samples: 319624060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-13 11:20:25,502][73265] Avg episode reward: [(0, '0.431')] [2024-06-13 11:20:25,508][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000232000_3801088000.pth... [2024-06-13 11:20:25,569][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000231333_3790159872.pth [2024-06-13 11:20:28,216][73497] Updated weights for policy 0, policy_version 232009 (0.0031) [2024-06-13 11:20:30,169][73477] Signal inference workers to stop experience collection... (4700 times) [2024-06-13 11:20:30,215][73497] InferenceWorker_p0-w0: stopping experience collection (4700 times) [2024-06-13 11:20:30,224][73477] Signal inference workers to resume experience collection... (4700 times) [2024-06-13 11:20:30,233][73497] InferenceWorker_p0-w0: resuming experience collection (4700 times) [2024-06-13 11:20:30,501][73265] Fps is (10 sec: 45882.4, 60 sec: 45329.2, 300 sec: 45597.5). Total num frames: 3801333760. Throughput: 0: 45339.2. Samples: 319896560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 11:20:30,502][73265] Avg episode reward: [(0, '0.445')] [2024-06-13 11:20:31,607][73497] Updated weights for policy 0, policy_version 232019 (0.0034) [2024-06-13 11:20:35,304][73497] Updated weights for policy 0, policy_version 232029 (0.0030) [2024-06-13 11:20:35,501][73265] Fps is (10 sec: 47513.3, 60 sec: 45602.1, 300 sec: 45653.0). Total num frames: 3801563136. Throughput: 0: 45175.4. Samples: 320035600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 11:20:35,502][73265] Avg episode reward: [(0, '0.419')] [2024-06-13 11:20:38,699][73497] Updated weights for policy 0, policy_version 232039 (0.0029) [2024-06-13 11:20:40,501][73265] Fps is (10 sec: 44236.4, 60 sec: 44782.9, 300 sec: 45542.0). Total num frames: 3801776128. Throughput: 0: 45224.8. Samples: 320304500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 11:20:40,502][73265] Avg episode reward: [(0, '0.442')] [2024-06-13 11:20:42,676][73497] Updated weights for policy 0, policy_version 232049 (0.0037) [2024-06-13 11:20:45,501][73265] Fps is (10 sec: 42598.9, 60 sec: 44509.9, 300 sec: 45597.5). Total num frames: 3801989120. Throughput: 0: 45256.9. Samples: 320577580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 11:20:45,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 11:20:46,253][73497] Updated weights for policy 0, policy_version 232059 (0.0032) [2024-06-13 11:20:49,679][73497] Updated weights for policy 0, policy_version 232069 (0.0028) [2024-06-13 11:20:50,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45329.0, 300 sec: 45597.5). Total num frames: 3802234880. Throughput: 0: 45380.8. Samples: 320715720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 11:20:50,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 11:20:53,118][73497] Updated weights for policy 0, policy_version 232079 (0.0041) [2024-06-13 11:20:55,502][73265] Fps is (10 sec: 47513.0, 60 sec: 45055.9, 300 sec: 45542.0). Total num frames: 3802464256. Throughput: 0: 45128.3. Samples: 320983360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 11:20:55,502][73265] Avg episode reward: [(0, '0.473')] [2024-06-13 11:20:56,851][73497] Updated weights for policy 0, policy_version 232089 (0.0040) [2024-06-13 11:21:00,032][73497] Updated weights for policy 0, policy_version 232099 (0.0040) [2024-06-13 11:21:00,502][73265] Fps is (10 sec: 47513.0, 60 sec: 45602.1, 300 sec: 45653.0). Total num frames: 3802710016. Throughput: 0: 45318.6. Samples: 321252240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 11:21:00,502][73265] Avg episode reward: [(0, '0.489')] [2024-06-13 11:21:04,075][73497] Updated weights for policy 0, policy_version 232109 (0.0030) [2024-06-13 11:21:05,503][73265] Fps is (10 sec: 45866.7, 60 sec: 45327.5, 300 sec: 45597.2). Total num frames: 3802923008. Throughput: 0: 45657.3. Samples: 321407800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 11:21:05,504][73265] Avg episode reward: [(0, '0.397')] [2024-06-13 11:21:07,255][73497] Updated weights for policy 0, policy_version 232119 (0.0021) [2024-06-13 11:21:10,501][73265] Fps is (10 sec: 45875.9, 60 sec: 45602.1, 300 sec: 45597.5). Total num frames: 3803168768. Throughput: 0: 45680.9. Samples: 321679700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 11:21:10,502][73265] Avg episode reward: [(0, '0.492')] [2024-06-13 11:21:10,915][73497] Updated weights for policy 0, policy_version 232129 (0.0033) [2024-06-13 11:21:14,786][73497] Updated weights for policy 0, policy_version 232139 (0.0036) [2024-06-13 11:21:15,501][73265] Fps is (10 sec: 44245.5, 60 sec: 45056.0, 300 sec: 45430.9). Total num frames: 3803365376. Throughput: 0: 45667.9. Samples: 321951620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 11:21:15,502][73265] Avg episode reward: [(0, '0.467')] [2024-06-13 11:21:18,762][73497] Updated weights for policy 0, policy_version 232149 (0.0032) [2024-06-13 11:21:20,501][73265] Fps is (10 sec: 45874.8, 60 sec: 45876.3, 300 sec: 45708.6). Total num frames: 3803627520. Throughput: 0: 45672.5. Samples: 322090860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 11:21:20,502][73265] Avg episode reward: [(0, '0.456')] [2024-06-13 11:21:21,599][73497] Updated weights for policy 0, policy_version 232159 (0.0031) [2024-06-13 11:21:25,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3803824128. Throughput: 0: 45734.7. Samples: 322362560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 11:21:25,502][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 11:21:25,806][73497] Updated weights for policy 0, policy_version 232169 (0.0031) [2024-06-13 11:21:28,908][73497] Updated weights for policy 0, policy_version 232179 (0.0026) [2024-06-13 11:21:30,502][73265] Fps is (10 sec: 44236.4, 60 sec: 45602.0, 300 sec: 45541.9). Total num frames: 3804069888. Throughput: 0: 45742.5. Samples: 322636000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-13 11:21:30,502][73265] Avg episode reward: [(0, '0.474')] [2024-06-13 11:21:32,774][73497] Updated weights for policy 0, policy_version 232189 (0.0027) [2024-06-13 11:21:35,501][73265] Fps is (10 sec: 49151.7, 60 sec: 45875.2, 300 sec: 45708.6). Total num frames: 3804315648. Throughput: 0: 45684.8. Samples: 322771540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 11:21:35,502][73265] Avg episode reward: [(0, '0.501')] [2024-06-13 11:21:36,622][73497] Updated weights for policy 0, policy_version 232199 (0.0033) [2024-06-13 11:21:39,727][73497] Updated weights for policy 0, policy_version 232209 (0.0035) [2024-06-13 11:21:40,506][73265] Fps is (10 sec: 45856.8, 60 sec: 45872.0, 300 sec: 45541.3). Total num frames: 3804528640. Throughput: 0: 45775.4. Samples: 323043440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 11:21:40,506][73265] Avg episode reward: [(0, '0.435')] [2024-06-13 11:21:43,680][73497] Updated weights for policy 0, policy_version 232219 (0.0032) [2024-06-13 11:21:45,501][73265] Fps is (10 sec: 44237.0, 60 sec: 46148.2, 300 sec: 45542.0). Total num frames: 3804758016. Throughput: 0: 45894.3. Samples: 323317480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 11:21:45,502][73265] Avg episode reward: [(0, '0.413')] [2024-06-13 11:21:47,587][73497] Updated weights for policy 0, policy_version 232229 (0.0034) [2024-06-13 11:21:50,501][73265] Fps is (10 sec: 44255.3, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3804971008. Throughput: 0: 45345.6. Samples: 323448260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 11:21:50,502][73265] Avg episode reward: [(0, '0.470')] [2024-06-13 11:21:50,513][73477] Signal inference workers to stop experience collection... (4750 times) [2024-06-13 11:21:50,516][73477] Signal inference workers to resume experience collection... (4750 times) [2024-06-13 11:21:50,531][73497] InferenceWorker_p0-w0: stopping experience collection (4750 times) [2024-06-13 11:21:50,531][73497] InferenceWorker_p0-w0: resuming experience collection (4750 times) [2024-06-13 11:21:50,651][73497] Updated weights for policy 0, policy_version 232239 (0.0031) [2024-06-13 11:21:54,643][73497] Updated weights for policy 0, policy_version 232249 (0.0031) [2024-06-13 11:21:55,501][73265] Fps is (10 sec: 45875.6, 60 sec: 45875.4, 300 sec: 45653.0). Total num frames: 3805216768. Throughput: 0: 45693.4. Samples: 323735900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 11:21:55,502][73265] Avg episode reward: [(0, '0.431')] [2024-06-13 11:21:57,808][73497] Updated weights for policy 0, policy_version 232259 (0.0036) [2024-06-13 11:22:00,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45329.2, 300 sec: 45542.0). Total num frames: 3805429760. Throughput: 0: 45680.9. Samples: 324007260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 11:22:00,502][73265] Avg episode reward: [(0, '0.405')] [2024-06-13 11:22:01,409][73497] Updated weights for policy 0, policy_version 232269 (0.0026) [2024-06-13 11:22:04,944][73497] Updated weights for policy 0, policy_version 232279 (0.0040) [2024-06-13 11:22:05,501][73265] Fps is (10 sec: 44236.5, 60 sec: 45603.6, 300 sec: 45542.0). Total num frames: 3805659136. Throughput: 0: 45598.3. Samples: 324142780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 11:22:05,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 11:22:08,358][73497] Updated weights for policy 0, policy_version 232289 (0.0030) [2024-06-13 11:22:10,502][73265] Fps is (10 sec: 45874.4, 60 sec: 45328.9, 300 sec: 45597.5). Total num frames: 3805888512. Throughput: 0: 45624.3. Samples: 324415660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 11:22:10,502][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 11:22:11,897][73497] Updated weights for policy 0, policy_version 232299 (0.0031) [2024-06-13 11:22:15,501][73265] Fps is (10 sec: 44237.0, 60 sec: 45602.2, 300 sec: 45430.9). Total num frames: 3806101504. Throughput: 0: 45670.9. Samples: 324691180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 11:22:15,502][73265] Avg episode reward: [(0, '0.453')] [2024-06-13 11:22:16,008][73497] Updated weights for policy 0, policy_version 232309 (0.0035) [2024-06-13 11:22:19,531][73497] Updated weights for policy 0, policy_version 232319 (0.0034) [2024-06-13 11:22:20,504][73265] Fps is (10 sec: 45864.7, 60 sec: 45327.3, 300 sec: 45597.5). Total num frames: 3806347264. Throughput: 0: 45554.0. Samples: 324821580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 11:22:20,504][73265] Avg episode reward: [(0, '0.427')] [2024-06-13 11:22:23,072][73497] Updated weights for policy 0, policy_version 232329 (0.0026) [2024-06-13 11:22:25,501][73265] Fps is (10 sec: 49151.8, 60 sec: 46148.3, 300 sec: 45653.0). Total num frames: 3806593024. Throughput: 0: 45862.0. Samples: 325107040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 11:22:25,502][73265] Avg episode reward: [(0, '0.462')] [2024-06-13 11:22:25,527][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000232336_3806593024.pth... [2024-06-13 11:22:25,584][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000231669_3795664896.pth [2024-06-13 11:22:26,383][73497] Updated weights for policy 0, policy_version 232339 (0.0039) [2024-06-13 11:22:30,254][73497] Updated weights for policy 0, policy_version 232349 (0.0024) [2024-06-13 11:22:30,501][73265] Fps is (10 sec: 45886.2, 60 sec: 45602.2, 300 sec: 45542.0). Total num frames: 3806806016. Throughput: 0: 46008.9. Samples: 325387880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 11:22:30,502][73265] Avg episode reward: [(0, '0.445')] [2024-06-13 11:22:33,574][73497] Updated weights for policy 0, policy_version 232359 (0.0033) [2024-06-13 11:22:35,501][73265] Fps is (10 sec: 45874.7, 60 sec: 45602.1, 300 sec: 45597.5). Total num frames: 3807051776. Throughput: 0: 45953.2. Samples: 325516160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 11:22:35,502][73265] Avg episode reward: [(0, '0.496')] [2024-06-13 11:22:37,247][73497] Updated weights for policy 0, policy_version 232369 (0.0029) [2024-06-13 11:22:40,504][73265] Fps is (10 sec: 47502.1, 60 sec: 45876.5, 300 sec: 45763.7). Total num frames: 3807281152. Throughput: 0: 45709.4. Samples: 325792940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-13 11:22:40,504][73265] Avg episode reward: [(0, '0.445')] [2024-06-13 11:22:40,636][73497] Updated weights for policy 0, policy_version 232379 (0.0034) [2024-06-13 11:22:44,924][73497] Updated weights for policy 0, policy_version 232389 (0.0031) [2024-06-13 11:22:45,501][73265] Fps is (10 sec: 45875.8, 60 sec: 45875.2, 300 sec: 45653.1). Total num frames: 3807510528. Throughput: 0: 45733.4. Samples: 326065260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 11:22:45,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 11:22:48,031][73497] Updated weights for policy 0, policy_version 232399 (0.0039) [2024-06-13 11:22:50,502][73265] Fps is (10 sec: 42608.5, 60 sec: 45602.0, 300 sec: 45486.4). Total num frames: 3807707136. Throughput: 0: 45647.9. Samples: 326196940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 11:22:50,502][73265] Avg episode reward: [(0, '0.509')] [2024-06-13 11:22:51,950][73497] Updated weights for policy 0, policy_version 232409 (0.0041) [2024-06-13 11:22:55,070][73497] Updated weights for policy 0, policy_version 232419 (0.0033) [2024-06-13 11:22:55,501][73265] Fps is (10 sec: 45874.7, 60 sec: 45875.1, 300 sec: 45708.6). Total num frames: 3807969280. Throughput: 0: 45861.4. Samples: 326479420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 11:22:55,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 11:22:59,051][73497] Updated weights for policy 0, policy_version 232429 (0.0040) [2024-06-13 11:23:00,501][73265] Fps is (10 sec: 45875.7, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3808165888. Throughput: 0: 45795.1. Samples: 326751960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 11:23:00,502][73265] Avg episode reward: [(0, '0.425')] [2024-06-13 11:23:01,966][73497] Updated weights for policy 0, policy_version 232439 (0.0039) [2024-06-13 11:23:05,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45875.2, 300 sec: 45597.6). Total num frames: 3808411648. Throughput: 0: 45805.2. Samples: 326882700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 11:23:05,502][73265] Avg episode reward: [(0, '0.464')] [2024-06-13 11:23:05,970][73497] Updated weights for policy 0, policy_version 232449 (0.0025) [2024-06-13 11:23:07,666][73477] Signal inference workers to stop experience collection... (4800 times) [2024-06-13 11:23:07,669][73477] Signal inference workers to resume experience collection... (4800 times) [2024-06-13 11:23:07,711][73497] InferenceWorker_p0-w0: stopping experience collection (4800 times) [2024-06-13 11:23:07,711][73497] InferenceWorker_p0-w0: resuming experience collection (4800 times) [2024-06-13 11:23:09,245][73497] Updated weights for policy 0, policy_version 232459 (0.0037) [2024-06-13 11:23:10,502][73265] Fps is (10 sec: 49151.3, 60 sec: 46148.3, 300 sec: 45597.5). Total num frames: 3808657408. Throughput: 0: 45571.0. Samples: 327157740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 11:23:10,502][73265] Avg episode reward: [(0, '0.467')] [2024-06-13 11:23:13,477][73497] Updated weights for policy 0, policy_version 232469 (0.0034) [2024-06-13 11:23:15,501][73265] Fps is (10 sec: 45875.1, 60 sec: 46148.2, 300 sec: 45597.5). Total num frames: 3808870400. Throughput: 0: 45181.8. Samples: 327421060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 11:23:15,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 11:23:16,765][73497] Updated weights for policy 0, policy_version 232479 (0.0039) [2024-06-13 11:23:20,355][73497] Updated weights for policy 0, policy_version 232489 (0.0029) [2024-06-13 11:23:20,501][73265] Fps is (10 sec: 44237.4, 60 sec: 45877.1, 300 sec: 45542.0). Total num frames: 3809099776. Throughput: 0: 45523.6. Samples: 327564720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 11:23:20,502][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 11:23:23,813][73497] Updated weights for policy 0, policy_version 232499 (0.0028) [2024-06-13 11:23:25,501][73265] Fps is (10 sec: 45875.6, 60 sec: 45602.2, 300 sec: 45597.5). Total num frames: 3809329152. Throughput: 0: 45461.2. Samples: 327838580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 11:23:25,502][73265] Avg episode reward: [(0, '0.449')] [2024-06-13 11:23:27,699][73497] Updated weights for policy 0, policy_version 232509 (0.0032) [2024-06-13 11:23:30,502][73265] Fps is (10 sec: 45874.6, 60 sec: 45875.1, 300 sec: 45653.0). Total num frames: 3809558528. Throughput: 0: 45495.8. Samples: 328112580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 11:23:30,502][73265] Avg episode reward: [(0, '0.474')] [2024-06-13 11:23:31,228][73497] Updated weights for policy 0, policy_version 232519 (0.0036) [2024-06-13 11:23:34,994][73497] Updated weights for policy 0, policy_version 232529 (0.0032) [2024-06-13 11:23:35,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45602.2, 300 sec: 45597.5). Total num frames: 3809787904. Throughput: 0: 45625.5. Samples: 328250080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 11:23:35,502][73265] Avg episode reward: [(0, '0.423')] [2024-06-13 11:23:38,184][73497] Updated weights for policy 0, policy_version 232539 (0.0041) [2024-06-13 11:23:40,501][73265] Fps is (10 sec: 44237.8, 60 sec: 45331.0, 300 sec: 45486.4). Total num frames: 3810000896. Throughput: 0: 45320.6. Samples: 328518840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 11:23:40,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 11:23:42,004][73497] Updated weights for policy 0, policy_version 232549 (0.0037) [2024-06-13 11:23:45,501][73265] Fps is (10 sec: 44236.8, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 3810230272. Throughput: 0: 45347.1. Samples: 328792580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 11:23:45,502][73265] Avg episode reward: [(0, '0.490')] [2024-06-13 11:23:45,725][73497] Updated weights for policy 0, policy_version 232559 (0.0037) [2024-06-13 11:23:49,022][73497] Updated weights for policy 0, policy_version 232569 (0.0035) [2024-06-13 11:23:50,501][73265] Fps is (10 sec: 45874.9, 60 sec: 45875.3, 300 sec: 45430.9). Total num frames: 3810459648. Throughput: 0: 45446.2. Samples: 328927780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-13 11:23:50,502][73265] Avg episode reward: [(0, '0.484')] [2024-06-13 11:23:52,878][73497] Updated weights for policy 0, policy_version 232579 (0.0051) [2024-06-13 11:23:55,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45329.1, 300 sec: 45486.5). Total num frames: 3810689024. Throughput: 0: 45569.9. Samples: 329208380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-13 11:23:55,502][73265] Avg episode reward: [(0, '0.479')] [2024-06-13 11:23:56,470][73497] Updated weights for policy 0, policy_version 232589 (0.0035) [2024-06-13 11:23:59,966][73497] Updated weights for policy 0, policy_version 232599 (0.0055) [2024-06-13 11:24:00,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 3810918400. Throughput: 0: 45594.7. Samples: 329472820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-13 11:24:00,502][73265] Avg episode reward: [(0, '0.420')] [2024-06-13 11:24:03,787][73497] Updated weights for policy 0, policy_version 232609 (0.0030) [2024-06-13 11:24:05,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45602.1, 300 sec: 45486.5). Total num frames: 3811147776. Throughput: 0: 45492.0. Samples: 329611860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-13 11:24:05,502][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 11:24:06,852][73497] Updated weights for policy 0, policy_version 232619 (0.0030) [2024-06-13 11:24:10,502][73265] Fps is (10 sec: 44236.1, 60 sec: 45056.0, 300 sec: 45430.9). Total num frames: 3811360768. Throughput: 0: 45408.2. Samples: 329881960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-13 11:24:10,502][73265] Avg episode reward: [(0, '0.487')] [2024-06-13 11:24:10,975][73497] Updated weights for policy 0, policy_version 232629 (0.0030) [2024-06-13 11:24:14,778][73497] Updated weights for policy 0, policy_version 232639 (0.0041) [2024-06-13 11:24:15,501][73265] Fps is (10 sec: 40959.9, 60 sec: 44782.9, 300 sec: 45375.3). Total num frames: 3811557376. Throughput: 0: 45483.2. Samples: 330159320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-13 11:24:15,502][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 11:24:17,904][73497] Updated weights for policy 0, policy_version 232649 (0.0028) [2024-06-13 11:24:20,502][73265] Fps is (10 sec: 49152.2, 60 sec: 45875.1, 300 sec: 45653.0). Total num frames: 3811852288. Throughput: 0: 45457.2. Samples: 330295660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-13 11:24:20,502][73265] Avg episode reward: [(0, '0.469')] [2024-06-13 11:24:21,629][73497] Updated weights for policy 0, policy_version 232659 (0.0035) [2024-06-13 11:24:24,871][73497] Updated weights for policy 0, policy_version 232669 (0.0032) [2024-06-13 11:24:25,502][73265] Fps is (10 sec: 49151.3, 60 sec: 45328.9, 300 sec: 45542.0). Total num frames: 3812048896. Throughput: 0: 45603.3. Samples: 330571000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-13 11:24:25,502][73265] Avg episode reward: [(0, '0.496')] [2024-06-13 11:24:25,515][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000232669_3812048896.pth... [2024-06-13 11:24:25,588][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000232000_3801088000.pth [2024-06-13 11:24:28,721][73497] Updated weights for policy 0, policy_version 232679 (0.0032) [2024-06-13 11:24:30,501][73265] Fps is (10 sec: 42598.6, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 3812278272. Throughput: 0: 45636.8. Samples: 330846240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-13 11:24:30,502][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 11:24:32,546][73497] Updated weights for policy 0, policy_version 232689 (0.0031) [2024-06-13 11:24:35,501][73265] Fps is (10 sec: 45876.1, 60 sec: 45329.1, 300 sec: 45486.4). Total num frames: 3812507648. Throughput: 0: 45688.0. Samples: 330983740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-13 11:24:35,502][73265] Avg episode reward: [(0, '0.420')] [2024-06-13 11:24:35,814][73497] Updated weights for policy 0, policy_version 232699 (0.0032) [2024-06-13 11:24:39,689][73497] Updated weights for policy 0, policy_version 232709 (0.0038) [2024-06-13 11:24:40,502][73265] Fps is (10 sec: 47513.3, 60 sec: 45875.1, 300 sec: 45542.0). Total num frames: 3812753408. Throughput: 0: 45363.9. Samples: 331249760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-13 11:24:40,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 11:24:43,203][73497] Updated weights for policy 0, policy_version 232719 (0.0036) [2024-06-13 11:24:45,501][73265] Fps is (10 sec: 45874.8, 60 sec: 45602.1, 300 sec: 45597.5). Total num frames: 3812966400. Throughput: 0: 45753.7. Samples: 331531740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-13 11:24:45,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 11:24:46,615][73497] Updated weights for policy 0, policy_version 232729 (0.0036) [2024-06-13 11:24:46,928][73477] Signal inference workers to stop experience collection... (4850 times) [2024-06-13 11:24:46,956][73497] InferenceWorker_p0-w0: stopping experience collection (4850 times) [2024-06-13 11:24:47,035][73477] Signal inference workers to resume experience collection... (4850 times) [2024-06-13 11:24:47,036][73497] InferenceWorker_p0-w0: resuming experience collection (4850 times) [2024-06-13 11:24:50,508][73265] Fps is (10 sec: 42571.0, 60 sec: 45324.1, 300 sec: 45485.4). Total num frames: 3813179392. Throughput: 0: 45476.5. Samples: 331658600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-13 11:24:50,509][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 11:24:50,691][73497] Updated weights for policy 0, policy_version 232739 (0.0024) [2024-06-13 11:24:53,945][73497] Updated weights for policy 0, policy_version 232749 (0.0037) [2024-06-13 11:24:55,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45602.2, 300 sec: 45597.5). Total num frames: 3813425152. Throughput: 0: 45479.7. Samples: 331928540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-13 11:24:55,502][73265] Avg episode reward: [(0, '0.446')] [2024-06-13 11:24:57,679][73497] Updated weights for policy 0, policy_version 232759 (0.0036) [2024-06-13 11:25:00,501][73265] Fps is (10 sec: 45905.0, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 3813638144. Throughput: 0: 45456.9. Samples: 332204880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-13 11:25:00,502][73265] Avg episode reward: [(0, '0.446')] [2024-06-13 11:25:01,075][73497] Updated weights for policy 0, policy_version 232769 (0.0033) [2024-06-13 11:25:04,990][73497] Updated weights for policy 0, policy_version 232779 (0.0036) [2024-06-13 11:25:05,501][73265] Fps is (10 sec: 42598.0, 60 sec: 45055.9, 300 sec: 45486.4). Total num frames: 3813851136. Throughput: 0: 45397.8. Samples: 332338560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 11:25:05,502][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 11:25:08,262][73497] Updated weights for policy 0, policy_version 232789 (0.0036) [2024-06-13 11:25:10,501][73265] Fps is (10 sec: 45875.7, 60 sec: 45602.3, 300 sec: 45542.0). Total num frames: 3814096896. Throughput: 0: 45106.9. Samples: 332600800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 11:25:10,502][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 11:25:12,342][73497] Updated weights for policy 0, policy_version 232799 (0.0031) [2024-06-13 11:25:15,406][73497] Updated weights for policy 0, policy_version 232809 (0.0031) [2024-06-13 11:25:15,501][73265] Fps is (10 sec: 49151.8, 60 sec: 46421.3, 300 sec: 45653.3). Total num frames: 3814342656. Throughput: 0: 45295.5. Samples: 332884540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 11:25:15,502][73265] Avg episode reward: [(0, '0.439')] [2024-06-13 11:25:19,461][73497] Updated weights for policy 0, policy_version 232819 (0.0036) [2024-06-13 11:25:20,501][73265] Fps is (10 sec: 42598.3, 60 sec: 44510.0, 300 sec: 45542.0). Total num frames: 3814522880. Throughput: 0: 45256.9. Samples: 333020300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 11:25:20,502][73265] Avg episode reward: [(0, '0.458')] [2024-06-13 11:25:22,679][73497] Updated weights for policy 0, policy_version 232829 (0.0038) [2024-06-13 11:25:25,501][73265] Fps is (10 sec: 42598.6, 60 sec: 45329.2, 300 sec: 45542.0). Total num frames: 3814768640. Throughput: 0: 45361.4. Samples: 333291020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 11:25:25,502][73265] Avg episode reward: [(0, '0.492')] [2024-06-13 11:25:26,436][73497] Updated weights for policy 0, policy_version 232839 (0.0042) [2024-06-13 11:25:29,865][73497] Updated weights for policy 0, policy_version 232849 (0.0037) [2024-06-13 11:25:30,501][73265] Fps is (10 sec: 49151.4, 60 sec: 45602.1, 300 sec: 45597.5). Total num frames: 3815014400. Throughput: 0: 45296.8. Samples: 333570100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 11:25:30,502][73265] Avg episode reward: [(0, '0.423')] [2024-06-13 11:25:33,882][73497] Updated weights for policy 0, policy_version 232859 (0.0029) [2024-06-13 11:25:35,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45329.0, 300 sec: 45597.5). Total num frames: 3815227392. Throughput: 0: 45587.1. Samples: 333709720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 11:25:35,502][73265] Avg episode reward: [(0, '0.446')] [2024-06-13 11:25:37,047][73497] Updated weights for policy 0, policy_version 232869 (0.0031) [2024-06-13 11:25:40,504][73265] Fps is (10 sec: 44226.3, 60 sec: 45054.2, 300 sec: 45652.7). Total num frames: 3815456768. Throughput: 0: 45653.5. Samples: 333983060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 11:25:40,504][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 11:25:40,956][73497] Updated weights for policy 0, policy_version 232879 (0.0032) [2024-06-13 11:25:44,077][73497] Updated weights for policy 0, policy_version 232889 (0.0033) [2024-06-13 11:25:45,501][73265] Fps is (10 sec: 49152.4, 60 sec: 45875.3, 300 sec: 45708.6). Total num frames: 3815718912. Throughput: 0: 45683.7. Samples: 334260640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 11:25:45,502][73265] Avg episode reward: [(0, '0.464')] [2024-06-13 11:25:47,875][73497] Updated weights for policy 0, policy_version 232899 (0.0037) [2024-06-13 11:25:50,502][73265] Fps is (10 sec: 45885.9, 60 sec: 45607.0, 300 sec: 45597.5). Total num frames: 3815915520. Throughput: 0: 45844.4. Samples: 334401560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 11:25:50,502][73265] Avg episode reward: [(0, '0.454')] [2024-06-13 11:25:51,403][73497] Updated weights for policy 0, policy_version 232909 (0.0023) [2024-06-13 11:25:55,077][73497] Updated weights for policy 0, policy_version 232919 (0.0033) [2024-06-13 11:25:55,501][73265] Fps is (10 sec: 42598.3, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 3816144896. Throughput: 0: 46185.3. Samples: 334679140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 11:25:55,502][73265] Avg episode reward: [(0, '0.469')] [2024-06-13 11:25:58,811][73497] Updated weights for policy 0, policy_version 232929 (0.0030) [2024-06-13 11:25:58,844][73477] Signal inference workers to stop experience collection... (4900 times) [2024-06-13 11:25:58,845][73477] Signal inference workers to resume experience collection... (4900 times) [2024-06-13 11:25:58,889][73497] InferenceWorker_p0-w0: stopping experience collection (4900 times) [2024-06-13 11:25:58,889][73497] InferenceWorker_p0-w0: resuming experience collection (4900 times) [2024-06-13 11:26:00,501][73265] Fps is (10 sec: 47514.0, 60 sec: 45875.2, 300 sec: 45653.3). Total num frames: 3816390656. Throughput: 0: 45869.4. Samples: 334948660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 11:26:00,502][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 11:26:02,187][73497] Updated weights for policy 0, policy_version 232939 (0.0038) [2024-06-13 11:26:05,502][73265] Fps is (10 sec: 45874.3, 60 sec: 45875.1, 300 sec: 45541.9). Total num frames: 3816603648. Throughput: 0: 45906.0. Samples: 335086080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 11:26:05,502][73265] Avg episode reward: [(0, '0.429')] [2024-06-13 11:26:05,946][73497] Updated weights for policy 0, policy_version 232949 (0.0027) [2024-06-13 11:26:09,702][73497] Updated weights for policy 0, policy_version 232959 (0.0024) [2024-06-13 11:26:10,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45875.1, 300 sec: 45708.6). Total num frames: 3816849408. Throughput: 0: 46002.7. Samples: 335361140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-13 11:26:10,502][73265] Avg episode reward: [(0, '0.462')] [2024-06-13 11:26:13,010][73497] Updated weights for policy 0, policy_version 232969 (0.0031) [2024-06-13 11:26:15,501][73265] Fps is (10 sec: 47514.3, 60 sec: 45602.2, 300 sec: 45597.5). Total num frames: 3817078784. Throughput: 0: 45934.8. Samples: 335637160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 11:26:15,502][73265] Avg episode reward: [(0, '0.426')] [2024-06-13 11:26:16,759][73497] Updated weights for policy 0, policy_version 232979 (0.0036) [2024-06-13 11:26:19,882][73497] Updated weights for policy 0, policy_version 232989 (0.0044) [2024-06-13 11:26:20,506][73265] Fps is (10 sec: 45853.9, 60 sec: 46417.7, 300 sec: 45707.9). Total num frames: 3817308160. Throughput: 0: 45970.3. Samples: 335778600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 11:26:20,507][73265] Avg episode reward: [(0, '0.435')] [2024-06-13 11:26:23,855][73497] Updated weights for policy 0, policy_version 232999 (0.0037) [2024-06-13 11:26:25,502][73265] Fps is (10 sec: 45874.4, 60 sec: 46148.2, 300 sec: 45653.0). Total num frames: 3817537536. Throughput: 0: 45996.1. Samples: 336052780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 11:26:25,502][73265] Avg episode reward: [(0, '0.488')] [2024-06-13 11:26:25,510][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000233004_3817537536.pth... [2024-06-13 11:26:25,564][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000232336_3806593024.pth [2024-06-13 11:26:27,170][73497] Updated weights for policy 0, policy_version 233009 (0.0031) [2024-06-13 11:26:30,504][73265] Fps is (10 sec: 45886.9, 60 sec: 45873.6, 300 sec: 45597.2). Total num frames: 3817766912. Throughput: 0: 45865.3. Samples: 336324680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 11:26:30,504][73265] Avg episode reward: [(0, '0.486')] [2024-06-13 11:26:30,775][73497] Updated weights for policy 0, policy_version 233019 (0.0036) [2024-06-13 11:26:34,456][73497] Updated weights for policy 0, policy_version 233029 (0.0034) [2024-06-13 11:26:35,501][73265] Fps is (10 sec: 45875.7, 60 sec: 46148.2, 300 sec: 45653.7). Total num frames: 3817996288. Throughput: 0: 45780.5. Samples: 336461680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 11:26:35,502][73265] Avg episode reward: [(0, '0.406')] [2024-06-13 11:26:38,327][73497] Updated weights for policy 0, policy_version 233039 (0.0038) [2024-06-13 11:26:40,501][73265] Fps is (10 sec: 44246.1, 60 sec: 45877.1, 300 sec: 45597.5). Total num frames: 3818209280. Throughput: 0: 45740.4. Samples: 336737460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 11:26:40,502][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 11:26:41,542][73497] Updated weights for policy 0, policy_version 233049 (0.0036) [2024-06-13 11:26:45,488][73497] Updated weights for policy 0, policy_version 233059 (0.0034) [2024-06-13 11:26:45,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 3818438656. Throughput: 0: 45798.3. Samples: 337009580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 11:26:45,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 11:26:48,651][73497] Updated weights for policy 0, policy_version 233069 (0.0034) [2024-06-13 11:26:50,501][73265] Fps is (10 sec: 47513.7, 60 sec: 46148.4, 300 sec: 45653.0). Total num frames: 3818684416. Throughput: 0: 45796.6. Samples: 337146920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 11:26:50,502][73265] Avg episode reward: [(0, '0.501')] [2024-06-13 11:26:52,861][73497] Updated weights for policy 0, policy_version 233079 (0.0033) [2024-06-13 11:26:55,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45875.1, 300 sec: 45653.0). Total num frames: 3818897408. Throughput: 0: 45711.5. Samples: 337418160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 11:26:55,502][73265] Avg episode reward: [(0, '0.399')] [2024-06-13 11:26:55,917][73497] Updated weights for policy 0, policy_version 233089 (0.0038) [2024-06-13 11:26:59,879][73497] Updated weights for policy 0, policy_version 233099 (0.0034) [2024-06-13 11:27:00,502][73265] Fps is (10 sec: 44236.4, 60 sec: 45602.1, 300 sec: 45653.0). Total num frames: 3819126784. Throughput: 0: 45609.7. Samples: 337689600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 11:27:00,502][73265] Avg episode reward: [(0, '0.443')] [2024-06-13 11:27:03,211][73497] Updated weights for policy 0, policy_version 233109 (0.0043) [2024-06-13 11:27:05,502][73265] Fps is (10 sec: 45874.8, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 3819356160. Throughput: 0: 45560.2. Samples: 337828600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 11:27:05,502][73265] Avg episode reward: [(0, '0.478')] [2024-06-13 11:27:06,792][73497] Updated weights for policy 0, policy_version 233119 (0.0027) [2024-06-13 11:27:10,362][73497] Updated weights for policy 0, policy_version 233129 (0.0030) [2024-06-13 11:27:10,501][73265] Fps is (10 sec: 45875.4, 60 sec: 45602.1, 300 sec: 45708.6). Total num frames: 3819585536. Throughput: 0: 45606.3. Samples: 338105060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 11:27:10,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 11:27:14,105][73497] Updated weights for policy 0, policy_version 233139 (0.0030) [2024-06-13 11:27:15,502][73265] Fps is (10 sec: 44236.9, 60 sec: 45329.0, 300 sec: 45597.9). Total num frames: 3819798528. Throughput: 0: 45610.1. Samples: 338377040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 11:27:15,511][73265] Avg episode reward: [(0, '0.443')] [2024-06-13 11:27:17,587][73497] Updated weights for policy 0, policy_version 233149 (0.0033) [2024-06-13 11:27:20,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45605.6, 300 sec: 45597.5). Total num frames: 3820044288. Throughput: 0: 45541.3. Samples: 338511040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-13 11:27:20,502][73265] Avg episode reward: [(0, '0.474')] [2024-06-13 11:27:21,544][73497] Updated weights for policy 0, policy_version 233159 (0.0036) [2024-06-13 11:27:24,633][73497] Updated weights for policy 0, policy_version 233169 (0.0041) [2024-06-13 11:27:25,501][73265] Fps is (10 sec: 45875.8, 60 sec: 45329.2, 300 sec: 45597.5). Total num frames: 3820257280. Throughput: 0: 45441.8. Samples: 338782340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 11:27:25,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 11:27:28,647][73497] Updated weights for policy 0, policy_version 233179 (0.0039) [2024-06-13 11:27:30,501][73265] Fps is (10 sec: 44237.4, 60 sec: 45330.7, 300 sec: 45542.0). Total num frames: 3820486656. Throughput: 0: 45361.0. Samples: 339050820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 11:27:30,502][73265] Avg episode reward: [(0, '0.485')] [2024-06-13 11:27:31,953][73497] Updated weights for policy 0, policy_version 233189 (0.0029) [2024-06-13 11:27:35,242][73477] Signal inference workers to stop experience collection... (4950 times) [2024-06-13 11:27:35,290][73477] Signal inference workers to resume experience collection... (4950 times) [2024-06-13 11:27:35,291][73497] InferenceWorker_p0-w0: stopping experience collection (4950 times) [2024-06-13 11:27:35,323][73497] InferenceWorker_p0-w0: resuming experience collection (4950 times) [2024-06-13 11:27:35,417][73497] Updated weights for policy 0, policy_version 233199 (0.0025) [2024-06-13 11:27:35,501][73265] Fps is (10 sec: 47513.5, 60 sec: 45602.2, 300 sec: 45597.9). Total num frames: 3820732416. Throughput: 0: 45560.0. Samples: 339197120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 11:27:35,502][73265] Avg episode reward: [(0, '0.474')] [2024-06-13 11:27:39,204][73497] Updated weights for policy 0, policy_version 233209 (0.0041) [2024-06-13 11:27:40,501][73265] Fps is (10 sec: 49151.6, 60 sec: 46148.3, 300 sec: 45653.0). Total num frames: 3820978176. Throughput: 0: 45648.0. Samples: 339472320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 11:27:40,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 11:27:43,049][73497] Updated weights for policy 0, policy_version 233219 (0.0027) [2024-06-13 11:27:45,502][73265] Fps is (10 sec: 44235.9, 60 sec: 45602.0, 300 sec: 45653.0). Total num frames: 3821174784. Throughput: 0: 45719.9. Samples: 339747000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 11:27:45,502][73265] Avg episode reward: [(0, '0.418')] [2024-06-13 11:27:46,156][73497] Updated weights for policy 0, policy_version 233229 (0.0040) [2024-06-13 11:27:49,975][73497] Updated weights for policy 0, policy_version 233239 (0.0034) [2024-06-13 11:27:50,501][73265] Fps is (10 sec: 42598.2, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 3821404160. Throughput: 0: 45559.6. Samples: 339878780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 11:27:50,502][73265] Avg episode reward: [(0, '0.479')] [2024-06-13 11:27:53,675][73497] Updated weights for policy 0, policy_version 233249 (0.0025) [2024-06-13 11:27:55,501][73265] Fps is (10 sec: 45876.4, 60 sec: 45602.2, 300 sec: 45653.1). Total num frames: 3821633536. Throughput: 0: 45500.6. Samples: 340152580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 11:27:55,502][73265] Avg episode reward: [(0, '0.443')] [2024-06-13 11:27:57,074][73497] Updated weights for policy 0, policy_version 233259 (0.0030) [2024-06-13 11:28:00,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45602.2, 300 sec: 45597.5). Total num frames: 3821862912. Throughput: 0: 45478.3. Samples: 340423560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 11:28:00,502][73265] Avg episode reward: [(0, '0.422')] [2024-06-13 11:28:00,733][73497] Updated weights for policy 0, policy_version 233269 (0.0029) [2024-06-13 11:28:04,206][73497] Updated weights for policy 0, policy_version 233279 (0.0036) [2024-06-13 11:28:05,501][73265] Fps is (10 sec: 47513.4, 60 sec: 45875.3, 300 sec: 45597.5). Total num frames: 3822108672. Throughput: 0: 45665.0. Samples: 340565960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 11:28:05,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 11:28:07,884][73497] Updated weights for policy 0, policy_version 233289 (0.0039) [2024-06-13 11:28:10,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 3822305280. Throughput: 0: 45719.1. Samples: 340839700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 11:28:10,502][73265] Avg episode reward: [(0, '0.498')] [2024-06-13 11:28:11,706][73497] Updated weights for policy 0, policy_version 233299 (0.0040) [2024-06-13 11:28:15,028][73497] Updated weights for policy 0, policy_version 233309 (0.0038) [2024-06-13 11:28:15,501][73265] Fps is (10 sec: 42598.2, 60 sec: 45602.2, 300 sec: 45542.0). Total num frames: 3822534656. Throughput: 0: 45728.3. Samples: 341108600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 11:28:15,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 11:28:18,621][73497] Updated weights for policy 0, policy_version 233319 (0.0041) [2024-06-13 11:28:20,501][73265] Fps is (10 sec: 47513.6, 60 sec: 45602.2, 300 sec: 45597.5). Total num frames: 3822780416. Throughput: 0: 45565.8. Samples: 341247580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 11:28:20,502][73265] Avg episode reward: [(0, '0.391')] [2024-06-13 11:28:22,452][73497] Updated weights for policy 0, policy_version 233329 (0.0034) [2024-06-13 11:28:25,501][73265] Fps is (10 sec: 47513.8, 60 sec: 45875.2, 300 sec: 45597.5). Total num frames: 3823009792. Throughput: 0: 45515.1. Samples: 341520500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 11:28:25,502][73265] Avg episode reward: [(0, '0.496')] [2024-06-13 11:28:25,513][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000233338_3823009792.pth... [2024-06-13 11:28:25,564][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000232669_3812048896.pth [2024-06-13 11:28:25,712][73497] Updated weights for policy 0, policy_version 233339 (0.0041) [2024-06-13 11:28:29,490][73497] Updated weights for policy 0, policy_version 233349 (0.0036) [2024-06-13 11:28:30,504][73265] Fps is (10 sec: 45863.8, 60 sec: 45873.3, 300 sec: 45597.1). Total num frames: 3823239168. Throughput: 0: 45698.6. Samples: 341803540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-13 11:28:30,504][73265] Avg episode reward: [(0, '0.469')] [2024-06-13 11:28:32,934][73497] Updated weights for policy 0, policy_version 233359 (0.0033) [2024-06-13 11:28:35,501][73265] Fps is (10 sec: 44236.8, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 3823452160. Throughput: 0: 45737.4. Samples: 341936960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 11:28:35,502][73265] Avg episode reward: [(0, '0.474')] [2024-06-13 11:28:36,440][73497] Updated weights for policy 0, policy_version 233369 (0.0031) [2024-06-13 11:28:40,263][73497] Updated weights for policy 0, policy_version 233379 (0.0041) [2024-06-13 11:28:40,501][73265] Fps is (10 sec: 44247.6, 60 sec: 45056.0, 300 sec: 45597.5). Total num frames: 3823681536. Throughput: 0: 45623.9. Samples: 342205660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 11:28:40,502][73265] Avg episode reward: [(0, '0.475')] [2024-06-13 11:28:44,022][73497] Updated weights for policy 0, policy_version 233389 (0.0029) [2024-06-13 11:28:45,505][73265] Fps is (10 sec: 45858.3, 60 sec: 45599.5, 300 sec: 45596.9). Total num frames: 3823910912. Throughput: 0: 45504.7. Samples: 342471440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 11:28:45,505][73265] Avg episode reward: [(0, '0.415')] [2024-06-13 11:28:47,363][73497] Updated weights for policy 0, policy_version 233399 (0.0041) [2024-06-13 11:28:50,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45602.2, 300 sec: 45597.5). Total num frames: 3824140288. Throughput: 0: 45390.3. Samples: 342608520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 11:28:50,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 11:28:51,436][73497] Updated weights for policy 0, policy_version 233409 (0.0033) [2024-06-13 11:28:54,640][73497] Updated weights for policy 0, policy_version 233419 (0.0044) [2024-06-13 11:28:55,501][73265] Fps is (10 sec: 47531.2, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 3824386048. Throughput: 0: 45396.9. Samples: 342882560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 11:28:55,502][73265] Avg episode reward: [(0, '0.465')] [2024-06-13 11:28:56,425][73477] Signal inference workers to stop experience collection... (5000 times) [2024-06-13 11:28:56,475][73477] Signal inference workers to resume experience collection... (5000 times) [2024-06-13 11:28:56,475][73497] InferenceWorker_p0-w0: stopping experience collection (5000 times) [2024-06-13 11:28:56,517][73497] InferenceWorker_p0-w0: resuming experience collection (5000 times) [2024-06-13 11:28:58,394][73497] Updated weights for policy 0, policy_version 233429 (0.0041) [2024-06-13 11:29:00,501][73265] Fps is (10 sec: 44236.4, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 3824582656. Throughput: 0: 45443.1. Samples: 343153540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 11:29:00,502][73265] Avg episode reward: [(0, '0.483')] [2024-06-13 11:29:02,060][73497] Updated weights for policy 0, policy_version 233439 (0.0029) [2024-06-13 11:29:05,416][73497] Updated weights for policy 0, policy_version 233449 (0.0040) [2024-06-13 11:29:05,502][73265] Fps is (10 sec: 44236.4, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 3824828416. Throughput: 0: 45255.9. Samples: 343284100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 11:29:05,502][73265] Avg episode reward: [(0, '0.482')] [2024-06-13 11:29:09,098][73497] Updated weights for policy 0, policy_version 233459 (0.0038) [2024-06-13 11:29:10,501][73265] Fps is (10 sec: 45875.8, 60 sec: 45602.2, 300 sec: 45708.6). Total num frames: 3825041408. Throughput: 0: 45278.8. Samples: 343558040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 11:29:10,502][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 11:29:12,879][73497] Updated weights for policy 0, policy_version 233469 (0.0038) [2024-06-13 11:29:15,504][73265] Fps is (10 sec: 42588.0, 60 sec: 45327.2, 300 sec: 45430.5). Total num frames: 3825254400. Throughput: 0: 44868.8. Samples: 343822640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 11:29:15,505][73265] Avg episode reward: [(0, '0.433')] [2024-06-13 11:29:16,345][73497] Updated weights for policy 0, policy_version 233479 (0.0030) [2024-06-13 11:29:20,218][73497] Updated weights for policy 0, policy_version 233489 (0.0030) [2024-06-13 11:29:20,501][73265] Fps is (10 sec: 44236.4, 60 sec: 45056.0, 300 sec: 45542.0). Total num frames: 3825483776. Throughput: 0: 44973.3. Samples: 343960760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 11:29:20,502][73265] Avg episode reward: [(0, '0.480')] [2024-06-13 11:29:23,364][73497] Updated weights for policy 0, policy_version 233499 (0.0041) [2024-06-13 11:29:25,501][73265] Fps is (10 sec: 47526.1, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 3825729536. Throughput: 0: 45135.2. Samples: 344236740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 11:29:25,502][73265] Avg episode reward: [(0, '0.487')] [2024-06-13 11:29:26,966][73497] Updated weights for policy 0, policy_version 233509 (0.0036) [2024-06-13 11:29:30,501][73265] Fps is (10 sec: 47514.2, 60 sec: 45331.0, 300 sec: 45597.5). Total num frames: 3825958912. Throughput: 0: 45386.5. Samples: 344513660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 11:29:30,502][73265] Avg episode reward: [(0, '0.510')] [2024-06-13 11:29:30,601][73497] Updated weights for policy 0, policy_version 233519 (0.0036) [2024-06-13 11:29:34,408][73497] Updated weights for policy 0, policy_version 233529 (0.0038) [2024-06-13 11:29:35,502][73265] Fps is (10 sec: 42593.5, 60 sec: 45055.2, 300 sec: 45430.7). Total num frames: 3826155520. Throughput: 0: 45365.5. Samples: 344650020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-13 11:29:35,503][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 11:29:37,597][73497] Updated weights for policy 0, policy_version 233539 (0.0026) [2024-06-13 11:29:40,504][73265] Fps is (10 sec: 44225.5, 60 sec: 45327.2, 300 sec: 45541.6). Total num frames: 3826401280. Throughput: 0: 45186.4. Samples: 344916060. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-13 11:29:40,504][73265] Avg episode reward: [(0, '0.504')] [2024-06-13 11:29:41,798][73497] Updated weights for policy 0, policy_version 233549 (0.0037) [2024-06-13 11:29:45,145][73497] Updated weights for policy 0, policy_version 233559 (0.0034) [2024-06-13 11:29:45,502][73265] Fps is (10 sec: 49156.7, 60 sec: 45604.9, 300 sec: 45654.0). Total num frames: 3826647040. Throughput: 0: 45260.8. Samples: 345190280. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-13 11:29:45,502][73265] Avg episode reward: [(0, '0.495')] [2024-06-13 11:29:48,992][73497] Updated weights for policy 0, policy_version 233569 (0.0028) [2024-06-13 11:29:50,501][73265] Fps is (10 sec: 44247.9, 60 sec: 45056.0, 300 sec: 45486.4). Total num frames: 3826843648. Throughput: 0: 45480.6. Samples: 345330720. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-13 11:29:50,502][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 11:29:52,062][73497] Updated weights for policy 0, policy_version 233579 (0.0042) [2024-06-13 11:29:53,574][73477] Signal inference workers to stop experience collection... (5050 times) [2024-06-13 11:29:53,575][73477] Signal inference workers to resume experience collection... (5050 times) [2024-06-13 11:29:53,629][73497] InferenceWorker_p0-w0: stopping experience collection (5050 times) [2024-06-13 11:29:53,629][73497] InferenceWorker_p0-w0: resuming experience collection (5050 times) [2024-06-13 11:29:55,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 3827105792. Throughput: 0: 45360.7. Samples: 345599280. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-13 11:29:55,502][73265] Avg episode reward: [(0, '0.476')] [2024-06-13 11:29:55,783][73497] Updated weights for policy 0, policy_version 233589 (0.0034) [2024-06-13 11:29:59,370][73497] Updated weights for policy 0, policy_version 233599 (0.0028) [2024-06-13 11:30:00,501][73265] Fps is (10 sec: 47513.2, 60 sec: 45602.1, 300 sec: 45653.0). Total num frames: 3827318784. Throughput: 0: 45455.4. Samples: 345868020. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-13 11:30:00,502][73265] Avg episode reward: [(0, '0.510')] [2024-06-13 11:30:02,867][73497] Updated weights for policy 0, policy_version 233609 (0.0033) [2024-06-13 11:30:05,501][73265] Fps is (10 sec: 42598.6, 60 sec: 45056.0, 300 sec: 45541.9). Total num frames: 3827531776. Throughput: 0: 45492.9. Samples: 346007940. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-13 11:30:05,502][73265] Avg episode reward: [(0, '0.413')] [2024-06-13 11:30:06,526][73497] Updated weights for policy 0, policy_version 233619 (0.0040) [2024-06-13 11:30:10,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45329.0, 300 sec: 45486.4). Total num frames: 3827761152. Throughput: 0: 45479.9. Samples: 346283340. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-13 11:30:10,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 11:30:10,577][73497] Updated weights for policy 0, policy_version 233629 (0.0035) [2024-06-13 11:30:13,853][73497] Updated weights for policy 0, policy_version 233639 (0.0034) [2024-06-13 11:30:15,502][73265] Fps is (10 sec: 45875.0, 60 sec: 45604.0, 300 sec: 45653.0). Total num frames: 3827990528. Throughput: 0: 45295.4. Samples: 346551960. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-13 11:30:15,502][73265] Avg episode reward: [(0, '0.472')] [2024-06-13 11:30:17,738][73497] Updated weights for policy 0, policy_version 233649 (0.0033) [2024-06-13 11:30:20,501][73265] Fps is (10 sec: 47513.5, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 3828236288. Throughput: 0: 45403.7. Samples: 346693140. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-13 11:30:20,502][73265] Avg episode reward: [(0, '0.456')] [2024-06-13 11:30:20,803][73497] Updated weights for policy 0, policy_version 233659 (0.0034) [2024-06-13 11:30:24,599][73497] Updated weights for policy 0, policy_version 233669 (0.0036) [2024-06-13 11:30:25,502][73265] Fps is (10 sec: 44236.5, 60 sec: 45055.8, 300 sec: 45486.4). Total num frames: 3828432896. Throughput: 0: 45586.8. Samples: 346967360. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-13 11:30:25,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 11:30:25,659][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000233670_3828449280.pth... [2024-06-13 11:30:25,704][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000233004_3817537536.pth [2024-06-13 11:30:28,143][73497] Updated weights for policy 0, policy_version 233679 (0.0035) [2024-06-13 11:30:30,501][73265] Fps is (10 sec: 42598.4, 60 sec: 45055.9, 300 sec: 45542.0). Total num frames: 3828662272. Throughput: 0: 45613.0. Samples: 347242860. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-13 11:30:30,502][73265] Avg episode reward: [(0, '0.434')] [2024-06-13 11:30:31,961][73497] Updated weights for policy 0, policy_version 233689 (0.0032) [2024-06-13 11:30:35,496][73497] Updated weights for policy 0, policy_version 233699 (0.0045) [2024-06-13 11:30:35,501][73265] Fps is (10 sec: 49152.4, 60 sec: 46149.0, 300 sec: 45653.4). Total num frames: 3828924416. Throughput: 0: 45422.1. Samples: 347374720. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-13 11:30:35,502][73265] Avg episode reward: [(0, '0.473')] [2024-06-13 11:30:39,703][73497] Updated weights for policy 0, policy_version 233709 (0.0035) [2024-06-13 11:30:40,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45330.9, 300 sec: 45430.9). Total num frames: 3829121024. Throughput: 0: 45239.7. Samples: 347635060. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-13 11:30:40,502][73265] Avg episode reward: [(0, '0.478')] [2024-06-13 11:30:42,907][73497] Updated weights for policy 0, policy_version 233719 (0.0026) [2024-06-13 11:30:45,501][73265] Fps is (10 sec: 42598.7, 60 sec: 45056.1, 300 sec: 45542.0). Total num frames: 3829350400. Throughput: 0: 45520.5. Samples: 347916440. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-13 11:30:45,502][73265] Avg episode reward: [(0, '0.456')] [2024-06-13 11:30:46,708][73497] Updated weights for policy 0, policy_version 233729 (0.0029) [2024-06-13 11:30:49,745][73497] Updated weights for policy 0, policy_version 233739 (0.0039) [2024-06-13 11:30:50,504][73265] Fps is (10 sec: 47501.8, 60 sec: 45873.3, 300 sec: 45597.1). Total num frames: 3829596160. Throughput: 0: 45613.5. Samples: 348060660. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-06-13 11:30:50,504][73265] Avg episode reward: [(0, '0.464')] [2024-06-13 11:30:53,810][73497] Updated weights for policy 0, policy_version 233749 (0.0030) [2024-06-13 11:30:55,501][73265] Fps is (10 sec: 47513.5, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 3829825536. Throughput: 0: 45540.8. Samples: 348332680. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-06-13 11:30:55,502][73265] Avg episode reward: [(0, '0.475')] [2024-06-13 11:30:57,063][73497] Updated weights for policy 0, policy_version 233759 (0.0035) [2024-06-13 11:31:00,501][73265] Fps is (10 sec: 44248.0, 60 sec: 45329.2, 300 sec: 45542.0). Total num frames: 3830038528. Throughput: 0: 45567.3. Samples: 348602480. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-06-13 11:31:00,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 11:31:00,789][73497] Updated weights for policy 0, policy_version 233769 (0.0024) [2024-06-13 11:31:03,442][73477] Signal inference workers to stop experience collection... (5100 times) [2024-06-13 11:31:03,494][73497] InferenceWorker_p0-w0: stopping experience collection (5100 times) [2024-06-13 11:31:03,494][73477] Signal inference workers to resume experience collection... (5100 times) [2024-06-13 11:31:03,519][73497] InferenceWorker_p0-w0: resuming experience collection (5100 times) [2024-06-13 11:31:04,490][73497] Updated weights for policy 0, policy_version 233779 (0.0026) [2024-06-13 11:31:05,501][73265] Fps is (10 sec: 42598.8, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3830251520. Throughput: 0: 45517.4. Samples: 348741420. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-06-13 11:31:05,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 11:31:08,089][73497] Updated weights for policy 0, policy_version 233789 (0.0035) [2024-06-13 11:31:10,501][73265] Fps is (10 sec: 47513.3, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 3830513664. Throughput: 0: 45371.3. Samples: 349009060. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-06-13 11:31:10,502][73265] Avg episode reward: [(0, '0.461')] [2024-06-13 11:31:11,811][73497] Updated weights for policy 0, policy_version 233799 (0.0022) [2024-06-13 11:31:15,502][73265] Fps is (10 sec: 45874.5, 60 sec: 45329.1, 300 sec: 45431.6). Total num frames: 3830710272. Throughput: 0: 45553.2. Samples: 349292760. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-06-13 11:31:15,502][73265] Avg episode reward: [(0, '0.458')] [2024-06-13 11:31:15,554][73497] Updated weights for policy 0, policy_version 233809 (0.0034) [2024-06-13 11:31:18,806][73497] Updated weights for policy 0, policy_version 233819 (0.0027) [2024-06-13 11:31:20,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3830972416. Throughput: 0: 45559.6. Samples: 349424900. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-06-13 11:31:20,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 11:31:22,688][73497] Updated weights for policy 0, policy_version 233829 (0.0039) [2024-06-13 11:31:25,502][73265] Fps is (10 sec: 47513.4, 60 sec: 45875.2, 300 sec: 45486.7). Total num frames: 3831185408. Throughput: 0: 45891.8. Samples: 349700200. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-06-13 11:31:25,502][73265] Avg episode reward: [(0, '0.477')] [2024-06-13 11:31:25,901][73497] Updated weights for policy 0, policy_version 233839 (0.0032) [2024-06-13 11:31:29,797][73497] Updated weights for policy 0, policy_version 233849 (0.0040) [2024-06-13 11:31:30,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45875.1, 300 sec: 45486.4). Total num frames: 3831414784. Throughput: 0: 45766.2. Samples: 349975920. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-06-13 11:31:30,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 11:31:33,411][73497] Updated weights for policy 0, policy_version 233859 (0.0034) [2024-06-13 11:31:35,501][73265] Fps is (10 sec: 44237.6, 60 sec: 45056.1, 300 sec: 45486.4). Total num frames: 3831627776. Throughput: 0: 45594.5. Samples: 350112300. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-06-13 11:31:35,502][73265] Avg episode reward: [(0, '0.446')] [2024-06-13 11:31:36,806][73497] Updated weights for policy 0, policy_version 233869 (0.0032) [2024-06-13 11:31:40,406][73497] Updated weights for policy 0, policy_version 233879 (0.0028) [2024-06-13 11:31:40,501][73265] Fps is (10 sec: 45875.7, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 3831873536. Throughput: 0: 45786.8. Samples: 350393080. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-06-13 11:31:40,502][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 11:31:44,091][73497] Updated weights for policy 0, policy_version 233889 (0.0049) [2024-06-13 11:31:45,501][73265] Fps is (10 sec: 49152.1, 60 sec: 46148.3, 300 sec: 45542.0). Total num frames: 3832119296. Throughput: 0: 46108.0. Samples: 350677340. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-06-13 11:31:45,502][73265] Avg episode reward: [(0, '0.413')] [2024-06-13 11:31:47,198][73497] Updated weights for policy 0, policy_version 233899 (0.0024) [2024-06-13 11:31:50,501][73265] Fps is (10 sec: 44236.4, 60 sec: 45330.9, 300 sec: 45486.4). Total num frames: 3832315904. Throughput: 0: 45992.3. Samples: 350811080. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-06-13 11:31:50,502][73265] Avg episode reward: [(0, '0.433')] [2024-06-13 11:31:50,994][73497] Updated weights for policy 0, policy_version 233909 (0.0029) [2024-06-13 11:31:54,557][73497] Updated weights for policy 0, policy_version 233919 (0.0030) [2024-06-13 11:31:55,501][73265] Fps is (10 sec: 44236.2, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3832561664. Throughput: 0: 46261.7. Samples: 351090840. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-06-13 11:31:55,502][73265] Avg episode reward: [(0, '0.458')] [2024-06-13 11:31:58,148][73497] Updated weights for policy 0, policy_version 233929 (0.0041) [2024-06-13 11:32:00,501][73265] Fps is (10 sec: 47514.0, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 3832791040. Throughput: 0: 45788.6. Samples: 351353240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-13 11:32:00,502][73265] Avg episode reward: [(0, '0.480')] [2024-06-13 11:32:01,841][73497] Updated weights for policy 0, policy_version 233939 (0.0057) [2024-06-13 11:32:05,345][73497] Updated weights for policy 0, policy_version 233949 (0.0034) [2024-06-13 11:32:05,501][73265] Fps is (10 sec: 45875.7, 60 sec: 46148.3, 300 sec: 45542.0). Total num frames: 3833020416. Throughput: 0: 46022.3. Samples: 351495900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-13 11:32:05,502][73265] Avg episode reward: [(0, '0.477')] [2024-06-13 11:32:08,558][73497] Updated weights for policy 0, policy_version 233959 (0.0034) [2024-06-13 11:32:10,502][73265] Fps is (10 sec: 47512.8, 60 sec: 45875.1, 300 sec: 45653.0). Total num frames: 3833266176. Throughput: 0: 46197.3. Samples: 351779080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-13 11:32:10,502][73265] Avg episode reward: [(0, '0.487')] [2024-06-13 11:32:12,526][73497] Updated weights for policy 0, policy_version 233969 (0.0038) [2024-06-13 11:32:15,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45875.3, 300 sec: 45486.4). Total num frames: 3833462784. Throughput: 0: 46234.3. Samples: 352056460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-13 11:32:15,502][73265] Avg episode reward: [(0, '0.425')] [2024-06-13 11:32:15,964][73477] Signal inference workers to stop experience collection... (5150 times) [2024-06-13 11:32:15,964][73477] Signal inference workers to resume experience collection... (5150 times) [2024-06-13 11:32:16,006][73497] InferenceWorker_p0-w0: stopping experience collection (5150 times) [2024-06-13 11:32:16,006][73497] InferenceWorker_p0-w0: resuming experience collection (5150 times) [2024-06-13 11:32:16,117][73497] Updated weights for policy 0, policy_version 233979 (0.0034) [2024-06-13 11:32:19,631][73497] Updated weights for policy 0, policy_version 233989 (0.0032) [2024-06-13 11:32:20,502][73265] Fps is (10 sec: 47513.6, 60 sec: 46148.2, 300 sec: 45708.6). Total num frames: 3833741312. Throughput: 0: 46079.8. Samples: 352185900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-13 11:32:20,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 11:32:23,211][73497] Updated weights for policy 0, policy_version 233999 (0.0029) [2024-06-13 11:32:25,501][73265] Fps is (10 sec: 47513.2, 60 sec: 45875.3, 300 sec: 45597.5). Total num frames: 3833937920. Throughput: 0: 46131.0. Samples: 352468980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-13 11:32:25,502][73265] Avg episode reward: [(0, '0.488')] [2024-06-13 11:32:25,524][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000234006_3833954304.pth... [2024-06-13 11:32:25,581][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000233338_3823009792.pth [2024-06-13 11:32:26,589][73497] Updated weights for policy 0, policy_version 234009 (0.0033) [2024-06-13 11:32:30,430][73497] Updated weights for policy 0, policy_version 234019 (0.0028) [2024-06-13 11:32:30,501][73265] Fps is (10 sec: 42598.9, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 3834167296. Throughput: 0: 46033.3. Samples: 352748840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-13 11:32:30,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 11:32:33,431][73497] Updated weights for policy 0, policy_version 234029 (0.0035) [2024-06-13 11:32:35,505][73265] Fps is (10 sec: 47495.4, 60 sec: 46418.3, 300 sec: 45541.4). Total num frames: 3834413056. Throughput: 0: 45885.0. Samples: 352876080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-13 11:32:35,506][73265] Avg episode reward: [(0, '0.469')] [2024-06-13 11:32:37,417][73497] Updated weights for policy 0, policy_version 234039 (0.0032) [2024-06-13 11:32:40,501][73265] Fps is (10 sec: 47513.3, 60 sec: 46148.2, 300 sec: 45653.1). Total num frames: 3834642432. Throughput: 0: 45844.0. Samples: 353153820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-13 11:32:40,502][73265] Avg episode reward: [(0, '0.465')] [2024-06-13 11:32:40,641][73497] Updated weights for policy 0, policy_version 234049 (0.0025) [2024-06-13 11:32:44,508][73497] Updated weights for policy 0, policy_version 234059 (0.0030) [2024-06-13 11:32:45,501][73265] Fps is (10 sec: 44254.0, 60 sec: 45602.1, 300 sec: 45597.5). Total num frames: 3834855424. Throughput: 0: 46308.4. Samples: 353437120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-13 11:32:45,502][73265] Avg episode reward: [(0, '0.443')] [2024-06-13 11:32:47,744][73497] Updated weights for policy 0, policy_version 234069 (0.0034) [2024-06-13 11:32:50,501][73265] Fps is (10 sec: 44237.1, 60 sec: 46148.3, 300 sec: 45597.5). Total num frames: 3835084800. Throughput: 0: 45988.8. Samples: 353565400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-13 11:32:50,502][73265] Avg episode reward: [(0, '0.445')] [2024-06-13 11:32:52,015][73497] Updated weights for policy 0, policy_version 234079 (0.0035) [2024-06-13 11:32:55,116][73497] Updated weights for policy 0, policy_version 234089 (0.0038) [2024-06-13 11:32:55,502][73265] Fps is (10 sec: 47513.1, 60 sec: 46148.2, 300 sec: 45653.0). Total num frames: 3835330560. Throughput: 0: 45588.9. Samples: 353830580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-13 11:32:55,502][73265] Avg episode reward: [(0, '0.456')] [2024-06-13 11:32:59,198][73497] Updated weights for policy 0, policy_version 234099 (0.0044) [2024-06-13 11:33:00,501][73265] Fps is (10 sec: 45875.4, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 3835543552. Throughput: 0: 45614.2. Samples: 354109100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-13 11:33:00,502][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 11:33:02,132][73497] Updated weights for policy 0, policy_version 234109 (0.0037) [2024-06-13 11:33:05,501][73265] Fps is (10 sec: 42599.3, 60 sec: 45602.2, 300 sec: 45597.5). Total num frames: 3835756544. Throughput: 0: 45755.3. Samples: 354244880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-13 11:33:05,502][73265] Avg episode reward: [(0, '0.444')] [2024-06-13 11:33:05,919][73497] Updated weights for policy 0, policy_version 234119 (0.0037) [2024-06-13 11:33:08,829][73497] Updated weights for policy 0, policy_version 234129 (0.0031) [2024-06-13 11:33:10,501][73265] Fps is (10 sec: 47513.6, 60 sec: 45875.3, 300 sec: 45708.6). Total num frames: 3836018688. Throughput: 0: 45729.0. Samples: 354526780. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-13 11:33:10,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 11:33:13,356][73497] Updated weights for policy 0, policy_version 234139 (0.0045) [2024-06-13 11:33:15,502][73265] Fps is (10 sec: 50789.4, 60 sec: 46694.3, 300 sec: 45708.6). Total num frames: 3836264448. Throughput: 0: 45717.7. Samples: 354806140. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-13 11:33:15,502][73265] Avg episode reward: [(0, '0.434')] [2024-06-13 11:33:16,020][73497] Updated weights for policy 0, policy_version 234149 (0.0026) [2024-06-13 11:33:20,501][73265] Fps is (10 sec: 42598.3, 60 sec: 45056.1, 300 sec: 45542.0). Total num frames: 3836444672. Throughput: 0: 46083.5. Samples: 354949660. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-13 11:33:20,502][73265] Avg episode reward: [(0, '0.482')] [2024-06-13 11:33:20,602][73497] Updated weights for policy 0, policy_version 234159 (0.0033) [2024-06-13 11:33:23,347][73497] Updated weights for policy 0, policy_version 234169 (0.0039) [2024-06-13 11:33:25,501][73265] Fps is (10 sec: 42598.9, 60 sec: 45875.3, 300 sec: 45597.9). Total num frames: 3836690432. Throughput: 0: 45841.0. Samples: 355216660. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-13 11:33:25,502][73265] Avg episode reward: [(0, '0.470')] [2024-06-13 11:33:27,685][73497] Updated weights for policy 0, policy_version 234179 (0.0027) [2024-06-13 11:33:30,170][73497] Updated weights for policy 0, policy_version 234189 (0.0035) [2024-06-13 11:33:30,502][73265] Fps is (10 sec: 52428.2, 60 sec: 46694.3, 300 sec: 45819.6). Total num frames: 3836968960. Throughput: 0: 45795.9. Samples: 355497940. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-13 11:33:30,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 11:33:34,511][73497] Updated weights for policy 0, policy_version 234199 (0.0027) [2024-06-13 11:33:35,327][73477] Signal inference workers to stop experience collection... (5200 times) [2024-06-13 11:33:35,375][73497] InferenceWorker_p0-w0: stopping experience collection (5200 times) [2024-06-13 11:33:35,385][73477] Signal inference workers to resume experience collection... (5200 times) [2024-06-13 11:33:35,387][73497] InferenceWorker_p0-w0: resuming experience collection (5200 times) [2024-06-13 11:33:35,501][73265] Fps is (10 sec: 45874.9, 60 sec: 45605.1, 300 sec: 45653.0). Total num frames: 3837149184. Throughput: 0: 46221.3. Samples: 355645360. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-13 11:33:35,502][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 11:33:37,057][73497] Updated weights for policy 0, policy_version 234209 (0.0029) [2024-06-13 11:33:40,501][73265] Fps is (10 sec: 42598.6, 60 sec: 45875.2, 300 sec: 45709.1). Total num frames: 3837394944. Throughput: 0: 46388.0. Samples: 355918040. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-13 11:33:40,502][73265] Avg episode reward: [(0, '0.414')] [2024-06-13 11:33:41,943][73497] Updated weights for policy 0, policy_version 234219 (0.0025) [2024-06-13 11:33:44,247][73497] Updated weights for policy 0, policy_version 234229 (0.0039) [2024-06-13 11:33:45,501][73265] Fps is (10 sec: 49152.1, 60 sec: 46421.3, 300 sec: 45764.1). Total num frames: 3837640704. Throughput: 0: 46055.9. Samples: 356181620. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-13 11:33:45,502][73265] Avg episode reward: [(0, '0.408')] [2024-06-13 11:33:49,025][73497] Updated weights for policy 0, policy_version 234239 (0.0033) [2024-06-13 11:33:50,501][73265] Fps is (10 sec: 47513.7, 60 sec: 46421.3, 300 sec: 45708.6). Total num frames: 3837870080. Throughput: 0: 46362.1. Samples: 356331180. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-13 11:33:50,502][73265] Avg episode reward: [(0, '0.474')] [2024-06-13 11:33:51,564][73497] Updated weights for policy 0, policy_version 234249 (0.0044) [2024-06-13 11:33:55,501][73265] Fps is (10 sec: 42598.2, 60 sec: 45602.2, 300 sec: 45708.6). Total num frames: 3838066688. Throughput: 0: 46206.6. Samples: 356606080. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-13 11:33:55,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 11:33:56,109][73497] Updated weights for policy 0, policy_version 234259 (0.0034) [2024-06-13 11:33:58,723][73497] Updated weights for policy 0, policy_version 234269 (0.0030) [2024-06-13 11:34:00,504][73265] Fps is (10 sec: 44226.1, 60 sec: 46146.4, 300 sec: 45708.2). Total num frames: 3838312448. Throughput: 0: 45906.5. Samples: 356872040. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-13 11:34:00,504][73265] Avg episode reward: [(0, '0.414')] [2024-06-13 11:34:03,166][73497] Updated weights for policy 0, policy_version 234279 (0.0031) [2024-06-13 11:34:05,501][73265] Fps is (10 sec: 50790.9, 60 sec: 46967.4, 300 sec: 45875.2). Total num frames: 3838574592. Throughput: 0: 46097.4. Samples: 357024040. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-13 11:34:05,502][73265] Avg episode reward: [(0, '0.390')] [2024-06-13 11:34:05,622][73497] Updated weights for policy 0, policy_version 234289 (0.0024) [2024-06-13 11:34:10,200][73497] Updated weights for policy 0, policy_version 234299 (0.0031) [2024-06-13 11:34:10,504][73265] Fps is (10 sec: 44236.6, 60 sec: 45600.2, 300 sec: 45764.1). Total num frames: 3838754816. Throughput: 0: 46209.4. Samples: 357296200. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-13 11:34:10,504][73265] Avg episode reward: [(0, '0.431')] [2024-06-13 11:34:12,978][73497] Updated weights for policy 0, policy_version 234309 (0.0024) [2024-06-13 11:34:15,501][73265] Fps is (10 sec: 42598.2, 60 sec: 45602.2, 300 sec: 45819.7). Total num frames: 3839000576. Throughput: 0: 45934.3. Samples: 357564980. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-13 11:34:15,502][73265] Avg episode reward: [(0, '0.476')] [2024-06-13 11:34:17,575][73497] Updated weights for policy 0, policy_version 234319 (0.0033) [2024-06-13 11:34:20,263][73497] Updated weights for policy 0, policy_version 234329 (0.0026) [2024-06-13 11:34:20,501][73265] Fps is (10 sec: 49164.0, 60 sec: 46694.4, 300 sec: 45819.6). Total num frames: 3839246336. Throughput: 0: 45732.0. Samples: 357703300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 11:34:20,502][73265] Avg episode reward: [(0, '0.497')] [2024-06-13 11:34:24,571][73497] Updated weights for policy 0, policy_version 234339 (0.0039) [2024-06-13 11:34:25,501][73265] Fps is (10 sec: 45875.3, 60 sec: 46148.3, 300 sec: 45764.1). Total num frames: 3839459328. Throughput: 0: 45586.3. Samples: 357969420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 11:34:25,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 11:34:25,521][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000234342_3839459328.pth... [2024-06-13 11:34:25,566][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000233670_3828449280.pth [2024-06-13 11:34:27,586][73497] Updated weights for policy 0, policy_version 234349 (0.0035) [2024-06-13 11:34:30,501][73265] Fps is (10 sec: 42598.9, 60 sec: 45056.1, 300 sec: 45819.8). Total num frames: 3839672320. Throughput: 0: 45785.0. Samples: 358241940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 11:34:30,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 11:34:31,855][73497] Updated weights for policy 0, policy_version 234359 (0.0045) [2024-06-13 11:34:34,711][73497] Updated weights for policy 0, policy_version 234369 (0.0042) [2024-06-13 11:34:35,501][73265] Fps is (10 sec: 45875.0, 60 sec: 46148.3, 300 sec: 45820.0). Total num frames: 3839918080. Throughput: 0: 45501.8. Samples: 358378760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 11:34:35,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 11:34:39,125][73497] Updated weights for policy 0, policy_version 234379 (0.0027) [2024-06-13 11:34:40,501][73265] Fps is (10 sec: 47513.3, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 3840147456. Throughput: 0: 45440.5. Samples: 358650900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 11:34:40,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 11:34:41,883][73497] Updated weights for policy 0, policy_version 234389 (0.0035) [2024-06-13 11:34:45,501][73265] Fps is (10 sec: 42598.6, 60 sec: 45056.0, 300 sec: 45764.1). Total num frames: 3840344064. Throughput: 0: 45478.9. Samples: 358918480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 11:34:45,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 11:34:46,243][73497] Updated weights for policy 0, policy_version 234399 (0.0028) [2024-06-13 11:34:49,237][73477] Signal inference workers to stop experience collection... (5250 times) [2024-06-13 11:34:49,237][73477] Signal inference workers to resume experience collection... (5250 times) [2024-06-13 11:34:49,282][73497] InferenceWorker_p0-w0: stopping experience collection (5250 times) [2024-06-13 11:34:49,282][73497] InferenceWorker_p0-w0: resuming experience collection (5250 times) [2024-06-13 11:34:49,363][73497] Updated weights for policy 0, policy_version 234409 (0.0030) [2024-06-13 11:34:50,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45329.2, 300 sec: 45708.6). Total num frames: 3840589824. Throughput: 0: 45090.2. Samples: 359053100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 11:34:50,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 11:34:53,560][73497] Updated weights for policy 0, policy_version 234419 (0.0033) [2024-06-13 11:34:55,501][73265] Fps is (10 sec: 47513.3, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 3840819200. Throughput: 0: 45087.4. Samples: 359325020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 11:34:55,502][73265] Avg episode reward: [(0, '0.381')] [2024-06-13 11:34:56,543][73497] Updated weights for policy 0, policy_version 234429 (0.0034) [2024-06-13 11:35:00,501][73265] Fps is (10 sec: 42598.4, 60 sec: 45057.9, 300 sec: 45708.6). Total num frames: 3841015808. Throughput: 0: 45479.6. Samples: 359611560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 11:35:00,502][73265] Avg episode reward: [(0, '0.454')] [2024-06-13 11:35:00,835][73497] Updated weights for policy 0, policy_version 234439 (0.0026) [2024-06-13 11:35:03,664][73497] Updated weights for policy 0, policy_version 234449 (0.0034) [2024-06-13 11:35:05,501][73265] Fps is (10 sec: 45875.6, 60 sec: 45056.0, 300 sec: 45819.7). Total num frames: 3841277952. Throughput: 0: 45219.2. Samples: 359738160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 11:35:05,502][73265] Avg episode reward: [(0, '0.488')] [2024-06-13 11:35:08,043][73497] Updated weights for policy 0, policy_version 234459 (0.0025) [2024-06-13 11:35:10,501][73265] Fps is (10 sec: 47513.6, 60 sec: 45604.1, 300 sec: 45764.1). Total num frames: 3841490944. Throughput: 0: 45267.6. Samples: 360006460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 11:35:10,502][73265] Avg episode reward: [(0, '0.443')] [2024-06-13 11:35:11,005][73497] Updated weights for policy 0, policy_version 234469 (0.0041) [2024-06-13 11:35:15,043][73497] Updated weights for policy 0, policy_version 234479 (0.0035) [2024-06-13 11:35:15,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45602.1, 300 sec: 45764.1). Total num frames: 3841736704. Throughput: 0: 45444.8. Samples: 360286960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 11:35:15,508][73265] Avg episode reward: [(0, '0.512')] [2024-06-13 11:35:18,583][73497] Updated weights for policy 0, policy_version 234489 (0.0038) [2024-06-13 11:35:20,501][73265] Fps is (10 sec: 44236.8, 60 sec: 44783.0, 300 sec: 45764.2). Total num frames: 3841933312. Throughput: 0: 45319.2. Samples: 360418120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 11:35:20,502][73265] Avg episode reward: [(0, '0.493')] [2024-06-13 11:35:22,145][73497] Updated weights for policy 0, policy_version 234499 (0.0021) [2024-06-13 11:35:25,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45329.0, 300 sec: 45819.7). Total num frames: 3842179072. Throughput: 0: 45270.2. Samples: 360688060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-13 11:35:25,502][73265] Avg episode reward: [(0, '0.481')] [2024-06-13 11:35:25,607][73497] Updated weights for policy 0, policy_version 234509 (0.0031) [2024-06-13 11:35:29,495][73497] Updated weights for policy 0, policy_version 234519 (0.0037) [2024-06-13 11:35:30,502][73265] Fps is (10 sec: 47512.3, 60 sec: 45601.9, 300 sec: 45708.6). Total num frames: 3842408448. Throughput: 0: 45619.3. Samples: 360971360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-13 11:35:30,502][73265] Avg episode reward: [(0, '0.514')] [2024-06-13 11:35:32,791][73497] Updated weights for policy 0, policy_version 234529 (0.0030) [2024-06-13 11:35:35,501][73265] Fps is (10 sec: 42598.3, 60 sec: 44782.9, 300 sec: 45708.6). Total num frames: 3842605056. Throughput: 0: 45495.0. Samples: 361100380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-13 11:35:35,503][73265] Avg episode reward: [(0, '0.471')] [2024-06-13 11:35:36,600][73497] Updated weights for policy 0, policy_version 234539 (0.0040) [2024-06-13 11:35:40,222][73497] Updated weights for policy 0, policy_version 234549 (0.0030) [2024-06-13 11:35:40,501][73265] Fps is (10 sec: 44237.7, 60 sec: 45056.0, 300 sec: 45764.1). Total num frames: 3842850816. Throughput: 0: 45568.5. Samples: 361375600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-13 11:35:40,502][73265] Avg episode reward: [(0, '0.471')] [2024-06-13 11:35:43,674][73497] Updated weights for policy 0, policy_version 234559 (0.0045) [2024-06-13 11:35:45,501][73265] Fps is (10 sec: 49152.3, 60 sec: 45875.2, 300 sec: 45764.5). Total num frames: 3843096576. Throughput: 0: 45118.1. Samples: 361641880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-13 11:35:45,502][73265] Avg episode reward: [(0, '0.474')] [2024-06-13 11:35:47,522][73497] Updated weights for policy 0, policy_version 234569 (0.0031) [2024-06-13 11:35:50,501][73265] Fps is (10 sec: 47513.6, 60 sec: 45602.1, 300 sec: 45764.1). Total num frames: 3843325952. Throughput: 0: 45548.9. Samples: 361787860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-13 11:35:50,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 11:35:50,743][73497] Updated weights for policy 0, policy_version 234579 (0.0029) [2024-06-13 11:35:54,462][73497] Updated weights for policy 0, policy_version 234589 (0.0034) [2024-06-13 11:35:55,503][73265] Fps is (10 sec: 45867.7, 60 sec: 45600.9, 300 sec: 45819.4). Total num frames: 3843555328. Throughput: 0: 45658.7. Samples: 362061180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-13 11:35:55,504][73265] Avg episode reward: [(0, '0.445')] [2024-06-13 11:35:57,959][73497] Updated weights for policy 0, policy_version 234599 (0.0041) [2024-06-13 11:36:00,501][73265] Fps is (10 sec: 44237.2, 60 sec: 45875.2, 300 sec: 45819.7). Total num frames: 3843768320. Throughput: 0: 45375.7. Samples: 362328860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-13 11:36:00,502][73265] Avg episode reward: [(0, '0.546')] [2024-06-13 11:36:00,502][73477] Saving new best policy, reward=0.546! [2024-06-13 11:36:01,799][73497] Updated weights for policy 0, policy_version 234609 (0.0046) [2024-06-13 11:36:05,091][73497] Updated weights for policy 0, policy_version 234619 (0.0032) [2024-06-13 11:36:05,501][73265] Fps is (10 sec: 44243.7, 60 sec: 45329.0, 300 sec: 45708.6). Total num frames: 3843997696. Throughput: 0: 45515.4. Samples: 362466320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-13 11:36:05,502][73265] Avg episode reward: [(0, '0.431')] [2024-06-13 11:36:09,494][73497] Updated weights for policy 0, policy_version 234629 (0.0031) [2024-06-13 11:36:10,503][73265] Fps is (10 sec: 45867.2, 60 sec: 45600.8, 300 sec: 45819.4). Total num frames: 3844227072. Throughput: 0: 45681.5. Samples: 362743800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-13 11:36:10,503][73265] Avg episode reward: [(0, '0.502')] [2024-06-13 11:36:12,254][73497] Updated weights for policy 0, policy_version 234639 (0.0027) [2024-06-13 11:36:13,892][73477] Signal inference workers to stop experience collection... (5300 times) [2024-06-13 11:36:13,925][73497] InferenceWorker_p0-w0: stopping experience collection (5300 times) [2024-06-13 11:36:13,952][73477] Signal inference workers to resume experience collection... (5300 times) [2024-06-13 11:36:13,952][73497] InferenceWorker_p0-w0: resuming experience collection (5300 times) [2024-06-13 11:36:15,501][73265] Fps is (10 sec: 42598.9, 60 sec: 44783.0, 300 sec: 45597.5). Total num frames: 3844423680. Throughput: 0: 45438.9. Samples: 363016100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-13 11:36:15,502][73265] Avg episode reward: [(0, '0.461')] [2024-06-13 11:36:16,608][73497] Updated weights for policy 0, policy_version 234649 (0.0037) [2024-06-13 11:36:19,067][73497] Updated weights for policy 0, policy_version 234659 (0.0038) [2024-06-13 11:36:20,501][73265] Fps is (10 sec: 45883.2, 60 sec: 45875.2, 300 sec: 45764.2). Total num frames: 3844685824. Throughput: 0: 45557.1. Samples: 363150440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-13 11:36:20,502][73265] Avg episode reward: [(0, '0.471')] [2024-06-13 11:36:23,534][73497] Updated weights for policy 0, policy_version 234669 (0.0027) [2024-06-13 11:36:25,501][73265] Fps is (10 sec: 49151.8, 60 sec: 45602.1, 300 sec: 45764.1). Total num frames: 3844915200. Throughput: 0: 45566.6. Samples: 363426100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-13 11:36:25,502][73265] Avg episode reward: [(0, '0.495')] [2024-06-13 11:36:25,527][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000234675_3844915200.pth... [2024-06-13 11:36:25,583][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000234006_3833954304.pth [2024-06-13 11:36:26,660][73497] Updated weights for policy 0, policy_version 234679 (0.0031) [2024-06-13 11:36:30,501][73265] Fps is (10 sec: 44236.3, 60 sec: 45329.2, 300 sec: 45764.1). Total num frames: 3845128192. Throughput: 0: 45828.0. Samples: 363704140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-13 11:36:30,502][73265] Avg episode reward: [(0, '0.476')] [2024-06-13 11:36:30,539][73497] Updated weights for policy 0, policy_version 234689 (0.0033) [2024-06-13 11:36:33,779][73497] Updated weights for policy 0, policy_version 234699 (0.0035) [2024-06-13 11:36:35,501][73265] Fps is (10 sec: 45875.2, 60 sec: 46148.3, 300 sec: 45764.1). Total num frames: 3845373952. Throughput: 0: 45438.6. Samples: 363832600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 11:36:35,502][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 11:36:38,187][73497] Updated weights for policy 0, policy_version 234709 (0.0036) [2024-06-13 11:36:40,501][73265] Fps is (10 sec: 47513.6, 60 sec: 45875.2, 300 sec: 45708.6). Total num frames: 3845603328. Throughput: 0: 45474.1. Samples: 364107440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 11:36:40,502][73265] Avg episode reward: [(0, '0.424')] [2024-06-13 11:36:41,325][73497] Updated weights for policy 0, policy_version 234719 (0.0036) [2024-06-13 11:36:45,194][73497] Updated weights for policy 0, policy_version 234729 (0.0023) [2024-06-13 11:36:45,501][73265] Fps is (10 sec: 42598.5, 60 sec: 45056.0, 300 sec: 45708.6). Total num frames: 3845799936. Throughput: 0: 45906.6. Samples: 364394660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 11:36:45,502][73265] Avg episode reward: [(0, '0.421')] [2024-06-13 11:36:47,991][73497] Updated weights for policy 0, policy_version 234739 (0.0028) [2024-06-13 11:36:50,501][73265] Fps is (10 sec: 44236.8, 60 sec: 45329.1, 300 sec: 45708.6). Total num frames: 3846045696. Throughput: 0: 45619.2. Samples: 364519180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 11:36:50,502][73265] Avg episode reward: [(0, '0.476')] [2024-06-13 11:36:52,046][73497] Updated weights for policy 0, policy_version 234749 (0.0034) [2024-06-13 11:36:54,965][73497] Updated weights for policy 0, policy_version 234759 (0.0029) [2024-06-13 11:36:55,501][73265] Fps is (10 sec: 49152.1, 60 sec: 45603.4, 300 sec: 45764.1). Total num frames: 3846291456. Throughput: 0: 45675.9. Samples: 364799140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 11:36:55,502][73265] Avg episode reward: [(0, '0.442')] [2024-06-13 11:36:59,685][73497] Updated weights for policy 0, policy_version 234769 (0.0032) [2024-06-13 11:37:00,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45602.0, 300 sec: 45708.6). Total num frames: 3846504448. Throughput: 0: 45655.5. Samples: 365070600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 11:37:00,502][73265] Avg episode reward: [(0, '0.443')] [2024-06-13 11:37:02,243][73497] Updated weights for policy 0, policy_version 234779 (0.0045) [2024-06-13 11:37:05,501][73265] Fps is (10 sec: 42598.1, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 3846717440. Throughput: 0: 45532.7. Samples: 365199420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 11:37:05,502][73265] Avg episode reward: [(0, '0.471')] [2024-06-13 11:37:06,697][73497] Updated weights for policy 0, policy_version 234789 (0.0034) [2024-06-13 11:37:09,643][73497] Updated weights for policy 0, policy_version 234799 (0.0033) [2024-06-13 11:37:10,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45603.4, 300 sec: 45764.1). Total num frames: 3846963200. Throughput: 0: 45441.4. Samples: 365470960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 11:37:10,502][73265] Avg episode reward: [(0, '0.445')] [2024-06-13 11:37:14,145][73497] Updated weights for policy 0, policy_version 234809 (0.0031) [2024-06-13 11:37:15,502][73265] Fps is (10 sec: 47513.4, 60 sec: 46148.2, 300 sec: 45597.5). Total num frames: 3847192576. Throughput: 0: 45644.4. Samples: 365758140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 11:37:15,502][73265] Avg episode reward: [(0, '0.481')] [2024-06-13 11:37:16,623][73497] Updated weights for policy 0, policy_version 234819 (0.0031) [2024-06-13 11:37:20,501][73265] Fps is (10 sec: 44237.0, 60 sec: 45329.0, 300 sec: 45653.1). Total num frames: 3847405568. Throughput: 0: 45811.6. Samples: 365894120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 11:37:20,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 11:37:20,982][73497] Updated weights for policy 0, policy_version 234829 (0.0027) [2024-06-13 11:37:23,450][73497] Updated weights for policy 0, policy_version 234839 (0.0041) [2024-06-13 11:37:25,502][73265] Fps is (10 sec: 45875.3, 60 sec: 45602.1, 300 sec: 45708.6). Total num frames: 3847651328. Throughput: 0: 45778.6. Samples: 366167480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 11:37:25,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 11:37:28,037][73477] Signal inference workers to stop experience collection... (5350 times) [2024-06-13 11:37:28,039][73477] Signal inference workers to resume experience collection... (5350 times) [2024-06-13 11:37:28,084][73497] InferenceWorker_p0-w0: stopping experience collection (5350 times) [2024-06-13 11:37:28,085][73497] InferenceWorker_p0-w0: resuming experience collection (5350 times) [2024-06-13 11:37:28,165][73497] Updated weights for policy 0, policy_version 234849 (0.0028) [2024-06-13 11:37:30,504][73265] Fps is (10 sec: 50777.6, 60 sec: 46419.4, 300 sec: 45764.3). Total num frames: 3847913472. Throughput: 0: 45404.2. Samples: 366437960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 11:37:30,504][73265] Avg episode reward: [(0, '0.418')] [2024-06-13 11:37:30,779][73497] Updated weights for policy 0, policy_version 234859 (0.0030) [2024-06-13 11:37:35,501][73265] Fps is (10 sec: 42599.0, 60 sec: 45056.1, 300 sec: 45542.0). Total num frames: 3848077312. Throughput: 0: 45872.5. Samples: 366583440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 11:37:35,502][73265] Avg episode reward: [(0, '0.476')] [2024-06-13 11:37:35,534][73497] Updated weights for policy 0, policy_version 234869 (0.0037) [2024-06-13 11:37:37,858][73497] Updated weights for policy 0, policy_version 234879 (0.0032) [2024-06-13 11:37:40,501][73265] Fps is (10 sec: 42609.1, 60 sec: 45602.2, 300 sec: 45708.6). Total num frames: 3848339456. Throughput: 0: 45638.2. Samples: 366852860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-13 11:37:40,502][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 11:37:42,697][73497] Updated weights for policy 0, policy_version 234889 (0.0033) [2024-06-13 11:37:45,121][73497] Updated weights for policy 0, policy_version 234899 (0.0040) [2024-06-13 11:37:45,501][73265] Fps is (10 sec: 52428.2, 60 sec: 46694.4, 300 sec: 45819.7). Total num frames: 3848601600. Throughput: 0: 45599.1. Samples: 367122560. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-13 11:37:45,502][73265] Avg episode reward: [(0, '0.445')] [2024-06-13 11:37:49,663][73497] Updated weights for policy 0, policy_version 234909 (0.0028) [2024-06-13 11:37:50,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45875.2, 300 sec: 45653.1). Total num frames: 3848798208. Throughput: 0: 46217.0. Samples: 367279180. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-13 11:37:50,502][73265] Avg episode reward: [(0, '0.481')] [2024-06-13 11:37:51,905][73497] Updated weights for policy 0, policy_version 234919 (0.0037) [2024-06-13 11:37:55,501][73265] Fps is (10 sec: 40960.2, 60 sec: 45329.1, 300 sec: 45653.0). Total num frames: 3849011200. Throughput: 0: 46284.4. Samples: 367553760. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-13 11:37:55,502][73265] Avg episode reward: [(0, '0.464')] [2024-06-13 11:37:56,894][73497] Updated weights for policy 0, policy_version 234929 (0.0037) [2024-06-13 11:37:59,171][73497] Updated weights for policy 0, policy_version 234939 (0.0025) [2024-06-13 11:38:00,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45875.3, 300 sec: 45764.1). Total num frames: 3849256960. Throughput: 0: 45739.7. Samples: 367816420. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-13 11:38:00,502][73265] Avg episode reward: [(0, '0.425')] [2024-06-13 11:38:03,719][73497] Updated weights for policy 0, policy_version 234949 (0.0036) [2024-06-13 11:38:05,501][73265] Fps is (10 sec: 49152.8, 60 sec: 46421.5, 300 sec: 45708.6). Total num frames: 3849502720. Throughput: 0: 46072.6. Samples: 367967380. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-13 11:38:05,501][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 11:38:06,030][73497] Updated weights for policy 0, policy_version 234959 (0.0029) [2024-06-13 11:38:10,502][73265] Fps is (10 sec: 44236.2, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3849699328. Throughput: 0: 46253.8. Samples: 368248900. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-13 11:38:10,502][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 11:38:10,831][73497] Updated weights for policy 0, policy_version 234969 (0.0035) [2024-06-13 11:38:13,463][73497] Updated weights for policy 0, policy_version 234979 (0.0038) [2024-06-13 11:38:15,504][73265] Fps is (10 sec: 44225.0, 60 sec: 45873.4, 300 sec: 45763.7). Total num frames: 3849945088. Throughput: 0: 46252.9. Samples: 368519340. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-13 11:38:15,505][73265] Avg episode reward: [(0, '0.479')] [2024-06-13 11:38:18,188][73497] Updated weights for policy 0, policy_version 234989 (0.0024) [2024-06-13 11:38:20,386][73497] Updated weights for policy 0, policy_version 234999 (0.0047) [2024-06-13 11:38:20,501][73265] Fps is (10 sec: 52429.1, 60 sec: 46967.4, 300 sec: 45875.2). Total num frames: 3850223616. Throughput: 0: 46197.7. Samples: 368662340. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-13 11:38:20,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 11:38:25,120][73497] Updated weights for policy 0, policy_version 235009 (0.0034) [2024-06-13 11:38:25,501][73265] Fps is (10 sec: 44248.1, 60 sec: 45602.2, 300 sec: 45486.5). Total num frames: 3850387456. Throughput: 0: 46207.1. Samples: 368932180. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-13 11:38:25,502][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 11:38:25,506][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000235009_3850387456.pth... [2024-06-13 11:38:25,555][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000234342_3839459328.pth [2024-06-13 11:38:27,216][73477] Signal inference workers to stop experience collection... (5400 times) [2024-06-13 11:38:27,217][73477] Signal inference workers to resume experience collection... (5400 times) [2024-06-13 11:38:27,259][73497] InferenceWorker_p0-w0: stopping experience collection (5400 times) [2024-06-13 11:38:27,260][73497] InferenceWorker_p0-w0: resuming experience collection (5400 times) [2024-06-13 11:38:27,524][73497] Updated weights for policy 0, policy_version 235019 (0.0031) [2024-06-13 11:38:30,501][73265] Fps is (10 sec: 39321.7, 60 sec: 45057.9, 300 sec: 45653.1). Total num frames: 3850616832. Throughput: 0: 46287.6. Samples: 369205500. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-13 11:38:30,502][73265] Avg episode reward: [(0, '0.482')] [2024-06-13 11:38:31,986][73497] Updated weights for policy 0, policy_version 235029 (0.0034) [2024-06-13 11:38:35,023][73497] Updated weights for policy 0, policy_version 235039 (0.0025) [2024-06-13 11:38:35,501][73265] Fps is (10 sec: 50790.3, 60 sec: 46967.4, 300 sec: 45764.1). Total num frames: 3850895360. Throughput: 0: 45982.2. Samples: 369348380. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-13 11:38:35,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 11:38:39,547][73497] Updated weights for policy 0, policy_version 235049 (0.0025) [2024-06-13 11:38:40,501][73265] Fps is (10 sec: 49151.6, 60 sec: 46148.2, 300 sec: 45653.0). Total num frames: 3851108352. Throughput: 0: 46003.9. Samples: 369623940. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-13 11:38:40,502][73265] Avg episode reward: [(0, '0.481')] [2024-06-13 11:38:42,370][73497] Updated weights for policy 0, policy_version 235059 (0.0031) [2024-06-13 11:38:45,501][73265] Fps is (10 sec: 42598.4, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 3851321344. Throughput: 0: 46143.5. Samples: 369892880. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-13 11:38:45,502][73265] Avg episode reward: [(0, '0.427')] [2024-06-13 11:38:46,514][73497] Updated weights for policy 0, policy_version 235069 (0.0029) [2024-06-13 11:38:49,158][73497] Updated weights for policy 0, policy_version 235079 (0.0032) [2024-06-13 11:38:50,501][73265] Fps is (10 sec: 45875.8, 60 sec: 46148.3, 300 sec: 45764.1). Total num frames: 3851567104. Throughput: 0: 45934.6. Samples: 370034440. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-13 11:38:50,502][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 11:38:53,378][73497] Updated weights for policy 0, policy_version 235089 (0.0034) [2024-06-13 11:38:55,501][73265] Fps is (10 sec: 45875.1, 60 sec: 46148.3, 300 sec: 45653.4). Total num frames: 3851780096. Throughput: 0: 45750.8. Samples: 370307680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:38:55,502][73265] Avg episode reward: [(0, '0.453')] [2024-06-13 11:38:56,528][73497] Updated weights for policy 0, policy_version 235099 (0.0039) [2024-06-13 11:39:00,328][73497] Updated weights for policy 0, policy_version 235109 (0.0038) [2024-06-13 11:39:00,501][73265] Fps is (10 sec: 45875.3, 60 sec: 46148.3, 300 sec: 45597.5). Total num frames: 3852025856. Throughput: 0: 45770.6. Samples: 370578900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:39:00,502][73265] Avg episode reward: [(0, '0.446')] [2024-06-13 11:39:04,161][73497] Updated weights for policy 0, policy_version 235119 (0.0041) [2024-06-13 11:39:05,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45602.0, 300 sec: 45709.0). Total num frames: 3852238848. Throughput: 0: 45640.9. Samples: 370716180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:39:05,502][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 11:39:07,957][73497] Updated weights for policy 0, policy_version 235129 (0.0033) [2024-06-13 11:39:10,501][73265] Fps is (10 sec: 42598.2, 60 sec: 45875.3, 300 sec: 45597.5). Total num frames: 3852451840. Throughput: 0: 45655.1. Samples: 370986660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:39:10,502][73265] Avg episode reward: [(0, '0.421')] [2024-06-13 11:39:11,355][73497] Updated weights for policy 0, policy_version 235139 (0.0044) [2024-06-13 11:39:14,803][73497] Updated weights for policy 0, policy_version 235149 (0.0029) [2024-06-13 11:39:15,502][73265] Fps is (10 sec: 45874.6, 60 sec: 45877.0, 300 sec: 45597.5). Total num frames: 3852697600. Throughput: 0: 45672.3. Samples: 371260760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:39:15,502][73265] Avg episode reward: [(0, '0.472')] [2024-06-13 11:39:18,323][73497] Updated weights for policy 0, policy_version 235159 (0.0030) [2024-06-13 11:39:20,501][73265] Fps is (10 sec: 47513.3, 60 sec: 45056.0, 300 sec: 45653.0). Total num frames: 3852926976. Throughput: 0: 45504.4. Samples: 371396080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:39:20,502][73265] Avg episode reward: [(0, '0.482')] [2024-06-13 11:39:22,151][73497] Updated weights for policy 0, policy_version 235169 (0.0032) [2024-06-13 11:39:25,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45875.1, 300 sec: 45653.0). Total num frames: 3853139968. Throughput: 0: 45425.8. Samples: 371668100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:39:25,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 11:39:25,858][73497] Updated weights for policy 0, policy_version 235179 (0.0040) [2024-06-13 11:39:29,239][73497] Updated weights for policy 0, policy_version 235189 (0.0037) [2024-06-13 11:39:30,501][73265] Fps is (10 sec: 45875.3, 60 sec: 46148.3, 300 sec: 45653.0). Total num frames: 3853385728. Throughput: 0: 45397.7. Samples: 371935780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:39:30,502][73265] Avg episode reward: [(0, '0.458')] [2024-06-13 11:39:33,245][73497] Updated weights for policy 0, policy_version 235199 (0.0034) [2024-06-13 11:39:35,501][73265] Fps is (10 sec: 47513.5, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 3853615104. Throughput: 0: 45469.2. Samples: 372080560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:39:35,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 11:39:36,551][73497] Updated weights for policy 0, policy_version 235209 (0.0033) [2024-06-13 11:39:36,777][73477] Signal inference workers to stop experience collection... (5450 times) [2024-06-13 11:39:36,777][73477] Signal inference workers to resume experience collection... (5450 times) [2024-06-13 11:39:36,814][73497] InferenceWorker_p0-w0: stopping experience collection (5450 times) [2024-06-13 11:39:36,814][73497] InferenceWorker_p0-w0: resuming experience collection (5450 times) [2024-06-13 11:39:40,310][73497] Updated weights for policy 0, policy_version 235219 (0.0028) [2024-06-13 11:39:40,501][73265] Fps is (10 sec: 44237.2, 60 sec: 45329.2, 300 sec: 45708.6). Total num frames: 3853828096. Throughput: 0: 45495.2. Samples: 372354960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:39:40,502][73265] Avg episode reward: [(0, '0.484')] [2024-06-13 11:39:43,860][73497] Updated weights for policy 0, policy_version 235229 (0.0039) [2024-06-13 11:39:45,501][73265] Fps is (10 sec: 44236.8, 60 sec: 45602.1, 300 sec: 45653.0). Total num frames: 3854057472. Throughput: 0: 45326.5. Samples: 372618600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:39:45,505][73265] Avg episode reward: [(0, '0.513')] [2024-06-13 11:39:47,419][73497] Updated weights for policy 0, policy_version 235239 (0.0036) [2024-06-13 11:39:50,502][73265] Fps is (10 sec: 45874.5, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 3854286848. Throughput: 0: 45321.7. Samples: 372755660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:39:50,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 11:39:51,131][73497] Updated weights for policy 0, policy_version 235249 (0.0040) [2024-06-13 11:39:54,787][73497] Updated weights for policy 0, policy_version 235259 (0.0033) [2024-06-13 11:39:55,501][73265] Fps is (10 sec: 45875.7, 60 sec: 45602.2, 300 sec: 45764.1). Total num frames: 3854516224. Throughput: 0: 45500.9. Samples: 373034200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:39:55,502][73265] Avg episode reward: [(0, '0.484')] [2024-06-13 11:39:58,189][73497] Updated weights for policy 0, policy_version 235269 (0.0040) [2024-06-13 11:40:00,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45055.9, 300 sec: 45597.5). Total num frames: 3854729216. Throughput: 0: 45134.2. Samples: 373291800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-13 11:40:00,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 11:40:02,270][73497] Updated weights for policy 0, policy_version 235279 (0.0039) [2024-06-13 11:40:05,502][73265] Fps is (10 sec: 44236.0, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 3854958592. Throughput: 0: 45189.7. Samples: 373429620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-13 11:40:05,502][73265] Avg episode reward: [(0, '0.498')] [2024-06-13 11:40:05,563][73497] Updated weights for policy 0, policy_version 235289 (0.0023) [2024-06-13 11:40:09,523][73497] Updated weights for policy 0, policy_version 235299 (0.0038) [2024-06-13 11:40:10,501][73265] Fps is (10 sec: 47514.2, 60 sec: 45875.2, 300 sec: 45653.1). Total num frames: 3855204352. Throughput: 0: 45429.9. Samples: 373712440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-13 11:40:10,502][73265] Avg episode reward: [(0, '0.535')] [2024-06-13 11:40:13,027][73497] Updated weights for policy 0, policy_version 235309 (0.0043) [2024-06-13 11:40:15,502][73265] Fps is (10 sec: 42598.3, 60 sec: 44782.9, 300 sec: 45597.5). Total num frames: 3855384576. Throughput: 0: 45372.8. Samples: 373977560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-13 11:40:15,502][73265] Avg episode reward: [(0, '0.479')] [2024-06-13 11:40:16,612][73497] Updated weights for policy 0, policy_version 235319 (0.0035) [2024-06-13 11:40:20,162][73497] Updated weights for policy 0, policy_version 235329 (0.0036) [2024-06-13 11:40:20,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45329.1, 300 sec: 45653.1). Total num frames: 3855646720. Throughput: 0: 44957.0. Samples: 374103620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-13 11:40:20,502][73265] Avg episode reward: [(0, '0.481')] [2024-06-13 11:40:23,760][73497] Updated weights for policy 0, policy_version 235339 (0.0039) [2024-06-13 11:40:25,502][73265] Fps is (10 sec: 50790.0, 60 sec: 45875.1, 300 sec: 45708.6). Total num frames: 3855892480. Throughput: 0: 45259.3. Samples: 374391640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-13 11:40:25,502][73265] Avg episode reward: [(0, '0.479')] [2024-06-13 11:40:25,519][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000235345_3855892480.pth... [2024-06-13 11:40:25,572][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000234675_3844915200.pth [2024-06-13 11:40:26,911][73497] Updated weights for policy 0, policy_version 235349 (0.0034) [2024-06-13 11:40:30,501][73265] Fps is (10 sec: 42598.5, 60 sec: 44783.0, 300 sec: 45653.1). Total num frames: 3856072704. Throughput: 0: 45556.1. Samples: 374668620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-13 11:40:30,502][73265] Avg episode reward: [(0, '0.477')] [2024-06-13 11:40:31,029][73497] Updated weights for policy 0, policy_version 235359 (0.0033) [2024-06-13 11:40:33,998][73497] Updated weights for policy 0, policy_version 235369 (0.0038) [2024-06-13 11:40:35,502][73265] Fps is (10 sec: 45875.6, 60 sec: 45602.1, 300 sec: 45764.1). Total num frames: 3856351232. Throughput: 0: 45450.6. Samples: 374800940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-13 11:40:35,502][73265] Avg episode reward: [(0, '0.507')] [2024-06-13 11:40:38,296][73497] Updated weights for policy 0, policy_version 235379 (0.0036) [2024-06-13 11:40:40,501][73265] Fps is (10 sec: 50790.3, 60 sec: 45875.2, 300 sec: 45708.6). Total num frames: 3856580608. Throughput: 0: 45238.6. Samples: 375069940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-13 11:40:40,502][73265] Avg episode reward: [(0, '0.442')] [2024-06-13 11:40:41,443][73497] Updated weights for policy 0, policy_version 235389 (0.0038) [2024-06-13 11:40:45,355][73497] Updated weights for policy 0, policy_version 235399 (0.0027) [2024-06-13 11:40:45,501][73265] Fps is (10 sec: 42599.0, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 3856777216. Throughput: 0: 45708.1. Samples: 375348660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-13 11:40:45,502][73265] Avg episode reward: [(0, '0.404')] [2024-06-13 11:40:48,831][73497] Updated weights for policy 0, policy_version 235409 (0.0040) [2024-06-13 11:40:50,507][73265] Fps is (10 sec: 44213.6, 60 sec: 45598.2, 300 sec: 45652.5). Total num frames: 3857022976. Throughput: 0: 45456.6. Samples: 375475400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-13 11:40:50,507][73265] Avg episode reward: [(0, '0.473')] [2024-06-13 11:40:52,673][73497] Updated weights for policy 0, policy_version 235419 (0.0041) [2024-06-13 11:40:55,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 3857235968. Throughput: 0: 45282.2. Samples: 375750140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-13 11:40:55,502][73265] Avg episode reward: [(0, '0.499')] [2024-06-13 11:40:55,701][73497] Updated weights for policy 0, policy_version 235429 (0.0024) [2024-06-13 11:40:59,807][73497] Updated weights for policy 0, policy_version 235439 (0.0035) [2024-06-13 11:41:00,501][73265] Fps is (10 sec: 44260.0, 60 sec: 45602.2, 300 sec: 45653.1). Total num frames: 3857465344. Throughput: 0: 45605.1. Samples: 376029780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-13 11:41:00,502][73265] Avg episode reward: [(0, '0.484')] [2024-06-13 11:41:02,804][73477] Signal inference workers to stop experience collection... (5500 times) [2024-06-13 11:41:02,858][73497] InferenceWorker_p0-w0: stopping experience collection (5500 times) [2024-06-13 11:41:02,915][73477] Signal inference workers to resume experience collection... (5500 times) [2024-06-13 11:41:02,915][73497] InferenceWorker_p0-w0: resuming experience collection (5500 times) [2024-06-13 11:41:03,052][73497] Updated weights for policy 0, policy_version 235449 (0.0033) [2024-06-13 11:41:05,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45329.2, 300 sec: 45597.8). Total num frames: 3857678336. Throughput: 0: 45754.7. Samples: 376162580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-13 11:41:05,502][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 11:41:07,173][73497] Updated weights for policy 0, policy_version 235459 (0.0031) [2024-06-13 11:41:10,383][73497] Updated weights for policy 0, policy_version 235469 (0.0032) [2024-06-13 11:41:10,501][73265] Fps is (10 sec: 45874.7, 60 sec: 45329.0, 300 sec: 45764.1). Total num frames: 3857924096. Throughput: 0: 45310.8. Samples: 376430620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-13 11:41:10,502][73265] Avg episode reward: [(0, '0.416')] [2024-06-13 11:41:14,338][73497] Updated weights for policy 0, policy_version 235479 (0.0035) [2024-06-13 11:41:15,501][73265] Fps is (10 sec: 49152.6, 60 sec: 46421.6, 300 sec: 45708.6). Total num frames: 3858169856. Throughput: 0: 45220.5. Samples: 376703540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 11:41:15,502][73265] Avg episode reward: [(0, '0.500')] [2024-06-13 11:41:17,507][73497] Updated weights for policy 0, policy_version 235489 (0.0041) [2024-06-13 11:41:20,501][73265] Fps is (10 sec: 40960.2, 60 sec: 44782.9, 300 sec: 45486.4). Total num frames: 3858333696. Throughput: 0: 45361.0. Samples: 376842180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 11:41:20,502][73265] Avg episode reward: [(0, '0.485')] [2024-06-13 11:41:21,462][73497] Updated weights for policy 0, policy_version 235499 (0.0045) [2024-06-13 11:41:24,414][73497] Updated weights for policy 0, policy_version 235509 (0.0033) [2024-06-13 11:41:25,501][73265] Fps is (10 sec: 42597.5, 60 sec: 45056.1, 300 sec: 45653.0). Total num frames: 3858595840. Throughput: 0: 45418.1. Samples: 377113760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 11:41:25,502][73265] Avg episode reward: [(0, '0.502')] [2024-06-13 11:41:28,676][73497] Updated weights for policy 0, policy_version 235519 (0.0027) [2024-06-13 11:41:30,501][73265] Fps is (10 sec: 50790.6, 60 sec: 46148.2, 300 sec: 45653.0). Total num frames: 3858841600. Throughput: 0: 45429.8. Samples: 377393000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 11:41:30,502][73265] Avg episode reward: [(0, '0.495')] [2024-06-13 11:41:31,663][73497] Updated weights for policy 0, policy_version 235529 (0.0034) [2024-06-13 11:41:35,503][73265] Fps is (10 sec: 44231.5, 60 sec: 44782.1, 300 sec: 45541.8). Total num frames: 3859038208. Throughput: 0: 45686.2. Samples: 377531100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 11:41:35,503][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 11:41:36,118][73497] Updated weights for policy 0, policy_version 235539 (0.0040) [2024-06-13 11:41:39,027][73497] Updated weights for policy 0, policy_version 235549 (0.0035) [2024-06-13 11:41:40,501][73265] Fps is (10 sec: 42598.5, 60 sec: 44782.9, 300 sec: 45653.1). Total num frames: 3859267584. Throughput: 0: 45538.7. Samples: 377799380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 11:41:40,502][73265] Avg episode reward: [(0, '0.449')] [2024-06-13 11:41:43,054][73497] Updated weights for policy 0, policy_version 235559 (0.0033) [2024-06-13 11:41:45,501][73265] Fps is (10 sec: 50796.7, 60 sec: 46148.3, 300 sec: 45764.1). Total num frames: 3859546112. Throughput: 0: 45335.5. Samples: 378069880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 11:41:45,502][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 11:41:46,066][73497] Updated weights for policy 0, policy_version 235569 (0.0029) [2024-06-13 11:41:49,971][73497] Updated weights for policy 0, policy_version 235579 (0.0034) [2024-06-13 11:41:50,502][73265] Fps is (10 sec: 45874.4, 60 sec: 45059.8, 300 sec: 45541.9). Total num frames: 3859726336. Throughput: 0: 45566.5. Samples: 378213080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 11:41:50,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 11:41:53,056][73497] Updated weights for policy 0, policy_version 235589 (0.0045) [2024-06-13 11:41:55,501][73265] Fps is (10 sec: 40960.1, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 3859955712. Throughput: 0: 45681.0. Samples: 378486260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 11:41:55,502][73265] Avg episode reward: [(0, '0.477')] [2024-06-13 11:41:57,899][73497] Updated weights for policy 0, policy_version 235599 (0.0034) [2024-06-13 11:42:00,501][73265] Fps is (10 sec: 47514.1, 60 sec: 45602.1, 300 sec: 45708.6). Total num frames: 3860201472. Throughput: 0: 45508.3. Samples: 378751420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 11:42:00,502][73265] Avg episode reward: [(0, '0.397')] [2024-06-13 11:42:00,886][73497] Updated weights for policy 0, policy_version 235609 (0.0034) [2024-06-13 11:42:04,793][73497] Updated weights for policy 0, policy_version 235619 (0.0026) [2024-06-13 11:42:05,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 3860398080. Throughput: 0: 45683.1. Samples: 378897920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 11:42:05,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 11:42:07,967][73497] Updated weights for policy 0, policy_version 235629 (0.0035) [2024-06-13 11:42:10,501][73265] Fps is (10 sec: 42598.3, 60 sec: 45056.0, 300 sec: 45542.0). Total num frames: 3860627456. Throughput: 0: 45637.8. Samples: 379167460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 11:42:10,502][73265] Avg episode reward: [(0, '0.402')] [2024-06-13 11:42:11,647][73497] Updated weights for policy 0, policy_version 235639 (0.0036) [2024-06-13 11:42:14,922][73497] Updated weights for policy 0, policy_version 235649 (0.0032) [2024-06-13 11:42:15,501][73265] Fps is (10 sec: 47513.9, 60 sec: 45055.9, 300 sec: 45653.0). Total num frames: 3860873216. Throughput: 0: 45527.2. Samples: 379441720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 11:42:15,502][73265] Avg episode reward: [(0, '0.489')] [2024-06-13 11:42:16,367][73477] Signal inference workers to stop experience collection... (5550 times) [2024-06-13 11:42:16,373][73477] Signal inference workers to resume experience collection... (5550 times) [2024-06-13 11:42:16,380][73497] InferenceWorker_p0-w0: stopping experience collection (5550 times) [2024-06-13 11:42:16,414][73497] InferenceWorker_p0-w0: resuming experience collection (5550 times) [2024-06-13 11:42:19,183][73497] Updated weights for policy 0, policy_version 235659 (0.0032) [2024-06-13 11:42:20,501][73265] Fps is (10 sec: 49152.5, 60 sec: 46421.4, 300 sec: 45653.1). Total num frames: 3861118976. Throughput: 0: 45598.7. Samples: 379582980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-13 11:42:20,502][73265] Avg episode reward: [(0, '0.410')] [2024-06-13 11:42:21,959][73497] Updated weights for policy 0, policy_version 235669 (0.0027) [2024-06-13 11:42:25,506][73265] Fps is (10 sec: 47489.6, 60 sec: 45871.4, 300 sec: 45541.6). Total num frames: 3861348352. Throughput: 0: 45845.1. Samples: 379862640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 11:42:25,507][73265] Avg episode reward: [(0, '0.407')] [2024-06-13 11:42:25,521][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000235678_3861348352.pth... [2024-06-13 11:42:25,574][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000235009_3850387456.pth [2024-06-13 11:42:26,139][73497] Updated weights for policy 0, policy_version 235679 (0.0029) [2024-06-13 11:42:29,485][73497] Updated weights for policy 0, policy_version 235689 (0.0031) [2024-06-13 11:42:30,501][73265] Fps is (10 sec: 42598.2, 60 sec: 45056.0, 300 sec: 45653.0). Total num frames: 3861544960. Throughput: 0: 45816.0. Samples: 380131600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 11:42:30,502][73265] Avg episode reward: [(0, '0.456')] [2024-06-13 11:42:33,173][73497] Updated weights for policy 0, policy_version 235699 (0.0030) [2024-06-13 11:42:35,501][73265] Fps is (10 sec: 45898.3, 60 sec: 46149.3, 300 sec: 45653.0). Total num frames: 3861807104. Throughput: 0: 45706.4. Samples: 380269860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 11:42:35,502][73265] Avg episode reward: [(0, '0.483')] [2024-06-13 11:42:36,489][73497] Updated weights for policy 0, policy_version 235709 (0.0053) [2024-06-13 11:42:40,214][73497] Updated weights for policy 0, policy_version 235719 (0.0035) [2024-06-13 11:42:40,501][73265] Fps is (10 sec: 47513.3, 60 sec: 45875.1, 300 sec: 45486.4). Total num frames: 3862020096. Throughput: 0: 45715.9. Samples: 380543480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 11:42:40,502][73265] Avg episode reward: [(0, '0.419')] [2024-06-13 11:42:43,531][73497] Updated weights for policy 0, policy_version 235729 (0.0036) [2024-06-13 11:42:45,501][73265] Fps is (10 sec: 40959.8, 60 sec: 44509.9, 300 sec: 45486.4). Total num frames: 3862216704. Throughput: 0: 45900.9. Samples: 380816960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 11:42:45,502][73265] Avg episode reward: [(0, '0.444')] [2024-06-13 11:42:47,720][73497] Updated weights for policy 0, policy_version 235739 (0.0039) [2024-06-13 11:42:50,501][73265] Fps is (10 sec: 45875.7, 60 sec: 45875.3, 300 sec: 45653.1). Total num frames: 3862478848. Throughput: 0: 45560.5. Samples: 380948140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 11:42:50,502][73265] Avg episode reward: [(0, '0.493')] [2024-06-13 11:42:50,925][73497] Updated weights for policy 0, policy_version 235749 (0.0040) [2024-06-13 11:42:54,496][73497] Updated weights for policy 0, policy_version 235759 (0.0039) [2024-06-13 11:42:55,501][73265] Fps is (10 sec: 47513.5, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3862691840. Throughput: 0: 45644.0. Samples: 381221440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 11:42:55,502][73265] Avg episode reward: [(0, '0.426')] [2024-06-13 11:42:58,099][73497] Updated weights for policy 0, policy_version 235769 (0.0025) [2024-06-13 11:43:00,501][73265] Fps is (10 sec: 44236.6, 60 sec: 45329.1, 300 sec: 45486.4). Total num frames: 3862921216. Throughput: 0: 45579.9. Samples: 381492820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 11:43:00,502][73265] Avg episode reward: [(0, '0.442')] [2024-06-13 11:43:01,657][73497] Updated weights for policy 0, policy_version 235779 (0.0032) [2024-06-13 11:43:05,183][73497] Updated weights for policy 0, policy_version 235789 (0.0035) [2024-06-13 11:43:05,501][73265] Fps is (10 sec: 47513.6, 60 sec: 46148.3, 300 sec: 45653.1). Total num frames: 3863166976. Throughput: 0: 45675.0. Samples: 381638360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 11:43:05,502][73265] Avg episode reward: [(0, '0.434')] [2024-06-13 11:43:09,188][73497] Updated weights for policy 0, policy_version 235799 (0.0032) [2024-06-13 11:43:10,504][73265] Fps is (10 sec: 45863.8, 60 sec: 45873.3, 300 sec: 45542.0). Total num frames: 3863379968. Throughput: 0: 45311.9. Samples: 381901560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 11:43:10,504][73265] Avg episode reward: [(0, '0.498')] [2024-06-13 11:43:12,699][73497] Updated weights for policy 0, policy_version 235809 (0.0035) [2024-06-13 11:43:15,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45602.1, 300 sec: 45375.4). Total num frames: 3863609344. Throughput: 0: 45448.9. Samples: 382176800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 11:43:15,502][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 11:43:16,320][73497] Updated weights for policy 0, policy_version 235819 (0.0039) [2024-06-13 11:43:19,659][73497] Updated weights for policy 0, policy_version 235829 (0.0031) [2024-06-13 11:43:20,501][73265] Fps is (10 sec: 45886.7, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 3863838720. Throughput: 0: 45635.5. Samples: 382323460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 11:43:20,502][73265] Avg episode reward: [(0, '0.414')] [2024-06-13 11:43:22,981][73497] Updated weights for policy 0, policy_version 235839 (0.0031) [2024-06-13 11:43:25,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45059.8, 300 sec: 45542.0). Total num frames: 3864051712. Throughput: 0: 45593.0. Samples: 382595160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 11:43:25,502][73265] Avg episode reward: [(0, '0.506')] [2024-06-13 11:43:26,967][73497] Updated weights for policy 0, policy_version 235849 (0.0034) [2024-06-13 11:43:30,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 3864297472. Throughput: 0: 45327.1. Samples: 382856680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-13 11:43:30,502][73265] Avg episode reward: [(0, '0.490')] [2024-06-13 11:43:30,610][73497] Updated weights for policy 0, policy_version 235859 (0.0035) [2024-06-13 11:43:32,525][73477] Signal inference workers to stop experience collection... (5600 times) [2024-06-13 11:43:32,580][73497] InferenceWorker_p0-w0: stopping experience collection (5600 times) [2024-06-13 11:43:32,582][73477] Signal inference workers to resume experience collection... (5600 times) [2024-06-13 11:43:32,596][73497] InferenceWorker_p0-w0: resuming experience collection (5600 times) [2024-06-13 11:43:34,403][73497] Updated weights for policy 0, policy_version 235869 (0.0026) [2024-06-13 11:43:35,502][73265] Fps is (10 sec: 47512.7, 60 sec: 45328.9, 300 sec: 45486.4). Total num frames: 3864526848. Throughput: 0: 45399.4. Samples: 382991120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-13 11:43:35,502][73265] Avg episode reward: [(0, '0.474')] [2024-06-13 11:43:38,118][73497] Updated weights for policy 0, policy_version 235879 (0.0033) [2024-06-13 11:43:40,501][73265] Fps is (10 sec: 42598.6, 60 sec: 45056.1, 300 sec: 45430.9). Total num frames: 3864723456. Throughput: 0: 45344.5. Samples: 383261940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-13 11:43:40,502][73265] Avg episode reward: [(0, '0.497')] [2024-06-13 11:43:41,394][73497] Updated weights for policy 0, policy_version 235889 (0.0031) [2024-06-13 11:43:45,165][73497] Updated weights for policy 0, policy_version 235899 (0.0036) [2024-06-13 11:43:45,501][73265] Fps is (10 sec: 45875.8, 60 sec: 46148.3, 300 sec: 45486.4). Total num frames: 3864985600. Throughput: 0: 45359.1. Samples: 383533980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-13 11:43:45,502][73265] Avg episode reward: [(0, '0.508')] [2024-06-13 11:43:48,759][73497] Updated weights for policy 0, policy_version 235909 (0.0028) [2024-06-13 11:43:50,501][73265] Fps is (10 sec: 47513.8, 60 sec: 45329.1, 300 sec: 45486.4). Total num frames: 3865198592. Throughput: 0: 45175.2. Samples: 383671240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-13 11:43:50,502][73265] Avg episode reward: [(0, '0.484')] [2024-06-13 11:43:52,224][73497] Updated weights for policy 0, policy_version 235919 (0.0035) [2024-06-13 11:43:55,501][73265] Fps is (10 sec: 40959.9, 60 sec: 45056.0, 300 sec: 45319.8). Total num frames: 3865395200. Throughput: 0: 45323.8. Samples: 383941020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-13 11:43:55,502][73265] Avg episode reward: [(0, '0.484')] [2024-06-13 11:43:56,208][73497] Updated weights for policy 0, policy_version 235929 (0.0043) [2024-06-13 11:43:59,975][73497] Updated weights for policy 0, policy_version 235939 (0.0033) [2024-06-13 11:44:00,502][73265] Fps is (10 sec: 44235.9, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 3865640960. Throughput: 0: 45049.2. Samples: 384204020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-13 11:44:00,502][73265] Avg episode reward: [(0, '0.458')] [2024-06-13 11:44:03,637][73497] Updated weights for policy 0, policy_version 235949 (0.0031) [2024-06-13 11:44:05,501][73265] Fps is (10 sec: 49152.3, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 3865886720. Throughput: 0: 44823.1. Samples: 384340500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-13 11:44:05,502][73265] Avg episode reward: [(0, '0.482')] [2024-06-13 11:44:07,353][73497] Updated weights for policy 0, policy_version 235959 (0.0025) [2024-06-13 11:44:10,501][73265] Fps is (10 sec: 44237.3, 60 sec: 45057.9, 300 sec: 45375.4). Total num frames: 3866083328. Throughput: 0: 44756.4. Samples: 384609200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-13 11:44:10,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 11:44:10,826][73497] Updated weights for policy 0, policy_version 235969 (0.0034) [2024-06-13 11:44:14,432][73497] Updated weights for policy 0, policy_version 235979 (0.0027) [2024-06-13 11:44:15,501][73265] Fps is (10 sec: 42598.2, 60 sec: 45055.9, 300 sec: 45375.4). Total num frames: 3866312704. Throughput: 0: 44900.4. Samples: 384877200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-13 11:44:15,502][73265] Avg episode reward: [(0, '0.521')] [2024-06-13 11:44:17,986][73497] Updated weights for policy 0, policy_version 235989 (0.0034) [2024-06-13 11:44:20,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45056.0, 300 sec: 45430.9). Total num frames: 3866542080. Throughput: 0: 44965.5. Samples: 385014560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-13 11:44:20,502][73265] Avg episode reward: [(0, '0.547')] [2024-06-13 11:44:21,407][73497] Updated weights for policy 0, policy_version 235999 (0.0047) [2024-06-13 11:44:25,353][73497] Updated weights for policy 0, policy_version 236009 (0.0043) [2024-06-13 11:44:25,501][73265] Fps is (10 sec: 45875.4, 60 sec: 45329.0, 300 sec: 45375.4). Total num frames: 3866771456. Throughput: 0: 45031.5. Samples: 385288360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-13 11:44:25,502][73265] Avg episode reward: [(0, '0.486')] [2024-06-13 11:44:25,506][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000236009_3866771456.pth... [2024-06-13 11:44:25,567][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000235345_3855892480.pth [2024-06-13 11:44:29,209][73497] Updated weights for policy 0, policy_version 236019 (0.0031) [2024-06-13 11:44:30,501][73265] Fps is (10 sec: 44236.6, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 3866984448. Throughput: 0: 44855.5. Samples: 385552480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-13 11:44:30,502][73265] Avg episode reward: [(0, '0.509')] [2024-06-13 11:44:32,781][73497] Updated weights for policy 0, policy_version 236029 (0.0030) [2024-06-13 11:44:35,502][73265] Fps is (10 sec: 45874.4, 60 sec: 45056.0, 300 sec: 45430.9). Total num frames: 3867230208. Throughput: 0: 44936.2. Samples: 385693380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-13 11:44:35,502][73265] Avg episode reward: [(0, '0.472')] [2024-06-13 11:44:36,391][73497] Updated weights for policy 0, policy_version 236039 (0.0024) [2024-06-13 11:44:39,696][73477] Signal inference workers to stop experience collection... (5650 times) [2024-06-13 11:44:39,746][73497] InferenceWorker_p0-w0: stopping experience collection (5650 times) [2024-06-13 11:44:39,754][73477] Signal inference workers to resume experience collection... (5650 times) [2024-06-13 11:44:39,761][73497] InferenceWorker_p0-w0: resuming experience collection (5650 times) [2024-06-13 11:44:39,766][73497] Updated weights for policy 0, policy_version 236049 (0.0029) [2024-06-13 11:44:40,501][73265] Fps is (10 sec: 50790.3, 60 sec: 46148.2, 300 sec: 45542.0). Total num frames: 3867492352. Throughput: 0: 45210.6. Samples: 385975500. Policy #0 lag: (min: 2.0, avg: 10.5, max: 20.0) [2024-06-13 11:44:40,502][73265] Avg episode reward: [(0, '0.495')] [2024-06-13 11:44:43,357][73497] Updated weights for policy 0, policy_version 236059 (0.0035) [2024-06-13 11:44:45,501][73265] Fps is (10 sec: 40960.7, 60 sec: 44236.8, 300 sec: 45264.3). Total num frames: 3867639808. Throughput: 0: 45423.7. Samples: 386248080. Policy #0 lag: (min: 2.0, avg: 10.5, max: 20.0) [2024-06-13 11:44:45,502][73265] Avg episode reward: [(0, '0.472')] [2024-06-13 11:44:46,893][73497] Updated weights for policy 0, policy_version 236069 (0.0030) [2024-06-13 11:44:50,273][73497] Updated weights for policy 0, policy_version 236079 (0.0043) [2024-06-13 11:44:50,501][73265] Fps is (10 sec: 42598.4, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 3867918336. Throughput: 0: 45246.6. Samples: 386376600. Policy #0 lag: (min: 2.0, avg: 10.5, max: 20.0) [2024-06-13 11:44:50,502][73265] Avg episode reward: [(0, '0.440')] [2024-06-13 11:44:54,431][73497] Updated weights for policy 0, policy_version 236089 (0.0035) [2024-06-13 11:44:55,501][73265] Fps is (10 sec: 52428.6, 60 sec: 46148.3, 300 sec: 45542.0). Total num frames: 3868164096. Throughput: 0: 45339.1. Samples: 386649460. Policy #0 lag: (min: 2.0, avg: 10.5, max: 20.0) [2024-06-13 11:44:55,502][73265] Avg episode reward: [(0, '0.473')] [2024-06-13 11:44:57,718][73497] Updated weights for policy 0, policy_version 236099 (0.0029) [2024-06-13 11:45:00,501][73265] Fps is (10 sec: 40960.5, 60 sec: 44783.1, 300 sec: 45319.8). Total num frames: 3868327936. Throughput: 0: 45491.6. Samples: 386924320. Policy #0 lag: (min: 2.0, avg: 10.5, max: 20.0) [2024-06-13 11:45:00,502][73265] Avg episode reward: [(0, '0.435')] [2024-06-13 11:45:01,549][73497] Updated weights for policy 0, policy_version 236109 (0.0032) [2024-06-13 11:45:05,021][73497] Updated weights for policy 0, policy_version 236119 (0.0039) [2024-06-13 11:45:05,502][73265] Fps is (10 sec: 40959.5, 60 sec: 44782.8, 300 sec: 45319.8). Total num frames: 3868573696. Throughput: 0: 45154.5. Samples: 387046520. Policy #0 lag: (min: 2.0, avg: 10.5, max: 20.0) [2024-06-13 11:45:05,502][73265] Avg episode reward: [(0, '0.481')] [2024-06-13 11:45:08,577][73497] Updated weights for policy 0, policy_version 236129 (0.0031) [2024-06-13 11:45:10,501][73265] Fps is (10 sec: 50789.9, 60 sec: 45875.2, 300 sec: 45597.5). Total num frames: 3868835840. Throughput: 0: 45319.5. Samples: 387327740. Policy #0 lag: (min: 2.0, avg: 10.5, max: 20.0) [2024-06-13 11:45:10,502][73265] Avg episode reward: [(0, '0.430')] [2024-06-13 11:45:12,430][73497] Updated weights for policy 0, policy_version 236139 (0.0038) [2024-06-13 11:45:15,502][73265] Fps is (10 sec: 45875.3, 60 sec: 45329.0, 300 sec: 45375.3). Total num frames: 3869032448. Throughput: 0: 45684.4. Samples: 387608280. Policy #0 lag: (min: 2.0, avg: 10.5, max: 20.0) [2024-06-13 11:45:15,502][73265] Avg episode reward: [(0, '0.442')] [2024-06-13 11:45:15,694][73497] Updated weights for policy 0, policy_version 236149 (0.0030) [2024-06-13 11:45:19,359][73497] Updated weights for policy 0, policy_version 236159 (0.0044) [2024-06-13 11:45:20,501][73265] Fps is (10 sec: 42598.2, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 3869261824. Throughput: 0: 45322.7. Samples: 387732900. Policy #0 lag: (min: 2.0, avg: 10.5, max: 20.0) [2024-06-13 11:45:20,503][73265] Avg episode reward: [(0, '0.458')] [2024-06-13 11:45:23,338][73497] Updated weights for policy 0, policy_version 236169 (0.0028) [2024-06-13 11:45:25,501][73265] Fps is (10 sec: 47514.5, 60 sec: 45602.2, 300 sec: 45542.0). Total num frames: 3869507584. Throughput: 0: 45033.5. Samples: 388002000. Policy #0 lag: (min: 2.0, avg: 10.5, max: 20.0) [2024-06-13 11:45:25,502][73265] Avg episode reward: [(0, '0.457')] [2024-06-13 11:45:26,382][73497] Updated weights for policy 0, policy_version 236179 (0.0039) [2024-06-13 11:45:30,450][73497] Updated weights for policy 0, policy_version 236189 (0.0031) [2024-06-13 11:45:30,501][73265] Fps is (10 sec: 45875.4, 60 sec: 45602.1, 300 sec: 45319.8). Total num frames: 3869720576. Throughput: 0: 45220.4. Samples: 388283000. Policy #0 lag: (min: 2.0, avg: 10.5, max: 20.0) [2024-06-13 11:45:30,502][73265] Avg episode reward: [(0, '0.445')] [2024-06-13 11:45:33,630][73497] Updated weights for policy 0, policy_version 236199 (0.0040) [2024-06-13 11:45:35,502][73265] Fps is (10 sec: 42597.3, 60 sec: 45056.0, 300 sec: 45264.2). Total num frames: 3869933568. Throughput: 0: 45157.7. Samples: 388408700. Policy #0 lag: (min: 2.0, avg: 10.5, max: 20.0) [2024-06-13 11:45:35,502][73265] Avg episode reward: [(0, '0.478')] [2024-06-13 11:45:37,679][73497] Updated weights for policy 0, policy_version 236209 (0.0023) [2024-06-13 11:45:40,502][73265] Fps is (10 sec: 45874.8, 60 sec: 44782.9, 300 sec: 45430.9). Total num frames: 3870179328. Throughput: 0: 45120.3. Samples: 388679880. Policy #0 lag: (min: 2.0, avg: 10.5, max: 20.0) [2024-06-13 11:45:40,502][73265] Avg episode reward: [(0, '0.472')] [2024-06-13 11:45:41,463][73497] Updated weights for policy 0, policy_version 236219 (0.0025) [2024-06-13 11:45:45,070][73497] Updated weights for policy 0, policy_version 236229 (0.0028) [2024-06-13 11:45:45,501][73265] Fps is (10 sec: 45875.9, 60 sec: 45875.2, 300 sec: 45320.6). Total num frames: 3870392320. Throughput: 0: 45217.3. Samples: 388959100. Policy #0 lag: (min: 2.0, avg: 10.5, max: 20.0) [2024-06-13 11:45:45,502][73265] Avg episode reward: [(0, '0.520')] [2024-06-13 11:45:48,536][73497] Updated weights for policy 0, policy_version 236239 (0.0044) [2024-06-13 11:45:50,501][73265] Fps is (10 sec: 42599.0, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 3870605312. Throughput: 0: 45324.6. Samples: 389086120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-13 11:45:50,502][73265] Avg episode reward: [(0, '0.529')] [2024-06-13 11:45:52,423][73497] Updated weights for policy 0, policy_version 236249 (0.0045) [2024-06-13 11:45:53,727][73477] Signal inference workers to stop experience collection... (5700 times) [2024-06-13 11:45:53,727][73477] Signal inference workers to resume experience collection... (5700 times) [2024-06-13 11:45:53,740][73497] InferenceWorker_p0-w0: stopping experience collection (5700 times) [2024-06-13 11:45:53,740][73497] InferenceWorker_p0-w0: resuming experience collection (5700 times) [2024-06-13 11:45:55,501][73265] Fps is (10 sec: 45875.5, 60 sec: 44783.0, 300 sec: 45375.4). Total num frames: 3870851072. Throughput: 0: 44881.0. Samples: 389347380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-13 11:45:55,502][73265] Avg episode reward: [(0, '0.454')] [2024-06-13 11:45:55,548][73497] Updated weights for policy 0, policy_version 236259 (0.0044) [2024-06-13 11:45:59,523][73497] Updated weights for policy 0, policy_version 236269 (0.0034) [2024-06-13 11:46:00,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45602.1, 300 sec: 45375.4). Total num frames: 3871064064. Throughput: 0: 44905.5. Samples: 389629020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-13 11:46:00,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 11:46:03,150][73497] Updated weights for policy 0, policy_version 236279 (0.0030) [2024-06-13 11:46:05,501][73265] Fps is (10 sec: 44236.2, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 3871293440. Throughput: 0: 45102.2. Samples: 389762500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-13 11:46:05,502][73265] Avg episode reward: [(0, '0.543')] [2024-06-13 11:46:06,829][73497] Updated weights for policy 0, policy_version 236289 (0.0026) [2024-06-13 11:46:10,502][73265] Fps is (10 sec: 44236.1, 60 sec: 44509.8, 300 sec: 45208.7). Total num frames: 3871506432. Throughput: 0: 45210.5. Samples: 390036480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-13 11:46:10,502][73265] Avg episode reward: [(0, '0.461')] [2024-06-13 11:46:10,927][73497] Updated weights for policy 0, policy_version 236299 (0.0028) [2024-06-13 11:46:13,854][73497] Updated weights for policy 0, policy_version 236309 (0.0028) [2024-06-13 11:46:15,502][73265] Fps is (10 sec: 47513.6, 60 sec: 45602.1, 300 sec: 45542.0). Total num frames: 3871768576. Throughput: 0: 44956.4. Samples: 390306040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-13 11:46:15,502][73265] Avg episode reward: [(0, '0.489')] [2024-06-13 11:46:17,816][73497] Updated weights for policy 0, policy_version 236319 (0.0043) [2024-06-13 11:46:20,501][73265] Fps is (10 sec: 47513.9, 60 sec: 45329.1, 300 sec: 45375.4). Total num frames: 3871981568. Throughput: 0: 45286.4. Samples: 390446580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-13 11:46:20,502][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 11:46:21,350][73497] Updated weights for policy 0, policy_version 236329 (0.0027) [2024-06-13 11:46:24,643][73497] Updated weights for policy 0, policy_version 236339 (0.0027) [2024-06-13 11:46:25,501][73265] Fps is (10 sec: 40960.4, 60 sec: 44509.8, 300 sec: 45208.7). Total num frames: 3872178176. Throughput: 0: 45355.2. Samples: 390720860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-13 11:46:25,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 11:46:25,602][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000236340_3872194560.pth... [2024-06-13 11:46:25,657][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000235678_3861348352.pth [2024-06-13 11:46:28,210][73497] Updated weights for policy 0, policy_version 236349 (0.0049) [2024-06-13 11:46:30,502][73265] Fps is (10 sec: 47513.4, 60 sec: 45602.1, 300 sec: 45486.6). Total num frames: 3872456704. Throughput: 0: 45219.9. Samples: 390994000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-13 11:46:30,502][73265] Avg episode reward: [(0, '0.472')] [2024-06-13 11:46:32,101][73497] Updated weights for policy 0, policy_version 236359 (0.0040) [2024-06-13 11:46:35,387][73497] Updated weights for policy 0, policy_version 236369 (0.0040) [2024-06-13 11:46:35,501][73265] Fps is (10 sec: 49152.3, 60 sec: 45602.3, 300 sec: 45430.9). Total num frames: 3872669696. Throughput: 0: 45812.9. Samples: 391147700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-13 11:46:35,502][73265] Avg episode reward: [(0, '0.473')] [2024-06-13 11:46:39,535][73497] Updated weights for policy 0, policy_version 236379 (0.0037) [2024-06-13 11:46:40,501][73265] Fps is (10 sec: 42598.5, 60 sec: 45056.0, 300 sec: 45208.7). Total num frames: 3872882688. Throughput: 0: 45917.7. Samples: 391413680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-13 11:46:40,502][73265] Avg episode reward: [(0, '0.467')] [2024-06-13 11:46:42,487][73497] Updated weights for policy 0, policy_version 236389 (0.0032) [2024-06-13 11:46:45,504][73265] Fps is (10 sec: 45863.5, 60 sec: 45600.2, 300 sec: 45430.5). Total num frames: 3873128448. Throughput: 0: 45456.5. Samples: 391674680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-13 11:46:45,504][73265] Avg episode reward: [(0, '0.473')] [2024-06-13 11:46:46,782][73497] Updated weights for policy 0, policy_version 236399 (0.0040) [2024-06-13 11:46:50,083][73497] Updated weights for policy 0, policy_version 236409 (0.0025) [2024-06-13 11:46:50,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45602.1, 300 sec: 45375.4). Total num frames: 3873341440. Throughput: 0: 45658.8. Samples: 391817140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-13 11:46:50,502][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 11:46:53,675][73497] Updated weights for policy 0, policy_version 236419 (0.0038) [2024-06-13 11:46:55,501][73265] Fps is (10 sec: 42608.8, 60 sec: 45055.9, 300 sec: 45264.3). Total num frames: 3873554432. Throughput: 0: 45632.5. Samples: 392089940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-13 11:46:55,502][73265] Avg episode reward: [(0, '0.401')] [2024-06-13 11:46:57,019][73497] Updated weights for policy 0, policy_version 236429 (0.0042) [2024-06-13 11:47:00,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45602.1, 300 sec: 45430.9). Total num frames: 3873800192. Throughput: 0: 45577.5. Samples: 392357020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 11:47:00,502][73265] Avg episode reward: [(0, '0.473')] [2024-06-13 11:47:00,775][73497] Updated weights for policy 0, policy_version 236439 (0.0033) [2024-06-13 11:47:03,874][73497] Updated weights for policy 0, policy_version 236449 (0.0027) [2024-06-13 11:47:05,502][73265] Fps is (10 sec: 45874.8, 60 sec: 45329.0, 300 sec: 45375.3). Total num frames: 3874013184. Throughput: 0: 45655.9. Samples: 392501100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 11:47:05,502][73265] Avg episode reward: [(0, '0.461')] [2024-06-13 11:47:08,034][73497] Updated weights for policy 0, policy_version 236459 (0.0035) [2024-06-13 11:47:10,501][73265] Fps is (10 sec: 47513.3, 60 sec: 46148.3, 300 sec: 45430.9). Total num frames: 3874275328. Throughput: 0: 45707.5. Samples: 392777700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 11:47:10,502][73265] Avg episode reward: [(0, '0.498')] [2024-06-13 11:47:10,907][73497] Updated weights for policy 0, policy_version 236469 (0.0024) [2024-06-13 11:47:13,202][73477] Signal inference workers to stop experience collection... (5750 times) [2024-06-13 11:47:13,203][73477] Signal inference workers to resume experience collection... (5750 times) [2024-06-13 11:47:13,247][73497] InferenceWorker_p0-w0: stopping experience collection (5750 times) [2024-06-13 11:47:13,247][73497] InferenceWorker_p0-w0: resuming experience collection (5750 times) [2024-06-13 11:47:15,099][73497] Updated weights for policy 0, policy_version 236479 (0.0034) [2024-06-13 11:47:15,502][73265] Fps is (10 sec: 45875.5, 60 sec: 45056.0, 300 sec: 45264.2). Total num frames: 3874471936. Throughput: 0: 45564.9. Samples: 393044420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 11:47:15,502][73265] Avg episode reward: [(0, '0.497')] [2024-06-13 11:47:18,458][73497] Updated weights for policy 0, policy_version 236489 (0.0031) [2024-06-13 11:47:20,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45602.2, 300 sec: 45320.6). Total num frames: 3874717696. Throughput: 0: 45135.1. Samples: 393178780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 11:47:20,502][73265] Avg episode reward: [(0, '0.499')] [2024-06-13 11:47:22,175][73497] Updated weights for policy 0, policy_version 236499 (0.0033) [2024-06-13 11:47:25,501][73265] Fps is (10 sec: 47513.8, 60 sec: 46148.2, 300 sec: 45430.9). Total num frames: 3874947072. Throughput: 0: 45421.8. Samples: 393457660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 11:47:25,502][73265] Avg episode reward: [(0, '0.480')] [2024-06-13 11:47:25,674][73497] Updated weights for policy 0, policy_version 236509 (0.0028) [2024-06-13 11:47:29,572][73497] Updated weights for policy 0, policy_version 236519 (0.0026) [2024-06-13 11:47:30,501][73265] Fps is (10 sec: 45875.4, 60 sec: 45329.2, 300 sec: 45319.8). Total num frames: 3875176448. Throughput: 0: 45747.5. Samples: 393733200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 11:47:30,502][73265] Avg episode reward: [(0, '0.454')] [2024-06-13 11:47:32,654][73497] Updated weights for policy 0, policy_version 236529 (0.0030) [2024-06-13 11:47:35,501][73265] Fps is (10 sec: 44237.0, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 3875389440. Throughput: 0: 45528.4. Samples: 393865920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 11:47:35,502][73265] Avg episode reward: [(0, '0.489')] [2024-06-13 11:47:36,735][73497] Updated weights for policy 0, policy_version 236539 (0.0041) [2024-06-13 11:47:39,798][73497] Updated weights for policy 0, policy_version 236549 (0.0036) [2024-06-13 11:47:40,502][73265] Fps is (10 sec: 45874.2, 60 sec: 45875.1, 300 sec: 45486.4). Total num frames: 3875635200. Throughput: 0: 45571.0. Samples: 394140640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 11:47:40,502][73265] Avg episode reward: [(0, '0.496')] [2024-06-13 11:47:43,943][73497] Updated weights for policy 0, policy_version 236559 (0.0030) [2024-06-13 11:47:45,502][73265] Fps is (10 sec: 47513.1, 60 sec: 45603.9, 300 sec: 45375.3). Total num frames: 3875864576. Throughput: 0: 45797.2. Samples: 394417900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 11:47:45,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 11:47:47,159][73497] Updated weights for policy 0, policy_version 236569 (0.0025) [2024-06-13 11:47:50,501][73265] Fps is (10 sec: 44237.7, 60 sec: 45602.2, 300 sec: 45375.4). Total num frames: 3876077568. Throughput: 0: 45713.1. Samples: 394558180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 11:47:50,502][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 11:47:50,772][73497] Updated weights for policy 0, policy_version 236579 (0.0026) [2024-06-13 11:47:54,385][73497] Updated weights for policy 0, policy_version 236589 (0.0038) [2024-06-13 11:47:55,501][73265] Fps is (10 sec: 45875.8, 60 sec: 46148.3, 300 sec: 45430.9). Total num frames: 3876323328. Throughput: 0: 45869.4. Samples: 394841820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 11:47:55,502][73265] Avg episode reward: [(0, '0.471')] [2024-06-13 11:47:57,937][73497] Updated weights for policy 0, policy_version 236599 (0.0031) [2024-06-13 11:48:00,501][73265] Fps is (10 sec: 47513.7, 60 sec: 45875.2, 300 sec: 45375.4). Total num frames: 3876552704. Throughput: 0: 45951.3. Samples: 395112220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 11:48:00,502][73265] Avg episode reward: [(0, '0.435')] [2024-06-13 11:48:01,222][73497] Updated weights for policy 0, policy_version 236609 (0.0034) [2024-06-13 11:48:05,074][73497] Updated weights for policy 0, policy_version 236619 (0.0040) [2024-06-13 11:48:05,501][73265] Fps is (10 sec: 44236.6, 60 sec: 45875.3, 300 sec: 45375.7). Total num frames: 3876765696. Throughput: 0: 46202.2. Samples: 395257880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-13 11:48:05,502][73265] Avg episode reward: [(0, '0.435')] [2024-06-13 11:48:08,344][73497] Updated weights for policy 0, policy_version 236629 (0.0032) [2024-06-13 11:48:10,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45329.1, 300 sec: 45375.3). Total num frames: 3876995072. Throughput: 0: 45811.7. Samples: 395519180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 11:48:10,502][73265] Avg episode reward: [(0, '0.408')] [2024-06-13 11:48:12,137][73497] Updated weights for policy 0, policy_version 236639 (0.0028) [2024-06-13 11:48:15,504][73265] Fps is (10 sec: 47502.2, 60 sec: 46146.4, 300 sec: 45430.5). Total num frames: 3877240832. Throughput: 0: 45630.3. Samples: 395786680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 11:48:15,504][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 11:48:16,095][73497] Updated weights for policy 0, policy_version 236649 (0.0044) [2024-06-13 11:48:19,488][73497] Updated weights for policy 0, policy_version 236659 (0.0036) [2024-06-13 11:48:20,501][73265] Fps is (10 sec: 47513.2, 60 sec: 45875.2, 300 sec: 45486.4). Total num frames: 3877470208. Throughput: 0: 45977.3. Samples: 395934900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 11:48:20,502][73265] Avg episode reward: [(0, '0.461')] [2024-06-13 11:48:23,226][73497] Updated weights for policy 0, policy_version 236669 (0.0036) [2024-06-13 11:48:25,501][73265] Fps is (10 sec: 42608.7, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 3877666816. Throughput: 0: 45996.2. Samples: 396210460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 11:48:25,502][73265] Avg episode reward: [(0, '0.465')] [2024-06-13 11:48:25,528][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000236675_3877683200.pth... [2024-06-13 11:48:25,571][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000236009_3866771456.pth [2024-06-13 11:48:26,455][73497] Updated weights for policy 0, policy_version 236679 (0.0032) [2024-06-13 11:48:27,415][73477] Signal inference workers to stop experience collection... (5800 times) [2024-06-13 11:48:27,415][73477] Signal inference workers to resume experience collection... (5800 times) [2024-06-13 11:48:27,432][73497] InferenceWorker_p0-w0: stopping experience collection (5800 times) [2024-06-13 11:48:27,432][73497] InferenceWorker_p0-w0: resuming experience collection (5800 times) [2024-06-13 11:48:30,206][73497] Updated weights for policy 0, policy_version 236689 (0.0041) [2024-06-13 11:48:30,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45602.1, 300 sec: 45375.4). Total num frames: 3877912576. Throughput: 0: 45840.6. Samples: 396480720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 11:48:30,502][73265] Avg episode reward: [(0, '0.450')] [2024-06-13 11:48:33,954][73497] Updated weights for policy 0, policy_version 236699 (0.0028) [2024-06-13 11:48:35,501][73265] Fps is (10 sec: 49152.4, 60 sec: 46148.3, 300 sec: 45542.0). Total num frames: 3878158336. Throughput: 0: 45739.6. Samples: 396616460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 11:48:35,502][73265] Avg episode reward: [(0, '0.472')] [2024-06-13 11:48:37,659][73497] Updated weights for policy 0, policy_version 236709 (0.0032) [2024-06-13 11:48:40,501][73265] Fps is (10 sec: 44237.0, 60 sec: 45329.2, 300 sec: 45319.8). Total num frames: 3878354944. Throughput: 0: 45411.1. Samples: 396885320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 11:48:40,502][73265] Avg episode reward: [(0, '0.465')] [2024-06-13 11:48:41,238][73497] Updated weights for policy 0, policy_version 236719 (0.0027) [2024-06-13 11:48:44,722][73497] Updated weights for policy 0, policy_version 236729 (0.0039) [2024-06-13 11:48:45,501][73265] Fps is (10 sec: 42598.2, 60 sec: 45329.2, 300 sec: 45375.3). Total num frames: 3878584320. Throughput: 0: 45534.2. Samples: 397161260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 11:48:45,502][73265] Avg episode reward: [(0, '0.452')] [2024-06-13 11:48:48,110][73497] Updated weights for policy 0, policy_version 236739 (0.0023) [2024-06-13 11:48:50,504][73265] Fps is (10 sec: 47501.6, 60 sec: 45873.3, 300 sec: 45541.6). Total num frames: 3878830080. Throughput: 0: 45464.2. Samples: 397303880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 11:48:50,505][73265] Avg episode reward: [(0, '0.472')] [2024-06-13 11:48:52,247][73497] Updated weights for policy 0, policy_version 236749 (0.0026) [2024-06-13 11:48:55,116][73497] Updated weights for policy 0, policy_version 236759 (0.0031) [2024-06-13 11:48:55,502][73265] Fps is (10 sec: 47512.8, 60 sec: 45602.0, 300 sec: 45486.4). Total num frames: 3879059456. Throughput: 0: 45805.1. Samples: 397580420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 11:48:55,502][73265] Avg episode reward: [(0, '0.469')] [2024-06-13 11:48:59,262][73497] Updated weights for policy 0, policy_version 236769 (0.0040) [2024-06-13 11:49:00,501][73265] Fps is (10 sec: 42609.2, 60 sec: 45056.0, 300 sec: 45319.8). Total num frames: 3879256064. Throughput: 0: 45818.5. Samples: 397848400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 11:49:00,502][73265] Avg episode reward: [(0, '0.454')] [2024-06-13 11:49:02,577][73497] Updated weights for policy 0, policy_version 236779 (0.0038) [2024-06-13 11:49:05,501][73265] Fps is (10 sec: 45876.1, 60 sec: 45875.3, 300 sec: 45542.0). Total num frames: 3879518208. Throughput: 0: 45351.2. Samples: 397975700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 11:49:05,502][73265] Avg episode reward: [(0, '0.453')] [2024-06-13 11:49:06,443][73497] Updated weights for policy 0, policy_version 236789 (0.0035) [2024-06-13 11:49:09,807][73497] Updated weights for policy 0, policy_version 236799 (0.0039) [2024-06-13 11:49:10,501][73265] Fps is (10 sec: 45874.9, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 3879714816. Throughput: 0: 45367.6. Samples: 398252000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 11:49:10,502][73265] Avg episode reward: [(0, '0.506')] [2024-06-13 11:49:13,273][73497] Updated weights for policy 0, policy_version 236809 (0.0027) [2024-06-13 11:49:15,502][73265] Fps is (10 sec: 42597.7, 60 sec: 45057.7, 300 sec: 45430.9). Total num frames: 3879944192. Throughput: 0: 45696.8. Samples: 398537080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-13 11:49:15,502][73265] Avg episode reward: [(0, '0.470')] [2024-06-13 11:49:17,338][73497] Updated weights for policy 0, policy_version 236819 (0.0042) [2024-06-13 11:49:20,501][73265] Fps is (10 sec: 47513.8, 60 sec: 45329.1, 300 sec: 45486.4). Total num frames: 3880189952. Throughput: 0: 45534.6. Samples: 398665520. Policy #0 lag: (min: 2.0, avg: 10.6, max: 26.0) [2024-06-13 11:49:20,502][73265] Avg episode reward: [(0, '0.446')] [2024-06-13 11:49:21,267][73497] Updated weights for policy 0, policy_version 236829 (0.0034) [2024-06-13 11:49:24,140][73497] Updated weights for policy 0, policy_version 236839 (0.0049) [2024-06-13 11:49:24,397][73477] Signal inference workers to stop experience collection... (5850 times) [2024-06-13 11:49:24,446][73497] InferenceWorker_p0-w0: stopping experience collection (5850 times) [2024-06-13 11:49:24,452][73477] Signal inference workers to resume experience collection... (5850 times) [2024-06-13 11:49:24,461][73497] InferenceWorker_p0-w0: resuming experience collection (5850 times) [2024-06-13 11:49:25,501][73265] Fps is (10 sec: 50790.7, 60 sec: 46421.3, 300 sec: 45653.0). Total num frames: 3880452096. Throughput: 0: 45794.6. Samples: 398946080. Policy #0 lag: (min: 2.0, avg: 10.6, max: 26.0) [2024-06-13 11:49:25,502][73265] Avg episode reward: [(0, '0.429')] [2024-06-13 11:49:28,156][73497] Updated weights for policy 0, policy_version 236849 (0.0038) [2024-06-13 11:49:30,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 3880632320. Throughput: 0: 45707.6. Samples: 399218100. Policy #0 lag: (min: 2.0, avg: 10.6, max: 26.0) [2024-06-13 11:49:30,502][73265] Avg episode reward: [(0, '0.410')] [2024-06-13 11:49:31,135][73497] Updated weights for policy 0, policy_version 236859 (0.0037) [2024-06-13 11:49:35,174][73497] Updated weights for policy 0, policy_version 236869 (0.0031) [2024-06-13 11:49:35,501][73265] Fps is (10 sec: 40960.6, 60 sec: 45056.0, 300 sec: 45319.8). Total num frames: 3880861696. Throughput: 0: 45372.8. Samples: 399345540. Policy #0 lag: (min: 2.0, avg: 10.6, max: 26.0) [2024-06-13 11:49:35,502][73265] Avg episode reward: [(0, '0.458')] [2024-06-13 11:49:38,452][73497] Updated weights for policy 0, policy_version 236879 (0.0034) [2024-06-13 11:49:40,501][73265] Fps is (10 sec: 49151.6, 60 sec: 46148.2, 300 sec: 45708.6). Total num frames: 3881123840. Throughput: 0: 45415.7. Samples: 399624120. Policy #0 lag: (min: 2.0, avg: 10.6, max: 26.0) [2024-06-13 11:49:40,503][73265] Avg episode reward: [(0, '0.484')] [2024-06-13 11:49:41,985][73497] Updated weights for policy 0, policy_version 236889 (0.0031) [2024-06-13 11:49:45,501][73265] Fps is (10 sec: 45875.0, 60 sec: 45602.2, 300 sec: 45430.9). Total num frames: 3881320448. Throughput: 0: 45792.4. Samples: 399909060. Policy #0 lag: (min: 2.0, avg: 10.6, max: 26.0) [2024-06-13 11:49:45,502][73265] Avg episode reward: [(0, '0.488')] [2024-06-13 11:49:45,652][73497] Updated weights for policy 0, policy_version 236899 (0.0035) [2024-06-13 11:49:49,694][73497] Updated weights for policy 0, policy_version 236909 (0.0026) [2024-06-13 11:49:50,501][73265] Fps is (10 sec: 42598.8, 60 sec: 45331.0, 300 sec: 45375.4). Total num frames: 3881549824. Throughput: 0: 45783.6. Samples: 400035960. Policy #0 lag: (min: 2.0, avg: 10.6, max: 26.0) [2024-06-13 11:49:50,502][73265] Avg episode reward: [(0, '0.407')] [2024-06-13 11:49:52,311][73497] Updated weights for policy 0, policy_version 236919 (0.0040) [2024-06-13 11:49:55,502][73265] Fps is (10 sec: 47512.7, 60 sec: 45602.1, 300 sec: 45653.0). Total num frames: 3881795584. Throughput: 0: 45843.4. Samples: 400314960. Policy #0 lag: (min: 2.0, avg: 10.6, max: 26.0) [2024-06-13 11:49:55,502][73265] Avg episode reward: [(0, '0.423')] [2024-06-13 11:49:56,476][73497] Updated weights for policy 0, policy_version 236929 (0.0035) [2024-06-13 11:49:59,790][73497] Updated weights for policy 0, policy_version 236939 (0.0038) [2024-06-13 11:50:00,501][73265] Fps is (10 sec: 50790.3, 60 sec: 46694.4, 300 sec: 45708.6). Total num frames: 3882057728. Throughput: 0: 45687.3. Samples: 400593000. Policy #0 lag: (min: 2.0, avg: 10.6, max: 26.0) [2024-06-13 11:50:00,502][73265] Avg episode reward: [(0, '0.447')] [2024-06-13 11:50:03,298][73497] Updated weights for policy 0, policy_version 236949 (0.0030) [2024-06-13 11:50:05,501][73265] Fps is (10 sec: 45875.8, 60 sec: 45602.1, 300 sec: 45486.4). Total num frames: 3882254336. Throughput: 0: 45985.8. Samples: 400734880. Policy #0 lag: (min: 2.0, avg: 10.6, max: 26.0) [2024-06-13 11:50:05,502][73265] Avg episode reward: [(0, '0.462')] [2024-06-13 11:50:06,867][73497] Updated weights for policy 0, policy_version 236959 (0.0036) [2024-06-13 11:50:10,125][73497] Updated weights for policy 0, policy_version 236969 (0.0035) [2024-06-13 11:50:10,502][73265] Fps is (10 sec: 44236.1, 60 sec: 46421.3, 300 sec: 45653.0). Total num frames: 3882500096. Throughput: 0: 45812.8. Samples: 401007660. Policy #0 lag: (min: 2.0, avg: 10.6, max: 26.0) [2024-06-13 11:50:10,502][73265] Avg episode reward: [(0, '0.477')] [2024-06-13 11:50:13,802][73497] Updated weights for policy 0, policy_version 236979 (0.0024) [2024-06-13 11:50:15,501][73265] Fps is (10 sec: 47513.7, 60 sec: 46421.4, 300 sec: 45653.1). Total num frames: 3882729472. Throughput: 0: 45930.6. Samples: 401284980. Policy #0 lag: (min: 2.0, avg: 10.6, max: 26.0) [2024-06-13 11:50:15,502][73265] Avg episode reward: [(0, '0.414')] [2024-06-13 11:50:17,591][73497] Updated weights for policy 0, policy_version 236989 (0.0040) [2024-06-13 11:50:20,501][73265] Fps is (10 sec: 47514.4, 60 sec: 46421.4, 300 sec: 45653.0). Total num frames: 3882975232. Throughput: 0: 46344.4. Samples: 401431040. Policy #0 lag: (min: 2.0, avg: 10.6, max: 26.0) [2024-06-13 11:50:20,502][73265] Avg episode reward: [(0, '0.449')] [2024-06-13 11:50:20,715][73497] Updated weights for policy 0, policy_version 236999 (0.0027) [2024-06-13 11:50:24,835][73497] Updated weights for policy 0, policy_version 237009 (0.0029) [2024-06-13 11:50:25,502][73265] Fps is (10 sec: 42597.9, 60 sec: 45056.0, 300 sec: 45542.0). Total num frames: 3883155456. Throughput: 0: 46156.8. Samples: 401701180. Policy #0 lag: (min: 2.0, avg: 10.6, max: 26.0) [2024-06-13 11:50:25,502][73265] Avg episode reward: [(0, '0.480')] [2024-06-13 11:50:25,594][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000237010_3883171840.pth... [2024-06-13 11:50:25,662][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000236340_3872194560.pth [2024-06-13 11:50:28,247][73497] Updated weights for policy 0, policy_version 237019 (0.0037) [2024-06-13 11:50:30,501][73265] Fps is (10 sec: 44236.9, 60 sec: 46421.4, 300 sec: 45708.6). Total num frames: 3883417600. Throughput: 0: 45819.1. Samples: 401970920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 11:50:30,502][73265] Avg episode reward: [(0, '0.490')] [2024-06-13 11:50:31,891][73497] Updated weights for policy 0, policy_version 237029 (0.0038) [2024-06-13 11:50:35,501][73265] Fps is (10 sec: 47514.3, 60 sec: 46148.2, 300 sec: 45597.5). Total num frames: 3883630592. Throughput: 0: 46193.3. Samples: 402114660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 11:50:35,502][73265] Avg episode reward: [(0, '0.470')] [2024-06-13 11:50:35,704][73497] Updated weights for policy 0, policy_version 237039 (0.0030) [2024-06-13 11:50:37,891][73477] Signal inference workers to stop experience collection... (5900 times) [2024-06-13 11:50:37,892][73477] Signal inference workers to resume experience collection... (5900 times) [2024-06-13 11:50:37,933][73497] InferenceWorker_p0-w0: stopping experience collection (5900 times) [2024-06-13 11:50:37,934][73497] InferenceWorker_p0-w0: resuming experience collection (5900 times) [2024-06-13 11:50:38,728][73497] Updated weights for policy 0, policy_version 237049 (0.0031) [2024-06-13 11:50:40,501][73265] Fps is (10 sec: 44236.5, 60 sec: 45602.2, 300 sec: 45653.1). Total num frames: 3883859968. Throughput: 0: 45958.4. Samples: 402383080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 11:50:40,502][73265] Avg episode reward: [(0, '0.469')] [2024-06-13 11:50:42,582][73497] Updated weights for policy 0, policy_version 237059 (0.0031) [2024-06-13 11:50:45,501][73265] Fps is (10 sec: 49152.1, 60 sec: 46694.4, 300 sec: 45819.7). Total num frames: 3884122112. Throughput: 0: 45949.8. Samples: 402660740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 11:50:45,502][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 11:50:45,768][73497] Updated weights for policy 0, policy_version 237069 (0.0039) [2024-06-13 11:50:49,457][73497] Updated weights for policy 0, policy_version 237079 (0.0037) [2024-06-13 11:50:50,501][73265] Fps is (10 sec: 45875.0, 60 sec: 46148.2, 300 sec: 45653.0). Total num frames: 3884318720. Throughput: 0: 45813.3. Samples: 402796480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 11:50:50,502][73265] Avg episode reward: [(0, '0.490')] [2024-06-13 11:50:53,563][73497] Updated weights for policy 0, policy_version 237089 (0.0035) [2024-06-13 11:50:55,504][73265] Fps is (10 sec: 42587.4, 60 sec: 45873.4, 300 sec: 45708.2). Total num frames: 3884548096. Throughput: 0: 45821.1. Samples: 403069720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 11:50:55,504][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 11:50:56,696][73497] Updated weights for policy 0, policy_version 237099 (0.0025) [2024-06-13 11:51:00,501][73265] Fps is (10 sec: 45875.6, 60 sec: 45329.1, 300 sec: 45708.6). Total num frames: 3884777472. Throughput: 0: 45716.5. Samples: 403342220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 11:51:00,502][73265] Avg episode reward: [(0, '0.488')] [2024-06-13 11:51:00,580][73497] Updated weights for policy 0, policy_version 237109 (0.0035) [2024-06-13 11:51:04,041][73497] Updated weights for policy 0, policy_version 237119 (0.0021) [2024-06-13 11:51:05,501][73265] Fps is (10 sec: 45886.6, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 3885006848. Throughput: 0: 45558.6. Samples: 403481180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 11:51:05,502][73265] Avg episode reward: [(0, '0.485')] [2024-06-13 11:51:07,731][73497] Updated weights for policy 0, policy_version 237129 (0.0025) [2024-06-13 11:51:10,501][73265] Fps is (10 sec: 47513.3, 60 sec: 45875.3, 300 sec: 45708.6). Total num frames: 3885252608. Throughput: 0: 45781.5. Samples: 403761340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 11:51:10,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 11:51:10,947][73497] Updated weights for policy 0, policy_version 237139 (0.0042) [2024-06-13 11:51:14,675][73497] Updated weights for policy 0, policy_version 237149 (0.0044) [2024-06-13 11:51:15,504][73265] Fps is (10 sec: 47502.0, 60 sec: 45873.3, 300 sec: 45763.7). Total num frames: 3885481984. Throughput: 0: 45961.8. Samples: 404039320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 11:51:15,505][73265] Avg episode reward: [(0, '0.469')] [2024-06-13 11:51:17,847][73497] Updated weights for policy 0, policy_version 237159 (0.0028) [2024-06-13 11:51:20,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45329.0, 300 sec: 45819.7). Total num frames: 3885694976. Throughput: 0: 45646.6. Samples: 404168760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 11:51:20,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 11:51:21,941][73497] Updated weights for policy 0, policy_version 237169 (0.0041) [2024-06-13 11:51:25,326][73497] Updated weights for policy 0, policy_version 237179 (0.0025) [2024-06-13 11:51:25,501][73265] Fps is (10 sec: 45887.1, 60 sec: 46421.5, 300 sec: 45708.6). Total num frames: 3885940736. Throughput: 0: 45989.0. Samples: 404452580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 11:51:25,502][73265] Avg episode reward: [(0, '0.409')] [2024-06-13 11:51:29,183][73497] Updated weights for policy 0, policy_version 237189 (0.0033) [2024-06-13 11:51:30,501][73265] Fps is (10 sec: 47513.9, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 3886170112. Throughput: 0: 45902.2. Samples: 404726340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 11:51:30,502][73265] Avg episode reward: [(0, '0.501')] [2024-06-13 11:51:32,375][73497] Updated weights for policy 0, policy_version 237199 (0.0033) [2024-06-13 11:51:35,501][73265] Fps is (10 sec: 44236.3, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 3886383104. Throughput: 0: 45975.6. Samples: 404865380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-13 11:51:35,502][73265] Avg episode reward: [(0, '0.462')] [2024-06-13 11:51:36,024][73497] Updated weights for policy 0, policy_version 237209 (0.0033) [2024-06-13 11:51:39,255][73497] Updated weights for policy 0, policy_version 237219 (0.0031) [2024-06-13 11:51:40,501][73265] Fps is (10 sec: 44236.8, 60 sec: 45875.2, 300 sec: 45709.0). Total num frames: 3886612480. Throughput: 0: 46022.6. Samples: 405140620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 11:51:40,502][73265] Avg episode reward: [(0, '0.462')] [2024-06-13 11:51:43,166][73497] Updated weights for policy 0, policy_version 237229 (0.0047) [2024-06-13 11:51:45,501][73265] Fps is (10 sec: 45875.6, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 3886841856. Throughput: 0: 46069.3. Samples: 405415340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 11:51:45,502][73265] Avg episode reward: [(0, '0.451')] [2024-06-13 11:51:46,559][73497] Updated weights for policy 0, policy_version 237239 (0.0030) [2024-06-13 11:51:49,272][73477] Signal inference workers to stop experience collection... (5950 times) [2024-06-13 11:51:49,273][73477] Signal inference workers to resume experience collection... (5950 times) [2024-06-13 11:51:49,289][73497] InferenceWorker_p0-w0: stopping experience collection (5950 times) [2024-06-13 11:51:49,290][73497] InferenceWorker_p0-w0: resuming experience collection (5950 times) [2024-06-13 11:51:50,492][73497] Updated weights for policy 0, policy_version 237249 (0.0037) [2024-06-13 11:51:50,502][73265] Fps is (10 sec: 47512.8, 60 sec: 46148.2, 300 sec: 45875.2). Total num frames: 3887087616. Throughput: 0: 45945.7. Samples: 405548740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 11:51:50,502][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 11:51:53,973][73497] Updated weights for policy 0, policy_version 237259 (0.0039) [2024-06-13 11:51:55,501][73265] Fps is (10 sec: 45874.8, 60 sec: 45877.1, 300 sec: 45764.1). Total num frames: 3887300608. Throughput: 0: 45712.4. Samples: 405818400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 11:51:55,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 11:51:57,575][73497] Updated weights for policy 0, policy_version 237269 (0.0036) [2024-06-13 11:52:00,501][73265] Fps is (10 sec: 45875.7, 60 sec: 46148.2, 300 sec: 45875.2). Total num frames: 3887546368. Throughput: 0: 45662.1. Samples: 406094000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 11:52:00,502][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 11:52:01,165][73497] Updated weights for policy 0, policy_version 237279 (0.0038) [2024-06-13 11:52:04,722][73497] Updated weights for policy 0, policy_version 237289 (0.0031) [2024-06-13 11:52:05,502][73265] Fps is (10 sec: 47513.0, 60 sec: 46148.2, 300 sec: 45764.1). Total num frames: 3887775744. Throughput: 0: 45982.1. Samples: 406237960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 11:52:05,502][73265] Avg episode reward: [(0, '0.477')] [2024-06-13 11:52:08,255][73497] Updated weights for policy 0, policy_version 237299 (0.0026) [2024-06-13 11:52:10,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45602.1, 300 sec: 45819.7). Total num frames: 3887988736. Throughput: 0: 45903.0. Samples: 406518220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 11:52:10,502][73265] Avg episode reward: [(0, '0.448')] [2024-06-13 11:52:11,807][73497] Updated weights for policy 0, policy_version 237309 (0.0029) [2024-06-13 11:52:15,501][73265] Fps is (10 sec: 44237.5, 60 sec: 45604.0, 300 sec: 45764.1). Total num frames: 3888218112. Throughput: 0: 45756.4. Samples: 406785380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 11:52:15,502][73265] Avg episode reward: [(0, '0.458')] [2024-06-13 11:52:15,531][73497] Updated weights for policy 0, policy_version 237319 (0.0031) [2024-06-13 11:52:19,152][73497] Updated weights for policy 0, policy_version 237329 (0.0042) [2024-06-13 11:52:20,501][73265] Fps is (10 sec: 45875.4, 60 sec: 45875.3, 300 sec: 45764.1). Total num frames: 3888447488. Throughput: 0: 45789.4. Samples: 406925900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 11:52:20,502][73265] Avg episode reward: [(0, '0.467')] [2024-06-13 11:52:22,940][73497] Updated weights for policy 0, policy_version 237339 (0.0034) [2024-06-13 11:52:25,501][73265] Fps is (10 sec: 45874.9, 60 sec: 45602.0, 300 sec: 45764.1). Total num frames: 3888676864. Throughput: 0: 45608.3. Samples: 407193000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 11:52:25,502][73265] Avg episode reward: [(0, '0.421')] [2024-06-13 11:52:25,583][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000237347_3888693248.pth... [2024-06-13 11:52:25,640][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000236675_3877683200.pth [2024-06-13 11:52:26,444][73497] Updated weights for policy 0, policy_version 237349 (0.0037) [2024-06-13 11:52:30,115][73497] Updated weights for policy 0, policy_version 237359 (0.0029) [2024-06-13 11:52:30,502][73265] Fps is (10 sec: 45874.3, 60 sec: 45602.0, 300 sec: 45819.6). Total num frames: 3888906240. Throughput: 0: 45608.2. Samples: 407467720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 11:52:30,502][73265] Avg episode reward: [(0, '0.441')] [2024-06-13 11:52:33,577][73497] Updated weights for policy 0, policy_version 237369 (0.0038) [2024-06-13 11:52:35,504][73265] Fps is (10 sec: 45864.2, 60 sec: 45873.3, 300 sec: 45763.8). Total num frames: 3889135616. Throughput: 0: 45709.2. Samples: 407605760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 11:52:35,504][73265] Avg episode reward: [(0, '0.475')] [2024-06-13 11:52:37,169][73497] Updated weights for policy 0, policy_version 237379 (0.0035) [2024-06-13 11:52:40,502][73265] Fps is (10 sec: 45875.2, 60 sec: 45875.1, 300 sec: 45764.1). Total num frames: 3889364992. Throughput: 0: 45929.7. Samples: 407885240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 11:52:40,502][73265] Avg episode reward: [(0, '0.461')] [2024-06-13 11:52:40,700][73497] Updated weights for policy 0, policy_version 237389 (0.0035) [2024-06-13 11:52:44,315][73497] Updated weights for policy 0, policy_version 237399 (0.0041) [2024-06-13 11:52:45,501][73265] Fps is (10 sec: 44247.7, 60 sec: 45602.1, 300 sec: 45764.1). Total num frames: 3889577984. Throughput: 0: 45793.4. Samples: 408154700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-13 11:52:45,502][73265] Avg episode reward: [(0, '0.446')] [2024-06-13 11:52:47,853][73497] Updated weights for policy 0, policy_version 237409 (0.0028) [2024-06-13 11:52:50,504][73265] Fps is (10 sec: 44226.4, 60 sec: 45327.3, 300 sec: 45708.2). Total num frames: 3889807360. Throughput: 0: 45479.4. Samples: 408284640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:52:50,504][73265] Avg episode reward: [(0, '0.412')] [2024-06-13 11:52:51,856][73497] Updated weights for policy 0, policy_version 237419 (0.0034) [2024-06-13 11:52:55,263][73497] Updated weights for policy 0, policy_version 237429 (0.0031) [2024-06-13 11:52:55,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45602.2, 300 sec: 45708.6). Total num frames: 3890036736. Throughput: 0: 45477.8. Samples: 408564720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:52:55,502][73265] Avg episode reward: [(0, '0.456')] [2024-06-13 11:52:58,783][73497] Updated weights for policy 0, policy_version 237439 (0.0038) [2024-06-13 11:53:00,501][73265] Fps is (10 sec: 47525.5, 60 sec: 45602.2, 300 sec: 45819.7). Total num frames: 3890282496. Throughput: 0: 45756.9. Samples: 408844440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:53:00,502][73265] Avg episode reward: [(0, '0.508')] [2024-06-13 11:53:02,393][73497] Updated weights for policy 0, policy_version 237449 (0.0041) [2024-06-13 11:53:05,501][73265] Fps is (10 sec: 45875.1, 60 sec: 45329.2, 300 sec: 45764.1). Total num frames: 3890495488. Throughput: 0: 45604.4. Samples: 408978100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:53:05,502][73265] Avg episode reward: [(0, '0.465')] [2024-06-13 11:53:06,255][73497] Updated weights for policy 0, policy_version 237459 (0.0034) [2024-06-13 11:53:09,447][73497] Updated weights for policy 0, policy_version 237469 (0.0032) [2024-06-13 11:53:10,501][73265] Fps is (10 sec: 44237.1, 60 sec: 45602.2, 300 sec: 45709.0). Total num frames: 3890724864. Throughput: 0: 45693.9. Samples: 409249220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:53:10,502][73265] Avg episode reward: [(0, '0.466')] [2024-06-13 11:53:13,388][73497] Updated weights for policy 0, policy_version 237479 (0.0040) [2024-06-13 11:53:15,505][73265] Fps is (10 sec: 45860.4, 60 sec: 45599.7, 300 sec: 45708.1). Total num frames: 3890954240. Throughput: 0: 45634.2. Samples: 409521400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:53:15,505][73265] Avg episode reward: [(0, '0.467')] [2024-06-13 11:53:16,773][73497] Updated weights for policy 0, policy_version 237489 (0.0031) [2024-06-13 11:53:20,275][73497] Updated weights for policy 0, policy_version 237499 (0.0033) [2024-06-13 11:53:20,501][73265] Fps is (10 sec: 45874.8, 60 sec: 45602.1, 300 sec: 45819.7). Total num frames: 3891183616. Throughput: 0: 45509.1. Samples: 409653560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:53:20,502][73265] Avg episode reward: [(0, '0.474')] [2024-06-13 11:53:23,969][73497] Updated weights for policy 0, policy_version 237509 (0.0030) [2024-06-13 11:53:25,502][73265] Fps is (10 sec: 47528.2, 60 sec: 45875.1, 300 sec: 45819.6). Total num frames: 3891429376. Throughput: 0: 45521.8. Samples: 409933720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:53:25,502][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 11:53:27,387][73497] Updated weights for policy 0, policy_version 237519 (0.0033) [2024-06-13 11:53:28,678][73477] Signal inference workers to stop experience collection... (6000 times) [2024-06-13 11:53:28,679][73477] Signal inference workers to resume experience collection... (6000 times) [2024-06-13 11:53:28,712][73497] InferenceWorker_p0-w0: stopping experience collection (6000 times) [2024-06-13 11:53:28,713][73497] InferenceWorker_p0-w0: resuming experience collection (6000 times) [2024-06-13 11:53:30,501][73265] Fps is (10 sec: 45874.8, 60 sec: 45602.2, 300 sec: 45708.6). Total num frames: 3891642368. Throughput: 0: 45659.9. Samples: 410209400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:53:30,502][73265] Avg episode reward: [(0, '0.500')] [2024-06-13 11:53:30,870][73497] Updated weights for policy 0, policy_version 237529 (0.0026) [2024-06-13 11:53:34,957][73497] Updated weights for policy 0, policy_version 237539 (0.0019) [2024-06-13 11:53:35,502][73265] Fps is (10 sec: 44236.7, 60 sec: 45603.9, 300 sec: 45819.6). Total num frames: 3891871744. Throughput: 0: 45750.4. Samples: 410343300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:53:35,502][73265] Avg episode reward: [(0, '0.431')] [2024-06-13 11:53:38,381][73497] Updated weights for policy 0, policy_version 237549 (0.0031) [2024-06-13 11:53:40,501][73265] Fps is (10 sec: 45875.8, 60 sec: 45602.3, 300 sec: 45819.7). Total num frames: 3892101120. Throughput: 0: 45579.1. Samples: 410615780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:53:40,502][73265] Avg episode reward: [(0, '0.487')] [2024-06-13 11:53:42,036][73497] Updated weights for policy 0, policy_version 237559 (0.0044) [2024-06-13 11:53:45,501][73265] Fps is (10 sec: 44237.8, 60 sec: 45602.2, 300 sec: 45709.0). Total num frames: 3892314112. Throughput: 0: 45434.3. Samples: 410888980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:53:45,502][73265] Avg episode reward: [(0, '0.478')] [2024-06-13 11:53:45,509][73497] Updated weights for policy 0, policy_version 237569 (0.0041) [2024-06-13 11:53:48,951][73497] Updated weights for policy 0, policy_version 237579 (0.0025) [2024-06-13 11:53:50,502][73265] Fps is (10 sec: 45874.5, 60 sec: 45877.0, 300 sec: 45764.1). Total num frames: 3892559872. Throughput: 0: 45571.4. Samples: 411028820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:53:50,502][73265] Avg episode reward: [(0, '0.491')] [2024-06-13 11:53:52,362][73497] Updated weights for policy 0, policy_version 237589 (0.0035) [2024-06-13 11:53:55,501][73265] Fps is (10 sec: 47513.3, 60 sec: 45875.1, 300 sec: 45875.2). Total num frames: 3892789248. Throughput: 0: 45696.3. Samples: 411305560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:53:55,502][73265] Avg episode reward: [(0, '0.489')] [2024-06-13 11:53:56,207][73497] Updated weights for policy 0, policy_version 237599 (0.0045) [2024-06-13 11:53:59,245][73497] Updated weights for policy 0, policy_version 237609 (0.0039) [2024-06-13 11:54:00,501][73265] Fps is (10 sec: 44237.5, 60 sec: 45329.1, 300 sec: 45708.6). Total num frames: 3893002240. Throughput: 0: 45832.6. Samples: 411583720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:54:00,502][73265] Avg episode reward: [(0, '0.434')] [2024-06-13 11:54:03,343][73497] Updated weights for policy 0, policy_version 237619 (0.0030) [2024-06-13 11:54:05,501][73265] Fps is (10 sec: 45875.2, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 3893248000. Throughput: 0: 45975.1. Samples: 411722440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:54:05,502][73265] Avg episode reward: [(0, '0.436')] [2024-06-13 11:54:06,966][73497] Updated weights for policy 0, policy_version 237629 (0.0048) [2024-06-13 11:54:10,358][73497] Updated weights for policy 0, policy_version 237639 (0.0026) [2024-06-13 11:54:10,501][73265] Fps is (10 sec: 47513.6, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 3893477376. Throughput: 0: 45745.5. Samples: 411992260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:54:10,502][73265] Avg episode reward: [(0, '0.468')] [2024-06-13 11:54:14,089][73497] Updated weights for policy 0, policy_version 237649 (0.0034) [2024-06-13 11:54:15,501][73265] Fps is (10 sec: 44236.9, 60 sec: 45604.6, 300 sec: 45764.1). Total num frames: 3893690368. Throughput: 0: 45611.2. Samples: 412261900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:54:15,502][73265] Avg episode reward: [(0, '0.518')] [2024-06-13 11:54:17,693][73497] Updated weights for policy 0, policy_version 237659 (0.0031) [2024-06-13 11:54:20,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45602.2, 300 sec: 45653.1). Total num frames: 3893919744. Throughput: 0: 45634.0. Samples: 412396820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:54:20,502][73265] Avg episode reward: [(0, '0.410')] [2024-06-13 11:54:20,927][73497] Updated weights for policy 0, policy_version 237669 (0.0032) [2024-06-13 11:54:25,097][73497] Updated weights for policy 0, policy_version 237679 (0.0027) [2024-06-13 11:54:25,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45329.2, 300 sec: 45819.7). Total num frames: 3894149120. Throughput: 0: 45653.4. Samples: 412670180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:54:25,502][73265] Avg episode reward: [(0, '0.456')] [2024-06-13 11:54:25,538][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000237681_3894165504.pth... [2024-06-13 11:54:25,588][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000237010_3883171840.pth [2024-06-13 11:54:27,981][73497] Updated weights for policy 0, policy_version 237689 (0.0028) [2024-06-13 11:54:30,502][73265] Fps is (10 sec: 45874.4, 60 sec: 45602.1, 300 sec: 45819.6). Total num frames: 3894378496. Throughput: 0: 45760.3. Samples: 412948200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:54:30,502][73265] Avg episode reward: [(0, '0.503')] [2024-06-13 11:54:31,907][73497] Updated weights for policy 0, policy_version 237699 (0.0033) [2024-06-13 11:54:35,459][73497] Updated weights for policy 0, policy_version 237709 (0.0037) [2024-06-13 11:54:35,502][73265] Fps is (10 sec: 47512.4, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 3894624256. Throughput: 0: 45721.7. Samples: 413086300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:54:35,502][73265] Avg episode reward: [(0, '0.484')] [2024-06-13 11:54:39,154][73497] Updated weights for policy 0, policy_version 237719 (0.0039) [2024-06-13 11:54:40,501][73265] Fps is (10 sec: 45875.8, 60 sec: 45602.1, 300 sec: 45819.7). Total num frames: 3894837248. Throughput: 0: 45672.5. Samples: 413360820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:54:40,502][73265] Avg episode reward: [(0, '0.437')] [2024-06-13 11:54:42,793][73497] Updated weights for policy 0, policy_version 237729 (0.0033) [2024-06-13 11:54:45,502][73265] Fps is (10 sec: 45875.5, 60 sec: 46148.1, 300 sec: 45875.2). Total num frames: 3895083008. Throughput: 0: 45652.7. Samples: 413638100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:54:45,502][73265] Avg episode reward: [(0, '0.493')] [2024-06-13 11:54:46,378][73497] Updated weights for policy 0, policy_version 237739 (0.0029) [2024-06-13 11:54:49,842][73497] Updated weights for policy 0, policy_version 237749 (0.0030) [2024-06-13 11:54:50,501][73265] Fps is (10 sec: 45875.3, 60 sec: 45602.2, 300 sec: 45764.1). Total num frames: 3895296000. Throughput: 0: 45536.9. Samples: 413771600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:54:50,502][73265] Avg episode reward: [(0, '0.484')] [2024-06-13 11:54:51,052][73477] Signal inference workers to stop experience collection... (6050 times) [2024-06-13 11:54:51,103][73497] InferenceWorker_p0-w0: stopping experience collection (6050 times) [2024-06-13 11:54:51,106][73477] Signal inference workers to resume experience collection... (6050 times) [2024-06-13 11:54:51,114][73497] InferenceWorker_p0-w0: resuming experience collection (6050 times) [2024-06-13 11:54:53,308][73497] Updated weights for policy 0, policy_version 237759 (0.0033) [2024-06-13 11:54:55,501][73265] Fps is (10 sec: 45875.5, 60 sec: 45875.2, 300 sec: 45708.6). Total num frames: 3895541760. Throughput: 0: 45646.1. Samples: 414046340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:54:55,502][73265] Avg episode reward: [(0, '0.477')] [2024-06-13 11:54:56,669][73497] Updated weights for policy 0, policy_version 237769 (0.0035) [2024-06-13 11:55:00,110][73497] Updated weights for policy 0, policy_version 237779 (0.0029) [2024-06-13 11:55:00,504][73265] Fps is (10 sec: 47501.7, 60 sec: 46146.3, 300 sec: 45819.3). Total num frames: 3895771136. Throughput: 0: 45705.5. Samples: 414318760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-13 11:55:00,504][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 11:55:03,875][73497] Updated weights for policy 0, policy_version 237789 (0.0033) [2024-06-13 11:55:05,501][73265] Fps is (10 sec: 44237.2, 60 sec: 45602.2, 300 sec: 45708.6). Total num frames: 3895984128. Throughput: 0: 45718.7. Samples: 414454160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 11:55:05,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 11:55:07,646][73497] Updated weights for policy 0, policy_version 237799 (0.0053) [2024-06-13 11:55:10,501][73265] Fps is (10 sec: 45886.4, 60 sec: 45875.1, 300 sec: 45764.1). Total num frames: 3896229888. Throughput: 0: 45930.6. Samples: 414737060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 11:55:10,502][73265] Avg episode reward: [(0, '0.458')] [2024-06-13 11:55:11,373][73497] Updated weights for policy 0, policy_version 237809 (0.0036) [2024-06-13 11:55:14,559][73497] Updated weights for policy 0, policy_version 237819 (0.0043) [2024-06-13 11:55:15,501][73265] Fps is (10 sec: 45874.8, 60 sec: 45875.1, 300 sec: 45653.0). Total num frames: 3896442880. Throughput: 0: 45747.6. Samples: 415006840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 11:55:15,502][73265] Avg episode reward: [(0, '0.438')] [2024-06-13 11:55:18,255][73497] Updated weights for policy 0, policy_version 237829 (0.0035) [2024-06-13 11:55:20,502][73265] Fps is (10 sec: 44236.6, 60 sec: 45875.1, 300 sec: 45819.7). Total num frames: 3896672256. Throughput: 0: 45841.0. Samples: 415149140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 11:55:20,502][73265] Avg episode reward: [(0, '0.483')] [2024-06-13 11:55:21,799][73497] Updated weights for policy 0, policy_version 237839 (0.0027) [2024-06-13 11:55:25,276][73497] Updated weights for policy 0, policy_version 237849 (0.0044) [2024-06-13 11:55:25,502][73265] Fps is (10 sec: 47513.3, 60 sec: 46148.1, 300 sec: 45764.1). Total num frames: 3896918016. Throughput: 0: 45777.6. Samples: 415420820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 11:55:25,502][73265] Avg episode reward: [(0, '0.400')] [2024-06-13 11:55:28,749][73497] Updated weights for policy 0, policy_version 237859 (0.0034) [2024-06-13 11:55:30,501][73265] Fps is (10 sec: 45875.8, 60 sec: 45875.3, 300 sec: 45764.1). Total num frames: 3897131008. Throughput: 0: 45773.5. Samples: 415697900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 11:55:30,502][73265] Avg episode reward: [(0, '0.459')] [2024-06-13 11:55:32,686][73497] Updated weights for policy 0, policy_version 237869 (0.0049) [2024-06-13 11:55:35,501][73265] Fps is (10 sec: 44237.6, 60 sec: 45602.3, 300 sec: 45764.1). Total num frames: 3897360384. Throughput: 0: 45779.6. Samples: 415831680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 11:55:35,502][73265] Avg episode reward: [(0, '0.476')] [2024-06-13 11:55:36,199][73497] Updated weights for policy 0, policy_version 237879 (0.0039) [2024-06-13 11:55:40,451][73497] Updated weights for policy 0, policy_version 237889 (0.0029) [2024-06-13 11:55:40,501][73265] Fps is (10 sec: 44236.6, 60 sec: 45602.1, 300 sec: 45597.5). Total num frames: 3897573376. Throughput: 0: 45711.2. Samples: 416103340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 11:55:40,502][73265] Avg episode reward: [(0, '0.487')] [2024-06-13 11:55:42,899][73497] Updated weights for policy 0, policy_version 237899 (0.0030) [2024-06-13 11:55:45,501][73265] Fps is (10 sec: 47513.3, 60 sec: 45875.3, 300 sec: 45819.7). Total num frames: 3897835520. Throughput: 0: 45929.2. Samples: 416385460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 11:55:45,502][73265] Avg episode reward: [(0, '0.473')] [2024-06-13 11:55:47,327][73497] Updated weights for policy 0, policy_version 237909 (0.0040) [2024-06-13 11:55:50,501][73265] Fps is (10 sec: 49152.2, 60 sec: 46148.3, 300 sec: 45820.1). Total num frames: 3898064896. Throughput: 0: 46120.9. Samples: 416529600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 11:55:50,502][73265] Avg episode reward: [(0, '0.505')] [2024-06-13 11:55:50,505][73497] Updated weights for policy 0, policy_version 237919 (0.0039) [2024-06-13 11:55:54,315][73497] Updated weights for policy 0, policy_version 237929 (0.0027) [2024-06-13 11:55:55,501][73265] Fps is (10 sec: 42598.4, 60 sec: 45329.1, 300 sec: 45708.6). Total num frames: 3898261504. Throughput: 0: 45902.7. Samples: 416802680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 11:55:55,502][73265] Avg episode reward: [(0, '0.500')] [2024-06-13 11:55:57,358][73497] Updated weights for policy 0, policy_version 237939 (0.0032) [2024-06-13 11:55:58,657][73477] Signal inference workers to stop experience collection... (6100 times) [2024-06-13 11:55:58,687][73497] InferenceWorker_p0-w0: stopping experience collection (6100 times) [2024-06-13 11:55:58,770][73477] Signal inference workers to resume experience collection... (6100 times) [2024-06-13 11:55:58,770][73497] InferenceWorker_p0-w0: resuming experience collection (6100 times) [2024-06-13 11:56:00,504][73265] Fps is (10 sec: 45863.9, 60 sec: 45875.2, 300 sec: 45819.3). Total num frames: 3898523648. Throughput: 0: 45991.4. Samples: 417076560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 11:56:00,504][73265] Avg episode reward: [(0, '0.443')] [2024-06-13 11:56:01,514][73497] Updated weights for policy 0, policy_version 237949 (0.0035) [2024-06-13 11:56:04,425][73497] Updated weights for policy 0, policy_version 237959 (0.0025) [2024-06-13 11:56:05,501][73265] Fps is (10 sec: 47513.4, 60 sec: 45875.1, 300 sec: 45708.6). Total num frames: 3898736640. Throughput: 0: 45915.2. Samples: 417215320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 11:56:05,502][73265] Avg episode reward: [(0, '0.428')] [2024-06-13 11:56:08,287][73497] Updated weights for policy 0, policy_version 237969 (0.0039) [2024-06-13 11:56:10,501][73265] Fps is (10 sec: 47525.4, 60 sec: 46148.3, 300 sec: 45820.1). Total num frames: 3898998784. Throughput: 0: 46232.6. Samples: 417501280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-13 11:56:10,502][73265] Avg episode reward: [(0, '0.460')] [2024-06-13 11:56:11,312][73497] Updated weights for policy 0, policy_version 237979 (0.0028) [2024-06-13 11:56:15,501][73265] Fps is (10 sec: 45875.7, 60 sec: 45875.3, 300 sec: 45764.1). Total num frames: 3899195392. Throughput: 0: 46105.4. Samples: 417772640. Policy #0 lag: (min: 1.0, avg: 10.8, max: 24.0) [2024-06-13 11:56:15,502][73265] Avg episode reward: [(0, '0.465')] [2024-06-13 11:56:15,603][73497] Updated weights for policy 0, policy_version 237989 (0.0031) [2024-06-13 11:56:18,990][73497] Updated weights for policy 0, policy_version 237999 (0.0048) [2024-06-13 11:56:20,501][73265] Fps is (10 sec: 44236.7, 60 sec: 46148.4, 300 sec: 45764.1). Total num frames: 3899441152. Throughput: 0: 46224.0. Samples: 417911760. Policy #0 lag: (min: 1.0, avg: 10.8, max: 24.0) [2024-06-13 11:56:20,502][73265] Avg episode reward: [(0, '0.467')] [2024-06-13 11:56:22,941][73497] Updated weights for policy 0, policy_version 238009 (0.0024) [2024-06-13 11:56:25,502][73265] Fps is (10 sec: 47512.8, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 3899670528. Throughput: 0: 46115.9. Samples: 418178560. Policy #0 lag: (min: 1.0, avg: 10.8, max: 24.0) [2024-06-13 11:56:25,502][73265] Avg episode reward: [(0, '0.478')] [2024-06-13 11:56:25,512][73477] Saving /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000238017_3899670528.pth... [2024-06-13 11:56:25,568][73477] Removing /workspace/metta/train_dir/p2.death/checkpoint_p0/checkpoint_000237347_3888693248.pth [2024-06-13 11:56:26,099][73497] Updated weights for policy 0, policy_version 238019 (0.0036) [2024-06-13 11:56:29,946][73497] Updated weights for policy 0, policy_version 238029 (0.0025) [2024-06-13 11:56:30,501][73265] Fps is (10 sec: 45875.1, 60 sec: 46148.2, 300 sec: 45819.7). Total num frames: 3899899904. Throughput: 0: 46061.8. Samples: 418458240. Policy #0 lag: (min: 1.0, avg: 10.8, max: 24.0) [2024-06-13 11:56:30,502][73265] Avg episode reward: [(0, '0.432')] [2024-06-13 11:56:33,468][73497] Updated weights for policy 0, policy_version 238039 (0.0028) [2024-06-13 11:56:35,501][73265] Fps is (10 sec: 44236.8, 60 sec: 45875.1, 300 sec: 45764.1). Total num frames: 3900112896. Throughput: 0: 45883.4. Samples: 418594360. Policy #0 lag: (min: 1.0, avg: 10.8, max: 24.0) [2024-06-13 11:56:35,502][73265] Avg episode reward: [(0, '0.453')] [2024-06-13 11:56:36,981][73497] Updated weights for policy 0, policy_version 238049 (0.0026) [2024-06-13 11:56:40,195][73497] Updated weights for policy 0, policy_version 238059 (0.0034) [2024-06-13 11:56:40,501][73265] Fps is (10 sec: 45875.2, 60 sec: 46421.3, 300 sec: 45819.7). Total num frames: 3900358656. Throughput: 0: 45861.3. Samples: 418866440. Policy #0 lag: (min: 1.0, avg: 10.8, max: 24.0) [2024-06-13 11:56:40,502][73265] Avg episode reward: [(0, '0.455')] [2024-06-13 11:56:44,469][73497] Updated weights for policy 0, policy_version 238069 (0.0035) [2024-06-13 11:56:45,501][73265] Fps is (10 sec: 47514.3, 60 sec: 45875.2, 300 sec: 45764.2). Total num frames: 3900588032. Throughput: 0: 46028.8. Samples: 419147740. Policy #0 lag: (min: 1.0, avg: 10.8, max: 24.0) [2024-06-13 11:56:45,502][73265] Avg episode reward: [(0, '0.514')] [2024-06-13 11:56:47,544][73497] Updated weights for policy 0, policy_version 238079 (0.0033) [2024-06-13 11:56:50,501][73265] Fps is (10 sec: 44236.7, 60 sec: 45602.1, 300 sec: 45764.1). Total num frames: 3900801024. Throughput: 0: 46051.1. Samples: 419287620. Policy #0 lag: (min: 1.0, avg: 10.8, max: 24.0) [2024-06-13 11:56:50,502][73265] Avg episode reward: [(0, '0.477')] [2024-06-13 11:56:51,193][73497] Updated weights for policy 0, policy_version 238089 (0.0038)