[2023-03-06 13:45:33,322][1875328] Saving configuration to /home/qgallouedec/gia/data/envs/metaworld/train_dir/button-press-topdown-v2/config.json... [2023-03-06 13:45:33,337][1875328] Rollout worker 0 uses device cpu [2023-03-06 13:45:33,337][1875328] Rollout worker 1 uses device cpu [2023-03-06 13:45:33,337][1875328] Rollout worker 2 uses device cpu [2023-03-06 13:45:33,337][1875328] Rollout worker 3 uses device cpu [2023-03-06 13:45:33,338][1875328] Rollout worker 4 uses device cpu [2023-03-06 13:45:33,338][1875328] Rollout worker 5 uses device cpu [2023-03-06 13:45:33,338][1875328] Rollout worker 6 uses device cpu [2023-03-06 13:45:33,338][1875328] Rollout worker 7 uses device cpu [2023-03-06 13:45:33,338][1875328] Rollout worker 8 uses device cpu [2023-03-06 13:45:33,338][1875328] Rollout worker 9 uses device cpu [2023-03-06 13:45:33,338][1875328] Rollout worker 10 uses device cpu [2023-03-06 13:45:33,339][1875328] Rollout worker 11 uses device cpu [2023-03-06 13:45:33,339][1875328] Rollout worker 12 uses device cpu [2023-03-06 13:45:33,339][1875328] Rollout worker 13 uses device cpu [2023-03-06 13:45:33,339][1875328] Rollout worker 14 uses device cpu [2023-03-06 13:45:33,339][1875328] Rollout worker 15 uses device cpu [2023-03-06 13:45:33,339][1875328] Rollout worker 16 uses device cpu [2023-03-06 13:45:33,339][1875328] Rollout worker 17 uses device cpu [2023-03-06 13:45:33,340][1875328] Rollout worker 18 uses device cpu [2023-03-06 13:45:33,340][1875328] Rollout worker 19 uses device cpu [2023-03-06 13:45:33,340][1875328] Rollout worker 20 uses device cpu [2023-03-06 13:45:33,340][1875328] Rollout worker 21 uses device cpu [2023-03-06 13:45:33,340][1875328] Rollout worker 22 uses device cpu [2023-03-06 13:45:33,340][1875328] Rollout worker 23 uses device cpu [2023-03-06 13:45:33,340][1875328] Rollout worker 24 uses device cpu [2023-03-06 13:45:33,341][1875328] Rollout worker 25 uses device cpu [2023-03-06 13:45:33,341][1875328] Rollout worker 26 uses device cpu [2023-03-06 13:45:33,341][1875328] Rollout worker 27 uses device cpu [2023-03-06 13:45:33,341][1875328] Rollout worker 28 uses device cpu [2023-03-06 13:45:33,341][1875328] Rollout worker 29 uses device cpu [2023-03-06 13:45:33,341][1875328] Rollout worker 30 uses device cpu [2023-03-06 13:45:33,341][1875328] Rollout worker 31 uses device cpu [2023-03-06 13:45:33,355][1875328] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-03-06 13:45:33,355][1875328] InferenceWorker_p0-w0: min num requests: 10 [2023-03-06 13:45:33,440][1875328] Starting all processes... [2023-03-06 13:45:33,440][1875328] Starting process learner_proc0 [2023-03-06 13:45:33,490][1875328] Starting all processes... [2023-03-06 13:45:33,541][1875328] Starting process inference_proc0-0 [2023-03-06 13:45:33,541][1875328] Starting process rollout_proc0 [2023-03-06 13:45:33,541][1875328] Starting process rollout_proc1 [2023-03-06 13:45:33,541][1875328] Starting process rollout_proc2 [2023-03-06 13:45:33,541][1875328] Starting process rollout_proc3 [2023-03-06 13:45:33,541][1875328] Starting process rollout_proc4 [2023-03-06 13:45:33,542][1875328] Starting process rollout_proc5 [2023-03-06 13:45:33,542][1875328] Starting process rollout_proc6 [2023-03-06 13:45:33,543][1875328] Starting process rollout_proc7 [2023-03-06 13:45:33,544][1875328] Starting process rollout_proc8 [2023-03-06 13:45:33,544][1875328] Starting process rollout_proc9 [2023-03-06 13:45:33,552][1875328] Starting process rollout_proc10 [2023-03-06 13:45:33,553][1875328] Starting process rollout_proc11 [2023-03-06 13:45:33,553][1875328] Starting process rollout_proc12 [2023-03-06 13:45:33,561][1875328] Starting process rollout_proc13 [2023-03-06 13:45:33,562][1875328] Starting process rollout_proc14 [2023-03-06 13:45:33,566][1875328] Starting process rollout_proc15 [2023-03-06 13:45:33,567][1875328] Starting process rollout_proc16 [2023-03-06 13:45:33,567][1875328] Starting process rollout_proc17 [2023-03-06 13:45:33,580][1875328] Starting process rollout_proc18 [2023-03-06 13:45:33,580][1875328] Starting process rollout_proc19 [2023-03-06 13:45:33,585][1875328] Starting process rollout_proc20 [2023-03-06 13:45:33,593][1875328] Starting process rollout_proc21 [2023-03-06 13:45:33,710][1875328] Starting process rollout_proc22 [2023-03-06 13:45:33,732][1875328] Starting process rollout_proc23 [2023-03-06 13:45:33,733][1875328] Starting process rollout_proc24 [2023-03-06 13:45:33,741][1875328] Starting process rollout_proc25 [2023-03-06 13:45:33,741][1875328] Starting process rollout_proc26 [2023-03-06 13:45:33,742][1875328] Starting process rollout_proc27 [2023-03-06 13:45:33,742][1875328] Starting process rollout_proc28 [2023-03-06 13:45:33,742][1875328] Starting process rollout_proc29 [2023-03-06 13:45:33,743][1875328] Starting process rollout_proc30 [2023-03-06 13:45:33,743][1875328] Starting process rollout_proc31 [2023-03-06 13:45:35,414][1875604] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-03-06 13:45:35,414][1875604] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-03-06 13:45:35,424][1875604] Num visible devices: 1 [2023-03-06 13:45:35,461][1875604] WARNING! It is generally recommended to enable Fixed KL loss (https://arxiv.org/pdf/1707.06347.pdf) for continuous action tasks to avoid potential numerical issues. I.e. set --kl_loss_coeff=0.1 [2023-03-06 13:45:35,461][1875604] Starting seed is not provided [2023-03-06 13:45:35,461][1875604] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-03-06 13:45:35,461][1875604] Initializing actor-critic model on device cuda:0 [2023-03-06 13:45:35,461][1875604] RunningMeanStd input shape: (39,) [2023-03-06 13:45:35,462][1875604] RunningMeanStd input shape: (1,) [2023-03-06 13:45:35,481][1875656] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-03-06 13:45:35,481][1875656] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-03-06 13:45:35,491][1875656] Num visible devices: 1 [2023-03-06 13:45:35,522][1875857] Worker 17 uses CPU cores [17] [2023-03-06 13:45:35,551][1875604] Created Actor Critic model with architecture: [2023-03-06 13:45:35,551][1875604] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): MultiInputEncoder( (encoders): ModuleDict( (obs): MlpEncoder( (mlp_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ELU) (2): RecursiveScriptModule(original_name=Linear) (3): RecursiveScriptModule(original_name=ELU) ) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=8, bias=True) ) ) [2023-03-06 13:45:35,726][1876204] Worker 27 uses CPU cores [27] [2023-03-06 13:45:35,896][1875859] Worker 14 uses CPU cores [14] [2023-03-06 13:45:35,899][1876219] Worker 30 uses CPU cores [30] [2023-03-06 13:45:35,956][1876218] Worker 29 uses CPU cores [29] [2023-03-06 13:45:36,195][1876220] Worker 24 uses CPU cores [24] [2023-03-06 13:45:36,218][1875895] Worker 18 uses CPU cores [18] [2023-03-06 13:45:36,297][1875854] Worker 10 uses CPU cores [10] [2023-03-06 13:45:36,513][1876221] Worker 31 uses CPU cores [31] [2023-03-06 13:45:36,587][1876025] Worker 22 uses CPU cores [22] [2023-03-06 13:45:36,697][1875659] Worker 2 uses CPU cores [2] [2023-03-06 13:45:36,775][1876089] Worker 25 uses CPU cores [25] [2023-03-06 13:45:36,952][1875665] Worker 5 uses CPU cores [5] [2023-03-06 13:45:37,018][1875894] Worker 11 uses CPU cores [11] [2023-03-06 13:45:37,063][1875604] Using optimizer [2023-03-06 13:45:37,074][1875604] No checkpoints found [2023-03-06 13:45:37,074][1875604] Did not load from checkpoint, starting from scratch! [2023-03-06 13:45:37,075][1875604] Initialized policy 0 weights for model version 0 [2023-03-06 13:45:37,077][1875604] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-03-06 13:45:37,080][1875604] LearnerWorker_p0 finished initialization! [2023-03-06 13:45:37,146][1875855] Worker 21 uses CPU cores [21] [2023-03-06 13:45:37,166][1875656] RunningMeanStd input shape: (39,) [2023-03-06 13:45:37,167][1875656] RunningMeanStd input shape: (1,) [2023-03-06 13:45:37,202][1875897] Worker 7 uses CPU cores [7] [2023-03-06 13:45:37,338][1875658] Worker 1 uses CPU cores [1] [2023-03-06 13:45:37,529][1875661] Worker 3 uses CPU cores [3] [2023-03-06 13:45:37,581][1875993] Worker 23 uses CPU cores [23] [2023-03-06 13:45:37,683][1875862] Worker 9 uses CPU cores [9] [2023-03-06 13:45:37,846][1876217] Worker 28 uses CPU cores [28] [2023-03-06 13:45:37,862][1875856] Worker 20 uses CPU cores [20] [2023-03-06 13:45:37,999][1875328] Inference worker 0-0 is ready! [2023-03-06 13:45:38,000][1875328] All inference workers are ready! Signal rollout workers to start! [2023-03-06 13:45:38,074][1875860] Worker 19 uses CPU cores [19] [2023-03-06 13:45:38,079][1875861] Worker 8 uses CPU cores [8] [2023-03-06 13:45:38,150][1875853] Worker 12 uses CPU cores [12] [2023-03-06 13:45:38,446][1875896] Worker 16 uses CPU cores [16] [2023-03-06 13:45:38,642][1875898] Worker 15 uses CPU cores [15] [2023-03-06 13:45:38,770][1875657] Worker 0 uses CPU cores [0] [2023-03-06 13:45:38,906][1875858] Worker 13 uses CPU cores [13] [2023-03-06 13:45:39,070][1875664] Worker 4 uses CPU cores [4] [2023-03-06 13:45:39,174][1876216] Worker 26 uses CPU cores [26] [2023-03-06 13:45:39,378][1875854] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,424][1875821] Worker 6 uses CPU cores [6] [2023-03-06 13:45:39,429][1876025] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,457][1876220] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,459][1875661] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,476][1876218] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,511][1875856] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,513][1875862] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,525][1875894] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,536][1876204] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,562][1876089] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,580][1875859] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,632][1875658] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,638][1875857] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,650][1875993] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,663][1876219] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,672][1875895] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,674][1875659] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,683][1875665] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,683][1875855] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,734][1875860] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,743][1876217] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,749][1875897] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,837][1875861] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,926][1876221] Decorrelating experience for 0 frames... [2023-03-06 13:45:39,997][1875853] Decorrelating experience for 0 frames... [2023-03-06 13:45:40,375][1875896] Decorrelating experience for 0 frames... [2023-03-06 13:45:40,412][1875328] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-03-06 13:45:40,632][1875898] Decorrelating experience for 0 frames... [2023-03-06 13:45:40,634][1875858] Decorrelating experience for 0 frames... [2023-03-06 13:45:40,798][1875664] Decorrelating experience for 0 frames... [2023-03-06 13:45:40,900][1876025] Decorrelating experience for 32 frames... [2023-03-06 13:45:40,949][1876216] Decorrelating experience for 0 frames... [2023-03-06 13:45:40,971][1875854] Decorrelating experience for 32 frames... [2023-03-06 13:45:40,981][1875657] Decorrelating experience for 0 frames... [2023-03-06 13:45:41,023][1876220] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,051][1876218] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,057][1875661] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,060][1875856] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,085][1875894] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,116][1876204] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,137][1876089] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,143][1875859] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,161][1875862] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,204][1875857] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,227][1875993] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,230][1875860] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,262][1876219] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,263][1875658] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,277][1875895] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,277][1875659] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,282][1875897] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,285][1875861] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,288][1875855] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,293][1875665] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,296][1876217] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,312][1875821] Decorrelating experience for 0 frames... [2023-03-06 13:45:41,460][1875853] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,487][1876221] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,710][1875604] Signal inference workers to stop experience collection... [2023-03-06 13:45:41,715][1875656] InferenceWorker_p0-w0: stopping experience collection [2023-03-06 13:45:41,779][1875858] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,863][1875896] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,888][1875664] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,922][1875898] Decorrelating experience for 32 frames... [2023-03-06 13:45:41,953][1876216] Decorrelating experience for 32 frames... [2023-03-06 13:45:42,018][1875604] Signal inference workers to resume experience collection... [2023-03-06 13:45:42,018][1875656] InferenceWorker_p0-w0: resuming experience collection [2023-03-06 13:45:42,200][1875657] Decorrelating experience for 32 frames... [2023-03-06 13:45:42,276][1875821] Decorrelating experience for 32 frames... [2023-03-06 13:45:43,202][1875656] Updated weights for policy 0, policy_version 10 (0.0215) [2023-03-06 13:45:43,953][1875656] Updated weights for policy 0, policy_version 20 (0.0005) [2023-03-06 13:45:44,733][1875656] Updated weights for policy 0, policy_version 30 (0.0005) [2023-03-06 13:45:45,412][1875328] Fps is (10 sec: 7987.5, 60 sec: 7987.5, 300 sec: 7987.5). Total num frames: 39936. Throughput: 0: 4501.4. Samples: 22506. Policy #0 lag: (min: 0.0, avg: 0.9, max: 3.0) [2023-03-06 13:45:45,413][1875328] Avg episode reward: [(0, '275.095')] [2023-03-06 13:45:45,476][1875656] Updated weights for policy 0, policy_version 40 (0.0006) [2023-03-06 13:45:46,229][1875656] Updated weights for policy 0, policy_version 50 (0.0005) [2023-03-06 13:45:47,002][1875656] Updated weights for policy 0, policy_version 60 (0.0006) [2023-03-06 13:45:47,769][1875656] Updated weights for policy 0, policy_version 70 (0.0006) [2023-03-06 13:45:48,520][1875656] Updated weights for policy 0, policy_version 80 (0.0007) [2023-03-06 13:45:49,274][1875656] Updated weights for policy 0, policy_version 90 (0.0006) [2023-03-06 13:45:50,041][1875656] Updated weights for policy 0, policy_version 100 (0.0005) [2023-03-06 13:45:50,412][1875328] Fps is (10 sec: 10649.8, 60 sec: 10649.8, 300 sec: 10649.8). Total num frames: 106496. Throughput: 0: 10315.1. Samples: 103149. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:45:50,413][1875328] Avg episode reward: [(0, '531.196')] [2023-03-06 13:45:50,420][1875604] Saving new best policy, reward=531.196! [2023-03-06 13:45:50,784][1875656] Updated weights for policy 0, policy_version 110 (0.0007) [2023-03-06 13:45:51,543][1875656] Updated weights for policy 0, policy_version 120 (0.0006) [2023-03-06 13:45:52,325][1875656] Updated weights for policy 0, policy_version 130 (0.0007) [2023-03-06 13:45:53,083][1875656] Updated weights for policy 0, policy_version 140 (0.0005) [2023-03-06 13:45:53,350][1875328] Heartbeat connected on Batcher_0 [2023-03-06 13:45:53,353][1875328] Heartbeat connected on LearnerWorker_p0 [2023-03-06 13:45:53,358][1875328] Heartbeat connected on RolloutWorker_w0 [2023-03-06 13:45:53,359][1875328] Heartbeat connected on InferenceWorker_p0-w0 [2023-03-06 13:45:53,360][1875328] Heartbeat connected on RolloutWorker_w1 [2023-03-06 13:45:53,362][1875328] Heartbeat connected on RolloutWorker_w2 [2023-03-06 13:45:53,364][1875328] Heartbeat connected on RolloutWorker_w3 [2023-03-06 13:45:53,365][1875328] Heartbeat connected on RolloutWorker_w4 [2023-03-06 13:45:53,367][1875328] Heartbeat connected on RolloutWorker_w5 [2023-03-06 13:45:53,370][1875328] Heartbeat connected on RolloutWorker_w6 [2023-03-06 13:45:53,371][1875328] Heartbeat connected on RolloutWorker_w7 [2023-03-06 13:45:53,375][1875328] Heartbeat connected on RolloutWorker_w9 [2023-03-06 13:45:53,376][1875328] Heartbeat connected on RolloutWorker_w8 [2023-03-06 13:45:53,377][1875328] Heartbeat connected on RolloutWorker_w10 [2023-03-06 13:45:53,402][1875328] Heartbeat connected on RolloutWorker_w11 [2023-03-06 13:45:53,403][1875328] Heartbeat connected on RolloutWorker_w12 [2023-03-06 13:45:53,405][1875328] Heartbeat connected on RolloutWorker_w13 [2023-03-06 13:45:53,407][1875328] Heartbeat connected on RolloutWorker_w14 [2023-03-06 13:45:53,409][1875328] Heartbeat connected on RolloutWorker_w15 [2023-03-06 13:45:53,410][1875328] Heartbeat connected on RolloutWorker_w16 [2023-03-06 13:45:53,412][1875328] Heartbeat connected on RolloutWorker_w17 [2023-03-06 13:45:53,414][1875328] Heartbeat connected on RolloutWorker_w18 [2023-03-06 13:45:53,416][1875328] Heartbeat connected on RolloutWorker_w19 [2023-03-06 13:45:53,419][1875328] Heartbeat connected on RolloutWorker_w20 [2023-03-06 13:45:53,419][1875328] Heartbeat connected on RolloutWorker_w21 [2023-03-06 13:45:53,422][1875328] Heartbeat connected on RolloutWorker_w22 [2023-03-06 13:45:53,424][1875328] Heartbeat connected on RolloutWorker_w23 [2023-03-06 13:45:53,427][1875328] Heartbeat connected on RolloutWorker_w25 [2023-03-06 13:45:53,428][1875328] Heartbeat connected on RolloutWorker_w24 [2023-03-06 13:45:53,429][1875328] Heartbeat connected on RolloutWorker_w26 [2023-03-06 13:45:53,431][1875328] Heartbeat connected on RolloutWorker_w27 [2023-03-06 13:45:53,432][1875328] Heartbeat connected on RolloutWorker_w28 [2023-03-06 13:45:53,436][1875328] Heartbeat connected on RolloutWorker_w29 [2023-03-06 13:45:53,436][1875328] Heartbeat connected on RolloutWorker_w30 [2023-03-06 13:45:53,438][1875328] Heartbeat connected on RolloutWorker_w31 [2023-03-06 13:45:53,828][1875656] Updated weights for policy 0, policy_version 150 (0.0005) [2023-03-06 13:45:54,610][1875656] Updated weights for policy 0, policy_version 160 (0.0006) [2023-03-06 13:45:55,369][1875656] Updated weights for policy 0, policy_version 170 (0.0007) [2023-03-06 13:45:55,412][1875328] Fps is (10 sec: 13414.3, 60 sec: 11605.4, 300 sec: 11605.4). Total num frames: 174080. Throughput: 0: 9568.3. Samples: 143523. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:45:55,413][1875328] Avg episode reward: [(0, '651.861')] [2023-03-06 13:45:55,413][1875604] Saving new best policy, reward=651.861! [2023-03-06 13:45:56,124][1875656] Updated weights for policy 0, policy_version 180 (0.0006) [2023-03-06 13:45:56,903][1875656] Updated weights for policy 0, policy_version 190 (0.0007) [2023-03-06 13:45:57,660][1875656] Updated weights for policy 0, policy_version 200 (0.0008) [2023-03-06 13:45:58,400][1875656] Updated weights for policy 0, policy_version 210 (0.0006) [2023-03-06 13:45:59,191][1875656] Updated weights for policy 0, policy_version 220 (0.0006) [2023-03-06 13:45:59,956][1875656] Updated weights for policy 0, policy_version 230 (0.0007) [2023-03-06 13:46:00,412][1875328] Fps is (10 sec: 13516.9, 60 sec: 12083.4, 300 sec: 12083.4). Total num frames: 241664. Throughput: 0: 11213.9. Samples: 224275. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:46:00,412][1875328] Avg episode reward: [(0, '1028.411')] [2023-03-06 13:46:00,416][1875604] Saving new best policy, reward=1028.411! [2023-03-06 13:46:00,717][1875656] Updated weights for policy 0, policy_version 240 (0.0006) [2023-03-06 13:46:01,480][1875656] Updated weights for policy 0, policy_version 250 (0.0006) [2023-03-06 13:46:02,229][1875656] Updated weights for policy 0, policy_version 260 (0.0005) [2023-03-06 13:46:02,977][1875656] Updated weights for policy 0, policy_version 270 (0.0006) [2023-03-06 13:46:03,745][1875656] Updated weights for policy 0, policy_version 280 (0.0006) [2023-03-06 13:46:04,517][1875656] Updated weights for policy 0, policy_version 290 (0.0006) [2023-03-06 13:46:05,281][1875656] Updated weights for policy 0, policy_version 300 (0.0007) [2023-03-06 13:46:05,412][1875328] Fps is (10 sec: 13414.5, 60 sec: 12329.0, 300 sec: 12329.0). Total num frames: 308224. Throughput: 0: 12191.5. Samples: 304787. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-03-06 13:46:05,413][1875328] Avg episode reward: [(0, '669.358')] [2023-03-06 13:46:06,036][1875656] Updated weights for policy 0, policy_version 310 (0.0006) [2023-03-06 13:46:06,843][1875656] Updated weights for policy 0, policy_version 320 (0.0006) [2023-03-06 13:46:07,582][1875656] Updated weights for policy 0, policy_version 330 (0.0006) [2023-03-06 13:46:08,347][1875656] Updated weights for policy 0, policy_version 340 (0.0005) [2023-03-06 13:46:09,132][1875656] Updated weights for policy 0, policy_version 350 (0.0006) [2023-03-06 13:46:09,884][1875656] Updated weights for policy 0, policy_version 360 (0.0005) [2023-03-06 13:46:10,412][1875328] Fps is (10 sec: 13414.3, 60 sec: 12527.0, 300 sec: 12527.0). Total num frames: 375808. Throughput: 0: 11491.4. Samples: 344739. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:46:10,412][1875328] Avg episode reward: [(0, '963.316')] [2023-03-06 13:46:10,649][1875656] Updated weights for policy 0, policy_version 370 (0.0006) [2023-03-06 13:46:11,448][1875656] Updated weights for policy 0, policy_version 380 (0.0006) [2023-03-06 13:46:12,214][1875656] Updated weights for policy 0, policy_version 390 (0.0006) [2023-03-06 13:46:12,957][1875656] Updated weights for policy 0, policy_version 400 (0.0006) [2023-03-06 13:46:13,738][1875656] Updated weights for policy 0, policy_version 410 (0.0006) [2023-03-06 13:46:14,492][1875656] Updated weights for policy 0, policy_version 420 (0.0006) [2023-03-06 13:46:15,241][1875656] Updated weights for policy 0, policy_version 430 (0.0005) [2023-03-06 13:46:15,412][1875328] Fps is (10 sec: 13414.6, 60 sec: 12639.2, 300 sec: 12639.2). Total num frames: 442368. Throughput: 0: 12135.2. Samples: 424729. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:46:15,413][1875328] Avg episode reward: [(0, '1058.506')] [2023-03-06 13:46:15,414][1875604] Saving new best policy, reward=1058.506! [2023-03-06 13:46:16,021][1875656] Updated weights for policy 0, policy_version 440 (0.0006) [2023-03-06 13:46:16,784][1875656] Updated weights for policy 0, policy_version 450 (0.0006) [2023-03-06 13:46:17,539][1875656] Updated weights for policy 0, policy_version 460 (0.0006) [2023-03-06 13:46:18,320][1875656] Updated weights for policy 0, policy_version 470 (0.0007) [2023-03-06 13:46:19,077][1875656] Updated weights for policy 0, policy_version 480 (0.0006) [2023-03-06 13:46:19,837][1875656] Updated weights for policy 0, policy_version 490 (0.0006) [2023-03-06 13:46:20,412][1875328] Fps is (10 sec: 13311.8, 60 sec: 12723.2, 300 sec: 12723.2). Total num frames: 508928. Throughput: 0: 12634.3. Samples: 505372. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:46:20,413][1875328] Avg episode reward: [(0, '1126.911')] [2023-03-06 13:46:20,417][1875604] Saving new best policy, reward=1126.911! [2023-03-06 13:46:20,621][1875656] Updated weights for policy 0, policy_version 500 (0.0006) [2023-03-06 13:46:21,372][1875656] Updated weights for policy 0, policy_version 510 (0.0006) [2023-03-06 13:46:22,146][1875656] Updated weights for policy 0, policy_version 520 (0.0006) [2023-03-06 13:46:22,916][1875656] Updated weights for policy 0, policy_version 530 (0.0005) [2023-03-06 13:46:23,695][1875656] Updated weights for policy 0, policy_version 540 (0.0005) [2023-03-06 13:46:24,451][1875656] Updated weights for policy 0, policy_version 550 (0.0007) [2023-03-06 13:46:25,201][1875656] Updated weights for policy 0, policy_version 560 (0.0006) [2023-03-06 13:46:25,412][1875328] Fps is (10 sec: 13312.0, 60 sec: 12788.7, 300 sec: 12788.7). Total num frames: 575488. Throughput: 0: 12113.2. Samples: 545090. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) [2023-03-06 13:46:25,412][1875328] Avg episode reward: [(0, '711.457')] [2023-03-06 13:46:25,984][1875656] Updated weights for policy 0, policy_version 570 (0.0006) [2023-03-06 13:46:26,732][1875656] Updated weights for policy 0, policy_version 580 (0.0005) [2023-03-06 13:46:27,508][1875656] Updated weights for policy 0, policy_version 590 (0.0006) [2023-03-06 13:46:28,265][1875656] Updated weights for policy 0, policy_version 600 (0.0006) [2023-03-06 13:46:29,009][1875656] Updated weights for policy 0, policy_version 610 (0.0006) [2023-03-06 13:46:29,769][1875656] Updated weights for policy 0, policy_version 620 (0.0007) [2023-03-06 13:46:30,412][1875328] Fps is (10 sec: 13414.6, 60 sec: 12861.5, 300 sec: 12861.5). Total num frames: 643072. Throughput: 0: 13409.6. Samples: 625940. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:46:30,412][1875328] Avg episode reward: [(0, '1037.907')] [2023-03-06 13:46:30,541][1875656] Updated weights for policy 0, policy_version 630 (0.0006) [2023-03-06 13:46:31,293][1875656] Updated weights for policy 0, policy_version 640 (0.0006) [2023-03-06 13:46:32,045][1875656] Updated weights for policy 0, policy_version 650 (0.0006) [2023-03-06 13:46:32,830][1875656] Updated weights for policy 0, policy_version 660 (0.0005) [2023-03-06 13:46:33,585][1875656] Updated weights for policy 0, policy_version 670 (0.0006) [2023-03-06 13:46:34,341][1875656] Updated weights for policy 0, policy_version 680 (0.0005) [2023-03-06 13:46:35,120][1875656] Updated weights for policy 0, policy_version 690 (0.0006) [2023-03-06 13:46:35,412][1875328] Fps is (10 sec: 13414.1, 60 sec: 12902.4, 300 sec: 12902.4). Total num frames: 709632. Throughput: 0: 13400.3. Samples: 706165. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:46:35,413][1875328] Avg episode reward: [(0, '1023.672')] [2023-03-06 13:46:35,891][1875656] Updated weights for policy 0, policy_version 700 (0.0006) [2023-03-06 13:46:36,645][1875656] Updated weights for policy 0, policy_version 710 (0.0005) [2023-03-06 13:46:37,429][1875656] Updated weights for policy 0, policy_version 720 (0.0005) [2023-03-06 13:46:38,202][1875656] Updated weights for policy 0, policy_version 730 (0.0006) [2023-03-06 13:46:38,955][1875656] Updated weights for policy 0, policy_version 740 (0.0005) [2023-03-06 13:46:39,728][1875656] Updated weights for policy 0, policy_version 750 (0.0006) [2023-03-06 13:46:40,412][1875328] Fps is (10 sec: 13312.0, 60 sec: 12936.6, 300 sec: 12936.6). Total num frames: 776192. Throughput: 0: 13391.1. Samples: 746120. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:46:40,413][1875328] Avg episode reward: [(0, '1049.364')] [2023-03-06 13:46:40,490][1875656] Updated weights for policy 0, policy_version 760 (0.0006) [2023-03-06 13:46:41,237][1875656] Updated weights for policy 0, policy_version 770 (0.0005) [2023-03-06 13:46:42,015][1875656] Updated weights for policy 0, policy_version 780 (0.0007) [2023-03-06 13:46:42,779][1875656] Updated weights for policy 0, policy_version 790 (0.0006) [2023-03-06 13:46:43,544][1875656] Updated weights for policy 0, policy_version 800 (0.0006) [2023-03-06 13:46:44,299][1875656] Updated weights for policy 0, policy_version 810 (0.0005) [2023-03-06 13:46:45,050][1875656] Updated weights for policy 0, policy_version 820 (0.0006) [2023-03-06 13:46:45,412][1875328] Fps is (10 sec: 13414.5, 60 sec: 13397.3, 300 sec: 12981.2). Total num frames: 843776. Throughput: 0: 13389.9. Samples: 826823. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) [2023-03-06 13:46:45,413][1875328] Avg episode reward: [(0, '968.504')] [2023-03-06 13:46:45,811][1875656] Updated weights for policy 0, policy_version 830 (0.0006) [2023-03-06 13:46:46,567][1875656] Updated weights for policy 0, policy_version 840 (0.0005) [2023-03-06 13:46:47,341][1875656] Updated weights for policy 0, policy_version 850 (0.0006) [2023-03-06 13:46:48,101][1875656] Updated weights for policy 0, policy_version 860 (0.0006) [2023-03-06 13:46:48,857][1875656] Updated weights for policy 0, policy_version 870 (0.0006) [2023-03-06 13:46:49,636][1875656] Updated weights for policy 0, policy_version 880 (0.0006) [2023-03-06 13:46:50,382][1875656] Updated weights for policy 0, policy_version 890 (0.0006) [2023-03-06 13:46:50,412][1875328] Fps is (10 sec: 13516.9, 60 sec: 13414.4, 300 sec: 13019.5). Total num frames: 911360. Throughput: 0: 13392.0. Samples: 907424. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-03-06 13:46:50,412][1875328] Avg episode reward: [(0, '1093.547')] [2023-03-06 13:46:51,133][1875656] Updated weights for policy 0, policy_version 900 (0.0006) [2023-03-06 13:46:51,910][1875656] Updated weights for policy 0, policy_version 910 (0.0007) [2023-03-06 13:46:52,696][1875656] Updated weights for policy 0, policy_version 920 (0.0006) [2023-03-06 13:46:53,457][1875656] Updated weights for policy 0, policy_version 930 (0.0006) [2023-03-06 13:46:54,230][1875656] Updated weights for policy 0, policy_version 940 (0.0006) [2023-03-06 13:46:54,994][1875656] Updated weights for policy 0, policy_version 950 (0.0006) [2023-03-06 13:46:55,412][1875328] Fps is (10 sec: 13414.5, 60 sec: 13397.4, 300 sec: 13039.0). Total num frames: 977920. Throughput: 0: 13398.4. Samples: 947666. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:46:55,413][1875328] Avg episode reward: [(0, '1125.832')] [2023-03-06 13:46:55,743][1875656] Updated weights for policy 0, policy_version 960 (0.0007) [2023-03-06 13:46:56,524][1875656] Updated weights for policy 0, policy_version 970 (0.0006) [2023-03-06 13:46:57,286][1875656] Updated weights for policy 0, policy_version 980 (0.0006) [2023-03-06 13:46:58,044][1875656] Updated weights for policy 0, policy_version 990 (0.0006) [2023-03-06 13:46:58,820][1875656] Updated weights for policy 0, policy_version 1000 (0.0006) [2023-03-06 13:46:59,587][1875656] Updated weights for policy 0, policy_version 1010 (0.0006) [2023-03-06 13:47:00,339][1875656] Updated weights for policy 0, policy_version 1020 (0.0006) [2023-03-06 13:47:00,412][1875328] Fps is (10 sec: 13414.4, 60 sec: 13397.3, 300 sec: 13068.8). Total num frames: 1045504. Throughput: 0: 13402.7. Samples: 1027851. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:47:00,412][1875328] Avg episode reward: [(0, '1079.160')] [2023-03-06 13:47:01,118][1875656] Updated weights for policy 0, policy_version 1030 (0.0006) [2023-03-06 13:47:01,876][1875656] Updated weights for policy 0, policy_version 1040 (0.0005) [2023-03-06 13:47:02,631][1875656] Updated weights for policy 0, policy_version 1050 (0.0006) [2023-03-06 13:47:03,397][1875656] Updated weights for policy 0, policy_version 1060 (0.0006) [2023-03-06 13:47:04,165][1875656] Updated weights for policy 0, policy_version 1070 (0.0007) [2023-03-06 13:47:04,925][1875656] Updated weights for policy 0, policy_version 1080 (0.0006) [2023-03-06 13:47:05,412][1875328] Fps is (10 sec: 13414.4, 60 sec: 13397.3, 300 sec: 13083.1). Total num frames: 1112064. Throughput: 0: 13401.1. Samples: 1108420. Policy #0 lag: (min: 0.0, avg: 1.0, max: 3.0) [2023-03-06 13:47:05,413][1875328] Avg episode reward: [(0, '1143.160')] [2023-03-06 13:47:05,413][1875604] Saving new best policy, reward=1143.160! [2023-03-06 13:47:05,696][1875656] Updated weights for policy 0, policy_version 1090 (0.0006) [2023-03-06 13:47:06,458][1875656] Updated weights for policy 0, policy_version 1100 (0.0005) [2023-03-06 13:47:07,214][1875656] Updated weights for policy 0, policy_version 1110 (0.0006) [2023-03-06 13:47:07,972][1875656] Updated weights for policy 0, policy_version 1120 (0.0006) [2023-03-06 13:47:08,751][1875656] Updated weights for policy 0, policy_version 1130 (0.0007) [2023-03-06 13:47:09,513][1875656] Updated weights for policy 0, policy_version 1140 (0.0005) [2023-03-06 13:47:10,273][1875656] Updated weights for policy 0, policy_version 1150 (0.0006) [2023-03-06 13:47:10,412][1875328] Fps is (10 sec: 13312.0, 60 sec: 13380.3, 300 sec: 13095.9). Total num frames: 1178624. Throughput: 0: 13410.9. Samples: 1148581. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) [2023-03-06 13:47:10,412][1875328] Avg episode reward: [(0, '1147.620')] [2023-03-06 13:47:10,430][1875604] Saving new best policy, reward=1147.620! [2023-03-06 13:47:11,067][1875656] Updated weights for policy 0, policy_version 1160 (0.0007) [2023-03-06 13:47:11,833][1875656] Updated weights for policy 0, policy_version 1170 (0.0006) [2023-03-06 13:47:12,580][1875656] Updated weights for policy 0, policy_version 1180 (0.0005) [2023-03-06 13:47:13,347][1875656] Updated weights for policy 0, policy_version 1190 (0.0006) [2023-03-06 13:47:14,108][1875656] Updated weights for policy 0, policy_version 1200 (0.0007) [2023-03-06 13:47:14,860][1875656] Updated weights for policy 0, policy_version 1210 (0.0005) [2023-03-06 13:47:15,412][1875328] Fps is (10 sec: 13414.5, 60 sec: 13397.3, 300 sec: 13118.0). Total num frames: 1246208. Throughput: 0: 13397.3. Samples: 1228817. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-03-06 13:47:15,412][1875328] Avg episode reward: [(0, '1129.926')] [2023-03-06 13:47:15,635][1875656] Updated weights for policy 0, policy_version 1220 (0.0006) [2023-03-06 13:47:16,368][1875656] Updated weights for policy 0, policy_version 1230 (0.0006) [2023-03-06 13:47:17,130][1875656] Updated weights for policy 0, policy_version 1240 (0.0006) [2023-03-06 13:47:17,898][1875656] Updated weights for policy 0, policy_version 1250 (0.0006) [2023-03-06 13:47:18,672][1875656] Updated weights for policy 0, policy_version 1260 (0.0006) [2023-03-06 13:47:19,421][1875656] Updated weights for policy 0, policy_version 1270 (0.0006) [2023-03-06 13:47:20,205][1875656] Updated weights for policy 0, policy_version 1280 (0.0005) [2023-03-06 13:47:20,412][1875328] Fps is (10 sec: 13516.8, 60 sec: 13414.4, 300 sec: 13137.9). Total num frames: 1313792. Throughput: 0: 13407.8. Samples: 1309516. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:47:20,413][1875328] Avg episode reward: [(0, '1172.515')] [2023-03-06 13:47:20,416][1875604] Saving new best policy, reward=1172.515! [2023-03-06 13:47:20,954][1875656] Updated weights for policy 0, policy_version 1290 (0.0006) [2023-03-06 13:47:21,715][1875656] Updated weights for policy 0, policy_version 1300 (0.0006) [2023-03-06 13:47:22,488][1875656] Updated weights for policy 0, policy_version 1310 (0.0006) [2023-03-06 13:47:23,247][1875656] Updated weights for policy 0, policy_version 1320 (0.0006) [2023-03-06 13:47:24,004][1875656] Updated weights for policy 0, policy_version 1330 (0.0006) [2023-03-06 13:47:24,767][1875656] Updated weights for policy 0, policy_version 1340 (0.0007) [2023-03-06 13:47:25,412][1875328] Fps is (10 sec: 13414.3, 60 sec: 13414.4, 300 sec: 13146.2). Total num frames: 1380352. Throughput: 0: 13413.3. Samples: 1349717. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:47:25,413][1875328] Avg episode reward: [(0, '1084.860')] [2023-03-06 13:47:25,539][1875656] Updated weights for policy 0, policy_version 1350 (0.0007) [2023-03-06 13:47:26,304][1875656] Updated weights for policy 0, policy_version 1360 (0.0006) [2023-03-06 13:47:27,065][1875656] Updated weights for policy 0, policy_version 1370 (0.0006) [2023-03-06 13:47:27,828][1875656] Updated weights for policy 0, policy_version 1380 (0.0006) [2023-03-06 13:47:28,604][1875656] Updated weights for policy 0, policy_version 1390 (0.0006) [2023-03-06 13:47:29,357][1875656] Updated weights for policy 0, policy_version 1400 (0.0006) [2023-03-06 13:47:30,105][1875656] Updated weights for policy 0, policy_version 1410 (0.0006) [2023-03-06 13:47:30,412][1875328] Fps is (10 sec: 13312.0, 60 sec: 13397.3, 300 sec: 13153.8). Total num frames: 1446912. Throughput: 0: 13408.4. Samples: 1430202. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:47:30,412][1875328] Avg episode reward: [(0, '1069.382')] [2023-03-06 13:47:30,417][1875604] Saving /home/qgallouedec/gia/data/envs/metaworld/train_dir/button-press-topdown-v2/checkpoint_p0/checkpoint_000001413_1446912.pth... [2023-03-06 13:47:30,909][1875656] Updated weights for policy 0, policy_version 1420 (0.0006) [2023-03-06 13:47:31,418][1875604] KL-divergence is very high: 144.4700 [2023-03-06 13:47:31,507][1875604] KL-divergence is very high: 541.9474 [2023-03-06 13:47:31,660][1875604] KL-divergence is very high: 579.9358 [2023-03-06 13:47:31,668][1875656] Updated weights for policy 0, policy_version 1430 (0.0005) [2023-03-06 13:47:32,440][1875656] Updated weights for policy 0, policy_version 1440 (0.0008) [2023-03-06 13:47:33,207][1875656] Updated weights for policy 0, policy_version 1450 (0.0005) [2023-03-06 13:47:33,967][1875656] Updated weights for policy 0, policy_version 1460 (0.0007) [2023-03-06 13:47:34,732][1875656] Updated weights for policy 0, policy_version 1470 (0.0006) [2023-03-06 13:47:35,412][1875328] Fps is (10 sec: 13312.0, 60 sec: 13397.4, 300 sec: 13160.6). Total num frames: 1513472. Throughput: 0: 13394.6. Samples: 1510182. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-03-06 13:47:35,412][1875328] Avg episode reward: [(0, '1133.150')] [2023-03-06 13:47:35,501][1875656] Updated weights for policy 0, policy_version 1480 (0.0007) [2023-03-06 13:47:35,563][1875604] KL-divergence is very high: 272.0869 [2023-03-06 13:47:36,255][1875656] Updated weights for policy 0, policy_version 1490 (0.0006) [2023-03-06 13:47:37,033][1875656] Updated weights for policy 0, policy_version 1500 (0.0006) [2023-03-06 13:47:37,805][1875656] Updated weights for policy 0, policy_version 1510 (0.0006) [2023-03-06 13:47:38,319][1875604] KL-divergence is very high: 342.8143 [2023-03-06 13:47:38,565][1875656] Updated weights for policy 0, policy_version 1520 (0.0006) [2023-03-06 13:47:39,337][1875656] Updated weights for policy 0, policy_version 1530 (0.0005) [2023-03-06 13:47:40,090][1875656] Updated weights for policy 0, policy_version 1540 (0.0007) [2023-03-06 13:47:40,412][1875328] Fps is (10 sec: 13414.3, 60 sec: 13414.4, 300 sec: 13175.5). Total num frames: 1581056. Throughput: 0: 13393.9. Samples: 1550394. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:47:40,413][1875328] Avg episode reward: [(0, '1154.947')] [2023-03-06 13:47:40,852][1875656] Updated weights for policy 0, policy_version 1550 (0.0006) [2023-03-06 13:47:41,638][1875656] Updated weights for policy 0, policy_version 1560 (0.0007) [2023-03-06 13:47:42,419][1875656] Updated weights for policy 0, policy_version 1570 (0.0006) [2023-03-06 13:47:43,173][1875656] Updated weights for policy 0, policy_version 1580 (0.0006) [2023-03-06 13:47:43,924][1875656] Updated weights for policy 0, policy_version 1590 (0.0006) [2023-03-06 13:47:44,711][1875656] Updated weights for policy 0, policy_version 1600 (0.0005) [2023-03-06 13:47:45,412][1875328] Fps is (10 sec: 13414.3, 60 sec: 13397.3, 300 sec: 13180.9). Total num frames: 1647616. Throughput: 0: 13389.3. Samples: 1630373. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:47:45,413][1875328] Avg episode reward: [(0, '1173.399')] [2023-03-06 13:47:45,413][1875604] Saving new best policy, reward=1173.399! [2023-03-06 13:47:45,469][1875656] Updated weights for policy 0, policy_version 1610 (0.0006) [2023-03-06 13:47:46,241][1875656] Updated weights for policy 0, policy_version 1620 (0.0006) [2023-03-06 13:47:47,012][1875656] Updated weights for policy 0, policy_version 1630 (0.0006) [2023-03-06 13:47:47,776][1875656] Updated weights for policy 0, policy_version 1640 (0.0005) [2023-03-06 13:47:48,526][1875656] Updated weights for policy 0, policy_version 1650 (0.0006) [2023-03-06 13:47:49,285][1875656] Updated weights for policy 0, policy_version 1660 (0.0005) [2023-03-06 13:47:50,040][1875656] Updated weights for policy 0, policy_version 1670 (0.0007) [2023-03-06 13:47:50,412][1875328] Fps is (10 sec: 13414.5, 60 sec: 13397.3, 300 sec: 13193.9). Total num frames: 1715200. Throughput: 0: 13390.1. Samples: 1710976. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) [2023-03-06 13:47:50,413][1875328] Avg episode reward: [(0, '1168.747')] [2023-03-06 13:47:50,801][1875656] Updated weights for policy 0, policy_version 1680 (0.0007) [2023-03-06 13:47:51,579][1875656] Updated weights for policy 0, policy_version 1690 (0.0006) [2023-03-06 13:47:52,326][1875656] Updated weights for policy 0, policy_version 1700 (0.0005) [2023-03-06 13:47:53,099][1875656] Updated weights for policy 0, policy_version 1710 (0.0005) [2023-03-06 13:47:53,876][1875656] Updated weights for policy 0, policy_version 1720 (0.0006) [2023-03-06 13:47:54,640][1875656] Updated weights for policy 0, policy_version 1730 (0.0006) [2023-03-06 13:47:55,396][1875656] Updated weights for policy 0, policy_version 1740 (0.0006) [2023-03-06 13:47:55,412][1875328] Fps is (10 sec: 13414.4, 60 sec: 13397.3, 300 sec: 13198.2). Total num frames: 1781760. Throughput: 0: 13392.1. Samples: 1751227. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:47:55,413][1875328] Avg episode reward: [(0, '1112.132')] [2023-03-06 13:47:56,171][1875656] Updated weights for policy 0, policy_version 1750 (0.0007) [2023-03-06 13:47:56,926][1875656] Updated weights for policy 0, policy_version 1760 (0.0006) [2023-03-06 13:47:57,667][1875656] Updated weights for policy 0, policy_version 1770 (0.0006) [2023-03-06 13:47:58,435][1875656] Updated weights for policy 0, policy_version 1780 (0.0007) [2023-03-06 13:47:59,205][1875656] Updated weights for policy 0, policy_version 1790 (0.0006) [2023-03-06 13:47:59,971][1875656] Updated weights for policy 0, policy_version 1800 (0.0006) [2023-03-06 13:48:00,412][1875328] Fps is (10 sec: 13312.1, 60 sec: 13380.3, 300 sec: 13202.3). Total num frames: 1848320. Throughput: 0: 13394.5. Samples: 1831569. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:48:00,412][1875328] Avg episode reward: [(0, '1152.745')] [2023-03-06 13:48:00,761][1875656] Updated weights for policy 0, policy_version 1810 (0.0006) [2023-03-06 13:48:01,508][1875656] Updated weights for policy 0, policy_version 1820 (0.0006) [2023-03-06 13:48:02,265][1875656] Updated weights for policy 0, policy_version 1830 (0.0006) [2023-03-06 13:48:03,056][1875656] Updated weights for policy 0, policy_version 1840 (0.0006) [2023-03-06 13:48:03,815][1875656] Updated weights for policy 0, policy_version 1850 (0.0005) [2023-03-06 13:48:04,585][1875656] Updated weights for policy 0, policy_version 1860 (0.0006) [2023-03-06 13:48:05,355][1875656] Updated weights for policy 0, policy_version 1870 (0.0006) [2023-03-06 13:48:05,412][1875328] Fps is (10 sec: 13312.2, 60 sec: 13380.3, 300 sec: 13206.1). Total num frames: 1914880. Throughput: 0: 13382.2. Samples: 1911715. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:48:05,412][1875328] Avg episode reward: [(0, '1192.677')] [2023-03-06 13:48:05,416][1875604] Saving new best policy, reward=1192.677! [2023-03-06 13:48:06,115][1875656] Updated weights for policy 0, policy_version 1880 (0.0006) [2023-03-06 13:48:06,879][1875656] Updated weights for policy 0, policy_version 1890 (0.0007) [2023-03-06 13:48:07,642][1875656] Updated weights for policy 0, policy_version 1900 (0.0005) [2023-03-06 13:48:08,410][1875656] Updated weights for policy 0, policy_version 1910 (0.0006) [2023-03-06 13:48:09,173][1875656] Updated weights for policy 0, policy_version 1920 (0.0005) [2023-03-06 13:48:09,940][1875656] Updated weights for policy 0, policy_version 1930 (0.0006) [2023-03-06 13:48:10,412][1875328] Fps is (10 sec: 13414.2, 60 sec: 13397.3, 300 sec: 13216.4). Total num frames: 1982464. Throughput: 0: 13377.4. Samples: 1951699. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-03-06 13:48:10,413][1875328] Avg episode reward: [(0, '1159.075')] [2023-03-06 13:48:10,702][1875656] Updated weights for policy 0, policy_version 1940 (0.0006) [2023-03-06 13:48:11,462][1875656] Updated weights for policy 0, policy_version 1950 (0.0006) [2023-03-06 13:48:12,253][1875656] Updated weights for policy 0, policy_version 1960 (0.0006) [2023-03-06 13:48:13,004][1875656] Updated weights for policy 0, policy_version 1970 (0.0006) [2023-03-06 13:48:13,768][1875656] Updated weights for policy 0, policy_version 1980 (0.0005) [2023-03-06 13:48:14,550][1875656] Updated weights for policy 0, policy_version 1990 (0.0006) [2023-03-06 13:48:15,330][1875656] Updated weights for policy 0, policy_version 2000 (0.0006) [2023-03-06 13:48:15,412][1875328] Fps is (10 sec: 13414.5, 60 sec: 13380.3, 300 sec: 13219.5). Total num frames: 2049024. Throughput: 0: 13370.7. Samples: 2031880. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:48:15,412][1875328] Avg episode reward: [(0, '1134.932')] [2023-03-06 13:48:16,101][1875656] Updated weights for policy 0, policy_version 2010 (0.0006) [2023-03-06 13:48:16,870][1875656] Updated weights for policy 0, policy_version 2020 (0.0006) [2023-03-06 13:48:17,639][1875656] Updated weights for policy 0, policy_version 2030 (0.0006) [2023-03-06 13:48:18,403][1875656] Updated weights for policy 0, policy_version 2040 (0.0005) [2023-03-06 13:48:19,182][1875656] Updated weights for policy 0, policy_version 2050 (0.0005) [2023-03-06 13:48:19,949][1875656] Updated weights for policy 0, policy_version 2060 (0.0006) [2023-03-06 13:48:20,412][1875328] Fps is (10 sec: 13209.6, 60 sec: 13346.1, 300 sec: 13216.0). Total num frames: 2114560. Throughput: 0: 13361.1. Samples: 2111433. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:48:20,413][1875328] Avg episode reward: [(0, '1110.894')] [2023-03-06 13:48:20,726][1875656] Updated weights for policy 0, policy_version 2070 (0.0005) [2023-03-06 13:48:21,488][1875656] Updated weights for policy 0, policy_version 2080 (0.0006) [2023-03-06 13:48:22,244][1875656] Updated weights for policy 0, policy_version 2090 (0.0006) [2023-03-06 13:48:23,020][1875656] Updated weights for policy 0, policy_version 2100 (0.0007) [2023-03-06 13:48:23,763][1875656] Updated weights for policy 0, policy_version 2110 (0.0006) [2023-03-06 13:48:24,541][1875656] Updated weights for policy 0, policy_version 2120 (0.0006) [2023-03-06 13:48:25,315][1875656] Updated weights for policy 0, policy_version 2130 (0.0005) [2023-03-06 13:48:25,412][1875328] Fps is (10 sec: 13209.5, 60 sec: 13346.1, 300 sec: 13218.9). Total num frames: 2181120. Throughput: 0: 13356.0. Samples: 2151413. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:48:25,413][1875328] Avg episode reward: [(0, '1016.088')] [2023-03-06 13:48:26,096][1875656] Updated weights for policy 0, policy_version 2140 (0.0008) [2023-03-06 13:48:26,856][1875656] Updated weights for policy 0, policy_version 2150 (0.0006) [2023-03-06 13:48:27,633][1875656] Updated weights for policy 0, policy_version 2160 (0.0006) [2023-03-06 13:48:28,424][1875656] Updated weights for policy 0, policy_version 2170 (0.0006) [2023-03-06 13:48:29,176][1875656] Updated weights for policy 0, policy_version 2180 (0.0005) [2023-03-06 13:48:29,957][1875656] Updated weights for policy 0, policy_version 2190 (0.0005) [2023-03-06 13:48:30,412][1875328] Fps is (10 sec: 13414.7, 60 sec: 13363.2, 300 sec: 13227.7). Total num frames: 2248704. Throughput: 0: 13352.5. Samples: 2231232. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-03-06 13:48:30,412][1875328] Avg episode reward: [(0, '1009.975')] [2023-03-06 13:48:30,708][1875656] Updated weights for policy 0, policy_version 2200 (0.0006) [2023-03-06 13:48:31,497][1875656] Updated weights for policy 0, policy_version 2210 (0.0005) [2023-03-06 13:48:32,284][1875656] Updated weights for policy 0, policy_version 2220 (0.0007) [2023-03-06 13:48:33,036][1875656] Updated weights for policy 0, policy_version 2230 (0.0006) [2023-03-06 13:48:33,809][1875656] Updated weights for policy 0, policy_version 2240 (0.0005) [2023-03-06 13:48:34,590][1875656] Updated weights for policy 0, policy_version 2250 (0.0006) [2023-03-06 13:48:35,350][1875656] Updated weights for policy 0, policy_version 2260 (0.0005) [2023-03-06 13:48:35,412][1875328] Fps is (10 sec: 13414.4, 60 sec: 13363.2, 300 sec: 13230.1). Total num frames: 2315264. Throughput: 0: 13332.6. Samples: 2310942. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:48:35,413][1875328] Avg episode reward: [(0, '1085.225')] [2023-03-06 13:48:36,116][1875656] Updated weights for policy 0, policy_version 2270 (0.0006) [2023-03-06 13:48:36,896][1875656] Updated weights for policy 0, policy_version 2280 (0.0005) [2023-03-06 13:48:37,640][1875656] Updated weights for policy 0, policy_version 2290 (0.0006) [2023-03-06 13:48:38,433][1875656] Updated weights for policy 0, policy_version 2300 (0.0005) [2023-03-06 13:48:39,200][1875656] Updated weights for policy 0, policy_version 2310 (0.0006) [2023-03-06 13:48:39,976][1875656] Updated weights for policy 0, policy_version 2320 (0.0005) [2023-03-06 13:48:40,412][1875328] Fps is (10 sec: 13209.5, 60 sec: 13329.1, 300 sec: 13226.7). Total num frames: 2380800. Throughput: 0: 13329.3. Samples: 2351043. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:48:40,423][1875328] Avg episode reward: [(0, '1042.952')] [2023-03-06 13:48:40,753][1875656] Updated weights for policy 0, policy_version 2330 (0.0006) [2023-03-06 13:48:41,535][1875656] Updated weights for policy 0, policy_version 2340 (0.0007) [2023-03-06 13:48:42,316][1875656] Updated weights for policy 0, policy_version 2350 (0.0006) [2023-03-06 13:48:43,066][1875656] Updated weights for policy 0, policy_version 2360 (0.0007) [2023-03-06 13:48:43,851][1875656] Updated weights for policy 0, policy_version 2370 (0.0005) [2023-03-06 13:48:44,617][1875656] Updated weights for policy 0, policy_version 2380 (0.0005) [2023-03-06 13:48:45,372][1875656] Updated weights for policy 0, policy_version 2390 (0.0006) [2023-03-06 13:48:45,412][1875328] Fps is (10 sec: 13209.5, 60 sec: 13329.1, 300 sec: 13229.0). Total num frames: 2447360. Throughput: 0: 13304.8. Samples: 2430289. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:48:45,427][1875328] Avg episode reward: [(0, '998.634')] [2023-03-06 13:48:46,153][1875656] Updated weights for policy 0, policy_version 2400 (0.0005) [2023-03-06 13:48:46,916][1875656] Updated weights for policy 0, policy_version 2410 (0.0005) [2023-03-06 13:48:47,690][1875656] Updated weights for policy 0, policy_version 2420 (0.0005) [2023-03-06 13:48:48,459][1875656] Updated weights for policy 0, policy_version 2430 (0.0006) [2023-03-06 13:48:49,226][1875656] Updated weights for policy 0, policy_version 2440 (0.0006) [2023-03-06 13:48:49,984][1875656] Updated weights for policy 0, policy_version 2450 (0.0006) [2023-03-06 13:48:50,412][1875328] Fps is (10 sec: 13312.1, 60 sec: 13312.0, 300 sec: 13231.2). Total num frames: 2513920. Throughput: 0: 13305.1. Samples: 2510445. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-03-06 13:48:50,412][1875328] Avg episode reward: [(0, '973.138')] [2023-03-06 13:48:50,762][1875656] Updated weights for policy 0, policy_version 2460 (0.0007) [2023-03-06 13:48:51,497][1875656] Updated weights for policy 0, policy_version 2470 (0.0006) [2023-03-06 13:48:52,280][1875656] Updated weights for policy 0, policy_version 2480 (0.0005) [2023-03-06 13:48:53,045][1875656] Updated weights for policy 0, policy_version 2490 (0.0006) [2023-03-06 13:48:53,816][1875656] Updated weights for policy 0, policy_version 2500 (0.0006) [2023-03-06 13:48:54,581][1875656] Updated weights for policy 0, policy_version 2510 (0.0006) [2023-03-06 13:48:55,351][1875656] Updated weights for policy 0, policy_version 2520 (0.0008) [2023-03-06 13:48:55,412][1875328] Fps is (10 sec: 13312.3, 60 sec: 13312.1, 300 sec: 13233.3). Total num frames: 2580480. Throughput: 0: 13308.2. Samples: 2550565. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:48:55,412][1875328] Avg episode reward: [(0, '1015.841')] [2023-03-06 13:48:56,115][1875656] Updated weights for policy 0, policy_version 2530 (0.0007) [2023-03-06 13:48:56,883][1875656] Updated weights for policy 0, policy_version 2540 (0.0005) [2023-03-06 13:48:57,649][1875656] Updated weights for policy 0, policy_version 2550 (0.0006) [2023-03-06 13:48:58,434][1875656] Updated weights for policy 0, policy_version 2560 (0.0006) [2023-03-06 13:48:59,185][1875656] Updated weights for policy 0, policy_version 2570 (0.0006) [2023-03-06 13:48:59,952][1875656] Updated weights for policy 0, policy_version 2580 (0.0006) [2023-03-06 13:49:00,412][1875328] Fps is (10 sec: 13312.0, 60 sec: 13312.0, 300 sec: 13235.2). Total num frames: 2647040. Throughput: 0: 13301.7. Samples: 2630458. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:49:00,412][1875328] Avg episode reward: [(0, '1009.998')] [2023-03-06 13:49:00,742][1875656] Updated weights for policy 0, policy_version 2590 (0.0006) [2023-03-06 13:49:01,510][1875656] Updated weights for policy 0, policy_version 2600 (0.0005) [2023-03-06 13:49:02,278][1875656] Updated weights for policy 0, policy_version 2610 (0.0006) [2023-03-06 13:49:03,044][1875656] Updated weights for policy 0, policy_version 2620 (0.0006) [2023-03-06 13:49:03,812][1875656] Updated weights for policy 0, policy_version 2630 (0.0006) [2023-03-06 13:49:04,573][1875656] Updated weights for policy 0, policy_version 2640 (0.0005) [2023-03-06 13:49:05,332][1875656] Updated weights for policy 0, policy_version 2650 (0.0007) [2023-03-06 13:49:05,412][1875328] Fps is (10 sec: 13414.3, 60 sec: 13329.1, 300 sec: 13242.1). Total num frames: 2714624. Throughput: 0: 13311.2. Samples: 2710436. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:49:05,412][1875328] Avg episode reward: [(0, '1133.452')] [2023-03-06 13:49:06,110][1875656] Updated weights for policy 0, policy_version 2660 (0.0006) [2023-03-06 13:49:06,864][1875656] Updated weights for policy 0, policy_version 2670 (0.0005) [2023-03-06 13:49:07,637][1875656] Updated weights for policy 0, policy_version 2680 (0.0006) [2023-03-06 13:49:08,400][1875656] Updated weights for policy 0, policy_version 2690 (0.0006) [2023-03-06 13:49:09,170][1875656] Updated weights for policy 0, policy_version 2700 (0.0006) [2023-03-06 13:49:09,942][1875656] Updated weights for policy 0, policy_version 2710 (0.0007) [2023-03-06 13:49:10,412][1875328] Fps is (10 sec: 13414.3, 60 sec: 13312.0, 300 sec: 13243.7). Total num frames: 2781184. Throughput: 0: 13313.7. Samples: 2750528. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:49:10,412][1875328] Avg episode reward: [(0, '1113.419')] [2023-03-06 13:49:10,702][1875656] Updated weights for policy 0, policy_version 2720 (0.0006) [2023-03-06 13:49:11,472][1875656] Updated weights for policy 0, policy_version 2730 (0.0006) [2023-03-06 13:49:12,244][1875656] Updated weights for policy 0, policy_version 2740 (0.0006) [2023-03-06 13:49:13,007][1875656] Updated weights for policy 0, policy_version 2750 (0.0006) [2023-03-06 13:49:13,772][1875656] Updated weights for policy 0, policy_version 2760 (0.0007) [2023-03-06 13:49:14,528][1875656] Updated weights for policy 0, policy_version 2770 (0.0006) [2023-03-06 13:49:15,300][1875656] Updated weights for policy 0, policy_version 2780 (0.0006) [2023-03-06 13:49:15,412][1875328] Fps is (10 sec: 13312.0, 60 sec: 13312.0, 300 sec: 13245.3). Total num frames: 2847744. Throughput: 0: 13322.1. Samples: 2830727. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:49:15,412][1875328] Avg episode reward: [(0, '1238.441')] [2023-03-06 13:49:15,413][1875604] Saving new best policy, reward=1238.441! [2023-03-06 13:49:16,057][1875656] Updated weights for policy 0, policy_version 2790 (0.0005) [2023-03-06 13:49:16,828][1875656] Updated weights for policy 0, policy_version 2800 (0.0006) [2023-03-06 13:49:17,610][1875656] Updated weights for policy 0, policy_version 2810 (0.0006) [2023-03-06 13:49:18,371][1875656] Updated weights for policy 0, policy_version 2820 (0.0006) [2023-03-06 13:49:19,141][1875656] Updated weights for policy 0, policy_version 2830 (0.0007) [2023-03-06 13:49:19,919][1875656] Updated weights for policy 0, policy_version 2840 (0.0007) [2023-03-06 13:49:20,412][1875328] Fps is (10 sec: 13311.8, 60 sec: 13329.1, 300 sec: 13246.8). Total num frames: 2914304. Throughput: 0: 13327.7. Samples: 2910688. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:49:20,413][1875328] Avg episode reward: [(0, '1192.236')] [2023-03-06 13:49:20,672][1875656] Updated weights for policy 0, policy_version 2850 (0.0006) [2023-03-06 13:49:21,431][1875656] Updated weights for policy 0, policy_version 2860 (0.0006) [2023-03-06 13:49:22,185][1875656] Updated weights for policy 0, policy_version 2870 (0.0005) [2023-03-06 13:49:22,964][1875656] Updated weights for policy 0, policy_version 2880 (0.0005) [2023-03-06 13:49:23,718][1875656] Updated weights for policy 0, policy_version 2890 (0.0006) [2023-03-06 13:49:24,501][1875656] Updated weights for policy 0, policy_version 2900 (0.0006) [2023-03-06 13:49:25,262][1875656] Updated weights for policy 0, policy_version 2910 (0.0005) [2023-03-06 13:49:25,412][1875328] Fps is (10 sec: 13414.3, 60 sec: 13346.1, 300 sec: 13252.8). Total num frames: 2981888. Throughput: 0: 13332.5. Samples: 2951008. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:49:25,413][1875328] Avg episode reward: [(0, '1197.595')] [2023-03-06 13:49:26,019][1875656] Updated weights for policy 0, policy_version 2920 (0.0006) [2023-03-06 13:49:26,791][1875656] Updated weights for policy 0, policy_version 2930 (0.0006) [2023-03-06 13:49:27,565][1875656] Updated weights for policy 0, policy_version 2940 (0.0006) [2023-03-06 13:49:28,337][1875656] Updated weights for policy 0, policy_version 2950 (0.0006) [2023-03-06 13:49:29,106][1875656] Updated weights for policy 0, policy_version 2960 (0.0006) [2023-03-06 13:49:29,864][1875656] Updated weights for policy 0, policy_version 2970 (0.0006) [2023-03-06 13:49:30,412][1875328] Fps is (10 sec: 13414.5, 60 sec: 13329.0, 300 sec: 13254.1). Total num frames: 3048448. Throughput: 0: 13352.5. Samples: 3031151. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:49:30,412][1875328] Avg episode reward: [(0, '1135.777')] [2023-03-06 13:49:30,418][1875604] Saving /home/qgallouedec/gia/data/envs/metaworld/train_dir/button-press-topdown-v2/checkpoint_p0/checkpoint_000002977_3048448.pth... [2023-03-06 13:49:30,624][1875656] Updated weights for policy 0, policy_version 2980 (0.0006) [2023-03-06 13:49:31,375][1875656] Updated weights for policy 0, policy_version 2990 (0.0005) [2023-03-06 13:49:32,162][1875656] Updated weights for policy 0, policy_version 3000 (0.0006) [2023-03-06 13:49:32,929][1875656] Updated weights for policy 0, policy_version 3010 (0.0005) [2023-03-06 13:49:33,694][1875656] Updated weights for policy 0, policy_version 3020 (0.0006) [2023-03-06 13:49:34,474][1875656] Updated weights for policy 0, policy_version 3030 (0.0005) [2023-03-06 13:49:35,240][1875656] Updated weights for policy 0, policy_version 3040 (0.0006) [2023-03-06 13:49:35,412][1875328] Fps is (10 sec: 13312.0, 60 sec: 13329.1, 300 sec: 13255.4). Total num frames: 3115008. Throughput: 0: 13351.1. Samples: 3111247. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) [2023-03-06 13:49:35,413][1875328] Avg episode reward: [(0, '1161.626')] [2023-03-06 13:49:35,999][1875656] Updated weights for policy 0, policy_version 3050 (0.0006) [2023-03-06 13:49:36,766][1875656] Updated weights for policy 0, policy_version 3060 (0.0006) [2023-03-06 13:49:37,540][1875656] Updated weights for policy 0, policy_version 3070 (0.0005) [2023-03-06 13:49:38,325][1875656] Updated weights for policy 0, policy_version 3080 (0.0006) [2023-03-06 13:49:39,066][1875656] Updated weights for policy 0, policy_version 3090 (0.0006) [2023-03-06 13:49:39,852][1875656] Updated weights for policy 0, policy_version 3100 (0.0006) [2023-03-06 13:49:40,412][1875328] Fps is (10 sec: 13312.2, 60 sec: 13346.2, 300 sec: 13256.5). Total num frames: 3181568. Throughput: 0: 13345.0. Samples: 3151092. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:49:40,412][1875328] Avg episode reward: [(0, '1140.047')] [2023-03-06 13:49:40,608][1875656] Updated weights for policy 0, policy_version 3110 (0.0006) [2023-03-06 13:49:41,364][1875656] Updated weights for policy 0, policy_version 3120 (0.0005) [2023-03-06 13:49:42,140][1875656] Updated weights for policy 0, policy_version 3130 (0.0006) [2023-03-06 13:49:42,928][1875656] Updated weights for policy 0, policy_version 3140 (0.0005) [2023-03-06 13:49:43,694][1875656] Updated weights for policy 0, policy_version 3150 (0.0006) [2023-03-06 13:49:44,459][1875656] Updated weights for policy 0, policy_version 3160 (0.0006) [2023-03-06 13:49:45,245][1875656] Updated weights for policy 0, policy_version 3170 (0.0006) [2023-03-06 13:49:45,412][1875328] Fps is (10 sec: 13312.2, 60 sec: 13346.2, 300 sec: 13257.7). Total num frames: 3248128. Throughput: 0: 13348.3. Samples: 3231133. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-03-06 13:49:45,412][1875328] Avg episode reward: [(0, '1163.707')] [2023-03-06 13:49:45,994][1875656] Updated weights for policy 0, policy_version 3180 (0.0006) [2023-03-06 13:49:46,755][1875656] Updated weights for policy 0, policy_version 3190 (0.0005) [2023-03-06 13:49:47,524][1875656] Updated weights for policy 0, policy_version 3200 (0.0007) [2023-03-06 13:49:48,272][1875656] Updated weights for policy 0, policy_version 3210 (0.0007) [2023-03-06 13:49:49,024][1875656] Updated weights for policy 0, policy_version 3220 (0.0006) [2023-03-06 13:49:49,811][1875656] Updated weights for policy 0, policy_version 3230 (0.0006) [2023-03-06 13:49:50,412][1875328] Fps is (10 sec: 13312.0, 60 sec: 13346.1, 300 sec: 13258.8). Total num frames: 3314688. Throughput: 0: 13355.0. Samples: 3311413. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:49:50,413][1875328] Avg episode reward: [(0, '1228.486')] [2023-03-06 13:49:50,565][1875656] Updated weights for policy 0, policy_version 3240 (0.0006) [2023-03-06 13:49:51,348][1875656] Updated weights for policy 0, policy_version 3250 (0.0007) [2023-03-06 13:49:52,105][1875656] Updated weights for policy 0, policy_version 3260 (0.0006) [2023-03-06 13:49:52,860][1875656] Updated weights for policy 0, policy_version 3270 (0.0006) [2023-03-06 13:49:53,633][1875656] Updated weights for policy 0, policy_version 3280 (0.0006) [2023-03-06 13:49:54,392][1875656] Updated weights for policy 0, policy_version 3290 (0.0005) [2023-03-06 13:49:55,142][1875656] Updated weights for policy 0, policy_version 3300 (0.0005) [2023-03-06 13:49:55,412][1875328] Fps is (10 sec: 13414.3, 60 sec: 13363.2, 300 sec: 13263.8). Total num frames: 3382272. Throughput: 0: 13355.9. Samples: 3351542. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-03-06 13:49:55,413][1875328] Avg episode reward: [(0, '1247.222')] [2023-03-06 13:49:55,413][1875604] Saving new best policy, reward=1247.222! [2023-03-06 13:49:55,912][1875656] Updated weights for policy 0, policy_version 3310 (0.0007) [2023-03-06 13:49:56,686][1875656] Updated weights for policy 0, policy_version 3320 (0.0006) [2023-03-06 13:49:57,455][1875656] Updated weights for policy 0, policy_version 3330 (0.0006) [2023-03-06 13:49:58,218][1875656] Updated weights for policy 0, policy_version 3340 (0.0006) [2023-03-06 13:49:58,947][1875656] Updated weights for policy 0, policy_version 3350 (0.0006) [2023-03-06 13:49:59,741][1875656] Updated weights for policy 0, policy_version 3360 (0.0006) [2023-03-06 13:50:00,412][1875328] Fps is (10 sec: 13414.4, 60 sec: 13363.2, 300 sec: 13264.8). Total num frames: 3448832. Throughput: 0: 13369.4. Samples: 3432351. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:50:00,412][1875328] Avg episode reward: [(0, '1246.908')] [2023-03-06 13:50:00,495][1875656] Updated weights for policy 0, policy_version 3370 (0.0006) [2023-03-06 13:50:01,257][1875656] Updated weights for policy 0, policy_version 3380 (0.0006) [2023-03-06 13:50:02,031][1875656] Updated weights for policy 0, policy_version 3390 (0.0006) [2023-03-06 13:50:02,784][1875656] Updated weights for policy 0, policy_version 3400 (0.0006) [2023-03-06 13:50:03,547][1875656] Updated weights for policy 0, policy_version 3410 (0.0007) [2023-03-06 13:50:04,325][1875656] Updated weights for policy 0, policy_version 3420 (0.0007) [2023-03-06 13:50:05,074][1875656] Updated weights for policy 0, policy_version 3430 (0.0005) [2023-03-06 13:50:05,412][1875328] Fps is (10 sec: 13414.5, 60 sec: 13363.2, 300 sec: 13269.5). Total num frames: 3516416. Throughput: 0: 13377.1. Samples: 3512656. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:50:05,412][1875328] Avg episode reward: [(0, '1246.806')] [2023-03-06 13:50:05,845][1875656] Updated weights for policy 0, policy_version 3440 (0.0006) [2023-03-06 13:50:06,595][1875656] Updated weights for policy 0, policy_version 3450 (0.0007) [2023-03-06 13:50:07,349][1875656] Updated weights for policy 0, policy_version 3460 (0.0006) [2023-03-06 13:50:08,126][1875656] Updated weights for policy 0, policy_version 3470 (0.0006) [2023-03-06 13:50:08,896][1875656] Updated weights for policy 0, policy_version 3480 (0.0006) [2023-03-06 13:50:09,662][1875656] Updated weights for policy 0, policy_version 3490 (0.0006) [2023-03-06 13:50:10,412][1875328] Fps is (10 sec: 13414.3, 60 sec: 13363.2, 300 sec: 13270.3). Total num frames: 3582976. Throughput: 0: 13379.8. Samples: 3553101. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:50:10,413][1875328] Avg episode reward: [(0, '1224.365')] [2023-03-06 13:50:10,439][1875656] Updated weights for policy 0, policy_version 3500 (0.0006) [2023-03-06 13:50:11,193][1875656] Updated weights for policy 0, policy_version 3510 (0.0006) [2023-03-06 13:50:11,954][1875656] Updated weights for policy 0, policy_version 3520 (0.0005) [2023-03-06 13:50:12,724][1875656] Updated weights for policy 0, policy_version 3530 (0.0006) [2023-03-06 13:50:13,495][1875656] Updated weights for policy 0, policy_version 3540 (0.0007) [2023-03-06 13:50:14,262][1875656] Updated weights for policy 0, policy_version 3550 (0.0006) [2023-03-06 13:50:15,008][1875656] Updated weights for policy 0, policy_version 3560 (0.0005) [2023-03-06 13:50:15,412][1875328] Fps is (10 sec: 13414.2, 60 sec: 13380.2, 300 sec: 13274.8). Total num frames: 3650560. Throughput: 0: 13377.4. Samples: 3633133. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:50:15,413][1875328] Avg episode reward: [(0, '1245.104')] [2023-03-06 13:50:15,775][1875656] Updated weights for policy 0, policy_version 3570 (0.0006) [2023-03-06 13:50:16,544][1875656] Updated weights for policy 0, policy_version 3580 (0.0006) [2023-03-06 13:50:17,297][1875656] Updated weights for policy 0, policy_version 3590 (0.0006) [2023-03-06 13:50:18,054][1875656] Updated weights for policy 0, policy_version 3600 (0.0006) [2023-03-06 13:50:18,835][1875656] Updated weights for policy 0, policy_version 3610 (0.0006) [2023-03-06 13:50:19,596][1875656] Updated weights for policy 0, policy_version 3620 (0.0005) [2023-03-06 13:50:20,358][1875656] Updated weights for policy 0, policy_version 3630 (0.0006) [2023-03-06 13:50:20,412][1875328] Fps is (10 sec: 13414.3, 60 sec: 13380.3, 300 sec: 13275.4). Total num frames: 3717120. Throughput: 0: 13386.0. Samples: 3713616. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:50:20,413][1875328] Avg episode reward: [(0, '1275.406')] [2023-03-06 13:50:20,427][1875604] Saving new best policy, reward=1275.406! [2023-03-06 13:50:21,148][1875656] Updated weights for policy 0, policy_version 3640 (0.0007) [2023-03-06 13:50:21,911][1875656] Updated weights for policy 0, policy_version 3650 (0.0005) [2023-03-06 13:50:22,653][1875656] Updated weights for policy 0, policy_version 3660 (0.0006) [2023-03-06 13:50:23,430][1875656] Updated weights for policy 0, policy_version 3670 (0.0006) [2023-03-06 13:50:24,180][1875656] Updated weights for policy 0, policy_version 3680 (0.0006) [2023-03-06 13:50:24,950][1875656] Updated weights for policy 0, policy_version 3690 (0.0005) [2023-03-06 13:50:25,412][1875328] Fps is (10 sec: 13312.1, 60 sec: 13363.2, 300 sec: 13276.1). Total num frames: 3783680. Throughput: 0: 13389.6. Samples: 3753625. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:50:25,413][1875328] Avg episode reward: [(0, '1301.733')] [2023-03-06 13:50:25,416][1875604] Saving new best policy, reward=1301.733! [2023-03-06 13:50:25,725][1875656] Updated weights for policy 0, policy_version 3700 (0.0006) [2023-03-06 13:50:26,497][1875656] Updated weights for policy 0, policy_version 3710 (0.0005) [2023-03-06 13:50:27,243][1875656] Updated weights for policy 0, policy_version 3720 (0.0006) [2023-03-06 13:50:28,024][1875656] Updated weights for policy 0, policy_version 3730 (0.0007) [2023-03-06 13:50:28,790][1875656] Updated weights for policy 0, policy_version 3740 (0.0006) [2023-03-06 13:50:29,592][1875656] Updated weights for policy 0, policy_version 3750 (0.0006) [2023-03-06 13:50:30,357][1875656] Updated weights for policy 0, policy_version 3760 (0.0006) [2023-03-06 13:50:30,412][1875328] Fps is (10 sec: 13312.1, 60 sec: 13363.2, 300 sec: 13276.7). Total num frames: 3850240. Throughput: 0: 13395.4. Samples: 3833926. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) [2023-03-06 13:50:30,413][1875328] Avg episode reward: [(0, '1256.368')] [2023-03-06 13:50:31,124][1875656] Updated weights for policy 0, policy_version 3770 (0.0006) [2023-03-06 13:50:31,917][1875656] Updated weights for policy 0, policy_version 3780 (0.0006) [2023-03-06 13:50:32,682][1875656] Updated weights for policy 0, policy_version 3790 (0.0006) [2023-03-06 13:50:33,459][1875656] Updated weights for policy 0, policy_version 3800 (0.0007) [2023-03-06 13:50:34,217][1875656] Updated weights for policy 0, policy_version 3810 (0.0006) [2023-03-06 13:50:34,973][1875656] Updated weights for policy 0, policy_version 3820 (0.0005) [2023-03-06 13:50:35,412][1875328] Fps is (10 sec: 13312.1, 60 sec: 13363.2, 300 sec: 13277.3). Total num frames: 3916800. Throughput: 0: 13375.5. Samples: 3913308. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-03-06 13:50:35,413][1875328] Avg episode reward: [(0, '1235.747')] [2023-03-06 13:50:35,759][1875656] Updated weights for policy 0, policy_version 3830 (0.0006) [2023-03-06 13:50:36,533][1875656] Updated weights for policy 0, policy_version 3840 (0.0006) [2023-03-06 13:50:37,286][1875656] Updated weights for policy 0, policy_version 3850 (0.0006) [2023-03-06 13:50:38,062][1875656] Updated weights for policy 0, policy_version 3860 (0.0005) [2023-03-06 13:50:38,825][1875656] Updated weights for policy 0, policy_version 3870 (0.0006) [2023-03-06 13:50:39,597][1875656] Updated weights for policy 0, policy_version 3880 (0.0006) [2023-03-06 13:50:40,378][1875656] Updated weights for policy 0, policy_version 3890 (0.0006) [2023-03-06 13:50:40,412][1875328] Fps is (10 sec: 13312.0, 60 sec: 13363.2, 300 sec: 13367.5). Total num frames: 3983360. Throughput: 0: 13369.3. Samples: 3953163. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:50:40,413][1875328] Avg episode reward: [(0, '1267.860')] [2023-03-06 13:50:41,154][1875656] Updated weights for policy 0, policy_version 3900 (0.0005) [2023-03-06 13:50:41,914][1875656] Updated weights for policy 0, policy_version 3910 (0.0008) [2023-03-06 13:50:42,699][1875656] Updated weights for policy 0, policy_version 3920 (0.0006) [2023-03-06 13:50:43,462][1875656] Updated weights for policy 0, policy_version 3930 (0.0006) [2023-03-06 13:50:44,225][1875656] Updated weights for policy 0, policy_version 3940 (0.0006) [2023-03-06 13:50:44,999][1875656] Updated weights for policy 0, policy_version 3950 (0.0005) [2023-03-06 13:50:45,412][1875328] Fps is (10 sec: 13311.8, 60 sec: 13363.2, 300 sec: 13367.5). Total num frames: 4049920. Throughput: 0: 13344.2. Samples: 4032839. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:50:45,413][1875328] Avg episode reward: [(0, '1233.508')] [2023-03-06 13:50:45,771][1875656] Updated weights for policy 0, policy_version 3960 (0.0007) [2023-03-06 13:50:46,533][1875656] Updated weights for policy 0, policy_version 3970 (0.0007) [2023-03-06 13:50:47,310][1875656] Updated weights for policy 0, policy_version 3980 (0.0007) [2023-03-06 13:50:48,082][1875656] Updated weights for policy 0, policy_version 3990 (0.0006) [2023-03-06 13:50:48,846][1875656] Updated weights for policy 0, policy_version 4000 (0.0006) [2023-03-06 13:50:49,625][1875656] Updated weights for policy 0, policy_version 4010 (0.0006) [2023-03-06 13:50:50,393][1875656] Updated weights for policy 0, policy_version 4020 (0.0006) [2023-03-06 13:50:50,412][1875328] Fps is (10 sec: 13312.0, 60 sec: 13363.2, 300 sec: 13364.1). Total num frames: 4116480. Throughput: 0: 13332.4. Samples: 4112617. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:50:50,413][1875328] Avg episode reward: [(0, '1218.227')] [2023-03-06 13:50:51,150][1875656] Updated weights for policy 0, policy_version 4030 (0.0006) [2023-03-06 13:50:51,913][1875656] Updated weights for policy 0, policy_version 4040 (0.0006) [2023-03-06 13:50:52,689][1875656] Updated weights for policy 0, policy_version 4050 (0.0006) [2023-03-06 13:50:53,458][1875656] Updated weights for policy 0, policy_version 4060 (0.0006) [2023-03-06 13:50:54,249][1875656] Updated weights for policy 0, policy_version 4070 (0.0006) [2023-03-06 13:50:55,025][1875656] Updated weights for policy 0, policy_version 4080 (0.0005) [2023-03-06 13:50:55,412][1875328] Fps is (10 sec: 13311.9, 60 sec: 13346.1, 300 sec: 13360.6). Total num frames: 4183040. Throughput: 0: 13320.2. Samples: 4152511. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:50:55,413][1875328] Avg episode reward: [(0, '1243.856')] [2023-03-06 13:50:55,785][1875656] Updated weights for policy 0, policy_version 4090 (0.0006) [2023-03-06 13:50:56,561][1875656] Updated weights for policy 0, policy_version 4100 (0.0006) [2023-03-06 13:50:57,335][1875656] Updated weights for policy 0, policy_version 4110 (0.0006) [2023-03-06 13:50:58,103][1875656] Updated weights for policy 0, policy_version 4120 (0.0005) [2023-03-06 13:50:58,869][1875656] Updated weights for policy 0, policy_version 4130 (0.0007) [2023-03-06 13:50:59,647][1875656] Updated weights for policy 0, policy_version 4140 (0.0007) [2023-03-06 13:51:00,409][1875656] Updated weights for policy 0, policy_version 4150 (0.0006) [2023-03-06 13:51:00,412][1875328] Fps is (10 sec: 13312.2, 60 sec: 13346.1, 300 sec: 13360.6). Total num frames: 4249600. Throughput: 0: 13312.5. Samples: 4232193. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) [2023-03-06 13:51:00,412][1875328] Avg episode reward: [(0, '1256.569')] [2023-03-06 13:51:01,161][1875656] Updated weights for policy 0, policy_version 4160 (0.0006) [2023-03-06 13:51:01,940][1875656] Updated weights for policy 0, policy_version 4170 (0.0006) [2023-03-06 13:51:02,697][1875656] Updated weights for policy 0, policy_version 4180 (0.0006) [2023-03-06 13:51:03,482][1875656] Updated weights for policy 0, policy_version 4190 (0.0006) [2023-03-06 13:51:04,255][1875656] Updated weights for policy 0, policy_version 4200 (0.0006) [2023-03-06 13:51:05,035][1875656] Updated weights for policy 0, policy_version 4210 (0.0006) [2023-03-06 13:51:05,412][1875328] Fps is (10 sec: 13209.7, 60 sec: 13312.0, 300 sec: 13353.7). Total num frames: 4315136. Throughput: 0: 13293.1. Samples: 4311805. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:51:05,412][1875328] Avg episode reward: [(0, '1302.044')] [2023-03-06 13:51:05,413][1875604] Saving new best policy, reward=1302.044! [2023-03-06 13:51:05,801][1875656] Updated weights for policy 0, policy_version 4220 (0.0005) [2023-03-06 13:51:06,580][1875656] Updated weights for policy 0, policy_version 4230 (0.0006) [2023-03-06 13:51:07,343][1875656] Updated weights for policy 0, policy_version 4240 (0.0006) [2023-03-06 13:51:08,116][1875656] Updated weights for policy 0, policy_version 4250 (0.0006) [2023-03-06 13:51:08,895][1875656] Updated weights for policy 0, policy_version 4260 (0.0006) [2023-03-06 13:51:09,665][1875656] Updated weights for policy 0, policy_version 4270 (0.0007) [2023-03-06 13:51:10,412][1875328] Fps is (10 sec: 13209.3, 60 sec: 13312.0, 300 sec: 13353.6). Total num frames: 4381696. Throughput: 0: 13294.8. Samples: 4351891. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:51:10,413][1875328] Avg episode reward: [(0, '1329.857')] [2023-03-06 13:51:10,433][1875604] Saving new best policy, reward=1329.857! [2023-03-06 13:51:10,436][1875656] Updated weights for policy 0, policy_version 4280 (0.0007) [2023-03-06 13:51:11,212][1875656] Updated weights for policy 0, policy_version 4290 (0.0006) [2023-03-06 13:51:11,968][1875656] Updated weights for policy 0, policy_version 4300 (0.0006) [2023-03-06 13:51:12,730][1875656] Updated weights for policy 0, policy_version 4310 (0.0005) [2023-03-06 13:51:13,516][1875656] Updated weights for policy 0, policy_version 4320 (0.0006) [2023-03-06 13:51:14,266][1875656] Updated weights for policy 0, policy_version 4330 (0.0006) [2023-03-06 13:51:15,029][1875656] Updated weights for policy 0, policy_version 4340 (0.0006) [2023-03-06 13:51:15,412][1875328] Fps is (10 sec: 13414.6, 60 sec: 13312.1, 300 sec: 13357.1). Total num frames: 4449280. Throughput: 0: 13282.4. Samples: 4431631. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:51:15,412][1875328] Avg episode reward: [(0, '1310.632')] [2023-03-06 13:51:15,796][1875656] Updated weights for policy 0, policy_version 4350 (0.0005) [2023-03-06 13:51:16,564][1875656] Updated weights for policy 0, policy_version 4360 (0.0006) [2023-03-06 13:51:17,347][1875656] Updated weights for policy 0, policy_version 4370 (0.0006) [2023-03-06 13:51:18,122][1875656] Updated weights for policy 0, policy_version 4380 (0.0006) [2023-03-06 13:51:18,891][1875656] Updated weights for policy 0, policy_version 4390 (0.0008) [2023-03-06 13:51:19,659][1875656] Updated weights for policy 0, policy_version 4400 (0.0005) [2023-03-06 13:51:20,412][1875328] Fps is (10 sec: 13414.7, 60 sec: 13312.0, 300 sec: 13357.1). Total num frames: 4515840. Throughput: 0: 13294.2. Samples: 4511546. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:51:20,412][1875328] Avg episode reward: [(0, '1333.366')] [2023-03-06 13:51:20,414][1875656] Updated weights for policy 0, policy_version 4410 (0.0006) [2023-03-06 13:51:20,417][1875604] Saving new best policy, reward=1333.366! [2023-03-06 13:51:21,215][1875656] Updated weights for policy 0, policy_version 4420 (0.0007) [2023-03-06 13:51:21,973][1875656] Updated weights for policy 0, policy_version 4430 (0.0007) [2023-03-06 13:51:22,736][1875656] Updated weights for policy 0, policy_version 4440 (0.0006) [2023-03-06 13:51:23,514][1875656] Updated weights for policy 0, policy_version 4450 (0.0007) [2023-03-06 13:51:24,281][1875656] Updated weights for policy 0, policy_version 4460 (0.0006) [2023-03-06 13:51:25,045][1875656] Updated weights for policy 0, policy_version 4470 (0.0006) [2023-03-06 13:51:25,412][1875328] Fps is (10 sec: 13209.4, 60 sec: 13294.9, 300 sec: 13350.2). Total num frames: 4581376. Throughput: 0: 13292.0. Samples: 4551302. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:51:25,412][1875328] Avg episode reward: [(0, '1355.125')] [2023-03-06 13:51:25,413][1875604] Saving new best policy, reward=1355.125! [2023-03-06 13:51:25,837][1875656] Updated weights for policy 0, policy_version 4480 (0.0006) [2023-03-06 13:51:26,591][1875656] Updated weights for policy 0, policy_version 4490 (0.0006) [2023-03-06 13:51:27,367][1875656] Updated weights for policy 0, policy_version 4500 (0.0006) [2023-03-06 13:51:28,131][1875656] Updated weights for policy 0, policy_version 4510 (0.0006) [2023-03-06 13:51:28,899][1875656] Updated weights for policy 0, policy_version 4520 (0.0006) [2023-03-06 13:51:29,691][1875656] Updated weights for policy 0, policy_version 4530 (0.0005) [2023-03-06 13:51:30,412][1875328] Fps is (10 sec: 13209.5, 60 sec: 13294.9, 300 sec: 13350.2). Total num frames: 4647936. Throughput: 0: 13294.2. Samples: 4631079. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:51:30,413][1875328] Avg episode reward: [(0, '1372.634')] [2023-03-06 13:51:30,419][1875604] Saving /home/qgallouedec/gia/data/envs/metaworld/train_dir/button-press-topdown-v2/checkpoint_p0/checkpoint_000004539_4647936.pth... [2023-03-06 13:51:30,450][1875604] Removing /home/qgallouedec/gia/data/envs/metaworld/train_dir/button-press-topdown-v2/checkpoint_p0/checkpoint_000001413_1446912.pth [2023-03-06 13:51:30,453][1875604] Saving new best policy, reward=1372.634! [2023-03-06 13:51:30,507][1875656] Updated weights for policy 0, policy_version 4540 (0.0006) [2023-03-06 13:51:31,214][1875656] Updated weights for policy 0, policy_version 4550 (0.0007) [2023-03-06 13:51:32,002][1875656] Updated weights for policy 0, policy_version 4560 (0.0006) [2023-03-06 13:51:32,777][1875656] Updated weights for policy 0, policy_version 4570 (0.0006) [2023-03-06 13:51:33,554][1875656] Updated weights for policy 0, policy_version 4580 (0.0006) [2023-03-06 13:51:34,320][1875656] Updated weights for policy 0, policy_version 4590 (0.0005) [2023-03-06 13:51:35,096][1875656] Updated weights for policy 0, policy_version 4600 (0.0006) [2023-03-06 13:51:35,412][1875328] Fps is (10 sec: 13312.0, 60 sec: 13294.9, 300 sec: 13350.2). Total num frames: 4714496. Throughput: 0: 13288.9. Samples: 4710614. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:51:35,412][1875328] Avg episode reward: [(0, '1365.200')] [2023-03-06 13:51:35,861][1875656] Updated weights for policy 0, policy_version 4610 (0.0005) [2023-03-06 13:51:36,626][1875656] Updated weights for policy 0, policy_version 4620 (0.0007) [2023-03-06 13:51:37,393][1875656] Updated weights for policy 0, policy_version 4630 (0.0006) [2023-03-06 13:51:38,176][1875656] Updated weights for policy 0, policy_version 4640 (0.0006) [2023-03-06 13:51:38,922][1875656] Updated weights for policy 0, policy_version 4650 (0.0006) [2023-03-06 13:51:39,701][1875656] Updated weights for policy 0, policy_version 4660 (0.0006) [2023-03-06 13:51:40,412][1875328] Fps is (10 sec: 13311.9, 60 sec: 13294.9, 300 sec: 13346.7). Total num frames: 4781056. Throughput: 0: 13287.8. Samples: 4750463. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:51:40,413][1875328] Avg episode reward: [(0, '1419.025')] [2023-03-06 13:51:40,419][1875604] Saving new best policy, reward=1419.025! [2023-03-06 13:51:40,476][1875656] Updated weights for policy 0, policy_version 4670 (0.0006) [2023-03-06 13:51:41,224][1875656] Updated weights for policy 0, policy_version 4680 (0.0006) [2023-03-06 13:51:42,005][1875656] Updated weights for policy 0, policy_version 4690 (0.0006) [2023-03-06 13:51:42,771][1875656] Updated weights for policy 0, policy_version 4700 (0.0006) [2023-03-06 13:51:43,546][1875656] Updated weights for policy 0, policy_version 4710 (0.0006) [2023-03-06 13:51:44,315][1875656] Updated weights for policy 0, policy_version 4720 (0.0007) [2023-03-06 13:51:45,085][1875656] Updated weights for policy 0, policy_version 4730 (0.0007) [2023-03-06 13:51:45,412][1875328] Fps is (10 sec: 13311.9, 60 sec: 13294.9, 300 sec: 13343.2). Total num frames: 4847616. Throughput: 0: 13294.9. Samples: 4830465. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:51:45,413][1875328] Avg episode reward: [(0, '1415.643')] [2023-03-06 13:51:45,861][1875656] Updated weights for policy 0, policy_version 4740 (0.0005) [2023-03-06 13:51:46,649][1875656] Updated weights for policy 0, policy_version 4750 (0.0005) [2023-03-06 13:51:47,404][1875656] Updated weights for policy 0, policy_version 4760 (0.0006) [2023-03-06 13:51:48,172][1875656] Updated weights for policy 0, policy_version 4770 (0.0006) [2023-03-06 13:51:48,928][1875656] Updated weights for policy 0, policy_version 4780 (0.0005) [2023-03-06 13:51:49,708][1875656] Updated weights for policy 0, policy_version 4790 (0.0005) [2023-03-06 13:51:50,412][1875328] Fps is (10 sec: 13312.2, 60 sec: 13295.0, 300 sec: 13343.2). Total num frames: 4914176. Throughput: 0: 13300.7. Samples: 4910337. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:51:50,412][1875328] Avg episode reward: [(0, '1351.719')] [2023-03-06 13:51:50,465][1875656] Updated weights for policy 0, policy_version 4800 (0.0006) [2023-03-06 13:51:51,244][1875656] Updated weights for policy 0, policy_version 4810 (0.0005) [2023-03-06 13:51:52,017][1875656] Updated weights for policy 0, policy_version 4820 (0.0005) [2023-03-06 13:51:52,781][1875656] Updated weights for policy 0, policy_version 4830 (0.0006) [2023-03-06 13:51:53,552][1875656] Updated weights for policy 0, policy_version 4840 (0.0006) [2023-03-06 13:51:54,326][1875656] Updated weights for policy 0, policy_version 4850 (0.0006) [2023-03-06 13:51:55,098][1875656] Updated weights for policy 0, policy_version 4860 (0.0006) [2023-03-06 13:51:55,412][1875328] Fps is (10 sec: 13312.1, 60 sec: 13295.0, 300 sec: 13339.8). Total num frames: 4980736. Throughput: 0: 13296.3. Samples: 4950220. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-03-06 13:51:55,412][1875328] Avg episode reward: [(0, '1425.335')] [2023-03-06 13:51:55,413][1875604] Saving new best policy, reward=1425.335! [2023-03-06 13:51:55,872][1875656] Updated weights for policy 0, policy_version 4870 (0.0006) [2023-03-06 13:51:56,614][1875656] Updated weights for policy 0, policy_version 4880 (0.0005) [2023-03-06 13:51:57,392][1875656] Updated weights for policy 0, policy_version 4890 (0.0006) [2023-03-06 13:51:58,150][1875656] Updated weights for policy 0, policy_version 4900 (0.0006) [2023-03-06 13:51:58,918][1875656] Updated weights for policy 0, policy_version 4910 (0.0006) [2023-03-06 13:51:59,689][1875656] Updated weights for policy 0, policy_version 4920 (0.0006) [2023-03-06 13:52:00,412][1875328] Fps is (10 sec: 13311.9, 60 sec: 13294.9, 300 sec: 13339.8). Total num frames: 5047296. Throughput: 0: 13305.9. Samples: 5030402. Policy #0 lag: (min: 0.0, avg: 1.1, max: 2.0) [2023-03-06 13:52:00,413][1875328] Avg episode reward: [(0, '1452.284')] [2023-03-06 13:52:00,417][1875604] Saving new best policy, reward=1452.284! [2023-03-06 13:52:00,468][1875656] Updated weights for policy 0, policy_version 4930 (0.0006) [2023-03-06 13:52:01,220][1875656] Updated weights for policy 0, policy_version 4940 (0.0007) [2023-03-06 13:52:01,985][1875656] Updated weights for policy 0, policy_version 4950 (0.0007) [2023-03-06 13:52:02,748][1875656] Updated weights for policy 0, policy_version 4960 (0.0005) [2023-03-06 13:52:03,509][1875656] Updated weights for policy 0, policy_version 4970 (0.0005) [2023-03-06 13:52:04,287][1875656] Updated weights for policy 0, policy_version 4980 (0.0006) [2023-03-06 13:52:05,056][1875656] Updated weights for policy 0, policy_version 4990 (0.0006) [2023-03-06 13:52:05,412][1875328] Fps is (10 sec: 13312.0, 60 sec: 13312.0, 300 sec: 13339.8). Total num frames: 5113856. Throughput: 0: 13309.1. Samples: 5110455. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:52:05,412][1875328] Avg episode reward: [(0, '1480.337')] [2023-03-06 13:52:05,413][1875604] Saving new best policy, reward=1480.337! [2023-03-06 13:52:05,812][1875656] Updated weights for policy 0, policy_version 5000 (0.0007) [2023-03-06 13:52:06,598][1875656] Updated weights for policy 0, policy_version 5010 (0.0007) [2023-03-06 13:52:07,357][1875656] Updated weights for policy 0, policy_version 5020 (0.0006) [2023-03-06 13:52:08,118][1875656] Updated weights for policy 0, policy_version 5030 (0.0007) [2023-03-06 13:52:08,889][1875656] Updated weights for policy 0, policy_version 5040 (0.0007) [2023-03-06 13:52:09,664][1875656] Updated weights for policy 0, policy_version 5050 (0.0006) [2023-03-06 13:52:10,412][1875328] Fps is (10 sec: 13312.1, 60 sec: 13312.0, 300 sec: 13336.3). Total num frames: 5180416. Throughput: 0: 13313.6. Samples: 5150414. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:52:10,412][1875328] Avg episode reward: [(0, '1462.406')] [2023-03-06 13:52:10,434][1875656] Updated weights for policy 0, policy_version 5060 (0.0006) [2023-03-06 13:52:11,219][1875656] Updated weights for policy 0, policy_version 5070 (0.0005) [2023-03-06 13:52:11,967][1875656] Updated weights for policy 0, policy_version 5080 (0.0006) [2023-03-06 13:52:12,721][1875656] Updated weights for policy 0, policy_version 5090 (0.0006) [2023-03-06 13:52:13,485][1875656] Updated weights for policy 0, policy_version 5100 (0.0006) [2023-03-06 13:52:14,274][1875656] Updated weights for policy 0, policy_version 5110 (0.0006) [2023-03-06 13:52:15,049][1875656] Updated weights for policy 0, policy_version 5120 (0.0006) [2023-03-06 13:52:15,412][1875328] Fps is (10 sec: 13312.0, 60 sec: 13294.9, 300 sec: 13332.8). Total num frames: 5246976. Throughput: 0: 13320.0. Samples: 5230479. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) [2023-03-06 13:52:15,412][1875328] Avg episode reward: [(0, '1534.189')] [2023-03-06 13:52:15,425][1875604] Saving new best policy, reward=1534.189! [2023-03-06 13:52:15,822][1875656] Updated weights for policy 0, policy_version 5130 (0.0006) [2023-03-06 13:52:16,583][1875656] Updated weights for policy 0, policy_version 5140 (0.0006) [2023-03-06 13:52:17,366][1875656] Updated weights for policy 0, policy_version 5150 (0.0006) [2023-03-06 13:52:18,120][1875656] Updated weights for policy 0, policy_version 5160 (0.0007) [2023-03-06 13:52:18,868][1875656] Updated weights for policy 0, policy_version 5170 (0.0006) [2023-03-06 13:52:19,648][1875656] Updated weights for policy 0, policy_version 5180 (0.0006) [2023-03-06 13:52:20,412][1875328] Fps is (10 sec: 13311.9, 60 sec: 13294.9, 300 sec: 13332.8). Total num frames: 5313536. Throughput: 0: 13327.8. Samples: 5310368. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:52:20,413][1875328] Avg episode reward: [(0, '1514.949')] [2023-03-06 13:52:20,421][1875656] Updated weights for policy 0, policy_version 5190 (0.0006) [2023-03-06 13:52:21,201][1875656] Updated weights for policy 0, policy_version 5200 (0.0006) [2023-03-06 13:52:21,977][1875656] Updated weights for policy 0, policy_version 5210 (0.0006) [2023-03-06 13:52:22,745][1875656] Updated weights for policy 0, policy_version 5220 (0.0006) [2023-03-06 13:52:23,532][1875656] Updated weights for policy 0, policy_version 5230 (0.0006) [2023-03-06 13:52:24,300][1875656] Updated weights for policy 0, policy_version 5240 (0.0006) [2023-03-06 13:52:25,071][1875656] Updated weights for policy 0, policy_version 5250 (0.0006) [2023-03-06 13:52:25,412][1875328] Fps is (10 sec: 13311.9, 60 sec: 13312.0, 300 sec: 13332.8). Total num frames: 5380096. Throughput: 0: 13322.4. Samples: 5349969. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:52:25,413][1875328] Avg episode reward: [(0, '1519.511')] [2023-03-06 13:52:25,828][1875656] Updated weights for policy 0, policy_version 5260 (0.0007) [2023-03-06 13:52:26,608][1875656] Updated weights for policy 0, policy_version 5270 (0.0006) [2023-03-06 13:52:27,383][1875656] Updated weights for policy 0, policy_version 5280 (0.0006) [2023-03-06 13:52:28,135][1875656] Updated weights for policy 0, policy_version 5290 (0.0006) [2023-03-06 13:52:28,914][1875656] Updated weights for policy 0, policy_version 5300 (0.0007) [2023-03-06 13:52:29,669][1875656] Updated weights for policy 0, policy_version 5310 (0.0006) [2023-03-06 13:52:30,412][1875328] Fps is (10 sec: 13312.1, 60 sec: 13312.0, 300 sec: 13332.8). Total num frames: 5446656. Throughput: 0: 13317.9. Samples: 5429768. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:52:30,412][1875328] Avg episode reward: [(0, '1561.516')] [2023-03-06 13:52:30,417][1875604] Saving new best policy, reward=1561.516! [2023-03-06 13:52:30,472][1875656] Updated weights for policy 0, policy_version 5320 (0.0007) [2023-03-06 13:52:31,208][1875656] Updated weights for policy 0, policy_version 5330 (0.0006) [2023-03-06 13:52:31,979][1875656] Updated weights for policy 0, policy_version 5340 (0.0006) [2023-03-06 13:52:32,744][1875656] Updated weights for policy 0, policy_version 5350 (0.0006) [2023-03-06 13:52:33,498][1875656] Updated weights for policy 0, policy_version 5360 (0.0007) [2023-03-06 13:52:34,264][1875656] Updated weights for policy 0, policy_version 5370 (0.0006) [2023-03-06 13:52:35,012][1875656] Updated weights for policy 0, policy_version 5380 (0.0006) [2023-03-06 13:52:35,412][1875328] Fps is (10 sec: 13414.5, 60 sec: 13329.1, 300 sec: 13332.8). Total num frames: 5514240. Throughput: 0: 13328.9. Samples: 5510135. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:52:35,412][1875328] Avg episode reward: [(0, '1534.413')] [2023-03-06 13:52:35,783][1875656] Updated weights for policy 0, policy_version 5390 (0.0006) [2023-03-06 13:52:36,562][1875656] Updated weights for policy 0, policy_version 5400 (0.0006) [2023-03-06 13:52:37,316][1875656] Updated weights for policy 0, policy_version 5410 (0.0006) [2023-03-06 13:52:38,068][1875656] Updated weights for policy 0, policy_version 5420 (0.0007) [2023-03-06 13:52:38,847][1875656] Updated weights for policy 0, policy_version 5430 (0.0006) [2023-03-06 13:52:39,624][1875656] Updated weights for policy 0, policy_version 5440 (0.0006) [2023-03-06 13:52:40,380][1875656] Updated weights for policy 0, policy_version 5450 (0.0007) [2023-03-06 13:52:40,412][1875328] Fps is (10 sec: 13414.3, 60 sec: 13329.1, 300 sec: 13332.8). Total num frames: 5580800. Throughput: 0: 13337.1. Samples: 5550389. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) [2023-03-06 13:52:40,413][1875328] Avg episode reward: [(0, '1547.463')] [2023-03-06 13:52:41,144][1875656] Updated weights for policy 0, policy_version 5460 (0.0006) [2023-03-06 13:52:41,925][1875656] Updated weights for policy 0, policy_version 5470 (0.0006) [2023-03-06 13:52:42,687][1875656] Updated weights for policy 0, policy_version 5480 (0.0006) [2023-03-06 13:52:43,440][1875656] Updated weights for policy 0, policy_version 5490 (0.0006) [2023-03-06 13:52:44,224][1875656] Updated weights for policy 0, policy_version 5500 (0.0008) [2023-03-06 13:52:44,985][1875656] Updated weights for policy 0, policy_version 5510 (0.0006) [2023-03-06 13:52:45,412][1875328] Fps is (10 sec: 13311.8, 60 sec: 13329.1, 300 sec: 13329.4). Total num frames: 5647360. Throughput: 0: 13332.9. Samples: 5630384. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) [2023-03-06 13:52:45,413][1875328] Avg episode reward: [(0, '1527.610')] [2023-03-06 13:52:45,746][1875656] Updated weights for policy 0, policy_version 5520 (0.0006) [2023-03-06 13:52:46,515][1875656] Updated weights for policy 0, policy_version 5530 (0.0006) [2023-03-06 13:52:47,283][1875656] Updated weights for policy 0, policy_version 5540 (0.0006) [2023-03-06 13:52:48,043][1875656] Updated weights for policy 0, policy_version 5550 (0.0006) [2023-03-06 13:52:48,828][1875656] Updated weights for policy 0, policy_version 5560 (0.0006) [2023-03-06 13:52:49,615][1875656] Updated weights for policy 0, policy_version 5570 (0.0005) [2023-03-06 13:52:50,389][1875656] Updated weights for policy 0, policy_version 5580 (0.0007) [2023-03-06 13:52:50,412][1875328] Fps is (10 sec: 13311.9, 60 sec: 13329.0, 300 sec: 13329.4). Total num frames: 5713920. Throughput: 0: 13330.1. Samples: 5710311. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:52:50,413][1875328] Avg episode reward: [(0, '1529.515')] [2023-03-06 13:52:51,145][1875656] Updated weights for policy 0, policy_version 5590 (0.0005) [2023-03-06 13:52:51,928][1875656] Updated weights for policy 0, policy_version 5600 (0.0006) [2023-03-06 13:52:52,694][1875656] Updated weights for policy 0, policy_version 5610 (0.0007) [2023-03-06 13:52:53,456][1875656] Updated weights for policy 0, policy_version 5620 (0.0006) [2023-03-06 13:52:54,237][1875656] Updated weights for policy 0, policy_version 5630 (0.0006) [2023-03-06 13:52:55,001][1875656] Updated weights for policy 0, policy_version 5640 (0.0006) [2023-03-06 13:52:55,412][1875328] Fps is (10 sec: 13312.0, 60 sec: 13329.0, 300 sec: 13329.4). Total num frames: 5780480. Throughput: 0: 13331.4. Samples: 5750328. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:52:55,413][1875328] Avg episode reward: [(0, '1563.322')] [2023-03-06 13:52:55,413][1875604] Saving new best policy, reward=1563.322! [2023-03-06 13:52:55,769][1875656] Updated weights for policy 0, policy_version 5650 (0.0006) [2023-03-06 13:52:56,565][1875656] Updated weights for policy 0, policy_version 5660 (0.0006) [2023-03-06 13:52:57,329][1875656] Updated weights for policy 0, policy_version 5670 (0.0005) [2023-03-06 13:52:58,092][1875656] Updated weights for policy 0, policy_version 5680 (0.0006) [2023-03-06 13:52:58,871][1875656] Updated weights for policy 0, policy_version 5690 (0.0007) [2023-03-06 13:52:59,641][1875656] Updated weights for policy 0, policy_version 5700 (0.0006) [2023-03-06 13:53:00,411][1875656] Updated weights for policy 0, policy_version 5710 (0.0006) [2023-03-06 13:53:00,412][1875328] Fps is (10 sec: 13312.3, 60 sec: 13329.1, 300 sec: 13329.4). Total num frames: 5847040. Throughput: 0: 13313.0. Samples: 5829562. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:53:00,413][1875328] Avg episode reward: [(0, '1534.513')] [2023-03-06 13:53:01,165][1875656] Updated weights for policy 0, policy_version 5720 (0.0006) [2023-03-06 13:53:01,944][1875656] Updated weights for policy 0, policy_version 5730 (0.0006) [2023-03-06 13:53:02,697][1875656] Updated weights for policy 0, policy_version 5740 (0.0007) [2023-03-06 13:53:03,463][1875656] Updated weights for policy 0, policy_version 5750 (0.0006) [2023-03-06 13:53:04,240][1875656] Updated weights for policy 0, policy_version 5760 (0.0007) [2023-03-06 13:53:05,016][1875656] Updated weights for policy 0, policy_version 5770 (0.0006) [2023-03-06 13:53:05,412][1875328] Fps is (10 sec: 13312.2, 60 sec: 13329.1, 300 sec: 13325.9). Total num frames: 5913600. Throughput: 0: 13320.5. Samples: 5909787. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:53:05,412][1875328] Avg episode reward: [(0, '1559.008')] [2023-03-06 13:53:05,779][1875656] Updated weights for policy 0, policy_version 5780 (0.0006) [2023-03-06 13:53:06,541][1875656] Updated weights for policy 0, policy_version 5790 (0.0007) [2023-03-06 13:53:07,332][1875656] Updated weights for policy 0, policy_version 5800 (0.0006) [2023-03-06 13:53:08,092][1875656] Updated weights for policy 0, policy_version 5810 (0.0007) [2023-03-06 13:53:08,877][1875656] Updated weights for policy 0, policy_version 5820 (0.0006) [2023-03-06 13:53:09,633][1875656] Updated weights for policy 0, policy_version 5830 (0.0007) [2023-03-06 13:53:10,386][1875656] Updated weights for policy 0, policy_version 5840 (0.0006) [2023-03-06 13:53:10,412][1875328] Fps is (10 sec: 13311.8, 60 sec: 13329.1, 300 sec: 13325.9). Total num frames: 5980160. Throughput: 0: 13325.2. Samples: 5949605. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:53:10,413][1875328] Avg episode reward: [(0, '1604.223')] [2023-03-06 13:53:10,417][1875604] Saving new best policy, reward=1604.223! [2023-03-06 13:53:11,161][1875656] Updated weights for policy 0, policy_version 5850 (0.0006) [2023-03-06 13:53:11,917][1875656] Updated weights for policy 0, policy_version 5860 (0.0007) [2023-03-06 13:53:12,682][1875656] Updated weights for policy 0, policy_version 5870 (0.0005) [2023-03-06 13:53:13,447][1875656] Updated weights for policy 0, policy_version 5880 (0.0005) [2023-03-06 13:53:14,228][1875656] Updated weights for policy 0, policy_version 5890 (0.0006) [2023-03-06 13:53:14,988][1875656] Updated weights for policy 0, policy_version 5900 (0.0006) [2023-03-06 13:53:15,412][1875328] Fps is (10 sec: 13312.0, 60 sec: 13329.1, 300 sec: 13329.4). Total num frames: 6046720. Throughput: 0: 13332.3. Samples: 6029722. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) [2023-03-06 13:53:15,412][1875328] Avg episode reward: [(0, '1575.913')] [2023-03-06 13:53:15,761][1875656] Updated weights for policy 0, policy_version 5910 (0.0007) [2023-03-06 13:53:16,538][1875656] Updated weights for policy 0, policy_version 5920 (0.0005) [2023-03-06 13:53:17,294][1875656] Updated weights for policy 0, policy_version 5930 (0.0006) [2023-03-06 13:53:18,046][1875656] Updated weights for policy 0, policy_version 5940 (0.0006) [2023-03-06 13:53:18,832][1875656] Updated weights for policy 0, policy_version 5950 (0.0005) [2023-03-06 13:53:19,589][1875656] Updated weights for policy 0, policy_version 5960 (0.0006) [2023-03-06 13:53:20,352][1875656] Updated weights for policy 0, policy_version 5970 (0.0006) [2023-03-06 13:53:20,412][1875328] Fps is (10 sec: 13312.0, 60 sec: 13329.1, 300 sec: 13329.4). Total num frames: 6113280. Throughput: 0: 13327.1. Samples: 6109857. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:53:20,413][1875328] Avg episode reward: [(0, '1574.369')] [2023-03-06 13:53:21,135][1875656] Updated weights for policy 0, policy_version 5980 (0.0006) [2023-03-06 13:53:21,905][1875656] Updated weights for policy 0, policy_version 5990 (0.0006) [2023-03-06 13:53:22,666][1875656] Updated weights for policy 0, policy_version 6000 (0.0006) [2023-03-06 13:53:23,449][1875656] Updated weights for policy 0, policy_version 6010 (0.0006) [2023-03-06 13:53:24,203][1875656] Updated weights for policy 0, policy_version 6020 (0.0006) [2023-03-06 13:53:24,958][1875656] Updated weights for policy 0, policy_version 6030 (0.0006) [2023-03-06 13:53:25,412][1875328] Fps is (10 sec: 13311.8, 60 sec: 13329.0, 300 sec: 13325.9). Total num frames: 6179840. Throughput: 0: 13319.4. Samples: 6149762. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) [2023-03-06 13:53:25,413][1875328] Avg episode reward: [(0, '1575.736')] [2023-03-06 13:53:25,751][1875656] Updated weights for policy 0, policy_version 6040 (0.0007) [2023-03-06 13:53:26,532][1875656] Updated weights for policy 0, policy_version 6050 (0.0006) [2023-03-06 13:53:27,280][1875656] Updated weights for policy 0, policy_version 6060 (0.0006) [2023-03-06 13:53:28,070][1875656] Updated weights for policy 0, policy_version 6070 (0.0006) [2023-03-06 13:53:28,828][1875656] Updated weights for policy 0, policy_version 6080 (0.0006) [2023-03-06 13:53:29,595][1875656] Updated weights for policy 0, policy_version 6090 (0.0006) [2023-03-06 13:53:30,368][1875656] Updated weights for policy 0, policy_version 6100 (0.0006) [2023-03-06 13:53:30,412][1875328] Fps is (10 sec: 13312.0, 60 sec: 13329.1, 300 sec: 13325.9). Total num frames: 6246400. Throughput: 0: 13316.0. Samples: 6229604. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:53:30,413][1875328] Avg episode reward: [(0, '1582.599')] [2023-03-06 13:53:30,432][1875604] Saving /home/qgallouedec/gia/data/envs/metaworld/train_dir/button-press-topdown-v2/checkpoint_p0/checkpoint_000006101_6247424.pth... [2023-03-06 13:53:30,463][1875604] Removing /home/qgallouedec/gia/data/envs/metaworld/train_dir/button-press-topdown-v2/checkpoint_p0/checkpoint_000002977_3048448.pth [2023-03-06 13:53:31,133][1875656] Updated weights for policy 0, policy_version 6110 (0.0006) [2023-03-06 13:53:31,904][1875656] Updated weights for policy 0, policy_version 6120 (0.0006) [2023-03-06 13:53:32,674][1875656] Updated weights for policy 0, policy_version 6130 (0.0006) [2023-03-06 13:53:33,437][1875656] Updated weights for policy 0, policy_version 6140 (0.0007) [2023-03-06 13:53:34,200][1875656] Updated weights for policy 0, policy_version 6150 (0.0006) [2023-03-06 13:53:34,980][1875656] Updated weights for policy 0, policy_version 6160 (0.0006) [2023-03-06 13:53:35,412][1875328] Fps is (10 sec: 13312.2, 60 sec: 13312.0, 300 sec: 13329.4). Total num frames: 6312960. Throughput: 0: 13316.6. Samples: 6309558. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:53:35,412][1875328] Avg episode reward: [(0, '1580.234')] [2023-03-06 13:53:35,742][1875656] Updated weights for policy 0, policy_version 6170 (0.0007) [2023-03-06 13:53:36,517][1875656] Updated weights for policy 0, policy_version 6180 (0.0006) [2023-03-06 13:53:37,311][1875656] Updated weights for policy 0, policy_version 6190 (0.0006) [2023-03-06 13:53:38,081][1875656] Updated weights for policy 0, policy_version 6200 (0.0007) [2023-03-06 13:53:38,835][1875656] Updated weights for policy 0, policy_version 6210 (0.0006) [2023-03-06 13:53:39,638][1875656] Updated weights for policy 0, policy_version 6220 (0.0006) [2023-03-06 13:53:40,388][1875656] Updated weights for policy 0, policy_version 6230 (0.0006) [2023-03-06 13:53:40,412][1875328] Fps is (10 sec: 13312.1, 60 sec: 13312.0, 300 sec: 13329.4). Total num frames: 6379520. Throughput: 0: 13308.4. Samples: 6349204. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:53:40,413][1875328] Avg episode reward: [(0, '1568.645')] [2023-03-06 13:53:41,152][1875656] Updated weights for policy 0, policy_version 6240 (0.0007) [2023-03-06 13:53:41,920][1875656] Updated weights for policy 0, policy_version 6250 (0.0007) [2023-03-06 13:53:42,696][1875656] Updated weights for policy 0, policy_version 6260 (0.0006) [2023-03-06 13:53:43,464][1875656] Updated weights for policy 0, policy_version 6270 (0.0006) [2023-03-06 13:53:44,241][1875656] Updated weights for policy 0, policy_version 6280 (0.0006) [2023-03-06 13:53:45,023][1875656] Updated weights for policy 0, policy_version 6290 (0.0006) [2023-03-06 13:53:45,412][1875328] Fps is (10 sec: 13312.0, 60 sec: 13312.0, 300 sec: 13329.4). Total num frames: 6446080. Throughput: 0: 13317.4. Samples: 6428846. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:53:45,413][1875328] Avg episode reward: [(0, '1598.673')] [2023-03-06 13:53:45,793][1875656] Updated weights for policy 0, policy_version 6300 (0.0006) [2023-03-06 13:53:46,554][1875656] Updated weights for policy 0, policy_version 6310 (0.0007) [2023-03-06 13:53:47,355][1875656] Updated weights for policy 0, policy_version 6320 (0.0005) [2023-03-06 13:53:48,116][1875656] Updated weights for policy 0, policy_version 6330 (0.0006) [2023-03-06 13:53:48,894][1875656] Updated weights for policy 0, policy_version 6340 (0.0006) [2023-03-06 13:53:49,676][1875656] Updated weights for policy 0, policy_version 6350 (0.0006) [2023-03-06 13:53:50,412][1875328] Fps is (10 sec: 13209.7, 60 sec: 13295.0, 300 sec: 13325.9). Total num frames: 6511616. Throughput: 0: 13296.2. Samples: 6508117. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:53:50,412][1875328] Avg episode reward: [(0, '1582.567')] [2023-03-06 13:53:50,448][1875656] Updated weights for policy 0, policy_version 6360 (0.0006) [2023-03-06 13:53:51,213][1875656] Updated weights for policy 0, policy_version 6370 (0.0006) [2023-03-06 13:53:51,981][1875656] Updated weights for policy 0, policy_version 6380 (0.0006) [2023-03-06 13:53:52,769][1875656] Updated weights for policy 0, policy_version 6390 (0.0006) [2023-03-06 13:53:53,531][1875656] Updated weights for policy 0, policy_version 6400 (0.0005) [2023-03-06 13:53:54,303][1875656] Updated weights for policy 0, policy_version 6410 (0.0006) [2023-03-06 13:53:55,062][1875656] Updated weights for policy 0, policy_version 6420 (0.0005) [2023-03-06 13:53:55,412][1875328] Fps is (10 sec: 13209.6, 60 sec: 13295.0, 300 sec: 13325.9). Total num frames: 6578176. Throughput: 0: 13293.3. Samples: 6547803. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:53:55,413][1875328] Avg episode reward: [(0, '1588.470')] [2023-03-06 13:53:55,826][1875656] Updated weights for policy 0, policy_version 6430 (0.0006) [2023-03-06 13:53:56,599][1875656] Updated weights for policy 0, policy_version 6440 (0.0006) [2023-03-06 13:53:57,361][1875656] Updated weights for policy 0, policy_version 6450 (0.0005) [2023-03-06 13:53:58,127][1875656] Updated weights for policy 0, policy_version 6460 (0.0007) [2023-03-06 13:53:58,888][1875656] Updated weights for policy 0, policy_version 6470 (0.0006) [2023-03-06 13:53:59,681][1875656] Updated weights for policy 0, policy_version 6480 (0.0006) [2023-03-06 13:54:00,412][1875328] Fps is (10 sec: 13311.7, 60 sec: 13294.9, 300 sec: 13322.4). Total num frames: 6644736. Throughput: 0: 13295.1. Samples: 6628005. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:54:00,413][1875328] Avg episode reward: [(0, '1613.833')] [2023-03-06 13:54:00,422][1875604] Saving new best policy, reward=1613.833! [2023-03-06 13:54:00,425][1875656] Updated weights for policy 0, policy_version 6490 (0.0006) [2023-03-06 13:54:01,211][1875656] Updated weights for policy 0, policy_version 6500 (0.0006) [2023-03-06 13:54:01,971][1875656] Updated weights for policy 0, policy_version 6510 (0.0006) [2023-03-06 13:54:02,731][1875656] Updated weights for policy 0, policy_version 6520 (0.0006) [2023-03-06 13:54:03,494][1875656] Updated weights for policy 0, policy_version 6530 (0.0006) [2023-03-06 13:54:04,271][1875656] Updated weights for policy 0, policy_version 6540 (0.0006) [2023-03-06 13:54:05,015][1875656] Updated weights for policy 0, policy_version 6550 (0.0006) [2023-03-06 13:54:05,412][1875328] Fps is (10 sec: 13312.1, 60 sec: 13294.9, 300 sec: 13322.4). Total num frames: 6711296. Throughput: 0: 13297.5. Samples: 6708244. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:54:05,412][1875328] Avg episode reward: [(0, '1617.709')] [2023-03-06 13:54:05,414][1875604] Saving new best policy, reward=1617.709! [2023-03-06 13:54:05,793][1875656] Updated weights for policy 0, policy_version 6560 (0.0006) [2023-03-06 13:54:06,584][1875656] Updated weights for policy 0, policy_version 6570 (0.0006) [2023-03-06 13:54:07,352][1875656] Updated weights for policy 0, policy_version 6580 (0.0005) [2023-03-06 13:54:08,106][1875656] Updated weights for policy 0, policy_version 6590 (0.0006) [2023-03-06 13:54:08,892][1875656] Updated weights for policy 0, policy_version 6600 (0.0006) [2023-03-06 13:54:09,661][1875656] Updated weights for policy 0, policy_version 6610 (0.0007) [2023-03-06 13:54:10,412][1875328] Fps is (10 sec: 13312.2, 60 sec: 13294.9, 300 sec: 13322.4). Total num frames: 6777856. Throughput: 0: 13297.1. Samples: 6748131. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:54:10,413][1875328] Avg episode reward: [(0, '1643.655')] [2023-03-06 13:54:10,417][1875604] Saving new best policy, reward=1643.655! [2023-03-06 13:54:10,418][1875656] Updated weights for policy 0, policy_version 6620 (0.0006) [2023-03-06 13:54:11,175][1875656] Updated weights for policy 0, policy_version 6630 (0.0006) [2023-03-06 13:54:11,933][1875656] Updated weights for policy 0, policy_version 6640 (0.0007) [2023-03-06 13:54:12,719][1875656] Updated weights for policy 0, policy_version 6650 (0.0006) [2023-03-06 13:54:13,478][1875656] Updated weights for policy 0, policy_version 6660 (0.0006) [2023-03-06 13:54:14,242][1875656] Updated weights for policy 0, policy_version 6670 (0.0007) [2023-03-06 13:54:14,996][1875656] Updated weights for policy 0, policy_version 6680 (0.0006) [2023-03-06 13:54:15,412][1875328] Fps is (10 sec: 13414.2, 60 sec: 13312.0, 300 sec: 13325.9). Total num frames: 6845440. Throughput: 0: 13303.9. Samples: 6828279. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:54:15,413][1875328] Avg episode reward: [(0, '1627.487')] [2023-03-06 13:54:15,782][1875656] Updated weights for policy 0, policy_version 6690 (0.0006) [2023-03-06 13:54:16,536][1875656] Updated weights for policy 0, policy_version 6700 (0.0006) [2023-03-06 13:54:17,315][1875656] Updated weights for policy 0, policy_version 6710 (0.0006) [2023-03-06 13:54:18,084][1875656] Updated weights for policy 0, policy_version 6720 (0.0006) [2023-03-06 13:54:18,862][1875656] Updated weights for policy 0, policy_version 6730 (0.0005) [2023-03-06 13:54:19,621][1875656] Updated weights for policy 0, policy_version 6740 (0.0007) [2023-03-06 13:54:20,387][1875656] Updated weights for policy 0, policy_version 6750 (0.0006) [2023-03-06 13:54:20,412][1875328] Fps is (10 sec: 13414.3, 60 sec: 13312.0, 300 sec: 13322.4). Total num frames: 6912000. Throughput: 0: 13304.1. Samples: 6908246. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:54:20,413][1875328] Avg episode reward: [(0, '1666.531')] [2023-03-06 13:54:20,417][1875604] Saving new best policy, reward=1666.531! [2023-03-06 13:54:21,156][1875656] Updated weights for policy 0, policy_version 6760 (0.0006) [2023-03-06 13:54:21,931][1875656] Updated weights for policy 0, policy_version 6770 (0.0006) [2023-03-06 13:54:22,693][1875656] Updated weights for policy 0, policy_version 6780 (0.0005) [2023-03-06 13:54:23,464][1875656] Updated weights for policy 0, policy_version 6790 (0.0005) [2023-03-06 13:54:24,235][1875656] Updated weights for policy 0, policy_version 6800 (0.0006) [2023-03-06 13:54:25,000][1875656] Updated weights for policy 0, policy_version 6810 (0.0006) [2023-03-06 13:54:25,412][1875328] Fps is (10 sec: 13312.0, 60 sec: 13312.0, 300 sec: 13322.4). Total num frames: 6978560. Throughput: 0: 13311.5. Samples: 6948222. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:54:25,413][1875328] Avg episode reward: [(0, '1641.273')] [2023-03-06 13:54:25,770][1875656] Updated weights for policy 0, policy_version 6820 (0.0006) [2023-03-06 13:54:26,537][1875656] Updated weights for policy 0, policy_version 6830 (0.0007) [2023-03-06 13:54:27,323][1875656] Updated weights for policy 0, policy_version 6840 (0.0006) [2023-03-06 13:54:28,077][1875656] Updated weights for policy 0, policy_version 6850 (0.0005) [2023-03-06 13:54:28,850][1875656] Updated weights for policy 0, policy_version 6860 (0.0007) [2023-03-06 13:54:29,620][1875656] Updated weights for policy 0, policy_version 6870 (0.0006) [2023-03-06 13:54:30,412][1875328] Fps is (10 sec: 13209.8, 60 sec: 13295.0, 300 sec: 13318.9). Total num frames: 7044096. Throughput: 0: 13308.7. Samples: 7027738. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:54:30,412][1875328] Avg episode reward: [(0, '1643.808')] [2023-03-06 13:54:30,424][1875656] Updated weights for policy 0, policy_version 6880 (0.0006) [2023-03-06 13:54:31,208][1875656] Updated weights for policy 0, policy_version 6890 (0.0006) [2023-03-06 13:54:31,984][1875656] Updated weights for policy 0, policy_version 6900 (0.0006) [2023-03-06 13:54:32,761][1875656] Updated weights for policy 0, policy_version 6910 (0.0006) [2023-03-06 13:54:33,536][1875656] Updated weights for policy 0, policy_version 6920 (0.0005) [2023-03-06 13:54:34,308][1875656] Updated weights for policy 0, policy_version 6930 (0.0006) [2023-03-06 13:54:35,077][1875656] Updated weights for policy 0, policy_version 6940 (0.0005) [2023-03-06 13:54:35,412][1875328] Fps is (10 sec: 13209.6, 60 sec: 13294.9, 300 sec: 13318.9). Total num frames: 7110656. Throughput: 0: 13303.4. Samples: 7106770. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:54:35,413][1875328] Avg episode reward: [(0, '1571.521')] [2023-03-06 13:54:35,857][1875656] Updated weights for policy 0, policy_version 6950 (0.0006) [2023-03-06 13:54:36,621][1875656] Updated weights for policy 0, policy_version 6960 (0.0006) [2023-03-06 13:54:37,408][1875656] Updated weights for policy 0, policy_version 6970 (0.0006) [2023-03-06 13:54:38,187][1875656] Updated weights for policy 0, policy_version 6980 (0.0007) [2023-03-06 13:54:38,951][1875656] Updated weights for policy 0, policy_version 6990 (0.0006) [2023-03-06 13:54:39,727][1875656] Updated weights for policy 0, policy_version 7000 (0.0006) [2023-03-06 13:54:40,412][1875328] Fps is (10 sec: 13312.0, 60 sec: 13295.0, 300 sec: 13318.9). Total num frames: 7177216. Throughput: 0: 13303.3. Samples: 7146451. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-03-06 13:54:40,412][1875328] Avg episode reward: [(0, '1644.543')] [2023-03-06 13:54:40,492][1875656] Updated weights for policy 0, policy_version 7010 (0.0005) [2023-03-06 13:54:41,266][1875656] Updated weights for policy 0, policy_version 7020 (0.0006) [2023-03-06 13:54:42,038][1875656] Updated weights for policy 0, policy_version 7030 (0.0006) [2023-03-06 13:54:42,801][1875656] Updated weights for policy 0, policy_version 7040 (0.0006) [2023-03-06 13:54:43,572][1875656] Updated weights for policy 0, policy_version 7050 (0.0005) [2023-03-06 13:54:44,330][1875656] Updated weights for policy 0, policy_version 7060 (0.0006) [2023-03-06 13:54:45,105][1875656] Updated weights for policy 0, policy_version 7070 (0.0006) [2023-03-06 13:54:45,412][1875328] Fps is (10 sec: 13312.1, 60 sec: 13294.9, 300 sec: 13318.9). Total num frames: 7243776. Throughput: 0: 13298.2. Samples: 7226420. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:54:45,413][1875328] Avg episode reward: [(0, '1566.827')] [2023-03-06 13:54:45,871][1875656] Updated weights for policy 0, policy_version 7080 (0.0007) [2023-03-06 13:54:46,639][1875656] Updated weights for policy 0, policy_version 7090 (0.0006) [2023-03-06 13:54:47,424][1875656] Updated weights for policy 0, policy_version 7100 (0.0006) [2023-03-06 13:54:48,188][1875656] Updated weights for policy 0, policy_version 7110 (0.0006) [2023-03-06 13:54:48,945][1875656] Updated weights for policy 0, policy_version 7120 (0.0006) [2023-03-06 13:54:49,719][1875656] Updated weights for policy 0, policy_version 7130 (0.0006) [2023-03-06 13:54:50,412][1875328] Fps is (10 sec: 13209.4, 60 sec: 13294.9, 300 sec: 13312.0). Total num frames: 7309312. Throughput: 0: 13287.2. Samples: 7306170. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:54:50,413][1875328] Avg episode reward: [(0, '1622.416')] [2023-03-06 13:54:50,484][1875656] Updated weights for policy 0, policy_version 7140 (0.0006) [2023-03-06 13:54:51,265][1875656] Updated weights for policy 0, policy_version 7150 (0.0006) [2023-03-06 13:54:52,035][1875656] Updated weights for policy 0, policy_version 7160 (0.0006) [2023-03-06 13:54:52,795][1875656] Updated weights for policy 0, policy_version 7170 (0.0006) [2023-03-06 13:54:53,588][1875656] Updated weights for policy 0, policy_version 7180 (0.0007) [2023-03-06 13:54:54,368][1875656] Updated weights for policy 0, policy_version 7190 (0.0005) [2023-03-06 13:54:55,144][1875656] Updated weights for policy 0, policy_version 7200 (0.0005) [2023-03-06 13:54:55,412][1875328] Fps is (10 sec: 13209.7, 60 sec: 13294.9, 300 sec: 13312.0). Total num frames: 7375872. Throughput: 0: 13285.6. Samples: 7345980. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-03-06 13:54:55,412][1875328] Avg episode reward: [(0, '1660.304')] [2023-03-06 13:54:55,925][1875656] Updated weights for policy 0, policy_version 7210 (0.0006) [2023-03-06 13:54:56,712][1875656] Updated weights for policy 0, policy_version 7220 (0.0006) [2023-03-06 13:54:57,484][1875656] Updated weights for policy 0, policy_version 7230 (0.0006) [2023-03-06 13:54:58,260][1875656] Updated weights for policy 0, policy_version 7240 (0.0006) [2023-03-06 13:54:59,044][1875656] Updated weights for policy 0, policy_version 7250 (0.0006) [2023-03-06 13:54:59,818][1875656] Updated weights for policy 0, policy_version 7260 (0.0006) [2023-03-06 13:55:00,412][1875328] Fps is (10 sec: 13209.5, 60 sec: 13277.9, 300 sec: 13305.0). Total num frames: 7441408. Throughput: 0: 13260.2. Samples: 7424988. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:55:00,413][1875328] Avg episode reward: [(0, '1637.408')] [2023-03-06 13:55:00,591][1875656] Updated weights for policy 0, policy_version 7270 (0.0006) [2023-03-06 13:55:01,360][1875656] Updated weights for policy 0, policy_version 7280 (0.0007) [2023-03-06 13:55:02,129][1875656] Updated weights for policy 0, policy_version 7290 (0.0007) [2023-03-06 13:55:02,904][1875656] Updated weights for policy 0, policy_version 7300 (0.0006) [2023-03-06 13:55:03,679][1875656] Updated weights for policy 0, policy_version 7310 (0.0007) [2023-03-06 13:55:04,438][1875656] Updated weights for policy 0, policy_version 7320 (0.0006) [2023-03-06 13:55:05,205][1875656] Updated weights for policy 0, policy_version 7330 (0.0007) [2023-03-06 13:55:05,412][1875328] Fps is (10 sec: 13209.5, 60 sec: 13277.9, 300 sec: 13305.1). Total num frames: 7507968. Throughput: 0: 13252.2. Samples: 7504592. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:55:05,413][1875328] Avg episode reward: [(0, '1608.147')] [2023-03-06 13:55:05,969][1875656] Updated weights for policy 0, policy_version 7340 (0.0006) [2023-03-06 13:55:06,733][1875656] Updated weights for policy 0, policy_version 7350 (0.0006) [2023-03-06 13:55:07,500][1875656] Updated weights for policy 0, policy_version 7360 (0.0006) [2023-03-06 13:55:08,277][1875656] Updated weights for policy 0, policy_version 7370 (0.0006) [2023-03-06 13:55:09,054][1875656] Updated weights for policy 0, policy_version 7380 (0.0006) [2023-03-06 13:55:09,808][1875656] Updated weights for policy 0, policy_version 7390 (0.0006) [2023-03-06 13:55:10,412][1875328] Fps is (10 sec: 13312.1, 60 sec: 13277.9, 300 sec: 13301.6). Total num frames: 7574528. Throughput: 0: 13252.4. Samples: 7544580. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:55:10,413][1875328] Avg episode reward: [(0, '1722.288')] [2023-03-06 13:55:10,418][1875604] Saving new best policy, reward=1722.288! [2023-03-06 13:55:10,588][1875656] Updated weights for policy 0, policy_version 7400 (0.0006) [2023-03-06 13:55:11,366][1875656] Updated weights for policy 0, policy_version 7410 (0.0006) [2023-03-06 13:55:12,136][1875656] Updated weights for policy 0, policy_version 7420 (0.0007) [2023-03-06 13:55:12,910][1875656] Updated weights for policy 0, policy_version 7430 (0.0006) [2023-03-06 13:55:13,689][1875656] Updated weights for policy 0, policy_version 7440 (0.0007) [2023-03-06 13:55:14,467][1875656] Updated weights for policy 0, policy_version 7450 (0.0006) [2023-03-06 13:55:15,248][1875656] Updated weights for policy 0, policy_version 7460 (0.0006) [2023-03-06 13:55:15,412][1875328] Fps is (10 sec: 13312.1, 60 sec: 13260.8, 300 sec: 13301.6). Total num frames: 7641088. Throughput: 0: 13246.1. Samples: 7623814. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:55:15,412][1875328] Avg episode reward: [(0, '1728.565')] [2023-03-06 13:55:15,413][1875604] Saving new best policy, reward=1728.565! [2023-03-06 13:55:16,030][1875656] Updated weights for policy 0, policy_version 7470 (0.0007) [2023-03-06 13:55:16,827][1875656] Updated weights for policy 0, policy_version 7480 (0.0006) [2023-03-06 13:55:17,578][1875656] Updated weights for policy 0, policy_version 7490 (0.0006) [2023-03-06 13:55:18,357][1875656] Updated weights for policy 0, policy_version 7500 (0.0006) [2023-03-06 13:55:19,143][1875656] Updated weights for policy 0, policy_version 7510 (0.0007) [2023-03-06 13:55:19,937][1875656] Updated weights for policy 0, policy_version 7520 (0.0006) [2023-03-06 13:55:20,412][1875328] Fps is (10 sec: 13209.8, 60 sec: 13243.8, 300 sec: 13298.1). Total num frames: 7706624. Throughput: 0: 13243.5. Samples: 7702725. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:55:20,413][1875328] Avg episode reward: [(0, '1799.341')] [2023-03-06 13:55:20,418][1875604] Saving new best policy, reward=1799.341! [2023-03-06 13:55:20,715][1875656] Updated weights for policy 0, policy_version 7530 (0.0006) [2023-03-06 13:55:21,489][1875656] Updated weights for policy 0, policy_version 7540 (0.0006) [2023-03-06 13:55:22,266][1875656] Updated weights for policy 0, policy_version 7550 (0.0006) [2023-03-06 13:55:23,041][1875656] Updated weights for policy 0, policy_version 7560 (0.0006) [2023-03-06 13:55:23,812][1875656] Updated weights for policy 0, policy_version 7570 (0.0006) [2023-03-06 13:55:24,581][1875656] Updated weights for policy 0, policy_version 7580 (0.0006) [2023-03-06 13:55:25,367][1875656] Updated weights for policy 0, policy_version 7590 (0.0006) [2023-03-06 13:55:25,412][1875328] Fps is (10 sec: 13107.1, 60 sec: 13226.7, 300 sec: 13294.6). Total num frames: 7772160. Throughput: 0: 13243.6. Samples: 7742413. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:55:25,413][1875328] Avg episode reward: [(0, '1772.622')] [2023-03-06 13:55:26,125][1875656] Updated weights for policy 0, policy_version 7600 (0.0006) [2023-03-06 13:55:26,908][1875656] Updated weights for policy 0, policy_version 7610 (0.0005) [2023-03-06 13:55:27,690][1875656] Updated weights for policy 0, policy_version 7620 (0.0006) [2023-03-06 13:55:28,471][1875656] Updated weights for policy 0, policy_version 7630 (0.0007) [2023-03-06 13:55:29,232][1875656] Updated weights for policy 0, policy_version 7640 (0.0006) [2023-03-06 13:55:30,014][1875656] Updated weights for policy 0, policy_version 7650 (0.0005) [2023-03-06 13:55:30,412][1875328] Fps is (10 sec: 13209.5, 60 sec: 13243.7, 300 sec: 13294.6). Total num frames: 7838720. Throughput: 0: 13227.8. Samples: 7821673. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:55:30,412][1875328] Avg episode reward: [(0, '1813.851')] [2023-03-06 13:55:30,417][1875604] Saving /home/qgallouedec/gia/data/envs/metaworld/train_dir/button-press-topdown-v2/checkpoint_p0/checkpoint_000007655_7838720.pth... [2023-03-06 13:55:30,446][1875604] Removing /home/qgallouedec/gia/data/envs/metaworld/train_dir/button-press-topdown-v2/checkpoint_p0/checkpoint_000004539_4647936.pth [2023-03-06 13:55:30,449][1875604] Saving new best policy, reward=1813.851! [2023-03-06 13:55:30,801][1875656] Updated weights for policy 0, policy_version 7660 (0.0006) [2023-03-06 13:55:31,552][1875656] Updated weights for policy 0, policy_version 7670 (0.0006) [2023-03-06 13:55:32,348][1875656] Updated weights for policy 0, policy_version 7680 (0.0005) [2023-03-06 13:55:33,145][1875656] Updated weights for policy 0, policy_version 7690 (0.0006) [2023-03-06 13:55:33,904][1875656] Updated weights for policy 0, policy_version 7700 (0.0005) [2023-03-06 13:55:34,673][1875656] Updated weights for policy 0, policy_version 7710 (0.0006) [2023-03-06 13:55:35,412][1875328] Fps is (10 sec: 13209.7, 60 sec: 13226.7, 300 sec: 13291.2). Total num frames: 7904256. Throughput: 0: 13210.5. Samples: 7900639. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:55:35,413][1875328] Avg episode reward: [(0, '1728.371')] [2023-03-06 13:55:35,452][1875656] Updated weights for policy 0, policy_version 7720 (0.0006) [2023-03-06 13:55:36,208][1875656] Updated weights for policy 0, policy_version 7730 (0.0007) [2023-03-06 13:55:36,999][1875656] Updated weights for policy 0, policy_version 7740 (0.0005) [2023-03-06 13:55:37,767][1875656] Updated weights for policy 0, policy_version 7750 (0.0006) [2023-03-06 13:55:38,546][1875656] Updated weights for policy 0, policy_version 7760 (0.0005) [2023-03-06 13:55:39,318][1875656] Updated weights for policy 0, policy_version 7770 (0.0006) [2023-03-06 13:55:40,085][1875656] Updated weights for policy 0, policy_version 7780 (0.0006) [2023-03-06 13:55:40,412][1875328] Fps is (10 sec: 13209.7, 60 sec: 13226.7, 300 sec: 13291.2). Total num frames: 7970816. Throughput: 0: 13206.4. Samples: 7940266. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:55:40,412][1875328] Avg episode reward: [(0, '1855.095')] [2023-03-06 13:55:40,417][1875604] Saving new best policy, reward=1855.095! [2023-03-06 13:55:40,861][1875656] Updated weights for policy 0, policy_version 7790 (0.0006) [2023-03-06 13:55:41,653][1875656] Updated weights for policy 0, policy_version 7800 (0.0006) [2023-03-06 13:55:42,423][1875656] Updated weights for policy 0, policy_version 7810 (0.0006) [2023-03-06 13:55:43,191][1875656] Updated weights for policy 0, policy_version 7820 (0.0005) [2023-03-06 13:55:43,969][1875656] Updated weights for policy 0, policy_version 7830 (0.0005) [2023-03-06 13:55:44,743][1875656] Updated weights for policy 0, policy_version 7840 (0.0007) [2023-03-06 13:55:45,412][1875328] Fps is (10 sec: 13209.6, 60 sec: 13209.6, 300 sec: 13287.7). Total num frames: 8036352. Throughput: 0: 13214.7. Samples: 8019646. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:55:45,412][1875328] Avg episode reward: [(0, '1845.141')] [2023-03-06 13:55:45,523][1875656] Updated weights for policy 0, policy_version 7850 (0.0006) [2023-03-06 13:55:46,300][1875656] Updated weights for policy 0, policy_version 7860 (0.0006) [2023-03-06 13:55:47,081][1875656] Updated weights for policy 0, policy_version 7870 (0.0006) [2023-03-06 13:55:47,869][1875656] Updated weights for policy 0, policy_version 7880 (0.0006) [2023-03-06 13:55:48,648][1875656] Updated weights for policy 0, policy_version 7890 (0.0007) [2023-03-06 13:55:49,428][1875656] Updated weights for policy 0, policy_version 7900 (0.0006) [2023-03-06 13:55:50,209][1875656] Updated weights for policy 0, policy_version 7910 (0.0006) [2023-03-06 13:55:50,412][1875328] Fps is (10 sec: 13107.2, 60 sec: 13209.6, 300 sec: 13284.2). Total num frames: 8101888. Throughput: 0: 13196.9. Samples: 8098451. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:55:50,412][1875328] Avg episode reward: [(0, '1775.929')] [2023-03-06 13:55:50,982][1875656] Updated weights for policy 0, policy_version 7920 (0.0006) [2023-03-06 13:55:51,748][1875656] Updated weights for policy 0, policy_version 7930 (0.0006) [2023-03-06 13:55:52,540][1875656] Updated weights for policy 0, policy_version 7940 (0.0006) [2023-03-06 13:55:53,304][1875656] Updated weights for policy 0, policy_version 7950 (0.0006) [2023-03-06 13:55:54,079][1875656] Updated weights for policy 0, policy_version 7960 (0.0007) [2023-03-06 13:55:54,848][1875656] Updated weights for policy 0, policy_version 7970 (0.0006) [2023-03-06 13:55:55,412][1875328] Fps is (10 sec: 13209.6, 60 sec: 13209.6, 300 sec: 13284.2). Total num frames: 8168448. Throughput: 0: 13190.3. Samples: 8138143. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:55:55,412][1875328] Avg episode reward: [(0, '1805.682')] [2023-03-06 13:55:55,617][1875656] Updated weights for policy 0, policy_version 7980 (0.0006) [2023-03-06 13:55:56,383][1875656] Updated weights for policy 0, policy_version 7990 (0.0006) [2023-03-06 13:55:57,164][1875656] Updated weights for policy 0, policy_version 8000 (0.0006) [2023-03-06 13:55:57,941][1875656] Updated weights for policy 0, policy_version 8010 (0.0005) [2023-03-06 13:55:58,735][1875656] Updated weights for policy 0, policy_version 8020 (0.0006) [2023-03-06 13:55:59,507][1875656] Updated weights for policy 0, policy_version 8030 (0.0006) [2023-03-06 13:56:00,280][1875656] Updated weights for policy 0, policy_version 8040 (0.0006) [2023-03-06 13:56:00,412][1875328] Fps is (10 sec: 13209.5, 60 sec: 13209.6, 300 sec: 13284.2). Total num frames: 8233984. Throughput: 0: 13191.0. Samples: 8217410. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:56:00,413][1875328] Avg episode reward: [(0, '1687.756')] [2023-03-06 13:56:01,044][1875656] Updated weights for policy 0, policy_version 8050 (0.0006) [2023-03-06 13:56:01,826][1875656] Updated weights for policy 0, policy_version 8060 (0.0008) [2023-03-06 13:56:02,622][1875656] Updated weights for policy 0, policy_version 8070 (0.0007) [2023-03-06 13:56:03,392][1875656] Updated weights for policy 0, policy_version 8080 (0.0005) [2023-03-06 13:56:04,173][1875656] Updated weights for policy 0, policy_version 8090 (0.0007) [2023-03-06 13:56:04,954][1875656] Updated weights for policy 0, policy_version 8100 (0.0006) [2023-03-06 13:56:05,412][1875328] Fps is (10 sec: 13209.7, 60 sec: 13209.6, 300 sec: 13284.2). Total num frames: 8300544. Throughput: 0: 13194.1. Samples: 8296460. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:56:05,412][1875328] Avg episode reward: [(0, '1703.105')] [2023-03-06 13:56:05,741][1875656] Updated weights for policy 0, policy_version 8110 (0.0006) [2023-03-06 13:56:06,502][1875656] Updated weights for policy 0, policy_version 8120 (0.0006) [2023-03-06 13:56:07,283][1875656] Updated weights for policy 0, policy_version 8130 (0.0006) [2023-03-06 13:56:08,048][1875656] Updated weights for policy 0, policy_version 8140 (0.0007) [2023-03-06 13:56:08,818][1875656] Updated weights for policy 0, policy_version 8150 (0.0006) [2023-03-06 13:56:09,576][1875656] Updated weights for policy 0, policy_version 8160 (0.0006) [2023-03-06 13:56:10,370][1875656] Updated weights for policy 0, policy_version 8170 (0.0006) [2023-03-06 13:56:10,412][1875328] Fps is (10 sec: 13209.7, 60 sec: 13192.6, 300 sec: 13277.3). Total num frames: 8366080. Throughput: 0: 13193.8. Samples: 8336135. Policy #0 lag: (min: 0.0, avg: 1.1, max: 2.0) [2023-03-06 13:56:10,412][1875328] Avg episode reward: [(0, '1707.271')] [2023-03-06 13:56:11,137][1875656] Updated weights for policy 0, policy_version 8180 (0.0006) [2023-03-06 13:56:11,909][1875656] Updated weights for policy 0, policy_version 8190 (0.0006) [2023-03-06 13:56:12,672][1875656] Updated weights for policy 0, policy_version 8200 (0.0006) [2023-03-06 13:56:13,445][1875656] Updated weights for policy 0, policy_version 8210 (0.0005) [2023-03-06 13:56:14,238][1875656] Updated weights for policy 0, policy_version 8220 (0.0006) [2023-03-06 13:56:15,013][1875656] Updated weights for policy 0, policy_version 8230 (0.0006) [2023-03-06 13:56:15,412][1875328] Fps is (10 sec: 13209.5, 60 sec: 13192.5, 300 sec: 13277.3). Total num frames: 8432640. Throughput: 0: 13195.1. Samples: 8415452. Policy #0 lag: (min: 0.0, avg: 1.1, max: 2.0) [2023-03-06 13:56:15,412][1875328] Avg episode reward: [(0, '1787.400')] [2023-03-06 13:56:15,808][1875656] Updated weights for policy 0, policy_version 8240 (0.0007) [2023-03-06 13:56:16,585][1875656] Updated weights for policy 0, policy_version 8250 (0.0005) [2023-03-06 13:56:17,347][1875656] Updated weights for policy 0, policy_version 8260 (0.0007) [2023-03-06 13:56:18,121][1875656] Updated weights for policy 0, policy_version 8270 (0.0005) [2023-03-06 13:56:18,900][1875656] Updated weights for policy 0, policy_version 8280 (0.0006) [2023-03-06 13:56:19,665][1875656] Updated weights for policy 0, policy_version 8290 (0.0005) [2023-03-06 13:56:20,412][1875328] Fps is (10 sec: 13209.5, 60 sec: 13192.5, 300 sec: 13277.3). Total num frames: 8498176. Throughput: 0: 13203.8. Samples: 8494811. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:56:20,413][1875328] Avg episode reward: [(0, '1657.012')] [2023-03-06 13:56:20,423][1875656] Updated weights for policy 0, policy_version 8300 (0.0006) [2023-03-06 13:56:21,214][1875656] Updated weights for policy 0, policy_version 8310 (0.0006) [2023-03-06 13:56:21,991][1875656] Updated weights for policy 0, policy_version 8320 (0.0006) [2023-03-06 13:56:22,772][1875656] Updated weights for policy 0, policy_version 8330 (0.0006) [2023-03-06 13:56:23,555][1875656] Updated weights for policy 0, policy_version 8340 (0.0006) [2023-03-06 13:56:24,337][1875656] Updated weights for policy 0, policy_version 8350 (0.0006) [2023-03-06 13:56:25,109][1875656] Updated weights for policy 0, policy_version 8360 (0.0006) [2023-03-06 13:56:25,412][1875328] Fps is (10 sec: 13107.2, 60 sec: 13192.5, 300 sec: 13273.8). Total num frames: 8563712. Throughput: 0: 13198.1. Samples: 8534183. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:56:25,412][1875328] Avg episode reward: [(0, '1887.711')] [2023-03-06 13:56:25,422][1875604] Saving new best policy, reward=1887.711! [2023-03-06 13:56:25,880][1875656] Updated weights for policy 0, policy_version 8370 (0.0005) [2023-03-06 13:56:26,667][1875656] Updated weights for policy 0, policy_version 8380 (0.0005) [2023-03-06 13:56:27,444][1875656] Updated weights for policy 0, policy_version 8390 (0.0006) [2023-03-06 13:56:28,208][1875656] Updated weights for policy 0, policy_version 8400 (0.0006) [2023-03-06 13:56:28,987][1875656] Updated weights for policy 0, policy_version 8410 (0.0006) [2023-03-06 13:56:29,765][1875656] Updated weights for policy 0, policy_version 8420 (0.0005) [2023-03-06 13:56:30,412][1875328] Fps is (10 sec: 13209.6, 60 sec: 13192.5, 300 sec: 13273.8). Total num frames: 8630272. Throughput: 0: 13194.2. Samples: 8613387. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:56:30,413][1875328] Avg episode reward: [(0, '1841.575')] [2023-03-06 13:56:30,529][1875656] Updated weights for policy 0, policy_version 8430 (0.0006) [2023-03-06 13:56:31,296][1875656] Updated weights for policy 0, policy_version 8440 (0.0006) [2023-03-06 13:56:32,069][1875656] Updated weights for policy 0, policy_version 8450 (0.0006) [2023-03-06 13:56:32,860][1875656] Updated weights for policy 0, policy_version 8460 (0.0006) [2023-03-06 13:56:33,635][1875656] Updated weights for policy 0, policy_version 8470 (0.0006) [2023-03-06 13:56:34,418][1875656] Updated weights for policy 0, policy_version 8480 (0.0006) [2023-03-06 13:56:35,202][1875656] Updated weights for policy 0, policy_version 8490 (0.0006) [2023-03-06 13:56:35,412][1875328] Fps is (10 sec: 13209.7, 60 sec: 13192.5, 300 sec: 13270.4). Total num frames: 8695808. Throughput: 0: 13202.3. Samples: 8692555. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:56:35,412][1875328] Avg episode reward: [(0, '2015.413')] [2023-03-06 13:56:35,419][1875604] Saving new best policy, reward=2015.413! [2023-03-06 13:56:35,984][1875656] Updated weights for policy 0, policy_version 8500 (0.0006) [2023-03-06 13:56:36,759][1875656] Updated weights for policy 0, policy_version 8510 (0.0006) [2023-03-06 13:56:37,540][1875656] Updated weights for policy 0, policy_version 8520 (0.0006) [2023-03-06 13:56:38,321][1875656] Updated weights for policy 0, policy_version 8530 (0.0006) [2023-03-06 13:56:39,096][1875656] Updated weights for policy 0, policy_version 8540 (0.0006) [2023-03-06 13:56:39,888][1875656] Updated weights for policy 0, policy_version 8550 (0.0007) [2023-03-06 13:56:40,412][1875328] Fps is (10 sec: 13107.1, 60 sec: 13175.4, 300 sec: 13266.9). Total num frames: 8761344. Throughput: 0: 13195.5. Samples: 8731943. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:56:40,413][1875328] Avg episode reward: [(0, '1822.680')] [2023-03-06 13:56:40,655][1875656] Updated weights for policy 0, policy_version 8560 (0.0006) [2023-03-06 13:56:41,427][1875656] Updated weights for policy 0, policy_version 8570 (0.0005) [2023-03-06 13:56:42,195][1875656] Updated weights for policy 0, policy_version 8580 (0.0006) [2023-03-06 13:56:42,981][1875656] Updated weights for policy 0, policy_version 8590 (0.0006) [2023-03-06 13:56:43,758][1875656] Updated weights for policy 0, policy_version 8600 (0.0006) [2023-03-06 13:56:44,526][1875656] Updated weights for policy 0, policy_version 8610 (0.0006) [2023-03-06 13:56:45,302][1875656] Updated weights for policy 0, policy_version 8620 (0.0007) [2023-03-06 13:56:45,412][1875328] Fps is (10 sec: 13209.4, 60 sec: 13192.5, 300 sec: 13266.9). Total num frames: 8827904. Throughput: 0: 13188.0. Samples: 8810872. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-03-06 13:56:45,413][1875328] Avg episode reward: [(0, '1855.466')] [2023-03-06 13:56:46,084][1875656] Updated weights for policy 0, policy_version 8630 (0.0006) [2023-03-06 13:56:46,839][1875656] Updated weights for policy 0, policy_version 8640 (0.0007) [2023-03-06 13:56:47,620][1875656] Updated weights for policy 0, policy_version 8650 (0.0006) [2023-03-06 13:56:48,408][1875656] Updated weights for policy 0, policy_version 8660 (0.0006) [2023-03-06 13:56:49,179][1875656] Updated weights for policy 0, policy_version 8670 (0.0006) [2023-03-06 13:56:49,973][1875656] Updated weights for policy 0, policy_version 8680 (0.0006) [2023-03-06 13:56:50,412][1875328] Fps is (10 sec: 13209.8, 60 sec: 13192.5, 300 sec: 13263.4). Total num frames: 8893440. Throughput: 0: 13189.5. Samples: 8889988. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-03-06 13:56:50,412][1875328] Avg episode reward: [(0, '1808.913')] [2023-03-06 13:56:50,757][1875656] Updated weights for policy 0, policy_version 8690 (0.0006) [2023-03-06 13:56:51,541][1875656] Updated weights for policy 0, policy_version 8700 (0.0006) [2023-03-06 13:56:52,304][1875656] Updated weights for policy 0, policy_version 8710 (0.0006) [2023-03-06 13:56:53,083][1875656] Updated weights for policy 0, policy_version 8720 (0.0006) [2023-03-06 13:56:53,860][1875656] Updated weights for policy 0, policy_version 8730 (0.0006) [2023-03-06 13:56:54,649][1875656] Updated weights for policy 0, policy_version 8740 (0.0006) [2023-03-06 13:56:55,409][1875656] Updated weights for policy 0, policy_version 8750 (0.0006) [2023-03-06 13:56:55,412][1875328] Fps is (10 sec: 13209.7, 60 sec: 13192.5, 300 sec: 13263.4). Total num frames: 8960000. Throughput: 0: 13186.2. Samples: 8929513. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:56:55,412][1875328] Avg episode reward: [(0, '1952.407')] [2023-03-06 13:56:56,189][1875656] Updated weights for policy 0, policy_version 8760 (0.0007) [2023-03-06 13:56:56,937][1875656] Updated weights for policy 0, policy_version 8770 (0.0006) [2023-03-06 13:56:57,739][1875656] Updated weights for policy 0, policy_version 8780 (0.0006) [2023-03-06 13:56:58,510][1875656] Updated weights for policy 0, policy_version 8790 (0.0006) [2023-03-06 13:56:59,272][1875656] Updated weights for policy 0, policy_version 8800 (0.0007) [2023-03-06 13:57:00,061][1875656] Updated weights for policy 0, policy_version 8810 (0.0006) [2023-03-06 13:57:00,412][1875328] Fps is (10 sec: 13209.6, 60 sec: 13192.6, 300 sec: 13259.9). Total num frames: 9025536. Throughput: 0: 13186.9. Samples: 9008861. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:57:00,412][1875328] Avg episode reward: [(0, '1717.853')] [2023-03-06 13:57:00,848][1875656] Updated weights for policy 0, policy_version 8820 (0.0006) [2023-03-06 13:57:01,622][1875656] Updated weights for policy 0, policy_version 8830 (0.0006) [2023-03-06 13:57:02,403][1875656] Updated weights for policy 0, policy_version 8840 (0.0006) [2023-03-06 13:57:03,166][1875656] Updated weights for policy 0, policy_version 8850 (0.0007) [2023-03-06 13:57:03,945][1875656] Updated weights for policy 0, policy_version 8860 (0.0006) [2023-03-06 13:57:04,715][1875656] Updated weights for policy 0, policy_version 8870 (0.0006) [2023-03-06 13:57:05,412][1875328] Fps is (10 sec: 13107.1, 60 sec: 13175.4, 300 sec: 13256.5). Total num frames: 9091072. Throughput: 0: 13177.0. Samples: 9087777. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:57:05,413][1875328] Avg episode reward: [(0, '1674.767')] [2023-03-06 13:57:05,505][1875656] Updated weights for policy 0, policy_version 8880 (0.0006) [2023-03-06 13:57:06,287][1875656] Updated weights for policy 0, policy_version 8890 (0.0006) [2023-03-06 13:57:07,084][1875656] Updated weights for policy 0, policy_version 8900 (0.0007) [2023-03-06 13:57:07,848][1875656] Updated weights for policy 0, policy_version 8910 (0.0006) [2023-03-06 13:57:08,631][1875656] Updated weights for policy 0, policy_version 8920 (0.0005) [2023-03-06 13:57:09,406][1875656] Updated weights for policy 0, policy_version 8930 (0.0006) [2023-03-06 13:57:10,196][1875656] Updated weights for policy 0, policy_version 8940 (0.0006) [2023-03-06 13:57:10,412][1875328] Fps is (10 sec: 13107.2, 60 sec: 13175.5, 300 sec: 13253.0). Total num frames: 9156608. Throughput: 0: 13174.8. Samples: 9127051. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:57:10,412][1875328] Avg episode reward: [(0, '1791.264')] [2023-03-06 13:57:10,972][1875656] Updated weights for policy 0, policy_version 8950 (0.0006) [2023-03-06 13:57:11,737][1875656] Updated weights for policy 0, policy_version 8960 (0.0007) [2023-03-06 13:57:12,527][1875656] Updated weights for policy 0, policy_version 8970 (0.0007) [2023-03-06 13:57:13,308][1875656] Updated weights for policy 0, policy_version 8980 (0.0006) [2023-03-06 13:57:14,098][1875656] Updated weights for policy 0, policy_version 8990 (0.0006) [2023-03-06 13:57:14,866][1875656] Updated weights for policy 0, policy_version 9000 (0.0006) [2023-03-06 13:57:15,412][1875328] Fps is (10 sec: 13107.2, 60 sec: 13158.4, 300 sec: 13249.5). Total num frames: 9222144. Throughput: 0: 13165.8. Samples: 9205849. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:57:15,413][1875328] Avg episode reward: [(0, '1828.010')] [2023-03-06 13:57:15,671][1875656] Updated weights for policy 0, policy_version 9010 (0.0006) [2023-03-06 13:57:16,461][1875656] Updated weights for policy 0, policy_version 9020 (0.0007) [2023-03-06 13:57:17,226][1875656] Updated weights for policy 0, policy_version 9030 (0.0007) [2023-03-06 13:57:18,011][1875656] Updated weights for policy 0, policy_version 9040 (0.0006) [2023-03-06 13:57:18,785][1875656] Updated weights for policy 0, policy_version 9050 (0.0006) [2023-03-06 13:57:19,562][1875656] Updated weights for policy 0, policy_version 9060 (0.0006) [2023-03-06 13:57:20,340][1875656] Updated weights for policy 0, policy_version 9070 (0.0007) [2023-03-06 13:57:20,412][1875328] Fps is (10 sec: 13107.1, 60 sec: 13158.4, 300 sec: 13246.0). Total num frames: 9287680. Throughput: 0: 13155.3. Samples: 9284544. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:57:20,413][1875328] Avg episode reward: [(0, '1953.020')] [2023-03-06 13:57:21,123][1875656] Updated weights for policy 0, policy_version 9080 (0.0006) [2023-03-06 13:57:21,940][1875656] Updated weights for policy 0, policy_version 9090 (0.0008) [2023-03-06 13:57:22,709][1875656] Updated weights for policy 0, policy_version 9100 (0.0006) [2023-03-06 13:57:23,480][1875656] Updated weights for policy 0, policy_version 9110 (0.0006) [2023-03-06 13:57:24,248][1875656] Updated weights for policy 0, policy_version 9120 (0.0006) [2023-03-06 13:57:25,035][1875656] Updated weights for policy 0, policy_version 9130 (0.0006) [2023-03-06 13:57:25,412][1875328] Fps is (10 sec: 13107.2, 60 sec: 13158.4, 300 sec: 13242.6). Total num frames: 9353216. Throughput: 0: 13149.7. Samples: 9323679. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:57:25,413][1875328] Avg episode reward: [(0, '1598.880')] [2023-03-06 13:57:25,814][1875656] Updated weights for policy 0, policy_version 9140 (0.0006) [2023-03-06 13:57:26,583][1875656] Updated weights for policy 0, policy_version 9150 (0.0006) [2023-03-06 13:57:27,341][1875656] Updated weights for policy 0, policy_version 9160 (0.0006) [2023-03-06 13:57:28,125][1875656] Updated weights for policy 0, policy_version 9170 (0.0006) [2023-03-06 13:57:28,913][1875656] Updated weights for policy 0, policy_version 9180 (0.0005) [2023-03-06 13:57:29,688][1875656] Updated weights for policy 0, policy_version 9190 (0.0006) [2023-03-06 13:57:30,412][1875328] Fps is (10 sec: 13209.7, 60 sec: 13158.4, 300 sec: 13239.1). Total num frames: 9419776. Throughput: 0: 13152.5. Samples: 9402733. Policy #0 lag: (min: 0.0, avg: 1.2, max: 2.0) [2023-03-06 13:57:30,413][1875328] Avg episode reward: [(0, '1671.797')] [2023-03-06 13:57:30,417][1875604] Saving /home/qgallouedec/gia/data/envs/metaworld/train_dir/button-press-topdown-v2/checkpoint_p0/checkpoint_000009199_9419776.pth... [2023-03-06 13:57:30,448][1875604] Removing /home/qgallouedec/gia/data/envs/metaworld/train_dir/button-press-topdown-v2/checkpoint_p0/checkpoint_000006101_6247424.pth [2023-03-06 13:57:30,470][1875656] Updated weights for policy 0, policy_version 9200 (0.0006) [2023-03-06 13:57:31,252][1875656] Updated weights for policy 0, policy_version 9210 (0.0006) [2023-03-06 13:57:32,042][1875656] Updated weights for policy 0, policy_version 9220 (0.0006) [2023-03-06 13:57:32,833][1875656] Updated weights for policy 0, policy_version 9230 (0.0006) [2023-03-06 13:57:33,626][1875656] Updated weights for policy 0, policy_version 9240 (0.0007) [2023-03-06 13:57:34,399][1875656] Updated weights for policy 0, policy_version 9250 (0.0006) [2023-03-06 13:57:35,178][1875656] Updated weights for policy 0, policy_version 9260 (0.0006) [2023-03-06 13:57:35,412][1875328] Fps is (10 sec: 13107.3, 60 sec: 13141.3, 300 sec: 13232.2). Total num frames: 9484288. Throughput: 0: 13137.0. Samples: 9481153. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-03-06 13:57:35,412][1875328] Avg episode reward: [(0, '1901.015')] [2023-03-06 13:57:35,964][1875656] Updated weights for policy 0, policy_version 9270 (0.0006) [2023-03-06 13:57:36,754][1875656] Updated weights for policy 0, policy_version 9280 (0.0007) [2023-03-06 13:57:37,537][1875656] Updated weights for policy 0, policy_version 9290 (0.0006) [2023-03-06 13:57:38,295][1875656] Updated weights for policy 0, policy_version 9300 (0.0005) [2023-03-06 13:57:39,088][1875656] Updated weights for policy 0, policy_version 9310 (0.0006) [2023-03-06 13:57:39,848][1875656] Updated weights for policy 0, policy_version 9320 (0.0006) [2023-03-06 13:57:40,412][1875328] Fps is (10 sec: 13107.3, 60 sec: 13158.4, 300 sec: 13232.2). Total num frames: 9550848. Throughput: 0: 13130.0. Samples: 9520364. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-03-06 13:57:40,412][1875328] Avg episode reward: [(0, '1956.408')] [2023-03-06 13:57:40,624][1875656] Updated weights for policy 0, policy_version 9330 (0.0006) [2023-03-06 13:57:41,428][1875656] Updated weights for policy 0, policy_version 9340 (0.0006) [2023-03-06 13:57:42,186][1875656] Updated weights for policy 0, policy_version 9350 (0.0006) [2023-03-06 13:57:42,960][1875656] Updated weights for policy 0, policy_version 9360 (0.0006) [2023-03-06 13:57:43,751][1875656] Updated weights for policy 0, policy_version 9370 (0.0007) [2023-03-06 13:57:44,542][1875656] Updated weights for policy 0, policy_version 9380 (0.0006) [2023-03-06 13:57:45,315][1875656] Updated weights for policy 0, policy_version 9390 (0.0006) [2023-03-06 13:57:45,412][1875328] Fps is (10 sec: 13209.4, 60 sec: 13141.3, 300 sec: 13228.7). Total num frames: 9616384. Throughput: 0: 13124.0. Samples: 9599442. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:57:45,413][1875328] Avg episode reward: [(0, '1836.822')] [2023-03-06 13:57:46,081][1875656] Updated weights for policy 0, policy_version 9400 (0.0006) [2023-03-06 13:57:46,859][1875656] Updated weights for policy 0, policy_version 9410 (0.0007) [2023-03-06 13:57:47,654][1875656] Updated weights for policy 0, policy_version 9420 (0.0007) [2023-03-06 13:57:48,437][1875656] Updated weights for policy 0, policy_version 9430 (0.0006) [2023-03-06 13:57:49,224][1875656] Updated weights for policy 0, policy_version 9440 (0.0006) [2023-03-06 13:57:50,007][1875656] Updated weights for policy 0, policy_version 9450 (0.0006) [2023-03-06 13:57:50,412][1875328] Fps is (10 sec: 13107.1, 60 sec: 13141.3, 300 sec: 13225.2). Total num frames: 9681920. Throughput: 0: 13120.9. Samples: 9678216. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:57:50,412][1875328] Avg episode reward: [(0, '1886.765')] [2023-03-06 13:57:50,790][1875656] Updated weights for policy 0, policy_version 9460 (0.0005) [2023-03-06 13:57:51,567][1875656] Updated weights for policy 0, policy_version 9470 (0.0006) [2023-03-06 13:57:52,338][1875656] Updated weights for policy 0, policy_version 9480 (0.0006) [2023-03-06 13:57:53,124][1875656] Updated weights for policy 0, policy_version 9490 (0.0006) [2023-03-06 13:57:53,907][1875656] Updated weights for policy 0, policy_version 9500 (0.0006) [2023-03-06 13:57:54,675][1875656] Updated weights for policy 0, policy_version 9510 (0.0006) [2023-03-06 13:57:55,412][1875328] Fps is (10 sec: 13107.4, 60 sec: 13124.3, 300 sec: 13221.7). Total num frames: 9747456. Throughput: 0: 13123.2. Samples: 9717596. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-03-06 13:57:55,412][1875328] Avg episode reward: [(0, '1706.997')] [2023-03-06 13:57:55,479][1875656] Updated weights for policy 0, policy_version 9520 (0.0007) [2023-03-06 13:57:56,247][1875656] Updated weights for policy 0, policy_version 9530 (0.0006) [2023-03-06 13:57:57,033][1875656] Updated weights for policy 0, policy_version 9540 (0.0006) [2023-03-06 13:57:57,815][1875656] Updated weights for policy 0, policy_version 9550 (0.0006) [2023-03-06 13:57:58,597][1875656] Updated weights for policy 0, policy_version 9560 (0.0006) [2023-03-06 13:57:59,372][1875656] Updated weights for policy 0, policy_version 9570 (0.0006) [2023-03-06 13:58:00,152][1875656] Updated weights for policy 0, policy_version 9580 (0.0007) [2023-03-06 13:58:00,412][1875328] Fps is (10 sec: 13107.1, 60 sec: 13124.2, 300 sec: 13218.3). Total num frames: 9812992. Throughput: 0: 13115.7. Samples: 9796058. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:58:00,413][1875328] Avg episode reward: [(0, '1863.497')] [2023-03-06 13:58:00,945][1875656] Updated weights for policy 0, policy_version 9590 (0.0006) [2023-03-06 13:58:01,708][1875656] Updated weights for policy 0, policy_version 9600 (0.0006) [2023-03-06 13:58:02,479][1875656] Updated weights for policy 0, policy_version 9610 (0.0006) [2023-03-06 13:58:03,255][1875656] Updated weights for policy 0, policy_version 9620 (0.0005) [2023-03-06 13:58:04,068][1875656] Updated weights for policy 0, policy_version 9630 (0.0007) [2023-03-06 13:58:04,845][1875656] Updated weights for policy 0, policy_version 9640 (0.0006) [2023-03-06 13:58:05,412][1875328] Fps is (10 sec: 13107.0, 60 sec: 13124.3, 300 sec: 13214.8). Total num frames: 9878528. Throughput: 0: 13118.8. Samples: 9874890. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:58:05,413][1875328] Avg episode reward: [(0, '1851.918')] [2023-03-06 13:58:05,607][1875656] Updated weights for policy 0, policy_version 9650 (0.0006) [2023-03-06 13:58:06,393][1875656] Updated weights for policy 0, policy_version 9660 (0.0006) [2023-03-06 13:58:07,159][1875656] Updated weights for policy 0, policy_version 9670 (0.0006) [2023-03-06 13:58:07,946][1875656] Updated weights for policy 0, policy_version 9680 (0.0006) [2023-03-06 13:58:08,740][1875656] Updated weights for policy 0, policy_version 9690 (0.0007) [2023-03-06 13:58:09,523][1875656] Updated weights for policy 0, policy_version 9700 (0.0006) [2023-03-06 13:58:10,308][1875656] Updated weights for policy 0, policy_version 9710 (0.0007) [2023-03-06 13:58:10,412][1875328] Fps is (10 sec: 13107.2, 60 sec: 13124.2, 300 sec: 13211.3). Total num frames: 9944064. Throughput: 0: 13125.9. Samples: 9914345. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-03-06 13:58:10,413][1875328] Avg episode reward: [(0, '1931.142')] [2023-03-06 13:58:11,109][1875656] Updated weights for policy 0, policy_version 9720 (0.0006) [2023-03-06 13:58:11,870][1875656] Updated weights for policy 0, policy_version 9730 (0.0006) [2023-03-06 13:58:12,671][1875656] Updated weights for policy 0, policy_version 9740 (0.0007) [2023-03-06 13:58:13,446][1875656] Updated weights for policy 0, policy_version 9750 (0.0006) [2023-03-06 13:58:14,232][1875656] Updated weights for policy 0, policy_version 9760 (0.0006) [2023-03-06 13:58:14,767][1876220] Stopping RolloutWorker_w24... [2023-03-06 13:58:14,767][1875821] Stopping RolloutWorker_w6... [2023-03-06 13:58:14,767][1875895] Stopping RolloutWorker_w18... [2023-03-06 13:58:14,767][1875993] Stopping RolloutWorker_w23... [2023-03-06 13:58:14,767][1876220] Loop rollout_proc24_evt_loop terminating... [2023-03-06 13:58:14,767][1876218] Stopping RolloutWorker_w29... [2023-03-06 13:58:14,767][1876219] Stopping RolloutWorker_w30... [2023-03-06 13:58:14,767][1875898] Stopping RolloutWorker_w15... [2023-03-06 13:58:14,767][1875821] Loop rollout_proc6_evt_loop terminating... [2023-03-06 13:58:14,767][1875895] Loop rollout_proc18_evt_loop terminating... [2023-03-06 13:58:14,767][1875857] Stopping RolloutWorker_w17... [2023-03-06 13:58:14,767][1875665] Stopping RolloutWorker_w5... [2023-03-06 13:58:14,767][1875861] Stopping RolloutWorker_w8... [2023-03-06 13:58:14,767][1875993] Loop rollout_proc23_evt_loop terminating... [2023-03-06 13:58:14,767][1875659] Stopping RolloutWorker_w2... [2023-03-06 13:58:14,767][1875657] Stopping RolloutWorker_w0... [2023-03-06 13:58:14,767][1875856] Stopping RolloutWorker_w20... [2023-03-06 13:58:14,767][1876089] Stopping RolloutWorker_w25... [2023-03-06 13:58:14,767][1875898] Loop rollout_proc15_evt_loop terminating... [2023-03-06 13:58:14,767][1875854] Stopping RolloutWorker_w10... [2023-03-06 13:58:14,767][1875604] Stopping Batcher_0... [2023-03-06 13:58:14,767][1875859] Stopping RolloutWorker_w14... [2023-03-06 13:58:14,767][1875658] Stopping RolloutWorker_w1... [2023-03-06 13:58:14,767][1876217] Stopping RolloutWorker_w28... [2023-03-06 13:58:14,768][1875861] Loop rollout_proc8_evt_loop terminating... [2023-03-06 13:58:14,768][1875659] Loop rollout_proc2_evt_loop terminating... [2023-03-06 13:58:14,767][1875858] Stopping RolloutWorker_w13... [2023-03-06 13:58:14,767][1875860] Stopping RolloutWorker_w19... [2023-03-06 13:58:14,767][1876221] Stopping RolloutWorker_w31... [2023-03-06 13:58:14,768][1876218] Loop rollout_proc29_evt_loop terminating... [2023-03-06 13:58:14,768][1876219] Loop rollout_proc30_evt_loop terminating... [2023-03-06 13:58:14,768][1876089] Loop rollout_proc25_evt_loop terminating... [2023-03-06 13:58:14,767][1875862] Stopping RolloutWorker_w9... [2023-03-06 13:58:14,768][1875857] Loop rollout_proc17_evt_loop terminating... [2023-03-06 13:58:14,768][1875665] Loop rollout_proc5_evt_loop terminating... [2023-03-06 13:58:14,768][1875894] Stopping RolloutWorker_w11... [2023-03-06 13:58:14,767][1875661] Stopping RolloutWorker_w3... [2023-03-06 13:58:14,768][1875657] Loop rollout_proc0_evt_loop terminating... [2023-03-06 13:58:14,768][1875855] Stopping RolloutWorker_w21... [2023-03-06 13:58:14,768][1875856] Loop rollout_proc20_evt_loop terminating... [2023-03-06 13:58:14,768][1876221] Loop rollout_proc31_evt_loop terminating... [2023-03-06 13:58:14,768][1875859] Loop rollout_proc14_evt_loop terminating... [2023-03-06 13:58:14,768][1876217] Loop rollout_proc28_evt_loop terminating... [2023-03-06 13:58:14,768][1875862] Loop rollout_proc9_evt_loop terminating... [2023-03-06 13:58:14,768][1875658] Loop rollout_proc1_evt_loop terminating... [2023-03-06 13:58:14,768][1875894] Loop rollout_proc11_evt_loop terminating... [2023-03-06 13:58:14,768][1875858] Loop rollout_proc13_evt_loop terminating... [2023-03-06 13:58:14,768][1875860] Loop rollout_proc19_evt_loop terminating... [2023-03-06 13:58:14,768][1875604] Loop batcher_evt_loop terminating... [2023-03-06 13:58:14,768][1875328] Component RolloutWorker_w24 stopped! [2023-03-06 13:58:14,768][1875661] Loop rollout_proc3_evt_loop terminating... [2023-03-06 13:58:14,768][1875896] Stopping RolloutWorker_w16... [2023-03-06 13:58:14,768][1875855] Loop rollout_proc21_evt_loop terminating... [2023-03-06 13:58:14,768][1875604] Saving /home/qgallouedec/gia/data/envs/metaworld/train_dir/button-press-topdown-v2/checkpoint_p0/checkpoint_000009767_10001408.pth... [2023-03-06 13:58:14,768][1875328] Component RolloutWorker_w6 stopped! [2023-03-06 13:58:14,768][1875853] Stopping RolloutWorker_w12... [2023-03-06 13:58:14,768][1876025] Stopping RolloutWorker_w22... [2023-03-06 13:58:14,768][1875896] Loop rollout_proc16_evt_loop terminating... [2023-03-06 13:58:14,769][1876025] Loop rollout_proc22_evt_loop terminating... [2023-03-06 13:58:14,769][1875853] Loop rollout_proc12_evt_loop terminating... [2023-03-06 13:58:14,769][1875328] Component RolloutWorker_w18 stopped! [2023-03-06 13:58:14,769][1875328] Component RolloutWorker_w30 stopped! [2023-03-06 13:58:14,769][1875664] Stopping RolloutWorker_w4... [2023-03-06 13:58:14,769][1875328] Component RolloutWorker_w29 stopped! [2023-03-06 13:58:14,769][1875664] Loop rollout_proc4_evt_loop terminating... [2023-03-06 13:58:14,769][1875328] Component RolloutWorker_w23 stopped! [2023-03-06 13:58:14,770][1875328] Component RolloutWorker_w17 stopped! [2023-03-06 13:58:14,770][1875328] Component RolloutWorker_w5 stopped! [2023-03-06 13:58:14,770][1875328] Component RolloutWorker_w15 stopped! [2023-03-06 13:58:14,770][1875328] Component RolloutWorker_w0 stopped! [2023-03-06 13:58:14,770][1875328] Component RolloutWorker_w10 stopped! [2023-03-06 13:58:14,771][1875328] Component RolloutWorker_w8 stopped! [2023-03-06 13:58:14,771][1875328] Component RolloutWorker_w20 stopped! [2023-03-06 13:58:14,771][1875328] Component Batcher_0 stopped! [2023-03-06 13:58:14,772][1875328] Component RolloutWorker_w2 stopped! [2023-03-06 13:58:14,772][1875328] Component RolloutWorker_w14 stopped! [2023-03-06 13:58:14,772][1875328] Component RolloutWorker_w1 stopped! [2023-03-06 13:58:14,772][1875328] Component RolloutWorker_w25 stopped! [2023-03-06 13:58:14,772][1875328] Component RolloutWorker_w13 stopped! [2023-03-06 13:58:14,773][1875328] Component RolloutWorker_w28 stopped! [2023-03-06 13:58:14,773][1875328] Component RolloutWorker_w19 stopped! [2023-03-06 13:58:14,773][1875328] Component RolloutWorker_w31 stopped! [2023-03-06 13:58:14,773][1875328] Component RolloutWorker_w3 stopped! [2023-03-06 13:58:14,773][1875328] Component RolloutWorker_w9 stopped! [2023-03-06 13:58:14,774][1875328] Component RolloutWorker_w21 stopped! [2023-03-06 13:58:14,774][1875328] Component RolloutWorker_w11 stopped! [2023-03-06 13:58:14,774][1875328] Component RolloutWorker_w16 stopped! [2023-03-06 13:58:14,774][1875328] Component RolloutWorker_w22 stopped! [2023-03-06 13:58:14,775][1875328] Component RolloutWorker_w12 stopped! [2023-03-06 13:58:14,775][1875328] Component RolloutWorker_w4 stopped! [2023-03-06 13:58:14,776][1875328] Component RolloutWorker_w7 stopped! [2023-03-06 13:58:14,776][1875897] Stopping RolloutWorker_w7... [2023-03-06 13:58:14,777][1875897] Loop rollout_proc7_evt_loop terminating... [2023-03-06 13:58:14,768][1875854] Loop rollout_proc10_evt_loop terminating... [2023-03-06 13:58:14,792][1876204] Stopping RolloutWorker_w27... [2023-03-06 13:58:14,793][1876204] Loop rollout_proc27_evt_loop terminating... [2023-03-06 13:58:14,793][1875328] Component RolloutWorker_w27 stopped! [2023-03-06 13:58:14,851][1875656] Weights refcount: 2 0 [2023-03-06 13:58:14,855][1876216] Stopping RolloutWorker_w26... [2023-03-06 13:58:14,855][1875328] Component InferenceWorker_p0-w0 stopped! [2023-03-06 13:58:14,856][1876216] Loop rollout_proc26_evt_loop terminating... [2023-03-06 13:58:14,856][1875328] Component RolloutWorker_w26 stopped! [2023-03-06 13:58:14,854][1875656] Stopping InferenceWorker_p0-w0... [2023-03-06 13:58:14,860][1875656] Loop inference_proc0-0_evt_loop terminating... [2023-03-06 13:58:14,890][1875604] Removing /home/qgallouedec/gia/data/envs/metaworld/train_dir/button-press-topdown-v2/checkpoint_p0/checkpoint_000007655_7838720.pth [2023-03-06 13:58:14,898][1875604] Saving /home/qgallouedec/gia/data/envs/metaworld/train_dir/button-press-topdown-v2/checkpoint_p0/checkpoint_000009767_10001408.pth... [2023-03-06 13:58:14,976][1875604] Stopping LearnerWorker_p0... [2023-03-06 13:58:14,976][1875604] Loop learner_proc0_evt_loop terminating... [2023-03-06 13:58:14,976][1875328] Component LearnerWorker_p0 stopped! [2023-03-06 13:58:14,977][1875328] Waiting for process learner_proc0 to stop... [2023-03-06 13:58:16,180][1875328] Waiting for process inference_proc0-0 to join... [2023-03-06 13:58:16,180][1875328] Waiting for process rollout_proc0 to join... [2023-03-06 13:58:16,181][1875328] Waiting for process rollout_proc1 to join... [2023-03-06 13:58:16,181][1875328] Waiting for process rollout_proc2 to join... [2023-03-06 13:58:16,181][1875328] Waiting for process rollout_proc3 to join... [2023-03-06 13:58:16,182][1875328] Waiting for process rollout_proc4 to join... [2023-03-06 13:58:16,182][1875328] Waiting for process rollout_proc5 to join... [2023-03-06 13:58:16,182][1875328] Waiting for process rollout_proc6 to join... [2023-03-06 13:58:16,183][1875328] Waiting for process rollout_proc7 to join... [2023-03-06 13:58:16,183][1875328] Waiting for process rollout_proc8 to join... [2023-03-06 13:58:16,183][1875328] Waiting for process rollout_proc9 to join... [2023-03-06 13:58:16,184][1875328] Waiting for process rollout_proc10 to join... [2023-03-06 13:58:16,184][1875328] Waiting for process rollout_proc11 to join... [2023-03-06 13:58:16,184][1875328] Waiting for process rollout_proc12 to join... [2023-03-06 13:58:16,185][1875328] Waiting for process rollout_proc13 to join... [2023-03-06 13:58:16,185][1875328] Waiting for process rollout_proc14 to join... [2023-03-06 13:58:16,186][1875328] Waiting for process rollout_proc15 to join... [2023-03-06 13:58:16,186][1875328] Waiting for process rollout_proc16 to join... [2023-03-06 13:58:16,186][1875328] Waiting for process rollout_proc17 to join... [2023-03-06 13:58:16,187][1875328] Waiting for process rollout_proc18 to join... [2023-03-06 13:58:16,187][1875328] Waiting for process rollout_proc19 to join... [2023-03-06 13:58:16,188][1875328] Waiting for process rollout_proc20 to join... [2023-03-06 13:58:16,188][1875328] Waiting for process rollout_proc21 to join... [2023-03-06 13:58:16,188][1875328] Waiting for process rollout_proc22 to join... [2023-03-06 13:58:16,189][1875328] Waiting for process rollout_proc23 to join... [2023-03-06 13:58:16,189][1875328] Waiting for process rollout_proc24 to join... [2023-03-06 13:58:16,189][1875328] Waiting for process rollout_proc25 to join... [2023-03-06 13:58:16,190][1875328] Waiting for process rollout_proc26 to join... [2023-03-06 13:58:16,190][1875328] Waiting for process rollout_proc27 to join... [2023-03-06 13:58:16,191][1875328] Waiting for process rollout_proc28 to join... [2023-03-06 13:58:16,191][1875328] Waiting for process rollout_proc29 to join... [2023-03-06 13:58:16,191][1875328] Waiting for process rollout_proc30 to join... [2023-03-06 13:58:16,192][1875328] Waiting for process rollout_proc31 to join... [2023-03-06 13:58:16,192][1875328] Batcher 0 profile tree view: batching: 94.7574, releasing_batches: 0.1569 [2023-03-06 13:58:16,192][1875328] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0001 wait_policy_total: 25.8098 update_model: 13.3238 weight_update: 0.0006 one_step: 0.0121 handle_policy_step: 679.9466 deserialize: 20.9805, stack: 3.6983, obs_to_device_normalize: 122.8592, forward: 296.7784, send_messages: 139.0387 prepare_outputs: 69.4172 to_cpu: 35.7882 [2023-03-06 13:58:16,193][1875328] Learner 0 profile tree view: misc: 0.0546, prepare_batch: 48.8439 train: 97.3978 epoch_init: 0.0416, minibatch_init: 0.0462, losses_postprocess: 2.5670, kl_divergence: 3.7265, after_optimizer: 6.4149 calculate_losses: 32.9949 losses_init: 0.0259, forward_head: 1.8677, bptt_initial: 11.6646, tail: 6.5610, advantages_returns: 0.8074, losses: 3.1972 bptt: 7.8578 bptt_forward_core: 7.5748 update: 49.1908 clip: 6.2957 [2023-03-06 13:58:16,193][1875328] RolloutWorker_w0 profile tree view: wait_for_trajectories: 0.4477, enqueue_policy_requests: 21.2109, env_step: 267.0849, overhead: 20.2326, complete_rollouts: 1.1376 save_policy_outputs: 27.1274 split_output_tensors: 13.4240 [2023-03-06 13:58:16,193][1875328] RolloutWorker_w31 profile tree view: wait_for_trajectories: 0.4643, enqueue_policy_requests: 21.5006, env_step: 275.0220, overhead: 20.6926, complete_rollouts: 1.1464 save_policy_outputs: 27.0130 split_output_tensors: 13.3125 [2023-03-06 13:58:16,194][1875328] Loop Runner_EvtLoop terminating... [2023-03-06 13:58:16,194][1875328] Runner profile tree view: main_loop: 762.7548 [2023-03-06 13:58:16,195][1875328] Collected {0: 10001408}, FPS: 13112.2