nikxtaco commited on
Commit
17453de
1 Parent(s): 6e8ae96

Upload folder using huggingface_hub

Browse files
.summary/0/events.out.tfevents.1700033565.4391a95ca488 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2d771ab8482c178e8bf9081104c58c80a76e2adaecff8a1215c9356ad72cda47
3
+ size 2343
README.md CHANGED
@@ -15,7 +15,7 @@ model-index:
15
  type: doom_health_gathering_supreme
16
  metrics:
17
  - type: mean_reward
18
- value: 9.73 +/- 4.97
19
  name: mean_reward
20
  verified: false
21
  ---
 
15
  type: doom_health_gathering_supreme
16
  metrics:
17
  - type: mean_reward
18
+ value: 10.43 +/- 5.21
19
  name: mean_reward
20
  verified: false
21
  ---
checkpoint_p0/checkpoint_000000979_4009984.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a1bbba73e8dc1d41b2810287eed5e98ebd95145102dc4e56b5d68014f37486f9
3
+ size 34929669
config.json CHANGED
@@ -65,7 +65,7 @@
65
  "summaries_use_frameskip": true,
66
  "heartbeat_interval": 20,
67
  "heartbeat_reporting_interval": 600,
68
- "train_for_env_steps": 4000000,
69
  "train_for_seconds": 10000000000,
70
  "save_every_sec": 120,
71
  "keep_checkpoints": 2,
 
65
  "summaries_use_frameskip": true,
66
  "heartbeat_interval": 20,
67
  "heartbeat_reporting_interval": 600,
68
+ "train_for_env_steps": 50000,
69
  "train_for_seconds": 10000000000,
70
  "save_every_sec": 120,
71
  "keep_checkpoints": 2,
replay.mp4 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a52081b89a2f9f08cb0b1216df352de49a4cefc8cd6cb67198b9bfca1b8bec39
3
- size 18903213
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:eb52ab7546998ece94f53381dc18af3b64b68dcbda359637b9feed826a1c69a5
3
+ size 20332764
sf_log.txt CHANGED
@@ -2411,3 +2411,696 @@ main_loop: 1253.2841
2411
  [2023-11-15 07:30:43,267][00663] Avg episode rewards: #0: 22.728, true rewards: #0: 9.728
2412
  [2023-11-15 07:30:43,270][00663] Avg episode reward: 22.728, avg true_objective: 9.728
2413
  [2023-11-15 07:31:48,191][00663] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2411
  [2023-11-15 07:30:43,267][00663] Avg episode rewards: #0: 22.728, true rewards: #0: 9.728
2412
  [2023-11-15 07:30:43,270][00663] Avg episode reward: 22.728, avg true_objective: 9.728
2413
  [2023-11-15 07:31:48,191][00663] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
2414
+ [2023-11-15 07:31:53,927][00663] The model has been pushed to https://huggingface.co/nikxtaco/rl_course_vizdoom_health_gathering_supreme
2415
+ [2023-11-15 07:32:45,353][00663] Environment doom_basic already registered, overwriting...
2416
+ [2023-11-15 07:32:45,356][00663] Environment doom_two_colors_easy already registered, overwriting...
2417
+ [2023-11-15 07:32:45,358][00663] Environment doom_two_colors_hard already registered, overwriting...
2418
+ [2023-11-15 07:32:45,359][00663] Environment doom_dm already registered, overwriting...
2419
+ [2023-11-15 07:32:45,361][00663] Environment doom_dwango5 already registered, overwriting...
2420
+ [2023-11-15 07:32:45,363][00663] Environment doom_my_way_home_flat_actions already registered, overwriting...
2421
+ [2023-11-15 07:32:45,364][00663] Environment doom_defend_the_center_flat_actions already registered, overwriting...
2422
+ [2023-11-15 07:32:45,366][00663] Environment doom_my_way_home already registered, overwriting...
2423
+ [2023-11-15 07:32:45,368][00663] Environment doom_deadly_corridor already registered, overwriting...
2424
+ [2023-11-15 07:32:45,369][00663] Environment doom_defend_the_center already registered, overwriting...
2425
+ [2023-11-15 07:32:45,370][00663] Environment doom_defend_the_line already registered, overwriting...
2426
+ [2023-11-15 07:32:45,372][00663] Environment doom_health_gathering already registered, overwriting...
2427
+ [2023-11-15 07:32:45,374][00663] Environment doom_health_gathering_supreme already registered, overwriting...
2428
+ [2023-11-15 07:32:45,376][00663] Environment doom_battle already registered, overwriting...
2429
+ [2023-11-15 07:32:45,378][00663] Environment doom_battle2 already registered, overwriting...
2430
+ [2023-11-15 07:32:45,379][00663] Environment doom_duel_bots already registered, overwriting...
2431
+ [2023-11-15 07:32:45,381][00663] Environment doom_deathmatch_bots already registered, overwriting...
2432
+ [2023-11-15 07:32:45,383][00663] Environment doom_duel already registered, overwriting...
2433
+ [2023-11-15 07:32:45,385][00663] Environment doom_deathmatch_full already registered, overwriting...
2434
+ [2023-11-15 07:32:45,386][00663] Environment doom_benchmark already registered, overwriting...
2435
+ [2023-11-15 07:32:45,388][00663] register_encoder_factory: <function make_vizdoom_encoder at 0x7e8c58d712d0>
2436
+ [2023-11-15 07:32:45,419][00663] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
2437
+ [2023-11-15 07:32:45,420][00663] Overriding arg 'train_for_env_steps' with value 50000 passed from command line
2438
+ [2023-11-15 07:32:45,424][00663] Experiment dir /content/train_dir/default_experiment already exists!
2439
+ [2023-11-15 07:32:45,425][00663] Resuming existing experiment from /content/train_dir/default_experiment...
2440
+ [2023-11-15 07:32:45,429][00663] Weights and Biases integration disabled
2441
+ [2023-11-15 07:32:45,433][00663] Environment var CUDA_VISIBLE_DEVICES is 0
2442
+
2443
+ [2023-11-15 07:32:48,231][00663] Starting experiment with the following configuration:
2444
+ help=False
2445
+ algo=APPO
2446
+ env=doom_health_gathering_supreme
2447
+ experiment=default_experiment
2448
+ train_dir=/content/train_dir
2449
+ restart_behavior=resume
2450
+ device=gpu
2451
+ seed=None
2452
+ num_policies=1
2453
+ async_rl=True
2454
+ serial_mode=False
2455
+ batched_sampling=False
2456
+ num_batches_to_accumulate=2
2457
+ worker_num_splits=2
2458
+ policy_workers_per_policy=1
2459
+ max_policy_lag=1000
2460
+ num_workers=8
2461
+ num_envs_per_worker=4
2462
+ batch_size=1024
2463
+ num_batches_per_epoch=1
2464
+ num_epochs=1
2465
+ rollout=32
2466
+ recurrence=32
2467
+ shuffle_minibatches=False
2468
+ gamma=0.99
2469
+ reward_scale=1.0
2470
+ reward_clip=1000.0
2471
+ value_bootstrap=False
2472
+ normalize_returns=True
2473
+ exploration_loss_coeff=0.001
2474
+ value_loss_coeff=0.5
2475
+ kl_loss_coeff=0.0
2476
+ exploration_loss=symmetric_kl
2477
+ gae_lambda=0.95
2478
+ ppo_clip_ratio=0.1
2479
+ ppo_clip_value=0.2
2480
+ with_vtrace=False
2481
+ vtrace_rho=1.0
2482
+ vtrace_c=1.0
2483
+ optimizer=adam
2484
+ adam_eps=1e-06
2485
+ adam_beta1=0.9
2486
+ adam_beta2=0.999
2487
+ max_grad_norm=4.0
2488
+ learning_rate=0.0001
2489
+ lr_schedule=constant
2490
+ lr_schedule_kl_threshold=0.008
2491
+ lr_adaptive_min=1e-06
2492
+ lr_adaptive_max=0.01
2493
+ obs_subtract_mean=0.0
2494
+ obs_scale=255.0
2495
+ normalize_input=True
2496
+ normalize_input_keys=None
2497
+ decorrelate_experience_max_seconds=0
2498
+ decorrelate_envs_on_one_worker=True
2499
+ actor_worker_gpus=[]
2500
+ set_workers_cpu_affinity=True
2501
+ force_envs_single_thread=False
2502
+ default_niceness=0
2503
+ log_to_file=True
2504
+ experiment_summaries_interval=10
2505
+ flush_summaries_interval=30
2506
+ stats_avg=100
2507
+ summaries_use_frameskip=True
2508
+ heartbeat_interval=20
2509
+ heartbeat_reporting_interval=600
2510
+ train_for_env_steps=50000
2511
+ train_for_seconds=10000000000
2512
+ save_every_sec=120
2513
+ keep_checkpoints=2
2514
+ load_checkpoint_kind=latest
2515
+ save_milestones_sec=-1
2516
+ save_best_every_sec=5
2517
+ save_best_metric=reward
2518
+ save_best_after=100000
2519
+ benchmark=False
2520
+ encoder_mlp_layers=[512, 512]
2521
+ encoder_conv_architecture=convnet_simple
2522
+ encoder_conv_mlp_layers=[512]
2523
+ use_rnn=True
2524
+ rnn_size=512
2525
+ rnn_type=gru
2526
+ rnn_num_layers=1
2527
+ decoder_mlp_layers=[]
2528
+ nonlinearity=elu
2529
+ policy_initialization=orthogonal
2530
+ policy_init_gain=1.0
2531
+ actor_critic_share_weights=True
2532
+ adaptive_stddev=True
2533
+ continuous_tanh_scale=0.0
2534
+ initial_stddev=1.0
2535
+ use_env_info_cache=False
2536
+ env_gpu_actions=False
2537
+ env_gpu_observations=True
2538
+ env_frameskip=4
2539
+ env_framestack=1
2540
+ pixel_format=CHW
2541
+ use_record_episode_statistics=False
2542
+ with_wandb=False
2543
+ wandb_user=None
2544
+ wandb_project=sample_factory
2545
+ wandb_group=None
2546
+ wandb_job_type=SF
2547
+ wandb_tags=[]
2548
+ with_pbt=False
2549
+ pbt_mix_policies_in_one_env=True
2550
+ pbt_period_env_steps=5000000
2551
+ pbt_start_mutation=20000000
2552
+ pbt_replace_fraction=0.3
2553
+ pbt_mutation_rate=0.15
2554
+ pbt_replace_reward_gap=0.1
2555
+ pbt_replace_reward_gap_absolute=1e-06
2556
+ pbt_optimize_gamma=False
2557
+ pbt_target_objective=true_objective
2558
+ pbt_perturb_min=1.1
2559
+ pbt_perturb_max=1.5
2560
+ num_agents=-1
2561
+ num_humans=0
2562
+ num_bots=-1
2563
+ start_bot_difficulty=None
2564
+ timelimit=None
2565
+ res_w=128
2566
+ res_h=72
2567
+ wide_aspect_ratio=False
2568
+ eval_env_frameskip=1
2569
+ fps=35
2570
+ command_line=--env=doom_health_gathering_supreme --num_workers=8 --num_envs_per_worker=4 --train_for_env_steps=4000
2571
+ cli_args={'env': 'doom_health_gathering_supreme', 'num_workers': 8, 'num_envs_per_worker': 4, 'train_for_env_steps': 4000}
2572
+ git_hash=unknown
2573
+ git_repo_name=not a git repository
2574
+ [2023-11-15 07:32:48,234][00663] Saving configuration to /content/train_dir/default_experiment/config.json...
2575
+ [2023-11-15 07:32:48,237][00663] Rollout worker 0 uses device cpu
2576
+ [2023-11-15 07:32:48,239][00663] Rollout worker 1 uses device cpu
2577
+ [2023-11-15 07:32:48,243][00663] Rollout worker 2 uses device cpu
2578
+ [2023-11-15 07:32:48,244][00663] Rollout worker 3 uses device cpu
2579
+ [2023-11-15 07:32:48,245][00663] Rollout worker 4 uses device cpu
2580
+ [2023-11-15 07:32:48,246][00663] Rollout worker 5 uses device cpu
2581
+ [2023-11-15 07:32:48,251][00663] Rollout worker 6 uses device cpu
2582
+ [2023-11-15 07:32:48,252][00663] Rollout worker 7 uses device cpu
2583
+ [2023-11-15 07:32:48,364][00663] Using GPUs [0] for process 0 (actually maps to GPUs [0])
2584
+ [2023-11-15 07:32:48,367][00663] InferenceWorker_p0-w0: min num requests: 2
2585
+ [2023-11-15 07:32:48,408][00663] Starting all processes...
2586
+ [2023-11-15 07:32:48,410][00663] Starting process learner_proc0
2587
+ [2023-11-15 07:32:48,483][00663] Starting all processes...
2588
+ [2023-11-15 07:32:48,496][00663] Starting process inference_proc0-0
2589
+ [2023-11-15 07:32:48,498][00663] Starting process rollout_proc0
2590
+ [2023-11-15 07:32:48,516][00663] Starting process rollout_proc1
2591
+ [2023-11-15 07:32:48,517][00663] Starting process rollout_proc2
2592
+ [2023-11-15 07:32:48,517][00663] Starting process rollout_proc3
2593
+ [2023-11-15 07:32:48,517][00663] Starting process rollout_proc4
2594
+ [2023-11-15 07:32:48,517][00663] Starting process rollout_proc5
2595
+ [2023-11-15 07:32:48,517][00663] Starting process rollout_proc6
2596
+ [2023-11-15 07:32:48,517][00663] Starting process rollout_proc7
2597
+ [2023-11-15 07:33:05,025][29796] Worker 1 uses CPU cores [1]
2598
+ [2023-11-15 07:33:05,444][29797] Worker 2 uses CPU cores [0]
2599
+ [2023-11-15 07:33:05,498][29798] Worker 3 uses CPU cores [1]
2600
+ [2023-11-15 07:33:05,640][29794] Using GPUs [0] for process 0 (actually maps to GPUs [0])
2601
+ [2023-11-15 07:33:05,641][29794] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
2602
+ [2023-11-15 07:33:05,711][29794] Num visible devices: 1
2603
+ [2023-11-15 07:33:05,741][29802] Worker 7 uses CPU cores [1]
2604
+ [2023-11-15 07:33:05,833][29799] Worker 4 uses CPU cores [0]
2605
+ [2023-11-15 07:33:05,854][29800] Worker 5 uses CPU cores [1]
2606
+ [2023-11-15 07:33:05,873][29781] Using GPUs [0] for process 0 (actually maps to GPUs [0])
2607
+ [2023-11-15 07:33:05,873][29781] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
2608
+ [2023-11-15 07:33:05,903][29795] Worker 0 uses CPU cores [0]
2609
+ [2023-11-15 07:33:05,913][29781] Num visible devices: 1
2610
+ [2023-11-15 07:33:05,915][29781] Starting seed is not provided
2611
+ [2023-11-15 07:33:05,916][29781] Using GPUs [0] for process 0 (actually maps to GPUs [0])
2612
+ [2023-11-15 07:33:05,916][29781] Initializing actor-critic model on device cuda:0
2613
+ [2023-11-15 07:33:05,917][29781] RunningMeanStd input shape: (3, 72, 128)
2614
+ [2023-11-15 07:33:05,918][29781] RunningMeanStd input shape: (1,)
2615
+ [2023-11-15 07:33:05,941][29781] ConvEncoder: input_channels=3
2616
+ [2023-11-15 07:33:05,950][29801] Worker 6 uses CPU cores [0]
2617
+ [2023-11-15 07:33:06,110][29781] Conv encoder output size: 512
2618
+ [2023-11-15 07:33:06,111][29781] Policy head output size: 512
2619
+ [2023-11-15 07:33:06,135][29781] Created Actor Critic model with architecture:
2620
+ [2023-11-15 07:33:06,136][29781] ActorCriticSharedWeights(
2621
+ (obs_normalizer): ObservationNormalizer(
2622
+ (running_mean_std): RunningMeanStdDictInPlace(
2623
+ (running_mean_std): ModuleDict(
2624
+ (obs): RunningMeanStdInPlace()
2625
+ )
2626
+ )
2627
+ )
2628
+ (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
2629
+ (encoder): VizdoomEncoder(
2630
+ (basic_encoder): ConvEncoder(
2631
+ (enc): RecursiveScriptModule(
2632
+ original_name=ConvEncoderImpl
2633
+ (conv_head): RecursiveScriptModule(
2634
+ original_name=Sequential
2635
+ (0): RecursiveScriptModule(original_name=Conv2d)
2636
+ (1): RecursiveScriptModule(original_name=ELU)
2637
+ (2): RecursiveScriptModule(original_name=Conv2d)
2638
+ (3): RecursiveScriptModule(original_name=ELU)
2639
+ (4): RecursiveScriptModule(original_name=Conv2d)
2640
+ (5): RecursiveScriptModule(original_name=ELU)
2641
+ )
2642
+ (mlp_layers): RecursiveScriptModule(
2643
+ original_name=Sequential
2644
+ (0): RecursiveScriptModule(original_name=Linear)
2645
+ (1): RecursiveScriptModule(original_name=ELU)
2646
+ )
2647
+ )
2648
+ )
2649
+ )
2650
+ (core): ModelCoreRNN(
2651
+ (core): GRU(512, 512)
2652
+ )
2653
+ (decoder): MlpDecoder(
2654
+ (mlp): Identity()
2655
+ )
2656
+ (critic_linear): Linear(in_features=512, out_features=1, bias=True)
2657
+ (action_parameterization): ActionParameterizationDefault(
2658
+ (distribution_linear): Linear(in_features=512, out_features=5, bias=True)
2659
+ )
2660
+ )
2661
+ [2023-11-15 07:33:06,412][29781] Using optimizer <class 'torch.optim.adam.Adam'>
2662
+ [2023-11-15 07:33:06,874][29781] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
2663
+ [2023-11-15 07:33:06,915][29781] Loading model from checkpoint
2664
+ [2023-11-15 07:33:06,918][29781] Loaded experiment state at self.train_step=978, self.env_steps=4005888
2665
+ [2023-11-15 07:33:06,919][29781] Initialized policy 0 weights for model version 978
2666
+ [2023-11-15 07:33:06,937][29781] Using GPUs [0] for process 0 (actually maps to GPUs [0])
2667
+ [2023-11-15 07:33:06,946][29781] LearnerWorker_p0 finished initialization!
2668
+ [2023-11-15 07:33:07,279][29794] RunningMeanStd input shape: (3, 72, 128)
2669
+ [2023-11-15 07:33:07,282][29794] RunningMeanStd input shape: (1,)
2670
+ [2023-11-15 07:33:07,302][29794] ConvEncoder: input_channels=3
2671
+ [2023-11-15 07:33:07,473][29794] Conv encoder output size: 512
2672
+ [2023-11-15 07:33:07,476][29794] Policy head output size: 512
2673
+ [2023-11-15 07:33:07,573][00663] Inference worker 0-0 is ready!
2674
+ [2023-11-15 07:33:07,576][00663] All inference workers are ready! Signal rollout workers to start!
2675
+ [2023-11-15 07:33:07,842][29800] Doom resolution: 160x120, resize resolution: (128, 72)
2676
+ [2023-11-15 07:33:07,841][29802] Doom resolution: 160x120, resize resolution: (128, 72)
2677
+ [2023-11-15 07:33:07,845][29796] Doom resolution: 160x120, resize resolution: (128, 72)
2678
+ [2023-11-15 07:33:07,844][29798] Doom resolution: 160x120, resize resolution: (128, 72)
2679
+ [2023-11-15 07:33:07,878][29799] Doom resolution: 160x120, resize resolution: (128, 72)
2680
+ [2023-11-15 07:33:07,880][29795] Doom resolution: 160x120, resize resolution: (128, 72)
2681
+ [2023-11-15 07:33:07,884][29797] Doom resolution: 160x120, resize resolution: (128, 72)
2682
+ [2023-11-15 07:33:07,885][29801] Doom resolution: 160x120, resize resolution: (128, 72)
2683
+ [2023-11-15 07:33:08,353][00663] Heartbeat connected on Batcher_0
2684
+ [2023-11-15 07:33:08,361][00663] Heartbeat connected on LearnerWorker_p0
2685
+ [2023-11-15 07:33:08,413][00663] Heartbeat connected on InferenceWorker_p0-w0
2686
+ [2023-11-15 07:33:09,261][29800] Decorrelating experience for 0 frames...
2687
+ [2023-11-15 07:33:09,346][29799] Decorrelating experience for 0 frames...
2688
+ [2023-11-15 07:33:09,358][29797] Decorrelating experience for 0 frames...
2689
+ [2023-11-15 07:33:09,363][29801] Decorrelating experience for 0 frames...
2690
+ [2023-11-15 07:33:10,434][00663] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 4005888. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
2691
+ [2023-11-15 07:33:10,749][29797] Decorrelating experience for 32 frames...
2692
+ [2023-11-15 07:33:10,751][29801] Decorrelating experience for 32 frames...
2693
+ [2023-11-15 07:33:10,763][29795] Decorrelating experience for 0 frames...
2694
+ [2023-11-15 07:33:11,048][29796] Decorrelating experience for 0 frames...
2695
+ [2023-11-15 07:33:11,096][29802] Decorrelating experience for 0 frames...
2696
+ [2023-11-15 07:33:11,832][29796] Decorrelating experience for 32 frames...
2697
+ [2023-11-15 07:33:12,228][29795] Decorrelating experience for 32 frames...
2698
+ [2023-11-15 07:33:12,576][29801] Decorrelating experience for 64 frames...
2699
+ [2023-11-15 07:33:12,578][29797] Decorrelating experience for 64 frames...
2700
+ [2023-11-15 07:33:13,530][29802] Decorrelating experience for 32 frames...
2701
+ [2023-11-15 07:33:13,738][29795] Decorrelating experience for 64 frames...
2702
+ [2023-11-15 07:33:13,838][29798] Decorrelating experience for 0 frames...
2703
+ [2023-11-15 07:33:13,868][29796] Decorrelating experience for 64 frames...
2704
+ [2023-11-15 07:33:13,867][29801] Decorrelating experience for 96 frames...
2705
+ [2023-11-15 07:33:14,088][00663] Heartbeat connected on RolloutWorker_w6
2706
+ [2023-11-15 07:33:14,903][29797] Decorrelating experience for 96 frames...
2707
+ [2023-11-15 07:33:15,016][29800] Decorrelating experience for 32 frames...
2708
+ [2023-11-15 07:33:15,022][29798] Decorrelating experience for 32 frames...
2709
+ [2023-11-15 07:33:15,038][29795] Decorrelating experience for 96 frames...
2710
+ [2023-11-15 07:33:15,261][00663] Heartbeat connected on RolloutWorker_w2
2711
+ [2023-11-15 07:33:15,433][00663] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4005888. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
2712
+ [2023-11-15 07:33:15,506][00663] Heartbeat connected on RolloutWorker_w0
2713
+ [2023-11-15 07:33:16,199][29796] Decorrelating experience for 96 frames...
2714
+ [2023-11-15 07:33:16,379][29802] Decorrelating experience for 64 frames...
2715
+ [2023-11-15 07:33:16,468][00663] Heartbeat connected on RolloutWorker_w1
2716
+ [2023-11-15 07:33:16,759][29798] Decorrelating experience for 64 frames...
2717
+ [2023-11-15 07:33:17,730][29799] Decorrelating experience for 32 frames...
2718
+ [2023-11-15 07:33:18,946][29800] Decorrelating experience for 64 frames...
2719
+ [2023-11-15 07:33:19,909][29781] Stopping Batcher_0...
2720
+ [2023-11-15 07:33:19,911][29781] Loop batcher_evt_loop terminating...
2721
+ [2023-11-15 07:33:19,915][00663] Component Batcher_0 stopped!
2722
+ [2023-11-15 07:33:19,917][29781] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000979_4009984.pth...
2723
+ [2023-11-15 07:33:19,950][29796] Stopping RolloutWorker_w1...
2724
+ [2023-11-15 07:33:19,950][00663] Component RolloutWorker_w1 stopped!
2725
+ [2023-11-15 07:33:19,966][00663] Component RolloutWorker_w2 stopped!
2726
+ [2023-11-15 07:33:19,971][29796] Loop rollout_proc1_evt_loop terminating...
2727
+ [2023-11-15 07:33:19,966][29797] Stopping RolloutWorker_w2...
2728
+ [2023-11-15 07:33:19,973][00663] Component RolloutWorker_w6 stopped!
2729
+ [2023-11-15 07:33:19,973][29801] Stopping RolloutWorker_w6...
2730
+ [2023-11-15 07:33:19,976][29797] Loop rollout_proc2_evt_loop terminating...
2731
+ [2023-11-15 07:33:19,980][29801] Loop rollout_proc6_evt_loop terminating...
2732
+ [2023-11-15 07:33:19,987][00663] Component RolloutWorker_w0 stopped!
2733
+ [2023-11-15 07:33:19,987][29795] Stopping RolloutWorker_w0...
2734
+ [2023-11-15 07:33:19,990][29795] Loop rollout_proc0_evt_loop terminating...
2735
+ [2023-11-15 07:33:20,012][29794] Weights refcount: 2 0
2736
+ [2023-11-15 07:33:20,021][00663] Component InferenceWorker_p0-w0 stopped!
2737
+ [2023-11-15 07:33:20,023][29794] Stopping InferenceWorker_p0-w0...
2738
+ [2023-11-15 07:33:20,023][29794] Loop inference_proc0-0_evt_loop terminating...
2739
+ [2023-11-15 07:33:20,086][29781] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000941_3854336.pth
2740
+ [2023-11-15 07:33:20,109][29781] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000979_4009984.pth...
2741
+ [2023-11-15 07:33:20,145][29802] Decorrelating experience for 96 frames...
2742
+ [2023-11-15 07:33:20,322][00663] Component LearnerWorker_p0 stopped!
2743
+ [2023-11-15 07:33:20,324][29781] Stopping LearnerWorker_p0...
2744
+ [2023-11-15 07:33:20,326][29781] Loop learner_proc0_evt_loop terminating...
2745
+ [2023-11-15 07:33:20,604][29798] Decorrelating experience for 96 frames...
2746
+ [2023-11-15 07:33:20,931][00663] Component RolloutWorker_w7 stopped!
2747
+ [2023-11-15 07:33:20,934][29802] Stopping RolloutWorker_w7...
2748
+ [2023-11-15 07:33:20,936][29802] Loop rollout_proc7_evt_loop terminating...
2749
+ [2023-11-15 07:33:21,203][00663] Component RolloutWorker_w3 stopped!
2750
+ [2023-11-15 07:33:21,201][29798] Stopping RolloutWorker_w3...
2751
+ [2023-11-15 07:33:21,206][29798] Loop rollout_proc3_evt_loop terminating...
2752
+ [2023-11-15 07:33:22,117][29800] Decorrelating experience for 96 frames...
2753
+ [2023-11-15 07:33:22,120][29799] Decorrelating experience for 64 frames...
2754
+ [2023-11-15 07:33:22,386][00663] Component RolloutWorker_w5 stopped!
2755
+ [2023-11-15 07:33:22,386][29800] Stopping RolloutWorker_w5...
2756
+ [2023-11-15 07:33:22,388][29800] Loop rollout_proc5_evt_loop terminating...
2757
+ [2023-11-15 07:33:23,934][29799] Decorrelating experience for 96 frames...
2758
+ [2023-11-15 07:33:24,200][29799] Stopping RolloutWorker_w4...
2759
+ [2023-11-15 07:33:24,200][00663] Component RolloutWorker_w4 stopped!
2760
+ [2023-11-15 07:33:24,207][29799] Loop rollout_proc4_evt_loop terminating...
2761
+ [2023-11-15 07:33:24,206][00663] Waiting for process learner_proc0 to stop...
2762
+ [2023-11-15 07:33:24,212][00663] Waiting for process inference_proc0-0 to join...
2763
+ [2023-11-15 07:33:24,215][00663] Waiting for process rollout_proc0 to join...
2764
+ [2023-11-15 07:33:24,220][00663] Waiting for process rollout_proc1 to join...
2765
+ [2023-11-15 07:33:24,228][00663] Waiting for process rollout_proc2 to join...
2766
+ [2023-11-15 07:33:24,230][00663] Waiting for process rollout_proc3 to join...
2767
+ [2023-11-15 07:33:24,485][00663] Waiting for process rollout_proc4 to join...
2768
+ [2023-11-15 07:33:25,056][00663] Waiting for process rollout_proc5 to join...
2769
+ [2023-11-15 07:33:25,062][00663] Waiting for process rollout_proc6 to join...
2770
+ [2023-11-15 07:33:25,064][00663] Waiting for process rollout_proc7 to join...
2771
+ [2023-11-15 07:33:25,066][00663] Batcher 0 profile tree view:
2772
+ batching: 0.0184, releasing_batches: 0.0000
2773
+ [2023-11-15 07:33:25,068][00663] InferenceWorker_p0-w0 profile tree view:
2774
+ wait_policy: 0.0038
2775
+ wait_policy_total: 9.5800
2776
+ update_model: 0.0192
2777
+ weight_update: 0.0013
2778
+ one_step: 0.0029
2779
+ handle_policy_step: 2.5433
2780
+ deserialize: 0.0512, stack: 0.0104, obs_to_device_normalize: 0.4439, forward: 1.6464, send_messages: 0.0528
2781
+ prepare_outputs: 0.2688
2782
+ to_cpu: 0.1773
2783
+ [2023-11-15 07:33:25,069][00663] Learner 0 profile tree view:
2784
+ misc: 0.0000, prepare_batch: 1.0413
2785
+ train: 1.5532
2786
+ epoch_init: 0.0000, minibatch_init: 0.0000, losses_postprocess: 0.0002, kl_divergence: 0.0073, after_optimizer: 0.0407
2787
+ calculate_losses: 0.4753
2788
+ losses_init: 0.0000, forward_head: 0.3270, bptt_initial: 0.1047, tail: 0.0067, advantages_returns: 0.0009, losses: 0.0310
2789
+ bptt: 0.0048
2790
+ bptt_forward_core: 0.0047
2791
+ update: 1.0292
2792
+ clip: 0.0481
2793
+ [2023-11-15 07:33:25,070][00663] RolloutWorker_w0 profile tree view:
2794
+ wait_for_trajectories: 0.0009, enqueue_policy_requests: 0.9218, env_step: 2.9765, overhead: 0.0933, complete_rollouts: 0.0458
2795
+ save_policy_outputs: 0.0550
2796
+ split_output_tensors: 0.0273
2797
+ [2023-11-15 07:33:25,071][00663] RolloutWorker_w7 profile tree view:
2798
+ wait_for_trajectories: 0.0003, enqueue_policy_requests: 0.0148
2799
+ [2023-11-15 07:33:25,073][00663] Loop Runner_EvtLoop terminating...
2800
+ [2023-11-15 07:33:25,077][00663] Runner profile tree view:
2801
+ main_loop: 36.6700
2802
+ [2023-11-15 07:33:25,079][00663] Collected {0: 4009984}, FPS: 111.7
2803
+ [2023-11-15 07:33:25,121][00663] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
2804
+ [2023-11-15 07:33:25,124][00663] Overriding arg 'num_workers' with value 1 passed from command line
2805
+ [2023-11-15 07:33:25,128][00663] Adding new argument 'no_render'=True that is not in the saved config file!
2806
+ [2023-11-15 07:33:25,130][00663] Adding new argument 'save_video'=True that is not in the saved config file!
2807
+ [2023-11-15 07:33:25,133][00663] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
2808
+ [2023-11-15 07:33:25,136][00663] Adding new argument 'video_name'=None that is not in the saved config file!
2809
+ [2023-11-15 07:33:25,137][00663] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
2810
+ [2023-11-15 07:33:25,140][00663] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
2811
+ [2023-11-15 07:33:25,142][00663] Adding new argument 'push_to_hub'=False that is not in the saved config file!
2812
+ [2023-11-15 07:33:25,144][00663] Adding new argument 'hf_repository'=None that is not in the saved config file!
2813
+ [2023-11-15 07:33:25,145][00663] Adding new argument 'policy_index'=0 that is not in the saved config file!
2814
+ [2023-11-15 07:33:25,148][00663] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
2815
+ [2023-11-15 07:33:25,149][00663] Adding new argument 'train_script'=None that is not in the saved config file!
2816
+ [2023-11-15 07:33:25,151][00663] Adding new argument 'enjoy_script'=None that is not in the saved config file!
2817
+ [2023-11-15 07:33:25,152][00663] Using frameskip 1 and render_action_repeat=4 for evaluation
2818
+ [2023-11-15 07:33:25,220][00663] RunningMeanStd input shape: (3, 72, 128)
2819
+ [2023-11-15 07:33:25,222][00663] RunningMeanStd input shape: (1,)
2820
+ [2023-11-15 07:33:25,247][00663] ConvEncoder: input_channels=3
2821
+ [2023-11-15 07:33:25,316][00663] Conv encoder output size: 512
2822
+ [2023-11-15 07:33:25,318][00663] Policy head output size: 512
2823
+ [2023-11-15 07:33:25,349][00663] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000979_4009984.pth...
2824
+ [2023-11-15 07:33:26,067][00663] Num frames 100...
2825
+ [2023-11-15 07:33:26,253][00663] Num frames 200...
2826
+ [2023-11-15 07:33:26,468][00663] Num frames 300...
2827
+ [2023-11-15 07:33:26,674][00663] Num frames 400...
2828
+ [2023-11-15 07:33:26,861][00663] Num frames 500...
2829
+ [2023-11-15 07:33:27,055][00663] Num frames 600...
2830
+ [2023-11-15 07:33:27,250][00663] Num frames 700...
2831
+ [2023-11-15 07:33:27,456][00663] Num frames 800...
2832
+ [2023-11-15 07:33:27,651][00663] Num frames 900...
2833
+ [2023-11-15 07:33:27,893][00663] Avg episode rewards: #0: 23.920, true rewards: #0: 9.920
2834
+ [2023-11-15 07:33:27,895][00663] Avg episode reward: 23.920, avg true_objective: 9.920
2835
+ [2023-11-15 07:33:27,914][00663] Num frames 1000...
2836
+ [2023-11-15 07:33:28,107][00663] Num frames 1100...
2837
+ [2023-11-15 07:33:28,309][00663] Num frames 1200...
2838
+ [2023-11-15 07:33:28,515][00663] Num frames 1300...
2839
+ [2023-11-15 07:33:28,702][00663] Num frames 1400...
2840
+ [2023-11-15 07:33:28,894][00663] Num frames 1500...
2841
+ [2023-11-15 07:33:29,032][00663] Num frames 1600...
2842
+ [2023-11-15 07:33:29,158][00663] Num frames 1700...
2843
+ [2023-11-15 07:33:29,287][00663] Num frames 1800...
2844
+ [2023-11-15 07:33:29,482][00663] Avg episode rewards: #0: 21.440, true rewards: #0: 9.440
2845
+ [2023-11-15 07:33:29,484][00663] Avg episode reward: 21.440, avg true_objective: 9.440
2846
+ [2023-11-15 07:33:29,503][00663] Num frames 1900...
2847
+ [2023-11-15 07:33:29,631][00663] Num frames 2000...
2848
+ [2023-11-15 07:33:29,762][00663] Num frames 2100...
2849
+ [2023-11-15 07:33:29,887][00663] Num frames 2200...
2850
+ [2023-11-15 07:33:30,024][00663] Num frames 2300...
2851
+ [2023-11-15 07:33:30,155][00663] Num frames 2400...
2852
+ [2023-11-15 07:33:30,285][00663] Num frames 2500...
2853
+ [2023-11-15 07:33:30,420][00663] Num frames 2600...
2854
+ [2023-11-15 07:33:30,564][00663] Num frames 2700...
2855
+ [2023-11-15 07:33:30,695][00663] Num frames 2800...
2856
+ [2023-11-15 07:33:30,829][00663] Num frames 2900...
2857
+ [2023-11-15 07:33:30,960][00663] Num frames 3000...
2858
+ [2023-11-15 07:33:31,095][00663] Num frames 3100...
2859
+ [2023-11-15 07:33:31,234][00663] Num frames 3200...
2860
+ [2023-11-15 07:33:31,374][00663] Num frames 3300...
2861
+ [2023-11-15 07:33:31,519][00663] Num frames 3400...
2862
+ [2023-11-15 07:33:31,658][00663] Num frames 3500...
2863
+ [2023-11-15 07:33:31,795][00663] Num frames 3600...
2864
+ [2023-11-15 07:33:31,930][00663] Num frames 3700...
2865
+ [2023-11-15 07:33:32,065][00663] Num frames 3800...
2866
+ [2023-11-15 07:33:32,197][00663] Num frames 3900...
2867
+ [2023-11-15 07:33:32,370][00663] Avg episode rewards: #0: 31.960, true rewards: #0: 13.293
2868
+ [2023-11-15 07:33:32,371][00663] Avg episode reward: 31.960, avg true_objective: 13.293
2869
+ [2023-11-15 07:33:32,393][00663] Num frames 4000...
2870
+ [2023-11-15 07:33:32,537][00663] Num frames 4100...
2871
+ [2023-11-15 07:33:32,673][00663] Num frames 4200...
2872
+ [2023-11-15 07:33:32,816][00663] Num frames 4300...
2873
+ [2023-11-15 07:33:32,947][00663] Num frames 4400...
2874
+ [2023-11-15 07:33:33,087][00663] Num frames 4500...
2875
+ [2023-11-15 07:33:33,216][00663] Num frames 4600...
2876
+ [2023-11-15 07:33:33,355][00663] Num frames 4700...
2877
+ [2023-11-15 07:33:33,488][00663] Num frames 4800...
2878
+ [2023-11-15 07:33:33,626][00663] Num frames 4900...
2879
+ [2023-11-15 07:33:33,756][00663] Num frames 5000...
2880
+ [2023-11-15 07:33:33,887][00663] Num frames 5100...
2881
+ [2023-11-15 07:33:34,018][00663] Num frames 5200...
2882
+ [2023-11-15 07:33:34,083][00663] Avg episode rewards: #0: 30.010, true rewards: #0: 13.010
2883
+ [2023-11-15 07:33:34,085][00663] Avg episode reward: 30.010, avg true_objective: 13.010
2884
+ [2023-11-15 07:33:34,224][00663] Num frames 5300...
2885
+ [2023-11-15 07:33:34,361][00663] Num frames 5400...
2886
+ [2023-11-15 07:33:34,491][00663] Num frames 5500...
2887
+ [2023-11-15 07:33:34,629][00663] Num frames 5600...
2888
+ [2023-11-15 07:33:34,763][00663] Num frames 5700...
2889
+ [2023-11-15 07:33:34,880][00663] Avg episode rewards: #0: 26.096, true rewards: #0: 11.496
2890
+ [2023-11-15 07:33:34,884][00663] Avg episode reward: 26.096, avg true_objective: 11.496
2891
+ [2023-11-15 07:33:34,952][00663] Num frames 5800...
2892
+ [2023-11-15 07:33:35,080][00663] Num frames 5900...
2893
+ [2023-11-15 07:33:35,210][00663] Num frames 6000...
2894
+ [2023-11-15 07:33:35,340][00663] Num frames 6100...
2895
+ [2023-11-15 07:33:35,476][00663] Num frames 6200...
2896
+ [2023-11-15 07:33:35,612][00663] Num frames 6300...
2897
+ [2023-11-15 07:33:35,743][00663] Num frames 6400...
2898
+ [2023-11-15 07:33:35,871][00663] Num frames 6500...
2899
+ [2023-11-15 07:33:36,004][00663] Num frames 6600...
2900
+ [2023-11-15 07:33:36,135][00663] Num frames 6700...
2901
+ [2023-11-15 07:33:36,264][00663] Num frames 6800...
2902
+ [2023-11-15 07:33:36,422][00663] Num frames 6900...
2903
+ [2023-11-15 07:33:36,574][00663] Num frames 7000...
2904
+ [2023-11-15 07:33:36,710][00663] Num frames 7100...
2905
+ [2023-11-15 07:33:36,843][00663] Num frames 7200...
2906
+ [2023-11-15 07:33:36,974][00663] Num frames 7300...
2907
+ [2023-11-15 07:33:37,107][00663] Num frames 7400...
2908
+ [2023-11-15 07:33:37,241][00663] Num frames 7500...
2909
+ [2023-11-15 07:33:37,350][00663] Avg episode rewards: #0: 29.233, true rewards: #0: 12.567
2910
+ [2023-11-15 07:33:37,352][00663] Avg episode reward: 29.233, avg true_objective: 12.567
2911
+ [2023-11-15 07:33:37,437][00663] Num frames 7600...
2912
+ [2023-11-15 07:33:37,574][00663] Num frames 7700...
2913
+ [2023-11-15 07:33:37,718][00663] Num frames 7800...
2914
+ [2023-11-15 07:33:37,854][00663] Num frames 7900...
2915
+ [2023-11-15 07:33:37,986][00663] Num frames 8000...
2916
+ [2023-11-15 07:33:38,119][00663] Num frames 8100...
2917
+ [2023-11-15 07:33:38,274][00663] Avg episode rewards: #0: 26.681, true rewards: #0: 11.681
2918
+ [2023-11-15 07:33:38,276][00663] Avg episode reward: 26.681, avg true_objective: 11.681
2919
+ [2023-11-15 07:33:38,308][00663] Num frames 8200...
2920
+ [2023-11-15 07:33:38,442][00663] Num frames 8300...
2921
+ [2023-11-15 07:33:38,574][00663] Num frames 8400...
2922
+ [2023-11-15 07:33:38,714][00663] Num frames 8500...
2923
+ [2023-11-15 07:33:38,845][00663] Num frames 8600...
2924
+ [2023-11-15 07:33:39,018][00663] Num frames 8700...
2925
+ [2023-11-15 07:33:39,222][00663] Num frames 8800...
2926
+ [2023-11-15 07:33:39,416][00663] Num frames 8900...
2927
+ [2023-11-15 07:33:39,610][00663] Num frames 9000...
2928
+ [2023-11-15 07:33:39,815][00663] Num frames 9100...
2929
+ [2023-11-15 07:33:40,007][00663] Num frames 9200...
2930
+ [2023-11-15 07:33:40,204][00663] Num frames 9300...
2931
+ [2023-11-15 07:33:40,403][00663] Num frames 9400...
2932
+ [2023-11-15 07:33:40,602][00663] Num frames 9500...
2933
+ [2023-11-15 07:33:40,802][00663] Num frames 9600...
2934
+ [2023-11-15 07:33:40,988][00663] Num frames 9700...
2935
+ [2023-11-15 07:33:41,179][00663] Num frames 9800...
2936
+ [2023-11-15 07:33:41,380][00663] Num frames 9900...
2937
+ [2023-11-15 07:33:41,455][00663] Avg episode rewards: #0: 29.006, true rewards: #0: 12.381
2938
+ [2023-11-15 07:33:41,457][00663] Avg episode reward: 29.006, avg true_objective: 12.381
2939
+ [2023-11-15 07:33:41,645][00663] Num frames 10000...
2940
+ [2023-11-15 07:33:41,851][00663] Num frames 10100...
2941
+ [2023-11-15 07:33:42,042][00663] Num frames 10200...
2942
+ [2023-11-15 07:33:42,229][00663] Num frames 10300...
2943
+ [2023-11-15 07:33:42,428][00663] Num frames 10400...
2944
+ [2023-11-15 07:33:42,632][00663] Num frames 10500...
2945
+ [2023-11-15 07:33:42,775][00663] Avg episode rewards: #0: 27.494, true rewards: #0: 11.717
2946
+ [2023-11-15 07:33:42,777][00663] Avg episode reward: 27.494, avg true_objective: 11.717
2947
+ [2023-11-15 07:33:42,884][00663] Num frames 10600...
2948
+ [2023-11-15 07:33:43,070][00663] Num frames 10700...
2949
+ [2023-11-15 07:33:43,272][00663] Num frames 10800...
2950
+ [2023-11-15 07:33:43,470][00663] Num frames 10900...
2951
+ [2023-11-15 07:33:43,662][00663] Num frames 11000...
2952
+ [2023-11-15 07:33:43,855][00663] Num frames 11100...
2953
+ [2023-11-15 07:33:44,056][00663] Num frames 11200...
2954
+ [2023-11-15 07:33:44,250][00663] Num frames 11300...
2955
+ [2023-11-15 07:33:44,455][00663] Num frames 11400...
2956
+ [2023-11-15 07:33:44,588][00663] Num frames 11500...
2957
+ [2023-11-15 07:33:44,724][00663] Num frames 11600...
2958
+ [2023-11-15 07:33:44,784][00663] Avg episode rewards: #0: 26.901, true rewards: #0: 11.601
2959
+ [2023-11-15 07:33:44,785][00663] Avg episode reward: 26.901, avg true_objective: 11.601
2960
+ [2023-11-15 07:34:59,120][00663] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
2961
+ [2023-11-15 07:34:59,693][00663] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
2962
+ [2023-11-15 07:34:59,695][00663] Overriding arg 'num_workers' with value 1 passed from command line
2963
+ [2023-11-15 07:34:59,701][00663] Adding new argument 'no_render'=True that is not in the saved config file!
2964
+ [2023-11-15 07:34:59,705][00663] Adding new argument 'save_video'=True that is not in the saved config file!
2965
+ [2023-11-15 07:34:59,707][00663] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
2966
+ [2023-11-15 07:34:59,709][00663] Adding new argument 'video_name'=None that is not in the saved config file!
2967
+ [2023-11-15 07:34:59,711][00663] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
2968
+ [2023-11-15 07:34:59,712][00663] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
2969
+ [2023-11-15 07:34:59,713][00663] Adding new argument 'push_to_hub'=True that is not in the saved config file!
2970
+ [2023-11-15 07:34:59,714][00663] Adding new argument 'hf_repository'='nikxtaco/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
2971
+ [2023-11-15 07:34:59,715][00663] Adding new argument 'policy_index'=0 that is not in the saved config file!
2972
+ [2023-11-15 07:34:59,716][00663] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
2973
+ [2023-11-15 07:34:59,717][00663] Adding new argument 'train_script'=None that is not in the saved config file!
2974
+ [2023-11-15 07:34:59,718][00663] Adding new argument 'enjoy_script'=None that is not in the saved config file!
2975
+ [2023-11-15 07:34:59,720][00663] Using frameskip 1 and render_action_repeat=4 for evaluation
2976
+ [2023-11-15 07:34:59,762][00663] RunningMeanStd input shape: (3, 72, 128)
2977
+ [2023-11-15 07:34:59,764][00663] RunningMeanStd input shape: (1,)
2978
+ [2023-11-15 07:34:59,781][00663] ConvEncoder: input_channels=3
2979
+ [2023-11-15 07:34:59,840][00663] Conv encoder output size: 512
2980
+ [2023-11-15 07:34:59,843][00663] Policy head output size: 512
2981
+ [2023-11-15 07:34:59,871][00663] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000979_4009984.pth...
2982
+ [2023-11-15 07:35:00,607][00663] Num frames 100...
2983
+ [2023-11-15 07:35:00,796][00663] Num frames 200...
2984
+ [2023-11-15 07:35:00,998][00663] Num frames 300...
2985
+ [2023-11-15 07:35:01,191][00663] Num frames 400...
2986
+ [2023-11-15 07:35:01,426][00663] Num frames 500...
2987
+ [2023-11-15 07:35:01,656][00663] Num frames 600...
2988
+ [2023-11-15 07:35:01,850][00663] Num frames 700...
2989
+ [2023-11-15 07:35:02,050][00663] Num frames 800...
2990
+ [2023-11-15 07:35:02,243][00663] Num frames 900...
2991
+ [2023-11-15 07:35:02,469][00663] Num frames 1000...
2992
+ [2023-11-15 07:35:02,674][00663] Num frames 1100...
2993
+ [2023-11-15 07:35:02,874][00663] Num frames 1200...
2994
+ [2023-11-15 07:35:03,080][00663] Num frames 1300...
2995
+ [2023-11-15 07:35:03,287][00663] Num frames 1400...
2996
+ [2023-11-15 07:35:03,506][00663] Num frames 1500...
2997
+ [2023-11-15 07:35:03,706][00663] Num frames 1600...
2998
+ [2023-11-15 07:35:03,903][00663] Num frames 1700...
2999
+ [2023-11-15 07:35:04,156][00663] Num frames 1800...
3000
+ [2023-11-15 07:35:04,390][00663] Num frames 1900...
3001
+ [2023-11-15 07:35:04,621][00663] Num frames 2000...
3002
+ [2023-11-15 07:35:04,859][00663] Num frames 2100...
3003
+ [2023-11-15 07:35:04,912][00663] Avg episode rewards: #0: 54.999, true rewards: #0: 21.000
3004
+ [2023-11-15 07:35:04,914][00663] Avg episode reward: 54.999, avg true_objective: 21.000
3005
+ [2023-11-15 07:35:05,148][00663] Num frames 2200...
3006
+ [2023-11-15 07:35:05,373][00663] Num frames 2300...
3007
+ [2023-11-15 07:35:05,613][00663] Num frames 2400...
3008
+ [2023-11-15 07:35:05,844][00663] Num frames 2500...
3009
+ [2023-11-15 07:35:06,072][00663] Num frames 2600...
3010
+ [2023-11-15 07:35:06,302][00663] Avg episode rewards: #0: 32.380, true rewards: #0: 13.380
3011
+ [2023-11-15 07:35:06,304][00663] Avg episode reward: 32.380, avg true_objective: 13.380
3012
+ [2023-11-15 07:35:06,376][00663] Num frames 2700...
3013
+ [2023-11-15 07:35:06,582][00663] Num frames 2800...
3014
+ [2023-11-15 07:35:06,827][00663] Num frames 2900...
3015
+ [2023-11-15 07:35:07,020][00663] Num frames 3000...
3016
+ [2023-11-15 07:35:07,242][00663] Num frames 3100...
3017
+ [2023-11-15 07:35:07,527][00663] Num frames 3200...
3018
+ [2023-11-15 07:35:07,789][00663] Avg episode rewards: #0: 25.613, true rewards: #0: 10.947
3019
+ [2023-11-15 07:35:07,791][00663] Avg episode reward: 25.613, avg true_objective: 10.947
3020
+ [2023-11-15 07:35:07,830][00663] Num frames 3300...
3021
+ [2023-11-15 07:35:08,071][00663] Num frames 3400...
3022
+ [2023-11-15 07:35:08,307][00663] Num frames 3500...
3023
+ [2023-11-15 07:35:08,537][00663] Num frames 3600...
3024
+ [2023-11-15 07:35:08,795][00663] Num frames 3700...
3025
+ [2023-11-15 07:35:09,056][00663] Num frames 3800...
3026
+ [2023-11-15 07:35:09,237][00663] Avg episode rewards: #0: 21.877, true rewards: #0: 9.628
3027
+ [2023-11-15 07:35:09,239][00663] Avg episode reward: 21.877, avg true_objective: 9.628
3028
+ [2023-11-15 07:35:09,364][00663] Num frames 3900...
3029
+ [2023-11-15 07:35:09,614][00663] Num frames 4000...
3030
+ [2023-11-15 07:35:09,846][00663] Num frames 4100...
3031
+ [2023-11-15 07:35:10,029][00663] Num frames 4200...
3032
+ [2023-11-15 07:35:10,219][00663] Num frames 4300...
3033
+ [2023-11-15 07:35:10,408][00663] Num frames 4400...
3034
+ [2023-11-15 07:35:10,598][00663] Num frames 4500...
3035
+ [2023-11-15 07:35:10,796][00663] Num frames 4600...
3036
+ [2023-11-15 07:35:10,971][00663] Avg episode rewards: #0: 21.102, true rewards: #0: 9.302
3037
+ [2023-11-15 07:35:10,974][00663] Avg episode reward: 21.102, avg true_objective: 9.302
3038
+ [2023-11-15 07:35:11,044][00663] Num frames 4700...
3039
+ [2023-11-15 07:35:11,172][00663] Num frames 4800...
3040
+ [2023-11-15 07:35:11,300][00663] Num frames 4900...
3041
+ [2023-11-15 07:35:11,434][00663] Num frames 5000...
3042
+ [2023-11-15 07:35:11,567][00663] Num frames 5100...
3043
+ [2023-11-15 07:35:11,701][00663] Num frames 5200...
3044
+ [2023-11-15 07:35:11,857][00663] Num frames 5300...
3045
+ [2023-11-15 07:35:11,997][00663] Num frames 5400...
3046
+ [2023-11-15 07:35:12,129][00663] Num frames 5500...
3047
+ [2023-11-15 07:35:12,265][00663] Num frames 5600...
3048
+ [2023-11-15 07:35:12,407][00663] Num frames 5700...
3049
+ [2023-11-15 07:35:12,538][00663] Num frames 5800...
3050
+ [2023-11-15 07:35:12,669][00663] Num frames 5900...
3051
+ [2023-11-15 07:35:12,803][00663] Num frames 6000...
3052
+ [2023-11-15 07:35:12,942][00663] Num frames 6100...
3053
+ [2023-11-15 07:35:13,088][00663] Avg episode rewards: #0: 23.780, true rewards: #0: 10.280
3054
+ [2023-11-15 07:35:13,090][00663] Avg episode reward: 23.780, avg true_objective: 10.280
3055
+ [2023-11-15 07:35:13,138][00663] Num frames 6200...
3056
+ [2023-11-15 07:35:13,277][00663] Num frames 6300...
3057
+ [2023-11-15 07:35:13,410][00663] Num frames 6400...
3058
+ [2023-11-15 07:35:13,544][00663] Num frames 6500...
3059
+ [2023-11-15 07:35:13,675][00663] Num frames 6600...
3060
+ [2023-11-15 07:35:13,803][00663] Num frames 6700...
3061
+ [2023-11-15 07:35:13,958][00663] Avg episode rewards: #0: 22.397, true rewards: #0: 9.683
3062
+ [2023-11-15 07:35:13,960][00663] Avg episode reward: 22.397, avg true_objective: 9.683
3063
+ [2023-11-15 07:35:13,992][00663] Num frames 6800...
3064
+ [2023-11-15 07:35:14,122][00663] Num frames 6900...
3065
+ [2023-11-15 07:35:14,254][00663] Num frames 7000...
3066
+ [2023-11-15 07:35:14,393][00663] Num frames 7100...
3067
+ [2023-11-15 07:35:14,523][00663] Num frames 7200...
3068
+ [2023-11-15 07:35:14,656][00663] Num frames 7300...
3069
+ [2023-11-15 07:35:14,801][00663] Num frames 7400...
3070
+ [2023-11-15 07:35:14,934][00663] Num frames 7500...
3071
+ [2023-11-15 07:35:15,073][00663] Num frames 7600...
3072
+ [2023-11-15 07:35:15,204][00663] Num frames 7700...
3073
+ [2023-11-15 07:35:15,340][00663] Num frames 7800...
3074
+ [2023-11-15 07:35:15,482][00663] Num frames 7900...
3075
+ [2023-11-15 07:35:15,620][00663] Num frames 8000...
3076
+ [2023-11-15 07:35:15,761][00663] Num frames 8100...
3077
+ [2023-11-15 07:35:15,905][00663] Num frames 8200...
3078
+ [2023-11-15 07:35:16,087][00663] Avg episode rewards: #0: 24.602, true rewards: #0: 10.352
3079
+ [2023-11-15 07:35:16,089][00663] Avg episode reward: 24.602, avg true_objective: 10.352
3080
+ [2023-11-15 07:35:16,116][00663] Num frames 8300...
3081
+ [2023-11-15 07:35:16,249][00663] Num frames 8400...
3082
+ [2023-11-15 07:35:16,384][00663] Num frames 8500...
3083
+ [2023-11-15 07:35:16,517][00663] Num frames 8600...
3084
+ [2023-11-15 07:35:16,655][00663] Num frames 8700...
3085
+ [2023-11-15 07:35:16,795][00663] Num frames 8800...
3086
+ [2023-11-15 07:35:16,926][00663] Num frames 8900...
3087
+ [2023-11-15 07:35:17,111][00663] Avg episode rewards: #0: 23.095, true rewards: #0: 9.984
3088
+ [2023-11-15 07:35:17,113][00663] Avg episode reward: 23.095, avg true_objective: 9.984
3089
+ [2023-11-15 07:35:17,137][00663] Num frames 9000...
3090
+ [2023-11-15 07:35:17,273][00663] Num frames 9100...
3091
+ [2023-11-15 07:35:17,411][00663] Num frames 9200...
3092
+ [2023-11-15 07:35:17,543][00663] Num frames 9300...
3093
+ [2023-11-15 07:35:17,676][00663] Num frames 9400...
3094
+ [2023-11-15 07:35:17,809][00663] Num frames 9500...
3095
+ [2023-11-15 07:35:17,940][00663] Num frames 9600...
3096
+ [2023-11-15 07:35:18,077][00663] Num frames 9700...
3097
+ [2023-11-15 07:35:18,206][00663] Num frames 9800...
3098
+ [2023-11-15 07:35:18,340][00663] Num frames 9900...
3099
+ [2023-11-15 07:35:18,478][00663] Num frames 10000...
3100
+ [2023-11-15 07:35:18,611][00663] Num frames 10100...
3101
+ [2023-11-15 07:35:18,742][00663] Num frames 10200...
3102
+ [2023-11-15 07:35:18,871][00663] Num frames 10300...
3103
+ [2023-11-15 07:35:18,998][00663] Num frames 10400...
3104
+ [2023-11-15 07:35:19,105][00663] Avg episode rewards: #0: 24.032, true rewards: #0: 10.432
3105
+ [2023-11-15 07:35:19,107][00663] Avg episode reward: 24.032, avg true_objective: 10.432
3106
+ [2023-11-15 07:36:27,462][00663] Replay video saved to /content/train_dir/default_experiment/replay.mp4!