gsc579's picture
Test_Pendulum-v1_A2C
d807572
raw
history blame
12 kB
2023-06-24 15:26:40 - SimpleLog - INFO: - General Configs:
2023-06-24 15:26:40 - SimpleLog - INFO: - ================================================================================
2023-06-24 15:26:40 - SimpleLog - INFO: - Name Value Type
2023-06-24 15:26:40 - SimpleLog - INFO: - env_name gym <class 'str'>
2023-06-24 15:26:40 - SimpleLog - INFO: - algo_name A2C <class 'str'>
2023-06-24 15:26:40 - SimpleLog - INFO: - mode test <class 'str'>
2023-06-24 15:26:40 - SimpleLog - INFO: - device cpu <class 'str'>
2023-06-24 15:26:40 - SimpleLog - INFO: - seed 1 <class 'int'>
2023-06-24 15:26:40 - SimpleLog - INFO: - max_episode 50 <class 'int'>
2023-06-24 15:26:40 - SimpleLog - INFO: - max_step 200 <class 'int'>
2023-06-24 15:26:40 - SimpleLog - INFO: - collect_traj 0 <class 'bool'>
2023-06-24 15:26:40 - SimpleLog - INFO: - mp_backend single <class 'str'>
2023-06-24 15:26:40 - SimpleLog - INFO: - n_workers 2 <class 'int'>
2023-06-24 15:26:40 - SimpleLog - INFO: - n_learners 1 <class 'int'>
2023-06-24 15:26:40 - SimpleLog - INFO: - share_buffer 1 <class 'bool'>
2023-06-24 15:26:40 - SimpleLog - INFO: - online_eval 1 <class 'bool'>
2023-06-24 15:26:40 - SimpleLog - INFO: - online_eval_episode 10 <class 'int'>
2023-06-24 15:26:40 - SimpleLog - INFO: - model_save_fre 10 <class 'int'>
2023-06-24 15:26:40 - SimpleLog - INFO: - load_checkpoint 1 <class 'bool'>
2023-06-24 15:26:40 - SimpleLog - INFO: - load_path Train_Pendulum-v1_A2C_20230623-232832 <class 'str'>
2023-06-24 15:26:40 - SimpleLog - INFO: - load_model_step best <class 'str'>
2023-06-24 15:26:40 - SimpleLog - INFO: - interact_summary_fre 1 <class 'int'>
2023-06-24 15:26:40 - SimpleLog - INFO: - model_summary_fre 1 <class 'int'>
2023-06-24 15:26:40 - SimpleLog - INFO: - ================================================================================
2023-06-24 15:26:40 - SimpleLog - INFO: - Algo Configs:
2023-06-24 15:26:40 - SimpleLog - INFO: - ================================================================================
2023-06-24 15:26:40 - SimpleLog - INFO: - Name Value Type
2023-06-24 15:26:40 - SimpleLog - INFO: - independ_actor 1 <class 'bool'>
2023-06-24 15:26:40 - SimpleLog - INFO: - share_optimizer 0 <class 'bool'>
2023-06-24 15:26:40 - SimpleLog - INFO: - action_type continuous <class 'str'>
2023-06-24 15:26:40 - SimpleLog - INFO: - gamma 0.9 <class 'float'>
2023-06-24 15:26:40 - SimpleLog - INFO: - k_epochs 4 <class 'int'>
2023-06-24 15:26:40 - SimpleLog - INFO: - lr 0.0001 <class 'float'>
2023-06-24 15:26:40 - SimpleLog - INFO: - actor_lr 0.0001 <class 'float'>
2023-06-24 15:26:40 - SimpleLog - INFO: - critic_lr 0.005 <class 'float'>
2023-06-24 15:26:40 - SimpleLog - INFO: - critic_loss_coef 0.5 <class 'float'>
2023-06-24 15:26:40 - SimpleLog - INFO: - entropy_coef 0.01 <class 'float'>
2023-06-24 15:26:40 - SimpleLog - INFO: - buffer_type ONPOLICY_QUE <class 'str'>
2023-06-24 15:26:40 - SimpleLog - INFO: - batch_size 256 <class 'int'>
2023-06-24 15:26:40 - SimpleLog - INFO: - sgd_batch_size 32 <class 'int'>
2023-06-24 15:26:40 - SimpleLog - INFO: - actor_hidden_dim 256 <class 'int'>
2023-06-24 15:26:40 - SimpleLog - INFO: - critic_hidden_dim 256 <class 'int'>
2023-06-24 15:26:40 - SimpleLog - INFO: - min_policy 0 <class 'int'>
2023-06-24 15:26:40 - SimpleLog - INFO: - n_steps_per_learn 1 <class 'int'>
2023-06-24 15:26:40 - SimpleLog - INFO: - actor_layers [{'layer_type': 'linear', 'layer_size': [256], 'activation': 'relu'}] <class 'str'>
2023-06-24 15:26:40 - SimpleLog - INFO: - critic_layers [{'layer_type': 'linear', 'layer_size': [256], 'activation': 'relu'}] <class 'str'>
2023-06-24 15:26:40 - SimpleLog - INFO: - ================================================================================
2023-06-24 15:26:40 - SimpleLog - INFO: - Env Configs:
2023-06-24 15:26:40 - SimpleLog - INFO: - ================================================================================
2023-06-24 15:26:40 - SimpleLog - INFO: - Name Value Type
2023-06-24 15:26:40 - SimpleLog - INFO: - id Pendulum-v1 <class 'str'>
2023-06-24 15:26:40 - SimpleLog - INFO: - render_mode None <class 'str'>
2023-06-24 15:26:40 - SimpleLog - INFO: - wrapper None <class 'str'>
2023-06-24 15:26:40 - SimpleLog - INFO: - ignore_params ['wrapper', 'ignore_params'] <class 'str'>
2023-06-24 15:26:40 - SimpleLog - INFO: - ================================================================================
2023-06-24 15:26:40 - SimpleLog - INFO: - Start testing!
2023-06-24 15:26:40 - SimpleLog - INFO: - Interactor 0 finished episode 1 with reward -661.605 in 200 steps
2023-06-24 15:26:41 - SimpleLog - INFO: - Interactor 1 finished episode 2 with reward -520.007 in 200 steps
2023-06-24 15:26:41 - SimpleLog - INFO: - Interactor 0 finished episode 3 with reward -801.194 in 200 steps
2023-06-24 15:26:41 - SimpleLog - INFO: - Interactor 1 finished episode 4 with reward -776.387 in 200 steps
2023-06-24 15:26:41 - SimpleLog - INFO: - Interactor 0 finished episode 5 with reward -1018.592 in 200 steps
2023-06-24 15:26:41 - SimpleLog - INFO: - Interactor 1 finished episode 6 with reward -924.237 in 200 steps
2023-06-24 15:26:42 - SimpleLog - INFO: - Interactor 0 finished episode 7 with reward -696.645 in 200 steps
2023-06-24 15:26:42 - SimpleLog - INFO: - Interactor 0 finished episode 8 with reward -1031.963 in 200 steps
2023-06-24 15:26:42 - SimpleLog - INFO: - Interactor 1 finished episode 9 with reward -521.132 in 200 steps
2023-06-24 15:26:42 - SimpleLog - INFO: - Interactor 1 finished episode 10 with reward -931.846 in 200 steps
2023-06-24 15:26:42 - SimpleLog - INFO: - Interactor 0 finished episode 11 with reward -301.204 in 200 steps
2023-06-24 15:26:42 - SimpleLog - INFO: - Interactor 1 finished episode 12 with reward -645.720 in 200 steps
2023-06-24 15:26:43 - SimpleLog - INFO: - update_step: 10, online_eval_reward: -760.746
2023-06-24 15:26:43 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: -760.746, save the best model!
2023-06-24 15:26:43 - SimpleLog - INFO: - Interactor 0 finished episode 13 with reward -264.046 in 200 steps
2023-06-24 15:26:43 - SimpleLog - INFO: - Interactor 1 finished episode 14 with reward -517.979 in 200 steps
2023-06-24 15:26:43 - SimpleLog - INFO: - Interactor 0 finished episode 15 with reward -527.257 in 200 steps
2023-06-24 15:26:43 - SimpleLog - INFO: - Interactor 1 finished episode 16 with reward -401.069 in 200 steps
2023-06-24 15:26:43 - SimpleLog - INFO: - Interactor 0 finished episode 17 with reward -693.433 in 200 steps
2023-06-24 15:26:44 - SimpleLog - INFO: - Interactor 0 finished episode 18 with reward -266.325 in 200 steps
2023-06-24 15:26:44 - SimpleLog - INFO: - Interactor 1 finished episode 19 with reward -704.755 in 200 steps
2023-06-24 15:26:44 - SimpleLog - INFO: - Interactor 1 finished episode 20 with reward -916.228 in 200 steps
2023-06-24 15:26:44 - SimpleLog - INFO: - Interactor 0 finished episode 21 with reward -654.939 in 200 steps
2023-06-24 15:26:44 - SimpleLog - INFO: - Interactor 1 finished episode 22 with reward -518.691 in 200 steps
2023-06-24 15:26:44 - SimpleLog - INFO: - Interactor 0 finished episode 23 with reward -438.668 in 200 steps
2023-06-24 15:26:44 - SimpleLog - INFO: - Interactor 1 finished episode 24 with reward -521.045 in 200 steps
2023-06-24 15:26:45 - SimpleLog - INFO: - update_step: 20, online_eval_reward: -775.715
2023-06-24 15:26:45 - SimpleLog - INFO: - Interactor 0 finished episode 25 with reward -657.087 in 200 steps
2023-06-24 15:26:45 - SimpleLog - INFO: - Interactor 0 finished episode 26 with reward -520.093 in 200 steps
2023-06-24 15:26:45 - SimpleLog - INFO: - Interactor 1 finished episode 27 with reward -520.386 in 200 steps
2023-06-24 15:26:45 - SimpleLog - INFO: - Interactor 1 finished episode 28 with reward -883.519 in 200 steps
2023-06-24 15:26:46 - SimpleLog - INFO: - Interactor 0 finished episode 29 with reward -7.415 in 200 steps
2023-06-24 15:26:46 - SimpleLog - INFO: - Interactor 1 finished episode 30 with reward -932.779 in 200 steps
2023-06-24 15:26:46 - SimpleLog - INFO: - Interactor 0 finished episode 31 with reward -872.634 in 200 steps
2023-06-24 15:26:46 - SimpleLog - INFO: - Interactor 1 finished episode 32 with reward -924.046 in 200 steps
2023-06-24 15:26:46 - SimpleLog - INFO: - Interactor 0 finished episode 33 with reward -779.892 in 200 steps
2023-06-24 15:26:46 - SimpleLog - INFO: - Interactor 1 finished episode 34 with reward -1026.092 in 200 steps
2023-06-24 15:26:47 - SimpleLog - INFO: - Interactor 0 finished episode 35 with reward -692.214 in 200 steps
2023-06-24 15:26:47 - SimpleLog - INFO: - Interactor 0 finished episode 36 with reward -656.551 in 200 steps
2023-06-24 15:26:47 - SimpleLog - INFO: - Interactor 1 finished episode 37 with reward -915.715 in 200 steps
2023-06-24 15:26:47 - SimpleLog - INFO: - Interactor 1 finished episode 38 with reward -1004.479 in 200 steps
2023-06-24 15:26:47 - SimpleLog - INFO: - update_step: 30, online_eval_reward: -381.387
2023-06-24 15:26:47 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: -381.387, save the best model!
2023-06-24 15:26:47 - SimpleLog - INFO: - Interactor 0 finished episode 39 with reward -653.117 in 200 steps
2023-06-24 15:26:48 - SimpleLog - INFO: - Interactor 1 finished episode 40 with reward -748.897 in 200 steps
2023-06-24 15:26:48 - SimpleLog - INFO: - Interactor 0 finished episode 41 with reward -522.462 in 200 steps
2023-06-24 15:26:48 - SimpleLog - INFO: - Interactor 1 finished episode 42 with reward -648.596 in 200 steps
2023-06-24 15:26:48 - SimpleLog - INFO: - Interactor 0 finished episode 43 with reward -132.224 in 200 steps
2023-06-24 15:26:48 - SimpleLog - INFO: - Interactor 0 finished episode 44 with reward -394.009 in 200 steps
2023-06-24 15:26:48 - SimpleLog - INFO: - Interactor 1 finished episode 45 with reward -389.747 in 200 steps
2023-06-24 15:26:48 - SimpleLog - INFO: - Interactor 1 finished episode 46 with reward -388.791 in 200 steps
2023-06-24 15:26:49 - SimpleLog - INFO: - Interactor 0 finished episode 47 with reward -531.517 in 200 steps
2023-06-24 15:26:49 - SimpleLog - INFO: - Interactor 1 finished episode 48 with reward -649.490 in 200 steps
2023-06-24 15:26:49 - SimpleLog - INFO: - Interactor 0 finished episode 49 with reward -394.079 in 200 steps
2023-06-24 15:26:49 - SimpleLog - INFO: - Interactor 1 finished episode 50 with reward -648.858 in 200 steps
2023-06-24 15:26:50 - SimpleLog - INFO: - update_step: 40, online_eval_reward: -631.289
2023-06-24 15:26:50 - SimpleLog - INFO: - Finish testing! Time cost: 9.316 s