File size: 17,191 Bytes
24d6ada
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
2023-05-18 17:26:42 - SimpleLog - INFO: - General Configs:
2023-05-18 17:26:42 - SimpleLog - INFO: - ================================================================================
2023-05-18 17:26:42 - SimpleLog - INFO: -         Name        	       Value        	        Type        
2023-05-18 17:26:42 - SimpleLog - INFO: -       env_name      	        gym         	   <class 'str'>    
2023-05-18 17:26:42 - SimpleLog - INFO: -      algo_name      	      NoisyDQN      	   <class 'str'>    
2023-05-18 17:26:42 - SimpleLog - INFO: -         mode        	       train        	   <class 'str'>    
2023-05-18 17:26:42 - SimpleLog - INFO: -        device       	        cpu         	   <class 'str'>    
2023-05-18 17:26:42 - SimpleLog - INFO: -         seed        	         1          	   <class 'int'>    
2023-05-18 17:26:42 - SimpleLog - INFO: -     max_episode     	        100         	   <class 'int'>    
2023-05-18 17:26:42 - SimpleLog - INFO: -       max_step      	        200         	   <class 'int'>    
2023-05-18 17:26:42 - SimpleLog - INFO: -     collect_traj    	         0          	   <class 'bool'>   
2023-05-18 17:26:42 - SimpleLog - INFO: -      mp_backend     	        ray         	   <class 'str'>    
2023-05-18 17:26:42 - SimpleLog - INFO: -      n_workers      	         2          	   <class 'int'>    
2023-05-18 17:26:42 - SimpleLog - INFO: -      n_learners     	         1          	   <class 'int'>    
2023-05-18 17:26:42 - SimpleLog - INFO: -     share_buffer    	         1          	   <class 'bool'>   
2023-05-18 17:26:42 - SimpleLog - INFO: -     online_eval     	         1          	   <class 'bool'>   
2023-05-18 17:26:42 - SimpleLog - INFO: - online_eval_episode 	         10         	   <class 'int'>    
2023-05-18 17:26:42 - SimpleLog - INFO: -    model_save_fre   	        500         	   <class 'int'>    
2023-05-18 17:26:42 - SimpleLog - INFO: -   load_checkpoint   	         0          	   <class 'bool'>   
2023-05-18 17:26:42 - SimpleLog - INFO: -      load_path      	Train_single_CartPole-v1_NoisyDQN_20230518-133737	   <class 'str'>    
2023-05-18 17:26:42 - SimpleLog - INFO: -   load_model_step   	        best        	   <class 'str'>    
2023-05-18 17:26:42 - SimpleLog - INFO: - ================================================================================
2023-05-18 17:26:42 - SimpleLog - INFO: - Algo Configs:
2023-05-18 17:26:42 - SimpleLog - INFO: - ================================================================================
2023-05-18 17:26:42 - SimpleLog - INFO: -         Name        	       Value        	        Type        
2023-05-18 17:26:42 - SimpleLog - INFO: -    epsilon_start    	        0.95        	  <class 'float'>   
2023-05-18 17:26:42 - SimpleLog - INFO: -     epsilon_end     	        0.01        	  <class 'float'>   
2023-05-18 17:26:42 - SimpleLog - INFO: -    epsilon_decay    	        500         	   <class 'int'>    
2023-05-18 17:26:42 - SimpleLog - INFO: -        gamma        	        0.99        	  <class 'float'>   
2023-05-18 17:26:42 - SimpleLog - INFO: -          lr         	       0.0001       	  <class 'float'>   
2023-05-18 17:26:42 - SimpleLog - INFO: -     buffer_size     	       100000       	   <class 'int'>    
2023-05-18 17:26:42 - SimpleLog - INFO: -      batch_size     	         64         	   <class 'int'>    
2023-05-18 17:26:42 - SimpleLog - INFO: -    target_update    	         4          	   <class 'int'>    
2023-05-18 17:26:42 - SimpleLog - INFO: -     value_layers    	[{'layer_type': 'noisy_linear', 'layer_size': [256], 'activation': 'relu', 'std_init': 0.4}, {'layer_type': 'noisy_linear', 'layer_size': [256], 'activation': 'relu', 'std_init': 0.4}]	   <class 'str'>    
2023-05-18 17:26:42 - SimpleLog - INFO: -     buffer_type     	     REPLAY_QUE     	   <class 'str'>    
2023-05-18 17:26:42 - SimpleLog - INFO: - ================================================================================
2023-05-18 17:26:42 - SimpleLog - INFO: - Env Configs:
2023-05-18 17:26:42 - SimpleLog - INFO: - ================================================================================
2023-05-18 17:26:42 - SimpleLog - INFO: -         Name        	       Value        	        Type        
2023-05-18 17:26:42 - SimpleLog - INFO: -          id         	    CartPole-v1     	   <class 'str'>    
2023-05-18 17:26:42 - SimpleLog - INFO: -     render_mode     	        None        	   <class 'str'>    
2023-05-18 17:26:42 - SimpleLog - INFO: -       wrapper       	        None        	   <class 'str'>    
2023-05-18 17:26:42 - SimpleLog - INFO: -    ignore_params    	['wrapper', 'ignore_params']	   <class 'str'>    
2023-05-18 17:26:42 - SimpleLog - INFO: - ================================================================================
2023-05-18 17:26:49 - SimpleLog - INFO: - obs_space: Box([-4.8000002e+00 -3.4028235e+38 -4.1887903e-01 -3.4028235e+38], [4.8000002e+00 3.4028235e+38 4.1887903e-01 3.4028235e+38], (4,), float32), n_actions: Discrete(2)
2023-05-18 17:26:56 - RayLog - INFO: - Worker 1 finished episode 0 with reward 41.0 in 41 steps
2023-05-18 17:26:57 - RayLog - INFO: - Worker 0 finished episode 0 with reward 55.0 in 55 steps
2023-05-18 17:26:57 - RayLog - INFO: - Worker 1 finished episode 1 with reward 15.0 in 15 steps
2023-05-18 17:26:57 - RayLog - INFO: - Worker 0 finished episode 2 with reward 17.0 in 17 steps
2023-05-18 17:26:57 - RayLog - INFO: - Worker 1 finished episode 3 with reward 24.0 in 24 steps
2023-05-18 17:26:57 - RayLog - INFO: - Worker 0 finished episode 4 with reward 12.0 in 12 steps
2023-05-18 17:26:57 - RayLog - INFO: - Worker 0 finished episode 6 with reward 9.0 in 9 steps
2023-05-18 17:26:57 - RayLog - INFO: - Worker 1 finished episode 5 with reward 17.0 in 17 steps
2023-05-18 17:26:58 - RayLog - INFO: - Worker 0 finished episode 7 with reward 11.0 in 11 steps
2023-05-18 17:26:58 - RayLog - INFO: - Worker 1 finished episode 8 with reward 11.0 in 11 steps
2023-05-18 17:26:58 - RayLog - INFO: - Worker 0 finished episode 9 with reward 11.0 in 11 steps
2023-05-18 17:26:58 - RayLog - INFO: - Worker 1 finished episode 10 with reward 13.0 in 13 steps
2023-05-18 17:26:58 - RayLog - INFO: - Worker 0 finished episode 11 with reward 11.0 in 11 steps
2023-05-18 17:26:58 - RayLog - INFO: - Worker 1 finished episode 12 with reward 16.0 in 16 steps
2023-05-18 17:26:58 - RayLog - INFO: - Worker 0 finished episode 13 with reward 12.0 in 12 steps
2023-05-18 17:26:59 - RayLog - INFO: - Worker 0 finished episode 15 with reward 11.0 in 11 steps
2023-05-18 17:26:59 - RayLog - INFO: - Worker 1 finished episode 14 with reward 13.0 in 13 steps
2023-05-18 17:26:59 - RayLog - INFO: - Worker 0 finished episode 16 with reward 14.0 in 14 steps
2023-05-18 17:26:59 - RayLog - INFO: - Worker 0 finished episode 18 with reward 10.0 in 10 steps
2023-05-18 17:26:59 - RayLog - INFO: - Worker 1 finished episode 17 with reward 24.0 in 24 steps
2023-05-18 17:26:59 - RayLog - INFO: - Worker 0 finished episode 19 with reward 12.0 in 12 steps
2023-05-18 17:27:00 - RayLog - INFO: - Worker 1 finished episode 20 with reward 17.0 in 17 steps
2023-05-18 17:27:00 - RayLog - INFO: - Worker 0 finished episode 21 with reward 14.0 in 14 steps
2023-05-18 17:27:00 - RayLog - INFO: - Worker 1 finished episode 22 with reward 9.0 in 9 steps
2023-05-18 17:27:00 - RayLog - INFO: - Worker 1 finished episode 24 with reward 9.0 in 9 steps
2023-05-18 17:27:00 - RayLog - INFO: - Worker 0 finished episode 23 with reward 17.0 in 17 steps
2023-05-18 17:27:00 - RayLog - INFO: - Worker 1 finished episode 25 with reward 13.0 in 13 steps
2023-05-18 17:27:01 - RayLog - INFO: - Worker 0 finished episode 26 with reward 15.0 in 15 steps
2023-05-18 17:27:01 - RayLog - INFO: - Worker 1 finished episode 27 with reward 13.0 in 13 steps
2023-05-18 17:27:01 - RayLog - INFO: - Worker 0 finished episode 28 with reward 11.0 in 11 steps
2023-05-18 17:27:01 - RayLog - INFO: - Worker 1 finished episode 29 with reward 9.0 in 9 steps
2023-05-18 17:27:01 - RayLog - INFO: - Worker 0 finished episode 30 with reward 15.0 in 15 steps
2023-05-18 17:27:01 - RayLog - INFO: - Worker 1 finished episode 31 with reward 16.0 in 16 steps
2023-05-18 17:27:01 - RayLog - INFO: - Worker 0 finished episode 32 with reward 10.0 in 10 steps
2023-05-18 17:27:01 - RayLog - INFO: - Worker 1 finished episode 33 with reward 9.0 in 9 steps
2023-05-18 17:27:02 - RayLog - INFO: - Worker 0 finished episode 34 with reward 10.0 in 10 steps
2023-05-18 17:27:02 - RayLog - INFO: - Worker 1 finished episode 35 with reward 9.0 in 9 steps
2023-05-18 17:27:03 - RayLog - INFO: - learner id: 0, update_step: 500, online_eval_reward: 10.000
2023-05-18 17:27:03 - RayLog - INFO: - learner 0 for current update step obtain a better online_eval_reward: 10.000, save the best model!
2023-05-18 17:27:03 - RayLog - INFO: - Worker 0 finished episode 36 with reward 12.0 in 12 steps
2023-05-18 17:27:03 - RayLog - INFO: - Worker 1 finished episode 37 with reward 17.0 in 17 steps
2023-05-18 17:27:05 - RayLog - INFO: - Worker 0 finished episode 38 with reward 92.0 in 92 steps
2023-05-18 17:27:06 - RayLog - INFO: - Worker 1 finished episode 39 with reward 99.0 in 99 steps
2023-05-18 17:27:06 - RayLog - INFO: - Worker 0 finished episode 40 with reward 26.0 in 26 steps
2023-05-18 17:27:07 - RayLog - INFO: - Worker 0 finished episode 42 with reward 23.0 in 23 steps
2023-05-18 17:27:07 - RayLog - INFO: - Worker 1 finished episode 41 with reward 40.0 in 40 steps
2023-05-18 17:27:07 - RayLog - INFO: - Worker 0 finished episode 43 with reward 21.0 in 21 steps
2023-05-18 17:27:07 - RayLog - INFO: - Worker 1 finished episode 44 with reward 32.0 in 32 steps
2023-05-18 17:27:08 - RayLog - INFO: - Worker 0 finished episode 45 with reward 20.0 in 20 steps
2023-05-18 17:27:08 - RayLog - INFO: - Worker 1 finished episode 46 with reward 28.0 in 28 steps
2023-05-18 17:27:08 - RayLog - INFO: - Worker 0 finished episode 47 with reward 27.0 in 27 steps
2023-05-18 17:27:09 - RayLog - INFO: - Worker 0 finished episode 49 with reward 21.0 in 21 steps
2023-05-18 17:27:09 - RayLog - INFO: - Worker 1 finished episode 48 with reward 30.0 in 30 steps
2023-05-18 17:27:09 - RayLog - INFO: - learner id: 0, update_step: 1000, online_eval_reward: 25.000
2023-05-18 17:27:09 - RayLog - INFO: - learner 0 for current update step obtain a better online_eval_reward: 25.000, save the best model!
2023-05-18 17:27:09 - RayLog - INFO: - Worker 1 finished episode 51 with reward 18.0 in 18 steps
2023-05-18 17:27:09 - RayLog - INFO: - Worker 0 finished episode 50 with reward 24.0 in 24 steps
2023-05-18 17:27:10 - RayLog - INFO: - Worker 0 finished episode 53 with reward 24.0 in 24 steps
2023-05-18 17:27:10 - RayLog - INFO: - Worker 1 finished episode 52 with reward 29.0 in 29 steps
2023-05-18 17:27:11 - RayLog - INFO: - Worker 0 finished episode 54 with reward 24.0 in 24 steps
2023-05-18 17:27:11 - RayLog - INFO: - Worker 1 finished episode 55 with reward 29.0 in 29 steps
2023-05-18 17:27:11 - RayLog - INFO: - Worker 0 finished episode 56 with reward 23.0 in 23 steps
2023-05-18 17:27:11 - RayLog - INFO: - Worker 1 finished episode 57 with reward 33.0 in 33 steps
2023-05-18 17:27:12 - RayLog - INFO: - Worker 0 finished episode 58 with reward 31.0 in 31 steps
2023-05-18 17:27:12 - RayLog - INFO: - Worker 0 finished episode 60 with reward 25.0 in 25 steps
2023-05-18 17:27:12 - RayLog - INFO: - Worker 1 finished episode 59 with reward 39.0 in 39 steps
2023-05-18 17:27:13 - RayLog - INFO: - Worker 0 finished episode 61 with reward 33.0 in 33 steps
2023-05-18 17:27:14 - RayLog - INFO: - Worker 1 finished episode 62 with reward 60.0 in 60 steps
2023-05-18 17:27:14 - RayLog - INFO: - Worker 0 finished episode 63 with reward 39.0 in 39 steps
2023-05-18 17:27:15 - RayLog - INFO: - Worker 0 finished episode 65 with reward 37.0 in 37 steps
2023-05-18 17:27:16 - RayLog - INFO: - learner id: 0, update_step: 1500, online_eval_reward: 35.000
2023-05-18 17:27:16 - RayLog - INFO: - learner 0 for current update step obtain a better online_eval_reward: 35.000, save the best model!
2023-05-18 17:27:16 - RayLog - INFO: - Worker 1 finished episode 64 with reward 58.0 in 58 steps
2023-05-18 17:27:17 - RayLog - INFO: - Worker 0 finished episode 66 with reward 53.0 in 53 steps
2023-05-18 17:27:19 - RayLog - INFO: - Worker 1 finished episode 67 with reward 99.0 in 99 steps
2023-05-18 17:27:19 - RayLog - INFO: - Worker 0 finished episode 68 with reward 91.0 in 91 steps
2023-05-18 17:27:21 - RayLog - INFO: - Worker 1 finished episode 69 with reward 106.0 in 106 steps
2023-05-18 17:27:23 - RayLog - INFO: - learner id: 0, update_step: 2000, online_eval_reward: 200.000
2023-05-18 17:27:23 - RayLog - INFO: - learner 0 for current update step obtain a better online_eval_reward: 200.000, save the best model!
2023-05-18 17:27:25 - RayLog - INFO: - Worker 0 finished episode 70 with reward 200.0 in 200 steps
2023-05-18 17:27:27 - RayLog - INFO: - Worker 1 finished episode 71 with reward 200.0 in 200 steps
2023-05-18 17:27:29 - RayLog - INFO: - learner id: 0, update_step: 2500, online_eval_reward: 200.000
2023-05-18 17:27:30 - RayLog - INFO: - Worker 0 finished episode 72 with reward 200.0 in 200 steps
2023-05-18 17:27:32 - RayLog - INFO: - Worker 1 finished episode 73 with reward 200.0 in 200 steps
2023-05-18 17:27:36 - RayLog - INFO: - Worker 0 finished episode 74 with reward 200.0 in 200 steps
2023-05-18 17:27:37 - RayLog - INFO: - learner id: 0, update_step: 3000, online_eval_reward: 200.000
2023-05-18 17:27:39 - RayLog - INFO: - Worker 1 finished episode 75 with reward 200.0 in 200 steps
2023-05-18 17:27:42 - RayLog - INFO: - Worker 0 finished episode 76 with reward 200.0 in 200 steps
2023-05-18 17:27:44 - RayLog - INFO: - learner id: 0, update_step: 3500, online_eval_reward: 200.000
2023-05-18 17:27:44 - RayLog - INFO: - Worker 1 finished episode 77 with reward 200.0 in 200 steps
2023-05-18 17:27:48 - RayLog - INFO: - Worker 0 finished episode 78 with reward 200.0 in 200 steps
2023-05-18 17:27:50 - RayLog - INFO: - Worker 1 finished episode 79 with reward 200.0 in 200 steps
2023-05-18 17:27:52 - RayLog - INFO: - learner id: 0, update_step: 4000, online_eval_reward: 200.000
2023-05-18 17:27:54 - RayLog - INFO: - Worker 0 finished episode 80 with reward 200.0 in 200 steps
2023-05-18 17:27:56 - RayLog - INFO: - Worker 1 finished episode 81 with reward 200.0 in 200 steps
2023-05-18 17:27:59 - RayLog - INFO: - learner id: 0, update_step: 4500, online_eval_reward: 200.000
2023-05-18 17:28:00 - RayLog - INFO: - Worker 0 finished episode 82 with reward 200.0 in 200 steps
2023-05-18 17:28:02 - RayLog - INFO: - Worker 1 finished episode 83 with reward 200.0 in 200 steps
2023-05-18 17:28:06 - RayLog - INFO: - Worker 0 finished episode 84 with reward 200.0 in 200 steps
2023-05-18 17:28:06 - RayLog - INFO: - learner id: 0, update_step: 5000, online_eval_reward: 200.000
2023-05-18 17:28:08 - RayLog - INFO: - Worker 1 finished episode 85 with reward 200.0 in 200 steps
2023-05-18 17:28:12 - RayLog - INFO: - Worker 0 finished episode 86 with reward 200.0 in 200 steps
2023-05-18 17:28:14 - RayLog - INFO: - learner id: 0, update_step: 5500, online_eval_reward: 200.000
2023-05-18 17:28:14 - RayLog - INFO: - Worker 1 finished episode 87 with reward 200.0 in 200 steps
2023-05-18 17:28:18 - RayLog - INFO: - Worker 0 finished episode 88 with reward 200.0 in 200 steps
2023-05-18 17:28:20 - RayLog - INFO: - Worker 1 finished episode 89 with reward 200.0 in 200 steps
2023-05-18 17:28:21 - RayLog - INFO: - learner id: 0, update_step: 6000, online_eval_reward: 200.000
2023-05-18 17:28:24 - RayLog - INFO: - Worker 0 finished episode 90 with reward 200.0 in 200 steps
2023-05-18 17:28:26 - RayLog - INFO: - Worker 1 finished episode 91 with reward 200.0 in 200 steps
2023-05-18 17:28:29 - RayLog - INFO: - learner id: 0, update_step: 6500, online_eval_reward: 200.000
2023-05-18 17:28:30 - RayLog - INFO: - Worker 0 finished episode 92 with reward 200.0 in 200 steps
2023-05-18 17:28:32 - RayLog - INFO: - Worker 1 finished episode 93 with reward 200.0 in 200 steps
2023-05-18 17:28:37 - RayLog - INFO: - Worker 0 finished episode 94 with reward 200.0 in 200 steps
2023-05-18 17:28:38 - RayLog - INFO: - learner id: 0, update_step: 7000, online_eval_reward: 200.000
2023-05-18 17:28:40 - RayLog - INFO: - Worker 1 finished episode 95 with reward 200.0 in 200 steps
2023-05-18 17:28:44 - RayLog - INFO: - Worker 0 finished episode 96 with reward 200.0 in 200 steps
2023-05-18 17:28:47 - RayLog - INFO: - learner id: 0, update_step: 7500, online_eval_reward: 200.000
2023-05-18 17:28:47 - RayLog - INFO: - Worker 1 finished episode 97 with reward 200.0 in 200 steps
2023-05-18 17:28:52 - RayLog - INFO: - Worker 0 finished episode 98 with reward 200.0 in 200 steps
2023-05-18 17:28:54 - RayLog - INFO: - Worker 1 finished episode 99 with reward 200.0 in 200 steps
2023-05-18 17:28:56 - RayLog - INFO: - learner id: 0, update_step: 8000, online_eval_reward: 200.000
2023-05-18 17:28:58 - RayLog - INFO: - Worker 0 finished episode 100 with reward 200.0 in 200 steps
2023-05-18 17:29:01 - SimpleLog - INFO: - Finish training! total time consumed: 138.97s