DeepRLCourse2022 / results.json
bguan's picture
bguan's lunar lander model #3 using PPO trained for 1M timesteps
ee17131
{"mean_reward": 224.75846835024868, "std_reward": 21.411892070032263, "is_deterministic": true, "n_eval_episodes": 10, "eval_datetime": "2022-05-09T06:40:11.790025"}