ppo-LunarLander-v2 / results.json
austinzheng's picture
This commit with 1200000 training and score exceed 200
ce0d7a1
{"mean_reward": 280.2584266527464, "std_reward": 16.824916486147476, "is_deterministic": true, "n_eval_episodes": 10, "eval_datetime": "2022-12-16T16:53:11.559013"}