ppo-LunarLander-v2 / results.json
erniechiew's picture
PPO default with more iterations
0c09359
{"mean_reward": 291.0646233476815, "std_reward": 23.119697417523252, "is_deterministic": true, "n_eval_episodes": 10, "eval_datetime": "2022-12-18T18:45:15.615666"}