ppo-lunarlander / results.json
rram12's picture
tuned LunarLander model trained with PPO
31ab8f3
{"mean_reward": 283.67660548661763, "std_reward": 19.99933910099818, "is_deterministic": true, "n_eval_episodes": 10, "eval_datetime": "2022-09-20T02:52:52.918637"}