LunarLander-v2-PPO / results.json
guza's picture
Improve PPO trained agent to 276.44 +/- 21.651967758051594
eac14dc
raw
history blame
164 Bytes
{"mean_reward": 255.12738523909792, "std_reward": 64.65579970572723, "is_deterministic": true, "n_eval_episodes": 10, "eval_datetime": "2023-05-12T03:35:36.942244"}