PPO-LunarLander-v2 / results.json
Khayoon's picture
unit 1 trained and getting mean reward 255.88 +/- 19.940845710538685
cc50c3c
raw
history blame
165 Bytes
{"mean_reward": 257.96114550919697, "std_reward": 16.773534964894235, "is_deterministic": true, "n_eval_episodes": 10, "eval_datetime": "2023-05-08T17:15:44.581874"}