LearnHF-LunarLander-v2 / results.json
nbiish's picture
The key is a while loop on the mean_reward variable when evaluating your agent🥰📚
094b889
raw
history blame
165 Bytes
{"mean_reward": 268.89390700135795, "std_reward": 30.990567156284154, "is_deterministic": true, "n_eval_episodes": 10, "eval_datetime": "2023-06-12T15:33:56.323331"}