LunarLanderTraining / results.json
rhr99's picture
first model training using PPO
b041420
{"mean_reward": 243.22743980605702, "std_reward": 24.876900580230856, "is_deterministic": true, "n_eval_episodes": 10, "eval_datetime": "2022-12-10T05:26:00.720449"}