ppo-LunarLander-v2 / results.json
hudbrog's picture
some more training with best model save callback
840b113 verified
{"mean_reward": 274.43176800000003, "std_reward": 18.929525223519658, "is_deterministic": true, "n_eval_episodes": 10, "eval_datetime": "2024-03-22T19:14:10.189540"}