CoreyMorris's picture
step 6220000 . Checkpoint from initial model taken and trained further at a lower learning rate 2nd
c002a26
{"mean_reward": 440.4, "std_reward": 169.58254627171982, "is_deterministic": true, "n_eval_episodes": 10, "eval_datetime": "2023-01-23T01:01:59.826816"}