ppo-LunarLander-v2 / baseline_1k
lysukhin's picture
Baseline of PPO @ 512k iterations
10b1b7d