Tuned PPO agent trained on LunarLander-v2 (1 million timesteps) 257c5fb verified crossroderick commited on Feb 26
Tuned PPO agent trained on LunarLander-v2 (1 million timesteps) 958737f verified crossroderick commited on Feb 26