PPO (from scratch) - LunarLander-v2

A CleanRL-style PPO agent trained from scratch on LunarLander-v2.

  • mean_reward: 229.63 +/- 69.15 (50 deterministic episodes)

Numbers are auto-generated from results.json so the card and results.json always match.

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Evaluation results