PPO Agent for LunarLander-v2

This model was trained for the Hugging Face Deep Reinforcement Learning course using a CleanRL-style PPO implementation in PyTorch.

Environment

  • LunarLander-v3

Evaluation Results

  • Mean reward: 116.31
  • Standard deviation: 18.91
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Evaluation results