YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Deep RL Course - LunarLander-v2 (Unit 1 + Unit 8)

This repo bundles both LunarLander-v2 agents from the Hugging Face Deep RL Course.

unit1/ - PPO via Stable-Baselines3

  • File: unit1/ppo-LunarLander-v2.zip
  • Algorithm: PPO (Stable-Baselines3), MlpPolicy
  • mean_reward: ~229.64 (solved)

unit8/ - PPO from scratch (CleanRL-style)

  • Weights: unit8/ppo_scratch_lunarlander.pt
  • Code: unit8/ppo_scratch.py
  • Algorithm: PPO implemented from scratch in PyTorch (GAE, clipped surrogate, value loss, entropy bonus, LR annealing, grad clipping)
  • Training: 4,000,000 timesteps, 8 parallel envs
  • mean_reward: ~156 (deterministic, 50 episodes)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support