Proximal Policy Optimization Algorithms
Paper β’ 1707.06347 β’ Published β’ 11
How to use AminVilan/ppo-LunarLander-v3 with stable-baselines3:
from huggingface_sb3 import load_from_hub
checkpoint = load_from_hub(
repo_id="AminVilan/ppo-LunarLander-v3",
filename="{MODEL FILENAME}.zip",
)The Github repository contains a trained Proximal Policy Optimization (PPO) agent for the classic control task LunarLander-v3 from Gymnasium.
The model is implemented and trained using the Stable-Baselines3 library.
LunarLander-v3 289.24 Β± 12.88 You can run the training pipeline locally or in Colab.
Click below to open the training notebook:
π Open Notebook in Colab
# Clone the repository
git clone https://github.com/AminVilan/RL-PPO-LunarLander-v3.git
cd RL-PPO-LunarLander-v3
# Open the notebook
jupyter notebook src/ppo_lunarlander_training.ipynb
The trained model is available on the Hugging Face Hub. You can load and run it directly:
import gymnasium as gym
from stable_baselines3 import PPO
from huggingface_sb3 import load_from_hub
# Download and load the model from Hugging Face Hub
repo_id = "AminVilan/ppo-LunarLander-v3"
filename = "v01-ppo-LunarLanderV3.zip"
model = load_from_hub(repo_id, filename)
# Create environment
env = gym.make("LunarLander-v3", render_mode="human")
obs, info = env.reset()
done, truncated = False, False
while not (done or truncated):
action, _ = model.predict(obs)
obs, reward, done, truncated, info = env.step(action)
env.render()
env.close()