PPO Agent playing LunarLander-v2
This is a trained model of a PPO agent playing LunarLander-v2 using the Stable-Baselines3 library.
Usage
To use this model with Stable-Baselines3, follow these steps:
import gymnasium as gym
from stable_baselines3 import PPO
# Create the environment
env = gym.make("LunarLander-v2")
# Load the trained model
model = PPO.load("path/to/model.zip")
# Run the model
obs, info = env.reset()
while True:
action, _states = model.predict(obs, deterministic=True)
obs, reward, terminated, truncated, info = env.step(action)
if terminated or truncated:
obs, info = env.reset()
env.close()
Environment
The LunarLander-v2 environment is part of the Gymnasium library.
Training
The model was trained using the following hyperparameters:
{
'learning_rate': 0.0003,
'n_steps': 2048,
'batch_size': 64,
'n_epochs': 10,
'gamma': 0.99,
'gae_lambda': 0.95,
'clip_range': 0.2,
'ent_coef': 0.0,
'vf_coef': 0.5,
'max_grad_norm': 0.5,
}
Results
The trained agent achieved a mean reward of 269.36 +/- 28.12 over 10 evaluation episodes.
- Downloads last month
- 9
Evaluation results
- mean_rewardself-reported269.36 +/- 28.12