PPO Agent playing LunarLander-v2

This is a trained model of a PPO agent playing LunarLander-v2 using the Stable-Baselines3 library.

Usage

To use this model with Stable-Baselines3, follow these steps:

import gymnasium as gym
from stable_baselines3 import PPO

# Create the environment
env = gym.make("LunarLander-v2")

# Load the trained model
model = PPO.load("path/to/model.zip")

# Run the model
obs, info = env.reset()
while True:
    action, _states = model.predict(obs, deterministic=True)
    obs, reward, terminated, truncated, info = env.step(action)
    
    if terminated or truncated:
        obs, info = env.reset()

env.close()

Environment

The LunarLander-v2 environment is part of the Gymnasium library.

Training

The model was trained using the following hyperparameters:

{
    'learning_rate': 0.0003,
    'n_steps': 2048,
    'batch_size': 64,
    'n_epochs': 10,
    'gamma': 0.99,
    'gae_lambda': 0.95,
    'clip_range': 0.2,
    'ent_coef': 0.0,
    'vf_coef': 0.5,
    'max_grad_norm': 0.5,
}

Results

The trained agent achieved a mean reward of 269.36 +/- 28.12 over 10 evaluation episodes.

Downloads last month
9
Video Preview
loading

Evaluation results