model-index: - name: stable-baselines3-ppo-LunarLander-v2

ARCHIVED MODEL, DO NOT USE IT

stable-baselines3-ppo-LunarLander-v2 🚀👩‍🚀

This is a saved model of a PPO agent playing LunarLander-v2. The model is taken from rl-baselines3-zoo

The goal is to correctly land the lander by controlling firing engines (fire left orientation engine, fire main engine and fire right orientation engine).

👉 You can watch the agent playing by using this notebook

Use the Model

Install the dependencies

You need to use the Stable Baselines 3 Hugging Face version of the library (this version contains the function to load saved models directly from the Hugging Face Hub):

pip install git+https://github.com/simoninithomas/stable-baselines3.git

Evaluate the agent

⚠️You need to have Linux or MacOS to be able to use this environment. If it's not the case you can use the colab notebook

# Import the libraries
import gym
from stable_baselines3 import PPO
from stable_baselines3.common.evaluation import evaluate_policy

# Load the environment
env = gym.make('LunarLander-v2')

model = PPO.load_from_huggingface(hf_model_id="ThomasSimonini/stable-baselines3-ppo-LunarLander-v2",hf_model_filename="LunarLander-v2")
 
# Evaluate the agent
eval_env = gym.make('LunarLander-v2')
mean_reward, std_reward = evaluate_policy(model, eval_env, n_eval_episodes=10, deterministic=True)
print(f"mean_reward={mean_reward:.2f} +/- {std_reward}")
 
# Watch the agent play
obs = env.reset()
for i in range(1000):
    action, _state = model.predict(obs)
    obs, reward, done, info = env.step(action)
    env.render()
    if done:
      obs = env.reset()

Results

Mean Reward (10 evaluation episodes): 245.63 +/- 10.02

ThomasSimonini
/

stable-baselines3-ppo-LunarLander-v2