ThomasSimonini
/

ppo-PongNoFrameskip-v4

Reinforcement Learning

stable-baselines3

deep-reinforcement-learning

atari

Eval Results

Model card Files Files and versions Community

ThomasSimonini HF staff commited on Feb 28, 2022

Commit

7b0b070

•

1 Parent(s): 1212adb

Update README.md

Browse files

Files changed (1) hide show

README.md +8 -34

README.md CHANGED Viewed

@@ -5,27 +5,19 @@ tags:
 - stable-baselines3
 ---
 # PPO Agent playing PongNoFrameskip-v4
-This is a trained model of a PPO agent playing PongNoFrameskip-v4 using the stable-baselines3 library (our agent is the 🟢 one).
 <video src="https://huggingface.co/ThomasSimonini/ppo-PongNoFrameskip-v4/resolve/main/output.mp4" controls autoplay loop></video>
 ## Evaluation Results
-Mean_reward = 21.00 +/- 0.0
 # Usage (with Stable-baselines3)
-## Watch your agent interacts (in Google Colab)
 - You need to use `gym==0.19` since it **includes Atari Roms**.
 - The Actor Space is 6 since we use only **legit actions**.
-```python
-# Install these libraries in one cell (don't forget to restart the runtime after installing the librairies)
-!pip install stable-baselines3[extra]
-!pip install huggingface_sb3
-!pip install huggingface_hub
-!pip install pickle5
-```
-Don't forget to restart the runtime before running the code below:
 ```python
 # Import the libraries
 import os
@@ -37,16 +29,8 @@ from stable_baselines3.common.vec_env import VecNormalize
 from stable_baselines3.common.env_util import make_atari_env
 from stable_baselines3.common.vec_env import VecFrameStack
-from stable_baselines3 import PPO
-from stable_baselines3.common.callbacks import CheckpointCallback
 from huggingface_sb3 import load_from_hub, push_to_hub
-import gym
-from stable_baselines3.common.vec_env import VecVideoRecorder, DummyVecEnv
-from stable_baselines3.common.evaluation import evaluate_policy
 # Load the model
 checkpoint = load_from_hub("ThomasSimonini/ppo-PongNoFrameskip-v4", "ppo-PongNoFrameskip-v4.zip")
@@ -60,24 +44,14 @@ custom_objects = {
 model= PPO.load(checkpoint, custom_objects=custom_objects)
-## Evaluate the agent
 env = make_atari_env('PongNoFrameskip-v4', n_envs=1)
 env = VecFrameStack(env, n_stack=4)
-mean_reward, std_reward = evaluate_policy(model, env, n_eval_episodes=10)
-print(f"mean_reward={mean_reward:.2f} +/- {std_reward}")
-## Generate a video of your agent performing with Colab
-!pip install gym pyvirtualdisplay > /dev/null 2>&1
-!apt-get install -y xvfb python-opengl ffmpeg > /dev/null 2>&1
-!pip install colabgymrender==1.0.2
-observation = env.reset()
-terminal = False
-while not terminal:
-  action, _state = model.predict(observation)
-  observation, reward, terminal, info = env.step(action)
-env.play()
 ```

 - stable-baselines3
 ---
 # PPO Agent playing PongNoFrameskip-v4
+This is a trained model of a **PPO agent playing PongNoFrameskip-v4 using the stable-baselines3 library** (our agent is the 🟢 one).
 <video src="https://huggingface.co/ThomasSimonini/ppo-PongNoFrameskip-v4/resolve/main/output.mp4" controls autoplay loop></video>
 ## Evaluation Results
+Mean_reward: `21.00 +/- 0.0`
 # Usage (with Stable-baselines3)
 - You need to use `gym==0.19` since it **includes Atari Roms**.
 - The Actor Space is 6 since we use only **legit actions**.
+Watch your agent interacts :
 ```python
 # Import the libraries
 import os
 from stable_baselines3.common.env_util import make_atari_env
 from stable_baselines3.common.vec_env import VecFrameStack
 from huggingface_sb3 import load_from_hub, push_to_hub
 # Load the model
 checkpoint = load_from_hub("ThomasSimonini/ppo-PongNoFrameskip-v4", "ppo-PongNoFrameskip-v4.zip")
 model= PPO.load(checkpoint, custom_objects=custom_objects)
 env = make_atari_env('PongNoFrameskip-v4', n_envs=1)
 env = VecFrameStack(env, n_stack=4)
+obs = env.reset()
+while True:
+    action, _states = model.predict(obs)
+    obs, rewards, dones, info = env.step(action)
+    env.render()
 ```