coledie commited on
Commit
d9426ed
1 Parent(s): ada3759

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -4
README.md CHANGED
@@ -25,12 +25,22 @@ This is a trained model of a **PPO** agent playing **LunarLander-v2**
25
  using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3).
26
 
27
  ## Usage (with Stable-baselines3)
28
- TODO: Add your code
29
-
30
 
31
  ```python
32
- from stable_baselines3 import ...
 
33
  from huggingface_sb3 import load_from_hub
34
 
35
- ...
 
 
 
 
 
 
 
 
 
 
 
36
  ```
 
25
  using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3).
26
 
27
  ## Usage (with Stable-baselines3)
 
 
28
 
29
  ```python
30
+ import gym
31
+
32
  from huggingface_sb3 import load_from_hub
33
 
34
+ from stable_baselines3 import PPO
35
+ from stable_baselines3.common.evaluation import evaluate_policy
36
+ from stable_baselines3.common.env_util import make_vec_env
37
+
38
+ env = make_vec_env('LunarLander-v2', n_envs=16)
39
+ model = PPO('MlpPolicy', env, verbose=1)
40
+
41
+ model.learn(total_timesteps=5 * 10**5)
42
+
43
+ eval_env = gym.make('LunarLander-v2')
44
+ mean_reward, std_reward = evaluate_policy(model, env, n_eval_episodes=10, deterministic=True)
45
+ print(f"Reward mean: {mean_reward:.2f}, Reward STD: {std_reward:.2f}")
46
  ```