Antonio Serrano Muñoz commited on
Commit
3a26b16
1 Parent(s): 5e70b70
Files changed (3) hide show
  1. README.md +88 -0
  2. agent.pickle +3 -0
  3. agent.pt +3 -0
README.md ADDED
@@ -0,0 +1,88 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: skrl
3
+ tags:
4
+ - deep-reinforcement-learning
5
+ - reinforcement-learning
6
+ - skrl
7
+ model-index:
8
+ - name: PPO
9
+ results:
10
+ - metrics:
11
+ - type: mean_reward
12
+ value: 493.73 +/- 0.58
13
+ name: Total reward (mean)
14
+ task:
15
+ type: reinforcement-learning
16
+ name: reinforcement-learning
17
+ dataset:
18
+ name: IsaacGymEnvs-Cartpole
19
+ type: IsaacGymEnvs-Cartpole
20
+ ---
21
+
22
+ <!-- ---
23
+ torch: 493.73 +/- 0.58
24
+ jax: 492.06 +/- 3.58
25
+ numpy: 491.92 +/- 0.57
26
+ --- -->
27
+
28
+ # IsaacGymEnvs-Cartpole-PPO
29
+
30
+ Trained agent for [NVIDIA Isaac Gym Preview](https://github.com/NVIDIA-Omniverse/IsaacGymEnvs) environments.
31
+
32
+ - **Task:** Cartpole
33
+ - **Agent:** [PPO](https://skrl.readthedocs.io/en/latest/api/agents/ppo.html)
34
+
35
+ # Usage (with skrl)
36
+
37
+ Note: Visit the skrl [Examples](https://skrl.readthedocs.io/en/latest/intro/examples.html) section to access the scripts.
38
+
39
+ * PyTorch
40
+
41
+ ```python
42
+ from skrl.utils.huggingface import download_model_from_huggingface
43
+
44
+ # assuming that there is an agent named `agent`
45
+ path = download_model_from_huggingface("skrl/IsaacGymEnvs-Cartpole-PPO", filename="agent.pt")
46
+ agent.load(path)
47
+ ```
48
+
49
+ * JAX
50
+
51
+ ```python
52
+ from skrl.utils.huggingface import download_model_from_huggingface
53
+
54
+ # assuming that there is an agent named `agent`
55
+ path = download_model_from_huggingface("skrl/IsaacGymEnvs-Cartpole-PPO", filename="agent.pickle")
56
+ agent.load(path)
57
+ ```
58
+
59
+ # Hyperparameters
60
+
61
+ Note: Undefined parameters keep their values by default.
62
+
63
+ ```python
64
+ # https://skrl.readthedocs.io/en/latest/api/agents/ppo.html#configuration-and-hyperparameters
65
+ cfg = PPO_DEFAULT_CONFIG.copy()
66
+ cfg["rollouts"] = 16 # memory_size
67
+ cfg["learning_epochs"] = 8
68
+ cfg["mini_batches"] = 1 # 16 * 512 / 8192
69
+ cfg["discount_factor"] = 0.99
70
+ cfg["lambda"] = 0.95
71
+ cfg["learning_rate"] = 3e-4
72
+ cfg["learning_rate_scheduler"] = KLAdaptiveRL
73
+ cfg["learning_rate_scheduler_kwargs"] = {"kl_threshold": 0.008}
74
+ cfg["random_timesteps"] = 0
75
+ cfg["learning_starts"] = 0
76
+ cfg["grad_norm_clip"] = 1.0
77
+ cfg["ratio_clip"] = 0.2
78
+ cfg["value_clip"] = 0.2
79
+ cfg["clip_predicted_values"] = True
80
+ cfg["entropy_loss_scale"] = 0.0
81
+ cfg["value_loss_scale"] = 2.0
82
+ cfg["kl_threshold"] = 0
83
+ cfg["rewards_shaper"] = lambda rewards, timestep, timesteps: rewards * 0.1
84
+ cfg["state_preprocessor"] = RunningStandardScaler
85
+ cfg["state_preprocessor_kwargs"] = {"size": env.observation_space, "device": device}
86
+ cfg["value_preprocessor"] = RunningStandardScaler
87
+ cfg["value_preprocessor_kwargs"] = {"size": 1, "device": device}
88
+ ```
agent.pickle ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:afe28bbbf6a8a7c306bd0afb57b0bdfd924b272633e629a9b1241312ee3e5d8e
3
+ size 31572
agent.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:069d9fd7ccef2f1d1c28053dfabc6f4f82502c6587e2491bc97964028626343e
3
+ size 29410