Toni-SM commited on
Commit
1be96ed
1 Parent(s): 074a042

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -0
README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: stable-baselines3
3
+ tags:
4
+ - deep-reinforcement-learning
5
+ - reinforcement-learning
6
+ - skrl
7
+ model-index:
8
+ - name: PPO
9
+ results:
10
+ - metrics:
11
+ - type: mean_reward
12
+ value: 494.34 +/- 0.87
13
+ name: Total reward (mean)
14
+ task:
15
+ type: reinforcement-learning
16
+ name: reinforcement-learning
17
+ ---
18
+
19
+ # OmniIsaacGymEnvs-Cartpole-PPO
20
+
21
+ Trained agent model for [NVIDIA Omniverse Isaac Gym](https://github.com/NVIDIA-Omniverse/OmniIsaacGymEnvs) environment
22
+
23
+ - **Task:** Cartpole
24
+ - **Agent:** [PPO](https://skrl.readthedocs.io/en/latest/modules/skrl.agents.ppo.html)
25
+
26
+ # Usage (with skrl)
27
+
28
+ ```python
29
+ ```
30
+
31
+ # Hyperparameters
32
+
33
+ ```python
34
+ # https://skrl.readthedocs.io/en/latest/modules/skrl.agents.ppo.html#configuration-and-hyperparameters
35
+ cfg_agent = PPO_DEFAULT_CONFIG.copy()
36
+ cfg_agent["rollouts"] = 16 # memory_size
37
+ cfg_agent["learning_epochs"] = 8
38
+ cfg_agent["mini_batches"] = 1 # 16 * 512 / 8192
39
+ cfg_agent["discount_factor"] = 0.99
40
+ cfg_agent["lambda"] = 0.95
41
+ cfg_agent["learning_rate"] = 3e-4
42
+ cfg_agent["learning_rate_scheduler"] = KLAdaptiveRL
43
+ cfg_agent["learning_rate_scheduler_kwargs"] = {"kl_threshold": 0.008}
44
+ cfg_agent["random_timesteps"] = 0
45
+ cfg_agent["learning_starts"] = 0
46
+ cfg_agent["grad_norm_clip"] = 1.0
47
+ cfg_agent["ratio_clip"] = 0.2
48
+ cfg_agent["value_clip"] = 0.2
49
+ cfg_agent["clip_predicted_values"] = True
50
+ cfg_agent["entropy_loss_scale"] = 0.0
51
+ cfg_agent["value_loss_scale"] = 2.0
52
+ cfg_agent["kl_threshold"] = 0
53
+ cfg_agent["rewards_shaper"] = lambda rewards, timestep, timesteps: rewards * 0.1
54
+ cfg_agent["state_preprocessor"] = RunningStandardScaler
55
+ cfg_agent["state_preprocessor_kwargs"] = {"size": env.observation_space, "device": device}
56
+ cfg_agent["value_preprocessor"] = RunningStandardScaler
57
+ cfg_agent["value_preprocessor_kwargs"] = {"size": 1, "device": device}
58
+ # logging to TensorBoard and writing checkpoints
59
+ cfg_agent["experiment"]["write_interval"] = 16
60
+ cfg_agent["experiment"]["checkpoint_interval"] = 80
61
+ ```