vwxyzjn commited on
Commit
b9ba5a5
1 Parent(s): 4c379f4

pushing model

Browse files
README.md CHANGED
@@ -1,10 +1,23 @@
1
  ---
2
  tags:
3
  - CartPole-v1
4
- - ppo
5
  - deep-reinforcement-learning
6
  - reinforcement-learning
7
  - custom-implementation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  ---
9
 
10
  # (CleanRL) **PPO** Agent Playing **CartPole-v1**
@@ -14,33 +27,33 @@ To learn to code your own PPO agent and train it Unit 8 of the Deep Reinforcemen
14
 
15
  # Hyperparameters
16
  ```python
17
- {'exp_name': 'ppo'
18
- 'seed': 1
19
- 'torch_deterministic': True
20
- 'cuda': False
21
- 'track': False
22
- 'wandb_project_name': 'cleanRL'
23
- 'wandb_entity': None
24
- 'capture_video': True
25
- 'hf_repo_id': 'cleanrl/ppo'
26
- 'env_id': 'CartPole-v1'
27
- 'total_timesteps': 500000
28
- 'learning_rate': 0.00025
29
- 'num_envs': 4
30
- 'num_steps': 128
31
- 'anneal_lr': True
32
- 'gamma': 0.99
33
- 'gae_lambda': 0.95
34
- 'num_minibatches': 4
35
- 'update_epochs': 4
36
- 'norm_adv': True
37
- 'clip_coef': 0.2
38
- 'clip_vloss': True
39
- 'ent_coef': 0.01
40
- 'vf_coef': 0.5
41
- 'max_grad_norm': 0.5
42
- 'target_kl': None
43
- 'batch_size': 512
44
- 'minibatch_size': 128}
45
  ```
46
 
 
1
  ---
2
  tags:
3
  - CartPole-v1
 
4
  - deep-reinforcement-learning
5
  - reinforcement-learning
6
  - custom-implementation
7
+ model-index:
8
+ - name: ppo
9
+ results:
10
+ - task:
11
+ type: reinforcement-learning
12
+ name: reinforcement-learning
13
+ dataset:
14
+ name: CartPole-v1
15
+ type: CartPole-v1
16
+ metrics:
17
+ - type: training_episodic_return
18
+ value: 22.37 +/- 10.05
19
+ name: training_episodic_return
20
+ verified: false
21
  ---
22
 
23
  # (CleanRL) **PPO** Agent Playing **CartPole-v1**
 
27
 
28
  # Hyperparameters
29
  ```python
30
+ {'anneal_lr': True,
31
+ 'batch_size': 512,
32
+ 'capture_video': True,
33
+ 'clip_coef': 0.2,
34
+ 'clip_vloss': True,
35
+ 'cuda': False,
36
+ 'ent_coef': 0.01,
37
+ 'env_id': 'CartPole-v1',
38
+ 'exp_name': 'ppo',
39
+ 'gae_lambda': 0.95,
40
+ 'gamma': 0.99,
41
+ 'hf_repo_id': 'cleanrl/ppo',
42
+ 'learning_rate': 0.00025,
43
+ 'max_grad_norm': 0.5,
44
+ 'minibatch_size': 128,
45
+ 'norm_adv': True,
46
+ 'num_envs': 4,
47
+ 'num_minibatches': 4,
48
+ 'num_steps': 128,
49
+ 'seed': 1,
50
+ 'target_kl': None,
51
+ 'torch_deterministic': True,
52
+ 'total_timesteps': 500000,
53
+ 'track': False,
54
+ 'update_epochs': 4,
55
+ 'vf_coef': 0.5,
56
+ 'wandb_entity': None,
57
+ 'wandb_project_name': 'cleanRL'}
58
  ```
59
 
events.out.tfevents.1665689377.pop-os.2035746.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dc8e3969e0b2a3417a07fa5a34e9392daafe0951ddbb3d0d60d40b395ad5ed45
3
+ size 3475