Parth673 commited on
Commit
1a4aebc
1 Parent(s): 1d60ab8

First push

Browse files
README.md ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - LunarLander-v2
4
+ - ppo
5
+ - deep-reinforcement-learning
6
+ - reinforcement-learning
7
+ - custom-implementation
8
+ - deep-rl-course
9
+ model-index:
10
+ - name: PPO
11
+ results:
12
+ - task:
13
+ type: reinforcement-learning
14
+ name: reinforcement-learning
15
+ dataset:
16
+ name: LunarLander-v2
17
+ type: LunarLander-v2
18
+ metrics:
19
+ - type: mean_reward
20
+ value: 104.83 +/- 18.01
21
+ name: mean_reward
22
+ verified: false
23
+ ---
24
+
25
+ # PPO Agent Playing LunarLander-v2
26
+
27
+ This is a trained model of a PPO agent playing LunarLander-v2.
28
+
29
+ # Hyperparameters
30
+ See the GitHub for full info and the journey on creating this on the surface not particularly exciting model: https://github.com/MattStammers/PPO_Lander_Implementation
31
+
32
+ It took me 8 attempts to get the score to nearly reach 0 using a cleanRL implementation and WandB metric tracking and then this version was trained after 10 attempts converging at about 3 million training steps
events.out.tfevents.1695069059.rhmmedcatt-ProLiant-ML350-Gen10.23223.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6534fa50be0408aa0878e5d8bb264ea7a140e986a43eac761528e7b199967c45
3
+ size 2448271
model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fa3561982d92893028d31ca5cfc8a13c9593138e5475bee965053e2e44f111d1
3
+ size 147301
replay.mp4 ADDED
Binary file (91.4 kB). View file
 
results.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"env_id": "LunarLander-v2", "mean_reward": 104.82761980859839, "std_reward": 18.014334589706, "n_evaluation_episodes": 10, "eval_datetime": "2023-09-19T00:59:32.656971"}