satcos commited on
Commit
36518c2
1 Parent(s): d9c35af

Upload folder using huggingface_hub

Browse files
Files changed (4) hide show
  1. README.md +19 -60
  2. hyperparameters.json +1 -0
  3. model.pt +3 -0
  4. results.json +1 -0
README.md CHANGED
@@ -1,67 +1,26 @@
1
  ---
2
  tags:
3
- - generated_from_trainer
4
- - BipedalWalker-v3
5
- - deep-reinforcement-learning
6
- - reinforcement-learning
7
- datasets:
8
- - gym_replay
9
  model-index:
10
  - name: DT-BipedalWalker-v3
11
  results:
12
- - task:
13
- type: reinforcement-learning
14
- name: reinforcement-learning
15
- dataset:
16
- name: BipedalWalker-v3
17
- type: BipedalWalker-v3
18
- metrics:
19
- - type: mean_reward
20
- value: 252.17 +/- 12.79
21
- name: mean_reward
22
- verified: false
23
  ---
24
 
25
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
26
- should probably proofread and complete it, then remove this comment. -->
27
-
28
- # DT-BipedalWalker-v3
29
-
30
- This model is a fine-tuned version of [](https://huggingface.co/) on the gym_replay dataset.
31
-
32
- ## Model description
33
-
34
- More information needed
35
-
36
- ## Intended uses & limitations
37
-
38
- More information needed
39
-
40
- ## Training and evaluation data
41
-
42
- More information needed
43
-
44
- ## Training procedure
45
-
46
- ### Training hyperparameters
47
-
48
- The following hyperparameters were used during training:
49
- - learning_rate: 0.0001
50
- - train_batch_size: 64
51
- - eval_batch_size: 8
52
- - seed: 42
53
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
54
- - lr_scheduler_type: linear
55
- - lr_scheduler_warmup_ratio: 0.1
56
- - num_epochs: 120
57
-
58
- ### Training results
59
-
60
-
61
-
62
- ### Framework versions
63
-
64
- - Transformers 4.38.2
65
- - Pytorch 2.1.2
66
- - Datasets 2.18.0
67
- - Tokenizers 0.15.2
 
1
  ---
2
  tags:
3
+ - BipedalWalker-v3
4
+ - reinforce
5
+ - reinforcement-learning
6
+ - custom-implementation
7
+ - deep-rl-class
 
8
  model-index:
9
  - name: DT-BipedalWalker-v3
10
  results:
11
+ - task:
12
+ type: reinforcement-learning
13
+ name: reinforcement-learning
14
+ dataset:
15
+ name: BipedalWalker-v3
16
+ type: BipedalWalker-v3
17
+ metrics:
18
+ - type: mean_reward
19
+ value: 252.88 +/- 1.38
20
+ name: mean_reward
21
+ verified: false
22
  ---
23
 
24
+ # **Reinforce** Agent playing **BipedalWalker-v3**
25
+ This is a trained model of a **Reinforce** agent playing **BipedalWalker-v3** .
26
+
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
hyperparameters.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"env_id": "BipedalWalker-v3", "n_evaluation_episodes": 10}
model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:984b01663c7dd81e9bccf78305843457f00b798eae3433f3ef3f985fddce6ff5
3
+ size 8211677
results.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"env_id": "BipedalWalker-v3", "mean_reward": 252.88336973987106, "n_evaluation_episodes": 10, "eval_datetime": "2024-03-20T13:03:25.909234"}