MattStammers
/

SAC-Bipedal_Walker_v3-HardcoreTrained

Reinforcement Learning

stable-baselines3

BipedalWalker-v3

deep-reinforcement-learning

Model card Files Files and versions Community

MattStammers commited on Aug 16, 2023

Commit

2835925

•

1 Parent(s): 46b399c

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -36,8 +36,9 @@ from huggingface_sb3 import load_from_hub
 ...
 ```
-Well he does ok but still gets stuck on the rocks. Here are my hyperparameters not that they did me any good:
 def linear_schedule(initial_value, final_value=0.00001):
     def func(progress_remaining):
         """Progress will decrease from 1 (beginning) to 0 (end)"""
@@ -61,5 +62,6 @@ model = SAC(
     policy_kwargs=dict(net_arch=[400, 300]),
     verbose=1
 )
 These are pretty well tuned but SAC leads to too much exploration and the agent is unable to exploit the required actions to complete the course. I suspect TD3 will be more successful so plan to turn back to that

 ...
 ```
+Well he does ok but still gets stuck on the rocks. Here are my hyperparameters not that they did me much good 😂:
+```python
 def linear_schedule(initial_value, final_value=0.00001):
     def func(progress_remaining):
         """Progress will decrease from 1 (beginning) to 0 (end)"""
     policy_kwargs=dict(net_arch=[400, 300]),
     verbose=1
 )
+```
 These are pretty well tuned but SAC leads to too much exploration and the agent is unable to exploit the required actions to complete the course. I suspect TD3 will be more successful so plan to turn back to that