Saraaaaaaaaa
/

Reinforce-Unit4-1

Reinforcement Learning

custom-implementation

Model card Files Files and versions Community

Saraaaaaaaaa commited on Apr 24

Commit

6430312

•

1 Parent(s): 69aff6f

Update README.md

Files changed (1) hide show

README.md +6 -0

README.md CHANGED Viewed

@@ -24,4 +24,10 @@ model-index:
   # **Reinforce** Agent playing **CartPole-v1**
   This is a trained model of a **Reinforce** agent playing **CartPole-v1** .
   To learn to use this model and train yours check Unit 4 of the Deep Reinforcement Learning Course: https://huggingface.co/deep-rl-course/unit4/introduction

   # **Reinforce** Agent playing **CartPole-v1**
   This is a trained model of a **Reinforce** agent playing **CartPole-v1** .
   To learn to use this model and train yours check Unit 4 of the Deep Reinforcement Learning Course: https://huggingface.co/deep-rl-course/unit4/introduction
+  **Policy-based learning** is directly approximating π without having to learn a value function- Our objective then is to maximize the performance of the parameterized policy using gradient ascent.
+  TL;DR: Having the cart learn to balance the pole via optimizing π for the best output; *the pole not falling over*
+  This model only had 500 training timesteps- the average is 1000, which is the reason why the cart struggles so much with balancing the pole in the video; it has not trained enough for it.