Vijay Shrivarshan Vijayaraja
commited on
Update README.md
Browse files
README.md
CHANGED
|
@@ -30,8 +30,11 @@ NOTE: I used only 10 episodes for the purpose of this video. I recommend using a
|
|
| 30 |
This project implements a Deep Q-Network (DQN) to train an agent to solve the Lunar Lander environment from OpenAI Gym. The goal is to teach the agent to safely control a lunar lander to land on the moon's surface by interacting with the environment.
|
| 31 |
|
| 32 |
The project includes:
|
|
|
|
| 33 |
- A fully implemented DQN algorithm.
|
|
|
|
| 34 |
- Real-time visualization of the training process using Pygame.
|
|
|
|
| 35 |
- Dynamic plotting of training progress using Matplotlib.
|
| 36 |
---
|
| 37 |
|
|
@@ -135,10 +138,15 @@ The project includes:
|
|
| 135 |
You can modify the following hyperparameters in the script to customize training:
|
| 136 |
|
| 137 |
**Learning Rate:** LR (default: 5e-4)
|
|
|
|
| 138 |
**Bactch Size:** BATCH_SIZE (default: 64)
|
|
|
|
| 139 |
**Discount Factor (Gamma):** GAMMA (default: 0.99)
|
|
|
|
| 140 |
**Replay Buffer Size:** BUFFER_SIZE (default: 1e5)
|
|
|
|
| 141 |
**Target Network Update Rate:** TAU (default: 1e-3)
|
|
|
|
| 142 |
**Update Frequency:** UPDATE_EVERY (default: 4)
|
| 143 |
|
| 144 |
---
|
|
|
|
| 30 |
This project implements a Deep Q-Network (DQN) to train an agent to solve the Lunar Lander environment from OpenAI Gym. The goal is to teach the agent to safely control a lunar lander to land on the moon's surface by interacting with the environment.
|
| 31 |
|
| 32 |
The project includes:
|
| 33 |
+
|
| 34 |
- A fully implemented DQN algorithm.
|
| 35 |
+
|
| 36 |
- Real-time visualization of the training process using Pygame.
|
| 37 |
+
|
| 38 |
- Dynamic plotting of training progress using Matplotlib.
|
| 39 |
---
|
| 40 |
|
|
|
|
| 138 |
You can modify the following hyperparameters in the script to customize training:
|
| 139 |
|
| 140 |
**Learning Rate:** LR (default: 5e-4)
|
| 141 |
+
|
| 142 |
**Bactch Size:** BATCH_SIZE (default: 64)
|
| 143 |
+
|
| 144 |
**Discount Factor (Gamma):** GAMMA (default: 0.99)
|
| 145 |
+
|
| 146 |
**Replay Buffer Size:** BUFFER_SIZE (default: 1e5)
|
| 147 |
+
|
| 148 |
**Target Network Update Rate:** TAU (default: 1e-3)
|
| 149 |
+
|
| 150 |
**Update Frequency:** UPDATE_EVERY (default: 4)
|
| 151 |
|
| 152 |
---
|