license: mit
Lunar Lander Deep Q-Learning Model
A Deep Q-Network (DQN) implementation to train an agent for the Lunar Lander environment from OpenAI Gym, complete with an interactive visualizer using Pygame.
NOTE: I used only 10 episodes for the purpose of this video. I recommend using at least 500 episodes for better accuracy.
Table of Contents
- General Information
- Features
- Tools and Technologies
- Setup
- Usage
- How It Works
- Adjusting Hyperparameters
- Using the Trained Model
- Credits
General Information
This project implements a Deep Q-Network (DQN) to train an agent to solve the Lunar Lander environment from OpenAI Gym. The goal is to teach the agent to safely control a lunar lander to land on the moon's surface by interacting with the environment.
The project includes:
A fully implemented DQN algorithm.
Real-time visualization of the training process using Pygame.
Dynamic plotting of training progress using Matplotlib.
Features
Deep Reinforcement Learning:
- Neural networks approximate the Q-function.
- Implements experience replay and a target network for stability.
Interactive Visualization:
- Displays the Lunar Lander environment in real-time using Pygame.
- Dynamically plots training progress alongside the environment.
Customizable Training:
- Adjustable hyperparameters, including learning rate, batch size, discount factor, and more.
Environment Solving:
- Trains the agent to achieve an average score of 200 over 100 consecutive episodes.
Tools and Technologies
Libraries and Frameworks
Reinforcement Learning:
- PyTorch (for building and training the neural network)
- NumPy (for efficient numerical computations)
Visualization:
- Pygame (for real-time visualization of the environment)
- Matplotlib (for plotting training progress)
Environment:
- OpenAI Gym (Lunar Lander environment)
Hardware Support
- CUDA support for GPU acceleration.
Setup
Prerequisites
- Python 3.8 or higher
- A compatible GPU (optional but recommended for faster training)
Installation
Clone the Repository:
git clone https://github.com/yourusername/lunar-lander-dqn.git cd lunar-lander-dqnInstall Required Packages:
pip install -r requirements.txtVerify Installation: Ensure that PyTorch is installed with CUDA support if you plan to use a GPU.
Usage
Training the Agent
Install Required Packages:
python main.pyMonitor Training
- View real-time rendering of the Lunar Lander environment.
- Observe the dynamic plot of training scores as the agent learns.
Save Model
- The trained model is automatically saved as checkpoint.pth when the environment is solved (average score ≥ 200 over 100 episodes).
How It Works
Q-Network Architecture
- A feedforward neural network with 2 hidden layers of 64 neurons each
- Input: Current state of the environment
- Output: Q-values for all possible actions
Target Network
- Maintains a separate network to compute target Q-values.
- Samples mini-batches for training to break correlations and stabilize learning.
Experience Replay
- Stores past experiences in a replay buffer.
- Updated periodically to stabilize training.
Epsilon-Greedy Policy
- Balances exploration and exploitation.
- Decays epsilon over time to focus on exploitation as training progresses.
Adjusting Hyperaramenters
You can modify the following hyperparameters in the script to customize training:
Learning Rate: LR (default: 5e-4)
Bactch Size: BATCH_SIZE (default: 64)
Discount Factor (Gamma): GAMMA (default: 0.99)
Replay Buffer Size: BUFFER_SIZE (default: 1e5)
Target Network Update Rate: TAU (default: 1e-3)
Update Frequency: UPDATE_EVERY (default: 4)
Using the Trained Model
Once the model is trained, you can use it to perform inference:
Load the Trained Model: Update your script to load the model:
agent.qnetwork_local.load_state_dict(torch.load('checkpoint.pth'))Run Inference: Use the agent.act() function to make decisions for the Lunar Lander environment:
state = env.reset() done = False while not done: action = agent.act(state) next_state, reward, done, truncated, _ = env.step(action) state = next_state
Credits
Created by Vijay Shrivarshan Vijayaraja