Q-Learning Model for CartPole

This project implements a Q-learning model for the CartPole-v1 environment using Gymnasium. The agent is trained to balance a pole on a moving cart by learning optimal actions through trial and error. The learning process uses an epsilon-greedy strategy, where the agent explores random actions at the beginning and gradually shifts towards exploiting learned actions as training progresses.

Key features of the model:

Discretization: Continuous state variables (cart position, cart velocity, pole angle, and pole angular velocity) are discretized into bins for efficient Q-learning. Q-learning algorithm: The agent updates its Q-values based on the Bellman equation, learning from the rewards it receives after each action. Epsilon-greedy strategy: The agent balances exploration and exploitation

Files:

  • train.py: Code for training the agent.
  • cartPole_qtable.npy: The trained Q-table.
  • replay.mp4: A video showing the agent's performance.

How to Reproduce:

  1. Install the dependencies:

    pip install gymnasium numpy imageio
    
  2. Run the training script:

    python train.py
    
  3. Use the saved Q-table (cartpole-qtable.npy) to evaluate the model.

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading