Q-Learning – CliffWalking-v1

Trained on CliffWalking-v1 using tabular Q-Learning from scratch.

  • Observation space: Discrete(48)
  • Action space: Discrete(4)
  • Mean Reward: -13.00 ± 0.00
  • Episodes: 100,000
  • Learning rate: 0.7 | Gamma: 0.95 | Epsilon decay: 0.0005
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading