Q-Learning – CliffWalking-v1

Observation space: Discrete(48)
Action space: Discrete(4)
Mean Reward: -13.00 ± 0.00
Episodes: 100,000
Learning rate: 0.7 | Gamma : 0.95 | Epsilon decay : 0.0005

Trained on CliffWalking-v1 using tabular Q-Learning from scratch.

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview