Mario PPO Model
This is a PPO agent trained using Stable Baselines3 and Gymnasium on a Mario-like environment.
Environment Details
- Action Space: Simple discrete NES-style actions (7 total)
- Observation: Grayscale, 250×264
- Frame Stack: 4 frames
Training Info
- Algorithm: PPO
- Framework: Stable Baselines3
- Timesteps: 20 million
- Environment: Gymnasium (
v0
) - Device: MPS / CUDA / CPU
Training Timesteps & Checkpoints
Checkpoint | Timesteps | Notes |
---|---|---|
25M Steps | 25,000,000 | Early-stage learning |
50M Steps | 50,000,000 | Better stability |
Usage
from stable_baselines3 import PPO
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(repo_id="akantox/mario-rl-model", filename="mario_ppo.zip")
model = PPO.load(model_path)