Mario PPO Model

This is a PPO agent trained using Stable Baselines3 and Gymnasium on a Mario-like environment.

Environment Details

  • Action Space: Simple discrete NES-style actions (7 total)
  • Observation: Grayscale, 250×264
  • Frame Stack: 4 frames

Training Info

  • Algorithm: PPO
  • Framework: Stable Baselines3
  • Timesteps: 20 million
  • Environment: Gymnasium (v0)
  • Device: MPS / CUDA / CPU

Training Timesteps & Checkpoints

Checkpoint Timesteps Notes
25M Steps 25,000,000 Early-stage learning
50M Steps 50,000,000 Better stability

Usage

from stable_baselines3 import PPO
from huggingface_hub import hf_hub_download

model_path = hf_hub_download(repo_id="akantox/mario-rl-model", filename="mario_ppo.zip")
model = PPO.load(model_path)
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading