SuperMarioBros-Nes-v0 Level1-2 PPO

PPO policy checkpoint for completing SuperMarioBros-Nes-v0 Level1-2 with Stable Retro, trained with rlab.

At a Glance

Item Value
Task Complete SuperMarioBros-Nes-v0 Level1-2
Model Stable-Baselines3 PPO
Format SB3 .zip checkpoint
Checkpoint model.zip
W&B artifact tsilva/SuperMarioBros-NES/b272-l12-b55-transfer-s6-20260703T171021Z-checkpoint:step-4500000
Checkpoint step 4500000
Eval completion rate 100/100 episodes (100.0%)
Eval mean reward 3148.450
Eval max x-position 3129
Training peak signal train/info/level_complete/rate/min/last = 0.98 near global step 4741488-4745680
W&B run b272-l12-b55-transfer-s6-20260703T171021Z
YouTube preview https://www.youtube.com/watch?v=emJ0NHXhUIg

Quick Start

Install rlab once, import the ROM, then play or evaluate this checkpoint directly from Hugging Face:

uv tool install --from git+https://github.com/tsilva/rlab rlab
rlab import-roms ~/roms --game SuperMarioBros-Nes-v0
rlab play hf://tsilva/SuperMarioBros-NES_Level1-2
rlab eval hf://tsilva/SuperMarioBros-NES_Level1-2

For the original W&B artifact:

rlab play tsilva/SuperMarioBros-NES/b272-l12-b55-transfer-s6-20260703T171021Z-checkpoint:step-4500000 --policy-env fast

Validate

This release was selected from the seed-6 training peak, then freshly evaluated during publication staging:

rlab eval tsilva/SuperMarioBros-NES/b272-l12-b55-transfer-s6-20260703T171021Z-checkpoint:step-4500000 --episodes 100 --deterministic

The preview video in replay.mp4 was generated from the best episode observed during the same deterministic evaluation pass.

Results

Metric Value
Completion rate 100/100 (100.0%)
Mean reward 3148.450
Max x-position 3129
Best episode reward 3148.450
Best episode max x-position 3129
Checkpoint step 4500000

Files

File Description
model.zip SB3 PPO checkpoint
replay.mp4 Representative preview episode for the Hugging Face RL widget
model_metadata.json Downloaded W&B artifact metadata plus publish-time training-peak note
release_manifest.json Release provenance, eval metrics, and video verification inputs

Environment Details

Setting Value
env_provider supermariobrosnes-turbo
game SuperMarioBros-Nes-v0
state Level1-2
action_set simple
frame_skip 4
max_pool_frames False
hud_crop_top 32
observation_size 84
eval_done_on_events level_change

Provenance

Limitations

This checkpoint was selected from a training metric peak and then evaluated for publication. Reported metrics are task-specific and should not be treated as cross-environment benchmark results.

Downloads last month
-
Video Preview
loading