PPO agent for MiniGrid fill_20

This is a trained model of a PPO agent playing MiniGrid fill_20 using the stable-baselines3 library.

Model Details

Environment: MiniGrid fill_20
Algorithm: PPO
Seed: 0
Framework: Stable Baselines3
Repository: ctrlp-zoo

Usage

from stable_baselines3 import PPO
import gymnasium as gym

# Load the trained model
model = PPO.load("best_model.zip")

# Create environment
env = gym.make("MiniGrid-fill_20")

# Enjoy the trained agent
obs, info = env.reset()
for _ in range(1000):
    action, _ = model.predict(obs, deterministic=True)
    obs, reward, terminated, truncated, info = env.step(action)
    if terminated or truncated:
        obs, info = env.reset()

Training Configuration

env:
  act:
    key:
    - LEFT
    - RIGHT
    - UP
    - DOWN
    - SPACE
    movement:
    - L
    - R
    - U
    - D
    - TOGGLE
  model:
    load:
      dt: 1.0e-05
      power: 1
      sigma: 5.0e-05
      speed: 5
      type: point
    mesh:
      length:
      - 0.001
      - 0.001
      n_elements:
      - 20
      - 20
    state:
      melt_temp:
        expr: melt_temp
        init: 0.99
        type: parameter
      phase:
        expr: (temp > melt_temp) | phase
        init: false
        type: derived
      temp:
        expr: temp
        init: 0.0
        type: primary
  obs:
    state:
    - phase
    - load
    - mask

Files Included

best_model.zip: The trained model checkpoint
vecnormalize.pkl: Vector normalization statistics (if applicable)

Citation

If you use this model in your research, please cite:

@misc{ctrlp-zoo,
  author = {Schmeitz, R.},
  title = {CTRL-P Zoo: Reinforcement Learning Model Repository},
  year = {2026},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/RSchmeitz/ctrlp-zoo}}
}

Downloads last month: 2

Video Preview

Reinforcement Learning