OpenEnv documentation
Grid World Environment
Grid World Environment
This directory contains the implementation of a simple 5x5 Grid World environment, designed to serve two primary purposes within the OpenEnv ecosystem:
- A basic Reinforcement Learning (RL) testbed: Providing a straightforward, deterministic environment for quick prototyping and testing of RL agents.
- A detailed βHow-Toβ guide for building new OpenEnv environments: Demonstrating the architectural patterns, best practices, and core components required to integrate a custom environment into the OpenEnv framework.
π Environment Overview
The Grid World environment features:
- Grid Size: A 5x5 square grid.
- Agent: Starts at position
(0,0)(top-left). - Goal: Fixed at
(4,4)(bottom-right). - Actions:
UP,DOWN,LEFT,RIGHT. - Dynamics: Deterministic. An action always moves the agent one step in the chosen direction, unless it would move off the grid, in which case the agent stays in its current cell.
- Reward Function (Sparse):
-0.1for every step taken (a βliving costβ or βstep penaltyβ).+1.0for reaching the goal at(4,4). This also terminates the episode.
- Episode Termination: The episode ends when the agent reaches the goal.
Example Gameplay
Imagine the agent trying to find the goal:
- Reset: Agent at
(0,0)βObs(x=0, y=0, reward=0.0, done=False) - Step DOWN: Agent moves to
(1,0)βObs(x=1, y=0, reward=-0.1, done=False) - Step RIGHT: Agent moves to
(1,1)βObs(x=1, y=1, reward=-0.1, done=False) - β¦
- Step RIGHT (from 4,3): Agent moves to
(4,4)βObs(x=4, y=4, reward=1.0, done=True)
π οΈ How to Build an OpenEnv Environment: A Detailed Guide
This section explains the structure and key design choices of the Grid World environment.
1. Scaffolding and Configuration
This environment supports multi-mode deployment. It uses pyproject.toml for modern local development (via uv) and a Dockerfile for containerized deployment.
Directory Structure
envs/grid_world_env βββ server/ β βββ __init__.py # Package initializer for the server side β βββ app.py # The FastAPI application entry point β βββ Dockerfile # Container definition (uses requirements.txt) β βββ grid_world_environment.py # The core environment logic β βββ requirements.txt # Dependencies for the Docker build βββ __init__.py # Package initializer for the client side βββ client.py # Python client for interacting with the env server βββ models.py # Pydantic data structures (Action, Observation) βββ openenv.yaml # OpenEnv metadata βββ pyproject.toml # Project configuration for local dev (uv) βββ uv.lock # Exact dependency versions (Generated by uv) βββ README.md βββ test_grid_world.sh # Integration test script (Docker based)
Core Components Explained
This section dives into the specific code files that power the Grid World, explaining how the OpenEnv framework connects the data, logic, and server layers.
1. models.py β The Data Contract
This file defines the strict βlanguageβ used for communication between the Client (RL Agent) and the Server. It relies on Pydantic to enforce type safety.
Key Components
MoveAction(str, Enum)
Defines the allowed vocabulary for movement:UP,DOWN,LEFT,RIGHT.
Using anEnumprevents magic string errors (e.g., sending"up"instead of"UP").GridWorldAction(Action)
Wraps the movement enum in a standardized OpenEnv action structure.
When the server receives a request, FastAPI automatically validates that the incoming JSON payload matches this schema.GridWorldObservation(Observation)
Defines exactly what the agent observes from the environment:x,y: Integer coordinates representing the agentβs positionreward: Floating-point value (e.g.,-0.1,1.0)done: Boolean flag indicating episode termination
Note:
By inheriting frompydantic.BaseModel(viaObservation), these classes automatically handle JSON serialization and deserialization.
2. server/grid_world_environment.py β The Logic
This file contains the βphysics engineβ and rules of the environment. It translates abstract actions into concrete state transitions.
Core Responsibilities
Inheritance
GridWorldEnvironmentinherits fromopenenv.core.env_server.Environment, providing the standardized interface required by the OpenEnv server.__init__Method- Sets static configuration:
- Grid size:
5 Γ 5 - Goal location:
[4, 4]
- Grid size:
- Initializes the persistent state container.
- Sets static configuration:
State Persistence (
self._state)- HTTP requests are stateless, so the environment instance must remember the agentβs position between calls.
self._state(an instance ofopenenv...State) tracks:step_countepisode_idagent_x,agent_y
step()Logic- Input: Receives a validated
GridWorldAction - Dynamics: Applies movement rules and clamps coordinates using
max(0, min(..., grid_size - 1))to prevent the agent from leaving the grid - Feedback: Computes a sparse reward:
1.0if(x, y) == goal-0.1otherwise
- Returns a
GridWorldObservation
- Input: Receives a validated
3. server/app.py β The API
This file is the βglueβ that turns the environment logic into a running web service.
Key Elements
create_appUtility
Instead of manually defining FastAPI routes, this file uses
openenv.core.env_server.create_app.It:
- Binds the environment logic (
GridWorldEnvironment) - Connects the data models (
GridWorldAction,GridWorldObservation) - Automatically generates standard endpoints:
/reset/step/state/health
- Binds the environment logic (
main()Entry Point
Defines amain()function that callsuvicorn.run.
This is what enables theserver = "..."script inpyproject.tomlto start the server.
4. server/Dockerfile β The Container
This file defines how the environment is packaged for production or remote deployment.
Container Setup
Base Image
Builds onenvtorch-base, ensuring compatible system libraries.Dependencies
Copies and installsserver/requirements.txt.
This keeps the Docker image lightweight and focused only on server-side requirements.Execution
- Exposes port
8000 - Defines the
CMDto launchuvicorn
The container is ready to accept HTTP requests immediately upon startup.
- Exposes port
5. pyproject.toml β Local Development
This file enables a modern local development workflow using uv.
Key Sections
Project Metadata
- Package name:
grid_world_env - Version information
- Package name:
Dependencies Lists libraries required for local execution:
fastapiuvicorngymnasiumnumpy
[project.scripts]Defines a shortcut command:server = "grid_world_env.server.app:main"
π Getting Started
You can run the environment using uv (fastest for development) or Docker (best for deployment).
Option 1: Local Development with uv (Recommended)
Since this project is configured with pyproject.toml, you can run the server instantly.
Steps
Navigate to the environment folder
cd envs/grid_world_env uv run serverVisit the live Swagger UI in your browser
http://localhost:8000/docs
Option 2: Docker Integration Test
To build the full container and run the integration test suite (simulating a production deployment):
Steps
Navigate to the root OpenEnv directory
Run the test script
./envs/grid_world_env/test_grid_world.sh
Builds the Docker image
Starts the container
Runs a series of curl requests to verify functionality
Cleans up containers and images after completion
Conclusion
This Grid World environment serves as the reference implementation for building environments in OpenEnv. By following this pattern, custom environments remain:
Portable across local and containerized setups
Strictly typed through Pydantic models