βοΈπ‘οΈ CloudSecurityAuditor β OpenEnv Environment
Complete Application Documentation
1. What Is This Application?
CloudSecurityAuditor is a standardized AI agent environment that simulates real-world cloud security auditing scenarios. It is built using the OpenEnv specification β an open standard for creating reproducible, programmable environments where AI agents can be trained, tested, and benchmarked.
Think of it as a virtual cybersecurity lab: instead of risking real cloud infrastructure, an AI agent (or a human) can interact with a mock cloud environment that contains intentional security vulnerabilities. The agent must discover, analyze, and remediate those vulnerabilities to earn a reward.
Who Is This For?
| Audience | Use Case |
|---|---|
| AI Researchers | Benchmark LLM-based security agents on structured tasks |
| Security Engineers | Practice cloud audit workflows in a safe sandbox |
| Students | Learn about S3 public buckets, EC2 security groups, and IAM log analysis |
| Hackathon Participants | Demonstrate agent-environment interaction for Meta/OpenEnv challenges |
2. Architecture Overview
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β BROWSER (UI) β
β ββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Sidebar β β Resource Gridβ β Execution Logβ β
β β (Tasks) β β (S3 / EC2) β β (Terminal) β β
β ββββββββββββ ββββββββββββββββ ββββββββββββββββ β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββ
β HTTP (REST)
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FastAPI Server (app.py) β
β βββββββββββ ββββββββββββ βββββββββββββββββββββ β
β β /reset β β /step β β /state / /docs β β
β ββββββ¬βββββ ββββββ¬ββββββ βββββββββββββββββββββ β
β β β β
β βΌ βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββ β
β β CloudAuditEnv (environment.py) β β
β β βββββββββββ ββββββββββ ββββββββββββββββ β β
β β β S3 Data β βEC2 Dataβ β Auth Logs β β β
β β βββββββββββ ββββββββββ ββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
3. File Structure
scaler/
βββ server/
β βββ app.py # FastAPI entry point, static file serving
β βββ environment.py # Core environment logic (reset, step, state)
β βββ models.py # Pydantic/dataclass models (Action, Observation, State)
β βββ tasks.py # Task definitions (Easy, Medium, Hard)
β βββ static/
β βββ index.html # Dashboard UI layout
β βββ index.css # Dark-mode cybersecurity theme
β βββ app.js # Frontend logic & API interaction
βββ scripts/
β βββ baseline_inference.py # Example agent that solves the Easy task
βββ openenv.yaml # OpenEnv specification file
βββ requirements.txt # Python dependencies
βββ Dockerfile # Docker deployment configuration
βββ README.md # Quick-start guide
4. The Environment Engine (environment.py)
The heart of the application is the CloudAuditEnv class. It implements three methods required by the OpenEnv spec:
reset(task_id) β Observation
- Reinitializes the mock infrastructure (S3 buckets, EC2 instances, auth logs).
- Sets the active task (easy, medium, or hard).
- Returns an initial observation with status info.
step(action) β Observation
- Accepts a
CloudActionand executes it against the mock infrastructure. - Returns an updated
CloudObservationcontaining discovered resources, details, logs, and a reward signal. - Automatically terminates the episode after 20 steps (truncation).
state() β CloudState
- Returns internal metadata: episode ID, step count, task ID, completion status, and cumulative score.
5. Mock Infrastructure
The environment simulates the following cloud resources:
S3 Buckets (3 total)
| ID | Region | Public? | Environment |
|---|---|---|---|
prod-data-001 |
us-east-1 | β Yes | prod |
prod-logs-002 |
us-east-1 | β No | prod |
dev-test-01 |
us-west-2 | β Yes | dev |
EC2 Instances (2 total)
| ID | Type | State | Environment | Open Ports |
|---|---|---|---|---|
i-0abcdef1234567890 |
t2.micro | running | dev | 22 (SSH), 3389 (RDP) β οΈ |
i-0987654321fedcba0 |
m5.large | running | prod | 443 (HTTPS) |
Auth Logs (auth-logs)
| Timestamp | User | Action | IP |
|---|---|---|---|
| 2026-04-05T10:00:00Z | admin | Login | 1.1.1.1 |
| 2026-04-05T10:15:00Z | iam-role-01 | DeleteStorage β οΈ | 192.168.1.50 |
| 2026-04-05T10:30:00Z | user-02 | ListBuckets | 2.2.2.2 |
6. Action Space
The agent interacts with the environment using a CloudAction object. Available action types:
| Action | Parameters | Description |
|---|---|---|
list |
resource_type (s3, ec2) |
Lists all resources of a given type |
describe |
resource_id |
Returns full details for a specific resource |
modify |
resource_id, patch |
Updates resource configuration (e.g., security group rules) |
logs |
resource_id (e.g., auth-logs) |
Fetches log entries for a service |
submit |
answer |
Submits the final answer for grading |
Example Actions (via Dashboard or API)
# List all S3 buckets
list s3
# Describe an EC2 instance
describe i-0abcdef1234567890
# Fetch authentication logs
logs auth-logs
# Submit an answer for Easy task
submit prod-data-001
# Submit an answer for Hard task
submit 192.168.1.50
7. Observation Space
Every step() and reset() returns a CloudObservation:
| Field | Type | Description |
|---|---|---|
resources |
List[Dict] |
List of discovered resource records |
details |
Dict |
Full metadata for a single described resource |
logs |
List[Dict] |
Log entries (timestamp, user, action, IP) |
status |
str |
Human-readable status message |
info |
str |
Additional context (e.g., grading feedback) |
reward |
float |
Scalar reward (0.0 to 1.0) |
done |
bool |
Whether the episode has ended |
8. Tasks & Grading
Task 1: Easy β S3 Public Audit
Goal: Identify all S3 buckets that are both public: true AND tagged env: prod.
| Step | Action | Expected Result |
|---|---|---|
| 1 | list s3 |
Returns 3 buckets |
| 2 | Filter for public + prod | prod-data-001 |
| 3 | submit prod-data-001 |
Reward: 1.0 β |
Task 2: Medium β EC2 Security Patch
Goal: Find EC2 instance i-0abcdef1234567890 which has port 3389 (RDP) open to 0.0.0.0/0, and close it by modifying the security group to only allow port 22.
| Step | Action | Expected Result |
|---|---|---|
| 1 | list ec2 |
Returns 2 instances |
| 2 | describe i-0abcdef1234567890 |
Shows RDP port open |
| 3 | modify i-0abcdef1234567890 with patch {"rules": [{"port": 22, "cidr": "0.0.0.0/0"}]} |
Reward: 1.0 β |
Task 3: Hard β IAM Log Forensic
Goal: A rogue IAM role (iam-role-01) has performed unauthorized actions. Analyze the auth-logs to identify the IP address that performed DeleteStorage.
| Step | Action | Expected Result |
|---|---|---|
| 1 | logs auth-logs |
Returns 3 log entries |
| 2 | Find DeleteStorage action |
IP: 192.168.1.50 |
| 3 | submit 192.168.1.50 |
Reward: 1.0 β |
9. API Reference
Base URL: http://localhost:7860
POST /reset
Reset the environment to a specific task.
Request:
{ "task_id": "easy" }
Response:
{
"observation": {
"resources": null,
"details": null,
"status": null,
"logs": null,
"info": "Environment reset. Task: easy"
},
"reward": 0.0,
"done": false
}
POST /step
Execute an action in the environment.
Request:
{
"action": {
"action": "list",
"resource_type": "s3"
}
}
Response:
{
"observation": {
"resources": [
{ "id": "prod-data-001", "region": "us-east-1", "public": true, "tags": { "env": "prod" } },
{ "id": "prod-logs-002", "region": "us-east-1", "public": false, "tags": { "env": "prod" } },
{ "id": "dev-test-01", "region": "us-west-2", "public": true, "tags": { "env": "dev" } }
],
"status": "Listed 3 s3 resources."
},
"reward": 0.0,
"done": false
}
GET /state
Get internal environment state.
Response:
{
"episode_id": "a1b2c3d4-...",
"step_count": 3,
"task_id": "easy",
"is_completed": false,
"score": 0.0
}
GET /docs
Interactive Swagger UI for API exploration.
GET /
Dashboard UI (the web interface).
10. Dashboard UI
The application includes a premium dark-mode cybersecurity dashboard accessible at http://localhost:7860.
Features
- Sidebar Task Selector β Switch between Easy, Medium, and Hard challenges with one click.
- Infrastructure Overview β Visual resource cards for S3 buckets and EC2 instances. Vulnerable resources are highlighted with red borders and blinking status dots.
- Execution Log β Terminal-style console showing timestamped action logs with color-coded entries (blue for actions, green for system, yellow for rewards, red for errors).
- Manual Command Input β Type commands like
list s3,describe i-0abcdef1234567890,logs auth-logs, orsubmit prod-data-001directly in the dashboard. - Live Stats HUD β Displays current task name, cumulative reward, and environment status (Active/Completed).
Design
- Theme: Cyber-noir dark mode with deep navy background (#0a0e14)
- Accents: Neon cyan (#00f5ff) for primary elements
- Typography: Inter (body), Outfit (headings), JetBrains Mono (code/logs)
- Effects: Glassmorphism panels, fade-in card animations, pulsing vulnerability indicators
11. Running the Application
Local Development
# Install dependencies
pip install -r requirements.txt
# Start the server
python -m server.app
# Open in browser
open http://localhost:7860
Running the Baseline Agent
# Solves the Easy task automatically
python scripts/baseline_inference.py
Docker Deployment
# Build the image
docker build -t cloud-security-auditor .
# Run the container
docker run -p 7860:7860 cloud-security-auditor
Hugging Face Spaces Deployment
- Create a new Space on Hugging Face.
- Select Docker as the SDK.
- Upload the repository contents (including
openenv.yamlandDockerfile). - The entrypoint is automatically set via
openenv.yaml.
12. Technology Stack
| Component | Technology |
|---|---|
| Backend | Python 3.10, FastAPI, Uvicorn |
| Environment | openenv-core β₯ 0.1.1 |
| Data Models | Python dataclasses |
| Frontend | Vanilla HTML/CSS/JS |
| Fonts | Google Fonts (Inter, Outfit, JetBrains Mono) |
| Deployment | Docker, Hugging Face Spaces |
13. OpenEnv Specification (openenv.yaml)
name: cloud-security-auditor
version: "0.2.0"
description: "A real-world cloud security audit environment for AI agents."
hardware:
tier: "cpu-small"
vCPU: 2
RAM: 4Gi
port: 7860
entrypoint: "uvicorn server.app:app --host 0.0.0.0 --port 7860"
tags:
- security
- cloud
- task-based
evaluation:
tasks:
- id: "easy"
name: "S3 Public Audit"
difficulty: "easy"
- id: "medium"
name: "EC2 Security Patch"
difficulty: "medium"
- id: "hard"
name: "IAM Log Forensic"
difficulty: "hard"
14. Extending the Environment
Adding a New Task
- Add the task definition to
server/tasks.py. - Add the corresponding mock data to
_initialize_state()inenvironment.py. - Add the grading logic to the
step()method underCloudActionType.SUBMIT. - Add a new task button to
index.htmlin the sidebar.
Adding a New Resource Type
- Add the resource data to
self.resourcesinenvironment.py. - Add a handler for
CloudActionType.LISTandCloudActionType.DESCRIBEfor the new type. - Update
detectResourceType()inapp.jsto render the correct card icon/label.
Built for the Meta Hackathon / OpenEnv Challenge β’ April 2026