Spaces:

SidhaGarg
/

Cloud-DevOps-RLEnv

Running

App Files Files Community

SidhaGarg commited on 16 days ago

Commit

39ff394

1 Parent(s): 1ba19c7

Prepare HF Space submission validation and compliance.

Browse files

Files changed (16) hide show

Dockerfile +23 -0
README.md +117 -1
__init__.py +25 -0
client.py +107 -0
env.py +67 -0
inference.py +163 -0
models.py +113 -0
openenv.yaml +19 -0
pyproject.toml +36 -0
scripts/pre_submit_validate.sh +365 -0
scripts/validate-submission.sh +185 -0
server/Dockerfile +80 -0
server/__init__.py +11 -0
server/app.py +101 -0
server/cloud_devops_env_environment.py +384 -0
server/requirements.txt +6 -0

Dockerfile ADDED Viewed

	@@ -0,0 +1,23 @@

+# Use a lightweight, stable Python image
+FROM python:3.10-slim
+# Set working directory
+WORKDIR /app
+# Copy project files
+COPY pyproject.toml .
+COPY openenv.yaml .
+COPY models.py .
+COPY env.py .
+COPY __init__.py .
+COPY client.py .
+COPY server ./server
+# Install dependencies (no-cache to save space)
+RUN pip install --no-cache-dir .
+# Expose the standard OpenEnv port
+EXPOSE 8000
+# Start the FastAPI/OpenEnv app directly (openenv serve is not implemented in v0.2.3)
+CMD ["uvicorn", "server.app:app", "--host", "0.0.0.0", "--port", "8000"]

README.md CHANGED Viewed

@@ -9,4 +9,120 @@ license: mit
 short_description: Cloud SRE/DevOps RL environment
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 short_description: Cloud SRE/DevOps RL environment
 ---
+# Cloud DevOps RLEnv
+Cloud DevOps RLEnv is an OpenEnv-compatible environment for training and evaluating agents on realistic cloud SRE and DevOps incident-response tasks.
+## Environment Description And Motivation
+Production incidents are often multi-step: triage, inspect resources, check logs, apply a safe remediation, and then verify the fix. This environment simulates that loop with deterministic scenarios and shaped rewards.
+Goals:
+- Benchmark planning and tool-use behavior for cloud operations agents.
+- Reward correct diagnosis over blind action execution.
+- Provide repeatable task outcomes for fair grading and comparison.
+## Action Space
+Action model: `CloudAction`
+Fields:
+- `command` (required): one of `list_resources`, `describe_resource`, `view_logs`, `update_security_group`, `restart_service`, `submit_solution`.
+- `resource_id` (optional): target resource identifier (required for most non-list actions).
+- `parameters` (optional): structured key/value arguments used by mutating actions.
+Notes:
+- `update_security_group` expects `parameters.port` and usually `parameters.action`.
+- `restart_service` targets a single instance by `resource_id`.
+## Observation And State Space
+Observation model: `CloudObservation`
+Primary observation fields:
+- `output`: command result payload.
+- `error`: command error, when present.
+- `system_health_status`: `CRITICAL`, `DEGRADED`, or `HEALTHY`.
+- `done`: terminal flag.
+- `reward`: scalar step reward.
+- `metadata`: includes task name, resolution status, step count, and other diagnostics.
+Hidden state model: `CloudState`
+- `task_difficulty`: `easy`, `medium`, or `hard`.
+- `resources`: underlying resource graph and logs.
+- `step_count`: total actions issued.
+- `is_resolved`: whether incident root cause is remediated.
+## Task Definitions And Expected Difficulty
+- `easy`:
+	Open port `80` on `sg-web` so web traffic can flow.
+	Expected difficulty: low.
+- `medium`:
+	Inspect API logs to identify DB connectivity failure, then open port `5432` on `sg-db`.
+	Expected difficulty: medium (requires diagnosis before remediation).
+- `hard`:
+	Trace load balancer timeout to `i-web2`, inspect the target, then restart the correct service.
+	Expected difficulty: high (multi-hop diagnosis and anti-shortcut checks).
+## Setup And Usage
+From repository root:
+```bash
+# Validate OpenEnv package structure and manifest
+..\\.venv\\Scripts\\openenv validate
+# Run pre-submission validator (skip live inference)
+bash scripts/pre_submit_validate.sh --skip-inference
+# Build local submission image
+docker build -t cloud-devops-env:phase1 -f Dockerfile .
+```
+Optional local server run:
+```bash
+uvicorn server.app:app --host 0.0.0.0 --port 8000
+```
+## Inference Contract
+`inference.py` uses the OpenAI client and reads the following environment variables:
+- `API_BASE_URL`
+- `MODEL_NAME`
+- `HF_TOKEN`
+It emits strict structured logs:
+- `[START] { ... }` per task
+- `[STEP] { ... }` per environment action
+- `[END] { ... }` per task summary
+## Baseline Scores
+Representative deterministic scripted-policy targets:
+| Task | Baseline Score (0-1) | Notes |
+| --- | --- | --- |
+| easy | 1.0 | Includes identifying and fixing security group rule |
+| medium | 0.8-1.0 | Depends on whether optional diagnostic reward is collected |
+| hard | 1.0 | Requires correct root-cause path before restart |
+Validation expectation:
+- Aggregate scores are clamped to `[0.0, 1.0]`.
+- `SUCCESS_SCORE_THRESHOLD` for inference summaries is `0.8`.
+## Hugging Face Space Deployment
+1. Push this repository to your Space (Docker SDK).
+2. Ensure `README.md` front matter (above) is present.
+3. Set Space secrets/variables:
+	 - `HF_TOKEN` (secret)
+	 - `API_BASE_URL` (for example `https://router.huggingface.co/v1`)
+	 - `MODEL_NAME` (chosen model slug)
+4. Wait for Space build to complete.
+5. Verify endpoints:
+	 - `GET /health` returns `200`
+	 - `POST /reset` returns `200`
+Reference: https://huggingface.co/docs/hub/spaces-config-reference

__init__.py ADDED Viewed

	@@ -0,0 +1,25 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""Cloud Devops Env Environment."""
+from .client import CloudDevopsEnv
+from .models import (
+    CloudAction,
+    CloudDevopsAction,
+    CloudDevopsObservation,
+    CloudObservation,
+    CloudState,
+)
+__all__ = [
+    "CloudAction",
+    "CloudObservation",
+    "CloudState",
+    "CloudDevopsAction",
+    "CloudDevopsObservation",
+    "CloudDevopsEnv",
+]

client.py ADDED Viewed

	@@ -0,0 +1,107 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""Cloud Devops Env Environment Client."""
+from typing import Any, Dict
+from openenv.core import EnvClient
+from openenv.core.client_types import StepResult
+from openenv.core.env_server.types import State
+from .models import CloudAction, CloudObservation
+class CloudDevopsEnv(
+    EnvClient[CloudAction, CloudObservation, State]
+):
+    """
+    Client for the Cloud Devops Env Environment.
+    This client maintains a persistent WebSocket connection to the environment server,
+    enabling efficient multi-step interactions with lower latency.
+    Each client instance has its own dedicated environment session on the server.
+    Example:
+        >>> # Connect to a running server
+        >>> with CloudDevopsEnv(base_url="http://localhost:8000") as client:
+        ...     result = client.reset()
+        ...     print(result.observation.system_health_status)
+        ...
+        ...     result = client.step(CloudAction(command="list_resources"))
+        ...     print(result.observation.output)
+    Example with Docker:
+        >>> # Automatically start container and connect
+        >>> client = CloudDevopsEnv.from_docker_image("cloud_devops_env-env:latest")
+        >>> try:
+        ...     result = client.reset()
+        ...     result = client.step(CloudAction(command="list_resources"))
+        ... finally:
+        ...     client.close()
+    """
+    def _step_payload(self, action: CloudAction) -> Dict[str, Any]:
+        """
+        Convert CloudAction to JSON payload for step message.
+        Args:
+            action: CloudAction instance
+        Returns:
+            Dictionary representation suitable for JSON encoding
+        """
+        payload: Dict[str, Any] = {
+            "command": action.command,
+            "resource_id": action.resource_id,
+            "parameters": action.parameters,
+        }
+        if action.message is not None:
+            payload["message"] = action.message
+        return payload
+    def _parse_result(self, payload: Dict[str, Any]) -> StepResult[CloudObservation]:
+        """
+        Parse server response into StepResult[CloudObservation].
+        Args:
+            payload: JSON response data from server
+        Returns:
+            StepResult with CloudObservation
+        """
+        obs_data = payload.get("observation", {})
+        observation = CloudObservation(
+            output=obs_data.get("output", ""),
+            error=obs_data.get("error"),
+            system_health_status=obs_data.get("system_health_status", "CRITICAL"),
+            message_length=obs_data.get("message_length", 0),
+            echoed_message=obs_data.get("echoed_message"),
+            done=payload.get("done", False),
+            reward=payload.get("reward"),
+            metadata=obs_data.get("metadata", {}),
+        )
+        return StepResult(
+            observation=observation,
+            reward=payload.get("reward"),
+            done=payload.get("done", False),
+        )
+    def _parse_state(self, payload: Dict[str, Any]) -> State:
+        """
+        Parse server response into State object.
+        Args:
+            payload: JSON response from state request
+        Returns:
+            State object with episode_id and step_count
+        """
+        return State(
+            episode_id=payload.get("episode_id"),
+            step_count=payload.get("step_count", 0),
+        )

env.py ADDED Viewed

	@@ -0,0 +1,67 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""Async entrypoint wrapper for external evaluators and custom graders."""
+from __future__ import annotations
+from typing import Any, Dict
+from pydantic import BaseModel
+try:
+    from .models import CloudAction, CloudObservation, CloudState
+    from .server.cloud_devops_env_environment import CloudDevopsEnvironment
+except ImportError:
+    from models import CloudAction, CloudObservation, CloudState
+    from server.cloud_devops_env_environment import CloudDevopsEnvironment
+class EnvResult(BaseModel):
+    """Canonical environment result payload for async evaluator loops."""
+    observation: CloudObservation
+    reward: float
+    done: bool
+    info: Dict[str, Any]
+class CloudDevOpsEnv:
+    """Async-compatible facade over the OpenEnv server-side environment logic."""
+    def __init__(self, task_name: str = "easy"):
+        self._impl = CloudDevopsEnvironment(task_name=task_name)
+    @property
+    def achievements(self) -> set[str]:
+        """Expose completed shaped-reward checkpoints for debugging/evaluation."""
+        return set(self._impl._achievements)
+    async def reset(self) -> EnvResult:
+        """Reset the environment to the initial task state."""
+        obs = self._impl.reset()
+        return EnvResult(
+            observation=obs,
+            reward=float(obs.reward or 0.0),
+            done=bool(obs.done),
+            info=dict(obs.metadata or {}),
+        )
+    async def step(self, action: CloudAction) -> EnvResult:
+        """Execute an action and return a structured async result."""
+        obs = self._impl.step(action)
+        return EnvResult(
+            observation=obs,
+            reward=float(obs.reward or 0.0),
+            done=bool(obs.done),
+            info=dict(obs.metadata or {}),
+        )
+    async def state(self) -> CloudState:
+        """Return hidden environment state for deterministic evaluators."""
+        state = self._impl.state
+        assert isinstance(state, CloudState)
+        return state

inference.py ADDED Viewed

	@@ -0,0 +1,163 @@

+import asyncio
+import json
+import os
+from typing import Any, Dict, List, Tuple
+from openai import OpenAI
+from pydantic import ValidationError
+from env import CloudDevOpsEnv
+from models import CloudAction
+API_BASE_URL = os.getenv("API_BASE_URL", "https://router.huggingface.co/v1")
+MODEL_NAME = os.getenv("MODEL_NAME", "google/gemma-4-31B-it")
+HF_TOKEN = os.getenv("HF_TOKEN")
+BENCHMARK = "CloudDevOpsEnv"
+MAX_STEPS = 15
+MAX_TOTAL_REWARD = 1.0
+SUCCESS_SCORE_THRESHOLD = 0.8
+def log_start(task: str, env: str, model: str) -> None:
+    log_data = {"task": task, "env": env, "model": model}
+    print(f"[START] {json.dumps(log_data)}", flush=True)
+def log_step(step: int, action: Any, reward: float, done: bool, error: Any) -> None:
+    action_dict = action.model_dump() if hasattr(action, "model_dump") else str(action)
+    log_data = {
+        "step": step,
+        "action": action_dict,
+        "reward": reward,
+        "done": done,
+        "error": error,
+    }
+    print(f"[STEP] {json.dumps(log_data)}", flush=True)
+def log_end(success: bool, steps: int, score: float, rewards: List[float]) -> None:
+    log_data = {"success": success, "steps": steps, "score": score, "rewards": rewards}
+    print(f"[END] {json.dumps(log_data)}", flush=True)
+def get_model_action(
+    client: OpenAI,
+    step: int,
+    last_obs: str,
+    last_error: str,
+    history: List[Dict[str, str]],
+) -> Tuple[CloudAction, str]:
+    """Prompt the LLM and parse its response into a CloudAction."""
+    system_prompt = (
+        "You are an expert AI DevOps Engineer diagnosing a cloud infrastructure issue. "
+        "You must respond ONLY with a raw JSON object matching this schema:\n"
+        "{\n"
+        '  "command": "list_resources" | "describe_resource" | "view_logs" | "update_security_group" | "restart_service" | "submit_solution",\n'
+        '  "resource_id": "string (optional)",\n'
+        '  "parameters": {"key": "value"} (optional)\n'
+        "}\n"
+        "Do not include markdown blocks like ```json. Just output the JSON."
+    )
+    user_prompt = f"Step {step}.\nLast Observation:\n{last_obs}\n"
+    if last_error:
+        user_prompt += f"\nLast Error:\n{last_error}\n"
+    user_prompt += "\nWhat is your next action JSON?"
+    messages = [{"role": "system", "content": system_prompt}] + history + [
+        {"role": "user", "content": user_prompt}
+    ]
+    try:
+        response = client.chat.completions.create(
+            model=MODEL_NAME,
+            messages=messages,
+            temperature=0.1,
+            max_tokens=200,
+        )
+        raw_text = (response.choices[0].message.content or "").strip()
+        if raw_text.startswith("```json"):
+            raw_text = raw_text.replace("```json", "").replace("```", "").strip()
+        action_dict = json.loads(raw_text)
+        return CloudAction(**action_dict), raw_text
+    except (json.JSONDecodeError, ValidationError) as exc:
+        print(f"[DEBUG] Model parse failed: {exc}", flush=True)
+        return CloudAction(command="list_resources"), "failed_parse"
+    except Exception as exc:
+        print(f"[DEBUG] API request failed: {exc}", flush=True)
+        return CloudAction(command="list_resources"), "api_error"
+async def run_task(task_name: str, client: OpenAI) -> None:
+    env = CloudDevOpsEnv(task_name=task_name)
+    history: List[Dict[str, str]] = []
+    rewards: List[float] = []
+    steps_taken = 0
+    score = 0.0
+    success = False
+    log_start(task=task_name, env=BENCHMARK, model=MODEL_NAME)
+    try:
+        result = await env.reset()
+        last_obs = result.observation.output
+        last_error = result.observation.error or ""
+        for step in range(1, MAX_STEPS + 1):
+            if result.done:
+                break
+            action, raw_response = get_model_action(
+                client, step, last_obs, last_error, history
+            )
+            result = await env.step(action)
+            obs = result.observation
+            reward = result.reward or 0.0
+            done = result.done
+            error = obs.error
+            rewards.append(reward)
+            steps_taken = step
+            last_obs = obs.output
+            last_error = error or ""
+            log_step(step=step, action=action, reward=reward, done=done, error=error)
+            history.append({"role": "assistant", "content": raw_response})
+            history.append(
+                {
+                    "role": "user",
+                    "content": f"Observation: {last_obs}\nError: {last_error}",
+                }
+            )
+            if done:
+                break
+        score = sum(rewards)
+        score = min(max(score, 0.0), MAX_TOTAL_REWARD)
+        success = score >= SUCCESS_SCORE_THRESHOLD
+    finally:
+        log_end(success=success, steps=steps_taken, score=score, rewards=rewards)
+async def main() -> None:
+    if not HF_TOKEN:
+        print("[WARNING] HF_TOKEN environment variable not set. API calls will likely fail.")
+    client = OpenAI(base_url=API_BASE_URL, api_key=HF_TOKEN)
+    tasks = ["easy", "medium", "hard"]
+    for task in tasks:
+        print(f"\n--- Running Task: {task.upper()} ---")
+        await run_task(task, client)
+if __name__ == "__main__":
+    asyncio.run(main())

models.py ADDED Viewed

	@@ -0,0 +1,113 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+Data models for the Cloud Devops Env Environment.
+The cloud_devops_env environment simulates cloud/devops incident response tasks.
+"""
+import json
+from typing import Any, Dict, Literal, Optional
+from openenv.core.env_server.types import Action, Observation, State
+from pydantic import Field, field_validator
+class CloudAction(Action):
+    """Action space (what the agent can do)."""
+    command: Literal[
+        "list_resources",
+        "describe_resource",
+        "view_logs",
+        "update_security_group",
+        "restart_service",
+        "submit_solution",
+    ] = Field(..., description="The cloud API command to execute.")
+    resource_id: Optional[str] = Field(
+        default=None,
+        description=(
+            "The ID of the target resource (e.g., 'i-12345'). "
+            "Required for all commands except list_resources."
+        ),
+    )
+    parameters: Optional[Dict[str, Any]] = Field(
+        default=None,
+        description=(
+            "Key-value pairs for updates "
+            "(e.g., {'port': '80', 'action': 'allow'} for update_security_group)."
+        ),
+    )
+    message: Optional[str] = Field(
+        default=None,
+        description="Legacy field from template env; safe to remove after server/client migration.",
+    )
+    @field_validator("parameters", mode="before")
+    @classmethod
+    def _coerce_parameters(cls, value: Any) -> Any:
+        """Allow /web text input to pass JSON for dict parameters."""
+        if value is None or value == "":
+            return None
+        if isinstance(value, dict):
+            return value
+        if isinstance(value, str):
+            try:
+                parsed = json.loads(value)
+            except json.JSONDecodeError as exc:
+                raise ValueError(
+                    "parameters must be a JSON object string, e.g. {\"port\":80,\"action\":\"allow\"}"
+                ) from exc
+            if not isinstance(parsed, dict):
+                raise ValueError("parameters JSON must decode to an object/dictionary")
+            return parsed
+        raise ValueError("parameters must be a dictionary or JSON object string")
+class CloudObservation(Observation):
+    """Observation space (what the agent sees)."""
+    output: str = Field(
+        ...,
+        description="The terminal/API response from the last command executed.",
+    )
+    error: Optional[str] = Field(
+        default=None,
+        description="Error message if the last command failed or was invalid.",
+    )
+    system_health_status: str = Field(
+        ...,
+        description="Current status of the system (e.g., 'CRITICAL', 'DEGRADED', 'HEALTHY').",
+    )
+    echoed_message: Optional[str] = Field(
+        default=None,
+        description="Legacy field from template env; safe to remove after server/client migration.",
+    )
+    message_length: int = Field(
+        default=0,
+        description="Legacy field from template env; safe to remove after server/client migration.",
+    )
+class CloudState(State):
+    """State space (the hidden environment state)."""
+    task_difficulty: str = Field(..., description="Current task: easy, medium, or hard.")
+    resources: Dict[str, Dict[str, Any]] = Field(
+        ...,
+        description="The hidden JSON state of all mock cloud resources.",
+    )
+    step_count: int = Field(..., description="Number of actions taken so far.")
+    is_resolved: bool = Field(
+        ...,
+        description="Whether the root cause has been successfully fixed.",
+    )
+# Backward-compatible aliases for scaffolded files that still use template names.
+CloudDevopsAction = CloudAction
+CloudDevopsObservation = CloudObservation

openenv.yaml ADDED Viewed

	@@ -0,0 +1,19 @@

+spec_version: 1
+name: cloud_devops_env
+type: space
+runtime: fastapi
+app: server.app:app
+port: 8000
+metadata:
+  project: cloud-devops-env
+  description: A real-world Cloud SRE/DevOps simulation environment.
+  entrypoint:
+    file: env.py
+    class: CloudDevOpsEnv
+  models:
+    file: models.py
+    action: CloudAction
+    observation: CloudObservation
+    state: CloudState

pyproject.toml ADDED Viewed

	@@ -0,0 +1,36 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+[build-system]
+requires = ["setuptools>=45", "wheel"]
+build-backend = "setuptools.build_meta"
+[project]
+name = "openenv-cloud_devops_env"
+version = "0.1.0"
+description = "Cloud Devops Env environment for OpenEnv"
+requires-python = ">=3.10"
+dependencies = [
+    "openenv-core[core]>=0.2.2",
+    "pydantic>=2.0.0",
+    "openai>=1.0.0",
+]
+[project.optional-dependencies]
+dev = [
+    "pytest>=8.0.0",
+    "pytest-cov>=4.0.0",
+]
+[project.scripts]
+# Server entry point - enables running via: uv run --project . server
+# or: python -m cloud_devops_env.server.app
+server = "cloud_devops_env.server.app:main"
+[tool.setuptools]
+include-package-data = true
+packages = ["cloud_devops_env", "cloud_devops_env.server"]
+package-dir = { "cloud_devops_env" = ".", "cloud_devops_env.server" = "server" }

scripts/pre_submit_validate.sh ADDED Viewed

	@@ -0,0 +1,365 @@

+#!/usr/bin/env bash
+#
+# pre_submit_validate.sh
+#
+# Extended pre-submission checks for OpenEnv hackathon submissions.
+# This script complements scripts/validate-submission.sh by also checking
+# inference contract requirements and baseline reproducibility.
+set -euo pipefail
+DOCKER_BUILD_TIMEOUT=600
+INFERENCE_TIMEOUT=1200
+PING_URL=""
+REPO_DIR="."
+SKIP_DOCKER=false
+SKIP_INFERENCE=false
+PYTHON_BIN=""
+OPENENV_BIN=""
+OPENENV_USE_MODULE=false
+DOCKER_CONTAINER_ID=""
+usage() {
+  cat <<'EOF'
+Usage: scripts/pre_submit_validate.sh [options]
+Options:
+  --ping-url <url>        HF Space URL (e.g., https://team-space.hf.space)
+  --repo-dir <path>       Repo root directory (default: current directory)
+  --skip-docker           Skip docker build check
+  --skip-inference        Skip inference baseline check
+  -h, --help              Show this help message
+Required environment variables for inference checks:
+  API_BASE_URL
+  MODEL_NAME
+  HF_TOKEN
+EOF
+}
+run_with_timeout() {
+  local secs="$1"; shift
+  if command -v timeout >/dev/null 2>&1; then
+    timeout "$secs" "$@"
+  elif command -v gtimeout >/dev/null 2>&1; then
+    gtimeout "$secs" "$@"
+  else
+    "$@" &
+    local pid=$!
+    ( sleep "$secs" && kill "$pid" 2>/dev/null ) &
+    local watcher=$!
+    wait "$pid" 2>/dev/null
+    local rc=$?
+    kill "$watcher" 2>/dev/null || true
+    wait "$watcher" 2>/dev/null || true
+    return $rc
+  fi
+}
+log() {
+  printf "[%s] %s\n" "$(date -u +%H:%M:%S)" "$*"
+}
+die() {
+  log "FAILED -- $*"
+  exit 1
+}
+pass() {
+  log "PASSED -- $*"
+}
+cleanup() {
+  if [ -n "$DOCKER_CONTAINER_ID" ]; then
+    docker rm -f "$DOCKER_CONTAINER_ID" >/dev/null 2>&1 || true
+  fi
+}
+trap cleanup EXIT
+resolve_python_bin() {
+  local candidates=(
+    "$REPO_DIR/.venv/bin/python"
+    "$REPO_DIR/.venv/Scripts/python.exe"
+    "$REPO_DIR/../.venv/bin/python"
+    "$REPO_DIR/../.venv/Scripts/python.exe"
+  )
+  for c in "${candidates[@]}"; do
+    if [ -x "$c" ]; then
+      PYTHON_BIN="$c"
+      return 0
+    fi
+  done
+  if command -v python >/dev/null 2>&1; then
+    PYTHON_BIN="$(command -v python)"
+    return 0
+  fi
+  if command -v python3 >/dev/null 2>&1; then
+    PYTHON_BIN="$(command -v python3)"
+    return 0
+  fi
+  return 1
+}
+resolve_openenv_cmd() {
+  local candidates=(
+    "$REPO_DIR/.venv/bin/openenv"
+    "$REPO_DIR/.venv/Scripts/openenv.exe"
+    "$REPO_DIR/../.venv/bin/openenv"
+    "$REPO_DIR/../.venv/Scripts/openenv.exe"
+  )
+  for c in "${candidates[@]}"; do
+    if [ -x "$c" ]; then
+      OPENENV_BIN="$c"
+      return 0
+    fi
+  done
+  if command -v openenv >/dev/null 2>&1; then
+    OPENENV_BIN="$(command -v openenv)"
+    return 0
+  fi
+  return 1
+}
+while [ "$#" -gt 0 ]; do
+  case "$1" in
+    --ping-url)
+      shift
+      [ "$#" -gt 0 ] || die "--ping-url requires a value"
+      PING_URL="$1"
+      ;;
+    --repo-dir)
+      shift
+      [ "$#" -gt 0 ] || die "--repo-dir requires a value"
+      REPO_DIR="$1"
+      ;;
+    --skip-docker)
+      SKIP_DOCKER=true
+      ;;
+    --skip-inference)
+      SKIP_INFERENCE=true
+      ;;
+    -h|--help)
+      usage
+      exit 0
+      ;;
+    *)
+      die "Unknown option: $1"
+      ;;
+  esac
+  shift
+done
+REPO_DIR="$(cd "$REPO_DIR" && pwd)"
+cd "$REPO_DIR"
+log "Repo: $REPO_DIR"
+resolve_python_bin || die "No usable Python interpreter found"
+log "Python: $PYTHON_BIN"
+if resolve_openenv_cmd; then
+  log "OpenEnv CLI: $OPENENV_BIN"
+else
+  OPENENV_USE_MODULE=true
+  log "OpenEnv CLI via module: $PYTHON_BIN -m openenv"
+fi
+log "Step 1/8: Checking OpenEnv standard file layout"
+required_files=(
+  "openenv.yaml"
+  "models.py"
+  "env.py"
+  "inference.py"
+  "server/app.py"
+  "server/cloud_devops_env_environment.py"
+)
+for f in "${required_files[@]}"; do
+  [ -f "$f" ] || die "Missing required file: $f"
+done
+pass "Core OpenEnv file layout looks valid"
+log "Step 2/8: Checking inference contract requirements"
+[ -f "inference.py" ] || die "inference.py must exist in repo root"
+grep -q "from openai import OpenAI" inference.py || die "inference.py must import OpenAI client"
+grep -q "OpenAI(" inference.py || die "inference.py must instantiate OpenAI client"
+grep -q "\[START\]" inference.py || die "inference.py must emit [START] logs"
+grep -q "\[STEP\]" inference.py || die "inference.py must emit [STEP] logs"
+grep -q "\[END\]" inference.py || die "inference.py must emit [END] logs"
+pass "Inference script contract checks passed"
+log "Step 3/8: Validating OpenEnv manifest and typed models"
+if [ "$OPENENV_USE_MODULE" = true ]; then
+  "$PYTHON_BIN" -m openenv validate >/tmp/openenv-validate.out 2>&1 || {
+    cat /tmp/openenv-validate.out
+    die "openenv validate failed"
+  }
+else
+  "$OPENENV_BIN" validate >/tmp/openenv-validate.out 2>&1 || {
+    cat /tmp/openenv-validate.out
+    die "openenv validate failed"
+  }
+fi
+pass "openenv validate passed"
+log "Step 4/8: Optional HF Space ping check"
+if [ -n "$PING_URL" ]; then
+  PING_URL="${PING_URL%/}"
+  code=$(curl -s -o /tmp/pre-submit-ping.out -w "%{http_code}" -X POST \
+    -H "Content-Type: application/json" -d '{}' \
+    "$PING_URL/reset" --max-time 30 || printf "000")
+  [ "$code" = "200" ] || die "HF Space /reset returned HTTP $code"
+  pass "HF Space responds to /reset (HTTP 200)"
+else
+  log "SKIPPED -- no --ping-url provided"
+fi
+log "Step 5/8: Docker build + run check"
+if [ "$SKIP_DOCKER" = true ]; then
+  log "SKIPPED -- --skip-docker enabled"
+else
+  command -v docker >/dev/null 2>&1 || die "docker not found"
+  if [ -f "Dockerfile" ]; then
+    context="."
+  elif [ -f "server/Dockerfile" ]; then
+    context="server"
+  else
+    die "No Dockerfile found at root or server/"
+  fi
+  run_with_timeout "$DOCKER_BUILD_TIMEOUT" docker build "$context" >/tmp/pre-submit-docker.out 2>&1 || {
+    tail -n 40 /tmp/pre-submit-docker.out
+    die "docker build failed"
+  }
+  pass "Docker build succeeded"
+  IMAGE_TAG="openenv-pre-submit-local"
+  run_with_timeout "$DOCKER_BUILD_TIMEOUT" docker build -t "$IMAGE_TAG" "$context" >/tmp/pre-submit-docker-tagged.out 2>&1 || {
+    tail -n 40 /tmp/pre-submit-docker-tagged.out
+    die "docker build (tagged) failed"
+  }
+  DOCKER_CONTAINER_ID="$(docker run -d -p 127.0.0.1::8000 "$IMAGE_TAG" 2>/tmp/pre-submit-docker-run.err || true)"
+  [ -n "$DOCKER_CONTAINER_ID" ] || {
+    cat /tmp/pre-submit-docker-run.err
+    die "docker run failed"
+  }
+  HOST_PORT="$(docker port "$DOCKER_CONTAINER_ID" 8000/tcp | tail -n 1 | awk -F: '{print $NF}')"
+  [ -n "$HOST_PORT" ] || die "could not resolve mapped host port for container"
+  HEALTH_OK=false
+  for _ in $(seq 1 30); do
+    health_code=$(curl -s -o /tmp/pre-submit-health.out -w "%{http_code}" \
+      "http://127.0.0.1:${HOST_PORT}/health" --max-time 3 || printf "000")
+    if [ "$health_code" = "200" ]; then
+      HEALTH_OK=true
+      break
+    fi
+    sleep 1
+  done
+  [ "$HEALTH_OK" = true ] || {
+    docker logs "$DOCKER_CONTAINER_ID" | tail -n 50
+    die "container did not become healthy on /health"
+  }
+  reset_code=$(curl -s -o /tmp/pre-submit-reset.out -w "%{http_code}" -X POST \
+    -H "Content-Type: application/json" -d '{}' \
+    "http://127.0.0.1:${HOST_PORT}/reset" --max-time 10 || printf "000")
+  [ "$reset_code" = "200" ] || {
+    docker logs "$DOCKER_CONTAINER_ID" | tail -n 50
+    die "container /reset returned HTTP $reset_code"
+  }
+  pass "Containerized execution check passed (/health and /reset)"
+  docker rm -f "$DOCKER_CONTAINER_ID" >/dev/null 2>&1 || true
+  DOCKER_CONTAINER_ID=""
+fi
+log "Step 6/8: Environment variable checks"
+if [ "$SKIP_INFERENCE" = true ]; then
+  log "SKIPPED -- --skip-inference enabled"
+else
+  [ -n "${API_BASE_URL:-}" ] || die "API_BASE_URL is not set"
+  [ -n "${MODEL_NAME:-}" ] || die "MODEL_NAME is not set"
+  [ -n "${HF_TOKEN:-}" ] || die "HF_TOKEN is not set"
+  pass "Required API_BASE_URL / MODEL_NAME / HF_TOKEN are set"
+fi
+log "Step 7/8: Baseline reproducibility (inference.py)"
+if [ "$SKIP_INFERENCE" = true ]; then
+  log "SKIPPED -- --skip-inference enabled"
+else
+  run_with_timeout "$INFERENCE_TIMEOUT" "$PYTHON_BIN" inference.py >/tmp/pre-submit-inference.out 2>&1 || {
+    tail -n 80 /tmp/pre-submit-inference.out
+    die "inference.py failed or timed out"
+  }
+  pass "inference.py completed within timeout"
+fi
+log "Step 8/8: Structured logs + task/grader checks"
+if [ "$SKIP_INFERENCE" = true ]; then
+  log "SKIPPED -- --skip-inference enabled"
+else
+  "$PYTHON_BIN" - <<'PY'
+import json
+import sys
+from pathlib import Path
+path = Path('/tmp/pre-submit-inference.out')
+text = path.read_text(encoding='utf-8', errors='replace').splitlines()
+starts = []
+ends = []
+step_count = 0
+for line in text:
+    line = line.strip()
+    if line.startswith('[START] '):
+        payload = json.loads(line[len('[START] '):])
+        starts.append(payload)
+    elif line.startswith('[STEP] '):
+        json.loads(line[len('[STEP] '):])
+        step_count += 1
+    elif line.startswith('[END] '):
+        payload = json.loads(line[len('[END] '):])
+        ends.append(payload)
+if len(starts) < 3:
+    raise SystemExit('Expected at least 3 [START] task logs')
+unique_tasks = {str(s.get('task', '')) for s in starts if s.get('task')}
+if len(unique_tasks) < 3:
+    raise SystemExit('Expected at least 3 unique tasks in [START] logs')
+if len(ends) != len(starts):
+    raise SystemExit('Mismatch between [START] and [END] log counts')
+if step_count == 0:
+    raise SystemExit('No [STEP] logs found')
+for i, end in enumerate(ends, start=1):
+    score = float(end.get('score', -1.0))
+    rewards = end.get('rewards', [])
+    if not (0.0 <= score <= 1.0):
+        raise SystemExit(f'END #{i} score out of range [0,1]: {score}')
+    if not isinstance(rewards, list):
+        raise SystemExit(f'END #{i} rewards must be a list')
+    for r in rewards:
+        rv = float(r)
+        if not (-1.0 <= rv <= 1.0):
+            raise SystemExit(f'END #{i} step reward out of sanity range [-1,1]: {rv}')
+print('Structured logs and task/grader checks passed')
+PY
+  pass "Structured [START]/[STEP]/[END] logs and score-range checks passed"
+fi
+log "All checks passed. Submission is ready."

scripts/validate-submission.sh ADDED Viewed

	@@ -0,0 +1,185 @@

+#!/usr/bin/env bash
+#
+# validate-submission.sh — OpenEnv Submission Validator
+#
+# Checks that your HF Space is live, Docker image builds, and openenv validate passes.
+#
+# Prerequisites:
+#   - Docker:       https://docs.docker.com/get-docker/
+#   - openenv-core: pip install openenv-core
+#   - curl (usually pre-installed)
+#
+# Run:
+#   curl -fsSL https://raw.githubusercontent.com/<owner>/<repo>/main/scripts/validate-submission.sh | bash -s -- <ping_url> [repo_dir]
+#
+#   Or download and run locally:
+#     chmod +x validate-submission.sh
+#     ./validate-submission.sh <ping_url> [repo_dir]
+#
+# Arguments:
+#   ping_url   Your HuggingFace Space URL (e.g. https://your-space.hf.space)
+#   repo_dir   Path to your repo (default: current directory)
+#
+# Examples:
+#   ./validate-submission.sh https://my-team.hf.space
+#   ./validate-submission.sh https://my-team.hf.space ./my-repo
+#
+set -uo pipefail
+DOCKER_BUILD_TIMEOUT=600
+if [ -t 1 ]; then
+  RED='\033[0;31m'
+  GREEN='\033[0;32m'
+  YELLOW='\033[1;33m'
+  BOLD='\033[1m'
+  NC='\033[0m'
+else
+  RED='' GREEN='' YELLOW='' BOLD='' NC=''
+fi
+run_with_timeout() {
+  local secs="$1"; shift
+  if command -v timeout &>/dev/null; then
+    timeout "$secs" "$@"
+  elif command -v gtimeout &>/dev/null; then
+    gtimeout "$secs" "$@"
+  else
+    "$@" &
+    local pid=$!
+    ( sleep "$secs" && kill "$pid" 2>/dev/null ) &
+    local watcher=$!
+    wait "$pid" 2>/dev/null
+    local rc=$?
+    kill "$watcher" 2>/dev/null
+    wait "$watcher" 2>/dev/null
+    return $rc
+  fi
+}
+portable_mktemp() {
+  local prefix="${1:-validate}"
+  mktemp "${TMPDIR:-/tmp}/${prefix}-XXXXXX" 2>/dev/null || mktemp
+}
+CLEANUP_FILES=()
+cleanup() { rm -f "${CLEANUP_FILES[@]+"${CLEANUP_FILES[@]}"}"; }
+trap cleanup EXIT
+PING_URL="${1:-}"
+REPO_DIR="${2:-.}"
+if [ -z "$PING_URL" ]; then
+  printf "Usage: %s <ping_url> [repo_dir]\n" "$0"
+  printf "\n"
+  printf "  ping_url   Your HuggingFace Space URL (e.g. https://your-space.hf.space)\n"
+  printf "  repo_dir   Path to your repo (default: current directory)\n"
+  exit 1
+fi
+if ! REPO_DIR="$(cd "$REPO_DIR" 2>/dev/null && pwd)"; then
+  printf "Error: directory '%s' not found\n" "${2:-.}"
+  exit 1
+fi
+PING_URL="${PING_URL%/}"
+export PING_URL
+PASS=0
+log()  { printf "[%s] %b\n" "$(date -u +%H:%M:%S)" "$*"; }
+pass() { log "${GREEN}PASSED${NC} -- $1"; PASS=$((PASS + 1)); }
+fail() { log "${RED}FAILED${NC} -- $1"; }
+hint() { printf "  ${YELLOW}Hint:${NC} %b\n" "$1"; }
+stop_at() {
+  printf "\n"
+  printf "${RED}${BOLD}Validation stopped at %s.${NC} Fix the above before continuing.\n" "$1"
+  exit 1
+}
+printf "\n"
+printf "${BOLD}========================================${NC}\n"
+printf "${BOLD}  OpenEnv Submission Validator${NC}\n"
+printf "${BOLD}========================================${NC}\n"
+log "Repo:     $REPO_DIR"
+log "Ping URL: $PING_URL"
+printf "\n"
+log "${BOLD}Step 1/3: Pinging HF Space${NC} ($PING_URL/reset) ..."
+CURL_OUTPUT=$(portable_mktemp "validate-curl")
+CLEANUP_FILES+=("$CURL_OUTPUT")
+HTTP_CODE=$(curl -s -o "$CURL_OUTPUT" -w "%{http_code}" -X POST \
+  -H "Content-Type: application/json" -d '{}' \
+  "$PING_URL/reset" --max-time 30 2>"$CURL_OUTPUT" || printf "000")
+if [ "$HTTP_CODE" = "200" ]; then
+  pass "HF Space is live and responds to /reset"
+elif [ "$HTTP_CODE" = "000" ]; then
+  fail "HF Space not reachable (connection failed or timed out)"
+  hint "Check your network connection and that the Space is running."
+  hint "Try: curl -s -o /dev/null -w '%%{http_code}' -X POST $PING_URL/reset"
+  stop_at "Step 1"
+else
+  fail "HF Space /reset returned HTTP $HTTP_CODE (expected 200)"
+  hint "Make sure your Space is running and the URL is correct."
+  hint "Try opening $PING_URL in your browser first."
+  stop_at "Step 1"
+fi
+log "${BOLD}Step 2/3: Running docker build${NC} ..."
+if ! command -v docker &>/dev/null; then
+  fail "docker command not found"
+  hint "Install Docker: https://docs.docker.com/get-docker/"
+  stop_at "Step 2"
+fi
+if [ -f "$REPO_DIR/Dockerfile" ]; then
+  DOCKER_CONTEXT="$REPO_DIR"
+elif [ -f "$REPO_DIR/server/Dockerfile" ]; then
+  DOCKER_CONTEXT="$REPO_DIR/server"
+else
+  fail "No Dockerfile found in repo root or server/ directory"
+  stop_at "Step 2"
+fi
+log "  Found Dockerfile in $DOCKER_CONTEXT"
+BUILD_OK=false
+BUILD_OUTPUT=$(run_with_timeout "$DOCKER_BUILD_TIMEOUT" docker build "$DOCKER_CONTEXT" 2>&1) && BUILD_OK=true
+if [ "$BUILD_OK" = true ]; then
+  pass "Docker build succeeded"
+else
+  fail "Docker build failed (timeout=${DOCKER_BUILD_TIMEOUT}s)"
+  printf "%s\n" "$BUILD_OUTPUT" | tail -20
+  stop_at "Step 2"
+fi
+log "${BOLD}Step 3/3: Running openenv validate${NC} ..."
+if ! command -v openenv &>/dev/null; then
+  fail "openenv command not found"
+  hint "Install it: pip install openenv-core"
+  stop_at "Step 3"
+fi
+VALIDATE_OK=false
+VALIDATE_OUTPUT=$(cd "$REPO_DIR" && openenv validate 2>&1) && VALIDATE_OK=true
+if [ "$VALIDATE_OK" = true ]; then
+  pass "openenv validate passed"
+  [ -n "$VALIDATE_OUTPUT" ] && log "  $VALIDATE_OUTPUT"
+else
+  fail "openenv validate failed"
+  printf "%s\n" "$VALIDATE_OUTPUT"
+  stop_at "Step 3"
+fi
+printf "\n"
+printf "${BOLD}========================================${NC}\n"
+printf "${GREEN}${BOLD}  All 3/3 checks passed!${NC}\n"
+printf "${GREEN}${BOLD}  Your submission is ready to submit.${NC}\n"
+printf "${BOLD}========================================${NC}\n"
+printf "\n"
+exit 0

server/Dockerfile ADDED Viewed

	@@ -0,0 +1,80 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+# Multi-stage build using openenv-base
+# This Dockerfile is flexible and works for both:
+# - In-repo environments (with local OpenEnv sources)
+# - Standalone environments (with openenv from PyPI/Git)
+# The build script (openenv build) handles context detection and sets appropriate build args.
+ARG BASE_IMAGE=ghcr.io/meta-pytorch/openenv-base:latest
+FROM ${BASE_IMAGE} AS builder
+WORKDIR /app
+# Ensure git is available (required for installing dependencies from VCS)
+RUN apt-get update && \
+    apt-get install -y --no-install-recommends git && \
+    rm -rf /var/lib/apt/lists/*
+# Build argument to control whether we're building standalone or in-repo
+ARG BUILD_MODE=in-repo
+ARG ENV_NAME=cloud_devops_env
+# Copy environment code (always at root of build context)
+COPY . /app/env
+# For in-repo builds, openenv is already vendored in the build context
+# For standalone builds, openenv will be installed via pyproject.toml
+WORKDIR /app/env
+# Ensure uv is available (for local builds where base image lacks it)
+RUN if ! command -v uv >/dev/null 2>&1; then \
+        curl -LsSf https://astral.sh/uv/install.sh | sh && \
+        mv /root/.local/bin/uv /usr/local/bin/uv && \
+        mv /root/.local/bin/uvx /usr/local/bin/uvx; \
+    fi
+# Install dependencies using uv sync
+# If uv.lock exists, use it; otherwise resolve on the fly
+RUN --mount=type=cache,target=/root/.cache/uv \
+    if [ -f uv.lock ]; then \
+        uv sync --frozen --no-install-project --no-editable; \
+    else \
+        uv sync --no-install-project --no-editable; \
+    fi
+RUN --mount=type=cache,target=/root/.cache/uv \
+    if [ -f uv.lock ]; then \
+        uv sync --frozen --no-editable; \
+    else \
+        uv sync --no-editable; \
+    fi
+# Final runtime stage
+FROM ${BASE_IMAGE}
+WORKDIR /app
+# Copy the virtual environment from builder
+COPY --from=builder /app/env/.venv /app/.venv
+# Copy the environment code
+COPY --from=builder /app/env /app/env
+# Set PATH to use the virtual environment
+ENV PATH="/app/.venv/bin:$PATH"
+# Set PYTHONPATH so imports work correctly
+ENV PYTHONPATH="/app/env:$PYTHONPATH"
+# Health check
+HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
+    CMD curl -f http://localhost:8000/health || exit 1
+# Run the FastAPI server
+# The module path is constructed to work with the /app/env structure
+CMD ["sh", "-c", "cd /app/env && uvicorn server.app:app --host 0.0.0.0 --port 8000"]

server/__init__.py ADDED Viewed

	@@ -0,0 +1,11 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""Cloud Devops Env environment server components."""
+from .cloud_devops_env_environment import CloudDevopsEnvironment
+__all__ = ["CloudDevopsEnvironment"]

server/app.py ADDED Viewed

	@@ -0,0 +1,101 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+FastAPI application for the Cloud Devops Env Environment.
+This module creates an HTTP server that exposes the CloudDevopsEnvironment
+over HTTP and WebSocket endpoints, compatible with EnvClient.
+Endpoints:
+    - POST /reset: Reset the environment
+    - POST /step: Execute an action
+    - GET /state: Get current environment state
+    - GET /schema: Get action/observation schemas
+    - WS /ws: WebSocket endpoint for persistent sessions
+Usage:
+    # Development (with auto-reload):
+    uvicorn server.app:app --reload --host 0.0.0.0 --port 8000
+    # Production:
+    uvicorn server.app:app --host 0.0.0.0 --port 8000 --workers 4
+    # Or run directly:
+    python -m server.app
+"""
+import os
+from pathlib import Path
+# Default to enabling the OpenEnv web interface for local development.
+# You can still disable it explicitly: ENABLE_WEB_INTERFACE=false
+os.environ.setdefault("ENABLE_WEB_INTERFACE", "true")
+os.environ.setdefault(
+    "ENV_README_PATH",
+    str((Path(__file__).resolve().parent.parent / "README.md")),
+)
+try:
+    from openenv.core.env_server.http_server import create_app
+except Exception as e:  # pragma: no cover
+    raise ImportError(
+        "openenv is required for the web interface. Install dependencies with '\n    uv sync\n'"
+    ) from e
+try:
+    from ..models import CloudDevopsAction, CloudDevopsObservation
+    from .cloud_devops_env_environment import CloudDevopsEnvironment
+except (ModuleNotFoundError, ImportError):
+    from models import CloudDevopsAction, CloudDevopsObservation
+    from server.cloud_devops_env_environment import CloudDevopsEnvironment
+# Create the app with web interface and README integration
+app = create_app(
+    CloudDevopsEnvironment,
+    CloudDevopsAction,
+    CloudDevopsObservation,
+    env_name="cloud_devops_env",
+    max_concurrent_envs=1,  # increase this number to allow more concurrent WebSocket sessions
+)
+def main(host: str | None = None, port: int | None = None):
+    """
+    Entry point for direct execution via uv run or python -m.
+    This function enables running the server without Docker:
+        uv run --project . server
+        uv run --project . server --port 8001
+        python -m cloud_devops_env.server.app
+    Args:
+        host: Host address to bind to. If not provided, CLI args are parsed.
+        port: Port number to listen on. If not provided, CLI args are parsed.
+    For production deployments, consider using uvicorn directly with
+    multiple workers:
+        uvicorn cloud_devops_env.server.app:app --workers 4
+    """
+    import argparse
+    import uvicorn
+    # Console-script entry points invoke main() with no parameters, so parse
+    # CLI flags here to make `server --host ... --port ...` work as expected.
+    if host is None and port is None:
+        parser = argparse.ArgumentParser(add_help=False)
+        parser.add_argument("--host", type=str, default="0.0.0.0")
+        parser.add_argument("--port", type=int, default=8000)
+        args, _ = parser.parse_known_args()
+        host = args.host
+        port = args.port
+    uvicorn.run(app, host=host or "0.0.0.0", port=port or 8000)
+if __name__ == "__main__":
+    main()

server/cloud_devops_env_environment.py ADDED Viewed

	@@ -0,0 +1,384 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+Cloud Devops Env Environment Implementation.
+A deterministic mock cloud/devops environment with reward shaping and
+anti-farming guardrails for hackathon evaluation.
+"""
+from __future__ import annotations
+import copy
+from uuid import uuid4
+from openenv.core.env_server.interfaces import Environment
+from openenv.core.env_server.types import State
+try:
+    from ..models import CloudAction, CloudObservation, CloudState
+except ImportError:
+    from models import CloudAction, CloudObservation, CloudState
+class CloudDevopsEnvironment(Environment):
+    """
+    A deterministic mock cloud/devops environment.
+    Tasks:
+    - easy: open port 80 on sg-web
+    - medium: inspect noisy API logs, then open port 5432 on sg-db
+    - hard: trace 502 from lb-main to i-web2, then restart i-web2 (not i-web1)
+    Example:
+        >>> env = CloudDevopsEnvironment()
+        >>> obs = env.reset()
+        >>> print(obs.system_health_status)  # "CRITICAL"
+        >>>
+        >>> obs = env.step(CloudAction(command="list_resources"))
+        >>> print(obs.output)
+    """
+    # Enable concurrent WebSocket sessions.
+    # Set to True if your environment isolates state between instances.
+    # When True, multiple WebSocket clients can connect simultaneously, each
+    # getting their own environment instance (when using factory mode in app.py).
+    SUPPORTS_CONCURRENT_SESSIONS: bool = True
+    MAX_STEPS: int = 20
+    VALID_TASKS = {"easy", "medium", "hard"}
+    def __init__(self, task_name: str = "easy"):
+        """Initialize the cloud_devops_env environment."""
+        normalized_task = (task_name or "easy").lower()
+        if normalized_task not in self.VALID_TASKS:
+            raise ValueError(f"Unknown task: {task_name}")
+        self.task_name = normalized_task
+        self._state_data: CloudState | None = None
+        self._achievements: set[str] = set()
+    def _build_noise_resources(self) -> dict[str, dict[str, object]]:
+        """Generate deterministic decoy resources to force retrieval and filtering."""
+        resources: dict[str, dict[str, object]] = {}
+        for i in range(1, 21):
+            suffix = f"{i:02d}"
+            resources[f"i-backend-{suffix}"] = {
+                "type": "Instance",
+                "status": "running",
+                "logs": (
+                    "[2026-04-06 17:00:00] INFO node-exporter: "
+                    "standard metrics reported successfully"
+                ),
+            }
+            resources[f"sg-backend-{suffix}"] = {
+                "type": "SecurityGroup",
+                "rules": [{"port": 443, "action": "allow"}],
+            }
+        return resources
+    def _build_task_resources(self) -> dict[str, dict[str, object]]:
+        resources = self._build_noise_resources()
+        if self.task_name == "easy":
+            resources.update(
+                {
+                "i-web": {"type": "Instance", "status": "running"},
+                "sg-web": {
+                    "type": "SecurityGroup",
+                    "rules": [{"port": 22, "action": "allow"}],
+                },
+                }
+            )
+            return resources
+        if self.task_name == "medium":
+            resources.update(
+                {
+                "i-api": {
+                    "type": "Instance",
+                    "status": "running",
+                    "logs": (
+                        "[2026-04-06 17:01:22] [CRITICAL] "
+                        "sqlalchemy.exc.OperationalError: "
+                        "(psycopg2.OperationalError) connection to server at "
+                        "'10.0.4.5' (i-db), port 5432 failed: Connection timed out. "
+                        "Is the server running and accepting TCP/IP connections?"
+                    ),
+                },
+                "i-db": {"type": "Instance", "status": "running"},
+                "sg-db": {
+                    "type": "SecurityGroup",
+                    "rules": [{"port": 22, "action": "allow"}],
+                },
+                }
+            )
+            return resources
+        resources.update(
+            {
+            "lb-main": {
+                "type": "LoadBalancer",
+                "logs": (
+                    "2026/04/06 17:02:09 [error] 3197#3197: *4189 upstream timed out "
+                    "(110: Connection timed out) while reading response header from upstream, "
+                    "client: 10.0.2.14, server: api.prod.local, request: \"GET /checkout HTTP/1.1\", "
+                    "upstream: \"http://i-web2:8080/checkout\", host: \"api.prod.local\"\n"
+                    "2026/04/06 17:02:10 [error] 3197#3197: *4190 no live upstreams while "
+                    "connecting to upstream \"i-web2\""
+                ),
+            },
+            "i-web1": {
+                "type": "Instance",
+                "status": "running",
+                "logs": (
+                    "[2026-04-06 17:02:11] INFO web-service: readiness probe passed\n"
+                    "[2026-04-06 17:02:12] INFO jvm: heap usage stable at 42%"
+                ),
+            },
+            "i-web2": {
+                "type": "Instance",
+                "status": "degraded",
+                "logs": (
+                    "kernel: Out of memory: Killed process 12345 (java) total-vm:4194304kB, "
+                    "anon-rss:3145728kB\n"
+                    "systemd[1]: web-service.service: Main process exited, code=killed, "
+                    "status=9/KILL"
+                ),
+            },
+            "sg-web": {
+                "type": "SecurityGroup",
+                "rules": [{"port": 80, "action": "allow"}],
+            },
+            }
+        )
+        return resources
+    def _reward_once(self, achievement: str, points: float) -> float:
+        if achievement in self._achievements:
+            return 0.0
+        self._achievements.add(achievement)
+        return points
+    def reset(self) -> CloudObservation:  # type: ignore[override]
+        """Reset the environment to the initial state for the selected task."""
+        self._achievements.clear()
+        self._state_data = CloudState(
+            episode_id=str(uuid4()),
+            task_difficulty=self.task_name,
+            resources=copy.deepcopy(self._build_task_resources()),
+            step_count=0,
+            is_resolved=False,
+        )
+        return CloudObservation(
+            output=(
+                "Environment initialized. System status is currently CRITICAL. "
+                "Use 'list_resources' to begin triage."
+            ),
+            error=None,
+            system_health_status="CRITICAL",
+            done=False,
+            reward=0.0,
+            metadata={
+                "step_count": 0,
+                "resolved": False,
+                "task": self.task_name,
+                "total_resources": len(self._state_data.resources),
+            },
+            echoed_message="Cloud Devops Env environment ready!",
+            message_length=0,
+        )
+    def step(self, action: CloudAction) -> CloudObservation:  # type: ignore[override]
+        """Execute the agent action and return the next observation."""
+        if self._state_data is None:
+            self.reset()
+        assert self._state_data is not None
+        state = self._state_data
+        state.step_count += 1
+        reward = 0.0
+        done = False
+        output = ""
+        error = None
+        try:
+            if action.command == "list_resources":
+                res_list = [
+                    f"{resource_id} ({data['type']})"
+                    for resource_id, data in sorted(state.resources.items())
+                ]
+                output = "Available Resources:\n" + "\n".join(res_list)
+            elif action.command == "describe_resource":
+                if not action.resource_id or action.resource_id not in state.resources:
+                    raise ValueError(f"Resource {action.resource_id} not found.")
+                output = str(state.resources[action.resource_id])
+                if self.task_name == "easy" and action.resource_id == "sg-web":
+                    reward += self._reward_once("read_sg", 0.2)
+                elif self.task_name == "medium" and action.resource_id == "sg-db":
+                    reward += self._reward_once("read_sg", 0.2)
+                elif self.task_name == "hard" and action.resource_id == "i-web2":
+                    reward += self._reward_once("inspect_target", 0.2)
+            elif action.command == "view_logs":
+                if not action.resource_id:
+                    raise ValueError("resource_id is required for view_logs.")
+                res = state.resources.get(action.resource_id)
+                if not res:
+                    raise ValueError(f"Resource {action.resource_id} not found.")
+                output = str(res.get("logs", "No logs available for this resource."))
+                if self.task_name == "medium" and action.resource_id == "i-api":
+                    reward += self._reward_once("read_logs", 0.2)
+                elif self.task_name == "hard" and action.resource_id == "lb-main":
+                    reward += self._reward_once("inspect_lb", 0.2)
+                elif self.task_name == "hard" and action.resource_id == "i-web2":
+                    reward += self._reward_once("inspect_target", 0.2)
+            elif action.command == "update_security_group":
+                if not action.resource_id:
+                    raise ValueError("resource_id is required for update_security_group.")
+                res = state.resources.get(action.resource_id)
+                if not res or res.get("type") != "SecurityGroup":
+                    raise ValueError(f"Invalid Security Group ID: {action.resource_id}")
+                if not action.parameters or "port" not in action.parameters:
+                    raise ValueError("Missing 'port' in parameters.")
+                rule = copy.deepcopy(action.parameters)
+                rules = res.get("rules")
+                if not isinstance(rules, list):
+                    raise ValueError(f"Security group {action.resource_id} has invalid rules.")
+                rules.append(rule)
+                output = f"Successfully updated {action.resource_id} with rule: {rule}"
+                port = int(rule["port"])
+                if (
+                    self.task_name == "easy"
+                    and action.resource_id == "sg-web"
+                    and port == 80
+                ):
+                    state.is_resolved = True
+                    reward += 0.8
+                    done = True
+                    output += "\nSUCCESS: Web server is now accessible!"
+                elif (
+                    self.task_name == "medium"
+                    and action.resource_id == "sg-db"
+                    and port == 5432
+                ):
+                    if "read_logs" in self._achievements:
+                        state.is_resolved = True
+                        reward += 0.6
+                        done = True
+                        output += "\nSUCCESS: Database connection restored!"
+                    else:
+                        reward -= 0.1
+                        output += (
+                            "\nWARNING: Change applied without incident triage. "
+                            "Inspect API logs before closing the incident."
+                        )
+            elif action.command == "restart_service":
+                if not action.resource_id:
+                    raise ValueError("resource_id is required for restart_service.")
+                if action.resource_id not in state.resources:
+                    raise ValueError(f"Resource {action.resource_id} not found.")
+                output = f"Service on {action.resource_id} restarted."
+                if self.task_name == "hard":
+                    if action.resource_id == "i-web2":
+                        investigated_root_cause = (
+                            "inspect_lb" in self._achievements
+                            and "inspect_target" in self._achievements
+                        )
+                        if investigated_root_cause:
+                            state.resources["i-web2"]["status"] = "running"
+                            state.resources["i-web2"][
+                                "logs"
+                            ] = "INFO: Restart successful. Memory cleared."
+                            state.is_resolved = True
+                            reward += 0.8
+                            done = True
+                            output += "\nSUCCESS: OutOfMemory loop broken. System stable."
+                        else:
+                            reward -= 0.1
+                            output += (
+                                "\nWARNING: Restart denied by change policy. "
+                                "Find failing upstream from lb-main and inspect i-web2 first."
+                            )
+                    elif action.resource_id == "i-web1":
+                        reward -= 0.2
+                        output += (
+                            "\nWARNING: You restarted a healthy production server! "
+                            "Users dropped."
+                        )
+            elif action.command == "submit_solution":
+                if state.is_resolved:
+                    done = True
+                    output = "Solution verified. System is HEALTHY."
+                else:
+                    if self.task_name == "hard":
+                        # In hard mode, unresolved submission should not abort the run.
+                        done = False
+                        reward -= 0.1
+                        output = (
+                            "Solution incorrect. Incident is still CRITICAL. "
+                            "Continue triage and remediation before submitting."
+                        )
+                    else:
+                        done = True
+                        output = "Solution incorrect. System is still CRITICAL."
+            else:
+                raise ValueError(f"Unsupported command: {action.command}")
+        except Exception as exc:
+            error = str(exc)
+            output = f"Command Failed: {error}"
+        if state.step_count >= self.MAX_STEPS and not done:
+            done = True
+            timeout_suffix = "\nTIMEOUT: Max steps reached."
+            output = f"{output}{timeout_suffix}" if output else timeout_suffix.strip()
+        reward = max(-1.0, min(1.0, reward))
+        status = "HEALTHY" if state.is_resolved else "CRITICAL"
+        info = {
+            "step_count": state.step_count,
+            "resolved": state.is_resolved,
+            "task": self.task_name,
+            "achievements": sorted(self._achievements),
+            "total_resources": len(state.resources),
+        }
+        return CloudObservation(
+            output=output,
+            error=error,
+            system_health_status=status,
+            done=done,
+            reward=reward,
+            metadata=info,
+            echoed_message=output,
+            message_length=len(output),
+        )
+    @property
+    def state(self) -> State:
+        """Return hidden environment state for evaluators/debugging."""
+        if self._state_data is None:
+            self.reset()
+        assert self._state_data is not None
+        return self._state_data

server/requirements.txt ADDED Viewed

	@@ -0,0 +1,6 @@

+openenv[core]>=0.2.0
+fastapi>=0.115.0
+uvicorn>=0.24.0