Krishna1107 commited on
Commit
c8f3b98
·
1 Parent(s): 6a5922c

full devops

Browse files
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- title: CI/CD + Docker Debug Environment
3
  emoji: 🔧
4
  colorFrom: blue
5
  colorTo: green
@@ -8,19 +8,20 @@ app_port: 8000
8
  pinned: false
9
  ---
10
 
11
- # CI/CD Debug Environment
12
 
13
- An OpenEnv-compatible environment where AI agents learn to debug broken GitHub Actions workflows and Dockerfiles. Built for the OpenEnv Hackathon by Scaler School of Technology (partners: Meta, HuggingFace, PyTorch).
14
 
15
- ## Why CI/CD Debugging?
16
 
17
- Every developer who ships code hits CI/CD failures. A misconfigured Dockerfile, a broken GitHub Actions workflow, a missing secret — these are the bugs that waste hours of developer time every week. They're hard to debug because:
18
 
19
  - Error messages are cryptic ("unable to prepare context: unable to evaluate symlinks")
20
  - The feedback loop is slow (push, wait for CI, read logs, fix, repeat)
21
- - Multiple config files interact in non-obvious ways (Dockerfile + workflow + secrets)
 
22
 
23
- This environment teaches AI agents to do what senior DevOps engineers do: read the error, trace it to the root cause, and fix it.
24
 
25
  ---
26
 
@@ -30,7 +31,7 @@ This environment teaches AI agents to do what senior DevOps engineers do: read t
30
  ┌──────────────────────────────────────────────────────────────┐
31
  │ 1. RESET │
32
  │ Agent receives: │
33
- │ - Broken config files (Dockerfile / workflow YAML)
34
  │ - Error message from the failed build/deploy │
35
  │ - Available secrets list │
36
  │ - Number of issues to find │
@@ -60,11 +61,11 @@ This environment teaches AI agents to do what senior DevOps engineers do: read t
60
 
61
  ---
62
 
63
- ## The 6 Tasks (30 Scenarios)
64
 
65
  ### Task 1: Dockerfile Syntax Errors — Easy
66
 
67
- Simple typos and instruction errors that break `docker build`. These are the bugs every developer makes on day one.
68
 
69
  | # | Scenario | What's Broken | Real-World Context |
70
  |---|----------|---------------|-------------------|
@@ -72,67 +73,115 @@ Simple typos and instruction errors that break `docker build`. These are the bug
72
  | 2 | `invalid_base_image` | `FROM python:3.9-slimm` — extra 'm' in tag | Happens when copy-pasting image tags |
73
  | 3 | `invalid_run_syntax` | `RUN pip install ... \n && python setup.py` — broken line continuation | Formatting multi-line RUN commands is tricky |
74
  | 4 | `invalid_expose` | `EXPOSE "eighty"` — string instead of port number | EXPOSE only accepts numeric ports |
75
- | 5 | `missing_from_instruction` | No `FROM` instruction at all | Dockerfile must start with FROM (or ARG before FROM) |
76
 
77
  ### Task 2: Dockerfile Runtime Errors — Medium
78
 
79
- The Dockerfile builds successfully, but the container crashes when you run it. These are harder because the error appears at runtime, not build time.
80
 
81
  | # | Scenario | What's Broken | Real-World Context |
82
  |---|----------|---------------|-------------------|
83
  | 1 | `missing_workdir` | No WORKDIR — files scatter to `/` | Container runs but `npm start` can't find `package.json` |
84
- | 2 | `cmd_entrypoint_conflict` | Both ENTRYPOINT and CMD defined as full commands | Process starts incorrectly; CMD should be args-only when ENTRYPOINT exists |
85
- | 3 | `entrypoint_not_executable` | Shell script lacks execute permission | `chmod +x` missing — "permission denied" at container start |
86
- | 4 | `missing_required_env` | App needs `DATABASE_URL` but it's not set | Container starts then crashes: "DATABASE_URL is not defined" |
87
- | 5 | `non_root_privileged_port` | Non-root user tries to bind port 80 | Security best practice (non-root) conflicts with port < 1024 |
88
 
89
  ### Task 3: Workflow Syntax & Structure — Easy
90
 
91
- GitHub Actions YAML has structural problems. GitHub rejects these before any job runs.
92
 
93
  | # | Scenario | What's Broken | Real-World Context |
94
  |---|----------|---------------|-------------------|
95
- | 1 | `checkout_after_build` | `docker build` runs before `actions/checkout` | No source code checked out — "Dockerfile not found" |
96
- | 2 | `missing_runs_on` | Job has no `runs-on` field | GitHub Actions rejects: every job needs a runner |
97
- | 3 | `invalid_trigger_syntax` | `branches: main` instead of `branches: [main]` | Must be a YAML list, not a scalar string |
98
- | 4 | `missing_step_uses_or_run` | Step has a name but no `uses:` or `run:` | Invalid step — must do something |
99
- | 5 | `missing_on_trigger` | No `on:` block at all | Workflow never triggers — GitHub doesn't know when to run it |
100
 
101
  ### Task 4: Workflow Secrets & Permissions — Medium
102
 
103
- Secrets exist in the repository but aren't wired correctly to the workflow steps. These are the bugs that make you say "but the secret is right there!"
104
 
105
  | # | Scenario | What's Broken | Real-World Context |
106
  |---|----------|---------------|-------------------|
107
- | 1 | `missing_env_secrets` | `$DOCKER_PASSWORD` in `run:` but no `env:` mapping | Secrets must be explicitly passed via `env:` block |
108
- | 2 | `wrong_secret_syntax` | `${ secrets.TOKEN }` instead of `${{ secrets.TOKEN }}` | Single braces vs double braces — subtle syntax difference |
109
- | 3 | `missing_token_permissions` | Pushing to GHCR without `permissions: packages: write` | GITHUB_TOKEN is read-only by default since 2023 |
110
- | 4 | `secret_not_in_env` | `curl` uses `$SLACK_WEBHOOK_URL` but it's not in `env:` | Same pattern as #1 — very common mistake |
111
- | 5 | `ghcr_wrong_credentials` | Using `DOCKER_PASSWORD` for GHCR login | GHCR uses `GITHUB_TOKEN`, not Docker Hub credentials |
112
 
113
  ### Task 5: CI + Docker Integration — Medium-Hard
114
 
115
- The workflow AND the Dockerfile interact. Fixing one file alone isn't enough — you need to understand how they work together.
116
 
117
  | # | Scenario | What's Broken | Real-World Context |
118
  |---|----------|---------------|-------------------|
119
- | 1 | `missing_buildx_for_platforms` | Multi-platform build without `setup-buildx-action` | Standard Docker builder can't cross-compile; need BuildKit |
120
- | 2 | `login_secrets_not_wired` | `docker login` step missing `env:` for secrets | Auth fails — "unauthorized: authentication required" |
121
- | 3 | `wrong_build_context` | Context is `./backend` but Dockerfile path is `./Dockerfile` | Path mismatch — build can't find the Dockerfile |
122
- | 4 | `cache_without_mode_max` | GHA cache export missing `mode=max` | Cache doesn't persist intermediate layers; slow rebuilds |
123
- | 5 | `push_without_login` | `docker push` without `docker login` first | "denied: requested access to the resource is denied" |
124
 
125
  ### Task 6: Multi-Stage Pipeline & Matrix — Hard
126
 
127
- Complex pipelines with multiple interacting bugs. The agent must find and fix 2-3 issues across multiple files.
128
 
129
  | # | Scenario | What's Broken | Real-World Context |
130
  |---|----------|---------------|-------------------|
131
- | 1 | `artifact_path_mismatch` | `COPY --from=builder /app/dist` but React outputs to `/app/build` | Framework output directories vary — CRA uses `build/`, Vite uses `dist/` |
132
- | 2 | `matrix_platform_arg` | Uses `$BUILDPLATFORM` without `ARG BUILDPLATFORM` declaration | Multi-arch builds need platform ARGs declared before FROM |
133
- | 3 | `cross_job_artifact` | Test job downloads artifact but missing `needs: build` | Jobs run in parallel by default — artifact doesn't exist yet |
134
- | 4 | `multiple_issues` | Dockerfile typo + workflow secrets not wired (2 bugs) | Real debugging: problems compound across files |
135
- | 5 | `matrix_version_failure` | Matrix includes Node 14 but code needs >= 16 + missing `needs:` | Version compatibility + job ordering — 2 bugs to find |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
136
 
137
  ---
138
 
@@ -162,86 +211,21 @@ Scoring is **deterministic** (same actions always produce the same score) and **
162
  ### The Formula
163
 
164
  ```
165
- FINAL SCORE = Partial Fixes + Complete Bonus + Efficiency - Hint Penalty
166
  ```
167
 
168
- Clamped to `[0.0, 1.0]`.
169
 
170
  ### Component Breakdown
171
 
172
- #### 1. Partial Fix Credit (40% max)
173
-
174
- ```
175
- partial = 0.40 x (issues_fixed / issues_total)
176
- ```
177
-
178
- | Fixed | Total | Partial Score |
179
- |-------|-------|---------------|
180
- | 0/2 | 2 | 0.00 |
181
- | 1/2 | 2 | 0.20 |
182
- | 2/2 | 2 | 0.40 |
183
- | 1/3 | 3 | 0.133 |
184
-
185
- #### 2. Complete Solution Bonus (30% max)
186
-
187
- ```
188
- complete = 0.30 if ALL issues fixed
189
- complete = 0.00 otherwise
190
- ```
191
-
192
- All-or-nothing. Fix 2/3 issues? You get 0. Fix 3/3? You get 0.30.
193
-
194
- #### 3. Efficiency Bonus (30% max)
195
-
196
- ```
197
- if issues_fixed == 0: efficiency = 0.00 (no credit for doing nothing)
198
- if steps <= issues_total: efficiency = 0.30 (optimal — full bonus)
199
- if steps > issues_total: efficiency = 0.30 - 0.03 per extra step
200
- ```
201
-
202
- Rewards agents that fix issues quickly. The "optimal" number of steps equals the number of issues (one fix per step).
203
-
204
- | Issues | Steps Taken | Efficiency Score |
205
- |--------|-------------|-----------------|
206
- | 1 | 1 | 0.30 (optimal) |
207
- | 1 | 3 | 0.24 |
208
- | 1 | 8 | 0.09 |
209
- | 2 | 2 | 0.30 (optimal) |
210
- | 2 | 5 | 0.21 |
211
- | 0 fixed | any | 0.00 |
212
-
213
- #### 4. Hint Penalty (-5% each)
214
-
215
- ```
216
- penalty = 0.05 x hints_used
217
- ```
218
-
219
- Each `request_hint` action costs 5% off the final score.
220
-
221
- ### Score Examples
222
-
223
- | Scenario | Partial | Complete | Efficiency | Hints | **Final Score** |
224
- |----------|---------|----------|------------|-------|-----------------|
225
- | Fixed 0/2 issues | 0.00 | 0.00 | 0.00 | 0 | **0.000** |
226
- | Fixed 1/2 in 3 steps | 0.20 | 0.00 | 0.27 | 0 | **~0.470** |
227
- | Fixed 2/2 in 5 steps | 0.40 | 0.30 | 0.21 | 0 | **~0.910** |
228
- | Fixed 1/1 in 1 step | 0.40 | 0.30 | 0.30 | 0 | **1.000** |
229
- | Fixed 1/1 + 2 hints | 0.40 | 0.30 | 0.30 | -0.10 | **0.900** |
230
- | Submitted immediately | 0.00 | 0.00 | 0.00 | 0 | **0.000** |
231
-
232
- ### Per-Step Rewards (Dense Feedback)
233
-
234
- The agent also gets **immediate rewards** after each action (not just at the end):
235
-
236
- | Event | Reward |
237
- |-------|--------|
238
- | Fix validated (issue resolved) | +0.3 per issue fixed |
239
- | Successful validation improvement | +0.1 |
240
- | Failed edit (old_content didn't match) | -0.02 |
241
- | Request hint | -0.05 |
242
- | Submit (terminal) | 0.0 |
243
-
244
- This dense reward signal helps RL agents learn faster than sparse pass/fail grading.
245
 
246
  ---
247
 
@@ -249,8 +233,8 @@ This dense reward signal helps RL agents learn faster than sparse pass/fail grad
249
 
250
  | Endpoint | Method | Description |
251
  |----------|--------|-------------|
252
- | `/` | GET | Root health check |
253
- | `/health` | GET | OpenEnv health endpoint — returns `{"status": "healthy"}` |
254
  | `/metadata` | GET | Environment name, description, version, tags |
255
  | `/schema` | GET | Action, observation, and state JSON schemas |
256
  | `/reset` | POST | Start a new episode (optional: `task_id`, `scenario_id`, `seed`) |
@@ -268,61 +252,34 @@ This dense reward signal helps RL agents learn faster than sparse pass/fail grad
268
  # 1. Start an episode
269
  curl -X POST http://localhost:8000/reset \
270
  -H "Content-Type: application/json" \
271
- -d '{"task_id": "dockerfile_syntax", "scenario_id": "typo_filename"}'
272
-
273
- # Response: observation with broken Dockerfile + error message
274
 
275
- # 2. Fix the typo
276
  curl -X POST http://localhost:8000/step \
277
  -H "Content-Type: application/json" \
278
  -d '{
279
  "action": {
280
  "action_type": "edit_file",
281
  "edits": [{
282
- "file_path": "Dockerfile",
283
- "old_content": "COPY requirments.txt .",
284
- "new_content": "COPY requirements.txt ."
285
  }]
286
  }
287
  }'
288
 
289
- # Response: reward=0.4, issues_fixed=1/1
290
-
291
- # 3. Submit
292
- curl -X POST http://localhost:8000/step \
293
- -H "Content-Type: application/json" \
294
- -d '{"action": {"action_type": "submit"}}'
295
-
296
- # Response: done=true, episode complete
297
  ```
298
 
299
  ---
300
 
301
- ## Baseline Results (Llama 3.1 70B)
302
-
303
- Tested with `meta-llama/Llama-3.1-70B-Instruct` via HuggingFace router:
304
-
305
- | Task | Score | Notes |
306
- |------|-------|-------|
307
- | dockerfile_syntax | 1.000 | Solved perfectly in 1 step |
308
- | dockerfile_runtime | 1.000 | Solved perfectly in 1 step |
309
- | workflow_syntax_structure | 0.000 | LLM struggled with exact whitespace matching |
310
- | workflow_secrets_permissions | 1.000 | Solved perfectly in 1 step |
311
- | ci_docker_integration | 0.000 | Multi-step fix needed; LLM edits didn't match exactly |
312
- | multi_stage_pipeline_matrix | 0.283 | Fixed 1/3 issues |
313
- | **OVERALL** | **0.547** | |
314
-
315
- This shows the environment is both **solvable** (3 perfect scores) and **challenging** (2 zero scores, 1 partial). The main difficulty is exact string matching for edits — a realistic constraint that mirrors real file editing.
316
-
317
- ---
318
-
319
  ## Quick Start
320
 
321
  ### Local Development
322
 
323
  ```bash
324
  pip install -r requirements.txt
325
- python -m uvicorn server.main:app --host 0.0.0.0 --port 8000
326
  ```
327
 
328
  ### Run Tests
@@ -334,8 +291,8 @@ pytest tests/ -v
334
  ### Docker
335
 
336
  ```bash
337
- docker build -t cicd-docker-env .
338
- docker run -p 8000:8000 cicd-docker-env
339
  ```
340
 
341
  ### Baseline Inference (with LLM)
@@ -352,7 +309,7 @@ python inference.py
352
  ## Project Structure
353
 
354
  ```
355
- cicd-docker-env/
356
  ├── openenv.yaml # OpenEnv environment specification
357
  ├── inference.py # LLM baseline (OpenAI client + HF router)
358
  ├── baseline_runner.py # Heuristic baseline for /baseline endpoint
@@ -360,24 +317,28 @@ cicd-docker-env/
360
  ├── requirements.txt # Python dependencies
361
 
362
  ├── server/
363
- │ ├── main.py # FastAPI with 12 endpoints
364
  │ ├── models.py # Pydantic models (type-safe API)
365
  │ ├── environment.py # Core environment loop (reset/step/state)
366
  │ ├── tasks/
367
  │ │ ├── base.py # BaseTask with scenario loading
368
- │ │ ├── task_registry.py # Maps task_id → task class
369
  │ │ ├── task_1_build_errors.py # 5 Dockerfile syntax scenarios
370
  │ │ ├── task_2_docker_runtime.py # 5 Dockerfile runtime scenarios
371
  │ │ ├── task_3_workflow_syntax.py # 5 workflow structure scenarios
372
  │ │ ├── task_4_workflow_secrets_permissions.py # 5 secrets scenarios
373
  │ │ ├── task_5_ci_docker_integration.py # 5 integration scenarios
374
- │ │ ── task_6_multi_stage_matrix.py # 5 multi-issue scenarios
 
 
 
 
375
  │ ├── graders/
376
- │ │ ── __init__.py # Deterministic trajectory grader
377
- │ │ └── base.py # Base grader with weight constants
378
  │ └── simulators/
379
  │ ├── docker_simulator.py # 15+ Dockerfile validation rules
380
- ── workflow_simulator.py # 15+ workflow validation rules
 
381
 
382
  └── tests/
383
  ├── test_endpoints.py # API endpoint tests
@@ -389,12 +350,12 @@ cicd-docker-env/
389
 
390
  ## Design Decisions
391
 
392
- 1. **Docker + GitHub Actions combined**: These two tools intersect in every modern deployment pipeline. Debugging their interaction is the hardest part of DevOps.
393
- 2. **Simulated validation (no real Docker)**: Static analysis rules instead of running actual containers. This gives deterministic results, fast execution, and no security concerns.
394
- 3. **Dense rewards**: Partial credit at every step (+0.3 per fix, -0.02 per failed edit) rather than sparse pass/fail. Helps RL agents learn faster.
395
- 4. **Difficulty progression**: Easy tasks are single-file, single-issue. Hard tasks are multi-file, multi-issue with interacting bugs.
396
- 5. **Exact string matching for edits**: Mirrors real file editing — whitespace matters. This is intentionally challenging for LLMs.
397
- 6. **30 scenarios from real bugs**: Every scenario is based on actual developer mistakes documented on Stack Overflow, GitHub Issues, and Docker/GitHub Actions documentation.
398
 
399
  ## License
400
 
 
1
  ---
2
+ title: Cloud-Native DevOps Debug Environment
3
  emoji: 🔧
4
  colorFrom: blue
5
  colorTo: green
 
8
  pinned: false
9
  ---
10
 
11
+ # Cloud-Native DevOps Debug Environment
12
 
13
+ An OpenEnv-compatible environment where AI agents learn to debug broken GitHub Actions workflows, Dockerfiles, and Kubernetes manifests. Built for the OpenEnv Hackathon by Scaler School of Technology (partners: Meta, HuggingFace, PyTorch).
14
 
15
+ ## Why Cloud-Native Debugging?
16
 
17
+ Every developer who ships code hits deployment pipeline failures. A misconfigured Dockerfile, a broken GitHub Actions workflow, a missing secret, a Kubernetes selector mismatch — these are the bugs that waste hours of developer time every week. They're hard to debug because:
18
 
19
  - Error messages are cryptic ("unable to prepare context: unable to evaluate symlinks")
20
  - The feedback loop is slow (push, wait for CI, read logs, fix, repeat)
21
+ - Multiple config files interact in non-obvious ways (Dockerfile + workflow + secrets + K8s manifests)
22
+ - Kubernetes errors require cross-resource reasoning (Deployment labels must match Service selectors)
23
 
24
+ This environment teaches AI agents to do what senior DevOps engineers do: read the error, trace it to the root cause across multiple files, and fix it.
25
 
26
  ---
27
 
 
31
  ┌──────────────────────────────────────────────────────────────┐
32
  │ 1. RESET │
33
  │ Agent receives: │
34
+ │ - Broken config files (Dockerfile / workflow / K8s YAML)
35
  │ - Error message from the failed build/deploy │
36
  │ - Available secrets list │
37
  │ - Number of issues to find │
 
61
 
62
  ---
63
 
64
+ ## The 10 Tasks (50 Scenarios)
65
 
66
  ### Task 1: Dockerfile Syntax Errors — Easy
67
 
68
+ Simple typos and instruction errors that break `docker build`.
69
 
70
  | # | Scenario | What's Broken | Real-World Context |
71
  |---|----------|---------------|-------------------|
 
73
  | 2 | `invalid_base_image` | `FROM python:3.9-slimm` — extra 'm' in tag | Happens when copy-pasting image tags |
74
  | 3 | `invalid_run_syntax` | `RUN pip install ... \n && python setup.py` — broken line continuation | Formatting multi-line RUN commands is tricky |
75
  | 4 | `invalid_expose` | `EXPOSE "eighty"` — string instead of port number | EXPOSE only accepts numeric ports |
76
+ | 5 | `missing_from_instruction` | No `FROM` instruction at all | Dockerfile must start with FROM |
77
 
78
  ### Task 2: Dockerfile Runtime Errors — Medium
79
 
80
+ The Dockerfile builds successfully, but the container crashes at runtime.
81
 
82
  | # | Scenario | What's Broken | Real-World Context |
83
  |---|----------|---------------|-------------------|
84
  | 1 | `missing_workdir` | No WORKDIR — files scatter to `/` | Container runs but `npm start` can't find `package.json` |
85
+ | 2 | `cmd_entrypoint_conflict` | Both ENTRYPOINT and CMD defined as full commands | Process starts incorrectly |
86
+ | 3 | `entrypoint_not_executable` | Shell script lacks execute permission | `chmod +x` missing — "permission denied" |
87
+ | 4 | `missing_required_env` | App needs `DATABASE_URL` but it's not set | Container crashes: "DATABASE_URL is not defined" |
88
+ | 5 | `non_root_privileged_port` | Non-root user tries to bind port 80 | Security best practice conflicts with port < 1024 |
89
 
90
  ### Task 3: Workflow Syntax & Structure — Easy
91
 
92
+ GitHub Actions YAML has structural problems that GitHub rejects before any job runs.
93
 
94
  | # | Scenario | What's Broken | Real-World Context |
95
  |---|----------|---------------|-------------------|
96
+ | 1 | `checkout_after_build` | `docker build` before `actions/checkout` | No source code — "Dockerfile not found" |
97
+ | 2 | `missing_runs_on` | Job has no `runs-on` field | Every job needs a runner |
98
+ | 3 | `invalid_trigger_syntax` | `branches: main` instead of `branches: [main]` | Must be a YAML list |
99
+ | 4 | `missing_step_uses_or_run` | Step has a name but no `uses:` or `run:` | Invalid step |
100
+ | 5 | `missing_on_trigger` | No `on:` block at all | Workflow never triggers |
101
 
102
  ### Task 4: Workflow Secrets & Permissions — Medium
103
 
104
+ Secrets exist but aren't wired correctly to the workflow steps.
105
 
106
  | # | Scenario | What's Broken | Real-World Context |
107
  |---|----------|---------------|-------------------|
108
+ | 1 | `missing_env_secrets` | `$DOCKER_PASSWORD` without `env:` mapping | Secrets must be passed via `env:` block |
109
+ | 2 | `wrong_secret_syntax` | `${ secrets.TOKEN }` instead of `${{ secrets.TOKEN }}` | Single vs double braces |
110
+ | 3 | `missing_token_permissions` | Pushing to GHCR without `permissions: packages: write` | GITHUB_TOKEN is read-only by default |
111
+ | 4 | `secret_not_in_env` | `$SLACK_WEBHOOK_URL` not in `env:` | Very common mistake |
112
+ | 5 | `ghcr_wrong_credentials` | Using `DOCKER_PASSWORD` for GHCR login | GHCR uses `GITHUB_TOKEN` |
113
 
114
  ### Task 5: CI + Docker Integration — Medium-Hard
115
 
116
+ The workflow AND the Dockerfile interact. Fixing one file alone isn't enough.
117
 
118
  | # | Scenario | What's Broken | Real-World Context |
119
  |---|----------|---------------|-------------------|
120
+ | 1 | `missing_buildx_for_platforms` | Multi-platform build without `setup-buildx-action` | Need BuildKit for cross-compile |
121
+ | 2 | `login_secrets_not_wired` | `docker login` missing `env:` for secrets | "unauthorized: authentication required" |
122
+ | 3 | `wrong_build_context` | Context is `./backend` but Dockerfile path is `./Dockerfile` | Path mismatch |
123
+ | 4 | `cache_without_mode_max` | GHA cache export missing `mode=max` | Cache doesn't persist |
124
+ | 5 | `push_without_login` | `docker push` without `docker login` first | "denied: requested access" |
125
 
126
  ### Task 6: Multi-Stage Pipeline & Matrix — Hard
127
 
128
+ Complex pipelines with multiple interacting bugs. Agent must find 2-3 issues across files.
129
 
130
  | # | Scenario | What's Broken | Real-World Context |
131
  |---|----------|---------------|-------------------|
132
+ | 1 | `artifact_path_mismatch` | `COPY --from=builder /app/dist` but React outputs to `/app/build` | CRA uses `build/`, Vite uses `dist/` |
133
+ | 2 | `matrix_platform_arg` | `$BUILDPLATFORM` without `ARG BUILDPLATFORM` | Multi-arch needs platform ARGs |
134
+ | 3 | `cross_job_artifact` | Test job downloads artifact but missing `needs: build` | Jobs run in parallel by default |
135
+ | 4 | `multiple_issues` | Dockerfile typo + workflow secrets not wired (2 bugs) | Problems compound across files |
136
+ | 5 | `matrix_version_failure` | Matrix includes Node 14 but code needs >= 16 + missing `needs:` | 2 bugs to find |
137
+
138
+ ### Task 7: Kubernetes Pod Failures — Medium
139
+
140
+ Pod crashes and scheduling failures in Kubernetes deployments.
141
+
142
+ | # | Scenario | What's Broken | Real-World Context |
143
+ |---|----------|---------------|-------------------|
144
+ | 1 | `oom_killed` | Memory limit 64Mi too low — CrashLoopBackOff/OOMKilled | Most common K8s production issue |
145
+ | 2 | `image_pull_backoff` | Image tag typo `nginx:latset` → ImagePullBackOff | Copy-paste tag errors |
146
+ | 3 | `wrong_command` | `command: ["python", "workers.py"]` but file is `worker.py` | File name mismatch |
147
+ | 4 | `missing_configmap` | `envFrom: configMapRef: app-config` but ConfigMap doesn't exist | CreateContainerConfigError |
148
+ | 5 | `liveness_probe_failing` | Liveness probe port 3000 but app listens on 8080 | Probe misconfiguration causes restarts |
149
+
150
+ ### Task 8: Kubernetes Service & Ingress Issues — Hard
151
+
152
+ Networking issues where pods run fine but traffic doesn't reach them.
153
+
154
+ | # | Scenario | What's Broken | Real-World Context |
155
+ |---|----------|---------------|-------------------|
156
+ | 1 | `selector_mismatch` | Service selector `app: api` but pod label is `app: api-server` | No endpoints — most common K8s networking bug |
157
+ | 2 | `port_mismatch` | Service targetPort 8080 but container listens on 3000 | Connection refused |
158
+ | 3 | `ingress_wrong_service` | Ingress references `api-svc` but service name is `api-service` | Ingress 404 |
159
+ | 4 | `network_policy_blocking` | NetworkPolicy with empty ingress rules blocks all traffic | Database unreachable |
160
+ | 5 | `missing_ingress_class` | No `ingressClassName: nginx` specified | Ingress controller doesn't pick it up |
161
+
162
+ ### Task 9: CI/CD Build & Push Pipeline — Hard
163
+
164
+ GHA-to-Docker-to-Registry pipeline failures spanning multiple files.
165
+
166
+ | # | Scenario | What's Broken | Real-World Context |
167
+ |---|----------|---------------|-------------------|
168
+ | 1 | `ghcr_token_not_mapped` | `$GITHUB_TOKEN` shell var not mapped from secrets | GHCR login fails |
169
+ | 2 | `image_tag_mismatch` | Build uses `github.ref_name` but push uses `github.sha` | "image not found locally" |
170
+ | 3 | `missing_packages_write` | No `permissions: packages: write` for GHCR push | "permission_denied: write_package" |
171
+ | 4 | `build_arg_not_passed` | Dockerfile `ARG APP_VERSION` but no `--build-arg` in workflow | Version file is empty |
172
+ | 5 | `multistage_output_mismatch` | `COPY --from=builder /app/dist` but react-scripts outputs to `/app/build` | Wrong output directory |
173
+
174
+ ### Task 10: Full Stack Deployment Pipeline — Expert
175
+
176
+ Multi-error scenarios spanning the entire stack: GHA + Dockerfile + K8s manifests. 2-4 bugs per scenario requiring cross-file reasoning.
177
+
178
+ | # | Scenario | What's Broken | Real-World Context |
179
+ |---|----------|---------------|-------------------|
180
+ | 1 | `full_pipeline_ghcr_and_selector` | GHCR token not mapped + K8s Service selector mismatch | 2 bugs across workflow + K8s |
181
+ | 2 | `full_pipeline_three_bugs` | Missing checkout + no WORKDIR + wrong container/service port | 4 bugs across 4 files |
182
+ | 3 | `full_pipeline_ghcr_dockerfile_k8s` | Wrong GHCR secret + base image typo + OOM memory limit | 3 bugs across all layers |
183
+ | 4 | `full_pipeline_permissions_image_ingress` | Missing packages:write + hardcoded image placeholder + no ingressClassName | 3 bugs |
184
+ | 5 | `full_pipeline_secrets_build_probe` | Docker secrets not wired + wrong build output dir + probe port mismatch | 4 bugs across all layers |
185
 
186
  ---
187
 
 
211
  ### The Formula
212
 
213
  ```
214
+ FINAL SCORE = Base + Partial Fixes + Complete Bonus + Efficiency - Hint Penalty - Failed Edit Penalty
215
  ```
216
 
217
+ Clamped to `(0.01, 0.99)`.
218
 
219
  ### Component Breakdown
220
 
221
+ | Component | Weight | Description |
222
+ |-----------|--------|-------------|
223
+ | Base score | 5% | Participation credit |
224
+ | Partial fixes | 35% | Proportional to `issues_fixed / issues_total` |
225
+ | Complete bonus | 25% | All issues fixed |
226
+ | Efficiency | 25% | Decays with extra steps beyond optimal |
227
+ | Hint penalty | -4% each | Per `request_hint` action |
228
+ | Failed edit penalty | -2% each | Per edit with no valid file path |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
229
 
230
  ---
231
 
 
233
 
234
  | Endpoint | Method | Description |
235
  |----------|--------|-------------|
236
+ | `/` | GET | Root page |
237
+ | `/health` | GET | Health check — returns `{"status": "healthy"}` |
238
  | `/metadata` | GET | Environment name, description, version, tags |
239
  | `/schema` | GET | Action, observation, and state JSON schemas |
240
  | `/reset` | POST | Start a new episode (optional: `task_id`, `scenario_id`, `seed`) |
 
252
  # 1. Start an episode
253
  curl -X POST http://localhost:8000/reset \
254
  -H "Content-Type: application/json" \
255
+ -d '{"task_id": "k8s_pod_failures", "scenario_id": "oom_killed"}'
 
 
256
 
257
+ # 2. Fix the memory limit
258
  curl -X POST http://localhost:8000/step \
259
  -H "Content-Type: application/json" \
260
  -d '{
261
  "action": {
262
  "action_type": "edit_file",
263
  "edits": [{
264
+ "file_path": "k8s/deployment.yaml",
265
+ "old_content": "memory: \"64Mi\"",
266
+ "new_content": "memory: \"256Mi\""
267
  }]
268
  }
269
  }'
270
 
271
+ # Response: reward=0.3, issues_fixed=1/1, done=true
 
 
 
 
 
 
 
272
  ```
273
 
274
  ---
275
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
276
  ## Quick Start
277
 
278
  ### Local Development
279
 
280
  ```bash
281
  pip install -r requirements.txt
282
+ python -m uvicorn server.app:app --host 0.0.0.0 --port 8000
283
  ```
284
 
285
  ### Run Tests
 
291
  ### Docker
292
 
293
  ```bash
294
+ docker build -t cloud-native-devops-env .
295
+ docker run -p 8000:8000 cloud-native-devops-env
296
  ```
297
 
298
  ### Baseline Inference (with LLM)
 
309
  ## Project Structure
310
 
311
  ```
312
+ cloud-native-devops-env/
313
  ├── openenv.yaml # OpenEnv environment specification
314
  ├── inference.py # LLM baseline (OpenAI client + HF router)
315
  ├── baseline_runner.py # Heuristic baseline for /baseline endpoint
 
317
  ├── requirements.txt # Python dependencies
318
 
319
  ├── server/
320
+ │ ├── app.py # FastAPI with 12 endpoints
321
  │ ├── models.py # Pydantic models (type-safe API)
322
  │ ├── environment.py # Core environment loop (reset/step/state)
323
  │ ├── tasks/
324
  │ │ ├── base.py # BaseTask with scenario loading
325
+ │ │ ├── task_registry.py # Maps task_id → task class (10 tasks)
326
  │ │ ├── task_1_build_errors.py # 5 Dockerfile syntax scenarios
327
  │ │ ├── task_2_docker_runtime.py # 5 Dockerfile runtime scenarios
328
  │ │ ├── task_3_workflow_syntax.py # 5 workflow structure scenarios
329
  │ │ ├── task_4_workflow_secrets_permissions.py # 5 secrets scenarios
330
  │ │ ├── task_5_ci_docker_integration.py # 5 integration scenarios
331
+ │ │ ── task_6_multi_stage_matrix.py # 5 multi-issue scenarios
332
+ │ │ ├── k8s_pod.py # 5 Kubernetes pod failure scenarios
333
+ │ │ ├── k8s_networking.py # 5 K8s networking scenarios
334
+ │ │ ├── pipeline_build_deploy.py # 5 GHA→Docker→Registry scenarios
335
+ │ │ └── pipeline_full.py # 5 full-stack multi-error scenarios
336
  │ ├── graders/
337
+ │ │ ── __init__.py # Deterministic trajectory grader
 
338
  │ └── simulators/
339
  │ ├── docker_simulator.py # 15+ Dockerfile validation rules
340
+ ── workflow_simulator.py # 15+ workflow validation rules
341
+ │ └── k8s_simulator.py # Kubernetes manifest validator
342
 
343
  └── tests/
344
  ├── test_endpoints.py # API endpoint tests
 
350
 
351
  ## Design Decisions
352
 
353
+ 1. **Full cloud-native stack**: Docker + GitHub Actions + Kubernetes the three pillars of modern deployment pipelines.
354
+ 2. **Simulated validation (no real Docker/K8s)**: Static analysis rules give deterministic results, fast execution, and no security concerns.
355
+ 3. **Dense rewards**: Partial credit at every step (+0.3 per fix, -0.02 per failed edit) rather than sparse pass/fail.
356
+ 4. **Difficulty progression**: Easy tasks are single-file, single-issue. Expert tasks are multi-file, multi-issue with interacting bugs across all three layers.
357
+ 5. **Exact string matching for edits**: Mirrors real file editing — whitespace matters.
358
+ 6. **50 scenarios from real bugs**: Every scenario is based on actual developer mistakes documented on Stack Overflow, GitHub Issues, and official documentation.
359
 
360
  ## License
361
 
inference.py CHANGED
@@ -1,4 +1,4 @@
1
- """Baseline inference script for CI/CD Debug Environment.
2
 
3
  Uses OpenAI-compatible client to call Llama 3.1 70B via HuggingFace router.
4
  Required by OpenEnv specification.
@@ -29,8 +29,8 @@ ENV_URL = os.getenv("ENV_URL", "http://localhost:8000")
29
  LOCAL_IMAGE_NAME = os.getenv("LOCAL_IMAGE_NAME")
30
  MAX_STEPS = 8 # leave 2 steps buffer before env hard-limit of 10
31
 
32
- SYSTEM_PROMPT = """You are an expert DevOps engineer debugging CI/CD pipelines.
33
- You will receive broken Dockerfile and/or GitHub Actions workflow files along with error messages.
34
 
35
  Your job is to:
36
  1. Analyze the error message carefully
@@ -49,6 +49,18 @@ When you identify a fix, respond with a JSON object in this exact format:
49
  ]
50
  }
51
 
 
 
 
 
 
 
 
 
 
 
 
 
52
  If you believe all issues are fixed and want to submit, respond with:
53
  {"action": "submit"}
54
 
@@ -62,6 +74,8 @@ Rules:
62
  - Common issues: typos, wrong syntax, missing fields, wrong secret references
63
  - For GitHub Actions: check secret syntax (${{ }} not ${ }), env blocks, permissions
64
  - For Dockerfiles: check instruction syntax, file paths, base image tags
 
 
65
  - Always respond with valid JSON only, no markdown fences"""
66
 
67
 
@@ -280,7 +294,7 @@ def run_all_tasks(client: OpenAI) -> Dict[str, float]:
280
 
281
  def main():
282
  """Entry point for baseline inference."""
283
- print("CI/CD Debug Environment - Baseline Inference")
284
  print(f"API: {API_BASE_URL}")
285
  print(f"Model: {MODEL_NAME}")
286
  print(f"Environment: {ENV_URL}")
 
1
+ """Baseline inference script for Cloud-Native Debug Environment.
2
 
3
  Uses OpenAI-compatible client to call Llama 3.1 70B via HuggingFace router.
4
  Required by OpenEnv specification.
 
29
  LOCAL_IMAGE_NAME = os.getenv("LOCAL_IMAGE_NAME")
30
  MAX_STEPS = 8 # leave 2 steps buffer before env hard-limit of 10
31
 
32
+ SYSTEM_PROMPT = """You are an expert DevOps engineer debugging cloud-native deployment pipelines.
33
+ You will receive broken Dockerfile, GitHub Actions workflow, and/or Kubernetes manifest files along with error messages.
34
 
35
  Your job is to:
36
  1. Analyze the error message carefully
 
49
  ]
50
  }
51
 
52
+ To create a new file (e.g. a missing ConfigMap), use an empty old_content:
53
+ {
54
+ "reasoning": "Create missing ConfigMap manifest",
55
+ "edits": [
56
+ {
57
+ "file_path": "k8s/configmap.yaml",
58
+ "old_content": "",
59
+ "new_content": "apiVersion: v1\\nkind: ConfigMap\\n..."
60
+ }
61
+ ]
62
+ }
63
+
64
  If you believe all issues are fixed and want to submit, respond with:
65
  {"action": "submit"}
66
 
 
74
  - Common issues: typos, wrong syntax, missing fields, wrong secret references
75
  - For GitHub Actions: check secret syntax (${{ }} not ${ }), env blocks, permissions
76
  - For Dockerfiles: check instruction syntax, file paths, base image tags
77
+ - For Kubernetes: check label selectors, port matching, resource limits, probe configs, ingress rules
78
+ - For full-stack pipelines: issues may span multiple files (workflow + Dockerfile + K8s manifests)
79
  - Always respond with valid JSON only, no markdown fences"""
80
 
81
 
 
294
 
295
  def main():
296
  """Entry point for baseline inference."""
297
+ print("Cloud-Native Debug Environment - Baseline Inference")
298
  print(f"API: {API_BASE_URL}")
299
  print(f"Model: {MODEL_NAME}")
300
  print(f"Environment: {ENV_URL}")
openenv.yaml CHANGED
@@ -1,8 +1,8 @@
1
- name: cicd-docker-env
2
  version: "1.0.0"
3
  description: >
4
- Debug broken GitHub Actions workflows and Dockerfiles.
5
- AI agents identify and fix CI/CD infrastructure issues.
6
 
7
  author: Krishna
8
  license: MIT
@@ -10,8 +10,10 @@ tags:
10
  - devops
11
  - docker
12
  - github-actions
 
13
  - debugging
14
  - infrastructure
 
15
 
16
  environment:
17
  type: text
@@ -56,6 +58,30 @@ tasks:
56
  difficulty: hard
57
  num_scenarios: 5
58
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
59
  graders:
60
  dockerfile_syntax:
61
  type: deterministic
@@ -75,6 +101,18 @@ graders:
75
  multi_stage_pipeline_matrix:
76
  type: deterministic
77
  score_range: [0.0, 1.0]
 
 
 
 
 
 
 
 
 
 
 
 
78
 
79
  baseline:
80
  script: inference.py
@@ -85,6 +123,10 @@ baseline:
85
  workflow_secrets_permissions: 0.50
86
  ci_docker_integration: 0.45
87
  multi_stage_pipeline_matrix: 0.30
 
 
 
 
88
 
89
  resources:
90
  vcpu: 2
 
1
+ name: cloud-native-devops-env
2
  version: "1.0.0"
3
  description: >
4
+ Debug broken GitHub Actions workflows, Dockerfiles, and Kubernetes manifests.
5
+ AI agents identify and fix cloud-native deployment pipeline issues.
6
 
7
  author: Krishna
8
  license: MIT
 
10
  - devops
11
  - docker
12
  - github-actions
13
+ - kubernetes
14
  - debugging
15
  - infrastructure
16
+ - cloud-native
17
 
18
  environment:
19
  type: text
 
58
  difficulty: hard
59
  num_scenarios: 5
60
 
61
+ - id: k8s_pod_failures
62
+ name: Kubernetes Pod Failures
63
+ description: Fix Kubernetes pod failures including CrashLoopBackOff, ImagePullBackOff, and resource issues
64
+ difficulty: medium
65
+ num_scenarios: 5
66
+
67
+ - id: k8s_networking
68
+ name: Kubernetes Service & Ingress Issues
69
+ description: Fix Kubernetes networking issues including Service selectors, port mismatches, and Ingress configuration
70
+ difficulty: hard
71
+ num_scenarios: 5
72
+
73
+ - id: pipeline_build_deploy
74
+ name: CI/CD Build & Push Pipeline
75
+ description: Debug GHA-to-Docker-to-Registry pipeline failures across multiple files
76
+ difficulty: hard
77
+ num_scenarios: 5
78
+
79
+ - id: pipeline_full_stack
80
+ name: Full Stack Deployment Pipeline
81
+ description: Debug complex multi-error deployment pipelines across GHA workflows, Dockerfiles, and Kubernetes manifests
82
+ difficulty: expert
83
+ num_scenarios: 5
84
+
85
  graders:
86
  dockerfile_syntax:
87
  type: deterministic
 
101
  multi_stage_pipeline_matrix:
102
  type: deterministic
103
  score_range: [0.0, 1.0]
104
+ k8s_pod_failures:
105
+ type: deterministic
106
+ score_range: [0.0, 1.0]
107
+ k8s_networking:
108
+ type: deterministic
109
+ score_range: [0.0, 1.0]
110
+ pipeline_build_deploy:
111
+ type: deterministic
112
+ score_range: [0.0, 1.0]
113
+ pipeline_full_stack:
114
+ type: deterministic
115
+ score_range: [0.0, 1.0]
116
 
117
  baseline:
118
  script: inference.py
 
123
  workflow_secrets_permissions: 0.50
124
  ci_docker_integration: 0.45
125
  multi_stage_pipeline_matrix: 0.30
126
+ k8s_pod_failures: 0.50
127
+ k8s_networking: 0.40
128
+ pipeline_build_deploy: 0.35
129
+ pipeline_full_stack: 0.20
130
 
131
  resources:
132
  vcpu: 2
pyproject.toml CHANGED
@@ -3,16 +3,16 @@ requires = ["setuptools>=68.0", "wheel"]
3
  build-backend = "setuptools.build_meta"
4
 
5
  [project]
6
- name = "cicd-docker-env"
7
  version = "1.0.0"
8
- description = "OpenEnv environment for debugging CI/CD infrastructure — GitHub Actions workflows and Dockerfiles."
9
  readme = "README.md"
10
  license = {text = "MIT"}
11
  requires-python = ">=3.10"
12
  authors = [
13
  {name = "Krishna"},
14
  ]
15
- keywords = ["openenv", "cicd", "docker", "github-actions", "debugging"]
16
  classifiers = [
17
  "Programming Language :: Python :: 3",
18
  "License :: OSI Approved :: MIT License",
@@ -44,7 +44,7 @@ inference = [
44
  server = "server.app:main"
45
 
46
  [project.urls]
47
- Homepage = "https://huggingface.co/spaces/jester1177/cicd-docker-env"
48
  Repository = "https://github.com/melohub-xbit/GitHubActions-Docker-OpenEnv"
49
 
50
  [tool.setuptools.packages.find]
 
3
  build-backend = "setuptools.build_meta"
4
 
5
  [project]
6
+ name = "cloud-native-devops-env"
7
  version = "1.0.0"
8
+ description = "OpenEnv environment for debugging cloud-native deployment pipelines — GitHub Actions workflows, Dockerfiles, and Kubernetes manifests."
9
  readme = "README.md"
10
  license = {text = "MIT"}
11
  requires-python = ">=3.10"
12
  authors = [
13
  {name = "Krishna"},
14
  ]
15
+ keywords = ["openenv", "cicd", "docker", "github-actions", "kubernetes", "debugging", "cloud-native"]
16
  classifiers = [
17
  "Programming Language :: Python :: 3",
18
  "License :: OSI Approved :: MIT License",
 
44
  server = "server.app:main"
45
 
46
  [project.urls]
47
+ Homepage = "https://huggingface.co/spaces/jester1177/cloud-native-devops-env"
48
  Repository = "https://github.com/melohub-xbit/GitHubActions-Docker-OpenEnv"
49
 
50
  [tool.setuptools.packages.find]
server/app.py CHANGED
@@ -1,4 +1,4 @@
1
- """FastAPI server for the CI/CD Debug Environment."""
2
 
3
  from pathlib import Path
4
  from typing import Optional
@@ -31,8 +31,8 @@ from server.tasks.task_registry import TASK_REGISTRY
31
  STATIC_DIR = Path(__file__).resolve().parent / "static"
32
 
33
  app = FastAPI(
34
- title="CI/CD + Docker Debug Environment",
35
- description="OpenEnv-style environment for Docker + GitHub Actions debugging",
36
  version="1.0.0",
37
  )
38
 
@@ -64,11 +64,11 @@ async def health():
64
  @app.get("/metadata")
65
  async def metadata():
66
  return {
67
- "name": "cicd-docker-env",
68
- "description": "Debug broken GitHub Actions workflows and Dockerfiles. AI agents identify and fix CI/CD infrastructure issues.",
69
  "version": "1.0.0",
70
  "author": "Krishna",
71
- "tags": ["devops", "docker", "github-actions", "debugging", "infrastructure"],
72
  }
73
 
74
 
@@ -95,7 +95,7 @@ async def mcp(request: dict = None):
95
  "result": {
96
  "protocolVersion": "2024-11-05",
97
  "capabilities": {"tools": {}},
98
- "serverInfo": {"name": "cicd-docker-env", "version": "1.0.0"},
99
  },
100
  }
101
  elif method == "tools/list":
 
1
+ """FastAPI server for the Cloud-Native DevOps Debug Environment."""
2
 
3
  from pathlib import Path
4
  from typing import Optional
 
31
  STATIC_DIR = Path(__file__).resolve().parent / "static"
32
 
33
  app = FastAPI(
34
+ title="Cloud-Native Debug Environment",
35
+ description="OpenEnv-style environment for Docker + GitHub Actions + Kubernetes debugging",
36
  version="1.0.0",
37
  )
38
 
 
64
  @app.get("/metadata")
65
  async def metadata():
66
  return {
67
+ "name": "cloud-native-devops-env",
68
+ "description": "Debug broken GitHub Actions workflows, Dockerfiles, and Kubernetes manifests. AI agents identify and fix cloud-native deployment pipeline issues.",
69
  "version": "1.0.0",
70
  "author": "Krishna",
71
+ "tags": ["devops", "docker", "github-actions", "kubernetes", "debugging", "infrastructure", "cloud-native"],
72
  }
73
 
74
 
 
95
  "result": {
96
  "protocolVersion": "2024-11-05",
97
  "capabilities": {"tools": {}},
98
+ "serverInfo": {"name": "cloud-native-devops-env", "version": "1.0.0"},
99
  },
100
  }
101
  elif method == "tools/list":
server/environment.py CHANGED
@@ -15,6 +15,7 @@ from server.models import (
15
  TaskDifficulty,
16
  )
17
  from server.simulators.docker_simulator import DockerSimulator
 
18
  from server.simulators.workflow_simulator import WorkflowSimulator
19
  from server.tasks.task_registry import TASK_REGISTRY, get_task
20
 
@@ -73,14 +74,17 @@ class CICDDebugEnvironment:
73
  docker_result = self.docker_sim.validate(self.current_files.get("Dockerfile"), self.current_files)
74
  workflow_file = self._find_workflow_file()
75
  workflow_result = self.workflow_sim.validate(workflow_file, self.current_files)
 
76
  return {
77
  "docker_build_valid": bool(docker_result.get("build_success", False)),
78
  "workflow_parse_valid": bool(workflow_result.get("parse_success", False)),
 
79
  }
80
 
81
  def __init__(self):
82
  self.docker_sim = DockerSimulator()
83
  self.workflow_sim = WorkflowSimulator()
 
84
 
85
  self.current_task_id: Optional[str] = None
86
  self.current_scenario_id: Optional[str] = None
@@ -203,6 +207,20 @@ class CICDDebugEnvironment:
203
  applied_count = 0
204
  for edit in action.edits:
205
  if edit.file_path not in self.current_files:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
206
  feedbacks.append(f"File not found: {edit.file_path}")
207
  continue
208
 
@@ -277,6 +295,9 @@ class CICDDebugEnvironment:
277
  if not before_validation["workflow_parse_valid"] and after_validation["workflow_parse_valid"]:
278
  reward += 0.1
279
  feedbacks.append("Workflow parse validity improved")
 
 
 
280
 
281
  if applied_count == 0:
282
  self.last_action_success = False
@@ -290,6 +311,10 @@ class CICDDebugEnvironment:
290
  for fix in self.expected_fixes:
291
  file_path = fix["file"]
292
  if file_path not in self.current_files:
 
 
 
 
293
  continue
294
  current_content = self.current_files[file_path].content
295
  if fix["type"] == "contains" and fix["expected"] in current_content:
@@ -313,35 +338,84 @@ class CICDDebugEnvironment:
313
  docker_result = self.docker_sim.validate(self.current_files.get("Dockerfile"), self.current_files)
314
  workflow_file = self._find_workflow_file()
315
  workflow_result = self.workflow_sim.validate(workflow_file, self.current_files)
 
 
 
 
 
316
 
317
  reward = 0.0
318
  parts: List[str] = []
319
 
320
- if docker_result["build_success"]:
321
- reward += 0.3
322
- parts.append("Docker build: PASS")
 
 
 
 
 
 
 
 
 
 
 
323
  else:
324
- parts.append(f"Docker build: FAIL - {docker_result.get('error', 'unknown')}")
325
 
326
- if docker_result["run_success"]:
327
- reward += 0.2
328
- parts.append("Docker run: PASS")
329
- else:
330
- parts.append(f"Docker run: FAIL - {docker_result.get('run_error', 'unknown')}")
 
 
331
 
332
- if workflow_result["parse_success"]:
333
- reward += 0.2
334
- parts.append("Workflow parse: PASS")
335
- else:
336
- parts.append(f"Workflow parse: FAIL - {workflow_result.get('error', 'unknown')}")
337
 
338
- if workflow_result["execution_success"]:
339
- reward += 0.3
340
- parts.append("Workflow execution: PASS")
341
- else:
342
- parts.append(f"Workflow execution: FAIL - {workflow_result.get('exec_error', 'unknown')}")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
343
 
344
- self.last_action_success = reward >= 0.8
345
  return reward, "; ".join(parts)
346
 
347
  def _handle_hint_request(self) -> Tuple[float, str]:
 
15
  TaskDifficulty,
16
  )
17
  from server.simulators.docker_simulator import DockerSimulator
18
+ from server.simulators.k8s_simulator import KubernetesSimulator
19
  from server.simulators.workflow_simulator import WorkflowSimulator
20
  from server.tasks.task_registry import TASK_REGISTRY, get_task
21
 
 
74
  docker_result = self.docker_sim.validate(self.current_files.get("Dockerfile"), self.current_files)
75
  workflow_file = self._find_workflow_file()
76
  workflow_result = self.workflow_sim.validate(workflow_file, self.current_files)
77
+ k8s_result = self.k8s_sim.validate(self.current_files)
78
  return {
79
  "docker_build_valid": bool(docker_result.get("build_success", False)),
80
  "workflow_parse_valid": bool(workflow_result.get("parse_success", False)),
81
+ "k8s_valid": bool(k8s_result.get("valid", True)),
82
  }
83
 
84
  def __init__(self):
85
  self.docker_sim = DockerSimulator()
86
  self.workflow_sim = WorkflowSimulator()
87
+ self.k8s_sim = KubernetesSimulator()
88
 
89
  self.current_task_id: Optional[str] = None
90
  self.current_scenario_id: Optional[str] = None
 
207
  applied_count = 0
208
  for edit in action.edits:
209
  if edit.file_path not in self.current_files:
210
+ # Allow creating new files (needed for K8s ConfigMap scenarios etc.)
211
+ if action.action_type == ActionType.EDIT_FILE and edit.new_content:
212
+ ft = FileType.OTHER
213
+ if edit.file_path.startswith("k8s/") or edit.file_path.endswith(".yaml") or edit.file_path.endswith(".yml"):
214
+ ft = FileType.KUBERNETES
215
+ self.current_files[edit.file_path] = FileContent(
216
+ path=edit.file_path,
217
+ content=edit.new_content,
218
+ file_type=ft,
219
+ line_count=edit.new_content.count("\n") + 1,
220
+ )
221
+ feedbacks.append(f"Created new file: {edit.file_path}")
222
+ applied_count += 1
223
+ continue
224
  feedbacks.append(f"File not found: {edit.file_path}")
225
  continue
226
 
 
295
  if not before_validation["workflow_parse_valid"] and after_validation["workflow_parse_valid"]:
296
  reward += 0.1
297
  feedbacks.append("Workflow parse validity improved")
298
+ if not before_validation["k8s_valid"] and after_validation["k8s_valid"]:
299
+ reward += 0.1
300
+ feedbacks.append("Kubernetes manifest validity improved")
301
 
302
  if applied_count == 0:
303
  self.last_action_success = False
 
311
  for fix in self.expected_fixes:
312
  file_path = fix["file"]
313
  if file_path not in self.current_files:
314
+ # For "contains" checks on missing files, the fix is not applied
315
+ # For "not_contains" checks on missing files, consider it fixed
316
+ if fix["type"] == "not_contains":
317
+ fixes_applied += 1
318
  continue
319
  current_content = self.current_files[file_path].content
320
  if fix["type"] == "contains" and fix["expected"] in current_content:
 
338
  docker_result = self.docker_sim.validate(self.current_files.get("Dockerfile"), self.current_files)
339
  workflow_file = self._find_workflow_file()
340
  workflow_result = self.workflow_sim.validate(workflow_file, self.current_files)
341
+ k8s_result = self.k8s_sim.validate(self.current_files)
342
+
343
+ has_k8s = any(fc.file_type == FileType.KUBERNETES for fc in self.current_files.values())
344
+ has_docker = "Dockerfile" in self.current_files
345
+ has_workflow = workflow_file is not None
346
 
347
  reward = 0.0
348
  parts: List[str] = []
349
 
350
+ # Determine weight distribution based on what file types are present
351
+ if has_docker and has_workflow and has_k8s:
352
+ # Full stack: Docker 20%, Workflow 30%, K8s 30%, fix progress 20%
353
+ docker_w, wf_w, k8s_w = 0.20, 0.30, 0.30
354
+ elif has_docker and has_workflow:
355
+ docker_w, wf_w, k8s_w = 0.50, 0.50, 0.0
356
+ elif has_docker and has_k8s:
357
+ docker_w, wf_w, k8s_w = 0.40, 0.0, 0.40
358
+ elif has_workflow and has_k8s:
359
+ docker_w, wf_w, k8s_w = 0.0, 0.40, 0.40
360
+ elif has_k8s:
361
+ docker_w, wf_w, k8s_w = 0.0, 0.0, 0.80
362
+ elif has_docker:
363
+ docker_w, wf_w, k8s_w = 0.50, 0.0, 0.0
364
  else:
365
+ docker_w, wf_w, k8s_w = 0.0, 0.50, 0.0
366
 
367
+ # Docker validation
368
+ if has_docker:
369
+ if docker_result.get("build_success"):
370
+ reward += docker_w * 0.6
371
+ parts.append("Docker build: PASS")
372
+ else:
373
+ parts.append(f"Docker build: FAIL - {docker_result.get('error', 'unknown')}")
374
 
375
+ if docker_result.get("run_success"):
376
+ reward += docker_w * 0.4
377
+ parts.append("Docker run: PASS")
378
+ else:
379
+ parts.append(f"Docker run: FAIL - {docker_result.get('run_error', 'unknown')}")
380
 
381
+ # Workflow validation
382
+ if has_workflow:
383
+ if workflow_result["parse_success"]:
384
+ reward += wf_w * 0.4
385
+ parts.append("Workflow parse: PASS")
386
+ else:
387
+ parts.append(f"Workflow parse: FAIL - {workflow_result.get('error', 'unknown')}")
388
+
389
+ if workflow_result["execution_success"]:
390
+ reward += wf_w * 0.6
391
+ parts.append("Workflow execution: PASS")
392
+ else:
393
+ parts.append(f"Workflow execution: FAIL - {workflow_result.get('exec_error', 'unknown')}")
394
+
395
+ # Kubernetes validation
396
+ if has_k8s:
397
+ if k8s_result["valid"]:
398
+ reward += k8s_w * 0.4
399
+ parts.append("K8s manifests: VALID")
400
+ else:
401
+ k8s_errors = k8s_result.get("errors", [])
402
+ parts.append(f"K8s manifests: INVALID - {'; '.join(k8s_errors[:2])}")
403
+
404
+ pod_status = k8s_result.get("pod_status", "N/A")
405
+ if pod_status == "Running":
406
+ reward += k8s_w * 0.3
407
+ parts.append(f"K8s pod status: {pod_status}")
408
+ else:
409
+ parts.append(f"K8s pod status: {pod_status}")
410
+
411
+ svc_status = k8s_result.get("service_status", "N/A")
412
+ if "active" in svc_status.lower() or svc_status == "N/A":
413
+ reward += k8s_w * 0.3
414
+ parts.append(f"K8s service: {svc_status}")
415
+ else:
416
+ parts.append(f"K8s service: {svc_status}")
417
 
418
+ self.last_action_success = reward >= 0.6
419
  return reward, "; ".join(parts)
420
 
421
  def _handle_hint_request(self) -> Tuple[float, str]:
server/models.py CHANGED
@@ -23,11 +23,20 @@ class ActionType(str, Enum):
23
  REQUEST_HINT = "request_hint"
24
 
25
 
 
 
 
 
 
 
 
 
26
  class FileType(str, Enum):
27
  DOCKERFILE = "dockerfile"
28
  WORKFLOW = "workflow"
29
  DOCKER_COMPOSE = "docker_compose"
30
  REQUIREMENTS = "requirements"
 
31
  OTHER = "other"
32
 
33
 
@@ -38,6 +47,11 @@ class ErrorPhase(str, Enum):
38
  TEST = "test"
39
  PUSH = "push"
40
  DEPLOY = "deploy"
 
 
 
 
 
41
 
42
 
43
  class FileContent(BaseModel):
@@ -97,9 +111,9 @@ class TaskInfo(BaseModel):
97
 
98
 
99
  class EnvironmentInfo(BaseModel):
100
- name: str = "cicd-docker-env"
101
  version: str = "1.0.0"
102
- description: str = "Debug CI/CD infrastructure issues"
103
  tasks: List[TaskInfo]
104
  max_steps: int = 10
105
  action_space: Dict[str, Any]
@@ -108,7 +122,7 @@ class EnvironmentInfo(BaseModel):
108
 
109
  class GraderResult(BaseModel):
110
  task_id: str
111
- score: float = Field(..., gt=0.0, lt=1.0)
112
  max_score: float = 1.0
113
  breakdown: Dict[str, float] = Field(default_factory=dict)
114
  feedback: str = ""
 
23
  REQUEST_HINT = "request_hint"
24
 
25
 
26
+ class TaskDifficultyExtended(str, Enum):
27
+ EASY = "easy"
28
+ MEDIUM = "medium"
29
+ MEDIUM_HARD = "medium-hard"
30
+ HARD = "hard"
31
+ EXPERT = "expert"
32
+
33
+
34
  class FileType(str, Enum):
35
  DOCKERFILE = "dockerfile"
36
  WORKFLOW = "workflow"
37
  DOCKER_COMPOSE = "docker_compose"
38
  REQUIREMENTS = "requirements"
39
+ KUBERNETES = "kubernetes"
40
  OTHER = "other"
41
 
42
 
 
47
  TEST = "test"
48
  PUSH = "push"
49
  DEPLOY = "deploy"
50
+ K8S_VALIDATION = "k8s_validation"
51
+ K8S_RUNTIME = "k8s_runtime"
52
+ K8S_NETWORKING = "k8s_networking"
53
+ PIPELINE_BUILD = "pipeline_build"
54
+ PIPELINE_DEPLOY = "pipeline_deploy"
55
 
56
 
57
  class FileContent(BaseModel):
 
111
 
112
 
113
  class EnvironmentInfo(BaseModel):
114
+ name: str = "cloud-native-devops-env"
115
  version: str = "1.0.0"
116
+ description: str = "Debug cloud-native deployment pipeline issues"
117
  tasks: List[TaskInfo]
118
  max_steps: int = 10
119
  action_space: Dict[str, Any]
 
122
 
123
  class GraderResult(BaseModel):
124
  task_id: str
125
+ score: float = Field(..., ge=0.0, le=1.0)
126
  max_score: float = 1.0
127
  breakdown: Dict[str, float] = Field(default_factory=dict)
128
  feedback: str = ""
server/simulators/k8s_simulator.py ADDED
@@ -0,0 +1,328 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Kubernetes manifest validator and simulator — deterministic, rule-based."""
2
+
3
+ import re
4
+ from typing import Any, Dict, List, Optional
5
+
6
+ import yaml
7
+
8
+ from server.models import FileContent
9
+
10
+
11
+ # Valid top-level K8s resource kinds we recognise
12
+ VALID_KINDS = {
13
+ "Deployment", "StatefulSet", "DaemonSet", "ReplicaSet",
14
+ "Pod", "Service", "Ingress", "ConfigMap", "Secret",
15
+ "PersistentVolumeClaim", "PersistentVolume",
16
+ "Job", "CronJob", "Namespace", "ServiceAccount",
17
+ "Role", "RoleBinding", "ClusterRole", "ClusterRoleBinding",
18
+ "HorizontalPodAutoscaler", "NetworkPolicy",
19
+ }
20
+
21
+ VALID_API_VERSIONS = {
22
+ "v1", "apps/v1", "batch/v1", "networking.k8s.io/v1",
23
+ "rbac.authorization.k8s.io/v1", "autoscaling/v2",
24
+ "autoscaling/v1", "policy/v1",
25
+ }
26
+
27
+
28
+ def _parse_memory(mem_str: str) -> int:
29
+ """Parse K8s memory string to bytes."""
30
+ mem_str = str(mem_str).strip()
31
+ multipliers = {
32
+ "Ki": 1024, "Mi": 1024**2, "Gi": 1024**3, "Ti": 1024**4,
33
+ "K": 1000, "M": 1000**2, "G": 1000**3, "T": 1000**4,
34
+ }
35
+ for suffix, mult in multipliers.items():
36
+ if mem_str.endswith(suffix):
37
+ return int(mem_str[:-len(suffix)]) * mult
38
+ if mem_str.isdigit():
39
+ return int(mem_str)
40
+ return 0
41
+
42
+
43
+ class KubernetesSimulator:
44
+ """Simulates kubectl apply / kubectl get output.
45
+
46
+ Validates K8s manifests without a real cluster.
47
+ """
48
+
49
+ def validate(self, manifests: Dict[str, FileContent]) -> Dict[str, Any]:
50
+ """Validate all Kubernetes manifests in the file set.
51
+
52
+ Returns dict with keys:
53
+ valid: bool
54
+ errors: list of error strings
55
+ pod_status: simulated pod status
56
+ service_status: simulated service endpoint status
57
+ """
58
+ k8s_files: Dict[str, Any] = {}
59
+ errors: List[str] = []
60
+
61
+ # Parse all K8s YAML files
62
+ for path, fc in manifests.items():
63
+ if fc.file_type.value != "kubernetes":
64
+ continue
65
+ try:
66
+ docs = list(yaml.safe_load_all(fc.content))
67
+ for doc in docs:
68
+ if doc and isinstance(doc, dict):
69
+ k8s_files[path] = doc
70
+ except yaml.YAMLError as exc:
71
+ errors.append(f"YAML parse error in {path}: {exc}")
72
+
73
+ if not k8s_files and not errors:
74
+ return {"valid": True, "errors": [], "pod_status": "N/A", "service_status": "N/A"}
75
+
76
+ if errors:
77
+ return {"valid": False, "errors": errors, "pod_status": "Error", "service_status": "Error"}
78
+
79
+ # Validate each manifest
80
+ all_resources: List[Dict[str, Any]] = []
81
+ for path, doc in k8s_files.items():
82
+ resource_errors = self._validate_resource(path, doc)
83
+ errors.extend(resource_errors)
84
+ all_resources.append({"path": path, "doc": doc})
85
+
86
+ # Cross-resource validation
87
+ cross_errors = self._validate_cross_resources(all_resources)
88
+ errors.extend(cross_errors)
89
+
90
+ # Simulate pod status
91
+ pod_status = self._simulate_pod_status(all_resources)
92
+ service_status = self._simulate_service_status(all_resources)
93
+
94
+ return {
95
+ "valid": len(errors) == 0,
96
+ "errors": errors,
97
+ "pod_status": pod_status,
98
+ "service_status": service_status,
99
+ }
100
+
101
+ def _validate_resource(self, path: str, doc: Dict[str, Any]) -> List[str]:
102
+ """Validate a single K8s resource document."""
103
+ errors: List[str] = []
104
+
105
+ kind = doc.get("kind", "")
106
+ api_version = doc.get("apiVersion", "")
107
+
108
+ if not kind:
109
+ errors.append(f"{path}: missing 'kind' field")
110
+ elif kind not in VALID_KINDS:
111
+ errors.append(f"{path}: unknown kind '{kind}'")
112
+
113
+ if not api_version:
114
+ errors.append(f"{path}: missing 'apiVersion' field")
115
+ elif api_version not in VALID_API_VERSIONS:
116
+ errors.append(f"{path}: unknown apiVersion '{api_version}'")
117
+
118
+ metadata = doc.get("metadata", {})
119
+ if not isinstance(metadata, dict) or not metadata.get("name"):
120
+ errors.append(f"{path}: metadata.name is required")
121
+
122
+ # Kind-specific validation
123
+ if kind == "Deployment":
124
+ errors.extend(self._validate_deployment(path, doc))
125
+ elif kind == "Service":
126
+ errors.extend(self._validate_service(path, doc))
127
+ elif kind == "Ingress":
128
+ errors.extend(self._validate_ingress(path, doc))
129
+
130
+ return errors
131
+
132
+ def _validate_deployment(self, path: str, doc: Dict[str, Any]) -> List[str]:
133
+ errors: List[str] = []
134
+ spec = doc.get("spec", {})
135
+ if not isinstance(spec, dict):
136
+ errors.append(f"{path}: Deployment spec must be a mapping")
137
+ return errors
138
+
139
+ selector = spec.get("selector", {})
140
+ template = spec.get("template", {})
141
+
142
+ if not selector or not selector.get("matchLabels"):
143
+ errors.append(f"{path}: Deployment must have spec.selector.matchLabels")
144
+ return errors
145
+
146
+ tmpl_labels = template.get("metadata", {}).get("labels", {})
147
+ sel_labels = selector.get("matchLabels", {})
148
+
149
+ # selector must match template labels
150
+ for k, v in sel_labels.items():
151
+ if tmpl_labels.get(k) != v:
152
+ errors.append(
153
+ f"{path}: selector matchLabels ({k}={v}) does not match template labels"
154
+ )
155
+
156
+ # Validate containers
157
+ containers = template.get("spec", {}).get("containers", [])
158
+ if not containers:
159
+ errors.append(f"{path}: Deployment must have at least one container")
160
+
161
+ for c in containers:
162
+ if not c.get("image"):
163
+ errors.append(f"{path}: container '{c.get('name', '?')}' missing image")
164
+
165
+ return errors
166
+
167
+ def _validate_service(self, path: str, doc: Dict[str, Any]) -> List[str]:
168
+ errors: List[str] = []
169
+ spec = doc.get("spec", {})
170
+ if not isinstance(spec, dict):
171
+ errors.append(f"{path}: Service spec must be a mapping")
172
+ return errors
173
+
174
+ if not spec.get("selector"):
175
+ errors.append(f"{path}: Service must have spec.selector")
176
+
177
+ ports = spec.get("ports", [])
178
+ if not ports:
179
+ errors.append(f"{path}: Service must define at least one port")
180
+
181
+ for p in ports:
182
+ if not p.get("port"):
183
+ errors.append(f"{path}: Service port entry missing 'port' field")
184
+
185
+ return errors
186
+
187
+ def _validate_ingress(self, path: str, doc: Dict[str, Any]) -> List[str]:
188
+ errors: List[str] = []
189
+ spec = doc.get("spec", {})
190
+ rules = spec.get("rules", [])
191
+ if not rules:
192
+ errors.append(f"{path}: Ingress must define at least one rule")
193
+ return errors
194
+
195
+ def _validate_cross_resources(self, resources: List[Dict[str, Any]]) -> List[str]:
196
+ """Validate cross-resource dependencies (e.g. Service selector matches Deployment labels)."""
197
+ errors: List[str] = []
198
+
199
+ # Collect all pod labels from Deployments/StatefulSets
200
+ pod_labels_by_name: Dict[str, Dict[str, str]] = {}
201
+ for r in resources:
202
+ doc = r["doc"]
203
+ kind = doc.get("kind", "")
204
+ if kind in ("Deployment", "StatefulSet", "DaemonSet"):
205
+ tmpl = doc.get("spec", {}).get("template", {})
206
+ labels = tmpl.get("metadata", {}).get("labels", {})
207
+ name = doc.get("metadata", {}).get("name", "?")
208
+ pod_labels_by_name[name] = labels
209
+
210
+ # Check Service selectors match some pod labels
211
+ for r in resources:
212
+ doc = r["doc"]
213
+ if doc.get("kind") != "Service":
214
+ continue
215
+ svc_name = doc.get("metadata", {}).get("name", "?")
216
+ selector = doc.get("spec", {}).get("selector", {})
217
+ if not selector:
218
+ continue
219
+
220
+ matched = False
221
+ for dep_name, labels in pod_labels_by_name.items():
222
+ if all(labels.get(k) == v for k, v in selector.items()):
223
+ matched = True
224
+ break
225
+ if not matched and pod_labels_by_name:
226
+ errors.append(
227
+ f"Service '{svc_name}' selector {selector} does not match any pod labels"
228
+ )
229
+
230
+ return errors
231
+
232
+ def _simulate_pod_status(self, resources: List[Dict[str, Any]]) -> str:
233
+ """Simulate what pod status would be."""
234
+ for r in resources:
235
+ doc = r["doc"]
236
+ kind = doc.get("kind", "")
237
+ if kind not in ("Deployment", "StatefulSet", "DaemonSet", "Pod"):
238
+ continue
239
+
240
+ if kind == "Pod":
241
+ containers = doc.get("spec", {}).get("containers", [])
242
+ else:
243
+ containers = doc.get("spec", {}).get("template", {}).get("spec", {}).get("containers", [])
244
+
245
+ for c in containers:
246
+ image = c.get("image", "")
247
+
248
+ # Check for image typos (common: latset, lates, etc.)
249
+ if image and ":" in image:
250
+ tag = image.split(":")[-1]
251
+ if tag in ("latset", "lates", "latets"):
252
+ return "ImagePullBackOff"
253
+
254
+ # Check for hardcoded placeholder images
255
+ if "OWNER/REPO" in image or "TAG" in image:
256
+ return "ImagePullBackOff"
257
+
258
+ # Check memory limits
259
+ resources_spec = c.get("resources", {})
260
+ limits = resources_spec.get("limits", {})
261
+ mem_limit = limits.get("memory", "")
262
+ if mem_limit:
263
+ mem_bytes = _parse_memory(str(mem_limit))
264
+ # Simulate OOM if memory limit is very low
265
+ if 0 < mem_bytes < 128 * 1024 * 1024: # < 128Mi
266
+ return "CrashLoopBackOff (OOMKilled)"
267
+
268
+ # Check command
269
+ command = c.get("command", [])
270
+ if command and isinstance(command, list):
271
+ if any("wrong" in str(cmd).lower() or "typo" in str(cmd).lower() for cmd in command):
272
+ return "CrashLoopBackOff"
273
+
274
+ # Check env refs to missing configmaps
275
+ env_from = c.get("envFrom", [])
276
+ for ef in env_from:
277
+ cm_ref = ef.get("configMapRef", {})
278
+ if cm_ref and cm_ref.get("name"):
279
+ # Check if configmap exists in resources
280
+ cm_exists = any(
281
+ res["doc"].get("kind") == "ConfigMap"
282
+ and res["doc"].get("metadata", {}).get("name") == cm_ref["name"]
283
+ for res in resources
284
+ )
285
+ if not cm_exists:
286
+ return f"CreateContainerConfigError (ConfigMap '{cm_ref['name']}' not found)"
287
+
288
+ return "Running"
289
+
290
+ def _simulate_service_status(self, resources: List[Dict[str, Any]]) -> str:
291
+ """Simulate service endpoint status."""
292
+ services = [r for r in resources if r["doc"].get("kind") == "Service"]
293
+ deployments = [r for r in resources if r["doc"].get("kind") in ("Deployment", "StatefulSet")]
294
+
295
+ if not services:
296
+ return "N/A"
297
+
298
+ for svc_r in services:
299
+ svc = svc_r["doc"]
300
+ selector = svc.get("spec", {}).get("selector", {})
301
+ if not selector:
302
+ continue
303
+
304
+ matched = False
305
+ for dep_r in deployments:
306
+ dep = dep_r["doc"]
307
+ tmpl_labels = dep.get("spec", {}).get("template", {}).get("metadata", {}).get("labels", {})
308
+ if all(tmpl_labels.get(k) == v for k, v in selector.items()):
309
+ matched = True
310
+
311
+ # Check port matching
312
+ svc_ports = svc.get("spec", {}).get("ports", [])
313
+ container_ports = []
314
+ for c in dep.get("spec", {}).get("template", {}).get("spec", {}).get("containers", []):
315
+ for p in c.get("ports", []):
316
+ container_ports.append(p.get("containerPort"))
317
+
318
+ for sp in svc_ports:
319
+ tp = sp.get("targetPort")
320
+ if tp and tp not in container_ports and container_ports:
321
+ return f"Service port mismatch: targetPort {tp} not in container ports {container_ports}"
322
+ break
323
+
324
+ if not matched:
325
+ svc_name = svc.get("metadata", {}).get("name", "?")
326
+ return f"No endpoints (selector {selector} matches no pods)"
327
+
328
+ return "Endpoints active"
server/static/index.html CHANGED
@@ -3,8 +3,8 @@
3
  <head>
4
  <meta charset="UTF-8">
5
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
- <title>CI/CD + Docker Debug Environment</title>
7
- <meta name="description" content="OpenEnv environment where AI agents learn to debug broken GitHub Actions workflows and Dockerfiles.">
8
  <link rel="preconnect" href="https://fonts.googleapis.com">
9
  <link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700;800&family=JetBrains+Mono:wght@400;500&display=swap" rel="stylesheet">
10
  <style>
@@ -504,7 +504,7 @@
504
  OpenEnv Environment &middot; Live
505
  </div>
506
  <h1>
507
- <span class="gradient-text">CI/CD + Docker</span><br>
508
  Debug Environment
509
  </h1>
510
  <p>
 
3
  <head>
4
  <meta charset="UTF-8">
5
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>Cloud-Native DevOps Debug Environment</title>
7
+ <meta name="description" content="OpenEnv environment where AI agents learn to debug broken GitHub Actions workflows, Dockerfiles, and Kubernetes manifests.">
8
  <link rel="preconnect" href="https://fonts.googleapis.com">
9
  <link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700;800&family=JetBrains+Mono:wght@400;500&display=swap" rel="stylesheet">
10
  <style>
 
504
  OpenEnv Environment &middot; Live
505
  </div>
506
  <h1>
507
+ <span class="gradient-text">Cloud-Native DevOps</span><br>
508
  Debug Environment
509
  </h1>
510
  <p>
server/tasks/k8s_networking.py ADDED
@@ -0,0 +1,463 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Task: Kubernetes Service & Ingress Issues — MEDIUM-HARD.
2
+
3
+ Agent fixes networking issues in Kubernetes:
4
+ selector mismatch, port mismatch, ingress path errors,
5
+ NetworkPolicy blocking traffic, missing ingress annotations.
6
+ """
7
+
8
+ from server.models import TaskDifficulty
9
+ from server.tasks.base import BaseTask
10
+
11
+
12
+ class K8sNetworkingTask(BaseTask):
13
+ NAME = "Kubernetes Service & Ingress Issues"
14
+ DESCRIPTION = "Fix Kubernetes networking issues including Service selectors, port mismatches, and Ingress configuration"
15
+ DIFFICULTY = TaskDifficulty.HARD
16
+ AVAILABLE_SECRETS = []
17
+
18
+ SCENARIOS = [
19
+ # Scenario 1: Service selector does not match Deployment labels
20
+ {
21
+ "id": "selector_mismatch",
22
+ "files": [
23
+ {
24
+ "path": "k8s/deployment.yaml",
25
+ "type": "kubernetes",
26
+ "content": (
27
+ "apiVersion: apps/v1\n"
28
+ "kind: Deployment\n"
29
+ "metadata:\n"
30
+ " name: api\n"
31
+ "spec:\n"
32
+ " replicas: 3\n"
33
+ " selector:\n"
34
+ " matchLabels:\n"
35
+ " app: api-server\n"
36
+ " template:\n"
37
+ " metadata:\n"
38
+ " labels:\n"
39
+ " app: api-server\n"
40
+ " spec:\n"
41
+ " containers:\n"
42
+ " - name: api\n"
43
+ " image: myapp:latest\n"
44
+ " ports:\n"
45
+ " - containerPort: 8080\n"
46
+ ),
47
+ },
48
+ {
49
+ "path": "k8s/service.yaml",
50
+ "type": "kubernetes",
51
+ "content": (
52
+ "apiVersion: v1\n"
53
+ "kind: Service\n"
54
+ "metadata:\n"
55
+ " name: api-service\n"
56
+ "spec:\n"
57
+ " selector:\n"
58
+ " app: api\n"
59
+ " ports:\n"
60
+ " - port: 80\n"
61
+ " targetPort: 8080\n"
62
+ ),
63
+ },
64
+ ],
65
+ "error": {
66
+ "phase": "k8s_networking",
67
+ "message": (
68
+ "$ kubectl get endpoints api-service\n"
69
+ "NAME ENDPOINTS AGE\n"
70
+ "api-service <none> 5m\n"
71
+ "\n"
72
+ "$ kubectl describe service api-service\n"
73
+ "Name: api-service\n"
74
+ "Selector: app=api\n"
75
+ "Type: ClusterIP\n"
76
+ "Endpoints: <none>\n"
77
+ "\n"
78
+ "$ kubectl get pods --show-labels\n"
79
+ "NAME READY STATUS LABELS\n"
80
+ "api-7f8d9c6b5-x2k9m 1/1 Running app=api-server\n"
81
+ "api-7f8d9c6b5-y3l0n 1/1 Running app=api-server\n"
82
+ "api-7f8d9c6b5-z4m1o 1/1 Running app=api-server\n"
83
+ "\n"
84
+ "Note: Service selector 'app=api' does not match pod label 'app=api-server'"
85
+ ),
86
+ },
87
+ "expected_fixes": [
88
+ {
89
+ "file": "k8s/service.yaml",
90
+ "type": "contains",
91
+ "expected": "app: api-server",
92
+ "hint": "Service selector 'app: api' doesn't match Deployment label 'app: api-server'",
93
+ }
94
+ ],
95
+ },
96
+
97
+ # Scenario 2: Service targetPort does not match container port
98
+ {
99
+ "id": "port_mismatch",
100
+ "files": [
101
+ {
102
+ "path": "k8s/deployment.yaml",
103
+ "type": "kubernetes",
104
+ "content": (
105
+ "apiVersion: apps/v1\n"
106
+ "kind: Deployment\n"
107
+ "metadata:\n"
108
+ " name: frontend\n"
109
+ "spec:\n"
110
+ " replicas: 2\n"
111
+ " selector:\n"
112
+ " matchLabels:\n"
113
+ " app: frontend\n"
114
+ " template:\n"
115
+ " metadata:\n"
116
+ " labels:\n"
117
+ " app: frontend\n"
118
+ " spec:\n"
119
+ " containers:\n"
120
+ " - name: frontend\n"
121
+ " image: frontend:v1.0\n"
122
+ " ports:\n"
123
+ " - containerPort: 3000\n"
124
+ ),
125
+ },
126
+ {
127
+ "path": "k8s/service.yaml",
128
+ "type": "kubernetes",
129
+ "content": (
130
+ "apiVersion: v1\n"
131
+ "kind: Service\n"
132
+ "metadata:\n"
133
+ " name: frontend-svc\n"
134
+ "spec:\n"
135
+ " selector:\n"
136
+ " app: frontend\n"
137
+ " ports:\n"
138
+ " - port: 80\n"
139
+ " targetPort: 8080\n"
140
+ ),
141
+ },
142
+ ],
143
+ "error": {
144
+ "phase": "k8s_networking",
145
+ "message": (
146
+ "$ kubectl get endpoints frontend-svc\n"
147
+ "NAME ENDPOINTS AGE\n"
148
+ "frontend-svc 10.244.0.5:8080 3m\n"
149
+ "\n"
150
+ "$ curl http://frontend-svc\n"
151
+ "curl: (7) Failed to connect to frontend-svc port 80: Connection refused\n"
152
+ "\n"
153
+ "$ kubectl exec -it test-pod -- wget -qO- http://10.244.0.5:3000\n"
154
+ "<!DOCTYPE html><html>...</html>\n"
155
+ "\n"
156
+ "Note: Service targetPort is 8080 but container listens on 3000"
157
+ ),
158
+ },
159
+ "expected_fixes": [
160
+ {
161
+ "file": "k8s/service.yaml",
162
+ "type": "contains",
163
+ "expected": "targetPort: 3000",
164
+ "hint": "Service targetPort (8080) doesn't match container port (3000)",
165
+ }
166
+ ],
167
+ },
168
+
169
+ # Scenario 3: Ingress path not matching backend service
170
+ {
171
+ "id": "ingress_wrong_service",
172
+ "files": [
173
+ {
174
+ "path": "k8s/deployment.yaml",
175
+ "type": "kubernetes",
176
+ "content": (
177
+ "apiVersion: apps/v1\n"
178
+ "kind: Deployment\n"
179
+ "metadata:\n"
180
+ " name: api\n"
181
+ "spec:\n"
182
+ " replicas: 2\n"
183
+ " selector:\n"
184
+ " matchLabels:\n"
185
+ " app: api\n"
186
+ " template:\n"
187
+ " metadata:\n"
188
+ " labels:\n"
189
+ " app: api\n"
190
+ " spec:\n"
191
+ " containers:\n"
192
+ " - name: api\n"
193
+ " image: myapi:v1.0\n"
194
+ " ports:\n"
195
+ " - containerPort: 8080\n"
196
+ ),
197
+ },
198
+ {
199
+ "path": "k8s/service.yaml",
200
+ "type": "kubernetes",
201
+ "content": (
202
+ "apiVersion: v1\n"
203
+ "kind: Service\n"
204
+ "metadata:\n"
205
+ " name: api-service\n"
206
+ "spec:\n"
207
+ " selector:\n"
208
+ " app: api\n"
209
+ " ports:\n"
210
+ " - port: 80\n"
211
+ " targetPort: 8080\n"
212
+ ),
213
+ },
214
+ {
215
+ "path": "k8s/ingress.yaml",
216
+ "type": "kubernetes",
217
+ "content": (
218
+ "apiVersion: networking.k8s.io/v1\n"
219
+ "kind: Ingress\n"
220
+ "metadata:\n"
221
+ " name: api-ingress\n"
222
+ "spec:\n"
223
+ " rules:\n"
224
+ " - host: api.example.com\n"
225
+ " http:\n"
226
+ " paths:\n"
227
+ " - path: /\n"
228
+ " pathType: Prefix\n"
229
+ " backend:\n"
230
+ " service:\n"
231
+ " name: api-svc\n"
232
+ " port:\n"
233
+ " number: 80\n"
234
+ ),
235
+ },
236
+ ],
237
+ "error": {
238
+ "phase": "k8s_networking",
239
+ "message": (
240
+ "$ kubectl describe ingress api-ingress\n"
241
+ "Name: api-ingress\n"
242
+ "Rules:\n"
243
+ " Host Path Backends\n"
244
+ " ---- ---- --------\n"
245
+ " api.example.com\n"
246
+ " / api-svc:80 (<error: endpoints \"api-svc\" not found>)\n"
247
+ "\n"
248
+ "$ kubectl get svc\n"
249
+ "NAME TYPE CLUSTER-IP PORT(S)\n"
250
+ "api-service ClusterIP 10.96.0.10 80/TCP\n"
251
+ "\n"
252
+ "Note: Ingress references service 'api-svc' but the actual service name is 'api-service'"
253
+ ),
254
+ },
255
+ "expected_fixes": [
256
+ {
257
+ "file": "k8s/ingress.yaml",
258
+ "type": "contains",
259
+ "expected": "name: api-service",
260
+ "hint": "Ingress backend references 'api-svc' but the Service is named 'api-service'",
261
+ }
262
+ ],
263
+ },
264
+
265
+ # Scenario 4: NetworkPolicy blocking all ingress traffic
266
+ {
267
+ "id": "network_policy_blocking",
268
+ "files": [
269
+ {
270
+ "path": "k8s/deployment.yaml",
271
+ "type": "kubernetes",
272
+ "content": (
273
+ "apiVersion: apps/v1\n"
274
+ "kind: Deployment\n"
275
+ "metadata:\n"
276
+ " name: database\n"
277
+ "spec:\n"
278
+ " replicas: 1\n"
279
+ " selector:\n"
280
+ " matchLabels:\n"
281
+ " app: database\n"
282
+ " template:\n"
283
+ " metadata:\n"
284
+ " labels:\n"
285
+ " app: database\n"
286
+ " spec:\n"
287
+ " containers:\n"
288
+ " - name: postgres\n"
289
+ " image: postgres:15\n"
290
+ " ports:\n"
291
+ " - containerPort: 5432\n"
292
+ " env:\n"
293
+ " - name: POSTGRES_PASSWORD\n"
294
+ ' value: "secretpass"\n'
295
+ ),
296
+ },
297
+ {
298
+ "path": "k8s/service.yaml",
299
+ "type": "kubernetes",
300
+ "content": (
301
+ "apiVersion: v1\n"
302
+ "kind: Service\n"
303
+ "metadata:\n"
304
+ " name: database-svc\n"
305
+ "spec:\n"
306
+ " selector:\n"
307
+ " app: database\n"
308
+ " ports:\n"
309
+ " - port: 5432\n"
310
+ " targetPort: 5432\n"
311
+ ),
312
+ },
313
+ {
314
+ "path": "k8s/networkpolicy.yaml",
315
+ "type": "kubernetes",
316
+ "content": (
317
+ "apiVersion: networking.k8s.io/v1\n"
318
+ "kind: NetworkPolicy\n"
319
+ "metadata:\n"
320
+ " name: db-policy\n"
321
+ "spec:\n"
322
+ " podSelector:\n"
323
+ " matchLabels:\n"
324
+ " app: database\n"
325
+ " policyTypes:\n"
326
+ " - Ingress\n"
327
+ " ingress: []\n"
328
+ ),
329
+ },
330
+ ],
331
+ "error": {
332
+ "phase": "k8s_networking",
333
+ "message": (
334
+ "$ kubectl exec -it api-pod -- pg_isready -h database-svc -p 5432\n"
335
+ "database-svc:5432 - no response\n"
336
+ "\n"
337
+ "$ kubectl get pods\n"
338
+ "NAME READY STATUS RESTARTS AGE\n"
339
+ "database-6b8f9d7c4-kj3m2 1/1 Running 0 5m\n"
340
+ "api-pod 1/1 Running 0 5m\n"
341
+ "\n"
342
+ "$ kubectl get networkpolicy\n"
343
+ "NAME POD-SELECTOR AGE\n"
344
+ "db-policy app=database 5m\n"
345
+ "\n"
346
+ "$ kubectl describe networkpolicy db-policy\n"
347
+ "Spec:\n"
348
+ " PodSelector: app=database\n"
349
+ " Allowing ingress traffic: <none> (Selected pods are isolated for ingress connectivity)\n"
350
+ "\n"
351
+ "Note: NetworkPolicy with empty ingress list blocks ALL inbound traffic to the database"
352
+ ),
353
+ },
354
+ "expected_fixes": [
355
+ {
356
+ "file": "k8s/networkpolicy.yaml",
357
+ "type": "contains",
358
+ "expected": "app: api",
359
+ "hint": "NetworkPolicy has empty ingress rules (blocks all traffic). Add an ingress rule allowing traffic from pods with label 'app: api'.",
360
+ }
361
+ ],
362
+ },
363
+
364
+ # Scenario 5: Ingress missing ingressClassName
365
+ {
366
+ "id": "missing_ingress_class",
367
+ "files": [
368
+ {
369
+ "path": "k8s/deployment.yaml",
370
+ "type": "kubernetes",
371
+ "content": (
372
+ "apiVersion: apps/v1\n"
373
+ "kind: Deployment\n"
374
+ "metadata:\n"
375
+ " name: webapp\n"
376
+ "spec:\n"
377
+ " replicas: 2\n"
378
+ " selector:\n"
379
+ " matchLabels:\n"
380
+ " app: webapp\n"
381
+ " template:\n"
382
+ " metadata:\n"
383
+ " labels:\n"
384
+ " app: webapp\n"
385
+ " spec:\n"
386
+ " containers:\n"
387
+ " - name: webapp\n"
388
+ " image: webapp:v2.0\n"
389
+ " ports:\n"
390
+ " - containerPort: 8080\n"
391
+ ),
392
+ },
393
+ {
394
+ "path": "k8s/service.yaml",
395
+ "type": "kubernetes",
396
+ "content": (
397
+ "apiVersion: v1\n"
398
+ "kind: Service\n"
399
+ "metadata:\n"
400
+ " name: webapp-svc\n"
401
+ "spec:\n"
402
+ " selector:\n"
403
+ " app: webapp\n"
404
+ " ports:\n"
405
+ " - port: 80\n"
406
+ " targetPort: 8080\n"
407
+ ),
408
+ },
409
+ {
410
+ "path": "k8s/ingress.yaml",
411
+ "type": "kubernetes",
412
+ "content": (
413
+ "apiVersion: networking.k8s.io/v1\n"
414
+ "kind: Ingress\n"
415
+ "metadata:\n"
416
+ " name: webapp-ingress\n"
417
+ "spec:\n"
418
+ " rules:\n"
419
+ " - host: webapp.example.com\n"
420
+ " http:\n"
421
+ " paths:\n"
422
+ " - path: /\n"
423
+ " pathType: Prefix\n"
424
+ " backend:\n"
425
+ " service:\n"
426
+ " name: webapp-svc\n"
427
+ " port:\n"
428
+ " number: 80\n"
429
+ ),
430
+ },
431
+ ],
432
+ "error": {
433
+ "phase": "k8s_networking",
434
+ "message": (
435
+ "$ kubectl describe ingress webapp-ingress\n"
436
+ "Name: webapp-ingress\n"
437
+ "Address: \n"
438
+ "Rules:\n"
439
+ " Host Path Backends\n"
440
+ " ---- ---- --------\n"
441
+ " webapp.example.com / webapp-svc:80 (10.244.0.5:8080)\n"
442
+ "\n"
443
+ "$ curl -H 'Host: webapp.example.com' http://<loadbalancer-ip>/\n"
444
+ "curl: (7) Failed to connect: Connection refused\n"
445
+ "\n"
446
+ "$ kubectl get ingressclass\n"
447
+ "NAME CONTROLLER PARAMETERS AGE\n"
448
+ "nginx k8s.io/ingress-nginx <none> 10d\n"
449
+ "\n"
450
+ "Note: Ingress has no ingressClassName specified. The cluster requires "
451
+ "explicit ingressClassName: nginx"
452
+ ),
453
+ },
454
+ "expected_fixes": [
455
+ {
456
+ "file": "k8s/ingress.yaml",
457
+ "type": "contains",
458
+ "expected": "ingressClassName: nginx",
459
+ "hint": "Ingress needs 'ingressClassName: nginx' in the spec to be picked up by the nginx ingress controller",
460
+ }
461
+ ],
462
+ },
463
+ ]
server/tasks/k8s_pod.py ADDED
@@ -0,0 +1,352 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Task: Kubernetes Pod Failures — MEDIUM.
2
+
3
+ Agent fixes common pod failure scenarios:
4
+ OOMKilled, ImagePullBackOff, wrong command, missing ConfigMap, liveness probe.
5
+ """
6
+
7
+ from server.models import TaskDifficulty
8
+ from server.tasks.base import BaseTask
9
+
10
+
11
+ class K8sPodTask(BaseTask):
12
+ NAME = "Kubernetes Pod Failures"
13
+ DESCRIPTION = "Fix Kubernetes pod failures including CrashLoopBackOff, ImagePullBackOff, and resource issues"
14
+ DIFFICULTY = TaskDifficulty.MEDIUM
15
+ AVAILABLE_SECRETS = []
16
+
17
+ SCENARIOS = [
18
+ # Scenario 1: CrashLoopBackOff — OOMKilled (memory limit too low)
19
+ {
20
+ "id": "oom_killed",
21
+ "files": [
22
+ {
23
+ "path": "k8s/deployment.yaml",
24
+ "type": "kubernetes",
25
+ "content": (
26
+ "apiVersion: apps/v1\n"
27
+ "kind: Deployment\n"
28
+ "metadata:\n"
29
+ " name: api-server\n"
30
+ "spec:\n"
31
+ " replicas: 3\n"
32
+ " selector:\n"
33
+ " matchLabels:\n"
34
+ " app: api\n"
35
+ " template:\n"
36
+ " metadata:\n"
37
+ " labels:\n"
38
+ " app: api\n"
39
+ " spec:\n"
40
+ " containers:\n"
41
+ " - name: api\n"
42
+ ' image: myapp:v1.2.3\n'
43
+ " resources:\n"
44
+ " limits:\n"
45
+ ' memory: "64Mi"\n'
46
+ ' cpu: "100m"\n'
47
+ " ports:\n"
48
+ " - containerPort: 8080\n"
49
+ ),
50
+ }
51
+ ],
52
+ "error": {
53
+ "phase": "k8s_runtime",
54
+ "message": (
55
+ "$ kubectl get pods\n"
56
+ "NAME READY STATUS RESTARTS AGE\n"
57
+ "api-server-7d4b8c9f5-x2k9m 0/1 CrashLoopBackOff 5 3m\n"
58
+ "\n"
59
+ "$ kubectl describe pod api-server-7d4b8c9f5-x2k9m\n"
60
+ "...\n"
61
+ "State: Waiting\n"
62
+ " Reason: CrashLoopBackOff\n"
63
+ "Last State: Terminated\n"
64
+ " Reason: OOMKilled\n"
65
+ " Exit Code: 137\n"
66
+ "...\n"
67
+ "Events:\n"
68
+ " Warning OOMKilling 3m kubelet Memory limit 64Mi exceeded"
69
+ ),
70
+ },
71
+ "expected_fixes": [
72
+ {
73
+ "file": "k8s/deployment.yaml",
74
+ "type": "contains",
75
+ "expected": 'memory: "256Mi"',
76
+ "hint": "Container is OOMKilled with 64Mi limit. The app needs at least 256Mi.",
77
+ }
78
+ ],
79
+ },
80
+
81
+ # Scenario 2: ImagePullBackOff — image tag typo
82
+ {
83
+ "id": "image_pull_backoff",
84
+ "files": [
85
+ {
86
+ "path": "k8s/deployment.yaml",
87
+ "type": "kubernetes",
88
+ "content": (
89
+ "apiVersion: apps/v1\n"
90
+ "kind: Deployment\n"
91
+ "metadata:\n"
92
+ " name: web-app\n"
93
+ "spec:\n"
94
+ " replicas: 2\n"
95
+ " selector:\n"
96
+ " matchLabels:\n"
97
+ " app: web\n"
98
+ " template:\n"
99
+ " metadata:\n"
100
+ " labels:\n"
101
+ " app: web\n"
102
+ " spec:\n"
103
+ " containers:\n"
104
+ " - name: web\n"
105
+ " image: nginx:latset\n"
106
+ " ports:\n"
107
+ " - containerPort: 80\n"
108
+ ),
109
+ }
110
+ ],
111
+ "error": {
112
+ "phase": "k8s_runtime",
113
+ "message": (
114
+ "$ kubectl get pods\n"
115
+ "NAME READY STATUS RESTARTS AGE\n"
116
+ "web-app-5f8d7b6c4-abc12 0/1 ImagePullBackOff 0 2m\n"
117
+ "\n"
118
+ "$ kubectl describe pod web-app-5f8d7b6c4-abc12\n"
119
+ "...\n"
120
+ "Events:\n"
121
+ ' Warning Failed 2m kubelet Failed to pull image "nginx:latset": '
122
+ "rpc error: code = NotFound desc = failed to pull and unpack image: "
123
+ "reference not found\n"
124
+ " Warning Failed 2m kubelet Error: ImagePullBackOff\n"
125
+ "..."
126
+ ),
127
+ },
128
+ "expected_fixes": [
129
+ {
130
+ "file": "k8s/deployment.yaml",
131
+ "type": "contains",
132
+ "expected": "image: nginx:latest",
133
+ "hint": "Image tag has a typo: 'latset' should be 'latest'",
134
+ }
135
+ ],
136
+ },
137
+
138
+ # Scenario 3: CrashLoopBackOff — wrong command
139
+ {
140
+ "id": "wrong_command",
141
+ "files": [
142
+ {
143
+ "path": "k8s/deployment.yaml",
144
+ "type": "kubernetes",
145
+ "content": (
146
+ "apiVersion: apps/v1\n"
147
+ "kind: Deployment\n"
148
+ "metadata:\n"
149
+ " name: worker\n"
150
+ "spec:\n"
151
+ " replicas: 1\n"
152
+ " selector:\n"
153
+ " matchLabels:\n"
154
+ " app: worker\n"
155
+ " template:\n"
156
+ " metadata:\n"
157
+ " labels:\n"
158
+ " app: worker\n"
159
+ " spec:\n"
160
+ " containers:\n"
161
+ " - name: worker\n"
162
+ " image: python:3.11-slim\n"
163
+ " command: [\"python\", \"workers.py\"]\n"
164
+ " resources:\n"
165
+ " limits:\n"
166
+ ' memory: "512Mi"\n'
167
+ ' cpu: "500m"\n'
168
+ ),
169
+ },
170
+ {
171
+ "path": "app/worker.py",
172
+ "type": "other",
173
+ "content": (
174
+ "import time\n"
175
+ "\n"
176
+ "def main():\n"
177
+ " while True:\n"
178
+ " print('Processing...')\n"
179
+ " time.sleep(5)\n"
180
+ "\n"
181
+ "if __name__ == '__main__':\n"
182
+ " main()\n"
183
+ ),
184
+ },
185
+ ],
186
+ "error": {
187
+ "phase": "k8s_runtime",
188
+ "message": (
189
+ "$ kubectl get pods\n"
190
+ "NAME READY STATUS RESTARTS AGE\n"
191
+ "worker-6b8f9d7c4-kj3m2 0/1 CrashLoopBackOff 4 2m\n"
192
+ "\n"
193
+ "$ kubectl logs worker-6b8f9d7c4-kj3m2\n"
194
+ "python: can't open file '/workers.py': [Errno 2] No such file or directory\n"
195
+ "\n"
196
+ "$ kubectl describe pod worker-6b8f9d7c4-kj3m2\n"
197
+ "...\n"
198
+ "State: Waiting\n"
199
+ " Reason: CrashLoopBackOff\n"
200
+ "Last State: Terminated\n"
201
+ " Reason: Error\n"
202
+ " Exit Code: 2\n"
203
+ "..."
204
+ ),
205
+ },
206
+ "expected_fixes": [
207
+ {
208
+ "file": "k8s/deployment.yaml",
209
+ "type": "contains",
210
+ "expected": 'command: ["python", "worker.py"]',
211
+ "hint": "The command references 'workers.py' but the file is named 'worker.py' (no 's')",
212
+ }
213
+ ],
214
+ },
215
+
216
+ # Scenario 4: CreateContainerConfigError — missing ConfigMap
217
+ {
218
+ "id": "missing_configmap",
219
+ "files": [
220
+ {
221
+ "path": "k8s/deployment.yaml",
222
+ "type": "kubernetes",
223
+ "content": (
224
+ "apiVersion: apps/v1\n"
225
+ "kind: Deployment\n"
226
+ "metadata:\n"
227
+ " name: backend\n"
228
+ "spec:\n"
229
+ " replicas: 2\n"
230
+ " selector:\n"
231
+ " matchLabels:\n"
232
+ " app: backend\n"
233
+ " template:\n"
234
+ " metadata:\n"
235
+ " labels:\n"
236
+ " app: backend\n"
237
+ " spec:\n"
238
+ " containers:\n"
239
+ " - name: backend\n"
240
+ " image: mybackend:v2.0\n"
241
+ " ports:\n"
242
+ " - containerPort: 8080\n"
243
+ " envFrom:\n"
244
+ " - configMapRef:\n"
245
+ " name: app-config\n"
246
+ " resources:\n"
247
+ " limits:\n"
248
+ ' memory: "512Mi"\n'
249
+ ' cpu: "500m"\n'
250
+ ),
251
+ },
252
+ ],
253
+ "error": {
254
+ "phase": "k8s_runtime",
255
+ "message": (
256
+ "$ kubectl get pods\n"
257
+ "NAME READY STATUS RESTARTS AGE\n"
258
+ "backend-5c9d8f7b6-lm4n5 0/1 CreateContainerConfigError 0 1m\n"
259
+ "\n"
260
+ "$ kubectl describe pod backend-5c9d8f7b6-lm4n5\n"
261
+ "...\n"
262
+ "Events:\n"
263
+ ' Warning Failed 1m kubelet Error: configmap "app-config" not found\n'
264
+ "..."
265
+ ),
266
+ },
267
+ "expected_fixes": [
268
+ {
269
+ "file": "k8s/configmap.yaml",
270
+ "type": "contains",
271
+ "expected": "name: app-config",
272
+ "hint": "The ConfigMap 'app-config' is referenced but doesn't exist. Create a ConfigMap manifest.",
273
+ }
274
+ ],
275
+ },
276
+
277
+ # Scenario 5: Pod not ready — liveness probe failing
278
+ {
279
+ "id": "liveness_probe_failing",
280
+ "files": [
281
+ {
282
+ "path": "k8s/deployment.yaml",
283
+ "type": "kubernetes",
284
+ "content": (
285
+ "apiVersion: apps/v1\n"
286
+ "kind: Deployment\n"
287
+ "metadata:\n"
288
+ " name: api\n"
289
+ "spec:\n"
290
+ " replicas: 2\n"
291
+ " selector:\n"
292
+ " matchLabels:\n"
293
+ " app: api\n"
294
+ " template:\n"
295
+ " metadata:\n"
296
+ " labels:\n"
297
+ " app: api\n"
298
+ " spec:\n"
299
+ " containers:\n"
300
+ " - name: api\n"
301
+ " image: myapi:v3.1\n"
302
+ " ports:\n"
303
+ " - containerPort: 8080\n"
304
+ " livenessProbe:\n"
305
+ " httpGet:\n"
306
+ " path: /healthz\n"
307
+ " port: 3000\n"
308
+ " initialDelaySeconds: 5\n"
309
+ " periodSeconds: 10\n"
310
+ " readinessProbe:\n"
311
+ " httpGet:\n"
312
+ " path: /ready\n"
313
+ " port: 8080\n"
314
+ " initialDelaySeconds: 5\n"
315
+ " periodSeconds: 10\n"
316
+ " resources:\n"
317
+ " limits:\n"
318
+ ' memory: "512Mi"\n'
319
+ ' cpu: "500m"\n'
320
+ ),
321
+ },
322
+ ],
323
+ "error": {
324
+ "phase": "k8s_runtime",
325
+ "message": (
326
+ "$ kubectl get pods\n"
327
+ "NAME READY STATUS RESTARTS AGE\n"
328
+ "api-7f8d9c6b5-gh7j8 0/1 Running 3 (30s ago) 2m\n"
329
+ "\n"
330
+ "$ kubectl describe pod api-7f8d9c6b5-gh7j8\n"
331
+ "...\n"
332
+ "Events:\n"
333
+ " Warning Unhealthy 90s kubelet Liveness probe failed: "
334
+ "Get \"http://10.244.0.5:3000/healthz\": dial tcp 10.244.0.5:3000: "
335
+ "connect: connection refused\n"
336
+ " Normal Killing 90s kubelet Container api failed liveness probe, "
337
+ "will be restarted\n"
338
+ "...\n"
339
+ "\n"
340
+ "Note: The application listens on port 8080, not 3000."
341
+ ),
342
+ },
343
+ "expected_fixes": [
344
+ {
345
+ "file": "k8s/deployment.yaml",
346
+ "type": "contains",
347
+ "expected": "port: 8080\n initialDelaySeconds: 5\n periodSeconds: 10\n readinessProbe:",
348
+ "hint": "The liveness probe port (3000) doesn't match the container port (8080). Change liveness probe port to 8080.",
349
+ }
350
+ ],
351
+ },
352
+ ]
server/tasks/pipeline_build_deploy.py ADDED
@@ -0,0 +1,361 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Task: CI/CD Build & Push Pipeline — HARD.
2
+
3
+ Agent debugs combined GHA + Docker + Registry pipeline failures:
4
+ GHCR login missing token, wrong image tag in workflow, missing permissions,
5
+ Dockerfile + workflow arg mismatch, multi-stage build output mismatch.
6
+ """
7
+
8
+ from server.models import TaskDifficulty
9
+ from server.tasks.base import BaseTask
10
+
11
+
12
+ class PipelineBuildDeployTask(BaseTask):
13
+ NAME = "CI/CD Build & Push Pipeline"
14
+ DESCRIPTION = "Debug GHA-to-Docker-to-Registry pipeline failures across multiple files"
15
+ DIFFICULTY = TaskDifficulty.HARD
16
+ AVAILABLE_SECRETS = ["GITHUB_TOKEN", "DOCKER_USERNAME", "DOCKER_PASSWORD"]
17
+
18
+ SCENARIOS = [
19
+ # Scenario 1: GHCR login — GITHUB_TOKEN not mapped to env
20
+ {
21
+ "id": "ghcr_token_not_mapped",
22
+ "files": [
23
+ {
24
+ "path": ".github/workflows/deploy.yml",
25
+ "type": "workflow",
26
+ "content": (
27
+ "name: Build and Push to GHCR\n"
28
+ "on:\n"
29
+ " push:\n"
30
+ " branches: [main]\n"
31
+ "\n"
32
+ "jobs:\n"
33
+ " build:\n"
34
+ " runs-on: ubuntu-latest\n"
35
+ " steps:\n"
36
+ " - uses: actions/checkout@v4\n"
37
+ "\n"
38
+ " - name: Login to GHCR\n"
39
+ " run: echo $GITHUB_TOKEN | docker login ghcr.io -u ${{ github.actor }} --password-stdin\n"
40
+ "\n"
41
+ " - name: Build image\n"
42
+ " run: docker build -t ghcr.io/${{ github.repository }}:${{ github.sha }} .\n"
43
+ "\n"
44
+ " - name: Push image\n"
45
+ " run: docker push ghcr.io/${{ github.repository }}:${{ github.sha }}\n"
46
+ ),
47
+ },
48
+ {
49
+ "path": "Dockerfile",
50
+ "type": "dockerfile",
51
+ "content": (
52
+ "FROM node:20-alpine\n"
53
+ "WORKDIR /app\n"
54
+ "COPY package*.json ./\n"
55
+ "RUN npm ci\n"
56
+ "COPY . .\n"
57
+ "EXPOSE 3000\n"
58
+ 'CMD ["npm", "start"]\n'
59
+ ),
60
+ },
61
+ {
62
+ "path": "package.json",
63
+ "type": "other",
64
+ "content": '{"name": "myapp", "scripts": {"start": "node server.js"}}',
65
+ },
66
+ ],
67
+ "error": {
68
+ "phase": "pipeline_build",
69
+ "message": (
70
+ "Run: Build and Push to GHCR\n"
71
+ "\n"
72
+ "Step: Login to GHCR\n"
73
+ "Error: Cannot perform an interactive login from a non TTY device\n"
74
+ "Error: GITHUB_TOKEN environment variable is not set\n"
75
+ "\n"
76
+ "The GITHUB_TOKEN secret is available but not mapped to an environment variable."
77
+ ),
78
+ "exit_code": 1,
79
+ "failed_step": "Login to GHCR",
80
+ },
81
+ "expected_fixes": [
82
+ {
83
+ "file": ".github/workflows/deploy.yml",
84
+ "type": "contains",
85
+ "expected": "GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}",
86
+ "hint": "The GITHUB_TOKEN shell variable is used in the run command but not mapped from secrets via env block",
87
+ }
88
+ ],
89
+ },
90
+
91
+ # Scenario 2: Image tag mismatch between build and push steps
92
+ {
93
+ "id": "image_tag_mismatch",
94
+ "files": [
95
+ {
96
+ "path": ".github/workflows/build.yml",
97
+ "type": "workflow",
98
+ "content": (
99
+ "name: Build and Push\n"
100
+ "on:\n"
101
+ " push:\n"
102
+ " tags: ['v*']\n"
103
+ "\n"
104
+ "jobs:\n"
105
+ " build:\n"
106
+ " runs-on: ubuntu-latest\n"
107
+ " steps:\n"
108
+ " - uses: actions/checkout@v4\n"
109
+ "\n"
110
+ " - name: Login to DockerHub\n"
111
+ " run: echo ${{ secrets.DOCKER_PASSWORD }} | docker login -u ${{ secrets.DOCKER_USERNAME }} --password-stdin\n"
112
+ "\n"
113
+ " - name: Build image\n"
114
+ " run: docker build -t myuser/myapp:${{ github.ref_name }} .\n"
115
+ "\n"
116
+ " - name: Push image\n"
117
+ " run: docker push myuser/myapp:${{ github.sha }}\n"
118
+ ),
119
+ },
120
+ {
121
+ "path": "Dockerfile",
122
+ "type": "dockerfile",
123
+ "content": (
124
+ "FROM python:3.11-slim\n"
125
+ "WORKDIR /app\n"
126
+ "COPY requirements.txt .\n"
127
+ "RUN pip install -r requirements.txt\n"
128
+ "COPY . .\n"
129
+ "EXPOSE 8000\n"
130
+ 'CMD ["python", "app.py"]\n'
131
+ ),
132
+ },
133
+ {
134
+ "path": "requirements.txt",
135
+ "type": "requirements",
136
+ "content": "flask==3.0.0\ngunicorn==21.2.0\n",
137
+ },
138
+ ],
139
+ "error": {
140
+ "phase": "pipeline_build",
141
+ "message": (
142
+ "Run: Build and Push\n"
143
+ "\n"
144
+ "Step: Build image ✓\n"
145
+ "Step: Push image ✗\n"
146
+ "Error: An image does not exist locally with the tag: myuser/myapp:<sha>\n"
147
+ "\n"
148
+ "The build used github.ref_name as the tag but push used github.sha. "
149
+ "These are different values."
150
+ ),
151
+ "exit_code": 1,
152
+ "failed_step": "Push image",
153
+ },
154
+ "expected_fixes": [
155
+ {
156
+ "file": ".github/workflows/build.yml",
157
+ "type": "contains",
158
+ "expected": "docker push myuser/myapp:${{ github.ref_name }}",
159
+ "hint": "Build tags image with github.ref_name but push uses github.sha — use the same tag",
160
+ }
161
+ ],
162
+ },
163
+
164
+ # Scenario 3: Missing packages:write permission for GHCR push
165
+ {
166
+ "id": "missing_packages_write",
167
+ "files": [
168
+ {
169
+ "path": ".github/workflows/publish.yml",
170
+ "type": "workflow",
171
+ "content": (
172
+ "name: Publish to GHCR\n"
173
+ "on:\n"
174
+ " release:\n"
175
+ " types: [published]\n"
176
+ "\n"
177
+ "jobs:\n"
178
+ " publish:\n"
179
+ " runs-on: ubuntu-latest\n"
180
+ " steps:\n"
181
+ " - uses: actions/checkout@v4\n"
182
+ "\n"
183
+ " - name: Login to GHCR\n"
184
+ " run: echo ${{ secrets.GITHUB_TOKEN }} | docker login ghcr.io -u ${{ github.actor }} --password-stdin\n"
185
+ "\n"
186
+ " - name: Build\n"
187
+ " run: docker build -t ghcr.io/${{ github.repository }}:${{ github.event.release.tag_name }} .\n"
188
+ "\n"
189
+ " - name: Push\n"
190
+ " run: docker push ghcr.io/${{ github.repository }}:${{ github.event.release.tag_name }}\n"
191
+ ),
192
+ },
193
+ {
194
+ "path": "Dockerfile",
195
+ "type": "dockerfile",
196
+ "content": (
197
+ "FROM python:3.11-slim\n"
198
+ "WORKDIR /app\n"
199
+ "COPY . .\n"
200
+ 'CMD ["python", "app.py"]\n'
201
+ ),
202
+ },
203
+ ],
204
+ "error": {
205
+ "phase": "pipeline_build",
206
+ "message": (
207
+ "Run: Publish to GHCR\n"
208
+ "\n"
209
+ "Step: Login to GHCR ✓\n"
210
+ "Step: Build ✓\n"
211
+ "Step: Push ✗\n"
212
+ "Error: denied: permission_denied: write_package\n"
213
+ "Error: GITHUB_TOKEN does not have packages:write permission\n"
214
+ "\n"
215
+ "The default GITHUB_TOKEN only has read access to packages. "
216
+ "Add a permissions block to the job."
217
+ ),
218
+ "exit_code": 1,
219
+ "failed_step": "Push",
220
+ },
221
+ "expected_fixes": [
222
+ {
223
+ "file": ".github/workflows/publish.yml",
224
+ "type": "contains",
225
+ "expected": "packages: write",
226
+ "hint": "GHCR push requires 'permissions: packages: write' in the job or workflow",
227
+ }
228
+ ],
229
+ },
230
+
231
+ # Scenario 4: Dockerfile ARG not passed from workflow build-arg
232
+ {
233
+ "id": "build_arg_not_passed",
234
+ "files": [
235
+ {
236
+ "path": ".github/workflows/build.yml",
237
+ "type": "workflow",
238
+ "content": (
239
+ "name: Build with Version\n"
240
+ "on:\n"
241
+ " push:\n"
242
+ " branches: [main]\n"
243
+ "\n"
244
+ "jobs:\n"
245
+ " build:\n"
246
+ " runs-on: ubuntu-latest\n"
247
+ " steps:\n"
248
+ " - uses: actions/checkout@v4\n"
249
+ "\n"
250
+ " - name: Build image\n"
251
+ " run: docker build -t myapp:${{ github.sha }} .\n"
252
+ ),
253
+ },
254
+ {
255
+ "path": "Dockerfile",
256
+ "type": "dockerfile",
257
+ "content": (
258
+ "FROM python:3.11-slim\n"
259
+ "ARG APP_VERSION\n"
260
+ "WORKDIR /app\n"
261
+ "COPY . .\n"
262
+ "RUN echo $APP_VERSION > /app/version.txt\n"
263
+ "EXPOSE 8000\n"
264
+ 'CMD ["python", "app.py"]\n'
265
+ ),
266
+ },
267
+ ],
268
+ "error": {
269
+ "phase": "pipeline_build",
270
+ "message": (
271
+ "Run: Build with Version\n"
272
+ "\n"
273
+ "Step: Build image ✓ (with warnings)\n"
274
+ "Warning: /app/version.txt is empty — APP_VERSION build arg was not provided\n"
275
+ "\n"
276
+ "The Dockerfile declares ARG APP_VERSION but the docker build command "
277
+ "does not pass --build-arg APP_VERSION=..."
278
+ ),
279
+ "exit_code": 0,
280
+ "failed_step": "Build image",
281
+ },
282
+ "expected_fixes": [
283
+ {
284
+ "file": ".github/workflows/build.yml",
285
+ "type": "contains",
286
+ "expected": "--build-arg APP_VERSION=",
287
+ "hint": "Dockerfile uses ARG APP_VERSION but the build command doesn't pass --build-arg",
288
+ }
289
+ ],
290
+ },
291
+
292
+ # Scenario 5: Multi-stage build — wrong output directory name
293
+ {
294
+ "id": "multistage_output_mismatch",
295
+ "files": [
296
+ {
297
+ "path": ".github/workflows/build.yml",
298
+ "type": "workflow",
299
+ "content": (
300
+ "name: Build Frontend\n"
301
+ "on:\n"
302
+ " push:\n"
303
+ " branches: [main]\n"
304
+ "\n"
305
+ "jobs:\n"
306
+ " build:\n"
307
+ " runs-on: ubuntu-latest\n"
308
+ " steps:\n"
309
+ " - uses: actions/checkout@v4\n"
310
+ "\n"
311
+ " - name: Build image\n"
312
+ " run: docker build -t frontend:latest .\n"
313
+ ),
314
+ },
315
+ {
316
+ "path": "Dockerfile",
317
+ "type": "dockerfile",
318
+ "content": (
319
+ "FROM node:20-alpine AS builder\n"
320
+ "WORKDIR /app\n"
321
+ "COPY package*.json ./\n"
322
+ "RUN npm ci\n"
323
+ "COPY . .\n"
324
+ "RUN npm run build\n"
325
+ "\n"
326
+ "FROM nginx:alpine\n"
327
+ "COPY --from=builder /app/dist /usr/share/nginx/html\n"
328
+ "EXPOSE 80\n"
329
+ 'CMD ["nginx", "-g", "daemon off;"]\n'
330
+ ),
331
+ },
332
+ {
333
+ "path": "package.json",
334
+ "type": "other",
335
+ "content": '{"name": "frontend", "scripts": {"build": "react-scripts build", "start": "react-scripts start"}}',
336
+ },
337
+ ],
338
+ "error": {
339
+ "phase": "pipeline_build",
340
+ "message": (
341
+ "Run: Build Frontend\n"
342
+ "\n"
343
+ "Step: Build image ✗\n"
344
+ "Error: COPY failed: stat app/dist: file does not exist\n"
345
+ "\n"
346
+ "react-scripts build outputs to /app/build, not /app/dist. "
347
+ "The COPY --from=builder path is wrong."
348
+ ),
349
+ "exit_code": 1,
350
+ "failed_step": "Build image",
351
+ },
352
+ "expected_fixes": [
353
+ {
354
+ "file": "Dockerfile",
355
+ "type": "contains",
356
+ "expected": "COPY --from=builder /app/build",
357
+ "hint": "react-scripts outputs to 'build/' not 'dist/'. Change COPY --from=builder /app/dist to /app/build",
358
+ }
359
+ ],
360
+ },
361
+ ]
server/tasks/pipeline_full.py ADDED
@@ -0,0 +1,654 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Task: Full Stack Deployment Pipeline — EXPERT.
2
+
3
+ Agent debugs multi-error scenarios spanning the entire stack:
4
+ GHA workflow + Dockerfile + Kubernetes manifests.
5
+ Multiple bugs per scenario requiring cross-file reasoning.
6
+ """
7
+
8
+ from server.models import TaskDifficulty
9
+ from server.tasks.base import BaseTask
10
+
11
+
12
+ class PipelineFullTask(BaseTask):
13
+ NAME = "Full Stack Deployment Pipeline"
14
+ DESCRIPTION = "Debug complex multi-error deployment pipelines across GHA workflows, Dockerfiles, and Kubernetes manifests"
15
+ DIFFICULTY = TaskDifficulty.HARD
16
+ AVAILABLE_SECRETS = ["GITHUB_TOKEN", "DOCKER_USERNAME", "DOCKER_PASSWORD"]
17
+
18
+ SCENARIOS = [
19
+ # Scenario 1: GHCR token missing env + K8s service selector mismatch
20
+ {
21
+ "id": "full_pipeline_ghcr_and_selector",
22
+ "files": [
23
+ {
24
+ "path": ".github/workflows/deploy.yml",
25
+ "type": "workflow",
26
+ "content": (
27
+ "name: Build and Deploy\n"
28
+ "on:\n"
29
+ " push:\n"
30
+ " branches: [main]\n"
31
+ "\n"
32
+ "jobs:\n"
33
+ " deploy:\n"
34
+ " runs-on: ubuntu-latest\n"
35
+ " steps:\n"
36
+ " - uses: actions/checkout@v4\n"
37
+ "\n"
38
+ " - name: Build Docker image\n"
39
+ " run: docker build -t ghcr.io/${{ github.repository }}:${{ github.sha }} .\n"
40
+ "\n"
41
+ " - name: Login to GHCR\n"
42
+ " run: echo $GITHUB_TOKEN | docker login ghcr.io -u ${{ github.actor }} --password-stdin\n"
43
+ "\n"
44
+ " - name: Push image\n"
45
+ " run: docker push ghcr.io/${{ github.repository }}:${{ github.sha }}\n"
46
+ ),
47
+ },
48
+ {
49
+ "path": "Dockerfile",
50
+ "type": "dockerfile",
51
+ "content": (
52
+ "FROM node:20-alpine\n"
53
+ "WORKDIR /app\n"
54
+ "COPY package*.json ./\n"
55
+ "RUN npm ci\n"
56
+ "COPY . .\n"
57
+ "EXPOSE 3000\n"
58
+ 'CMD ["npm", "start"]\n'
59
+ ),
60
+ },
61
+ {
62
+ "path": "package.json",
63
+ "type": "other",
64
+ "content": '{"name": "myapp", "scripts": {"start": "node server.js"}}',
65
+ },
66
+ {
67
+ "path": "k8s/deployment.yaml",
68
+ "type": "kubernetes",
69
+ "content": (
70
+ "apiVersion: apps/v1\n"
71
+ "kind: Deployment\n"
72
+ "metadata:\n"
73
+ " name: myapp\n"
74
+ "spec:\n"
75
+ " replicas: 3\n"
76
+ " selector:\n"
77
+ " matchLabels:\n"
78
+ " app: myapp\n"
79
+ " template:\n"
80
+ " metadata:\n"
81
+ " labels:\n"
82
+ " app: myapp\n"
83
+ " spec:\n"
84
+ " containers:\n"
85
+ " - name: app\n"
86
+ " image: ghcr.io/OWNER/REPO:TAG\n"
87
+ " ports:\n"
88
+ " - containerPort: 3000\n"
89
+ ),
90
+ },
91
+ {
92
+ "path": "k8s/service.yaml",
93
+ "type": "kubernetes",
94
+ "content": (
95
+ "apiVersion: v1\n"
96
+ "kind: Service\n"
97
+ "metadata:\n"
98
+ " name: myapp-service\n"
99
+ "spec:\n"
100
+ " selector:\n"
101
+ " app: my-app\n"
102
+ " ports:\n"
103
+ " - port: 80\n"
104
+ " targetPort: 3000\n"
105
+ ),
106
+ },
107
+ ],
108
+ "error": {
109
+ "phase": "pipeline_deploy",
110
+ "message": (
111
+ "Run: Build and Deploy\n"
112
+ "\n"
113
+ "Step: Login to GHCR ✗\n"
114
+ "Error: Cannot perform an interactive login from a non TTY device\n"
115
+ "Error: GITHUB_TOKEN environment variable is not set\n"
116
+ "\n"
117
+ "---\n"
118
+ "(If login had succeeded, deployment would also fail with:)\n"
119
+ "Error: Service 'myapp-service' has no endpoints — selector 'app=my-app' "
120
+ "doesn't match any pods (pods have label 'app=myapp')"
121
+ ),
122
+ },
123
+ "expected_fixes": [
124
+ {
125
+ "file": ".github/workflows/deploy.yml",
126
+ "type": "contains",
127
+ "expected": "GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}",
128
+ "hint": "GITHUB_TOKEN is used as shell variable but not mapped from secrets via env block",
129
+ },
130
+ {
131
+ "file": "k8s/service.yaml",
132
+ "type": "contains",
133
+ "expected": "app: myapp",
134
+ "hint": "Service selector 'app: my-app' doesn't match Deployment label 'app: myapp'",
135
+ },
136
+ ],
137
+ },
138
+
139
+ # Scenario 2: Dockerfile missing WORKDIR + workflow missing checkout + K8s wrong port
140
+ {
141
+ "id": "full_pipeline_three_bugs",
142
+ "files": [
143
+ {
144
+ "path": ".github/workflows/ci.yml",
145
+ "type": "workflow",
146
+ "content": (
147
+ "name: CI Pipeline\n"
148
+ "on:\n"
149
+ " push:\n"
150
+ " branches: [main]\n"
151
+ "\n"
152
+ "jobs:\n"
153
+ " build:\n"
154
+ " runs-on: ubuntu-latest\n"
155
+ " steps:\n"
156
+ " - name: Build image\n"
157
+ " run: docker build -t myapp:${{ github.sha }} .\n"
158
+ "\n"
159
+ " - name: Run tests\n"
160
+ " run: docker run myapp:${{ github.sha }} npm test\n"
161
+ ),
162
+ },
163
+ {
164
+ "path": "Dockerfile",
165
+ "type": "dockerfile",
166
+ "content": (
167
+ "FROM node:18-alpine\n"
168
+ "COPY package*.json ./\n"
169
+ "RUN npm ci\n"
170
+ "COPY . .\n"
171
+ "EXPOSE 3000\n"
172
+ 'CMD ["npm", "start"]\n'
173
+ ),
174
+ },
175
+ {
176
+ "path": "package.json",
177
+ "type": "other",
178
+ "content": '{"name": "myapp", "scripts": {"start": "node server.js", "test": "jest"}}',
179
+ },
180
+ {
181
+ "path": "k8s/deployment.yaml",
182
+ "type": "kubernetes",
183
+ "content": (
184
+ "apiVersion: apps/v1\n"
185
+ "kind: Deployment\n"
186
+ "metadata:\n"
187
+ " name: myapp\n"
188
+ "spec:\n"
189
+ " replicas: 2\n"
190
+ " selector:\n"
191
+ " matchLabels:\n"
192
+ " app: myapp\n"
193
+ " template:\n"
194
+ " metadata:\n"
195
+ " labels:\n"
196
+ " app: myapp\n"
197
+ " spec:\n"
198
+ " containers:\n"
199
+ " - name: app\n"
200
+ " image: myapp:latest\n"
201
+ " ports:\n"
202
+ " - containerPort: 8080\n"
203
+ ),
204
+ },
205
+ {
206
+ "path": "k8s/service.yaml",
207
+ "type": "kubernetes",
208
+ "content": (
209
+ "apiVersion: v1\n"
210
+ "kind: Service\n"
211
+ "metadata:\n"
212
+ " name: myapp-svc\n"
213
+ "spec:\n"
214
+ " selector:\n"
215
+ " app: myapp\n"
216
+ " ports:\n"
217
+ " - port: 80\n"
218
+ " targetPort: 8080\n"
219
+ ),
220
+ },
221
+ ],
222
+ "error": {
223
+ "phase": "pipeline_deploy",
224
+ "message": (
225
+ "Run: CI Pipeline\n"
226
+ "\n"
227
+ "Step: Build image ✗\n"
228
+ "Error: Checkout must happen before Docker build steps\n"
229
+ "(No actions/checkout@v4 step found before docker build)\n"
230
+ "\n"
231
+ "---\n"
232
+ "Additionally:\n"
233
+ "- Dockerfile has no WORKDIR set — npm will fail to find package.json\n"
234
+ "- K8s deployment containerPort is 8080 but app listens on 3000 "
235
+ "(service targetPort also wrong)"
236
+ ),
237
+ },
238
+ "expected_fixes": [
239
+ {
240
+ "file": ".github/workflows/ci.yml",
241
+ "type": "contains",
242
+ "expected": "actions/checkout@v4",
243
+ "hint": "Workflow needs a checkout step before docker build",
244
+ },
245
+ {
246
+ "file": "Dockerfile",
247
+ "type": "contains",
248
+ "expected": "WORKDIR /app",
249
+ "hint": "Dockerfile needs WORKDIR /app before COPY commands",
250
+ },
251
+ {
252
+ "file": "k8s/deployment.yaml",
253
+ "type": "contains",
254
+ "expected": "containerPort: 3000",
255
+ "hint": "Container port should be 3000 to match the app's EXPOSE/listen port",
256
+ },
257
+ {
258
+ "file": "k8s/service.yaml",
259
+ "type": "contains",
260
+ "expected": "targetPort: 3000",
261
+ "hint": "Service targetPort should be 3000 to match container port",
262
+ },
263
+ ],
264
+ },
265
+
266
+ # Scenario 3: Wrong GHCR password secret + Dockerfile base image typo + K8s OOM
267
+ {
268
+ "id": "full_pipeline_ghcr_dockerfile_k8s",
269
+ "files": [
270
+ {
271
+ "path": ".github/workflows/release.yml",
272
+ "type": "workflow",
273
+ "content": (
274
+ "name: Release Pipeline\n"
275
+ "on:\n"
276
+ " release:\n"
277
+ " types: [published]\n"
278
+ "\n"
279
+ "jobs:\n"
280
+ " release:\n"
281
+ " runs-on: ubuntu-latest\n"
282
+ " steps:\n"
283
+ " - uses: actions/checkout@v4\n"
284
+ "\n"
285
+ " - name: Login to GHCR\n"
286
+ " run: echo ${{ secrets.DOCKER_PASSWORD }} | docker login ghcr.io -u ${{ github.actor }} --password-stdin\n"
287
+ "\n"
288
+ " - name: Build\n"
289
+ " run: docker build -t ghcr.io/${{ github.repository }}:${{ github.event.release.tag_name }} .\n"
290
+ "\n"
291
+ " - name: Push\n"
292
+ " run: docker push ghcr.io/${{ github.repository }}:${{ github.event.release.tag_name }}\n"
293
+ ),
294
+ },
295
+ {
296
+ "path": "Dockerfile",
297
+ "type": "dockerfile",
298
+ "content": (
299
+ "FROM python:3.9-slimm\n"
300
+ "WORKDIR /app\n"
301
+ "COPY requirements.txt .\n"
302
+ "RUN pip install -r requirements.txt\n"
303
+ "COPY . .\n"
304
+ "EXPOSE 8000\n"
305
+ 'CMD ["gunicorn", "app:app", "-b", "0.0.0.0:8000"]\n'
306
+ ),
307
+ },
308
+ {
309
+ "path": "requirements.txt",
310
+ "type": "requirements",
311
+ "content": "flask==3.0.0\ngunicorn==21.2.0\n",
312
+ },
313
+ {
314
+ "path": "k8s/deployment.yaml",
315
+ "type": "kubernetes",
316
+ "content": (
317
+ "apiVersion: apps/v1\n"
318
+ "kind: Deployment\n"
319
+ "metadata:\n"
320
+ " name: api\n"
321
+ "spec:\n"
322
+ " replicas: 3\n"
323
+ " selector:\n"
324
+ " matchLabels:\n"
325
+ " app: api\n"
326
+ " template:\n"
327
+ " metadata:\n"
328
+ " labels:\n"
329
+ " app: api\n"
330
+ " spec:\n"
331
+ " containers:\n"
332
+ " - name: api\n"
333
+ " image: ghcr.io/myorg/myapp:latest\n"
334
+ " ports:\n"
335
+ " - containerPort: 8000\n"
336
+ " resources:\n"
337
+ " limits:\n"
338
+ ' memory: "64Mi"\n'
339
+ ' cpu: "100m"\n'
340
+ ),
341
+ },
342
+ ],
343
+ "error": {
344
+ "phase": "pipeline_deploy",
345
+ "message": (
346
+ "Run: Release Pipeline\n"
347
+ "\n"
348
+ "Step: Login to GHCR ✗\n"
349
+ "Error: GHCR requires GITHUB_TOKEN for authentication, not DOCKER_PASSWORD\n"
350
+ "\n"
351
+ "---\n"
352
+ "Additional issues found:\n"
353
+ "- Dockerfile: pull access denied for python:3.9-slimm (typo in base image tag)\n"
354
+ "- K8s: Pod CrashLoopBackOff with OOMKilled (64Mi memory limit too low for gunicorn)"
355
+ ),
356
+ },
357
+ "expected_fixes": [
358
+ {
359
+ "file": ".github/workflows/release.yml",
360
+ "type": "contains",
361
+ "expected": "secrets.GITHUB_TOKEN",
362
+ "hint": "GHCR uses GITHUB_TOKEN, not DOCKER_PASSWORD",
363
+ },
364
+ {
365
+ "file": "Dockerfile",
366
+ "type": "not_contains",
367
+ "expected": "python:3.9-slimm",
368
+ "hint": "Base image tag has a typo: 'slimm' should be 'slim'",
369
+ },
370
+ {
371
+ "file": "k8s/deployment.yaml",
372
+ "type": "contains",
373
+ "expected": 'memory: "256Mi"',
374
+ "hint": "Memory limit 64Mi is too low for gunicorn — increase to at least 256Mi",
375
+ },
376
+ ],
377
+ },
378
+
379
+ # Scenario 4: Missing permissions block + hardcoded K8s image + missing ingress class
380
+ {
381
+ "id": "full_pipeline_permissions_image_ingress",
382
+ "files": [
383
+ {
384
+ "path": ".github/workflows/deploy.yml",
385
+ "type": "workflow",
386
+ "content": (
387
+ "name: Deploy to Production\n"
388
+ "on:\n"
389
+ " push:\n"
390
+ " branches: [main]\n"
391
+ "\n"
392
+ "jobs:\n"
393
+ " build-and-push:\n"
394
+ " runs-on: ubuntu-latest\n"
395
+ " steps:\n"
396
+ " - uses: actions/checkout@v4\n"
397
+ "\n"
398
+ " - name: Login to GHCR\n"
399
+ " run: echo ${{ secrets.GITHUB_TOKEN }} | docker login ghcr.io -u ${{ github.actor }} --password-stdin\n"
400
+ "\n"
401
+ " - name: Build and push\n"
402
+ " run: |\n"
403
+ " docker build -t ghcr.io/${{ github.repository }}:${{ github.sha }} .\n"
404
+ " docker push ghcr.io/${{ github.repository }}:${{ github.sha }}\n"
405
+ ),
406
+ },
407
+ {
408
+ "path": "Dockerfile",
409
+ "type": "dockerfile",
410
+ "content": (
411
+ "FROM node:20-alpine\n"
412
+ "WORKDIR /app\n"
413
+ "COPY package*.json ./\n"
414
+ "RUN npm ci\n"
415
+ "COPY . .\n"
416
+ "EXPOSE 3000\n"
417
+ 'CMD ["npm", "start"]\n'
418
+ ),
419
+ },
420
+ {
421
+ "path": "package.json",
422
+ "type": "other",
423
+ "content": '{"name": "app", "scripts": {"start": "node index.js"}}',
424
+ },
425
+ {
426
+ "path": "k8s/deployment.yaml",
427
+ "type": "kubernetes",
428
+ "content": (
429
+ "apiVersion: apps/v1\n"
430
+ "kind: Deployment\n"
431
+ "metadata:\n"
432
+ " name: webapp\n"
433
+ "spec:\n"
434
+ " replicas: 3\n"
435
+ " selector:\n"
436
+ " matchLabels:\n"
437
+ " app: webapp\n"
438
+ " template:\n"
439
+ " metadata:\n"
440
+ " labels:\n"
441
+ " app: webapp\n"
442
+ " spec:\n"
443
+ " containers:\n"
444
+ " - name: webapp\n"
445
+ " image: ghcr.io/OWNER/REPO:TAG\n"
446
+ " ports:\n"
447
+ " - containerPort: 3000\n"
448
+ ),
449
+ },
450
+ {
451
+ "path": "k8s/service.yaml",
452
+ "type": "kubernetes",
453
+ "content": (
454
+ "apiVersion: v1\n"
455
+ "kind: Service\n"
456
+ "metadata:\n"
457
+ " name: webapp-svc\n"
458
+ "spec:\n"
459
+ " selector:\n"
460
+ " app: webapp\n"
461
+ " ports:\n"
462
+ " - port: 80\n"
463
+ " targetPort: 3000\n"
464
+ ),
465
+ },
466
+ {
467
+ "path": "k8s/ingress.yaml",
468
+ "type": "kubernetes",
469
+ "content": (
470
+ "apiVersion: networking.k8s.io/v1\n"
471
+ "kind: Ingress\n"
472
+ "metadata:\n"
473
+ " name: webapp-ingress\n"
474
+ "spec:\n"
475
+ " rules:\n"
476
+ " - host: webapp.example.com\n"
477
+ " http:\n"
478
+ " paths:\n"
479
+ " - path: /\n"
480
+ " pathType: Prefix\n"
481
+ " backend:\n"
482
+ " service:\n"
483
+ " name: webapp-svc\n"
484
+ " port:\n"
485
+ " number: 80\n"
486
+ ),
487
+ },
488
+ ],
489
+ "error": {
490
+ "phase": "pipeline_deploy",
491
+ "message": (
492
+ "Run: Deploy to Production\n"
493
+ "\n"
494
+ "Step: Build and push ✗\n"
495
+ "Error: denied: permission_denied: write_package\n"
496
+ "GITHUB_TOKEN does not have packages:write permission\n"
497
+ "\n"
498
+ "---\n"
499
+ "Additional issues:\n"
500
+ "- K8s Deployment image is hardcoded as 'ghcr.io/OWNER/REPO:TAG' — "
501
+ "should reference the actual built image\n"
502
+ "- Ingress has no ingressClassName — won't be picked up by nginx controller"
503
+ ),
504
+ },
505
+ "expected_fixes": [
506
+ {
507
+ "file": ".github/workflows/deploy.yml",
508
+ "type": "contains",
509
+ "expected": "packages: write",
510
+ "hint": "Add permissions block with 'packages: write' to allow GHCR push",
511
+ },
512
+ {
513
+ "file": "k8s/deployment.yaml",
514
+ "type": "not_contains",
515
+ "expected": "OWNER/REPO:TAG",
516
+ "hint": "Replace hardcoded 'OWNER/REPO:TAG' placeholder with actual image reference",
517
+ },
518
+ {
519
+ "file": "k8s/ingress.yaml",
520
+ "type": "contains",
521
+ "expected": "ingressClassName: nginx",
522
+ "hint": "Add ingressClassName: nginx to the Ingress spec",
523
+ },
524
+ ],
525
+ },
526
+
527
+ # Scenario 5: Workflow secrets not wired + Dockerfile wrong output dir + K8s probe port wrong
528
+ {
529
+ "id": "full_pipeline_secrets_build_probe",
530
+ "files": [
531
+ {
532
+ "path": ".github/workflows/build.yml",
533
+ "type": "workflow",
534
+ "content": (
535
+ "name: Build and Push\n"
536
+ "on:\n"
537
+ " push:\n"
538
+ " branches: [main]\n"
539
+ "\n"
540
+ "jobs:\n"
541
+ " build:\n"
542
+ " runs-on: ubuntu-latest\n"
543
+ " steps:\n"
544
+ " - uses: actions/checkout@v4\n"
545
+ "\n"
546
+ " - name: Login to DockerHub\n"
547
+ " run: echo $DOCKER_PASSWORD | docker login -u $DOCKER_USERNAME --password-stdin\n"
548
+ "\n"
549
+ " - name: Build\n"
550
+ " run: docker build -t myuser/frontend:${{ github.sha }} .\n"
551
+ "\n"
552
+ " - name: Push\n"
553
+ " run: docker push myuser/frontend:${{ github.sha }}\n"
554
+ ),
555
+ },
556
+ {
557
+ "path": "Dockerfile",
558
+ "type": "dockerfile",
559
+ "content": (
560
+ "FROM node:20-alpine AS builder\n"
561
+ "WORKDIR /app\n"
562
+ "COPY package*.json ./\n"
563
+ "RUN npm ci\n"
564
+ "COPY . .\n"
565
+ "RUN npm run build\n"
566
+ "\n"
567
+ "FROM nginx:alpine\n"
568
+ "COPY --from=builder /app/dist /usr/share/nginx/html\n"
569
+ "EXPOSE 80\n"
570
+ 'CMD ["nginx", "-g", "daemon off;"]\n'
571
+ ),
572
+ },
573
+ {
574
+ "path": "package.json",
575
+ "type": "other",
576
+ "content": '{"name": "frontend", "scripts": {"build": "react-scripts build", "start": "react-scripts start"}}',
577
+ },
578
+ {
579
+ "path": "k8s/deployment.yaml",
580
+ "type": "kubernetes",
581
+ "content": (
582
+ "apiVersion: apps/v1\n"
583
+ "kind: Deployment\n"
584
+ "metadata:\n"
585
+ " name: frontend\n"
586
+ "spec:\n"
587
+ " replicas: 2\n"
588
+ " selector:\n"
589
+ " matchLabels:\n"
590
+ " app: frontend\n"
591
+ " template:\n"
592
+ " metadata:\n"
593
+ " labels:\n"
594
+ " app: frontend\n"
595
+ " spec:\n"
596
+ " containers:\n"
597
+ " - name: frontend\n"
598
+ " image: myuser/frontend:latest\n"
599
+ " ports:\n"
600
+ " - containerPort: 80\n"
601
+ " livenessProbe:\n"
602
+ " httpGet:\n"
603
+ " path: /healthz\n"
604
+ " port: 3000\n"
605
+ " initialDelaySeconds: 10\n"
606
+ " periodSeconds: 5\n"
607
+ ),
608
+ },
609
+ ],
610
+ "error": {
611
+ "phase": "pipeline_deploy",
612
+ "message": (
613
+ "Run: Build and Push\n"
614
+ "\n"
615
+ "Step: Login to DockerHub ✗\n"
616
+ "Error: DOCKER_USERNAME and DOCKER_PASSWORD env vars are empty — "
617
+ "secrets not wired via env block\n"
618
+ "\n"
619
+ "---\n"
620
+ "Additional issues:\n"
621
+ "- Dockerfile: COPY failed: stat app/dist: file does not exist "
622
+ "(react-scripts outputs to 'build/' not 'dist/')\n"
623
+ "- K8s: Liveness probe port 3000 doesn't match container port 80 "
624
+ "(nginx listens on 80)"
625
+ ),
626
+ },
627
+ "expected_fixes": [
628
+ {
629
+ "file": ".github/workflows/build.yml",
630
+ "type": "contains",
631
+ "expected": "DOCKER_USERNAME: ${{ secrets.DOCKER_USERNAME }}",
632
+ "hint": "Docker login secrets need to be mapped via env block",
633
+ },
634
+ {
635
+ "file": ".github/workflows/build.yml",
636
+ "type": "contains",
637
+ "expected": "DOCKER_PASSWORD: ${{ secrets.DOCKER_PASSWORD }}",
638
+ "hint": "Both DOCKER_USERNAME and DOCKER_PASSWORD must be in env block",
639
+ },
640
+ {
641
+ "file": "Dockerfile",
642
+ "type": "contains",
643
+ "expected": "COPY --from=builder /app/build",
644
+ "hint": "react-scripts outputs to 'build/' not 'dist/'",
645
+ },
646
+ {
647
+ "file": "k8s/deployment.yaml",
648
+ "type": "contains",
649
+ "expected": "port: 80",
650
+ "hint": "Liveness probe port should be 80 to match nginx container",
651
+ },
652
+ ],
653
+ },
654
+ ]
server/tasks/task_registry.py CHANGED
@@ -10,6 +10,10 @@ from server.tasks.task_3_workflow_syntax import WorkflowSyntaxStructureTask
10
  from server.tasks.task_4_workflow_secrets_permissions import WorkflowSecretsPermissionsTask
11
  from server.tasks.task_5_ci_docker_integration import CIDockerIntegrationTask
12
  from server.tasks.task_6_multi_stage_matrix import MultiStageMatrixTask
 
 
 
 
13
 
14
  TASK_REGISTRY: Dict[str, Type[BaseTask]] = {
15
  "dockerfile_syntax": DockerfileSyntaxTask,
@@ -18,6 +22,10 @@ TASK_REGISTRY: Dict[str, Type[BaseTask]] = {
18
  "workflow_secrets_permissions": WorkflowSecretsPermissionsTask,
19
  "ci_docker_integration": CIDockerIntegrationTask,
20
  "multi_stage_pipeline_matrix": MultiStageMatrixTask,
 
 
 
 
21
  }
22
 
23
 
 
10
  from server.tasks.task_4_workflow_secrets_permissions import WorkflowSecretsPermissionsTask
11
  from server.tasks.task_5_ci_docker_integration import CIDockerIntegrationTask
12
  from server.tasks.task_6_multi_stage_matrix import MultiStageMatrixTask
13
+ from server.tasks.k8s_pod import K8sPodTask
14
+ from server.tasks.k8s_networking import K8sNetworkingTask
15
+ from server.tasks.pipeline_build_deploy import PipelineBuildDeployTask
16
+ from server.tasks.pipeline_full import PipelineFullTask
17
 
18
  TASK_REGISTRY: Dict[str, Type[BaseTask]] = {
19
  "dockerfile_syntax": DockerfileSyntaxTask,
 
22
  "workflow_secrets_permissions": WorkflowSecretsPermissionsTask,
23
  "ci_docker_integration": CIDockerIntegrationTask,
24
  "multi_stage_pipeline_matrix": MultiStageMatrixTask,
25
+ "k8s_pod_failures": K8sPodTask,
26
+ "k8s_networking": K8sNetworkingTask,
27
+ "pipeline_build_deploy": PipelineBuildDeployTask,
28
+ "pipeline_full_stack": PipelineFullTask,
29
  }
30
 
31
 
uv.lock CHANGED
@@ -333,7 +333,19 @@ wheels = [
333
  ]
334
 
335
  [[package]]
336
- name = "cicd-docker-env"
 
 
 
 
 
 
 
 
 
 
 
 
337
  version = "1.0.0"
338
  source = { editable = "." }
339
  dependencies = [
@@ -374,18 +386,6 @@ requires-dist = [
374
  ]
375
  provides-extras = ["dev", "inference"]
376
 
377
- [[package]]
378
- name = "click"
379
- version = "8.3.2"
380
- source = { registry = "https://pypi.org/simple" }
381
- dependencies = [
382
- { name = "colorama", marker = "sys_platform == 'win32'" },
383
- ]
384
- sdist = { url = "https://files.pythonhosted.org/packages/57/75/31212c6bf2503fdf920d87fee5d7a86a2e3bcf444984126f13d8e4016804/click-8.3.2.tar.gz", hash = "sha256:14162b8b3b3550a7d479eafa77dfd3c38d9dc8951f6f69c78913a8f9a7540fd5", size = 302856, upload-time = "2026-04-03T19:14:45.118Z" }
385
- wheels = [
386
- { url = "https://files.pythonhosted.org/packages/e4/20/71885d8b97d4f3dde17b1fdb92dbd4908b00541c5a3379787137285f602e/click-8.3.2-py3-none-any.whl", hash = "sha256:1924d2c27c5653561cd2cae4548d1406039cb79b858b747cfea24924bbc1616d", size = 108379, upload-time = "2026-04-03T19:14:43.505Z" },
387
- ]
388
-
389
  [[package]]
390
  name = "colorama"
391
  version = "0.4.6"
 
333
  ]
334
 
335
  [[package]]
336
+ name = "click"
337
+ version = "8.3.2"
338
+ source = { registry = "https://pypi.org/simple" }
339
+ dependencies = [
340
+ { name = "colorama", marker = "sys_platform == 'win32'" },
341
+ ]
342
+ sdist = { url = "https://files.pythonhosted.org/packages/57/75/31212c6bf2503fdf920d87fee5d7a86a2e3bcf444984126f13d8e4016804/click-8.3.2.tar.gz", hash = "sha256:14162b8b3b3550a7d479eafa77dfd3c38d9dc8951f6f69c78913a8f9a7540fd5", size = 302856, upload-time = "2026-04-03T19:14:45.118Z" }
343
+ wheels = [
344
+ { url = "https://files.pythonhosted.org/packages/e4/20/71885d8b97d4f3dde17b1fdb92dbd4908b00541c5a3379787137285f602e/click-8.3.2-py3-none-any.whl", hash = "sha256:1924d2c27c5653561cd2cae4548d1406039cb79b858b747cfea24924bbc1616d", size = 108379, upload-time = "2026-04-03T19:14:43.505Z" },
345
+ ]
346
+
347
+ [[package]]
348
+ name = "cloud-native-devops-env"
349
  version = "1.0.0"
350
  source = { editable = "." }
351
  dependencies = [
 
386
  ]
387
  provides-extras = ["dev", "inference"]
388
 
 
 
 
 
 
 
 
 
 
 
 
 
389
  [[package]]
390
  name = "colorama"
391
  version = "0.4.6"