sniki28 commited on
Commit
7a24c99
Β·
verified Β·
1 Parent(s): 36bfac2

Upload server/app.py with huggingface_hub

Browse files
Files changed (1) hide show
  1. server/app.py +831 -0
server/app.py ADDED
@@ -0,0 +1,831 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ FastAPI server exposing the Content Moderation Queue OpenEnv environment.
3
+
4
+ Each call to /reset creates an isolated session with its own state.
5
+ Pass the returned session_id to /step and /state to avoid interference
6
+ between concurrent users or test runs.
7
+ """
8
+
9
+ import uuid
10
+ from typing import Optional, Dict
11
+ from fastapi import FastAPI, HTTPException, Query
12
+ from fastapi.middleware.cors import CORSMiddleware
13
+ from fastapi.responses import HTMLResponse
14
+
15
+ from environment import ContentModerationEnv
16
+ from environment.models import Action, Observation, StepResult, EnvironmentState
17
+
18
+ # ═══════════════════════════════════════════════════════════════════════════
19
+ # Swagger CSS β€” warm cozy theme
20
+ # ═══════════════════════════════════════════════════════════════════════════
21
+
22
+ SWAGGER_CSS = """
23
+ body { background: #fdf0dc !important; }
24
+
25
+ .swagger-ui .topbar {
26
+ background: linear-gradient(135deg, #f4a833, #e8923a) !important;
27
+ border-bottom: 2px solid #e8923a !important;
28
+ padding: 10px 0 !important;
29
+ }
30
+ .swagger-ui .topbar a { color: #fff !important; }
31
+ .swagger-ui .topbar .download-url-wrapper .select-label select {
32
+ border-color: rgba(255,255,255,0.3) !important;
33
+ color: #fff !important;
34
+ }
35
+
36
+ .swagger-ui { color: #5a4530 !important; font-family: 'Inter', system-ui, sans-serif !important; }
37
+ .swagger-ui .info .title { color: #3d2b1a !important; font-weight: 700 !important; }
38
+ .swagger-ui .info .description p { color: #7a6550 !important; }
39
+ .swagger-ui .info .title small.version-stamp {
40
+ background: #f4a833 !important;
41
+ color: #fff !important;
42
+ border: none !important;
43
+ border-radius: 12px !important;
44
+ }
45
+
46
+ .swagger-ui .scheme-container {
47
+ background: #fbe8c8 !important;
48
+ border: 1px solid #f0d4a0 !important;
49
+ border-radius: 14px !important;
50
+ box-shadow: 0 2px 8px rgba(180,140,80,0.08) !important;
51
+ }
52
+
53
+ /* Operation blocks */
54
+ .swagger-ui .opblock {
55
+ border-radius: 14px !important;
56
+ box-shadow: 0 2px 10px rgba(180,140,80,0.08) !important;
57
+ margin-bottom: 14px !important;
58
+ overflow: hidden !important;
59
+ }
60
+ .swagger-ui .opblock .opblock-summary { border: none !important; }
61
+ .swagger-ui .opblock .opblock-summary-method { border-radius: 8px !important; font-weight: 700 !important; }
62
+ .swagger-ui .opblock .opblock-summary-description { color: #7a6550 !important; }
63
+ .swagger-ui .opblock .opblock-summary-path { color: #3d2b1a !important; }
64
+
65
+ /* GET blocks β€” warm teal */
66
+ .swagger-ui .opblock-get {
67
+ background: #fef9f0 !important;
68
+ border: 1.5px solid #b8d8c8 !important;
69
+ }
70
+ .swagger-ui .opblock-get .opblock-summary-method {
71
+ background: #5baa8a !important;
72
+ color: #fff !important;
73
+ }
74
+ .swagger-ui .opblock-get .opblock-summary { border-color: transparent !important; }
75
+
76
+ /* POST blocks β€” warm orange */
77
+ .swagger-ui .opblock-post {
78
+ background: #fef6ed !important;
79
+ border: 1.5px solid #f0c880 !important;
80
+ }
81
+ .swagger-ui .opblock-post .opblock-summary-method {
82
+ background: #f4a833 !important;
83
+ color: #fff !important;
84
+ }
85
+ .swagger-ui .opblock-post .opblock-summary { border-color: transparent !important; }
86
+
87
+ /* Body */
88
+ .swagger-ui .opblock-body { background: #fdf5e8 !important; }
89
+ .swagger-ui .opblock-body pre {
90
+ background: #fef9f0 !important;
91
+ color: #5a4530 !important;
92
+ border: 1px solid #f0d4a0 !important;
93
+ border-radius: 10px !important;
94
+ }
95
+ .swagger-ui .opblock-description-wrapper p { color: #7a6550 !important; }
96
+
97
+ /* Tables */
98
+ .swagger-ui table thead tr td, .swagger-ui table thead tr th {
99
+ color: #7a6550 !important;
100
+ border-color: #f0d4a0 !important;
101
+ }
102
+ .swagger-ui table tbody tr td {
103
+ color: #5a4530 !important;
104
+ border-color: #f5e0c0 !important;
105
+ }
106
+
107
+ /* Parameters */
108
+ .swagger-ui .parameter__name { color: #3d2b1a !important; }
109
+ .swagger-ui .parameter__type { color: #5baa8a !important; }
110
+ .swagger-ui .parameter__name.required::after { color: #e86040 !important; }
111
+ .swagger-ui .parameters-col_description p { color: #7a6550 !important; }
112
+
113
+ /* Inputs */
114
+ .swagger-ui input[type=text], .swagger-ui textarea, .swagger-ui select {
115
+ background: #fef9f0 !important;
116
+ color: #3d2b1a !important;
117
+ border: 1.5px solid #f0d4a0 !important;
118
+ border-radius: 10px !important;
119
+ font-family: 'JetBrains Mono', monospace !important;
120
+ }
121
+ .swagger-ui input[type=text]:focus, .swagger-ui textarea:focus {
122
+ border-color: #f4a833 !important;
123
+ box-shadow: 0 0 0 3px rgba(244,168,51,0.15) !important;
124
+ }
125
+
126
+ /* Execute button */
127
+ .swagger-ui .btn.execute {
128
+ background: linear-gradient(135deg, #f4a833, #e8923a) !important;
129
+ color: #fff !important;
130
+ border: none !important;
131
+ border-radius: 10px !important;
132
+ box-shadow: 0 3px 12px rgba(244,168,51,0.3) !important;
133
+ font-weight: 600 !important;
134
+ padding: 8px 24px !important;
135
+ }
136
+ .swagger-ui .btn.execute:hover {
137
+ box-shadow: 0 5px 20px rgba(244,168,51,0.4) !important;
138
+ transform: translateY(-1px);
139
+ }
140
+
141
+ /* Try-out button */
142
+ .swagger-ui .try-out__btn {
143
+ color: #f4a833 !important;
144
+ border-color: #f0c880 !important;
145
+ border-radius: 10px !important;
146
+ }
147
+ .swagger-ui .try-out__btn:hover { background: rgba(244,168,51,0.08) !important; }
148
+
149
+ /* Cancel */
150
+ .swagger-ui .btn-group .cancel { color: #7a6550 !important; border-color: #e0c8a0 !important; }
151
+
152
+ /* Responses */
153
+ .swagger-ui .responses-inner { background: transparent !important; }
154
+ .swagger-ui .response-col_status { color: #5baa8a !important; font-weight: 600 !important; }
155
+ .swagger-ui .response-col_description { color: #7a6550 !important; }
156
+
157
+ /* Live response */
158
+ .swagger-ui .microlight {
159
+ background: #fef9f0 !important;
160
+ color: #5a4530 !important;
161
+ border-radius: 10px !important;
162
+ border: 1px solid #f0d4a0 !important;
163
+ }
164
+
165
+ /* Models */
166
+ .swagger-ui section.models {
167
+ border: 1.5px solid #f0d4a0 !important;
168
+ border-radius: 14px !important;
169
+ background: #fef6ed !important;
170
+ }
171
+ .swagger-ui section.models h4 { color: #3d2b1a !important; border-color: #f0d4a0 !important; }
172
+ .swagger-ui .model-title { color: #3d2b1a !important; }
173
+ .swagger-ui .model { color: #5a4530 !important; }
174
+ .swagger-ui .model .property { color: #7a6550 !important; }
175
+ .swagger-ui .model .property.primitive { color: #5baa8a !important; }
176
+ .swagger-ui .prop-type { color: #c47830 !important; }
177
+ .swagger-ui .model-box { background: #fdf5e8 !important; border-radius: 10px !important; }
178
+ .swagger-ui section.models .model-container {
179
+ background: #fdf5e8 !important;
180
+ border-radius: 10px !important;
181
+ margin: 4px 0 !important;
182
+ }
183
+
184
+ /* Links */
185
+ .swagger-ui a { color: #e08030 !important; }
186
+ .swagger-ui a:hover { color: #c06020 !important; }
187
+
188
+ /* Section tags */
189
+ .swagger-ui .opblock-tag { color: #3d2b1a !important; border-color: #f0d4a0 !important; }
190
+
191
+ /* Expand arrows */
192
+ .swagger-ui .expand-operation svg, .swagger-ui .expand-methods svg { fill: #c4a070 !important; }
193
+
194
+ /* Markdown */
195
+ .swagger-ui .markdown p, .swagger-ui .renderedMarkdown p { color: #7a6550 !important; }
196
+ .swagger-ui .markdown li, .swagger-ui .renderedMarkdown li { color: #7a6550 !important; }
197
+ .swagger-ui .markdown code {
198
+ background: #fbe8c8 !important;
199
+ color: #c47830 !important;
200
+ border-radius: 6px !important;
201
+ padding: 1px 6px !important;
202
+ }
203
+
204
+ /* Scrollbar */
205
+ ::-webkit-scrollbar { width: 6px; }
206
+ ::-webkit-scrollbar-track { background: #fdf0dc; }
207
+ ::-webkit-scrollbar-thumb { background: #e0c8a0; border-radius: 3px; }
208
+ """
209
+
210
+ # ═══════════════════════════════════════════════════════════════════════════
211
+ # App setup
212
+ # ═══════════════════════════════════════════════════════════════════════════
213
+
214
+ app = FastAPI(
215
+ title="Content Moderation Queue β€” OpenEnv",
216
+ description=(
217
+ "A real-world content moderation environment where AI agents learn "
218
+ "to triage social media posts using a tiered policy framework. "
219
+ "Implements the full OpenEnv spec: step() / reset() / state(). "
220
+ "Each /reset call creates an isolated session β€” pass session_id to /step and /state."
221
+ ),
222
+ version="1.0.0",
223
+ docs_url=None,
224
+ )
225
+
226
+ app.add_middleware(
227
+ CORSMiddleware,
228
+ allow_origins=["*"],
229
+ allow_methods=["*"],
230
+ allow_headers=["*"],
231
+ )
232
+
233
+ _sessions: Dict[str, ContentModerationEnv] = {}
234
+ MAX_SESSIONS = 200
235
+ _shared_env = ContentModerationEnv()
236
+
237
+
238
+ def _get_session(session_id: str) -> ContentModerationEnv:
239
+ if session_id not in _sessions:
240
+ raise HTTPException(
241
+ status_code=404,
242
+ detail=f"Session '{session_id}' not found. Call POST /reset first to create a session."
243
+ )
244
+ return _sessions[session_id]
245
+
246
+
247
+ def _new_session() -> tuple[str, ContentModerationEnv]:
248
+ sid = str(uuid.uuid4())[:8]
249
+ if len(_sessions) >= MAX_SESSIONS:
250
+ oldest = next(iter(_sessions))
251
+ del _sessions[oldest]
252
+ env = ContentModerationEnv()
253
+ _sessions[sid] = env
254
+ return sid, env
255
+
256
+
257
+ # ═══════════════════════════════════════════════════════════════════════════
258
+ # Landing page β€” warm cozy theme
259
+ # ═══════════════════════════════════════════════════════════════════════════
260
+
261
+ LANDING_HTML = """
262
+ <!DOCTYPE html>
263
+ <html lang="en">
264
+ <head>
265
+ <meta charset="UTF-8">
266
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
267
+ <title>Content Moderation Queue β€” OpenEnv</title>
268
+ <link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700;800&display=swap" rel="stylesheet">
269
+ <style>
270
+ :root {
271
+ --cream: #fdf0dc;
272
+ --cream2: #fbe8c8;
273
+ --card: #fef6ed;
274
+ --card2: #fef9f0;
275
+ --border: #f0d4a0;
276
+ --border2: #e8c890;
277
+ --text: #3d2b1a;
278
+ --text2: #5a4530;
279
+ --muted: #9a8060;
280
+ --orange: #f4a833;
281
+ --orange2: #e8923a;
282
+ --peach: #f8c06a;
283
+ --teal: #5baa8a;
284
+ --red-soft: #e86040;
285
+ --rose: #e88070;
286
+ --sand: #d4b888;
287
+ }
288
+
289
+ * { margin: 0; padding: 0; box-sizing: border-box; }
290
+
291
+ body {
292
+ font-family: 'Inter', system-ui, sans-serif;
293
+ background: var(--cream);
294
+ color: var(--text2);
295
+ min-height: 100vh;
296
+ background-image:
297
+ radial-gradient(ellipse at 30% 0%, rgba(244,168,51,0.08) 0%, transparent 50%),
298
+ radial-gradient(ellipse at 80% 100%, rgba(232,146,58,0.06) 0%, transparent 50%);
299
+ }
300
+
301
+ /* ═══ HERO ═══ */
302
+ .hero {
303
+ background: linear-gradient(160deg, #fbe8c8 0%, #f8d8a4 40%, #f4c87a 100%);
304
+ padding: 52px 24px 44px;
305
+ text-align: center;
306
+ position: relative;
307
+ overflow: hidden;
308
+ border-bottom: 2px solid var(--border);
309
+ }
310
+ .hero::before {
311
+ content: '';
312
+ position: absolute;
313
+ bottom: -2px; left: 0; right: 0;
314
+ height: 40px;
315
+ background: url("data:image/svg+xml,%3Csvg viewBox='0 0 1200 40' xmlns='http://www.w3.org/2000/svg'%3E%3Cpath d='M0,20 Q150,0 300,20 Q450,40 600,20 Q750,0 900,20 Q1050,40 1200,20 V40 H0 Z' fill='%23fdf0dc'/%3E%3C/svg%3E") no-repeat center;
316
+ background-size: cover;
317
+ }
318
+
319
+ .hero-icon {
320
+ width: 72px; height: 72px;
321
+ background: rgba(255,255,255,0.6);
322
+ border: 2px solid rgba(255,255,255,0.8);
323
+ border-radius: 20px;
324
+ display: inline-flex;
325
+ align-items: center;
326
+ justify-content: center;
327
+ font-size: 2.2rem;
328
+ margin-bottom: 16px;
329
+ box-shadow: 0 4px 16px rgba(200,150,60,0.15);
330
+ }
331
+
332
+ .hero h1 {
333
+ font-size: 2rem;
334
+ font-weight: 800;
335
+ color: var(--text);
336
+ letter-spacing: -0.5px;
337
+ margin-bottom: 10px;
338
+ }
339
+
340
+ .badges { display: flex; gap: 8px; justify-content: center; margin-bottom: 16px; }
341
+ .badge {
342
+ padding: 4px 14px;
343
+ border-radius: 20px;
344
+ font-size: 0.68rem;
345
+ font-weight: 700;
346
+ text-transform: uppercase;
347
+ letter-spacing: 0.8px;
348
+ }
349
+ .b-env { background: var(--orange); color: #fff; }
350
+ .b-live { background: var(--teal); color: #fff; }
351
+ .b-ver { background: rgba(255,255,255,0.6); color: var(--muted); border: 1px solid var(--border); }
352
+
353
+ .hero-desc {
354
+ max-width: 520px;
355
+ margin: 0 auto;
356
+ color: var(--text2);
357
+ font-size: 0.92rem;
358
+ line-height: 1.65;
359
+ opacity: 0.85;
360
+ }
361
+
362
+ .container { max-width: 900px; margin: 0 auto; padding: 32px 24px; }
363
+
364
+ /* ═══ CARD BASE ═══ */
365
+ .card {
366
+ background: var(--card);
367
+ border: 1.5px solid var(--border);
368
+ border-radius: 16px;
369
+ box-shadow: 0 2px 10px rgba(180,140,80,0.06);
370
+ }
371
+
372
+ /* ═══ STATS ═══ */
373
+ .stats { display: grid; grid-template-columns: repeat(4, 1fr); gap: 14px; margin-bottom: 32px; }
374
+ .stat { padding: 20px 14px; text-align: center; }
375
+ .stat-val {
376
+ font-size: 1.8rem;
377
+ font-weight: 800;
378
+ color: var(--orange2);
379
+ }
380
+ .stat-lbl {
381
+ font-size: 0.7rem;
382
+ color: var(--muted);
383
+ text-transform: uppercase;
384
+ letter-spacing: 0.8px;
385
+ margin-top: 4px;
386
+ font-weight: 500;
387
+ }
388
+
389
+ /* ═══ SECTION TITLE ═══ */
390
+ .stitle {
391
+ font-size: 0.72rem;
392
+ font-weight: 700;
393
+ color: var(--muted);
394
+ text-transform: uppercase;
395
+ letter-spacing: 1.5px;
396
+ margin-bottom: 14px;
397
+ display: flex;
398
+ align-items: center;
399
+ gap: 12px;
400
+ }
401
+ .stitle::after {
402
+ content: '';
403
+ flex: 1;
404
+ height: 1.5px;
405
+ background: linear-gradient(90deg, var(--border), transparent);
406
+ }
407
+
408
+ /* ═══ HOW IT WORKS ═══ */
409
+ .flow { display: grid; grid-template-columns: repeat(4, 1fr); gap: 12px; margin-bottom: 32px; }
410
+ .flow-card {
411
+ padding: 20px 14px;
412
+ text-align: center;
413
+ position: relative;
414
+ overflow: hidden;
415
+ }
416
+ .flow-card::before {
417
+ content: '';
418
+ position: absolute;
419
+ top: 0; left: 0; right: 0;
420
+ height: 3px;
421
+ background: linear-gradient(90deg, var(--orange), var(--peach));
422
+ border-radius: 16px 16px 0 0;
423
+ }
424
+ .flow-n {
425
+ display: inline-flex;
426
+ width: 32px; height: 32px;
427
+ align-items: center;
428
+ justify-content: center;
429
+ background: var(--orange);
430
+ color: #fff;
431
+ border-radius: 10px;
432
+ font-size: 0.85rem;
433
+ font-weight: 700;
434
+ }
435
+ .flow-t { font-size: 0.88rem; font-weight: 600; color: var(--text); margin-top: 8px; }
436
+ .flow-d { font-size: 0.75rem; color: var(--muted); margin-top: 4px; }
437
+ .flow-c {
438
+ display: inline-block;
439
+ margin-top: 8px;
440
+ font-family: 'JetBrains Mono', monospace;
441
+ font-size: 0.67rem;
442
+ color: var(--teal);
443
+ background: var(--cream2);
444
+ padding: 3px 10px;
445
+ border-radius: 8px;
446
+ border: 1px solid var(--border);
447
+ font-weight: 500;
448
+ }
449
+
450
+ /* ═══ TASKS ═══ */
451
+ .tasks { display: grid; grid-template-columns: repeat(3, 1fr); gap: 14px; margin-bottom: 32px; }
452
+ .task { padding: 22px 18px; position: relative; overflow: hidden; }
453
+ .task::before {
454
+ content: '';
455
+ position: absolute;
456
+ left: 0; top: 0; bottom: 0;
457
+ width: 4px;
458
+ border-radius: 16px 0 0 16px;
459
+ }
460
+ .t-easy::before { background: var(--teal); }
461
+ .t-med::before { background: var(--orange); }
462
+ .t-hard::before { background: var(--red-soft); }
463
+ .task-diff {
464
+ font-size: 0.62rem;
465
+ font-weight: 700;
466
+ text-transform: uppercase;
467
+ letter-spacing: 1.2px;
468
+ margin-bottom: 6px;
469
+ }
470
+ .t-easy .task-diff { color: var(--teal); }
471
+ .t-med .task-diff { color: var(--orange2); }
472
+ .t-hard .task-diff { color: var(--red-soft); }
473
+ .task-name { font-size: 0.92rem; font-weight: 700; color: var(--text); margin-bottom: 6px; }
474
+ .task-desc { font-size: 0.78rem; color: var(--muted); line-height: 1.5; margin-bottom: 12px; }
475
+ .chips { display: flex; gap: 6px; flex-wrap: wrap; }
476
+ .chip {
477
+ font-size: 0.65rem;
478
+ font-weight: 500;
479
+ padding: 3px 10px;
480
+ border-radius: 8px;
481
+ background: var(--cream2);
482
+ border: 1px solid var(--border);
483
+ color: var(--muted);
484
+ }
485
+
486
+ /* ═══ ACTIONS ═══ */
487
+ .actions { display: grid; grid-template-columns: repeat(3, 1fr); gap: 10px; margin-bottom: 32px; }
488
+ .act { padding: 14px 16px; transition: transform 0.2s; }
489
+ .act:hover { transform: translateY(-2px); }
490
+ .act-name {
491
+ font-family: 'JetBrains Mono', monospace;
492
+ font-size: 0.82rem;
493
+ font-weight: 600;
494
+ margin-bottom: 3px;
495
+ }
496
+ .act-desc { font-size: 0.72rem; color: var(--muted); }
497
+ .a-approve .act-name { color: var(--teal); }
498
+ .a-warn .act-name { color: var(--orange); }
499
+ .a-remove .act-name { color: var(--orange2); }
500
+ .a-bant .act-name { color: var(--red-soft); }
501
+ .a-banp .act-name { color: #c43030; }
502
+ .a-esc .act-name { color: #9070b0; }
503
+
504
+ /* ═══ ENDPOINTS ═══ */
505
+ .ep-table { margin-bottom: 32px; overflow: hidden; }
506
+ .ep {
507
+ display: flex;
508
+ align-items: center;
509
+ padding: 12px 18px;
510
+ gap: 14px;
511
+ border-bottom: 1px solid rgba(240,212,160,0.5);
512
+ }
513
+ .ep:last-child { border-bottom: none; }
514
+ .ep-m {
515
+ font-size: 0.64rem;
516
+ font-weight: 700;
517
+ padding: 4px 10px;
518
+ border-radius: 8px;
519
+ text-transform: uppercase;
520
+ letter-spacing: 0.5px;
521
+ min-width: 48px;
522
+ text-align: center;
523
+ }
524
+ .ep-g { background: rgba(91,170,138,0.12); color: var(--teal); border: 1px solid rgba(91,170,138,0.25); }
525
+ .ep-p { background: rgba(244,168,51,0.12); color: var(--orange2); border: 1px solid rgba(244,168,51,0.25); }
526
+ .ep-path {
527
+ font-family: 'JetBrains Mono', monospace;
528
+ font-size: 0.8rem;
529
+ color: var(--text);
530
+ flex: 1;
531
+ font-weight: 500;
532
+ }
533
+ .ep-info { font-size: 0.78rem; color: var(--muted); }
534
+
535
+ /* ═══ REWARD ═══ */
536
+ .reward { padding: 22px 20px; margin-bottom: 32px; }
537
+ .reward h3 { font-size: 0.95rem; font-weight: 700; color: var(--text); margin-bottom: 14px; }
538
+ .r-item { display: flex; align-items: center; gap: 10px; margin-bottom: 10px; font-size: 0.82rem; color: var(--text2); }
539
+ .r-dot { width: 10px; height: 10px; border-radius: 50%; flex-shrink: 0; }
540
+ .rg { background: var(--teal); }
541
+ .ry { background: var(--orange); }
542
+ .rb { background: #5a9ac0; }
543
+ .rr { background: var(--rose); }
544
+
545
+ /* ═══ BASELINE ═══ */
546
+ .baseline { padding: 22px 20px; margin-bottom: 32px; }
547
+ .baseline h3 { font-size: 0.92rem; font-weight: 700; color: var(--text); margin-bottom: 6px; }
548
+ .bl-sub { font-size: 0.73rem; color: var(--muted); margin-bottom: 16px; }
549
+ .sc-row { display: flex; align-items: center; gap: 12px; margin-bottom: 10px; }
550
+ .sc-lbl { font-size: 0.78rem; width: 72px; font-weight: 600; }
551
+ .sc-bg {
552
+ flex: 1; height: 24px; border-radius: 12px;
553
+ background: var(--cream2);
554
+ border: 1px solid var(--border);
555
+ overflow: hidden;
556
+ }
557
+ .sc-bar { height: 100%; border-radius: 11px; }
558
+ .sc-bar-g { background: linear-gradient(90deg, #4a9a7a, var(--teal)); }
559
+ .sc-bar-y { background: linear-gradient(90deg, #d4922a, var(--orange)); }
560
+ .sc-bar-r { background: linear-gradient(90deg, #c84830, var(--red-soft)); }
561
+ .sc-val { font-size: 0.82rem; font-weight: 700; width: 48px; text-align: right; color: var(--text); }
562
+
563
+ /* ═══ CTA ═══ */
564
+ .ctas { display: flex; gap: 12px; margin-bottom: 36px; }
565
+ .cta {
566
+ flex: 1;
567
+ display: block;
568
+ text-align: center;
569
+ padding: 14px 20px;
570
+ border-radius: 12px;
571
+ text-decoration: none;
572
+ font-weight: 600;
573
+ font-size: 0.88rem;
574
+ transition: all 0.25s ease;
575
+ }
576
+ .cta:hover { transform: translateY(-2px); }
577
+ .cta-1 {
578
+ background: linear-gradient(135deg, var(--orange), var(--orange2));
579
+ color: #fff;
580
+ box-shadow: 0 4px 16px rgba(244,168,51,0.25);
581
+ }
582
+ .cta-1:hover { box-shadow: 0 6px 24px rgba(244,168,51,0.35); }
583
+ .cta-2 {
584
+ background: var(--card);
585
+ color: var(--teal);
586
+ border: 1.5px solid rgba(91,170,138,0.3);
587
+ }
588
+ .cta-2:hover { background: rgba(91,170,138,0.06); }
589
+ .cta-3 {
590
+ background: var(--card);
591
+ color: var(--text2);
592
+ border: 1.5px solid var(--border);
593
+ }
594
+
595
+ /* ═══ FOOTER ═══ */
596
+ .footer {
597
+ text-align: center;
598
+ padding: 28px 0;
599
+ border-top: 1.5px solid var(--border);
600
+ color: var(--muted);
601
+ font-size: 0.75rem;
602
+ }
603
+ .footer a { color: var(--orange2); text-decoration: none; }
604
+ .footer a:hover { text-decoration: underline; }
605
+
606
+ /* ═══ DECORATIVE BLOBS ═══ */
607
+ .blob {
608
+ position: absolute;
609
+ border-radius: 50%;
610
+ opacity: 0.12;
611
+ pointer-events: none;
612
+ }
613
+ .blob-1 { width: 200px; height: 200px; background: var(--orange); top: -60px; right: -40px; }
614
+ .blob-2 { width: 140px; height: 140px; background: var(--peach); bottom: -30px; left: -30px; }
615
+
616
+ @media (max-width: 720px) {
617
+ .stats, .flow { grid-template-columns: repeat(2, 1fr); }
618
+ .tasks, .actions { grid-template-columns: 1fr; }
619
+ .ctas { flex-direction: column; }
620
+ }
621
+ </style>
622
+ </head>
623
+ <body>
624
+
625
+ <div class="hero">
626
+ <div class="blob blob-1"></div>
627
+ <div class="blob blob-2"></div>
628
+ <div class="hero-icon">&#128737;&#65039;</div>
629
+ <h1>Content Moderation Queue</h1>
630
+ <div class="badges">
631
+ <span class="badge b-env">OpenEnv</span>
632
+ <span class="badge b-live">Live</span>
633
+ <span class="badge b-ver">v1.0.0</span>
634
+ </div>
635
+ <p class="hero-desc">A real-world RL environment simulating Trust &amp; Safety moderation.
636
+ AI agents triage social media posts, handle appeals, detect crisis content,
637
+ and apply graduated policy enforcement.</p>
638
+ </div>
639
+
640
+ <div class="container">
641
+
642
+ <div class="stats">
643
+ <div class="card stat"><div class="stat-val">30</div><div class="stat-lbl">Labeled Posts</div></div>
644
+ <div class="card stat"><div class="stat-val">3</div><div class="stat-lbl">Difficulty Levels</div></div>
645
+ <div class="card stat"><div class="stat-val">6</div><div class="stat-lbl">Action Types</div></div>
646
+ <div class="card stat"><div class="stat-val">9</div><div class="stat-lbl">Violation Types</div></div>
647
+ </div>
648
+
649
+ <div class="stitle">How It Works</div>
650
+ <div class="flow">
651
+ <div class="card flow-card"><div class="flow-n">1</div><div class="flow-t">Reset</div><div class="flow-d">Start episode, pick difficulty</div><div class="flow-c">POST /reset</div></div>
652
+ <div class="card flow-card"><div class="flow-n">2</div><div class="flow-t">Observe</div><div class="flow-d">Read post, history, context</div><div class="flow-c">session_id</div></div>
653
+ <div class="card flow-card"><div class="flow-n">3</div><div class="flow-t">Decide</div><div class="flow-d">Choose action + violation</div><div class="flow-c">POST /step</div></div>
654
+ <div class="card flow-card"><div class="flow-n">4</div><div class="flow-t">Score</div><div class="flow-d">Get reward 0.0 - 1.0</div><div class="flow-c">GET /state</div></div>
655
+ </div>
656
+
657
+ <div class="stitle">Tasks</div>
658
+ <div class="tasks">
659
+ <div class="card task t-easy">
660
+ <div class="task-diff">Easy</div>
661
+ <div class="task-name">Binary Content Moderation</div>
662
+ <div class="task-desc">Classify 8 posts as safe or harmful. Clear signals: spam, hate speech vs everyday content.</div>
663
+ <div class="chips"><span class="chip">8 posts</span><span class="chip">2 actions</span><span class="chip">Binary</span></div>
664
+ </div>
665
+ <div class="card task t-med">
666
+ <div class="task-diff">Medium</div>
667
+ <div class="task-name">Tiered Policy Enforcement</div>
668
+ <div class="task-desc">Apply 5-level severity. Harsh criticism, repeat offenders, political content, zero-tolerance.</div>
669
+ <div class="chips"><span class="chip">10 posts</span><span class="chip">5 actions</span><span class="chip">Partial credit</span></div>
670
+ </div>
671
+ <div class="card task t-hard">
672
+ <div class="task-diff">Hard</div>
673
+ <div class="task-name">Full Queue + Appeals</div>
674
+ <div class="task-desc">Appeals, gaming slang vs threats, crisis escalation, repeat offenders, second-appeal denials.</div>
675
+ <div class="chips"><span class="chip">12 posts</span><span class="chip">6 actions</span><span class="chip">Context bonus</span></div>
676
+ </div>
677
+ </div>
678
+
679
+ <div class="stitle">Action Space</div>
680
+ <div class="actions">
681
+ <div class="card act a-approve"><div class="act-name">approve</div><div class="act-desc">Safe, policy-compliant content</div></div>
682
+ <div class="card act a-warn"><div class="act-name">warn</div><div class="act-desc">Minor first-time violation</div></div>
683
+ <div class="card act a-remove"><div class="act-name">remove</div><div class="act-desc">Clear policy violation</div></div>
684
+ <div class="card act a-bant"><div class="act-name">ban_temporary</div><div class="act-desc">Repeat or serious offense</div></div>
685
+ <div class="card act a-banp"><div class="act-name">ban_permanent</div><div class="act-desc">Zero-tolerance or 5+ violations</div></div>
686
+ <div class="card act a-esc"><div class="act-name">escalate</div><div class="act-desc">Ambiguous, appeals, or crisis</div></div>
687
+ </div>
688
+
689
+ <div class="stitle">API Endpoints</div>
690
+ <div class="card ep-table">
691
+ <div class="ep"><span class="ep-m ep-g">GET</span><span class="ep-path">/health</span><span class="ep-info">Liveness check</span></div>
692
+ <div class="ep"><span class="ep-m ep-g">GET</span><span class="ep-path">/tasks</span><span class="ep-info">List all tasks</span></div>
693
+ <div class="ep"><span class="ep-m ep-p">POST</span><span class="ep-path">/reset?task_id=task_easy</span><span class="ep-info">Start episode, get session_id</span></div>
694
+ <div class="ep"><span class="ep-m ep-p">POST</span><span class="ep-path">/step?session_id=abc123</span><span class="ep-info">Submit action, get reward</span></div>
695
+ <div class="ep"><span class="ep-m ep-g">GET</span><span class="ep-path">/state?session_id=abc123</span><span class="ep-info">Full state + score</span></div>
696
+ </div>
697
+
698
+ <div class="stitle">Reward Design</div>
699
+ <div class="card reward">
700
+ <div class="r-item"><div class="r-dot rg"></div><div><strong>Non-sparse:</strong> every post scores independently (0.0 - 1.0)</div></div>
701
+ <div class="r-item"><div class="r-dot ry"></div><div><strong>Partial credit:</strong> one severity level off scores ~0.65 instead of 0.0</div></div>
702
+ <div class="r-item"><div class="r-dot rb"></div><div><strong>Context bonus:</strong> +0.3 for history-dependent or context-dependent posts</div></div>
703
+ <div class="r-item"><div class="r-dot rr"></div><div><strong>Violation ID:</strong> correctly identifying the violation category earns bonus</div></div>
704
+ </div>
705
+
706
+ <div class="stitle">Baseline Scores</div>
707
+ <div class="card baseline">
708
+ <h3>Meta Llama 3 8B Instruct</h3>
709
+ <div class="bl-sub">temperature=0 | seed=42 | reproducible</div>
710
+ <div class="sc-row"><span class="sc-lbl" style="color:var(--teal)">Easy</span><div class="sc-bg"><div class="sc-bar sc-bar-g" style="width:50%"></div></div><span class="sc-val">0.500</span></div>
711
+ <div class="sc-row"><span class="sc-lbl" style="color:var(--orange2)">Medium</span><div class="sc-bg"><div class="sc-bar sc-bar-y" style="width:53%"></div></div><span class="sc-val">0.533</span></div>
712
+ <div class="sc-row"><span class="sc-lbl" style="color:var(--red-soft)">Hard</span><div class="sc-bg"><div class="sc-bar sc-bar-r" style="width:42%"></div></div><span class="sc-val">0.423</span></div>
713
+ </div>
714
+
715
+ <div class="ctas">
716
+ <a class="cta cta-1" href="/docs">Interactive API Docs</a>
717
+ <a class="cta cta-2" href="/tasks">View Tasks</a>
718
+ <a class="cta cta-3" href="/health">Health Check</a>
719
+ </div>
720
+
721
+ <div class="footer">
722
+ Content Moderation Queue &mdash; OpenEnv v1.0.0<br>
723
+ Built for the Meta AI Hackathon | <a href="/docs">API Docs</a>
724
+ </div>
725
+
726
+ </div>
727
+ </body>
728
+ </html>
729
+ """
730
+
731
+ @app.get("/", response_class=HTMLResponse, include_in_schema=False)
732
+ def root():
733
+ return LANDING_HTML
734
+
735
+
736
+ @app.get("/docs", include_in_schema=False)
737
+ def custom_docs():
738
+ return HTMLResponse(f"""
739
+ <!DOCTYPE html>
740
+ <html><head>
741
+ <title>Content Moderation Queue β€” API Docs</title>
742
+ <meta charset="utf-8"/>
743
+ <meta name="viewport" content="width=device-width, initial-scale=1">
744
+ <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/swagger-ui-dist@5/swagger-ui.css">
745
+ <style>{SWAGGER_CSS}</style>
746
+ </head><body>
747
+ <div id="swagger-ui"></div>
748
+ <script src="https://cdn.jsdelivr.net/npm/swagger-ui-dist@5/swagger-ui-bundle.js"></script>
749
+ <script>
750
+ SwaggerUIBundle({{
751
+ url: '/openapi.json',
752
+ dom_id: '#swagger-ui',
753
+ presets: [SwaggerUIBundle.presets.apis, SwaggerUIBundle.SwaggerUIStandalonePreset],
754
+ layout: "BaseLayout",
755
+ defaultModelsExpandDepth: 1,
756
+ docExpansion: "list",
757
+ }})
758
+ </script>
759
+ </body></html>
760
+ """)
761
+
762
+
763
+ # ═══════════════════════════════════════════════════════════════════════════
764
+ # API Endpoints
765
+ # ═══════════════════════════════════════════════════════════════════════════
766
+
767
+ @app.get("/health")
768
+ def health():
769
+ """Liveness probe β€” returns 200 when server is ready."""
770
+ return {"status": "ok", "environment": "content-moderation-queue", "version": "1.0.0"}
771
+
772
+
773
+ @app.get("/tasks")
774
+ def list_tasks():
775
+ """List all available tasks with metadata."""
776
+ return {"tasks": _shared_env.list_tasks()}
777
+
778
+
779
+ @app.post("/reset", response_model=Observation)
780
+ def reset(
781
+ task_id: str = Query(default="task_easy", description="One of: task_easy, task_medium, task_hard"),
782
+ seed: Optional[int] = Query(default=None, description="Seed for post order. None=random each episode, integer=fixed reproducible order"),
783
+ ):
784
+ """
785
+ Start a new episode. Creates an **isolated session** for you.
786
+
787
+ - **task_id**: Which task to run (task_easy | task_medium | task_hard)
788
+ - **seed**: Optional. Omit for random post order (RL training). Pass integer (e.g. 42) for reproducible order.
789
+
790
+ The response includes a **session_id** β€” copy it and pass it to every `/step` and `/state` call.
791
+ """
792
+ try:
793
+ sid, env = _new_session()
794
+ obs = env.reset(task_id=task_id, seed=seed, session_id=sid)
795
+ return obs
796
+ except ValueError as e:
797
+ raise HTTPException(status_code=400, detail=str(e))
798
+
799
+
800
+ @app.post("/step", response_model=StepResult)
801
+ def step(
802
+ action: Action,
803
+ session_id: str = Query(..., description="session_id from /reset response"),
804
+ ):
805
+ """
806
+ Submit a moderation decision for the current post in your session.
807
+
808
+ **Action fields:**
809
+ - `action_type`: One of approve / warn / remove / ban_temporary / ban_permanent / escalate
810
+ - `reasoning`: Optional explanation (logged, not graded)
811
+ - `violation_type`: Optional β€” spam / hate_speech / harassment / misinformation / csam / illegal_services / doxxing / self_harm_risk / none
812
+
813
+ Returns the next Observation, reward (0.0-1.0), done flag, and info dict.
814
+ """
815
+ env = _get_session(session_id)
816
+ try:
817
+ return env.step(action)
818
+ except RuntimeError as e:
819
+ raise HTTPException(status_code=400, detail=str(e))
820
+
821
+
822
+ @app.get("/state", response_model=EnvironmentState)
823
+ def state(
824
+ session_id: str = Query(..., description="session_id from /reset response"),
825
+ ):
826
+ """
827
+ Return a full snapshot of your session's current state.
828
+ Includes step count, cumulative reward, all decisions, and final_score once done.
829
+ """
830
+ env = _get_session(session_id)
831
+ return env.state()