YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
AR Scene Graph Builder β OpenEnv Environment
AR spatial placement RL environment simulating the object-anchoring decisions made by Meta Ray-Ban / Orion AR glasses. Trains an agent to place virtual AR objects (notifications, screens, arrows, panels, widgets) into physical rooms with optimal gaze alignment, spatial validity, occlusion avoidance, and ergonomic comfort.
Overview
| Property | Value |
|---|---|
| Domain | Augmented Reality / Spatial Computing |
| Action Space | {action_type, position, scale, anchor_surface_id, orientation, layer} |
| Observation | Room + users + placed objects + pending object |
| Reward Range | [0.0, 1.0] β shaped at every step |
| Tasks | 3 tasks (Easy β Hard), 3 cases each |
| API | FastAPI on port 7860 |
RL Problem
The agent receives a description of a physical room (surfaces, furniture, users) and must decide where and how to place virtual AR objects. It is scored on five weighted criteria:
| Criterion | Description |
|---|---|
spatial_validity |
Object inside room, not overlapping furniture |
gaze_alignment |
Angle between user gaze and object direction |
occlusion |
How much existing objects block the new placement |
comfort_zone |
Object within user's ergonomic comfort cone |
priority_respect |
High-priority objects placed in the sweet spot |
Project Structure
ar-scene-graph-env/
β
βββ Dockerfile
βββ openenv.yaml
βββ requirements.txt
βββ README.md
βββ inference.py
β
βββ env/
β βββ __init__.py
β βββ models.py β Pydantic v2 data models
β βββ tasks.py β Hardcoded task / case data (9 cases total)
β βββ environment.py β ARSceneGraphEnv class
β βββ scene_graph.py β Pure-numpy 3D geometry helpers
β βββ reward.py β Shaped reward computation
β βββ graders.py β Episode-level graders (1 per task)
β
βββ api/
β βββ __init__.py
β βββ server.py β FastAPI app
β
βββ tests/
βββ test_env.py
βββ test_graders.py
Tasks
Task 1 β Easy (task1_easy)
- Room: 4 Γ 3 Γ 3 m, empty
- 1 user, 1 notification bubble to place
- Budget: 5 steps
- Reward threshold: 0.6
Task 2 β Medium (task2_medium)
- Room: 6 Γ 4 Γ 4 m, 3 furniture items
- 1 user, 4 mixed AR objects (virtual screen, notification, arrow, widget)
- Budget: 15 steps
- Reward threshold: 0.5
Task 3 β Hard (task3_hard)
- Room: 8 Γ 4 Γ 6 m, 8 furniture items
- 2 users with different positions and gaze vectors
- 6 objects (some shared between users, some user-specific)
- Budget: 25 steps
- Reward threshold: 0.4
API
Start the server
uvicorn api.server:app --host 0.0.0.0 --port 7860
POST /reset
{
"task_id": "task1_easy",
"case_index": 0,
"seed": 42
}
All fields optional. Returns ResetResult with initial Observation.
POST /step
{
"action_type": "place",
"position": [2.0, 1.5, 2.8],
"scale": 1.0,
"anchor_surface_id": "wall_n",
"orientation": [0.0, 0.0, 0.0],
"layer": 0
}
Returns StepResult with reward (β [0, 1]), done, and per-component info.
GET /state
Returns full StateResult β works even before reset.
GET /health
{"status": "ok", "version": "1.0.0"}
Inference
Set environment variables and run:
export HF_TOKEN=<your-api-key>
export MODEL_NAME=gpt-4o-mini
export SERVER_URL=http://localhost:7860
python inference.py
Log format (exact):
[START] task=task1_easy_case0 env=ar-scene-graph-env model=gpt-4o-mini
[STEP] step=1 action=place reward=0.82 done=True error=None
[END] success=True steps=1 score=0.82 rewards=[0.82]
Docker
# Build
docker build -t ar-scene-graph-env .
# Run
docker run -p 7860:7860 ar-scene-graph-env
Testing
pip install -r requirements.txt
pytest tests/ -v
Tech Stack
- Python 3.10+
- FastAPI 0.110 + Uvicorn 0.29
- Pydantic v2 β full type annotations, model validators
- NumPy β all 3D geometry (no scipy / trimesh / etc.)
- OpenAI SDK β LLM inference client
- pytest + httpx β testing
Reward Details
reward = Ξ£(weight_i Γ score_i) β penalties
reward = clamp(round(reward, 4), 0.0, 1.0)
Penalties:
skip action β0.20
scale out of [0.5,2] β0.10
overlaps placed obj β0.15
placed behind user β0.25
All geometry is pure NumPy. EPSILON = 0.001 for all float comparisons.
Design Notes
state()never crashes β returns safe defaults beforereset()is called.step()never raises β all exceptions caught, returns safe fallback.reset()is fully reproducible: same(task_id, case_index, seed)always produces identical observations.- After
done=True, all subsequentstep()calls returnreward=0.0. - Scale is auto-corrected to [0.5, 2.0]; layer to [0, 10].