OpenEnv Challenge — Deliverables & Status
Competition
OpenEnv Challenge: SOTA Environments to drive general intelligence
Sponsors: PyTorch team at Meta, HuggingFace, Unsloth
Prizes:
- $10K in HuggingFace credits
- Invitation to publish on PyTorch.org blog
Judging Criteria
Evaluated primarily on the submission blog. Judging panel grades on:
- Creative and robust use of OpenEnv
- Technical excellence
- Storytelling
- Open-source demo
- Green Agent wrapper for the environment
Required Deliverables
1. HuggingFace Space
Environment on the HF Hub. Judges interact with the action space (DESCRIBE, SAMPLE, QUERY, ANSWER) against real Spider databases.
Live at: https://huggingface.co/spaces/hjerpe/sql_env
Docker image: registry.hf.space/hjerpe-sql_env:latest
Published via uv run openenv push on 2026-03-29 (see specs/F007-DEMO.md).
Status: Live. Endpoints /health, /docs, /web, /reset, /step, /ws
exposed by the FastAPI server in envs/sql_env/server/. Python client:
SQLEnv(base_url="https://hjerpe-sql-env.hf.space").
2. Training notebooks/scripts (GitHub)
Colab-ready notebooks:
notebooks/train_grpo.ipynb— Full SFT + GRPO pipeline, Colab L4, ~7hnotebooks/compare_methods.ipynb— Base vs GRPO evaluation (zero-shot, 1-shot, 3-shot, GRPO v1, v2)notebooks/showcase_sqlenv.ipynb— Interactive environment demo with Random and Oracle baselines
Status: Complete
3. Blog post (HuggingFace)
Analyst exploration framing, reward architecture with theory, training results (0% to ~30%), failure analysis, lessons learned.
Draft: docs/blog-post-v1.md
Status: Draft v1 complete, not yet published
Additional Deliverables
4. GitHub repo
Clean codebase: zero ruff errors, typed Pydantic models, 280 passing tests, architecture docs, training artifacts.
Status: Complete (F016 quality sweep done)
5. Trained checkpoints (HuggingFace Hub)
hjerpe/sqlenv-qwen3-0.6b-grpo(v1)hjerpe/sqlenv-qwen3-0.6b-grpo-v2(v2)
Status: Uploaded
6. Green Agent wrapper
OpenEnv evaluation wrapper pattern. A Policy protocol with
evaluate(env, policy, n_episodes, seed) that reports success rate,
average reward, and average steps. Includes RandomPolicy and
OraclePolicy baselines for standardized comparison.
Implementation: evaluation/policies.py, evaluation/oracle_policy.py
Tests: tests/test_evaluation.py (17 tests, all passing)
Used by: notebooks/showcase_sqlenv.ipynb, notebooks/compare_methods.ipynb
Status: Complete
7. TRL environment_factory adapter
HuggingFace TRL's native OpenEnv integration: pass a class with
reset() + named tool methods as environment_factory= and GRPOTrainer
runs the multi-turn tool-calling loop automatically (no custom
rollout_func).
Implementation: training/trl_adapter.py — class SQLEnvTRL exposing
describe(), sample(), query(), answer() as tool methods plus
sql_env_reward_func. Used by notebooks/train_grpo.ipynb (cell 16:
environment_factory=SQLEnvTRL).
Note: the adapter instantiates a local in-process SQLEnvironment,
not a WebSocket client to the hosted HF Space. Intentional — training
needs N parallel sessions (one per generation), and local is faster and
avoids the Space's default 1-session concurrency limit.
Status: Complete
Our Position
No interactive SQL exploration environment exists. SQL Repair (WALKMAN303) is single-turn fix-it. Calendar Gym (Turing) is real-world but not SQL. We are the only multi-turn strategy-discovery environment for database exploration.
Key narrative: "The environment is the product." The trained agent demonstrates that the environment works, but the contribution is the action space, reward architecture, and episode structure.
Open Items
- Deploy HuggingFace Space (live at https://huggingface.co/spaces/hjerpe/sql_env, 2026-03-29)
- Publish blog post on HuggingFace (planned 2026-04-12)
- Final review of blog-post-v1.md
- Verify notebooks run clean on fresh Colab
- Post-launch: enable
SUPPORTS_CONCURRENT_SESSIONS=True+max_concurrent_envs=64on the Space for external users who want to retrain against the hosted endpoint
Resources
- OpenEnv tutorial: https://colab.research.google.com/github/meta-pytorch/OpenEnv/blob/main/examples/OpenEnv_Tutorial.ipynb
- OpenEnv GitHub: https://github.com/meta-pytorch/OpenEnv
- OpenEnv docs: https://meta-pytorch.org/OpenEnv/
- Environment hub: https://huggingface.co/openenv
- Discord: https://discord.com/invite/YsTYBh6PD9