Spaces:
Running
Deployment Guide (Max / Person C)
Local Development
# Create and activate virtualenv
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# Install server deps
pip install -r server/requirements.txt
# Install replicalab package
pip install -e . --no-deps
# Run the server
uvicorn server.app:app --host 0.0.0.0 --port 7860 --reload
Server should be available at http://localhost:7860.
Quick smoke test:
curl http://localhost:7860/health
curl -X POST http://localhost:7860/reset \
-H "Content-Type: application/json" \
-d '{"seed": 42, "scenario": "math_reasoning", "difficulty": "easy"}'
Docker (Local)
docker build -f server/Dockerfile -t replicalab .
docker run -p 7860:7860 replicalab
Verified endpoints (API 08 sign-off, 2026-03-08)
After docker run -p 7860:7860 replicalab, the following were verified
against the real env (not stub):
curl http://localhost:7860/health
# β {"status":"ok","env":"real"}
curl http://localhost:7860/scenarios
# β {"scenarios":[{"family":"math_reasoning",...}, ...]}
curl -X POST http://localhost:7860/reset \
-H "Content-Type: application/json" \
-d '{"seed":42,"scenario":"math_reasoning","difficulty":"easy"}'
# β {"session_id":"...","episode_id":"...","observation":{...}}
# Use session_id from reset response:
curl -X POST http://localhost:7860/step \
-H "Content-Type: application/json" \
-d '{"session_id":"<SESSION_ID>","action":{"action_type":"propose_protocol","sample_size":3,"controls":["baseline"],"technique":"algebraic_proof","duration_days":1,"required_equipment":[],"required_reagents":[],"questions":[],"rationale":"Test."}}'
# β {"observation":{...},"reward":0.0,"done":false,"info":{...}}
With optional hosted-model secrets:
docker run -p 7860:7860 \
-e MODEL_API_KEY=replace-me \
replicalab
Hugging Face Spaces Deployment
What is already configured (API 09)
The repo is now deployment-ready for HF Spaces:
- Root
Dockerfileβ HF Spaces requires the Dockerfile at repo root. The root-levelDockerfileis identical toserver/Dockerfile. Keep them in sync, or deleteserver/Dockerfileonce the team standardizes. README.mdfrontmatter β The root README now contains the required YAML frontmatter that HF Spaces parses on push: ```yamltitle: ReplicaLab emoji: π§ͺ colorFrom: blue colorTo: green sdk: docker app_port: 7860 pinned: false
- Non-root user β The Dockerfile creates and runs as
appuser(UID 1000), which HF Spaces requires for security. - Port 7860 β Both the
EXPOSEdirective and theuvicornCMD use 7860, matching theapp_portin the frontmatter.
Step-by-step deployment (for Max)
1. Create the Space
- Go to https://huggingface.co/new-space
- Fill in:
- Owner: your HF username or the team org
- Space name:
replicalab(orreplicalab-demo) - License: MIT
- SDK: Docker
- Hardware: CPU Basic (free tier is fine for the server)
- Visibility: Public
- Click Create Space
2. Add the Space as a git remote
# From the repo root
git remote add hf https://huggingface.co/spaces/<YOUR_HF_USERNAME>/replicalab
# If the org is different:
# git remote add hf https://huggingface.co/spaces/<ORG>/replicalab
3. Push the repo
# Push the current branch to the Space
git push hf ayush:main
# Or if deploying from master:
# git push hf master:main
HF Spaces will automatically detect the Dockerfile, build the image, and
start the container.
4. Monitor the build
- Go to https://huggingface.co/spaces/\<YOUR_HF_USERNAME>/replicalab
- Click the Logs tab (or Build tab during first deploy)
- Wait for the build to complete (typically 2-5 minutes)
- The Space status should change from "Building" to "Running"
5. Verify the deployment (API 10 scope)
Once the Space is running:
# Health check
curl https://ayushozha-replicalab.hf.space/health
# Reset an episode
curl -X POST https://ayushozha-replicalab.hf.space/reset \
-H "Content-Type: application/json" \
-d '{"seed": 42, "scenario": "math_reasoning", "difficulty": "easy"}'
# List scenarios
curl https://ayushozha-replicalab.hf.space/scenarios
WebSocket test (using websocat or wscat):
wscat -c wss://ayushozha-replicalab.hf.space/ws
# Then type: {"type": "ping"}
# Expect: {"type": "pong"}
Verified live deployment (API 10 sign-off, 2026-03-08)
Public Space URL: https://huggingface.co/spaces/ayushozha/replicalab
API base URL: https://ayushozha-replicalab.hf.space
All four endpoints verified against the live Space with real env:
GET /health β 200 {"status":"ok","env":"real"}
GET /scenarios β 200 {"scenarios":[...3 families...]}
POST /reset β 200 {"session_id":"...","episode_id":"...","observation":{...}}
POST /step β 200 {"reward":2.312798,"done":true,"info":{"verdict":"accept",...}}
Full episode verified: reset β propose_protocol β accept β terminal reward with real judge scoring (rigor=0.465, feasibility=1.000, fidelity=0.325, total_reward=2.313, verdict=accept).
Secrets and API Key Management (API 17)
Current state
The server is fully self-contained with no external API calls. No secrets or API keys are required to run the environment, judge, or scoring pipeline. All reward computation is deterministic and local.
Where secrets live (by context)
| Context | Location | What to set | Required? |
|---|---|---|---|
| HF Space | Space Settings β Repository secrets | Nothing currently | No |
| Local dev | Shell env vars or .env file (gitignored) |
Nothing currently | No |
| Docker | -e KEY=value flags on docker run |
Nothing currently | No |
| Colab notebook | google.colab.userdata or env vars |
HF_TOKEN for model downloads, REPLICALAB_URL for hosted env |
Yes for training |
Colab notebook secrets
When running the training notebook, the following are needed:
| Secret | Purpose | Where to set | Required? |
|---|---|---|---|
HF_TOKEN |
Download gated models (Qwen3-4B) from HF Hub | Colab Secrets panel (key icon) | Yes |
REPLICALAB_URL |
URL of the hosted environment | Hardcode or Colab secret | Optional β defaults to https://ayushozha-replicalab.hf.space |
To set in Colab:
- Click the key icon in the left sidebar
- Add
HF_TOKENwith your Hugging Face access token - Access in code:
from google.colab import userdata
hf_token = userdata.get("HF_TOKEN")
Future secrets (not currently needed)
If a frontier hosted evaluator is added later:
| Secret name | Purpose | Required? |
|---|---|---|
MODEL_API_KEY |
Hosted evaluator access key | Only if a hosted evaluator is added |
MODEL_BASE_URL |
Alternate provider endpoint | Only if using a proxy |
These would be set in HF Space Settings β Repository secrets, and
accessed via os.environ.get("MODEL_API_KEY") in server code.
Re-deploying after code changes
# Just push again β HF rebuilds automatically
git push hf ayush:main
To force a full rebuild (e.g. after dependency changes):
- Go to Space Settings
- Click Factory reboot under the Danger zone section
Known limitations
- Free CPU tier has 2 vCPU and 16 GB RAM. This is sufficient for the FastAPI server but NOT for running RL training. Training happens in Colab.
- Cold starts β Free-tier Spaces sleep after 48 hours of inactivity. The first request after sleep takes 30-60 seconds to rebuild.
- Persistent storage β Episode replays and logs are in-memory only. They reset when the container restarts. This is acceptable for the hackathon demo.
- Heavy hosted models require billing-enabled hardware β as of
2026-03-09, the checked HF token authenticates successfully but the backing
account reports
canPay=falseand has no org attached, so it is currently suitable for model downloads but not for provisioning paid large-model serving through HF Spaces hardware or Inference Endpoints.
Environment URLs Reference
| Service | Local | Hosted |
|---|---|---|
| FastAPI app | http://localhost:7860 |
https://ayushozha-replicalab.hf.space |
| Health | http://localhost:7860/health |
https://ayushozha-replicalab.hf.space/health |
| WebSocket | ws://localhost:7860/ws |
wss://ayushozha-replicalab.hf.space/ws |
| Scenarios | http://localhost:7860/scenarios |
https://ayushozha-replicalab.hf.space/scenarios |
Northflank CLI Access
Local verification (2026-03-08)
- Installed globally with
npm i -g @northflank/cli - Verified locally with
northflank --version - Current verified version:
0.10.16
Login
northflank login -n <context-name> -t <token>
<token> must come from the user's Northflank account or team secret
manager. Do not commit it to the repo.
Service access commands for replica-labs/replicalab-ai
northflank forward service --projectId replica-labs --serviceId replicalab-ai
northflank get service logs --tail --projectId replica-labs --serviceId replicalab-ai
northflank ssh service --projectId replica-labs --serviceId replicalab-ai
northflank exec service --projectId replica-labs --serviceId replicalab-ai
northflank upload service file --projectId replica-labs --serviceId replicalab-ai --localPath dir/file.txt --remotePath /home/file.txt
northflank download service file --projectId replica-labs --serviceId replicalab-ai --localPath dir/file.txt --remotePath /home/file.txt
Current Northflank runtime findings (2026-03-09)
- The manual training job
replicalab-trainexists inreplica-labs, butnorthflank start job run --projectId replica-labs --jobId replicalab-traincurrently fails with409 No deployment configured. - The job still has runtime variables configured, including the older remote
MODEL_NAME=Qwen/Qwen3-8B, so even after the missing deployment is fixed the runtime config should be reviewed before launching training. - The live service
replicalab-aiis deployed on the samenf-gpu-hack-16-64billing plan, but a direct probe from inside the container found nonvidia-smibinary and no/dev/nvidia*device nodes. Treat GPU/H100 availability as unverified until a container can prove hardware visibility from inside the runtime.
Current Northflank notebook findings (2026-03-09)
- There is a separate live notebook service in project
notebook-openport:jupyter-pytorch. - The active public notebook DNS is
app--jupyter-pytorch--9y6g97v7czb9.code.runon port8888(/labfor the Jupyter UI). - Northflank reports that service with GPU config
gpuType=h100-80,gpuCount=1, and an in-container probe confirmedNVIDIA H100 80GB HBM3. - The notebook image is
quay.io/jupyter/pytorch-notebook:cuda12-2025-08-18. - The notebook currently contains a repo clone and GRPO outputs, but the saved
notebook/log state is not clean: training produced adapter checkpoints
through step 200, then later notebook evaluation/inference failed with a
string indices must be integers, not 'str'content-format error.
Windows note
Global npm binaries resolve from C:\Users\ayush\AppData\Roaming\npm on this
machine. If northflank is not found in a new shell, reopen the terminal so
the updated PATH is reloaded.
Hand-off To Ayush
Local server:
- WebSocket:
ws://localhost:7860/ws - REST health:
http://localhost:7860/health - Running against: real env (not stub)
Hosted deployment (verified 2026-03-08):
- Base URL:
https://ayushozha-replicalab.hf.space /healthreturns200with{"status":"ok","env":"real"}- WebSocket path:
wss://ayushozha-replicalab.hf.space/ws - Full episode tested: propose β accept β reward with real judge scores
Troubleshooting
| Issue | Fix |
|---|---|
ReplicaLabEnv not found warning at startup |
The real env is now available; ensure replicalab/scoring/rubric.py is present and httpx + websocket-client are in server/requirements.txt |
| Docker build fails | Re-check server/requirements.txt and the Docker build context |
| CORS error from the frontend | Re-check allowed origins in server/app.py |
| WebSocket closes after idle time | Send periodic ping messages or reconnect |
| Session not found (REST) | Call /reset again to create a new session |