Spaces:

openenv-community
/

replicalab

Running

App Files Files Community

replicalab / docs /max /deployment.md

maxxie114

Initial HF Spaces deployment

80d8c84 2 days ago

preview code

raw

history blame contribute delete

12.7 kB

Deployment Guide (Max / Person C)

Local Development

# Create and activate virtualenv
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

# Install server deps
pip install -r server/requirements.txt

# Install replicalab package
pip install -e . --no-deps

# Run the server
uvicorn server.app:app --host 0.0.0.0 --port 7860 --reload

Server should be available at http://localhost:7860.

Quick smoke test:

curl http://localhost:7860/health

curl -X POST http://localhost:7860/reset \
  -H "Content-Type: application/json" \
  -d '{"seed": 42, "scenario": "math_reasoning", "difficulty": "easy"}'

Docker (Local)

docker build -f server/Dockerfile -t replicalab .
docker run -p 7860:7860 replicalab

Verified endpoints (API 08 sign-off, 2026-03-08)

After docker run -p 7860:7860 replicalab, the following were verified against the real env (not stub):

curl http://localhost:7860/health
# → {"status":"ok","env":"real"}

curl http://localhost:7860/scenarios
# → {"scenarios":[{"family":"math_reasoning",...}, ...]}

curl -X POST http://localhost:7860/reset \
  -H "Content-Type: application/json" \
  -d '{"seed":42,"scenario":"math_reasoning","difficulty":"easy"}'
# → {"session_id":"...","episode_id":"...","observation":{...}}

# Use session_id from reset response:
curl -X POST http://localhost:7860/step \
  -H "Content-Type: application/json" \
  -d '{"session_id":"<SESSION_ID>","action":{"action_type":"propose_protocol","sample_size":3,"controls":["baseline"],"technique":"algebraic_proof","duration_days":1,"required_equipment":[],"required_reagents":[],"questions":[],"rationale":"Test."}}'
# → {"observation":{...},"reward":0.0,"done":false,"info":{...}}

With optional hosted-model secrets:

docker run -p 7860:7860 \
  -e MODEL_API_KEY=replace-me \
  replicalab

Hugging Face Spaces Deployment

What is already configured (API 09)

The repo is now deployment-ready for HF Spaces:

Root Dockerfile — HF Spaces requires the Dockerfile at repo root. The root-level Dockerfile is identical to server/Dockerfile. Keep them in sync, or delete server/Dockerfile once the team standardizes.
README.md frontmatter — The root README now contains the required YAML frontmatter that HF Spaces parses on push: ```yaml

title: ReplicaLab emoji: 🧪 colorFrom: blue colorTo: green sdk: docker app_port: 7860 pinned: false
Non-root user — The Dockerfile creates and runs as appuser (UID 1000), which HF Spaces requires for security.
Port 7860 — Both the EXPOSE directive and the uvicorn CMD use 7860, matching the app_port in the frontmatter.

Step-by-step deployment (for Max)

1. Create the Space

Go to https://huggingface.co/new-space
Fill in:
- Owner: your HF username or the team org
- Space name: replicalab (or replicalab-demo)
- License: MIT
- SDK: Docker
- Hardware: CPU Basic (free tier is fine for the server)
- Visibility: Public
Click Create Space

2. Add the Space as a git remote

# From the repo root
git remote add hf https://huggingface.co/spaces/<YOUR_HF_USERNAME>/replicalab

# If the org is different:
# git remote add hf https://huggingface.co/spaces/<ORG>/replicalab

3. Push the repo

# Push the current branch to the Space
git push hf ayush:main

# Or if deploying from master:
# git push hf master:main

HF Spaces will automatically detect the Dockerfile, build the image, and start the container.

4. Monitor the build

Go to https://huggingface.co/spaces/\<YOUR_HF_USERNAME>/replicalab
Click the Logs tab (or Build tab during first deploy)
Wait for the build to complete (typically 2-5 minutes)
The Space status should change from "Building" to "Running"

5. Verify the deployment (API 10 scope)

Once the Space is running:

# Health check
curl https://ayushozha-replicalab.hf.space/health

# Reset an episode
curl -X POST https://ayushozha-replicalab.hf.space/reset \
  -H "Content-Type: application/json" \
  -d '{"seed": 42, "scenario": "math_reasoning", "difficulty": "easy"}'

# List scenarios
curl https://ayushozha-replicalab.hf.space/scenarios

WebSocket test (using websocat or wscat):

wscat -c wss://ayushozha-replicalab.hf.space/ws
# Then type: {"type": "ping"}
# Expect: {"type": "pong"}

Verified live deployment (API 10 sign-off, 2026-03-08)

Public Space URL: https://huggingface.co/spaces/ayushozha/replicalab API base URL: https://ayushozha-replicalab.hf.space

All four endpoints verified against the live Space with real env:

GET  /health    → 200 {"status":"ok","env":"real"}
GET  /scenarios → 200 {"scenarios":[...3 families...]}
POST /reset     → 200 {"session_id":"...","episode_id":"...","observation":{...}}
POST /step      → 200 {"reward":2.312798,"done":true,"info":{"verdict":"accept",...}}

Full episode verified: reset → propose_protocol → accept → terminal reward with real judge scoring (rigor=0.465, feasibility=1.000, fidelity=0.325, total_reward=2.313, verdict=accept).

Secrets and API Key Management (API 17)

Current state

The server is fully self-contained with no external API calls. No secrets or API keys are required to run the environment, judge, or scoring pipeline. All reward computation is deterministic and local.

Where secrets live (by context)

Context	Location	What to set	Required?
HF Space	Space Settings → Repository secrets	Nothing currently	No
Local dev	Shell env vars or `.env` file (gitignored)	Nothing currently	No
Docker	`-e KEY=value` flags on `docker run`	Nothing currently	No
Colab notebook	`google.colab.userdata` or env vars	`HF_TOKEN` for model downloads, `REPLICALAB_URL` for hosted env	Yes for training

Colab notebook secrets

When running the training notebook, the following are needed:

Secret	Purpose	Where to set	Required?
`HF_TOKEN`	Download gated models (Qwen3-4B) from HF Hub	Colab Secrets panel (key icon)	Yes
`REPLICALAB_URL`	URL of the hosted environment	Hardcode or Colab secret	Optional — defaults to `https://ayushozha-replicalab.hf.space`

To set in Colab:

Click the key icon in the left sidebar
Add HF_TOKEN with your Hugging Face access token
Access in code:

from google.colab import userdata
hf_token = userdata.get("HF_TOKEN")

Future secrets (not currently needed)

If a frontier hosted evaluator is added later:

Secret name	Purpose	Required?
`MODEL_API_KEY`	Hosted evaluator access key	Only if a hosted evaluator is added
`MODEL_BASE_URL`	Alternate provider endpoint	Only if using a proxy

These would be set in HF Space Settings → Repository secrets, and accessed via os.environ.get("MODEL_API_KEY") in server code.

Re-deploying after code changes

# Just push again — HF rebuilds automatically
git push hf ayush:main

To force a full rebuild (e.g. after dependency changes):

Go to Space Settings
Click Factory reboot under the Danger zone section

Known limitations

Free CPU tier has 2 vCPU and 16 GB RAM. This is sufficient for the FastAPI server but NOT for running RL training. Training happens in Colab.
Cold starts — Free-tier Spaces sleep after 48 hours of inactivity. The first request after sleep takes 30-60 seconds to rebuild.
Persistent storage — Episode replays and logs are in-memory only. They reset when the container restarts. This is acceptable for the hackathon demo.
Heavy hosted models require billing-enabled hardware — as of 2026-03-09, the checked HF token authenticates successfully but the backing account reports canPay=false and has no org attached, so it is currently suitable for model downloads but not for provisioning paid large-model serving through HF Spaces hardware or Inference Endpoints.

Environment URLs Reference

Service	Local	Hosted
FastAPI app	`http://localhost:7860`	`https://ayushozha-replicalab.hf.space`
Health	`http://localhost:7860/health`	`https://ayushozha-replicalab.hf.space/health`
WebSocket	`ws://localhost:7860/ws`	`wss://ayushozha-replicalab.hf.space/ws`
Scenarios	`http://localhost:7860/scenarios`	`https://ayushozha-replicalab.hf.space/scenarios`

Northflank CLI Access

Local verification (2026-03-08)

Installed globally with npm i -g @northflank/cli
Verified locally with northflank --version
Current verified version: 0.10.16

Login

northflank login -n <context-name> -t <token>

<token> must come from the user's Northflank account or team secret manager. Do not commit it to the repo.

Service access commands for `replica-labs/replicalab-ai`

northflank forward service --projectId replica-labs --serviceId replicalab-ai
northflank get service logs --tail --projectId replica-labs --serviceId replicalab-ai
northflank ssh service --projectId replica-labs --serviceId replicalab-ai
northflank exec service --projectId replica-labs --serviceId replicalab-ai
northflank upload service file --projectId replica-labs --serviceId replicalab-ai --localPath dir/file.txt --remotePath /home/file.txt
northflank download service file --projectId replica-labs --serviceId replicalab-ai --localPath dir/file.txt --remotePath /home/file.txt

Current Northflank runtime findings (2026-03-09)

The manual training job replicalab-train exists in replica-labs, but northflank start job run --projectId replica-labs --jobId replicalab-train currently fails with 409 No deployment configured.
The job still has runtime variables configured, including the older remote MODEL_NAME=Qwen/Qwen3-8B, so even after the missing deployment is fixed the runtime config should be reviewed before launching training.
The live service replicalab-ai is deployed on the same nf-gpu-hack-16-64 billing plan, but a direct probe from inside the container found no nvidia-smi binary and no /dev/nvidia* device nodes. Treat GPU/H100 availability as unverified until a container can prove hardware visibility from inside the runtime.

Current Northflank notebook findings (2026-03-09)

There is a separate live notebook service in project notebook-openport: jupyter-pytorch.
The active public notebook DNS is app--jupyter-pytorch--9y6g97v7czb9.code.run on port 8888 (/lab for the Jupyter UI).
Northflank reports that service with GPU config gpuType=h100-80, gpuCount=1, and an in-container probe confirmed NVIDIA H100 80GB HBM3.
The notebook image is quay.io/jupyter/pytorch-notebook:cuda12-2025-08-18.
The notebook currently contains a repo clone and GRPO outputs, but the saved notebook/log state is not clean: training produced adapter checkpoints through step 200, then later notebook evaluation/inference failed with a string indices must be integers, not 'str' content-format error.

Windows note

Global npm binaries resolve from C:\Users\ayush\AppData\Roaming\npm on this machine. If northflank is not found in a new shell, reopen the terminal so the updated PATH is reloaded.

Hand-off To Ayush

Local server:

WebSocket: ws://localhost:7860/ws
REST health: http://localhost:7860/health
Running against: real env (not stub)

Hosted deployment (verified 2026-03-08):

Base URL: https://ayushozha-replicalab.hf.space
/health returns 200 with {"status":"ok","env":"real"}
WebSocket path: wss://ayushozha-replicalab.hf.space/ws
Full episode tested: propose → accept → reward with real judge scores

Troubleshooting

Issue	Fix
`ReplicaLabEnv not found` warning at startup	The real env is now available; ensure `replicalab/scoring/rubric.py` is present and `httpx` + `websocket-client` are in `server/requirements.txt`
Docker build fails	Re-check `server/requirements.txt` and the Docker build context
CORS error from the frontend	Re-check allowed origins in `server/app.py`
WebSocket closes after idle time	Send periodic ping messages or reconnect
Session not found (REST)	Call `/reset` again to create a new session