Spaces:
Sleeping
feat: Add local CUDA support, MCP server, Spaces GPU selection, and stacking roadmap
Browse files- Remove ZeroGPU dependency, optimize for local CUDA (4090/3090/3070ti)
- Add MCP server (mcp_server.py) with sharp_predict, list_outputs tools
- Add hardware_config.py for Spaces GPU selection with persistence
- Add Settings tab in Gradio UI for hardware configuration
- Support all HuggingFace Spaces GPUs (ZeroGPU through A100)
- Enable Gradio API by default (show_api=True)
- Add comprehensive WARP.md with codebase map and documentation
- Complete multi-image stacking roadmap with implementation phases
New files:
- WARP.md: Project guidance for WARP/AI assistants
- mcp_server.py: MCP server for programmatic access
- hardware_config.py: GPU hardware selection module
Environment:
- SHARP_PORT (default: 49200) for Gradio
- SHARP_MCP_PORT (default: 49201) for MCP
- CUDA_VISIBLE_DEVICES for multi-GPU selection
- .gitignore +1 -0
- WARP.md +344 -0
- app.py +145 -3
- hardware_config.py +252 -0
- mcp_server.py +224 -0
- model_utils.py +71 -20
- pyproject.toml +2 -1
- requirements.txt +2 -1
|
@@ -217,3 +217,4 @@ __marimo__/
|
|
| 217 |
|
| 218 |
# Kilo Code
|
| 219 |
.kilocode/
|
|
|
|
|
|
| 217 |
|
| 218 |
# Kilo Code
|
| 219 |
.kilocode/
|
| 220 |
+
.hardware_config.json
|
|
@@ -0,0 +1,344 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# WARP.md
|
| 2 |
+
|
| 3 |
+
This file provides guidance to WARP (warp.dev) when working with code in this repository.
|
| 4 |
+
|
| 5 |
+
## Project Overview
|
| 6 |
+
|
| 7 |
+
SHARP (Single-image 3D Gaussian scene prediction) Gradio demo. Wraps Apple's SHARP model to predict 3D Gaussian scenes from single images, export `.ply` files, and optionally render camera trajectory videos.
|
| 8 |
+
|
| 9 |
+
Optimized for local CUDA (4090/3090/3070ti) or HuggingFace Spaces GPU. Includes MCP server for programmatic access.
|
| 10 |
+
|
| 11 |
+
## Development Commands
|
| 12 |
+
|
| 13 |
+
```bash
|
| 14 |
+
# Install dependencies (uses uv package manager)
|
| 15 |
+
uv sync
|
| 16 |
+
|
| 17 |
+
# Run the Gradio app (port 49200 by default)
|
| 18 |
+
uv run python app.py
|
| 19 |
+
|
| 20 |
+
# Run MCP server (stdio transport)
|
| 21 |
+
uv run python mcp_server.py
|
| 22 |
+
|
| 23 |
+
# Lint with ruff
|
| 24 |
+
uv run ruff check .
|
| 25 |
+
uv run ruff format .
|
| 26 |
+
```
|
| 27 |
+
|
| 28 |
+
## Codebase Map
|
| 29 |
+
|
| 30 |
+
```
|
| 31 |
+
ml-sharp/
|
| 32 |
+
├── app.py # Gradio UI (tabs: Run, Examples, About, Settings)
|
| 33 |
+
│ ├── build_demo() # Main UI builder
|
| 34 |
+
│ ├── run_sharp() # Inference entrypoint called by UI
|
| 35 |
+
│ └── discover_examples() # Load precompiled examples
|
| 36 |
+
├── model_utils.py # Core inference + rendering
|
| 37 |
+
│ ├── ModelWrapper # Checkpoint loading, predictor caching
|
| 38 |
+
│ │ ├── predict_to_ply() # Image → Gaussians → PLY
|
| 39 |
+
│ │ └── render_video() # Gaussians → MP4 trajectory
|
| 40 |
+
│ ├── PredictionOutputs # Dataclass for inference results
|
| 41 |
+
│ ├── configure_gpu_mode() # Switch between local/Spaces GPU
|
| 42 |
+
│ └── predict_and_maybe_render_gpu # Module-level entrypoint
|
| 43 |
+
├── hardware_config.py # GPU hardware selection & persistence
|
| 44 |
+
│ ├── HardwareConfig # Dataclass with mode, hardware, duration
|
| 45 |
+
│ ├── get_hardware_choices() # Dropdown options
|
| 46 |
+
│ └── SPACES_HARDWARE_SPECS # HF Spaces GPU specs & pricing
|
| 47 |
+
├── mcp_server.py # MCP server for programmatic access
|
| 48 |
+
│ ├── sharp_predict # Tool: image → PLY + video
|
| 49 |
+
│ ├── list_outputs # Tool: list generated files
|
| 50 |
+
│ └── sharp://info # Resource: GPU status, config
|
| 51 |
+
├── assets/examples/ # Precompiled example outputs
|
| 52 |
+
├── outputs/ # Runtime outputs (PLY, MP4)
|
| 53 |
+
├── .hardware_config.json # Persisted hardware settings
|
| 54 |
+
├── pyproject.toml # Dependencies (uv)
|
| 55 |
+
└── WARP.md # This file
|
| 56 |
+
```
|
| 57 |
+
|
| 58 |
+
### Data Flow
|
| 59 |
+
|
| 60 |
+
```
|
| 61 |
+
Image → load_rgb() → predict_image() → Gaussians3D → save_ply() → PLY
|
| 62 |
+
↓
|
| 63 |
+
render_video() → MP4
|
| 64 |
+
```
|
| 65 |
+
|
| 66 |
+
## Architecture
|
| 67 |
+
|
| 68 |
+
### Core Files
|
| 69 |
+
|
| 70 |
+
- `app.py` — Gradio UI with tabs for Run/Examples/About/Settings. Handles example discovery from `assets/examples/` via manifest.json or filename conventions.
|
| 71 |
+
- `model_utils.py` — SHARP model wrapper with checkpoint loading (HF Hub → CDN fallback), inference via `predict_to_ply()`, and CUDA video rendering via `render_video()`.
|
| 72 |
+
- `hardware_config.py` — GPU hardware selection between local CUDA and HuggingFace Spaces. Persists to `.hardware_config.json`.
|
| 73 |
+
- `mcp_server.py` — MCP server exposing `sharp_predict` tool and `sharp://info` resource.
|
| 74 |
+
|
| 75 |
+
### Key Patterns
|
| 76 |
+
|
| 77 |
+
**Local CUDA mode**: Model kept on GPU by default (`SHARP_KEEP_MODEL_ON_DEVICE=1`) for better performance on dedicated GPUs.
|
| 78 |
+
|
| 79 |
+
**Spaces GPU mode**: Uses `@spaces.GPU` decorator for dynamic GPU allocation on HuggingFace Spaces. Configurable via Settings tab.
|
| 80 |
+
|
| 81 |
+
**Checkpoint resolution order**:
|
| 82 |
+
1. `SHARP_CHECKPOINT_PATH` env var
|
| 83 |
+
2. HF Hub cache
|
| 84 |
+
3. HF Hub download
|
| 85 |
+
4. Upstream CDN via `torch.hub`
|
| 86 |
+
|
| 87 |
+
**Video rendering**: Requires CUDA (gsplat). Falls back gracefully on CPU-only systems by returning `None` for video path.
|
| 88 |
+
|
| 89 |
+
## Environment Variables
|
| 90 |
+
|
| 91 |
+
| Variable | Default | Description |
|
| 92 |
+
|----------|---------|-------------|
|
| 93 |
+
| `SHARP_PORT` | `49200` | Gradio server port |
|
| 94 |
+
| `SHARP_MCP_PORT` | `49201` | MCP server port |
|
| 95 |
+
| `SHARP_CHECKPOINT_PATH` | — | Override local checkpoint path |
|
| 96 |
+
| `SHARP_HF_REPO_ID` | `apple/Sharp` | HuggingFace repo |
|
| 97 |
+
| `SHARP_HF_FILENAME` | `sharp_2572gikvuh.pt` | Checkpoint filename |
|
| 98 |
+
| `SHARP_KEEP_MODEL_ON_DEVICE` | `1` | Keep model on GPU (set `0` to free VRAM) |
|
| 99 |
+
| `CUDA_VISIBLE_DEVICES` | — | GPU selection (e.g., `0` or `0,1`) |
|
| 100 |
+
|
| 101 |
+
## Gradio API
|
| 102 |
+
|
| 103 |
+
API is enabled by default. Access at `http://localhost:49200/?view=api`.
|
| 104 |
+
|
| 105 |
+
### Endpoint: `/api/run_sharp`
|
| 106 |
+
|
| 107 |
+
```python
|
| 108 |
+
import requests
|
| 109 |
+
|
| 110 |
+
response = requests.post(
|
| 111 |
+
"http://localhost:49200/api/run_sharp",
|
| 112 |
+
json={
|
| 113 |
+
"data": [
|
| 114 |
+
"/path/to/image.jpg", # image_path
|
| 115 |
+
"rotate_forward", # trajectory_type
|
| 116 |
+
0, # output_long_side (0 = match input)
|
| 117 |
+
60, # num_frames
|
| 118 |
+
30, # fps
|
| 119 |
+
True, # render_video
|
| 120 |
+
]
|
| 121 |
+
}
|
| 122 |
+
)
|
| 123 |
+
result = response.json()["data"]
|
| 124 |
+
video_path, ply_path, status = result
|
| 125 |
+
```
|
| 126 |
+
|
| 127 |
+
## MCP Server
|
| 128 |
+
|
| 129 |
+
Run the MCP server for integration with AI agents:
|
| 130 |
+
|
| 131 |
+
```bash
|
| 132 |
+
uv run python mcp_server.py
|
| 133 |
+
```
|
| 134 |
+
|
| 135 |
+
### MCP Config (for clients like Warp)
|
| 136 |
+
|
| 137 |
+
```json
|
| 138 |
+
{
|
| 139 |
+
"mcpServers": {
|
| 140 |
+
"sharp": {
|
| 141 |
+
"command": "uv",
|
| 142 |
+
"args": ["run", "python", "mcp_server.py"],
|
| 143 |
+
"cwd": "/home/robin/CascadeProjects/ml-sharp"
|
| 144 |
+
}
|
| 145 |
+
}
|
| 146 |
+
}
|
| 147 |
+
```
|
| 148 |
+
|
| 149 |
+
### Tools
|
| 150 |
+
|
| 151 |
+
- `sharp_predict(image_path, render_video=True, trajectory_type="rotate_forward", ...)` — Run inference
|
| 152 |
+
- `list_outputs()` — List generated PLY/MP4 files
|
| 153 |
+
|
| 154 |
+
### Resources
|
| 155 |
+
|
| 156 |
+
- `sharp://info` — GPU status, configuration
|
| 157 |
+
- `sharp://help` — Usage documentation
|
| 158 |
+
|
| 159 |
+
## Multi-GPU Configuration
|
| 160 |
+
|
| 161 |
+
Select GPU via environment variable:
|
| 162 |
+
|
| 163 |
+
```bash
|
| 164 |
+
# Use GPU 0 (e.g., 4090)
|
| 165 |
+
CUDA_VISIBLE_DEVICES=0 uv run python app.py
|
| 166 |
+
|
| 167 |
+
# Use GPU 1 (e.g., 3090)
|
| 168 |
+
CUDA_VISIBLE_DEVICES=1 uv run python app.py
|
| 169 |
+
```
|
| 170 |
+
|
| 171 |
+
## HuggingFace Spaces GPU
|
| 172 |
+
|
| 173 |
+
The app supports HuggingFace Spaces paid GPUs for faster inference or larger models. Configure via the **Settings** tab.
|
| 174 |
+
|
| 175 |
+
### Available Hardware
|
| 176 |
+
|
| 177 |
+
| Hardware | VRAM | Price/hr | Best For |
|
| 178 |
+
|----------|------|----------|----------|
|
| 179 |
+
| ZeroGPU (H200) | 70GB | Free (PRO) | Demos, dynamic allocation |
|
| 180 |
+
| T4 small | 16GB | $0.40 | Light workloads |
|
| 181 |
+
| T4 medium | 16GB | $0.60 | Standard workloads |
|
| 182 |
+
| L4x1 | 24GB | $0.80 | Standard inference |
|
| 183 |
+
| L4x4 | 96GB | $3.80 | Multi-GPU |
|
| 184 |
+
| L40Sx1 | 48GB | $1.80 | Large models |
|
| 185 |
+
| L40Sx4 | 192GB | $8.30 | Very large models |
|
| 186 |
+
| A10G small | 24GB | $1.00 | Balanced |
|
| 187 |
+
| A10G large | 24GB | $1.50 | More CPU/RAM |
|
| 188 |
+
| A100 large | 80GB | $2.50 | Maximum VRAM |
|
| 189 |
+
|
| 190 |
+
### Deploying to Spaces
|
| 191 |
+
|
| 192 |
+
1. Push to HuggingFace Space
|
| 193 |
+
2. Set hardware in Space settings (or use `suggested_hardware` in README.md)
|
| 194 |
+
3. The app auto-detects Spaces environment via `SPACE_ID` env var
|
| 195 |
+
|
| 196 |
+
### README.md Metadata for Spaces
|
| 197 |
+
|
| 198 |
+
```yaml
|
| 199 |
+
---
|
| 200 |
+
title: SHARP - 3D Gaussian Scene Prediction
|
| 201 |
+
emoji: 🔪
|
| 202 |
+
colorFrom: purple
|
| 203 |
+
colorTo: indigo
|
| 204 |
+
sdk: gradio
|
| 205 |
+
sdk_version: 6.2.0
|
| 206 |
+
python_version: 3.13.11
|
| 207 |
+
app_file: app.py
|
| 208 |
+
suggested_hardware: l4x1 # or zero-gpu, a100-large, etc.
|
| 209 |
+
startup_duration_timeout: 1h
|
| 210 |
+
preload_from_hub:
|
| 211 |
+
- apple/Sharp sharp_2572gikvuh.pt
|
| 212 |
+
---
|
| 213 |
+
```
|
| 214 |
+
|
| 215 |
+
## Examples System
|
| 216 |
+
|
| 217 |
+
Place precompiled outputs in `assets/examples/`:
|
| 218 |
+
- `<name>.{jpg,png,webp}` + `<name>.mp4` + `<name>.ply`
|
| 219 |
+
- Or define `assets/examples/manifest.json` with `{label, image, video, ply}` entries
|
| 220 |
+
|
| 221 |
+
## Multi-Image Stacking Roadmap
|
| 222 |
+
|
| 223 |
+
SHARP predicts 3D Gaussians from a single image. To "stack" multiple images into a unified scene:
|
| 224 |
+
|
| 225 |
+
### Required Components
|
| 226 |
+
|
| 227 |
+
1. **Pose Estimation** (`multi_view.py`)
|
| 228 |
+
- Estimate relative camera poses between images
|
| 229 |
+
- Options: COLMAP, hloc, or PnP-based
|
| 230 |
+
- Transform each prediction to common world frame
|
| 231 |
+
|
| 232 |
+
2. **Gaussian Merging** (`gaussian_merge.py`)
|
| 233 |
+
- Concatenate Gaussian parameters (means, covariances, colors, opacities)
|
| 234 |
+
- Deduplicate overlapping regions via density-based filtering
|
| 235 |
+
- Optional: fine-tune merged scene with photometric loss
|
| 236 |
+
|
| 237 |
+
3. **UI Changes**
|
| 238 |
+
- Multi-upload widget
|
| 239 |
+
- Alignment preview/validation
|
| 240 |
+
- Progress indicator for multi-image processing
|
| 241 |
+
|
| 242 |
+
### Data Structures
|
| 243 |
+
|
| 244 |
+
```python
|
| 245 |
+
@dataclass
|
| 246 |
+
class AlignedGaussians:
|
| 247 |
+
gaussians: Gaussians3D
|
| 248 |
+
world_transform: torch.Tensor # 4x4 SE(3)
|
| 249 |
+
source_image: Path
|
| 250 |
+
|
| 251 |
+
def merge_gaussians(aligned: list[AlignedGaussians]) -> Gaussians3D:
|
| 252 |
+
# 1. Transform each Gaussian's means by world_transform
|
| 253 |
+
# 2. Concatenate all parameters
|
| 254 |
+
# 3. Density-based pruning in overlapping regions
|
| 255 |
+
...
|
| 256 |
+
```
|
| 257 |
+
|
| 258 |
+
### Dependencies to Add
|
| 259 |
+
|
| 260 |
+
- `pycolmap` or `hloc` for pose estimation
|
| 261 |
+
- `open3d` for point cloud operations (optional)
|
| 262 |
+
|
| 263 |
+
### Implementation Phases
|
| 264 |
+
|
| 265 |
+
#### Phase 1: Basic Multi-Image Pipeline
|
| 266 |
+
- [ ] Add `multi_view.py` with `estimate_relative_pose(img1, img2)` using feature matching
|
| 267 |
+
- [ ] Add `gaussian_merge.py` with naive concatenation (no dedup)
|
| 268 |
+
- [ ] UI: Multi-file upload in new "Stack" tab
|
| 269 |
+
- [ ] Export merged PLY
|
| 270 |
+
|
| 271 |
+
#### Phase 2: Pose Estimation Options
|
| 272 |
+
- [ ] Integrate COLMAP sparse reconstruction for >2 images
|
| 273 |
+
- [ ] Add hloc (Hierarchical Localization) as lightweight alternative
|
| 274 |
+
- [ ] Fallback: manual pose input for known camera rigs
|
| 275 |
+
|
| 276 |
+
#### Phase 3: Gaussian Deduplication
|
| 277 |
+
- [ ] Implement KD-tree based nearest-neighbor pruning
|
| 278 |
+
- [ ] Merge overlapping Gaussians by averaging parameters
|
| 279 |
+
- [ ] Add confidence weighting based on view angle
|
| 280 |
+
|
| 281 |
+
#### Phase 4: Refinement (Optional)
|
| 282 |
+
- [ ] Photometric loss optimization on merged scene
|
| 283 |
+
- [ ] Iterative alignment refinement
|
| 284 |
+
- [ ] Support for depth priors from stereo/MVS
|
| 285 |
+
|
| 286 |
+
### API Design
|
| 287 |
+
|
| 288 |
+
```python
|
| 289 |
+
# multi_view.py
|
| 290 |
+
def estimate_poses(
|
| 291 |
+
images: list[Path],
|
| 292 |
+
method: Literal["colmap", "hloc", "pnp"] = "hloc",
|
| 293 |
+
) -> list[np.ndarray]: # List of 4x4 world-to-camera transforms
|
| 294 |
+
...
|
| 295 |
+
|
| 296 |
+
# gaussian_merge.py
|
| 297 |
+
def merge_scenes(
|
| 298 |
+
predictions: list[PredictionOutputs],
|
| 299 |
+
poses: list[np.ndarray],
|
| 300 |
+
deduplicate: bool = True,
|
| 301 |
+
dedup_radius: float = 0.01, # meters
|
| 302 |
+
) -> Gaussians3D:
|
| 303 |
+
...
|
| 304 |
+
|
| 305 |
+
# app.py (Stack tab)
|
| 306 |
+
def run_stack(
|
| 307 |
+
images: list[str], # Gradio multi-file upload
|
| 308 |
+
pose_method: str,
|
| 309 |
+
deduplicate: bool,
|
| 310 |
+
) -> tuple[str | None, str | None, str]: # video, ply, status
|
| 311 |
+
...
|
| 312 |
+
```
|
| 313 |
+
|
| 314 |
+
### MCP Extension
|
| 315 |
+
|
| 316 |
+
```python
|
| 317 |
+
# mcp_server.py additions
|
| 318 |
+
@mcp.tool()
|
| 319 |
+
def sharp_stack(
|
| 320 |
+
image_paths: list[str],
|
| 321 |
+
pose_method: str = "hloc",
|
| 322 |
+
deduplicate: bool = True,
|
| 323 |
+
render_video: bool = True,
|
| 324 |
+
) -> dict:
|
| 325 |
+
"""Stack multiple images into unified 3D Gaussian scene."""
|
| 326 |
+
...
|
| 327 |
+
```
|
| 328 |
+
|
| 329 |
+
### Technical Considerations
|
| 330 |
+
|
| 331 |
+
**Coordinate Systems**:
|
| 332 |
+
- SHARP outputs Gaussians in camera-centric coordinates
|
| 333 |
+
- Need to transform to world frame using estimated poses
|
| 334 |
+
- Convention: Y-up, -Z forward (OpenGL style)
|
| 335 |
+
|
| 336 |
+
**Memory Management**:
|
| 337 |
+
- Each SHARP prediction ~50-200MB GPU memory
|
| 338 |
+
- Batch processing with model unload between predictions
|
| 339 |
+
- Consider streaming merge for >10 images
|
| 340 |
+
|
| 341 |
+
**Quality Metrics**:
|
| 342 |
+
- Reprojection error for pose validation
|
| 343 |
+
- Gaussian density histogram for coverage analysis
|
| 344 |
+
- Visual comparison with ground truth (if available)
|
|
@@ -29,7 +29,22 @@ from typing import Final
|
|
| 29 |
|
| 30 |
import gradio as gr
|
| 31 |
|
| 32 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
|
| 34 |
# -----------------------------------------------------------------------------
|
| 35 |
# Paths & constants
|
|
@@ -42,6 +57,7 @@ EXAMPLES_DIR: Final[Path] = ASSETS_DIR / "examples"
|
|
| 42 |
|
| 43 |
IMAGE_EXTS: Final[tuple[str, ...]] = (".png", ".jpg", ".jpeg", ".webp")
|
| 44 |
DEFAULT_QUEUE_MAX_SIZE: Final[int] = 32
|
|
|
|
| 45 |
|
| 46 |
THEME: Final = gr.themes.Soft(
|
| 47 |
primary_hue="indigo",
|
|
@@ -239,6 +255,68 @@ def _validate_image(image_path: str | None) -> None:
|
|
| 239 |
raise gr.Error("Upload an image first.")
|
| 240 |
|
| 241 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 242 |
def run_sharp(
|
| 243 |
image_path: str | None,
|
| 244 |
trajectory_type: TrajectoryType,
|
|
@@ -354,7 +432,7 @@ def build_demo() -> gr.Blocks:
|
|
| 354 |
)
|
| 355 |
|
| 356 |
render_toggle = gr.Checkbox(
|
| 357 |
-
label="Render MP4 (CUDA
|
| 358 |
value=True,
|
| 359 |
)
|
| 360 |
|
|
@@ -490,6 +568,65 @@ def build_demo() -> gr.Blocks:
|
|
| 490 |
""".strip()
|
| 491 |
)
|
| 492 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 493 |
demo.queue(max_size=DEFAULT_QUEUE_MAX_SIZE, default_concurrency_limit=1)
|
| 494 |
return demo
|
| 495 |
|
|
@@ -497,4 +634,9 @@ def build_demo() -> gr.Blocks:
|
|
| 497 |
demo = build_demo()
|
| 498 |
|
| 499 |
if __name__ == "__main__":
|
| 500 |
-
demo.launch(
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
|
| 30 |
import gradio as gr
|
| 31 |
|
| 32 |
+
import os
|
| 33 |
+
|
| 34 |
+
from model_utils import (
|
| 35 |
+
TrajectoryType,
|
| 36 |
+
predict_and_maybe_render_gpu,
|
| 37 |
+
configure_gpu_mode,
|
| 38 |
+
get_gpu_status,
|
| 39 |
+
)
|
| 40 |
+
from hardware_config import (
|
| 41 |
+
get_hardware_choices,
|
| 42 |
+
parse_hardware_choice,
|
| 43 |
+
get_config,
|
| 44 |
+
update_config,
|
| 45 |
+
SPACES_HARDWARE_SPECS,
|
| 46 |
+
is_running_on_spaces,
|
| 47 |
+
)
|
| 48 |
|
| 49 |
# -----------------------------------------------------------------------------
|
| 50 |
# Paths & constants
|
|
|
|
| 57 |
|
| 58 |
IMAGE_EXTS: Final[tuple[str, ...]] = (".png", ".jpg", ".jpeg", ".webp")
|
| 59 |
DEFAULT_QUEUE_MAX_SIZE: Final[int] = 32
|
| 60 |
+
DEFAULT_PORT: Final[int] = int(os.getenv("SHARP_PORT", "49200"))
|
| 61 |
|
| 62 |
THEME: Final = gr.themes.Soft(
|
| 63 |
primary_hue="indigo",
|
|
|
|
| 255 |
raise gr.Error("Upload an image first.")
|
| 256 |
|
| 257 |
|
| 258 |
+
# -----------------------------------------------------------------------------
|
| 259 |
+
# Hardware Configuration
|
| 260 |
+
# -----------------------------------------------------------------------------
|
| 261 |
+
|
| 262 |
+
|
| 263 |
+
def _get_current_hardware_value() -> str:
|
| 264 |
+
"""Get current hardware choice value for dropdown."""
|
| 265 |
+
config = get_config()
|
| 266 |
+
if config.mode == "local":
|
| 267 |
+
return "local"
|
| 268 |
+
return f"spaces:{config.spaces_hardware}"
|
| 269 |
+
|
| 270 |
+
|
| 271 |
+
def _format_gpu_status() -> str:
|
| 272 |
+
"""Format GPU status as markdown."""
|
| 273 |
+
status = get_gpu_status()
|
| 274 |
+
config = get_config()
|
| 275 |
+
|
| 276 |
+
lines = ["### Current Status"]
|
| 277 |
+
lines.append(f"- **Mode:** {'Local CUDA' if config.mode == 'local' else 'HuggingFace Spaces'}")
|
| 278 |
+
|
| 279 |
+
if config.mode == "spaces":
|
| 280 |
+
hw_spec = SPACES_HARDWARE_SPECS.get(config.spaces_hardware, {})
|
| 281 |
+
lines.append(f"- **Spaces Hardware:** {hw_spec.get('name', config.spaces_hardware)}")
|
| 282 |
+
lines.append(f"- **VRAM:** {hw_spec.get('vram', 'N/A')}")
|
| 283 |
+
lines.append(f"- **Price:** {hw_spec.get('price', 'N/A')}")
|
| 284 |
+
lines.append(f"- **Duration:** {config.spaces_duration}s")
|
| 285 |
+
else:
|
| 286 |
+
lines.append(f"- **CUDA Available:** {'✅ Yes' if status['cuda_available'] else '❌ No'}")
|
| 287 |
+
lines.append(f"- **Spaces Module:** {'✅ Installed' if status['spaces_available'] else '❌ Not installed'}")
|
| 288 |
+
|
| 289 |
+
if status['devices']:
|
| 290 |
+
lines.append("\n### Local GPUs")
|
| 291 |
+
for dev in status['devices']:
|
| 292 |
+
lines.append(f"- **GPU {dev['index']}:** {dev['name']} ({dev['total_memory_gb']}GB)")
|
| 293 |
+
|
| 294 |
+
if is_running_on_spaces():
|
| 295 |
+
lines.append("\n⚠️ *Running on HuggingFace Spaces*")
|
| 296 |
+
|
| 297 |
+
return "\n".join(lines)
|
| 298 |
+
|
| 299 |
+
|
| 300 |
+
def _apply_hardware_config(choice: str, duration: int) -> str:
|
| 301 |
+
"""Apply hardware configuration and return status."""
|
| 302 |
+
mode, spaces_hw = parse_hardware_choice(choice)
|
| 303 |
+
|
| 304 |
+
# Update config
|
| 305 |
+
update_config(
|
| 306 |
+
mode=mode,
|
| 307 |
+
spaces_hardware=spaces_hw if spaces_hw else "zero-gpu",
|
| 308 |
+
spaces_duration=duration,
|
| 309 |
+
)
|
| 310 |
+
|
| 311 |
+
# Configure GPU mode in model_utils
|
| 312 |
+
configure_gpu_mode(
|
| 313 |
+
use_spaces=(mode == "spaces"),
|
| 314 |
+
duration=duration,
|
| 315 |
+
)
|
| 316 |
+
|
| 317 |
+
return _format_gpu_status()
|
| 318 |
+
|
| 319 |
+
|
| 320 |
def run_sharp(
|
| 321 |
image_path: str | None,
|
| 322 |
trajectory_type: TrajectoryType,
|
|
|
|
| 432 |
)
|
| 433 |
|
| 434 |
render_toggle = gr.Checkbox(
|
| 435 |
+
label="Render MP4 (requires CUDA)",
|
| 436 |
value=True,
|
| 437 |
)
|
| 438 |
|
|
|
|
| 568 |
""".strip()
|
| 569 |
)
|
| 570 |
|
| 571 |
+
with gr.Tab("⚙️ Settings", id="settings"):
|
| 572 |
+
with gr.Column(elem_id="settings-panel"):
|
| 573 |
+
gr.Markdown("### GPU Hardware Selection")
|
| 574 |
+
gr.Markdown(
|
| 575 |
+
"Select local CUDA or HuggingFace Spaces GPU for inference. "
|
| 576 |
+
"Spaces GPUs require deploying to HuggingFace Spaces."
|
| 577 |
+
)
|
| 578 |
+
|
| 579 |
+
with gr.Row():
|
| 580 |
+
with gr.Column(scale=3):
|
| 581 |
+
hw_dropdown = gr.Dropdown(
|
| 582 |
+
label="Hardware",
|
| 583 |
+
choices=get_hardware_choices(),
|
| 584 |
+
value=_get_current_hardware_value(),
|
| 585 |
+
interactive=True,
|
| 586 |
+
)
|
| 587 |
+
|
| 588 |
+
duration_slider = gr.Slider(
|
| 589 |
+
label="Spaces GPU Duration (seconds)",
|
| 590 |
+
info="Max time for @spaces.GPU decorator (ZeroGPU only)",
|
| 591 |
+
minimum=60,
|
| 592 |
+
maximum=300,
|
| 593 |
+
step=30,
|
| 594 |
+
value=get_config().spaces_duration,
|
| 595 |
+
interactive=True,
|
| 596 |
+
)
|
| 597 |
+
|
| 598 |
+
apply_btn = gr.Button("Apply & Save", variant="primary")
|
| 599 |
+
|
| 600 |
+
with gr.Column(scale=2):
|
| 601 |
+
hw_status = gr.Markdown(
|
| 602 |
+
value=_format_gpu_status(),
|
| 603 |
+
elem_id="hw-status",
|
| 604 |
+
)
|
| 605 |
+
|
| 606 |
+
apply_btn.click(
|
| 607 |
+
fn=_apply_hardware_config,
|
| 608 |
+
inputs=[hw_dropdown, duration_slider],
|
| 609 |
+
outputs=[hw_status],
|
| 610 |
+
)
|
| 611 |
+
|
| 612 |
+
gr.Markdown(
|
| 613 |
+
"""
|
| 614 |
+
---
|
| 615 |
+
### Spaces Hardware Reference
|
| 616 |
+
|
| 617 |
+
| Hardware | VRAM | Price | Best For |
|
| 618 |
+
|----------|------|-------|----------|
|
| 619 |
+
| ZeroGPU (H200) | 70GB | Free (PRO) | Demos, dynamic allocation |
|
| 620 |
+
| T4 small/medium | 16GB | $0.40-0.60/hr | Light workloads |
|
| 621 |
+
| L4x1 | 24GB | $0.80/hr | Standard inference |
|
| 622 |
+
| L40Sx1 | 48GB | $1.80/hr | Large models |
|
| 623 |
+
| A10G large | 24GB | $1.50/hr | Balanced cost/performance |
|
| 624 |
+
| A100 large | 80GB | $2.50/hr | Maximum VRAM |
|
| 625 |
+
|
| 626 |
+
*Prices as of Dec 2024. See [HuggingFace Spaces GPU docs](https://huggingface.co/docs/hub/spaces-gpus).*
|
| 627 |
+
"""
|
| 628 |
+
)
|
| 629 |
+
|
| 630 |
demo.queue(max_size=DEFAULT_QUEUE_MAX_SIZE, default_concurrency_limit=1)
|
| 631 |
return demo
|
| 632 |
|
|
|
|
| 634 |
demo = build_demo()
|
| 635 |
|
| 636 |
if __name__ == "__main__":
|
| 637 |
+
demo.launch(
|
| 638 |
+
theme=THEME,
|
| 639 |
+
css=CSS,
|
| 640 |
+
server_port=DEFAULT_PORT,
|
| 641 |
+
show_api=True,
|
| 642 |
+
)
|
|
@@ -0,0 +1,252 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Hardware configuration for local CUDA and HuggingFace Spaces GPU selection.
|
| 2 |
+
|
| 3 |
+
This module provides:
|
| 4 |
+
- Hardware mode selection (local CUDA vs Spaces GPU)
|
| 5 |
+
- Persistent configuration via JSON file
|
| 6 |
+
- HuggingFace Spaces GPU hardware options
|
| 7 |
+
|
| 8 |
+
Spaces GPU pricing (as of Dec 2024):
|
| 9 |
+
- ZeroGPU (H200): Free (PRO subscribers), dynamic allocation
|
| 10 |
+
- T4-small: $0.40/hr, 16GB VRAM
|
| 11 |
+
- T4-medium: $0.60/hr, 16GB VRAM
|
| 12 |
+
- L4x1: $0.80/hr, 24GB VRAM
|
| 13 |
+
- L4x4: $3.80/hr, 96GB VRAM
|
| 14 |
+
- L40Sx1: $1.80/hr, 48GB VRAM
|
| 15 |
+
- L40Sx4: $8.30/hr, 192GB VRAM
|
| 16 |
+
- A10G-small: $1.00/hr, 24GB VRAM
|
| 17 |
+
- A10G-large: $1.50/hr, 24GB VRAM
|
| 18 |
+
- A100-large: $2.50/hr, 80GB VRAM
|
| 19 |
+
"""
|
| 20 |
+
|
| 21 |
+
from __future__ import annotations
|
| 22 |
+
|
| 23 |
+
import json
|
| 24 |
+
import os
|
| 25 |
+
from dataclasses import dataclass, field
|
| 26 |
+
from pathlib import Path
|
| 27 |
+
from typing import Final, Literal
|
| 28 |
+
|
| 29 |
+
# Hardware mode: local CUDA or HuggingFace Spaces
|
| 30 |
+
HardwareMode = Literal["local", "spaces"]
|
| 31 |
+
|
| 32 |
+
# Spaces hardware flavors (from HF docs)
|
| 33 |
+
SpacesHardware = Literal[
|
| 34 |
+
"zero-gpu", # ZeroGPU (H200, dynamic, free for PRO)
|
| 35 |
+
"t4-small", # Nvidia T4 small
|
| 36 |
+
"t4-medium", # Nvidia T4 medium
|
| 37 |
+
"l4x1", # 1x Nvidia L4
|
| 38 |
+
"l4x4", # 4x Nvidia L4
|
| 39 |
+
"l40s-x1", # 1x Nvidia L40S
|
| 40 |
+
"l40s-x4", # 4x Nvidia L40S
|
| 41 |
+
"a10g-small", # Nvidia A10G small
|
| 42 |
+
"a10g-large", # Nvidia A10G large
|
| 43 |
+
"a10g-largex2", # 2x Nvidia A10G large
|
| 44 |
+
"a10g-largex4", # 4x Nvidia A10G large
|
| 45 |
+
"a100-large", # Nvidia A100 large (80GB)
|
| 46 |
+
]
|
| 47 |
+
|
| 48 |
+
# Hardware specs for display
|
| 49 |
+
SPACES_HARDWARE_SPECS: Final[dict[str, dict]] = {
|
| 50 |
+
"zero-gpu": {
|
| 51 |
+
"name": "ZeroGPU (H200)",
|
| 52 |
+
"vram": "70GB",
|
| 53 |
+
"price": "Free (PRO)",
|
| 54 |
+
"description": "Dynamic allocation, best for demos",
|
| 55 |
+
},
|
| 56 |
+
"t4-small": {
|
| 57 |
+
"name": "Nvidia T4 small",
|
| 58 |
+
"vram": "16GB",
|
| 59 |
+
"price": "$0.40/hr",
|
| 60 |
+
"description": "4 vCPU, 15GB RAM",
|
| 61 |
+
},
|
| 62 |
+
"t4-medium": {
|
| 63 |
+
"name": "Nvidia T4 medium",
|
| 64 |
+
"vram": "16GB",
|
| 65 |
+
"price": "$0.60/hr",
|
| 66 |
+
"description": "8 vCPU, 30GB RAM",
|
| 67 |
+
},
|
| 68 |
+
"l4x1": {
|
| 69 |
+
"name": "1x Nvidia L4",
|
| 70 |
+
"vram": "24GB",
|
| 71 |
+
"price": "$0.80/hr",
|
| 72 |
+
"description": "8 vCPU, 30GB RAM",
|
| 73 |
+
},
|
| 74 |
+
"l4x4": {
|
| 75 |
+
"name": "4x Nvidia L4",
|
| 76 |
+
"vram": "96GB",
|
| 77 |
+
"price": "$3.80/hr",
|
| 78 |
+
"description": "48 vCPU, 186GB RAM",
|
| 79 |
+
},
|
| 80 |
+
"l40s-x1": {
|
| 81 |
+
"name": "1x Nvidia L40S",
|
| 82 |
+
"vram": "48GB",
|
| 83 |
+
"price": "$1.80/hr",
|
| 84 |
+
"description": "8 vCPU, 62GB RAM",
|
| 85 |
+
},
|
| 86 |
+
"l40s-x4": {
|
| 87 |
+
"name": "4x Nvidia L40S",
|
| 88 |
+
"vram": "192GB",
|
| 89 |
+
"price": "$8.30/hr",
|
| 90 |
+
"description": "48 vCPU, 382GB RAM",
|
| 91 |
+
},
|
| 92 |
+
"a10g-small": {
|
| 93 |
+
"name": "Nvidia A10G small",
|
| 94 |
+
"vram": "24GB",
|
| 95 |
+
"price": "$1.00/hr",
|
| 96 |
+
"description": "4 vCPU, 14GB RAM",
|
| 97 |
+
},
|
| 98 |
+
"a10g-large": {
|
| 99 |
+
"name": "Nvidia A10G large",
|
| 100 |
+
"vram": "24GB",
|
| 101 |
+
"price": "$1.50/hr",
|
| 102 |
+
"description": "12 vCPU, 46GB RAM",
|
| 103 |
+
},
|
| 104 |
+
"a10g-largex2": {
|
| 105 |
+
"name": "2x Nvidia A10G large",
|
| 106 |
+
"vram": "48GB",
|
| 107 |
+
"price": "$3.00/hr",
|
| 108 |
+
"description": "24 vCPU, 92GB RAM",
|
| 109 |
+
},
|
| 110 |
+
"a10g-largex4": {
|
| 111 |
+
"name": "4x Nvidia A10G large",
|
| 112 |
+
"vram": "96GB",
|
| 113 |
+
"price": "$5.00/hr",
|
| 114 |
+
"description": "48 vCPU, 184GB RAM",
|
| 115 |
+
},
|
| 116 |
+
"a100-large": {
|
| 117 |
+
"name": "Nvidia A100 large",
|
| 118 |
+
"vram": "80GB",
|
| 119 |
+
"price": "$2.50/hr",
|
| 120 |
+
"description": "12 vCPU, 142GB RAM, best for large models",
|
| 121 |
+
},
|
| 122 |
+
}
|
| 123 |
+
|
| 124 |
+
CONFIG_FILE: Final[Path] = Path(__file__).resolve().parent / ".hardware_config.json"
|
| 125 |
+
|
| 126 |
+
|
| 127 |
+
@dataclass
|
| 128 |
+
class HardwareConfig:
|
| 129 |
+
"""Persistent hardware configuration."""
|
| 130 |
+
|
| 131 |
+
mode: HardwareMode = "local"
|
| 132 |
+
spaces_hardware: SpacesHardware = "zero-gpu"
|
| 133 |
+
spaces_duration: int = 180 # seconds for @spaces.GPU decorator
|
| 134 |
+
local_device: str = "auto" # auto, cuda, cpu, mps
|
| 135 |
+
keep_model_on_device: bool = True
|
| 136 |
+
|
| 137 |
+
def to_dict(self) -> dict:
|
| 138 |
+
return {
|
| 139 |
+
"mode": self.mode,
|
| 140 |
+
"spaces_hardware": self.spaces_hardware,
|
| 141 |
+
"spaces_duration": self.spaces_duration,
|
| 142 |
+
"local_device": self.local_device,
|
| 143 |
+
"keep_model_on_device": self.keep_model_on_device,
|
| 144 |
+
}
|
| 145 |
+
|
| 146 |
+
@classmethod
|
| 147 |
+
def from_dict(cls, data: dict) -> "HardwareConfig":
|
| 148 |
+
return cls(
|
| 149 |
+
mode=data.get("mode", "local"),
|
| 150 |
+
spaces_hardware=data.get("spaces_hardware", "zero-gpu"),
|
| 151 |
+
spaces_duration=data.get("spaces_duration", 180),
|
| 152 |
+
local_device=data.get("local_device", "auto"),
|
| 153 |
+
keep_model_on_device=data.get("keep_model_on_device", True),
|
| 154 |
+
)
|
| 155 |
+
|
| 156 |
+
def save(self, path: Path = CONFIG_FILE) -> None:
|
| 157 |
+
"""Save configuration to JSON file."""
|
| 158 |
+
path.write_text(json.dumps(self.to_dict(), indent=2))
|
| 159 |
+
|
| 160 |
+
@classmethod
|
| 161 |
+
def load(cls, path: Path = CONFIG_FILE) -> "HardwareConfig":
|
| 162 |
+
"""Load configuration from JSON file, or return defaults."""
|
| 163 |
+
if path.exists():
|
| 164 |
+
try:
|
| 165 |
+
data = json.loads(path.read_text())
|
| 166 |
+
return cls.from_dict(data)
|
| 167 |
+
except Exception:
|
| 168 |
+
pass
|
| 169 |
+
return cls()
|
| 170 |
+
|
| 171 |
+
|
| 172 |
+
def get_hardware_choices() -> list[tuple[str, str]]:
|
| 173 |
+
"""Get hardware choices for Gradio dropdown.
|
| 174 |
+
|
| 175 |
+
Returns list of (display_name, value) tuples.
|
| 176 |
+
"""
|
| 177 |
+
choices = [
|
| 178 |
+
("🖥️ Local CUDA (auto-detect)", "local"),
|
| 179 |
+
]
|
| 180 |
+
|
| 181 |
+
for hw_id, spec in SPACES_HARDWARE_SPECS.items():
|
| 182 |
+
label = f"☁️ {spec['name']} - {spec['vram']} VRAM ({spec['price']})"
|
| 183 |
+
choices.append((label, f"spaces:{hw_id}"))
|
| 184 |
+
|
| 185 |
+
return choices
|
| 186 |
+
|
| 187 |
+
|
| 188 |
+
def parse_hardware_choice(choice: str) -> tuple[HardwareMode, SpacesHardware | None]:
|
| 189 |
+
"""Parse hardware choice string into mode and hardware type."""
|
| 190 |
+
if choice == "local":
|
| 191 |
+
return "local", None
|
| 192 |
+
elif choice.startswith("spaces:"):
|
| 193 |
+
hw = choice.replace("spaces:", "")
|
| 194 |
+
return "spaces", hw # type: ignore
|
| 195 |
+
else:
|
| 196 |
+
return "local", None
|
| 197 |
+
|
| 198 |
+
|
| 199 |
+
def is_running_on_spaces() -> bool:
|
| 200 |
+
"""Check if we're running on HuggingFace Spaces."""
|
| 201 |
+
return os.getenv("SPACE_ID") is not None
|
| 202 |
+
|
| 203 |
+
|
| 204 |
+
def get_spaces_module():
|
| 205 |
+
"""Import and return the spaces module if available."""
|
| 206 |
+
try:
|
| 207 |
+
import spaces
|
| 208 |
+
return spaces
|
| 209 |
+
except ImportError:
|
| 210 |
+
return None
|
| 211 |
+
|
| 212 |
+
|
| 213 |
+
# Global config instance
|
| 214 |
+
_config: HardwareConfig | None = None
|
| 215 |
+
|
| 216 |
+
|
| 217 |
+
def get_config() -> HardwareConfig:
|
| 218 |
+
"""Get the global hardware configuration."""
|
| 219 |
+
global _config
|
| 220 |
+
if _config is None:
|
| 221 |
+
_config = HardwareConfig.load()
|
| 222 |
+
return _config
|
| 223 |
+
|
| 224 |
+
|
| 225 |
+
def update_config(
|
| 226 |
+
mode: HardwareMode | None = None,
|
| 227 |
+
spaces_hardware: SpacesHardware | None = None,
|
| 228 |
+
spaces_duration: int | None = None,
|
| 229 |
+
local_device: str | None = None,
|
| 230 |
+
keep_model_on_device: bool | None = None,
|
| 231 |
+
save: bool = True,
|
| 232 |
+
) -> HardwareConfig:
|
| 233 |
+
"""Update and optionally save the hardware configuration."""
|
| 234 |
+
global _config
|
| 235 |
+
config = get_config()
|
| 236 |
+
|
| 237 |
+
if mode is not None:
|
| 238 |
+
config.mode = mode
|
| 239 |
+
if spaces_hardware is not None:
|
| 240 |
+
config.spaces_hardware = spaces_hardware
|
| 241 |
+
if spaces_duration is not None:
|
| 242 |
+
config.spaces_duration = spaces_duration
|
| 243 |
+
if local_device is not None:
|
| 244 |
+
config.local_device = local_device
|
| 245 |
+
if keep_model_on_device is not None:
|
| 246 |
+
config.keep_model_on_device = keep_model_on_device
|
| 247 |
+
|
| 248 |
+
if save:
|
| 249 |
+
config.save()
|
| 250 |
+
|
| 251 |
+
_config = config
|
| 252 |
+
return config
|
|
@@ -0,0 +1,224 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""SHARP MCP Server for programmatic access to 3D Gaussian prediction.
|
| 2 |
+
|
| 3 |
+
Run standalone:
|
| 4 |
+
uv run python mcp_server.py
|
| 5 |
+
|
| 6 |
+
Or integrate with MCP clients via stdio transport.
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
from __future__ import annotations
|
| 10 |
+
|
| 11 |
+
import json
|
| 12 |
+
import os
|
| 13 |
+
from pathlib import Path
|
| 14 |
+
from typing import Literal
|
| 15 |
+
|
| 16 |
+
import torch
|
| 17 |
+
from mcp.server.fastmcp import FastMCP
|
| 18 |
+
|
| 19 |
+
from model_utils import (
|
| 20 |
+
DEFAULT_OUTPUTS_DIR,
|
| 21 |
+
ModelWrapper,
|
| 22 |
+
TrajectoryType,
|
| 23 |
+
get_global_model,
|
| 24 |
+
)
|
| 25 |
+
|
| 26 |
+
MCP_PORT: int = int(os.getenv("SHARP_MCP_PORT", "49201"))
|
| 27 |
+
|
| 28 |
+
mcp = FastMCP(
|
| 29 |
+
"sharp",
|
| 30 |
+
description="SHARP: Single-image 3D Gaussian scene prediction",
|
| 31 |
+
)
|
| 32 |
+
|
| 33 |
+
# -----------------------------------------------------------------------------
|
| 34 |
+
# Tools
|
| 35 |
+
# -----------------------------------------------------------------------------
|
| 36 |
+
|
| 37 |
+
|
| 38 |
+
@mcp.tool()
|
| 39 |
+
def sharp_predict(
|
| 40 |
+
image_path: str,
|
| 41 |
+
render_video: bool = True,
|
| 42 |
+
trajectory_type: TrajectoryType = "rotate_forward",
|
| 43 |
+
num_frames: int = 60,
|
| 44 |
+
fps: int = 30,
|
| 45 |
+
output_long_side: int | None = None,
|
| 46 |
+
) -> dict:
|
| 47 |
+
"""Predict 3D Gaussians from a single image.
|
| 48 |
+
|
| 49 |
+
Args:
|
| 50 |
+
image_path: Absolute path to input image (jpg/png/webp).
|
| 51 |
+
render_video: Whether to render a camera trajectory video (requires CUDA).
|
| 52 |
+
trajectory_type: Camera trajectory type (swipe/shake/rotate/rotate_forward).
|
| 53 |
+
num_frames: Number of frames for video rendering.
|
| 54 |
+
fps: Frames per second for video.
|
| 55 |
+
output_long_side: Output resolution (longest side). None = match input.
|
| 56 |
+
|
| 57 |
+
Returns:
|
| 58 |
+
dict with keys:
|
| 59 |
+
- ply_path: Path to exported PLY file
|
| 60 |
+
- video_path: Path to rendered MP4 (or null if not rendered)
|
| 61 |
+
- cuda_available: Whether CUDA was available
|
| 62 |
+
"""
|
| 63 |
+
image_path_obj = Path(image_path)
|
| 64 |
+
if not image_path_obj.exists():
|
| 65 |
+
raise FileNotFoundError(f"Image not found: {image_path}")
|
| 66 |
+
|
| 67 |
+
model = get_global_model()
|
| 68 |
+
video_path, ply_path = model.predict_and_maybe_render(
|
| 69 |
+
image_path_obj,
|
| 70 |
+
trajectory_type=trajectory_type,
|
| 71 |
+
num_frames=num_frames,
|
| 72 |
+
fps=fps,
|
| 73 |
+
output_long_side=output_long_side,
|
| 74 |
+
render_video=render_video,
|
| 75 |
+
)
|
| 76 |
+
|
| 77 |
+
return {
|
| 78 |
+
"ply_path": str(ply_path),
|
| 79 |
+
"video_path": str(video_path) if video_path else None,
|
| 80 |
+
"cuda_available": torch.cuda.is_available(),
|
| 81 |
+
}
|
| 82 |
+
|
| 83 |
+
|
| 84 |
+
@mcp.tool()
|
| 85 |
+
def sharp_render(
|
| 86 |
+
ply_path: str,
|
| 87 |
+
trajectory_type: TrajectoryType = "rotate_forward",
|
| 88 |
+
num_frames: int = 60,
|
| 89 |
+
fps: int = 30,
|
| 90 |
+
output_long_side: int | None = None,
|
| 91 |
+
) -> dict:
|
| 92 |
+
"""Render a video from an existing PLY file.
|
| 93 |
+
|
| 94 |
+
Note: This requires re-predicting from the original image since Gaussians
|
| 95 |
+
are not stored in standard PLY format. For now, returns an error.
|
| 96 |
+
Future versions may support loading Gaussians from PLY.
|
| 97 |
+
|
| 98 |
+
Args:
|
| 99 |
+
ply_path: Path to PLY file (from previous prediction).
|
| 100 |
+
trajectory_type: Camera trajectory type.
|
| 101 |
+
num_frames: Number of frames.
|
| 102 |
+
fps: Frames per second.
|
| 103 |
+
output_long_side: Output resolution.
|
| 104 |
+
|
| 105 |
+
Returns:
|
| 106 |
+
dict with error message (feature not yet implemented).
|
| 107 |
+
"""
|
| 108 |
+
return {
|
| 109 |
+
"error": "Rendering from PLY not yet implemented. Use sharp_predict with render_video=True.",
|
| 110 |
+
"hint": "PLY files store only point data, not the full Gaussian parameters needed for rendering.",
|
| 111 |
+
}
|
| 112 |
+
|
| 113 |
+
|
| 114 |
+
@mcp.tool()
|
| 115 |
+
def list_outputs() -> dict:
|
| 116 |
+
"""List all generated output files (PLY and MP4).
|
| 117 |
+
|
| 118 |
+
Returns:
|
| 119 |
+
dict with keys:
|
| 120 |
+
- outputs_dir: Path to outputs directory
|
| 121 |
+
- ply_files: List of PLY file paths
|
| 122 |
+
- video_files: List of MP4 file paths
|
| 123 |
+
"""
|
| 124 |
+
outputs_dir = DEFAULT_OUTPUTS_DIR
|
| 125 |
+
ply_files = sorted(outputs_dir.glob("*.ply"))
|
| 126 |
+
video_files = sorted(outputs_dir.glob("*.mp4"))
|
| 127 |
+
|
| 128 |
+
return {
|
| 129 |
+
"outputs_dir": str(outputs_dir),
|
| 130 |
+
"ply_files": [str(f) for f in ply_files],
|
| 131 |
+
"video_files": [str(f) for f in video_files],
|
| 132 |
+
}
|
| 133 |
+
|
| 134 |
+
|
| 135 |
+
# -----------------------------------------------------------------------------
|
| 136 |
+
# Resources
|
| 137 |
+
# -----------------------------------------------------------------------------
|
| 138 |
+
|
| 139 |
+
|
| 140 |
+
@mcp.resource("sharp://info")
|
| 141 |
+
def get_info() -> str:
|
| 142 |
+
"""Get SHARP server info including GPU status and configuration."""
|
| 143 |
+
cuda_available = torch.cuda.is_available()
|
| 144 |
+
gpu_info = []
|
| 145 |
+
|
| 146 |
+
if cuda_available:
|
| 147 |
+
for i in range(torch.cuda.device_count()):
|
| 148 |
+
props = torch.cuda.get_device_properties(i)
|
| 149 |
+
gpu_info.append({
|
| 150 |
+
"index": i,
|
| 151 |
+
"name": props.name,
|
| 152 |
+
"total_memory_gb": round(props.total_memory / (1024**3), 2),
|
| 153 |
+
"compute_capability": f"{props.major}.{props.minor}",
|
| 154 |
+
})
|
| 155 |
+
|
| 156 |
+
info = {
|
| 157 |
+
"model": "SHARP (Apple ml-sharp)",
|
| 158 |
+
"description": "Single-image 3D Gaussian scene prediction",
|
| 159 |
+
"cuda_available": cuda_available,
|
| 160 |
+
"cuda_device_count": torch.cuda.device_count() if cuda_available else 0,
|
| 161 |
+
"gpus": gpu_info,
|
| 162 |
+
"outputs_dir": str(DEFAULT_OUTPUTS_DIR),
|
| 163 |
+
"checkpoint_sources": [
|
| 164 |
+
"SHARP_CHECKPOINT_PATH env var",
|
| 165 |
+
"HuggingFace Hub (apple/Sharp)",
|
| 166 |
+
"Upstream CDN (torch.hub)",
|
| 167 |
+
],
|
| 168 |
+
"env_vars": {
|
| 169 |
+
"SHARP_CHECKPOINT_PATH": os.getenv("SHARP_CHECKPOINT_PATH", "(not set)"),
|
| 170 |
+
"SHARP_KEEP_MODEL_ON_DEVICE": os.getenv("SHARP_KEEP_MODEL_ON_DEVICE", "1"),
|
| 171 |
+
"CUDA_VISIBLE_DEVICES": os.getenv("CUDA_VISIBLE_DEVICES", "(not set)"),
|
| 172 |
+
},
|
| 173 |
+
}
|
| 174 |
+
|
| 175 |
+
return json.dumps(info, indent=2)
|
| 176 |
+
|
| 177 |
+
|
| 178 |
+
@mcp.resource("sharp://help")
|
| 179 |
+
def get_help() -> str:
|
| 180 |
+
"""Get usage help for the SHARP MCP server."""
|
| 181 |
+
help_text = """
|
| 182 |
+
# SHARP MCP Server
|
| 183 |
+
|
| 184 |
+
## Tools
|
| 185 |
+
|
| 186 |
+
### sharp_predict
|
| 187 |
+
Predict 3D Gaussians from a single image.
|
| 188 |
+
|
| 189 |
+
Parameters:
|
| 190 |
+
- image_path (required): Absolute path to input image
|
| 191 |
+
- render_video: Whether to render MP4 (default: true, requires CUDA)
|
| 192 |
+
- trajectory_type: swipe | shake | rotate | rotate_forward (default: rotate_forward)
|
| 193 |
+
- num_frames: Number of video frames (default: 60)
|
| 194 |
+
- fps: Video frame rate (default: 30)
|
| 195 |
+
- output_long_side: Output resolution, null = match input
|
| 196 |
+
|
| 197 |
+
### list_outputs
|
| 198 |
+
List all generated PLY and MP4 files.
|
| 199 |
+
|
| 200 |
+
## Resources
|
| 201 |
+
|
| 202 |
+
### sharp://info
|
| 203 |
+
Server info, GPU status, configuration.
|
| 204 |
+
|
| 205 |
+
### sharp://help
|
| 206 |
+
This help text.
|
| 207 |
+
|
| 208 |
+
## Environment Variables
|
| 209 |
+
|
| 210 |
+
- SHARP_MCP_PORT: MCP server port (default: 49201)
|
| 211 |
+
- SHARP_CHECKPOINT_PATH: Local checkpoint path override
|
| 212 |
+
- SHARP_KEEP_MODEL_ON_DEVICE: Keep model on GPU (default: 1)
|
| 213 |
+
- CUDA_VISIBLE_DEVICES: GPU selection (e.g., "0" or "0,1")
|
| 214 |
+
"""
|
| 215 |
+
return help_text.strip()
|
| 216 |
+
|
| 217 |
+
|
| 218 |
+
# -----------------------------------------------------------------------------
|
| 219 |
+
# Main
|
| 220 |
+
# -----------------------------------------------------------------------------
|
| 221 |
+
|
| 222 |
+
if __name__ == "__main__":
|
| 223 |
+
# Run as stdio transport for MCP clients
|
| 224 |
+
mcp.run()
|
|
@@ -23,10 +23,13 @@ from typing import Final, Literal
|
|
| 23 |
|
| 24 |
import torch
|
| 25 |
|
|
|
|
| 26 |
try:
|
| 27 |
import spaces
|
| 28 |
-
|
|
|
|
| 29 |
spaces = None # type: ignore[assignment]
|
|
|
|
| 30 |
|
| 31 |
try:
|
| 32 |
# Prefer HF cache / Hub downloads (works with Spaces `preload_from_hub`).
|
|
@@ -175,15 +178,19 @@ class ModelWrapper:
|
|
| 175 |
|
| 176 |
self.device_preference = device_preference
|
| 177 |
|
| 178 |
-
#
|
| 179 |
if keep_model_on_device is None:
|
| 180 |
-
keep_env = (
|
| 181 |
-
|
| 182 |
-
)
|
| 183 |
-
self.keep_model_on_device = keep_env == "1"
|
| 184 |
else:
|
| 185 |
self.keep_model_on_device = keep_model_on_device
|
| 186 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 187 |
self._lock = threading.RLock()
|
| 188 |
self._predictor: torch.nn.Module | None = None
|
| 189 |
self._predictor_device: torch.device | None = None
|
|
@@ -560,16 +567,8 @@ class ModelWrapper:
|
|
| 560 |
|
| 561 |
|
| 562 |
# -----------------------------------------------------------------------------
|
| 563 |
-
#
|
| 564 |
# -----------------------------------------------------------------------------
|
| 565 |
-
#
|
| 566 |
-
# IMPORTANT: Do NOT decorate bound instance methods with `@spaces.GPU` on ZeroGPU.
|
| 567 |
-
# The wrapper uses multiprocessing queues and pickles args/kwargs. If `self` is
|
| 568 |
-
# included, Python will try to pickle the whole instance. ModelWrapper contains
|
| 569 |
-
# a threading.RLock (not pickleable) and the model itself should not be pickled.
|
| 570 |
-
#
|
| 571 |
-
# Expose module-level functions that accept only pickleable arguments and
|
| 572 |
-
# create/cache the ModelWrapper inside the GPU worker process.
|
| 573 |
|
| 574 |
DEFAULT_OUTPUTS_DIR: Final[Path] = _ensure_dir(Path(__file__).resolve().parent / "outputs")
|
| 575 |
|
|
@@ -605,8 +604,60 @@ def predict_and_maybe_render(
|
|
| 605 |
)
|
| 606 |
|
| 607 |
|
| 608 |
-
#
|
| 609 |
-
|
| 610 |
-
|
| 611 |
-
|
| 612 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
|
| 24 |
import torch
|
| 25 |
|
| 26 |
+
# Optional Spaces GPU support (for HuggingFace Spaces deployment)
|
| 27 |
try:
|
| 28 |
import spaces
|
| 29 |
+
_SPACES_AVAILABLE = True
|
| 30 |
+
except ImportError:
|
| 31 |
spaces = None # type: ignore[assignment]
|
| 32 |
+
_SPACES_AVAILABLE = False
|
| 33 |
|
| 34 |
try:
|
| 35 |
# Prefer HF cache / Hub downloads (works with Spaces `preload_from_hub`).
|
|
|
|
| 178 |
|
| 179 |
self.device_preference = device_preference
|
| 180 |
|
| 181 |
+
# Local CUDA: keep model on device by default for better performance
|
| 182 |
if keep_model_on_device is None:
|
| 183 |
+
keep_env = os.getenv("SHARP_KEEP_MODEL_ON_DEVICE", "1")
|
| 184 |
+
self.keep_model_on_device = keep_env != "0"
|
|
|
|
|
|
|
| 185 |
else:
|
| 186 |
self.keep_model_on_device = keep_model_on_device
|
| 187 |
|
| 188 |
+
# Support CUDA device selection via env var
|
| 189 |
+
cuda_device = os.getenv("CUDA_VISIBLE_DEVICES")
|
| 190 |
+
if cuda_device and device_preference == "auto":
|
| 191 |
+
# Let PyTorch handle device mapping via CUDA_VISIBLE_DEVICES
|
| 192 |
+
pass
|
| 193 |
+
|
| 194 |
self._lock = threading.RLock()
|
| 195 |
self._predictor: torch.nn.Module | None = None
|
| 196 |
self._predictor_device: torch.device | None = None
|
|
|
|
| 567 |
|
| 568 |
|
| 569 |
# -----------------------------------------------------------------------------
|
| 570 |
+
# Module-level entrypoints
|
| 571 |
# -----------------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 572 |
|
| 573 |
DEFAULT_OUTPUTS_DIR: Final[Path] = _ensure_dir(Path(__file__).resolve().parent / "outputs")
|
| 574 |
|
|
|
|
| 604 |
)
|
| 605 |
|
| 606 |
|
| 607 |
+
# -----------------------------------------------------------------------------
|
| 608 |
+
# GPU-wrapped entrypoint (Spaces or local)
|
| 609 |
+
# -----------------------------------------------------------------------------
|
| 610 |
+
|
| 611 |
+
|
| 612 |
+
def _create_spaces_gpu_wrapper(duration: int = 180):
|
| 613 |
+
"""Create a Spaces GPU-wrapped version of predict_and_maybe_render.
|
| 614 |
+
|
| 615 |
+
This is called dynamically based on hardware configuration.
|
| 616 |
+
"""
|
| 617 |
+
if spaces is not None and _SPACES_AVAILABLE:
|
| 618 |
+
return spaces.GPU(duration=duration)(predict_and_maybe_render)
|
| 619 |
+
return predict_and_maybe_render
|
| 620 |
+
|
| 621 |
+
|
| 622 |
+
# Default export: use local CUDA unless explicitly configured for Spaces
|
| 623 |
+
# The actual wrapper is created dynamically based on hardware_config
|
| 624 |
+
predict_and_maybe_render_gpu = predict_and_maybe_render
|
| 625 |
+
|
| 626 |
+
|
| 627 |
+
def configure_gpu_mode(use_spaces: bool = False, duration: int = 180) -> None:
|
| 628 |
+
"""Configure the GPU mode at runtime.
|
| 629 |
+
|
| 630 |
+
Args:
|
| 631 |
+
use_spaces: If True and spaces module available, use @spaces.GPU decorator
|
| 632 |
+
duration: Duration for @spaces.GPU decorator (seconds)
|
| 633 |
+
"""
|
| 634 |
+
global predict_and_maybe_render_gpu
|
| 635 |
+
|
| 636 |
+
if use_spaces and _SPACES_AVAILABLE and spaces is not None:
|
| 637 |
+
predict_and_maybe_render_gpu = spaces.GPU(duration=duration)(predict_and_maybe_render)
|
| 638 |
+
else:
|
| 639 |
+
predict_and_maybe_render_gpu = predict_and_maybe_render
|
| 640 |
+
|
| 641 |
+
|
| 642 |
+
def get_gpu_status() -> dict:
|
| 643 |
+
"""Get current GPU status information."""
|
| 644 |
+
import torch
|
| 645 |
+
|
| 646 |
+
status = {
|
| 647 |
+
"cuda_available": torch.cuda.is_available(),
|
| 648 |
+
"spaces_available": _SPACES_AVAILABLE,
|
| 649 |
+
"device_count": torch.cuda.device_count() if torch.cuda.is_available() else 0,
|
| 650 |
+
"devices": [],
|
| 651 |
+
}
|
| 652 |
+
|
| 653 |
+
if torch.cuda.is_available():
|
| 654 |
+
for i in range(torch.cuda.device_count()):
|
| 655 |
+
props = torch.cuda.get_device_properties(i)
|
| 656 |
+
status["devices"].append({
|
| 657 |
+
"index": i,
|
| 658 |
+
"name": props.name,
|
| 659 |
+
"total_memory_gb": round(props.total_memory / (1024**3), 2),
|
| 660 |
+
"compute_capability": f"{props.major}.{props.minor}",
|
| 661 |
+
})
|
| 662 |
+
|
| 663 |
+
return status
|
|
@@ -7,8 +7,9 @@ requires-python = ">=3.13"
|
|
| 7 |
dependencies = [
|
| 8 |
"gradio==6.1.0",
|
| 9 |
"huggingface-hub>=1.2.3",
|
|
|
|
| 10 |
"sharp",
|
| 11 |
-
"spaces
|
| 12 |
"torch>=2.9.1",
|
| 13 |
"torchvision>=0.24.1",
|
| 14 |
]
|
|
|
|
| 7 |
dependencies = [
|
| 8 |
"gradio==6.1.0",
|
| 9 |
"huggingface-hub>=1.2.3",
|
| 10 |
+
"mcp>=1.0.0",
|
| 11 |
"sharp",
|
| 12 |
+
"spaces>=0.30.0",
|
| 13 |
"torch>=2.9.1",
|
| 14 |
"torchvision>=0.24.1",
|
| 15 |
]
|
|
@@ -1,6 +1,7 @@
|
|
| 1 |
gradio==6.2.0
|
| 2 |
-
spaces==0.44.0
|
| 3 |
huggingface_hub>=1.2.3
|
|
|
|
| 4 |
torch
|
| 5 |
torchvision
|
| 6 |
sharp @ git+https://github.com/apple/ml-sharp.git@cdb4ddc6796402bee5487c7312260f2edd8bd5f0
|
|
|
|
|
|
| 1 |
gradio==6.2.0
|
|
|
|
| 2 |
huggingface_hub>=1.2.3
|
| 3 |
+
spaces>=0.30.0
|
| 4 |
torch
|
| 5 |
torchvision
|
| 6 |
sharp @ git+https://github.com/apple/ml-sharp.git@cdb4ddc6796402bee5487c7312260f2edd8bd5f0
|
| 7 |
+
mcp>=1.0.0
|